Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore All of Programming [ PART I ]

All of Programming [ PART I ]

Published by Willington Island, 2021-09-02 03:28:09

Description: All of Programming provides a platform for instructors to design courses which properly place their focus on the core fundamentals of programming, or to let a motivated student learn these skills independently. A student who masters the material in this book will not just be a competent C programmer, but also a competent programmer. We teach students how to solve programming problems with a 7-step approach centered on thinking about how to develop an algorithm. We also teach students to deeply understand how the code works by teaching students how to execute the code by hand.

Search

Read the Text Version

where the pointer itself cannot be changed, but what it points to can be). In the example code above, this rule is applied to the declaration of the parameter x in line 1 and to the variable r in line 9. Initialization When we initialize the reference, we implicitly take the address of the value we are initializing from. This initialization must happen in the same statement as the declaration for a variable and is thus different from any other assignment to the reference. We can translate the initialization by imagining a & before the initializing expression. In the example code above, we see this rule applied to a the initialization of the variable r in line 9, where we convert the right side of the assignment statement from z to &z (the address of z, just like in C). For a parameter, initialization happens when we pass the parameter value into the function. This rule means that instead of passing a copy of the value of the expression passed in, we pass an arrow pointing to it. We can see this rule applied to parameter passing in line 8, where we translate passing x in the reference-based code into passing &x in the pointer-based code. Uses Any other time we use the reference (whether to name a box on the left side of an assignment statement or in an expression), we implicitly dereference the pointer. We can therefore translate any other use of the reference to a dereference of the pointer that we translated the reference into. We can see

this rule applied in the example code above in lines 2, 3, and 10. Note that sometimes these rules lead to pairs of operations that “cancel out” (such as &*x, which is just x), when one reference is initialized from another. This behavior is exactly correct, as the pointer is just used directly in these cases—the reference being initialized is another name for whatever “box” the other reference names. Another important consequence of the way references work is that a call to a function that returns a (non-const) reference may be used as the left side of an assignment statement. To see an example of this principle, consider the following code (which is a simple example of this concept, not at all a good OO design): Code with References 1 class Point { 2 private: 3 int x; 4 public: 5 int & getX() { 6 return x; 7} 8 }; 9 //... 10 //... 11 myPoint.getX() = 3; Equivalent Code With Pointers 1 class Point { 2 private: 3 int x; 4 public: 5 int * const getX() { 6 return &x; 7} 8 }; 9 //... 10 //... 11 *(myPoint.getX()) = 3;

In the first piece of code (with references), myPoint.getX() is a valid lvalue, as it names the x box inside of myPoint. We can see how this behaves by looking at the equivalent pointer-based code on the right, in which *(myPoint.getX()) would name the same box in the same way—by following an arrow to it. Video 14.2: Executing code with references. Video 14.2 shows execution of C++ code with references by thinking about how they translate into pointer-based code. Video 14.3: Executing swap, written with references. As another basic example of references, we can revisit the swap function we saw in Video 8.3. Recall that in this video, we saw how to execute a function that swaps two ints by taking pointers to the boxes that should be swapped. We could instead accomplish the same task by passing references to our swap function. Video 14.3 illustrates such code and its execution. Observe how similar the behavior is to the pointer-based swap function from Video 8.3. In describing these rules, we put a caveat of “for the most part” on them. There are two reasons for this caveat. The first reason is that a const reference may be initialized from something that is not an lvalue. Remember that lvalue is the technical term for “something that names a box.” When something is not an lvalue, we cannot take its address (as there is no box to get an arrow pointing to), thus we cannot apply the translation rules above. To see a concrete example of this, consider the following C++ code fragment (where most of the bodies of the functions are elided, as indicated by the ellipses):

1 void someFunction(const int & x) { 2 //... 3} 4 void anotherFunction(void) { 5 //... 6 someFunction(3); 7 //... 8} Passing 3 to someFunction is legal because x is a const int & (a const int reference)—if it were just int &, and thus not const, this call would be illegal. However, our translation rules say that this would be equivalent to the following (illegal) code with pointers: 1 void someFunction(const int * const x) { 2 //... 3} 4 void anotherFunction(void) { 5 //... 6 someFunction(&3); 7 //... 8} Attempting to compile this pointer-based code results in an error, as 3 is not an lvalue, thus we cannot take its address. What actually happens is that the compiler creates a temporary const int variable and passes the address of that variable, such as this (note that the temporary variable does not really have a name, even though we show it with one below): 1 void someFunction(const int * const x) { 2 //... 3} 4 void anotherFunction(void) { 5 //... 6{ 7 const int tempArg = 3; 8 someFunction(&tempArg); 9} 10 //... 11 }

Initializing a reference from something that is not an lvalue is only legal if the reference is constant. Much like a const pointer, a const reference cannot be used to modify the box it refers to. Even though we could imagine letting such code be legal—the temporarily created variable would be modified, then discarded—such a language design choice is more likely to lead to mistakes, as it leads to nonsensical code, such as 1 //this code is not legal 2 void swap(int & x, int & y) { 3 int temp = x; 4 x = y; 5 y = temp; 6} 7 //.... 8 swap(3, 4); //what is this even supposed to mean? Of course, we can always declare a temporary variable ourselves, initialize it with whatever value we want, and then use that variable to initialize a non-const reference. The other reason for our “for the most part” caveat is that pointers and references are distinct types. While these rules give us a translation semantics to understand how references behave (meaning you know how to execute the code by applying familiar rules), actually translating to use pointers instead of references changes the types that variables (and thus expressions that use them) have in the program. While this distinction may seem to be subtle and insignificant, it is actually quite important. As we will see shortly in Section 14.4 and Section 14.5, C++ permits us to declare functions with the same name but different parameter types, as well as to provide definitions of many operators (such as +, *, =, ==, etc.) where user-defined types (or references to user- defined types) are involved. We can define an operator that operates on two references but not on two pointers.

14.3 Namespaces In C, functions and type names (as well as global variables, if you use them) reside in a global scope, which is visible throughout the entire program. The only way to restrict the visibility of a function’s name is to declare it as static,6 which restricts its visibility to the compilation unit it is declared in. If a function is not declared static, its name must be unique in the entire program. If it is static, it may not be used in any other compilation unit. Such a design is not ideal for large pieces of software with many developers, as it introduces the problem of name collisions—developers attempting to name different functions with the same name. This problem can be especially bad if developers want to use multiple libraries (which may be pre-compiled) that have a name collision. C++ introduces a way to create named scopes, called namespaces, which can be used from anywhere in the program. Declarations can be placed inside of a namespace (e.g., one called somename) by wrapping namespace somename { .... } around the declarations. For example, we might write: 1 namespace dataAnalysis { 2 class DataPoint { ... }; 3 class DataSet { ... }; 4 DataSet combineDataSets(DataSet * array, int n 5} While we will not be writing programs large enough that we need to declare multiple namespaces in this book, knowing how to use them is important, especially since the C++ standard library declares its functions and types in the std namespace. Additionally, many popular C++ libraries declare the functions that they provide

inside of a namespace, exactly to avoid name collisions between libraries. There are two ways to reference a name declared inside of a namespace. The first is to use the scope resolution operator ::. For example, if we want to use the vector class in the C++ standard library, which is in the std namespace, we can reference it by its fully qualified name, std::vector. The second way to reference names inside of a namespace is to open the namespace with the using keyword. The using keyword instructs the compiler to bring the names from the requested namespace into the current scope. If we were to write using namespace std; we would open the entire namespace, and could just write vector to refer to std::vector. We can also use using to bring particular items from a namespace into the current scope. For example, we can write using std::vector; to bring only the name vector from the std namespace into scope. We can open multiple namespace into the same scope with using, but if any of the namespaces have functions of the same name,7 then we must explicitly specify which function we want when we use it. Note that the compiler only requires you to explicitly specify which function you want if the “best choice” of functions with the same name is ambiguous. As we will discuss shortly, there may be one particular best choice based on the parameter types of the functions in question. Opening namespaces is generally regarded as something to be done sparingly and possibly avoided entirely in large scopes. As a general rule, the larger the scope, the more wary you should be of opening an entire namespace. Furthermore, the more namespaces you

already have open in a scope, the more wary you should be of opening another. Instead, opening only the particular names you desire (e.g., using std::vector; instead of just using namespace std;) is typically preferable. 14.4 Function Overloading In C, we can only have one function of a given name visible at any time. If we want to write a max function that takes two doubles and another max function that takes two ints, we must give them different names. If we try to give them both the same name—e.g., we write the following code: 1 int max(int x, int y) { 2 if (x > y) { 3 return x; 4} 5 return y; 6} 7 double max(double x, double y) { 8 if (x > y) { 9 return x; 10 } 11 return y; 12 } Then the compiler will give us the error: error: conflicting types for ’max’ However, in C++ this code is perfectly legal— multiple functions of the same name are legal if they have different parameter types. This concept of allowing multiple functions of the same name is called function overloading. Note that an overloading is legal if (and only if) the functions can be distinguished by their parameter types, read as an ordered list. That is, the following

functions all represent valid overloadings (actual contents of the function omitted, as they are irrelevant, and of course, f is generally a terrible name for a function): 1 int f(int x, double d) {...} 2 double f(double d, int x) {...} 3 void f(int y) {...} 4 int f(double d) {...} 5 double f(void) {} 6 int f(int * p) {} 7 int f(const int * p) {} Every one of these functions has a different parameter list than the others (a const int * is a different type from an int *). Note that functions that only differ in their return types and/or parameter names are not valid overloadings, as the compiler determines which function you are referencing by looking at the types of the parameters you pass in. Overloading of member functions (a.k.a. methods) is also legal—a class may have multiple functions of the same name, as long as they have different parameter types. Methods that differ only in the const-ness of this are legal, as that constitutes a different parameter type. The compiler must also be able to determine an unambiguous best choice of which overloaded function you want from the parameters you pass in. Given the previous declarations, a call to f(3,3) is illegal, and results in the following error messages: error: call of overloaded ’f(int, int)’ is ambiguous note: candidates are: note: int f(int, double) note: double f(double, int) Here, the compiler considers converting f(3, 3) to f(3.0, 3) (which would be a call to double f(double, int)) to be “just as good as” converting

it to f(3, 3.0) (which would be a call to int f(int, double)). The compiler does not look at how the return type is used in making this determination either. However, if we had another overloading of f that took (int, int) as its parameters, that would be unambiguously the best choice, and the call would be legal. If that last paragraph seemed a bit complex, let it be a warning to use function overloading sparingly, if at all. Many programmers view function overloading as a horrible idea for several reasons. First, it can confuse the reader of the code, as you have to find all the possible functions of a given name then know the rules for determining what is the “best match” to know what happens in the code. Second, it provides a source of potential errors in writing the code. The programmer may think she is calling one function, when, in fact, she is calling another. Third, it introduces the possibility of very surprising errors when working code is modified— introducing another function of the same name may change what the “best choice” is for some other part of the code, making it so that a piece of code that was working and has not been modified is no longer correct. Note that this issue is closely related to one of the reasons why opening entire namespaces (especially multiple entire namespaces) is generally considered a bad idea. Consider the following code fragment: 1 namespace libraryX { 2 double aFunction(double x) { ... } 3 }; 4 namespace moduleY { 5 int somethingElse(int y) { ... } 6 }; 7 ... 8 using namespace libraryX; 9 using namespace moduleY; 10 double x = aFunction(2);

11 int z = somethingElse(42); Here, we open two namespaces, and all of the code is completely legal. The call to aFunction references libraryX::aFunction, and the compiler converts 2 to 2.0. Now, however, suppose that a developer working on moduleY writes a function of the same name, but it takes an int as a parameter. This developer is unaware of the names used in libraryX, which is not a part of the code she is concerned with (which is part of the advantage of properly used namespaces). Now, the code reads as follows: 1 namespace libraryX { 2 double aFunction(double x) { ... } 3 }; 4 namespace moduleY { 5 int somethingElse(int y) { ... } 6 int aFunction(int x) { ... } 7 }; 8 ... 9 using namespace libraryX; 10 using namespace moduleY; 11 double x = aFunction(2); 12 int z = somethingElse(42); Here, there are two aFunctions in scope; however, moduleY::aFunction is unambiguously a better choice than libraryX::aFunction, so the compiler will select moduleY::aFunction as the target of the call (and then convert the returned result from an int to a double). Calling this other function (which likely behaves quite differently) breaks the code in a surprising, and thus hard-to-debug way. If you are going to use function overloading, you should follow a few simple guidelines. First, you should only overload functions when they perform the same task but on different types. For example, our max functions that we described earlier do the same computation but on

different types. Second, you should only overload functions in such a way that understanding what the “best choice” is for a particular call is straightforward. 14.4.1 Name Mangling When you compile a C++ program, the C++ compiler generates assembly, which is then assembled into an object file. As with C programs, the linker then links together the object files, resolving symbol references. The linker does not understand function overloading, nor does it have type information available to it. Instead, the C++ compiler must ensure that the names of the symbols the linker sees are unique. To accomplish this goal, the C++ compiler performs name mangling—adjusting the function names to encode the parameter type information, as well as what class and namespace the function resides inside of—so that each name is unique. While you generally do not need to know the specific details of name mangling, it is useful to understand that it happens, and that C does not mangle names. If you mix C and C++ code, the C++ compiler must be explicitly informed of functions that were compiled with a C compiler by declaring them extern \"C\", such as 1 extern \"C\" { 2 void someFunction(int x); 3 int anotherFunction(const char * str); 4} Note that main is treated specially (e.g., generally as if it were declared extern \"C\"), as it may be declared with or without parameters for the command line arguments, and is called by the startup library, which is frequently written in C.

14.5 Operator Overloading C++ takes function overloading one step further, allowing operator overloading. You can write “function declarations” that define the behavior of the operators (such as +, -, *, ++, =, and many others) when at least one user-defined type (e.g. class) is involved. For example, we might write: 1 class Point { 2 private: 3 int x; 4 int y; 5 public: 6 Point operator+(const Point & rhs) cons 7 Point ans; 8 ans.x = x + rhs.x; 9 ans.y = y + rhs.y; 10 return ans; 11 } 12 }; Here, we have defined an overloading of the + operator inside of the Point class, which takes two points as arguments. The first Point is this (the implicit argument that points at the object a method is invoked upon), which points at the left-hand operand of the + operator. The second operand is a const reference to the right-hand operand of the + operator (named rhs, since it is the right-hand-side operand of the operator). We pass a const reference rather than the value so that we avoid copying the entire object as we pass the parameter (which is generally the correct way to pass the right- operand argument of an overloaded operator). Operator overloading is quite common in C++; however, as with function overloading, it should only be used when it is appropriate (not just any time you can). C++ does not enforce any rules about what the

overloaded operators do, but they should obey common sense. If you overload +, it should correspond in some way to addition. For example, if you have a class for a matrix (in the mathematical sense), it is logical to overload to overload the + operator to add two matrices. Such an overloaded operator would be declared inside of the Matrix class as 1 Matrix operator+(const Matrix & rhs) { 2 //implementation goes here 3} Note that the first (left) operand is the this object. Likewise, we overloaded the += operator on a Matrix, so we would expect it to add another Matrix to the current one, updating the current matrix with the sum. Such an overloaded operator would be declared inside of the Matrix class as: 1 Matrix & operator+=(const Matrix & rhs) { 2 //implementation goes here 3 return *this; 4} Note that in the case of operators that modify the object they are inside of, such as +=, they return a reference to that object. That is, their return type is a reference to the their own class type, and they return *this. Note that when returning *this, a reference (the return value) is being initialized, thus the address is implicitly taken; that is, the pointer is &*this, which is just this. The reason why operators such as this return a reference to the object (rather than void) is that a = b += c; is legal, even if we strongly discourage writing such code (it performs b += c; then a = b; and is better written as two statements). As we mentioned earlier, const versus non-const functions constitute valid overloadings. For example, in

our Matrix class, we might overload the indexing operator ([]) to give us a row of the matrix (assume we have some class, MatrixRow which represents one row of the matrix). We might wish to provide two versions of this operator, one that is non-const and returns a non-const MatrixRow reference, and one that is const and returns a const MatrixRow reference: 1 MatrixRow & operator[](int index) { 2 //code here 3} 4 const MatrixRow & operator[](int index) const { 5 //implementation here 6} Such an implementation allows us to get a const (i.e., read-only) row from a const Matrix, allowing us to read the row (and thus presumably the elements) out of it but not modify them. However, if we have a non- const Matrix, we can get a non-const MatrixRow, allowing us to read or write the elements of the Matrix. Video 14.4: Executing code with an overloaded operator. Executing code with overloading operators is primarily a matter of thinking of the overloaded operators as functions and using the rules you are well familiar with (with the additional rule about passing this, which we just learned for operators that are members of classes). Video 14.4 illustrates. 14.6 Other Aspects of Switching to C++ There are a variety of small topics that are all important to know as you get started in C++ programming. We cover several of these here.

14.6.1 Compilation Instead of compiling with GCC, C++ programs are compiled with G++ (g++ on the command line). For the most part, the options we have discussed for GCC work for G++; however, the option for the language standard is different. For everything we are going to do in this book, - std=gnu++98 will work fine, specifying the GNU extensions to the C++98 standard, which is the default. However, if you want to use any of the new features in C++11, you should specify -std=gnu++11. For a list of C++11 features and their availability in various versions of G++, consult the documentation page: https://gcc.gnu.org/projects/cxx0x.html. We also give an overview in Section E.6. 14.6.2 The bool Type C++ has an actual bool type with values true and false (C99 has _Bool and stdbool.h typedefs it to bool, as well as defining true and false; however, bool, true, and false are actually part of the language in C++). C++ will still respect C-style use of an integer as meaning false if the integer is 0 and true otherwise; however, you should generally use bool for the type of parameters and variables that represent a boolean value. 14.6.3 void Pointers In C, we can assign any pointer type to a void pointer and assign a void pointer to any pointer type without a cast. C++ removes this flexibility and requires an explicit cast. While this change may seem annoying, other of the new features that C++ provides make it less cumbersome than it may seem, and it is in fact imminently reasonable.

As we will see in Chapter 15, C++ provides a new mechanism for dynamic allocation, which returns a correctly typed pointer instead of a void *. C++ also provides a nicer mechanism for referencing data of any type, via templates (which we will discuss in Chapter 17), which allow classes and functions to be parameterized over one or more types (e.g., what type of data they hold or operate on). Being able to write classes that can hold any type using templates (rather than void *s) is one of the compelling reasons to switch to C++ before learning about data structures in Part III. 14.6.4 Standard Headers In C, the standard header files end in .h (e.g., stdio.h, stdlib.h). In C++, the standard headers do not have any dot suffix. For example, C++ has a header file for the std::vector class (which we have mentioned, but not delved into). However, this header file is not vector.h; it is simply vector. That is, you would include it with #include <vector>. You can still include the C standard header files if you want. However, as these header files put names into the global scope (rather than the std namespace), they are not the prefered way to include the C standard library. Instead, C++ provides a header file that has the same name as the C header file but starts with a c and does not end with .h (e.g., cstdlib for stdlib.h and cstdio for stdio.h). These versions of the header files provide similar definitions, except they are placed in the std namespace (so cstdio would provide std::printf). 14.6.5 Code Organization

In C, we had the organization that a header file declared the interface, and the corresponding C source file defined the implementation. In C++, a similar arrangement occurs with header files and C++ source files (which end with a .cpp extension). In C++, class declarations typically occur in header files (as they describe the interface of the class). It is generally considered to be fine to write the implementation of very short methods directly inside of the class declaration in the header file. For example, if you want to have a private field that can be read outside the class, you would write a public accessor (a.k.a. “getter”) method. Such a method might look like int getX() const {return x;}. Such a method is short enough that it is fine to write it inside the class declaration in the header file. The remaining methods need their implementations written in the .cpp file. However, when we write these methods, we must specify which class they belong in. We do so with the scope resolution operator (::) to give the fully qualified name (Classname::methodName) in the declaration. For example, if we wanted to write the implementations of our BankAccount class in a separate .cpp file, we would write the following in the header file: 1 class BankAccount { 2 private: 3 double balance; 4 public: 5 void deposit(double amount) ; 6 double withdraw (double desiredAmount); 7 double getBalance() const; 8 void initAccount(); 9 }; Then, in the .cpp file, we would write: 1 #include \"bank.h\" 2 3 void BankAccount::deposit(double amount)

4 balance += amount; 5} 6 7 double BankAccount::withdraw(double desir 8 if (desiredAmount <= balance) { 9 balance -= desiredAmount; 10 return desiredAmount; 11 } 12 else { 13 double actualAmount = balance; 14 balance = 0; 15 return actualAmount; 16 } 17 } 18 19 double BankAccount::getBalance() const { 20 return balance; 21 } 22 23 void BankAccount::initAccount() { 24 balance = 0; //we will see a better way to d 25 } We will note that there is no rule enforced by the compiler about how long of a method is acceptable inside the class declaration. Instead, such a decision is governed by coding standards in whatever organization you work in or your own personal preference. However, the guiding principle in making such a decision is that someone reading the header file should easily be able to see the interface for the class (i.e., how they can use it) without the declaration being cluttered up by method implementations. 14.6.6 Default Values C++ allows default values to be specified for some or all of the parameters of a function (or method). These default values provide the caller of the function in question with the ability to omit the arguments for certain parameters if the default values are desired. The

arguments that are omitted must be the rightmost arguments of the call. For example, suppose we write the following function prototype with default parameter values: 1 int f(int x, int y = 3, int z = 4, bool b = false Then the following are calls are legal and interpreted as noted in the comments: 1 f(9); //x = 9, y = 3, z = 4, b = false 2 f(9, 8); //x = 9, y = 8, z = 4, b = false 3 f(9, 8, 7); //x = 9, y = 8, z = 7, b = false 4 f(9, 8, 7, true); //x = 9, y = 8, z = 7, b = true When reading code with default parameters, you make the function call the same way as normal but copy the default values into the frame as if they were passed normally (that is, as if all the arguments were specified). We will note that default parameter values are often abused by novice programmers (save some typing! put whatever I use most often as a default!). You should only use them when you really mean for that to be the default behavior of a function. For example, a reasonable use would be: 1 int doSomething(int arg1, int arg2, bool verboseD In this example, not having verbose debugging is a good default behavior. By providing this parameter, we could turn on verbose debugging on a call-by-call basis, by passing true for that parameter. We will also note that the values of default parameters used at a call site are based on what is “seen” by the compiler at that point in the code. That is, if you declare the function in the header file that was

included without default values, then you cannot make use of them, even if the implementation declares them. Consequently, if you are going to use default parameters, you should specify them in the prototype in the header file, and only in the prototype in the header file. 14.6.7 Reference Material The man pages are the best reference material for C. However, for C++, you often know what class you want to work with, and wish to look at the methods inside of it. http://www.cplusplus.com/ provides an excellent reference for the C++ library in this format—you can look up the page for a particular class and then examine a list of the methods in that class to find what you need. 14.7 Practice Exercises Selected questions have links to answers in the back of the book. • Question 14.1 : What is an object? What is a class? How are the two related? • Question 14.2 : What is the difference between a struct and a class? • Question 14.3 : What is the this pointer? How do you determine what it points at? • Question 14.4 : What does const mean when it appears after the parameter list but before the open curly brace of a function body, as in void getX() const {...}? • Question 14.5 : What is a reference? How is it like a pointer? How is it different? • Question 14.6 : What is operator overloading? • Question 14.7 : When you overload an operator that modifies the object (such as +=), what type

and value should your operator typically return? • Question 14.8 : What is the output when the following code is executed? 1 #include <cstdio> 2 #include <cstdlib> 3 4 class Point { 5 private: 6 int x; 7 int y; 8 public: 9 void setLocation(int newX, int ne 10 x = newX; 11 y = newY; 12 } 13 int getX() const { 14 return x; 15 } 16 int getY() const { 17 return y; 18 } 19 }; 20 void printPoint(const char * name, 21 printf(\"%s: (%d,%d)\\n\", name, p.g 22 } 23 void f(Point & p) { 24 printPoint(\"p\", p); 25 p.setLocation(p.getX() + 2, p.get 26 } 27 28 int main(void) { 29 Point p1; 30 Point p2; 31 p1.setLocation(2,4); 32 p2.setLocation(3,5); 33 f(p1); 34 f(p2); 35 printPoint(\"p1\", p1); 36 printPoint(\"p2\", p2); 37 return EXIT_SUCCESS; 38 } • Question 14.9 : Write a class for a Square, which has the following members: –

A private double for the edge length – A public method void setEdgeLength(double) that uses assert to check that the passed-in edge length is non-negative (and aborts the program if not). If the edge length is non- negative, it sets the edge length field to the passed-in value. – A public method double getEdgeLength() const, which returns the edge length. – A public method double getArea() const, which returns the area of the square. – A public method double getPerimeter() const, which returns the perimeter of the square. Test your code with the following main: 1 int main(void) { 2 Square squares[4]; 3 for (int i = 0; i < 4; i++) { 4 squares[i].setEdgeLength(2 * i 5} 6 for (int i = 0; i < 4; i++) { 7 printf(\"Square %d has edge leng 8 printf(\" and area %f\\n\", squa 9 printf(\" and perimeter %f\\n\", 10 } 11 printf(\"Trying to set a negative 12 squares[0].setEdgeLength(-1); 13 return EXIT_FAILURE; 14 } Which should output the following (to stdout): Square 0 has edge length 2.000000 and area 4.000000 and perimeter 8.000000 Square 1 has edge length 4.000000 and area 16.000000

and perimeter 16.000000 Square 2 has edge length 6.000000 and area 36.000000 and perimeter 24.000000 Square 3 has edge length 8.000000 and area 64.000000 and perimeter 32.000000 Trying to set a negative edge length (should abort) then print an assertion failure message and abort. • Question 14.10 : Take the Point class from Question 14.7 and add three overloaded operators to it: – A += operator, which takes a const Point & and increases this Point’s x by the passed- in Point’s x and this Point’s y by the passed-in Point’s y. It should then return a reference to this object. – A == operator, which takes a const Point & and determines if it has the same coordinates as this Point. – A *= operator, which takes an int and scales (multiplies) this Point’s x and y by the passed-in integer. This operator should return a reference to this Point. Write a main to test your code. • Question 14.11 : What is the output when the following code is executed? 1 #include <cstdio> 2 #include <cstdlib> 3 4 void f(int & y, int * z) { 5 printf(\"y = %d, *z = %d\\n\", y, *z 6 y += *z; 7 *z = 42; 8} 9 10 int main(void) { 11 int a = 3; 12 int b = 4;

13 int & x = a; 14 x = b; 15 printf(\"a = %d, b = %d, x = %d\\n\" 16 f(b, &x); 17 printf(\"a = %d, b = %d, x = %d\\n\" 18 return EXIT_SUCCESS; 19 } II C++15 Object Creation and Destruction Generated on Thu Jun 27 15:08:37 2019 by L T XML

II C++14 Transition to C++16 Strings and IO Revisited

Chapter 15 Object Creation and Destruction One of the benefits of object-oriented languages (such as C++) is the ability to design classes such that the privacy of their data can ensure their invariants are always maintained. By keeping fields private, the only way they can be manipulated is through the public interface of the class, which only changes the fields in ways that respect the invariants of the objects. Of course, the implementations of these methods must be written correctly to achieve these goals; however, the designer of the class does not need to worry about unrelated code modifying the object in unexpected ways. While the access restrictions help designers maintain invariants, in order to be truly useful, objects must be able to initialize their state properly as well—the invariants must be initially established before they can be maintained. An idea that goes nowhere would be to have the code creating the object initialize its fields directly. Such an approach would not only require relaxation of the visibility restrictions to allow the code direct access to the private fields but would also require the class designer to trust external code to set up the initial state correctly. A slightly better approach would be to have each class provide a public method (e.g., called initialize) code should call immediately after allocating an object. This approach, which is what we showed in our example

in the previous chapter (because we will not learn the right way until this chapter), would mean that we would need to do something like this: 1 BankAccount * account = malloc(sizeof(*account)) 2 account->initialize(); However, this approach has several problems. First, it is easy for a programmer to forget to call the initialize method when creating an object. Such forgetfulness would lead to the fields of the object being used uninitialized, which you are already aware results in the worst sort of errors—those that only show up sometimes. The second problem with this approach is that we cannot enforce that initialize is only called on a newly created object. As the method is public, it could be called by any other piece of code at any time. We also have a similar problem when an object is about to be destroyed. Instead of initializing the state, we want to clean up any resources in use by the object (e.g., freeing any memory that only it has references to, closing files that it has open, etc.). As with initialization, we might imagine each class providing a public cleanup method, but this approach suffers similar problems to its initialization counterpart. As with initialization, a programmer may forget to call cleanup or call it at an inappropriate time. 15.1 Object Construction The approach of having a particular method to initialize the object could be fixed if we could make this method “special” in such a way that (1) it is always called when you create an object, (2) it cannot be called directly (at any time) by the programmer but instead can only be

called during object creation. C++ (and many other object-oriented languages) take exactly this approach— these “special methods” are called constructors. In C++, a constructor has no return type (not even void) and the same name as the class it is inside of. For example, if we wanted to change our BankAccount class from the last chapter to use a constructor, it would look like this: 1 class BankAccount { 2 private: 3 double balance; 4 public: 5 BankAccount() { 6 balance = 0; 7} 8 //other methods remain the same 9 }; Of course, we could also write the BankAccount constructor outside of the class declaration, in which case, we would write: 1 BankAccount::BankAccount() { 2 balance = 0; 3} Video 15.1: Executing code that creates an object with a constructor. Now, if we declare a variable of type BankAccount, C++ will automatically call the constructor for that variable when the “box” for the variable is created. As the constructor is “inside of” an object, it has an implicit this parameter, which points at the newly created object. Video 15.1 shows the execution of code in which an object with a constructor is created. Note that the constructor only happens when a value whose type is a class is created. If you create a pointer (or reference) to an object, then no new object is created, just a pointer, so no constructor is run.

15.1.1 Overloading As with all other functions in C++, constructors can be overloaded. We might want to overload constructors so that we can allow different ways to initialize an object, based on different information. For example, in our BankAccount class, we might want to write a second constructor that takes in an initial balance and initializes the balance field appropriately: 1 BankAccount::BankAccount() { 2 balance = 0; 3} 4 BankAccount::BankAccount(double initialBalance) 5 balance = initialBalance; 6} The constructor we have seen so far that takes no parameters has a special name: the default constructor. If you do not write any constructors in a class, the C++ compiler will provide a default constructor that basically behaves as if you declared it like this: 1 class MyClass { 2 public: 3 MyClass() {} 4 }; Note that if you write any other constructor, the C++ compiler does not provide this constructor for you. If you write any constructors, the class is non-POD (it requires “special” initialization—not just allocation of a chunk of bytes) by virtue of having a constructor. If you do not write a constructor and the compiler provides an implicit default constructor, the class is non-POD if that constructor turns out to be nontrivial. If the constructor is trivial, then it does not automatically make the class non- POD, but other aspects of the class may, of course, make

it non-POD. The non-technical description of a constructor being “nontrivial” is that it “does something.” The provided constructor looks like it obviously does nothing; however, as we will discuss in Section 15.1.4, there may be some implicit initializations that happen in such a constructor. A class type that has a public default constructor— whether explicitly declared or automatically provided—is called a default constructible class. Many pieces of the C++ library only work with classes that are default constructible—if you try to use them with a class that is not default constructible, you will get a compiler error. Generally, you want your class to be default constructible unless you have a compelling reason not to. To make use of a constructor other than the default constructor when initializing an object, you must specify the arguments you wish to pass to the constructor. For local variables, you do so by placing parentheses with the argument after the name of the new variable, for example: 1 BankAccount myAccount(42.3); //pass 42.3 as initialB Note that while it may be tempting (for consistency) to write empty parentheses after a variable we wish to construct via the default constructor, this approach unfortunately does not work: BankAccount x(); is interpreted as a function named x that takes no parameters and returns a BankAccount. 15.1.2 Dynamic Allocation Suppose that instead of wanting to create a local variable for a BankAccount, we wanted to dynamically allocate a

BankAccount object. In C, we would use malloc to allocate the proper number of bytes (i.e. malloc(sizeof(BankAccount)), and everything would work fine. However, now that our BankAccount class has a constructor, it is not a POD class, so using malloc will not work properly. Recall from Section 14.1.4 that (basically) a class is only POD (“plain old data”) if it has a direct C analog. C does not have anything resembling constructors—there is no way to have structs initialized automatically whenever they are allocated. The underlying reason why malloc will not work properly (in this case) is that it will not run the constructor. There is no way for malloc to actually know what type of object it is allocating space for and thus no way for it to call the proper constructor—recall that the compiler will just evaluate sizeof(BankAccount) to a size_t, and malloc will be called with that integer as a parameter. malloc will simply allocate a contiguous sequence of the number of bytes you have requested of it. However, the whole point of having the constructor in the BankAccount class was to guarantee that it would be properly initialized. By virtue of having a constructor (or possibly multiple constructors), a BankAccount is no longer just “a bunch of fields”—it also includes a procedure for setting up the initial state of those fields. It does not behave like a C struct anymore, therefore using C’s tools to just allocate “a chunk of memory” is inappropriate. In C++, the proper way to allocate memory dynamically is to use the new operator. For example, we might write: 1 BankAccount * accountPtr = new BankAccount(); Evaluating the new BankAccount() expression1 allocates memory for one object of type

BankAccount and calls the default (no argument) constructor to initialize the newly created object. This expression evaluates to a pointer to the newly allocated object and has type BankAccount * (contrast this type, which is completely accurate, to void *, which is what malloc returns). If we need to pass arguments to the constructor for BankAccount, we can do so by placing them in the parentheses: 1 BankAccount * accountPtr = new BankAccount(initia We can also use new[] operator to allocate space for an array. For example, if we wanted an array of 42 BankAccounts, we could do: 1 BankAccount * accountArray = new BankAccount[42] This code would allocate space for 42 consecutive BankAccount objects and invoke the default constructor on each of them in ascending order of index. If you create an array of objects, you can only have them constructed with their default constructor. If the class does not have a default constructor or you want to initialize the elements of the array with some other constructor, you need to create an array of pointers and then write a loop to create each object and put the pointer to it into the array: 1 BankAccount ** accountPointerArray = new BankAcco 2 for (int i = 0; i < 42; i++) { 3 accountPointerArray[i] = new BankAccount(initia 4} Video 15.2: Execution of code with new and new[]. Video 15.2 illustrates the execution of code with new and new[].

15.1.3 Types of Initialization One aspect of C++ object creation that is often misunderstood is the difference between this line of code: 1 BankAccount * accountPtr = new BankAccount(); //p and this line of code: 1 BankAccount * accountPtr = new BankAccount; //no The two appear quite similar, but one has parentheses after BankAccount and the other does not. Odds are good that if you ask 20 different C++ programmers what the difference is, you will get 15 different answers, and none will be correct. We will explain the difference, not so much because you need to remember it to write good programs, but rather because it strongly motivates doing what you should do anyways. These two uses of new make use of two different types of initialization. The first (with the parentheses) uses value initialization, and the second uses default initialization. Each of these has different behavior, and the specifics depend on whether the type being initialized is POD or non-POD. When value initialization is used,2 a class with a default constructor is initialized by its default constructor. A class without any constructor has every field value initialized. Non-class types are zero initialized. Arrays have their elements value initialized. When default initialization is used, non-POD types are initialized by their default constructor. POD types are

left uninitialized. Arrays have their elements default initialized. Believe it or not, that is actually a simplification of the rules. However, most C++ programmers get by just fine without understanding them. How can this be? The best approach is to just always include a default constructor in your classes. Notice that the main similarity between the two is that classes with a default constructor that you wrote (remember: having a programmer-defined constructor makes the class a non-POD type) will be initialized by their default constructor under either scheme. If you write a default constructor, you do not need to remember the distinctions, since both will do the same thing—using that constructor. 15.1.4 Initializer Lists While our constructor for the BankAccount works fine, it makes use of traditional assignment statements, rather than the preferred C++ approach, which is to use an initializer list. An initializer list is a list of initializations written by placing a colon after the close parenthesis of the constructor’s parameter list and writing a sequence of initializations before the open curly brace of the function’s body. Each initialization takes the form name(values) that we just saw for constructing objects with parameters to their constructors. For example, our BankAccount’s constructors would be rewritten as follows: 1 BankAccount::BankAccount() : balance(0) {} 2 BankAccount::BankAccount(double initBal) : balanc Video 15.3: Creating objects whose constructors use initalizer lists.

Video 15.3 shows the execution of code where objects whose constructors make use of initializer lists are created. This video also shows the behavior of a field declared as static, which we discussed briefly in Section 14.1.5. A natural question for a C programmer learning C++ is “Why should I bother with initialization lists? Assignment statements seem to work just fine…” However, there are several reasons why initialization lists are preferable. Most of the reasons are all based on the fact that C++ makes a distinction between initialization and assignment. Any assignment statements in the constructor are treated as regular assignment statements, while initializers in the initialization list are treated as initializations. One important way that this distinction between initialization and assignment matters is that C++ ensures all fields have some form of initialization before the open curly brace of the constructor. If you specify the initialization you want in the initialization list, then you get exactly what you want. Otherwise, the field is default initialized. However, remember that default initialization for POD types leaves them with unspecified values. The distinction between assignment statements and initialization is critical here, as an assignment statement in the constructor does not specify the initialization behavior. Instead, if you place an assignment statement that assigns to the field in the body of the constructor, it will assign to the already created object, changing its fields. Depending on the exact details, you may get the behavior you want (e.g., if the field has a POD type). You may get the behavior you want, except that it is slower (from creating, then overwriting the object). You may also

get undesired behavior, if constructing and/or destroying the objects has some noticeable effects. Another case where the distinction between initializing and assigning to a field matters is for references. Recall that initializing a reference sets what the reference refers to (points at), while assigning to the reference implicitly dereferences the reference and assigns to whatever it refers to. If your class has fields of a reference type, you must initialize them in the initializer list. Otherwise, you will receive an error message like: uninitialized reference member ’ClassName::fieldName’ A third case where this distinction matters is if you have a const field (i.e. one whose type includes a const modifier, which does not allow its value to be changed). A const field must be initialized3 in the initializer list and may not be assigned to anywhere. This requirement exists so that the compiler may ensure the const field will be initialized but never changed. If you fail to initialize a const field in an initializer list, you will get an error message like this: uninitialized member ’ClassName::fieldName’ with ’const’ type ’const int’ The best practice for C++ is to use the initializer list to initialize the fields of the object. You may still need to write code in the body of the constructor if more complex setup is needed after the fields are initialized. There is one final detail about initializer lists you should understand, as not knowing about it often leads to confusing warnings or errors. The order in which fields are initialized by the initializer list is the order in which

they are declared in the class, not the order their initializers are written in the initializer list. For example, if we wrote the following code: 1 class Something { 2 private: 3 int x; 4 int y; 5 int z; 6 public: 7 Something(int a, int b, int c) : z(c), y(b), x 8 }; The order of initialization would be x(a), y(b), z(c) because that is the order the fields are declared in, even though it is not the order the initializations appear in the list. Although this may seem like an insignificant technicality, it matters in cases where one field is used to initialize another. Consider the slightly different code below: 1 class Something { 2 private: 3 int x; 4 int y; 5 int z; 6 public: 7 Something(int a, int b) : z(a), y(b), x(z + y) 8 }; Here, the fact that x is initialized first is quite significant. The expression from which we initialize x is the sum of z + y, neither of which have been initialized yet, so x gets initialized to some unknown value! Note that if you compile with warnings turned off, this code will compile just fine (despite the serious problem lurking in its initializer list). If you compile with -Wall -Werror, then the compiler will produce an error message so you know to fix your problem:

In constructor ’Something::Something(int, int)’: ’Something::z’ will be initialized after ’int Something::y’ when initialized here ’Something::y’ will be initialized after ’int Something::x’ when initialized here Whenever you encounter this type of error message, fix the ordering of your initializer list, and make sure you are not using uninitialized fields to initialize other fields. Video 15.4: Initialization list ordering and implicit initialization of non-POD fields. Video 15.4 shows the execution of some more code with initializer lists. This video shows both intializer lists whose elements are in a different order from their declarations and the implicit initialization of non-POD types when they are omitted from the initializer list. 15.1.5 What to Do Object construction in C++ is a fairly complex topic. This section has a significant amount of information, including more deep technical details than we typically like to give in this book. However, these technical details provide the motivation for the why of several things that are generally important to do. Without this information, telling you what to do would seem like an arbitrary set of rules without any explanation for the reasons. Although there are exceptions to the following rules, you should understand much more about the details of object construction, POD versus non-POD types, and how objects are implemented before you break them (complete mastery of all the material in this entire book is a good start, but you need to know a bit more about the

internals of C++ than we will cover to truly know when it is safe to break some of these rules): Make your classes default constructible. Write a public default constructor in every class that you write. Make that constructor initialize the object in a sane way. Use initializer lists to initialize your class’s fields. Initialize the fields of your class in the initializer list. Do not try to initialize them in the constructor’s body with assignment statements. The two are different. In the initializer list, explicitly initialize every field. Do not rely on the implicit default initialization for any field. Explicitly initialize it to the value you want. If you think you do not care if it is uninitialized, pick some value to initialize it to—it will make testing and debugging that much easier to guarantee the field has one particular value. Initialize the fields in the order they are declared. You should be compiling with -Wall - Werror anyways, so doing anything else should produce an error. Use new and new[], not malloc. Using malloc on a type that is not POD will result in problems. Depending on why the type is not a POD type, you may experience a variety of strange and difficult-to-debug behaviors (understanding what is going on and why requires understanding of the internals of C++, which is what you are violating by using malloc with a non-POD type). In C++, just always use new

and new[] (there is no need to ever use malloc only danger in doing so). 15.2 Object Destruction In much the same way that we would like to have our classes be able to specify an initialization procedure that is guaranteed to happen (and guaranteed to happen only when an object is being created), we would like to be able to specify a cleanup procedure for object destruction (and only for object destruction). Such a procedure is called a destructor. A destructor is named the same as the class it appears in, except with a tilde (~) (which is typically shift plus the key left of the 1 key). Like a constructor, a destructor has no return type, not even void. Unlike constructors, destructors may not be overloaded. A class may only have one destructor, and it must take no parameters. As the destructor is inside the class, it receives an implicit this parameter, just like any other member function. Destructors are typically public but may be private—however, if they are, only that class can destroy objects of that type (thus, the usefulness of private destructors is quite limited). At present, it would not make sense to add a destructor to our BankAccount class—there is nothing that it would need to do. However, if we imagined our class having some dynamically allocated memory associated with it, such as a transaction history stored as a dynamically allocated array, then it would make sense: 1 class BankAccount { 2 private: 3 Transaction * transactionHistory; //arra 4 int numTransactions; 5 double balance; 6 void addTransaction(double amount, cons 7 Transaction * temp = new Transaction[

8 for (int i = 0; i < numTransactions; 9 temp[i] = transactionHistory[i]; 10 } 11 temp[numTransactions].message = messa 12 temp[numTransactions].amount = amount 13 gettimeofday(&temp[numTransactions].w 14 Transaction * old = transactionHistor 15 transactionHistory = temp; 16 numTransactions++; 17 delete[] old; 18 } 19 public: 20 BankAccount() : transactionHistory(NULL 21 balance(0) {} 22 ~BankAccount() { 23 delete[] transactionHistory; 24 } 25 void deposit(double amount) { 26 addTransaction(amount, \"Deposit\"); 27 balance += amount; 28 } 29 double withdraw(double desiredAmount) { 30 if (desiredAmount <= balance) { 31 addTransaction(-desiredAmount, \"Wit 32 balance -= desiredAmount; 33 return desiredAmount; 34 } 35 else { 36 double actualAmount = balance; 37 addTransaction(-actualAmount, \"With 38 balance = 0; 39 return actualAmount; 40 } 41 } 42 double getBalance() const { 43 return balance; 44 } 45 }; Here, our BankAccount class tracks an array of all transactions it has performed. Every time the withdraw or deposit methods are called, they call the private addTransaction method to record the transaction in this history. This method then reallocates the array to be larger,4 adds the new entry to the end, and uses the

delete[] operator to free the memory from the old array. delete and delete[] free the memory allocated by new and new[] respectively. We will discuss them more in a moment. Before we proceed, we should briefly note that if we wanted to this in a real C++ program, we would want to use the built-in vector class, which provides an array-like interface, but also has the ability to add more elements to the end (causing it to grow). However, we are not yet ready to discuss vector, as it is a templated class, and we will not learn about templates until Chapter 17. We will note that, unfortunately, new[] does not have a realloc analog (for good reason: remember we can move POD around at will, but not non-POD types). We should also take a moment to note that this example illustrates the use of visibility restrictions to enforce invariants. This BankAccount class has the invariant that the current balance must be equal to the sum of the amounts in the transaction history (since the history is a log of the changes to the balance). No external code can modify the balance or transactionHistory, nor call addTransaction to violate this invariant. Consequently, we need only convince ourselves that the class maintains this invariant properly (through a combination of code inspection, testing, and debugging) to know that the invariant will hold in the entire program for any program we write with this class. 15.2.1 When Are Destructors Invoked A destructor is invoked whenever the “box” for an object is about to be destroyed. This destruction can happen either due to dynamic deallocation through delete or delete[], by a local variable going out of scope, or by one object that contains another (as a field) being

destroyed. Whenever a destructor is called, the object it is invoked on (i.e., where the this pointer points when the destructor is called) is the object that is about to be destroyed. The “box” is only actually destroyed after the destructor completes. In the first of these cases, delete and delete[] free the memory allocated by new and new[] (respectively), just like free frees the memory allocated by malloc. It is only correct to use delete to free memory allocated by new and delete[] to free memory allocated by delete[]. Mixing these up can lead to memory leaks or program crashes. If you deallocate memory for an array with delete[], the elements of the array have their destructor (if any) invoked in decreasing order of index—the opposite order from which they were constructed. In fact, as a general rule, whenever construction and destruction occur in a group, the order of destruction is the opposite of the order of construction. For local variables (and parameters), their “box” is destroyed whenever they go out of scope. If the variable’s type is a class with a destructor (note: not a pointer to a class with a destructor, in which case, only the pointer is being destroyed, not the object), then the destructor must be invoked before the box is destroyed. If multiple variables go out of scope at the same point, their destructors are invoked in the opposite order from the order in which they were constructed. We will note that this rule (combined with the fact that destructors may have arbitrary effects) means you must be careful about when variables go out of scope when executing code by hand (or, more generally, understanding what it is doing). Consider: 1 #include <cstdio> 2 #include <cstdlib>

3 class Example { 4 int x; 5 public: 6 Example(int i) : x(i) { 7 std::printf(\"Created example %d\\n\", x 8} 9 public: 10 ~Example() { 11 std::printf(\"Destroyed example %d\\n\", 12 } 13 }; 14 int main(void) { 15 Example e1(42); 16 Example e2(99); 17 for (int i = 0; i < 4; i++) { 18 Example eloop(i); 19 } 20 return EXIT_SUCCESS; 21 } This code does not make a particularly good use of its constructor or destructor, but having them print a message is informative for illustrating this point about how object creation and destruction behaves. The variable eloop goes out of scope at the close curly brace that ends the for loop, and another object by the same name is initialized the next time around the loop. Therefore, this code would print: Created example 42 Created example 99 Created example 0 Destroyed example 0 Created example 1 Destroyed example 1 Created example 2 Destroyed example 2 Created example 3 Destroyed example 3 Destroyed example 99 Destroyed example 42

Video 15.5: Execution of code with object destruction. Video 15.5 illustrates the execution of code with object destruction, making better use of the destructor. Here, the Polygon class has a pointer to an array of Points, which needs to be deleted when the object that owns the array is destroyed. Fields inside of a class are also destroyed as part of the destruction of the enclosing object. After the class’s destructor completes, each of the fields is destroyed in the reverse of the order in which they were initialized. For any field whose type is a class with a destructor, that destructor is invoked before the field is destroyed. Note that, if you have a field that is a pointer, then the box being destroyed only contains a pointer, not an object, so no destructor is invoked. If you want to destroy the object that the pointer points at, you must explicitly delete it in the class’s destructor. If you do not explicitly declare a destructor for a class, the compiler implicitly provides one that looks like: 1 class MyClass { 2 public: 3 ~MyClass() {} 4 }; The automatically supplied destructor is a trivial destructor if the class has no fields that have nontrivial destructors. If you explicitly write a destructor (even if it has nothing in it), that destructor is considered nontrivial. Put another way, a destructor is trivial if (a) the compiler automatically supplies it, and (b) it actually does nothing. Any class with a nontrivial destructor is not a POD type. Video 15.6: Execution of more complex code with destructors.

Video 15.6 shows the execution of code with destructors, where objects have fields that need to be destroyed along with the object they belong to. 15.3 Object Copying There are many ways in which programs copy values, such as when a parameter is passed to a function (its value is copied into the called function’s frame), when a value is assigned to a variable (its value is copied into the destination “box”), or when a value is returned (it is copied out of the returning function’s frame to be returned to the caller). In C, all of these copies occur in a simple fashion. All of C’s types are plain old data (POD), so the underlying numeric representation is directly copied from one location to another. Video 15.7: Naïvely copying an object via parameter passing. Video 15.8: Naïvely copying an object via assignment. In C++, however, there are types that are not POD. Objects that are not POD cannot simply be copied from one location to another. Instead, the class may wish to specify a way that the object should be copied. To see why, Video 15.75 and Video 15.8 illustrate what happens when we have a non-POD type and naïvely copy its fields when copying during parameter passing and when copying by assignment. In both videos, the fundamental problem is the same: we cannot just naïvely copy the fields from one object to another. When we simply copy the points field, both the original object and the copy point at the same memory. When one object is destroyed, it frees this memory,

leaving the other’s pointer dangling. In the case of assigning to an existing object, there is an additional problem, which is not present when creating a new object as a copy—as we saw in Video 15.8, assigning to p1 leaks the memory it previously pointed to, as we have overwritten the pointer without freeing that memory. Instead of always simply making shallow copies, we need to allow our classes to specify how their objects should be copied. A shallow copy may suffice for some types (even if they are not POD), but clearly does not suffice for all types. C++ distinguishes between two types of copying: copying during initialization (the copy constructor) and copying during assignment (the copy assignment operator). This distinction allows us to free existing resources when assigning to an existing object. We must do so during assignment to avoid memory leaks; however, we cannot do so during object creation as those fields are not initialized. 15.3.1 Copy Constructor Copying during initialization occurs when a new object is created as a copy of an old one. This form of copying occurs when objects are passed to functions by value (as opposed to passing a reference or pointer), when an object is returned from a function by value, or explicitly when the programmer writes another object of the same type as the initializer for a newly declared object. Whatever the reason for initializing an object from a copy of another object, the copy constructor is invoked to perform the copying in the fashion the class defines. We might modify the Polygon class from Video 15.7 to have a copy constructor as follows: 1 class Polygon {

2 Point * points; 3 size_t numPoints; 4 public: 5 Polygon(size_t n) : points(new Point[n]), numPo 6 //copy constructor: makes a deep copy 7 Polygon(const Polygon & rhs) : points(new Point 8 numPoints(rhs.nu 9 for (size_t i = 0; i < numPoints; i++) { 10 points[i] = rhs.points[i]; 11 } 12 } 13 ~Polygon() { 14 delete[] points; 15 } 16 }; As with all constructors, the copy constructor is named the same as the class in which it resides, and it has no return type (not even void). However, the copy constructor is a very specific overloading of the constructor. It takes a reference (generally a const reference) to its own type (in this case, as const Polygon &). Passing the argument by reference means that the argument points at the original object, rather than being a copy of it. If we were to pass the argument as the value of the object, we would have to make a copy of it just to get it into the copy constructor! We typically pass a const reference as we typically should not modify the object that we are making a copy of. However, we can write a copy constructor that takes a non-const reference if we have a good reason to do so. If we desire, we can have multiple copy constructors: one that takes a const reference, and one that takes a non- const reference. If we do not explicitly specify a copy constructor, then C++ automatically provides one. The provided copy constructor performs as if it contains an initializer list that initializes every field from the corresponding field in the argument. That is, in general it looks

1 class SomeClass { 2 Type1 field1; 3 Type2 field2; 4 ... 5 TypeN fieldN; 6 public: 7 //this is what the provided copy constructor would 8 //look like if you did not write any copy constructor 9 SomeClass(const SomeClass & rhs) : field1(rhs. 10 field2(rhs. 11 ... 12 fieldN(rhs. 13 }; Video 15.9: Executing code with a copy constructor: parameter-passing example revisited. For fields of non-class types, this initialization simply copies the value. For fields of class types, this initialization invokes the corresponding copy constructor (which may either by written by the user or automatically provided). The automatically generated copy constructor will take a const reference as an argument if possible (i.e., if all fields have copy constructors which take const references), otherwise, it will have a non-const reference as an argument. As with default constructors, a copy constructor may be classified as trivial. In order for a copy constructor to be trivial, it must be automatically provided by the compiler (no user-defined copy constructors are trivial), and the fields must all have trivial copy constructors. There are also some other conditions, which are related to topics we will learn about later. A trivial copy constructor essentially simply copies the bytes in memory from one object to another, much as we would copy a struct in C.


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook