struct
holds data members of various types, and
the C union
also defines data members of various types. However, a
union's data members all occupy the same location in memory and the programmer
may decide on which one to use.
In this chapter classes are introduced. A class
is a kind of struct
,
but its content is by default inaccessible to the outside world, whereas the
content of a C++ struct
is by default accessible to the outside
world. In C++ struct
s find little use: they are mainly used to
aggregate data within the context of classes or to define elaborate return
values. Often a C++ struct
merely contains plain old data (POD,
cf. section 9.10). In C++ the class
is the main data structuring
device, by default enforcing two core concepts of current-day software
engineering: data hiding and encapsulation (cf. sections 3.2.1
and 7.1.1).
The union
is another data structuring device the language offers. The
traditional C union is still available, but C++ also offers
unrestricted unions. Unrestricted unions are unions whose data fields may
be of class types. The C++ Annotations covers these unrestricted unions in
section 9.9, after having introduced several other new concepts of
C++,
C++ extends the C struct
and union
concepts by allowing the
definition of member
functions (introduced in this chapter) within these data types. Member
functions are functions that can only be used with objects of these data types
or within the scope of these data types. Some of these member functions are
special in that they are always, usually automatically, called when an object
starts its life (the so-called constructor) or ends its life (the
so-called destructor). These and other types of member functions, as well
as the design and construction of, and philosophy behind, classes are
introduced in this chapter.
We step-by-step construct a class Person
, which could be used in a
database application to store a person's name, address and phone number.
Let's start by creating a class Person
right away. From the
onset, it is important to make the distinction between the class
interface and its implementation. A class may loosely be defined as
`a set of data and all the functions operating on those data'. This definition
is later refined but for now it is sufficient to get us started.
A class interface is a definition, defining the organization of objects of
that class. Normally a definition results in memory reservation. E.g., when
defining int variable
the compiler ensures that some memory is reserved in
the final program storing variable
's values. Although it is a definition
no memory is set aside by the compiler once it has processed the class
definition. But a class definition follows the one definition rule: in
C++ entities may be defined only once. As a class definition does not
imply that memory is being reserved the term class interface is preferred
instead.
Class interfaces are normally contained in a class header file, e.g.,
person.h
. We'll start our class Person
interface here (cf section
7.7 for an explanation of the const
keywords behind some
of the class's member functions):
#include <string> class Person { std::string d_name; // name of person std::string d_address; // address field std::string d_phone; // telephone number size_t d_mass; // the mass in kg. public: // member functions void setName(std::string const &name); void setAddress(std::string const &address); void setPhone(std::string const &phone); void setMass(size_t mass); std::string const &name() const; std::string const &address() const; std::string const &phone() const; size_t mass() const; };The member functions that are declared in the interface must still be implemented. The implementation of these members is properly called their definition.
In addition to member functions classes also commonly define the data
that are manipulated by those member functions. These data are called the
data members. In Person
they are d_name, d_address,
d_phone
and d_mass
. Data members should be given private
access rights. Since the class uses private access rights by default
they are usually simply listed at the top of the class interface.
All communication between the outer world and the class data is routed through
the class's member functions. Data members may receive new values (e.g., using
setName
) or they may be retrieved for inspection (e.g., using
name
). Functions merely returning values stored inside the object, not
allowing the caller to modify these internally stored values, are called
accessors.
Syntactically there is only a marginal difference between a class and a struct. Classes by default define private members, structs define public members. Conceptually, though, there are differences. In C++ structs are used in the way they are used in C: to aggregate data, which are all freely accessible. Classes, on the other hand, hide their data from access by the outside world (which is aptly called data hiding) and offer member functions to define the communication between the outer world and the class's data members.
Following Lakos (Lakos, J., 2001) Large-Scale C++ Software Design (Addison-Wesley) I suggest the following setup of class interfaces:
d_
, followed by a name suggesting
their meaning (in chapter 8 we'll also
encounter data members starting with s_
).
set
. E.g., setName
.
get
-prefix is still frequently encountered,
e.g., getName
. However, following the conventions promoted by the Qt
(see http://www.trolltech.com
)
Graphical User Interface Toolkit, the get
-prefix is now
deprecated. So, rather than defining the member getAddress
, it should
simply be named address
.
private
is needed beyond the public members to
switch back from public members to private access rights which
nicely separates the members that may be used `by the general public' from the
class's own support members.
Finally, referring back to section 3.1.2 that
using namespace std;
must be used in most (if not all) examples of source code. As
explained in sections 7.11 and 7.11.1 the using
directive should follow the preprocessor directive(s) including the header
files, using a setup like the following:
#include <iostream> #include "person.h" using namespace std; int main() { ... }
Constructors are recognized by their names which are equal to their class
names. Constructors do not specify return values, not even void
. E.g.,
the class Person
may define a constructor Person::Person()
. The
C++ run-time system ensures that the constructor of a class is called when
a variable of the class is defined. It is possible to define a class lacking
any constructor. In that case the compiler defines a
default constructor that is called when an object of
that class is defined. What actually happens in that case depends on the data
members that are defined by that class (cf. section 7.3.1).
Objects may be defined locally or globally. However, in C++ most objects are defined locally. Globally defined objects are hardly ever required and are somewhat deprecated.
When a function defines a local object, that object's constructor is called every time the function is called. The object's constructor is activated at the point where the object is defined (a subtlety is that an object may be defined implicitly as, e.g., a temporary variable in an expression).
When an object is defined as a static object
it is constructed when
the program starts. In this case its constructor is called even
before the function main
starts. Example:
#include <iostream> using namespace std; class Demo { public: Demo(); }; Demo::Demo() { cout << "Demo constructor called\n"; } Demo d; int main() {} /* Generated output: Demo constructor called */The program contains one global object of the class
Demo
with main
having an empty body. Nonetheless, the program produces some output generated
by the constructor of the globally defined Demo
object.
Constructors have a very important and well-defined role. They must ensure that all the class's data members have sensible or at least well-defined values once the object has been constructed. We'll get back to this important task shortly. The default constructor has no argument. It is defined by the compiler unless another constructor is defined and unless its definition is suppressed (cf. section 7.6). If a default constructor is required in addition to another constructor then the default constructor must explicitly be defined as well. C++ provides special syntax to realize that without much effort, which is also covered by section 7.6.
Person
has three string data members and a size_t
d_mass
data member. Access to these data members is controlled by
interface functions.
Whenever an object is defined the class's constructor(s) ensure that its data
members are given `sensible' values. Thus, objects never suffer from
uninitialized values. Data members may be given new values, but that should
never be directly allowed. It is a core principle (called
data hiding) of good class design that its data members are private. The
modification of data members is therefore fully controlled by member functions
and thus, indirectly, by the class-designer. The class encapsulates all
actions performed on its data members and due to this
encapsulation the class object may assume the `responsibility' for its
own data-integrity. Here is a minimal definition of Person
's manipulating
members:
#include "person.h" // given earlier using namespace std; void Person::setName(string const &name) { d_name = name; } void Person::setAddress(string const &address) { d_address = address; } void Person::setPhone(string const &phone) { d_phone = phone; } void Person::setMass(size_t mass) { d_mass = mass; }It's a minimal definition in that no checks are performed. But it should be clear that checks are easy to implement. E.g., to ensure that a phone number only contains digits one could define:
void Person::setPhone(string const &phone) { if (phone.empty()) d_phone = " - not available -"; else if (phone.find_first_not_of("0123456789") == string::npos) d_phone = phone; else cout << "A phone number may only contain digits\n"; }
Note the double negation in this implementation. Double negations are very
hard to read, and an encapsulating member bool hasOnly
handles the test,
and improves setPhone's
readability:
bool Person::hasOnly(char const *characters, string const &object) { // object only contains 'characters' return object.find_first_not_of(characters) == string::npos; }
and setPhone
becomes:
void Person::setPhone(string const &phone) { if (phone.empty()) d_phone = " - not available -"; else if (hasOnly("0123456789", phone)) d_phone = phone; else cout << "A phone number may only contain digits\n"; }
Since hasOnly
is an encapsulated member function we can ensure that
it's only used with non-empty string objects, so hasOnly
itself doesn't
have to check for that.
Access to the data members is controlled by accessor
members. Accessors ensure that data members cannot suffer from uncontrolled
modifications. Since accessors conceptually do not modify the object's data
(but only retrieve the data) these member functions are given the predicate
const
. They are called const member functions,
which, as they are guaranteed not to modify their object's data, are available
to both modifiable and constant objects (cf. section 7.7).
To prevent backdoors we must also make sure that the data member is not modifiable through an accessor's return value. For values of built-in primitive types that's easy, as they are usually returned by value, which are copies of the values found in variables. But since objects may be fairly large making copies is usually prevented by returning objects by reference. A backdoor is created by returning a data member by reference, as in the following example, showing the allowed abuse below the function definition:
string &Person::name() const { return d_name; } Person somebody; somebody.setName("Nemo"); somebody.name() = "Eve"; // Oops, backdoor changing the name
To prevent the backdoor objects are returned as const references from
accessors. Here are the implementations of Person
's accessors:
#include "person.h" // given earlier using namespace std; string const &Person::name() const { return d_name; } string const &Person::address() const { return d_address; } string const &Person::phone() const { return d_phone; } size_t Person::mass() const { return d_mass; }
The Person
class interface remains the starting point for the class
design: its member functions define what can be asked of a Person
object. In the end the implementation of its members merely is a technicality
allowing Person
objects to do their jobs.
The next example shows how the class Person
may be used. An object is
initialized and passed to a function printperson()
, printing the person's
data. Note the reference operator in the parameter list of the function
printperson
. Only a reference to an existing Person
object is passed
to the function, rather than a complete object. The fact that
printperson
does not modify its argument is evident from the fact that
the parameter is declared const
.
#include <iostream> #include "person.h" // given earlier using namespace std; void printperson(Person const &p) { cout << "Name : " << p.name() << "\n" "Address : " << p.address() << "\n" "Phone : " << p.phone() << "\n" "Mass : " << p.mass() << '\n'; } int main() { Person p; p.setName("Linus Torvalds"); p.setAddress("E-mail: Torvalds@cs.helsinki.fi"); p.setPhone(""); p.setMass(75); // kg. printperson(p); } /* Produced output: Name : Linus Torvalds Address : E-mail: Torvalds@cs.helsinki.fi Phone : - not available - Mass : 75 */
Person
's constructor so far has not received any
parameters. C++ allows constructors to be defined with or without
parameter lists. The arguments are supplied when an object is defined.
For the class Person
a constructor expecting three strings and a
size_t
might be useful. Representing, respectively, the person's name,
address, phone number and mass. This constructor can be implemented like this
(but see also section 7.3.1):
Person::Person(string const &name, string const &address, string const &phone, size_t mass) { d_name = name; d_address = address; setPhone(phone); d_mass = mass; }
It must of course also be declared in the class interface:
class Person { // data members (not altered) public: Person(std::string const &name, std::string const &address, std::string const &phone, size_t mass); // rest of the class interface (not altered) };
Now that this constructor has been declared, the default constructor must
explicitly be declared as well if we still want to be able to construct a
plain Person
object without any specific initial values for its data
members. The class Person
would thus support two constructors, and the
part declaring the constructors now becomes:
class Person { // data members public: Person(); Person(std::string const &name, std::string const &address, std::string const &phone, size_t mass); // additional members };
In this case, the default constructor doesn't have to do very much, as it
doesn't have to initialize the string
data members of the Person
object. As these data members are objects themselves, they are initialized to
empty strings by their own default constructor. However, there is also a
size_t
data member. That member is a variable of a built-in type and such
variabes do not have constructors and so are not initialized automatically.
Therefore, unless the value of the d_mass
data member is explicitly
initialized its value is:
Person
objects;
Person
objects.
Person::Person() { d_mass = 0; }
Using constructors with and without arguments is illustrated next. The
object karel
is initialized by the constructor defining a non-empty
parameter list while the default constructor is used for the anon
object. When constructing objects using constructors requiring arguments you
are advised to surround the arguments by curly braces. Parentheses can often
also be used, and sometimes even have to be used (cf. section
12.4.2), but mindlessly using parentheses instead of curly braces may
easily result in unexpected problems (cf. section 7.2). Hence the
advice to prefer curly braces rather than parentheses. Here's the
example showing two constructor-calls:
int main() { Person karel{ "Karel", "Rietveldlaan 37", "542 6044", 70 }; Person anon; }
The two Person
objects are defined when main
starts as they are
local objects, living only for as long as main
is active.
If Person
objects must be definable using other arguments,
corresponding constructors must be added to Person
's interface. Apart from
overloading class constructors it is also possible to provide constructors
with default argument values. These default arguments must be specified with
the constructor declarations in the class interface, like so:
class Person { public: Person(std::string const &name, std::string const &address = "--unknown--", std::string const &phone = "--unknown--", size_t mass = 0); };
Often, constructors use highly similar implementions. This results from
the fact that the constructor's parameters are often defined for convenience:
a constructor not requiring a phone
number but requiring a mass
cannot
be defined using default arguments, since phone
is not the constructor's
last parameter. Consequently a special constructor is required not having
phone
in its parameter list. However, this doesn't necessarily mean that
constructors must duplicate their code, as constructors may call each other
(called constructor delegation). Constructor delegation is illustrated in
section 7.4.1 below.
Test
. The program defines a global
Test
object and two local Test
objects. The order of construction is
as expected: first global, then main's first local object, then func
's
local object, and then, finally, main
's second local object:
#include <iostream> #include <string> using namespace std; class Test { public: Test(string const &name); // constructor with an argument }; Test::Test(string const &name) { cout << "Test object " << name << " created" << '\n'; } Test globaltest("global"); void func() { Test functest("func"); } int main() { Test first{ "main first" }; func(); Test second{ "main second" }; } /* Generated output: Test object global created Test object main first created Test object func created Test object main second created */
class Data { public: Data(); Data(int one); Data(int one, int two); void display(); };
The intention is to define two objects of the class Data, using, respectively, the first and second constructors, while using parentheses in the object definitions. Your code looks like this (and compiles correctly):
#include "data.h" int main() { Data d1(); Data d2(argc); }
Now it's time to make some good use of the Data
objects. Let's add two
statements to main
:
d1.display(); d2.display();
But, surprise, the compiler complains about the first of these two:
error: request for member 'display' in 'd1', which is of non-class type 'Data()'
What's going on here? First of all, notice the data type the compiler refers
to: Data()
, rather than Data
. What are those ()
doing there?
Before answering that question, let's broaden our story somewhat. We know that
somewhere in a library a factory function dataFactory
exists. A
factory function creates and returns an object of a certain type. This
dataFactory
function returns a Data
object, constructed using
Data
's default constructor. Hence, dataFactory
needs no arguments. We
want to use dataFactory
in our program, but must declare the function. So
we add the declaration to main
, as that's the only location where
dataFactory
will be used. It's a function, not requiring arguments,
returning a Data
object:
Data dataFactory();
This, however, looks remarkably similar to our d1
object definition:
Data d1();
We found the source of our problem: Data d1()
apparently is not
the definition of a d1
object, but the declaration of a function,
returning a Data
object. So, what's happening here and how should we
define a Data
object using Data
's default constructor?
First: what's happening here is that the compiler, when confronted with
Data d1()
, actually had a choice. It could either define a Data
object, or declare a function. It declares a function.
In fact, we're encountering an ambiguity in C++'s grammar here, which is solved, according to the language's standard, by always letting a declaration prevail over a definition. We'll encounter more situations where this ambiguity occurs later on in this section.
Second: there are several ways we can solve this ambiguity the way we want it to be solved. To define an object using its default constructor:
int x
): Data d1
;
Data d1{}
;
Data
object: Data d1 = Data{}
, or possibly Data d1 =
Data()
.
Data()
in the above context defines a default constructed anonymous
Data
object. This takes us back to the compiler error. According to the
compiler, our original d1
apparently was not of type Data
, but of type
Data()
. So what's that?
Let's first have a look at our second constructor. It expects an
int
. We would like to define another Data
object, using the second
constructor, but want to pass the default int
value to the constructor,
using int()
. We know this defines a default int
value, as cout <<
int() << '\n'
nicely displays 0, and int x = int()
also initialized x to
0. So we define `Data di(int())'
in main
.
Not good: again the compiler complains when we try to use
di
. After `di.display()'
the compiler tells us:
error: request for member 'display' in 'di', which is of non-class type 'Data(int (*)())'
Oops, again not as expected.... Didn't we pass 0? Why the sudden pointer? It's
that same `use a declaration when possible' strategy again. The notation
Type()
not only represents the default value of type Type
, but it's
also a shorthand notation for an anonymous pointer to a function, not
expecting arguments, and returning a Type
value, which you can verify by
defining `int (*ip)() = nullptr'
, and passing ip
as argument to
di
: di(ip)
compiles fine.
So why doesn't the error occur when inserting int()
or assigning int()
to int x
? In these latter cases nothing is declared. Rather, `cout
'
and `int x =
' require expressions determining values, which is provided by
int()
's `natural' interpretation. But with `Data di(int())'
the
compiler again has a choice, and (by design) it chooses a declaration because
the declaration takes priority. Now int()
's interpretation as an anonymous
pointer is available and therefore used.
Likewise, if int x
has been defined, `Data b1(int(x))'
declares b1
as a function, expecting an int
(as int(x)
represents a type), while
`Data b2((int)x)'
defines b2
as a Data
object, using the
constructor expecting a single int
value.
Again, to use default entities, values or objects, prefer {}
over ()
:
Data di{ int{} }
defines di
of type Data
, calling the Data(int
x)
constructor and uses int's
default value 0.
int b
. Then,
in a compound statement we need to construct an anonymous Data
object,
initialized using b
, followed by displaying b
:
int b = 18; { Data(b); cout << b; }
About that cout
statement the compiler tells us (I modified the
error message to reveal its meaning):
error: cannot bind `std::ostream & << Data const &'
Here we didn't insert int b
but Data b
. Had we omitted the compound
statement, the compiler would have complained about a doubly defined b
entity, as Data(b)
simply means Data b
, a Data
object constructed
by default. The compiler may omit superfluous parentheses when parsing a
definition or declaration.
Of course, the question now becomes how a temporary object Data
,
initialized with int b
can be defined. Remember that the compiler may
remove superfluous parentheses. So, what we need to do is to pass an int
to the anonymous Data
object, without using the int
's name.
Data(static_cast<int>(b))
;
Data{ b }
.
Values and types make big differences. Consider the following definitions:
Data (*d4)(int); // 1 Data (*d5)(3); // 2
Definition 1 should cause no problems: it's a pointer to a function,
expecting an int
, returning a Data
object. Hence, d4
is a pointer
variable.
Definition 2 is slightly more complex. Yes, it's a pointer. But it has
nothing to do with a function. So what's that argument list containing 3 doing
there? Well, it's not an argument list. It's an initialization that looks like
an argument list. Remember that variables can be initialized using the
assignment statement, by parentheses or by curly parentheses. So instead of
`(3)'
we could have written `= 3'
or `{3}'
. Let's pick the first
alternative, resulting in:
Data (*d5) = 3;
Now we get to `play compiler' again. Removing some superfluous parentheses we get:
Data *d5 = 3;
It's a pointer to a Data
object, initialized to 3. This is
semantically incorrect, but that's only clear after the syntactical
analysis. If I had initially written
Data (*d5)(&d1); // 2
the fun resulting from contrasting int
and 3
would most likely
have been spoiled.
Assume a function process
expecting an int
exists in a library. We
want to use this function to process some int
data values. So in main
process
is declared and called:
int process(int Data); process(argc);
No problems here. But unfortunately we once decided to `beautify' our code, by throwing in some superfluous parentheses, like so:
int process(int (Data)); process(argc);
Now we're in trouble. The compiler now generates an error, caused by its
rule to let declarations prevail over definitions. Data
now becomes the
name of the class Data
, and analogous to int (x)
the parameter int
(Data)
is parsed as int (*)(Data)
: a pointer to a function, expecting a
Data
object, returning an int
.
Here is another example. When, instead of declaring
int process(int Data[10]);
we declare, e.g., to emphasize the fact that an array is passed to
process
:
int process(int (Data[10]));
the process
function does not expect a pointer to int
values, but
a pointer to a function expecting a pointer to Data
elements, returning an
int
.
To summarize the findings in the `Ambiguity Resolution' section:
Person
objects are used as data members. This construction
technique is called composition.
Composition is neither extraordinary nor C++ specific: in C
a struct
or union
field is commonly used in other compound types. In
C++ it requires some special thought as their initialization sometimes is
subject to restrictions, as discussed in the next few sections.
Earlier we've encountered the following constructor of the Person
:
Person::Person(string const &name, string const &address, string const &phone, size_t mass) { d_name = name; d_address = address; d_phone = phone; d_mass = mass; }
Think briefly about what is going on in this constructor. In the
constructor's body we encounter assignments to string objects. Since
assignments are used in the constructor's body their left-hand side objects
must exist. But when objects are coming into existence constructors must
have been called. The initialization of those objects is thereupon immediately
undone by the body of Person
's constructor. That is
not only inefficient but sometimes downright impossible. Assume that the class
interface mentions a string const
data member: a data member whose value
is not supposed to change at all (like a birthday, which usually doesn't
change very much and is therefore a good candidate for a string const
data
member). Constructing a birthday object and providing it with an initial value
is OK, but changing the initial value isn't.
The body of a constructor allows assignments to data members. The initialization of data members happens before that. C++ defines the member initializer syntax allowing us to specify the way data members are initialized at construction time. Member initializers are specified as a list of constructor specifications between a colon following a constructor's parameter list and the opening curly brace of a constructor's body, as follows:
Person::Person(string const &name, string const &address, string const &phone, size_t mass) : d_name(name), d_address(address), d_phone(phone), d_mass(mass) {}
In this example the member initialization used parentheses surrounding the
intialization expression. Instead of
parentheses curly braces may also be used. E.g., d_name
could also be
initialized this way:
d_name{ name },
Member initialization always occurs when objects are composed in classes: if no constructors are mentioned in the member initializer list the default constructors of the objects are called. Note that this only holds true for objects. Data members of primitive data types are not initialized automatically.
Member initialization can, however, also be used for primitive data members,
like int
and double
. The above example shows the initialization of the
data member d_mass
from the parameter mass
. When member
initializers are used the data member could even have the same name as the
constructor's parameter (although this is deprecated) as there is no ambiguity
and the first (left) identifier used in a member initializer is always a data
member that is initialized whereas the identifier between parentheses is
interpreted as the parameter.
The order in which class type data members are initialized is defined by the order in which those members are defined in the composing class interface. If the order of the initialization in the constructor differs from the order in the class interface, the compiler complains, and reorders the initialization so as to match the order of the class interface.
Member initializers should be used as often as possible. As shown it may be required to use them (e.g., to initialize const data members, or to initialize objects of classes lacking default constructors) but not using member initializers also results in inefficient code as the default constructor of a data member is always automatically called unless an explicit member initializer is specified. Reassignment in the constructor's body following default construction is then clearly inefficient. Of course, sometimes it is fine to use the default constructor, but in those cases the explicit member initializer can be omitted.
As a rule of thumb: if a value is assigned to a data member in the constructor's body then try to avoid that assignment in favor of using a member initializer.
const
objects or not), there is another situation where member
initializers must be used. Consider the following situation.
A program uses an object of the class Configfile
, defined in main
to access the information in a configuration file. The configuration file
contains parameters of the program which may be set by changing the values in
the configuration file, rather than by supplying command line arguments.
Assume another object used in main
is an object of the class Process
,
doing `all the work'. What possibilities do we have to tell the object of the
class Process
that an object of the class Configfile
exists?
Configfile
object may be passed to the Process
object at
construction time. Bluntly passing an object (i.e., by value) might not
be a very good idea, since the object must be copied into the Configfile
parameter, and then a data member of the Process
class can be used to make
the Configfile
object accessible throughout the Process
class. This
might involve yet another object-copying task, as in the following situation:
Process::Process(Configfile conf) // a copy from the caller { d_conf = conf; // copying to d_conf member }
Configfile
objects are used, as in:
Process::Process(Configfile *conf) // pointer to external object { d_conf = conf; // d_conf is a Configfile * }
This construction as such is OK, but forces us to use the `->
' field
selector operator, rather than the `.
' operator, which is (disputably)
awkward. Conceptually one tends to think of the Configfile
object as an
object, and not as a pointer to an object. In C this would probably have
been the preferred method, but in C++ we can do better.
Configfile
parameter could be defined as a reference parameter of Process
's
constructor. Next, use a Config
reference data member in the
class Process
.
Process::Process(Configfile &conf) { d_conf = conf; // wrong: no assignment }
The statement d_conf = conf
fails, because it is not an
initialization, but an assignment of one Configfile
object (i.e.,
conf
), to another (d_conf
). An assignment to a reference variable is
actually an assignment to the variable the reference variable refers to. But
which variable does d_conf
refer to? To no variable at all, since we
haven't initialized d_conf
. After all, the whole purpose of the statement
d_conf = conf
was to initialize d_conf
....
How to initialize d_conf
? We once again use the member initializer
syntax. Here is the correct way to initialize d_conf
:
Process::Process(Configfile &conf) : d_conf(conf) // initializing reference member {}
The above syntax must be used in all cases where reference data members
are used. E.g., if d_ir
would have been an int
reference data member,
a construction like
Process::Process(int &ir) : d_ir(ir) {}
would have been required.
Consider a class defining several data members: a pointer to data, a data member storing the number of data elements the pointer points at, a data member storing the sequence number of the object. The class also offer a basic set of constructors, as shown in the following class interface:
class Container { Data *d_data; size_t d_size; size_t d_nr; static size_t s_nObjects; public: Container(); Container(Container const &other); Container(Data *data, size_t size); Container(Container &&tmp); };
The initial values of the data members are easy to describe, but somewhat
hard to implement. Consider the initial situation and assume the default
constructor is used: all data members should be set to 0, except for d_nr
which must be given the value ++s_nObjects
. Since these are
non-default actions, we can't declare the default constructor using =
default
, but we must provide an actual implementation:
Container() : d_data(0), d_size(0), d_nr(++s_nObjects) {}
In fact, all constructors require us to state the
d_nr(++s_nObjects)
initialization. So if d_data
's type would have been
a (move aware) class type, we would still have to provide implementations for
all of the above constructors.
C++, however, also supports
data member initializers,
simplifying the initialization of non-static data members. Data member
initializers allow us to assign initial values to data members. The compiler
must be able to compute these initial values from initialization expressions,
but the initial values do not have to be constant expressions. So
++s_nObjects
can be an initial value.
Using data member initializers for the class Container
we get:
class Container { Data *d_data = 0; size_t d_size = 0; size_t d_nr = ++s_nObjects; static size_t s_nObjects; public: Container() = default; Container(Container const &other); Container(Data *data, size_t size); Container(Container &&tmp); };
Note that the data member initializations are recognized by the compiler, and are applied to its implementation of the default constructor. In fact, all constructors will apply the data member initializations, unless explicitly initialized otherwise. E.g., the move-constructor may now be implemented like this:
Container(Container &&tmp) : d_data(tmp.d_data), d_size(tmp.d_size) { tmp.d_data = 0; }
Although d_nr
's intialization is left out of the implementation it
is initialized due to the data member initialization provided in the
class's interface.
An aggregate is an array or a class
(usually a struct
with no
user-defined constructors, no private or protected non-static data members,
no base classes (cf. chapter 13), and no virtual functions
(cf. chapter 14)). E.g.,
struct POD // defining aggregate POD { int first = 5; double second = 1.28; std::string hello{ "hello" }; };
To initialize such aggregates braced initializer lists can be used. In fact, their use is preferred over using the older form (using parentheses), as using braces avoids confusion with function declarations. E.g.,
POD pod{ 4, 13.5, "hi there" };
When using braced-initializer lists not all data members need to be initialized. Specification may stop at any data member, in which case the default (or explicitly defined initialization values) of the remaining data members are used. E.g.,
POD pod{ 4 }; // uses second: 1.28, hello: "hello"
Before the C++11 standard common practice was to define a member like init
performing all initializations common to constructors. Such an init
function, however, cannot be used to initialize const
or reference data
members, nor can it be used to perform so-called base class
initializations (cf. chapter 13).
Here is an example where such an init
function might have been used. A
class Stat
is designed as a wrapper class around C's stat(2)
function. The class might define three constructors: one expecting no
arguments and initializing all data members to appropriate values; a second
one doing the same, but it calls stat
for the filename provided to the
constructor; and a third one expecting a filename and a search path for the
provided file name. Instead of repeating the initialization code in each
constructor, the common code can be factorized into a member init
which
is called by the constructors.
C++ offers an alternative by allowing constructors to call each other. This is called delegating constructors which is illustrated by the next example:
class Stat { public: Stat() : Stat("", "") // no filename/searchpath {} Stat(std::string const &fileName) : Stat(fileName, "") // only a filename {} Stat(std::string const &fileName, std::string const &searchPath) : d_filename(fileName), d_searchPath(searchPath) { // remaining actions to be performed by the constructor } };
C++ allows static const integral data members to be initialized within
the class interfaces (cf. chapter
8). The C++11 standard adds to this the facility to define
default initializations for plain data members in class interfaces (these data
members may or may not be const
or of integral types, but (of course) they
cannot be reference data members).
These default initializations may be overruled by constructors. E.g., if
the class Stat
uses a data member bool d_hasPath
which is false
by
default but the third constructor (see above) should initialize it to true
then the following approach is possible:
class Stat { bool d_hasPath = false; public: Stat(std::string const &fileName, std::string const &searchPath) : d_hasPath(true) // overrule the interface-specified {} // value };
Here d_hasPath
receives its value only once: it's always initialized
to false
except when the shown constructor is used in which case it is
initialized to true
.
C++ supports a comparable initialization, called uniform initialization. It uses the following syntax:
Type object{ value list };
When defining objects using a list of objects each individual object may use its own uniform initialization.
The advantage of uniform initialization over using constructors is that using
constructor arguments may sometimes result in an ambiguity as constructing an
object may sometimes be confused with using the object's overloaded function
call operator (cf. section 11.10). As initializer lists can only be used
with plain old data (POD) types (cf. section 9.10) and with classes
that are `initializer list aware' (like std::vector
) the ambiguity does
not arise when initializer lists are used.
Uniform initialization can be used to initialize an object or variable, but also to initialize data members in a constructor or implicitly in the return statement of functions. Examples:
class Person { // data members public: Person(std::string const &name, size_t mass) : d_name {name}, d_mass {mass} {} Person copy() const { return {d_name, d_mass}; } };
Object definitions may be encountered in unexpected places, easily resulting in
(human) confusion. Consider a function `func
' and a very simple
class Fun
(struct
is used, as data hiding is not an issue here;
in-class implementations are used for brevity):
void func(); struct Fun { Fun(void (*f)()) { std::cout << "Constructor\n"; }; void process() { std::cout << "process\n"; } };
Assume that in main
a Fun
object is defined as follows:
Fun fun(func);
Running this program displays Constructor
, confirming that the
object fun
is constructed.
Next we change this line of code, intending to call process
from an
anonymous Fun
object:
Fun(func).process();
As expected, Constructor
appears, followed by the text process
.
What about just defining an anonymous Fun
object? We do:
Fun(func);
Now we're in for a surprise. The compiler complains that Fun
's default
constructor is missing. Why's that? Insert some blanks immediately after
Fun
and you get Fun (func)
. Parentheses around an identifier are OK,
and are stripped off once the parenthesized expression has been parsed. In
this case: (func)
equals func
, and so we have Fun func
: the
definition of a Fun func
object, using Fun
's default constructor (which
isn't provided).
So why does Fun(func).process()
compile? In this case we have a member
selector operator, whose left-hand operand must be an class-type object. The
object must exist, and Fun(func)
represents that object. It's not the name
of an existing object, but a constructor expecting a function like func
exists. The compiler now creates an anonymous Fun
, passing it func
as
its argument.
Clearly, in this example, parentheses cannot be used to create an anonymous
Fun
object. However, the uniform initialization can be used. To define
the anonymous Fun
object use this syntax:
Fun{ func };
(which can also be used to immediately call one of its members. E.g.,
Fun{func}.process()
).
Although the uniform intialization syntax is slightly different from the syntax of an initializer list (the latter using the assignment operator) the compiler nevertheless uses the initializer list if a constructor supporting an initializer list is available. As an example consider:
class Vector { public: Vector(size_t size); Vector(std::initializer_list<int> const &values); }; Vector vi = {4};
When defining vi
the constructor expecting the initializer list is
called rather than the constructor expecting a size_t
argument. If the
latter constructor is required the definition using the standard constructor
syntax must be used. I.e., Vector vi(4)
.
Initializer lists are themselves objects that may be constructed using another initializer list. However, values stored in an initializer list are immutable. Once the initializer list has been defined their values remain as-is.
Before using initializer lists the
initializer_list
header file must be included.
Initializer lists support a basic set of member functions and constructors:
initializer_list<Type> object
:object
as an empty initializer list
initializer_list<Type> object { list of Type values }
:object
as an initializer list containing Type
values
initializer_list<Type> object(other)
:object
using the values stored in other
size_t size() const
:Type const *begin() const
:Type const *end() const
:= default
'
syntax. A class specifying `= default
' with its default constructor
declaration indicates that the trivial default constructor should be
provided by the compiler. A trivial default constructor performs the following
actions:
Conversely, situations exist where some (otherwise automatically provided)
members should not be made available. This is realized by specifying
`= delete
'. Using = default
and = delete
is illustrated
by the following example. The default constructor receives its trivial
implementation, copy-construction is prevented:
class Strings { public: Strings() = default; Strings(std::string const *sp, size_t size); Strings(Strings const &other) = delete; };
const
is often used behind the parameter list of member
functions. This keyword indicates that a member function does not alter the
data members of its object. Such member functions are called
const member functions. In the class Person
, we see that the
accessor functions were declared const
:
class Person { public: std::string const &name() const; std::string const &address() const; std::string const &phone() const; size_t mass() const; };
The rule of thumb given in section 3.1.1 applies here too:
whichever appears to the left of the keyword const
, is not
altered. With member functions this should be interpreted as `doesn't alter
its own data'.
When implementing a const member function the const
attribute must be
repeated:
string const &Person::name() const { return d_name; }
The compiler prevents the data members of a class from being modified by one of its const member functions. Therefore a statement like
d_name[0] = toupper(static_cast<unsigned char>(d_name[0]));
results in a compiler error when added to the above function's definition.
Const
member functions are used to prevent inadvertent data
modification. Except for constructors and the destructor (cf. chapter
9) only const member functions can be used with (plain, references
or pointers to) const
objects.
Const objects are frequently encountered as const &
parameters of
functions. Inside such functions only the object's const members may be
used. Here is an example:
void displayMass(ostream &out, Person const &person) { out << person.name() << " weighs " << person.mass() << " kg.\n"; }
Since person
is defined as a Person const &
the function
displayMass
cannot call, e.g.,
person.setMass(75)
.
The const
member function attribute can be used to
overload member functions. When functions are
overloaded by their const
attribute the compiler uses the member
function matching most closely the const-qualification of the object:
const
object, only const
member
functions can be used.
const
object, non-const
member
functions are used, unless only a const
member function is
available. In that case, the const
member function is used.
const
member functions are
selected:
#include <iostream> using namespace std; class Members { public: Members(); void member(); void member() const; }; Members::Members() {} void Members::member() { cout << "non const member\n"; } void Members::member() const { cout << "const member\n"; } int main() { Members const constObject; Members nonConstObject; constObject.member(); nonConstObject.member(); } /* Generated output: const member non const member */As a general principle of design: member functions should always be given the
const
attribute, unless they actually modify the object's data.
Print
offers a facility to
print a string, using a configurable prefix and suffix. A partial class
interface could be:
class Print { public: Print(ostream &out); void print(std::string const &prefix, std::string const &text, std::string const &suffix) const; };
An interface like this would allow us to do things like:
Print print{ cout }; for (int idx = 0; idx != argc; ++idx) print.print("arg: ", argv[idx], "\n");
This works fine, but it could greatly be improved if we could pass
print
's invariant arguments to Print
's constructor. This would
simplify print
's prototype (only one argument would need to be passed
rather than three) and we could wrap the above code in a function
expecting a Print
object:
void allArgs(Print const &print, int argc, char **argv) { for (int idx = 0; idx != argc; ++idx) print.print(argv[idx]); }
The above is a fairly generic piece of code, at least it is with respect
to Print
. Since prefix
and suffix
don't change they can be passed
to the constructor which could be given the prototype:
Print(ostream &out, string const &prefix = "", string const &suffix = "");
Now allArgs
may be used as follows:
Print p1{ cout, "arg: ", "\n" }; // prints to cout Print p2{ cerr, "err: --", "--\n" };// prints to cerr allArgs(p1, argc, argv); // prints to cout allArgs(p2, argc, argv); // prints to cerr
But now we note that p1
and p2
are only used inside the
allArgs
function. Furthermore, as we can see from print
's prototype,
print
doesn't modify the internal data of the Print
object it is
using.
In such situations it is actually not necessary to define objects before they are used. Instead anonymous objects may be used. Anonymous objects can be used:
const
reference to
an object;
const &
parameters of
functions they are considered constant as they merely exist for passing the
information of (class type) objects to those functions. This way, they cannot
be modified, nor may their non-const member functions be used. Of course, a
const_cast
could be used to cast away the const reference's constness, but
that's considered bad practice on behalf of the function receiving the
anonymous objects. Also, any modification to the anonymous object is lost once
the function returns as the anonymous object ceases to exist after calling the
function. These anonymous objects used to initialize const references should
not be confused with passing anonymous objects to parameters defined as rvalue
references (section 3.3.2) which have a completely different purpose in
life. Rvalue references primarily exist to be `swallowed' by functions
receiving them. Thus, the information made available by rvalue references
outlives the rvalue reference objects which are also anonymous.
Anonymous objects are defined when a constructor is used without providing a name for the constructed object. Here is the corresponding example:
allArgs(Print{ cout, "arg: ", "\n" }, argc, argv); // prints to cout allArgs(Print{ cerr, "err: --", "--\n" }, argc, argv);// prints to cerr
In this situation the Print
objects are constructed and immediately
passed as first arguments to the allArgs
functions, where they are
accessible as the function's print
parameter. While the allArgs
function is executing they can be used, but once the function has completed,
the anonymous Print
objects are no longer accessible.
const
references to objects. These objects are created just
before such a function is called, and are destroyed once the function has
terminated. C++'s grammar allows us to use anonymous objects in other
situations as well. Consider the following snippet of code:
int main() { // initial statements Print{ "hello", "world" }; // assume a matching constructor // is available // later statements }
In this example an anonymous Print
object is constructed, and it is
immediately destroyed thereafter. So, following the `initial
statements' our Print
object is constructed. Then it is destroyed again
followed by the execution of the `later statements'.
The example illustrates that the standard lifetime rules do not apply to anonymous objects. Their lifetimes are limited to the statements, rather than to the end of the block in which they are defined.
Plain anonymous object are at least useful in one situation. Assume we want to put markers in our code producing some output when the program's execution reaches a certain point. An object's constructor could be implemented so as to provide that marker-functionality allowing us to put markers in our code by defining anonymous, rather than named objects.
C++'s grammar contains another remarkable characteristic illustrated by the next example:
int main(int argc, char **argv) { // assume a matching constructor is available: Print p{ cout, "", "" }; // 1 allArgs(Print{ p }, argc, argv); // 2 }
In this example a non-anonymous object p
is constructed in statement
1, which is then used in statement 2 to initialize an anonymous
object. The anonymous object, in turn, is then used to initialize
allArgs
's const
reference parameter. This use of an existing object
to initialize another object is common practice, and is based on the existence
of a so-called
copy constructor. A copy constructor creates an object (as it is a
constructor) using an existing object's characteristics to initialize the data
of the object that's created. Copy constructors are discussed in depth in
chapter 9, but presently only the concept of a copy constructor is
used.
In the above example a copy constructor is used to initialize an anonymous
object. The anonymous object was then used to initialize a parameter of a
function. However, when we try to apply the same trick (i.e., using an
existing object to initialize an anonymous object) to a plain statement, the
compiler generates an error: the object p
can't be redefined (in statement
3, below):
int main(int argc, char *argv[]) { Print p{ "", "" }; // 1 allArgs(Print(p), argc, argv); // 2 Print(p); // 3 error! }
Does this mean that using an existing object to initialize an anonymous object that is used as function argument is OK, while an existing object can't be used to initialize an anonymous object in a plain statement?
The compiler actually provides us with the answer to this apparent contradiction. About statement 3 the compiler reports something like:
error: redeclaration of 'Print p'
which solves the problem when realizing that within a compound statement
objects and variables may be defined. Inside a compound statement, a
type name followed by a variable name
is the grammatical form of a
variable definition. Parentheses can be used to break priorities, but if
there are no priorities to break, they have no effect, and are simply ignored
by the compiler. In statement 3 the parentheses allowed us to get rid of the
blank that's required between a type name and the variable name, but to the
compiler we wrote
Print (p);
which is, since the parentheses are superfluous, equal to
Print p;
thus producing p
's redeclaration.
As a further example: when we define a variable using a built-in type (e.g.,
double
) using superfluous parentheses the compiler quietly removes
these parentheses for us:
double ((((a)))); // weird, but OK.
To summarize our findings about anonymous variables:
const
reference
parameters.
Person::name()
:
std::string const &Person::name() const { return d_name; }
This function is used to retrieve the name field of an object of the class
Person
. Example:
void showName(Person const &person) { cout << person.name(); }
To insert person
's name the following actions are performed:
Person::name()
is called.
person
's d_name
as a reference.
cout
.
name
field.
Sometimes a faster procedure immediately making the d_name
data member
available is preferred without ever actually calling a function
name
. This can be realized using inline
functions. An inline
function is a request to the compiler to insert the function's code at the
location of the function's call. This may speed up execution by avoiding a
function call, which typically comes with some (stack handling and parameter
passing) overhead. Note that inline
is a request to the compiler: the
compiler may decide to ignore it, and will probably ignore it when the
function's body contains much code. Good programming discipline suggests to be
aware of this, and to avoid inline
unless the function's body is fairly
small. More on this in section 7.8.2.
Person
this results in the following implementation of name
:
class Person { public: std::string const &name() const { return d_name; } };
Note that the inline code of the function name
now literally
occurs inline in the interface of the class Person
. The keyword const
is again added to the function's header.
Although members can be defined in-class (i.e., inside the class interface itself), it is considered bad practice for the following reasons:
Person::name
member is therefore preferably defined as follows:
class Person { public: std::string const &name() const; }; inline std::string const &Person::name() const { return d_name; }
If it is ever necessary to cancel Person::name
's inline
implementation, then this becomes its non-inline implementation:
#include "person.ih" std::string const &Person::name() const { return d_name; }
Only the inline
keyword needs to be removed to obtain the correct
non-inline implementation.
Defining members inline has the following effect: whenever an inline-defined function is called, the compiler may insert the function's body at the location of the function call. It may be that the function itself is never actually called.
This construction, where the function code itself is inserted rather than a call to the function, is called an inline function. Note that using inline functions may result in multiple occurrences of the code of those functions in a program: one copy for each invocation of the inline function. This is probably OK if the function is a small one, and needs to be executed fast. It's not so desirable if the code of the function is extensive. The compiler knows this too, and handles the use of inline functions as a request rather than a command. If the compiler considers the function too long, it will not grant the request. Instead it will treat the function as a normal function.
Person::name
).
inline void Person::printname() const { cout << d_name << '\n'; }
This function contains only one statement. However, the statement takes a
relatively long time to execute. In general, functions which perform input and
output take lots of time. The effect of the conversion of this function
printname()
to inline would therefore lead to an insignificant gain in
execution time.
inline
is the topic of this section this is
considered the appropriate location for the advice.
There are situations where the compiler is confronted with so-called
vague linkage
(cf.
http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc/Vague-Linkage.html). These
situations occur when the compiler does not have a clear indication in what
object file to put its compiled code. This happens, e.g., with inline
functions, which are usually encountered in multiple source files. Since the
compiler may insert the code of ordinary inline functions in places where
these functions are called, vague linking is usually no problem with these
ordinary functions.
However, as explained in chapter 14, when using polymorphism
the compiler must ignore the inline
keyword and define so-called
virtual members as true (out-of-line)
functions. In this situation the vague linkage may cause problems, as the
compiler must decide in what object s to put their code. Usually that's
not a big problem as long as the function is at least called once. But virtual
functions are special in the sense that they may very well never be explicitly
called. On some architectures (e.g., armel) the compiler may fail to compile
such inline virtual functions. This may result in missing symbols in programs
using them. To make matters slightly more complex: the problem may emerge when
shared libraries are used, but not when static libraries are used.
To avoid all of these problems virtual functions should never be defined inline, but they should always be defined out-of-line. I.e., they should be defined in source files.
inline int value = 15; // OK class Demo { // static int s_value = 15; // ERROR static int constexpr s_value = 15; // OK static int s_inline; // OK: see below: the inline // definition follows the // class declaration }; inline int Demo::s_inline = 20; // OK
Local classes can be very useful in advanced applications involving inheritance or templates (cf. section 13.8). At this point in the C++ Annotations they have limited use, although their main features can be described. At the end of this section an example is provided.
enum
may be anonymous, exposing only the
enum
values.
Local
cannot directly access main
's argc
parameter.
#include <iostream> #include <string> using namespace std; int main(int argc, char **argv) { static size_t staticValue = 0; class Local { int d_argc; // non-static data members OK public: enum // enums OK { VALUE = 5 }; Local(int argc) // constructors and member functions OK : // in-class implementation required d_argc(argc) { // global data: accessible cout << "Local constructor\n"; // static function variables: accessible staticValue += 5; } static void hello() // static member functions: OK { cout << "hello world\n"; } }; Local::hello(); // call Local static member Local loc{ argc }; // define object of a local class. }
C++ also allows the declaration of data members which may be modified,
even by const member function. Declarations of such data members start with
the keyword mutable
.
Mutable should be used for those data members that may be modified without logically changing the object, which might therefore still be considered a constant object.
An example of a situation where mutable
is appropriately used is found in
the implementation of a string class. Consider the std::string
's c_str
and data
members. The actual data returned by the two members are
identical, but c_str
must ensure that the returned string is terminated by
an 0-byte. As a string object has both a length and a capacity an easy
way to implement c_str
is to ensure that the string's capacity exceeds its
length by at least one character. This invariant allows c_str
to be
implemented as follows:
char const *string::c_str() const { d_data[d_length] = 0; return d_data; }
This implementation logically does not modify the object's data as the
bytes beyond the object's initial (length) characters have undefined
values. But in order to use this implementation d_data
must be declared
mutable
:
mutable char *d_data;
The keyword mutable
is also useful in classes implementing, e.g.,
reference counting. Consider a class implementing reference counting for
strings. The object doing the reference counting might be a const object,
but the class may define a copy constructor. Since const objects can't be
modified, how would the copy constructor be able to increment the reference
count? Here the mutable
keyword may profitably be used, as it can be
incremented and decremented, even though its object is a const object.
The keyword mutable
should sparingly be used. Data modified by const
member functions should never logically modify the object, and it should be
easy to demonstrate this. As a rule of thumb: do not use mutable
unless there is a very clear reason (the object is logically not altered) for
violating this rule.
First, source files. With the exception of the occasional classless function, source files contain the code of member functions of classes. Basically, there are two approaches:
include
-directives and to think about the header files which are needed in
a particular source file.
The second alternative has the advantage of economy for the program developer: the header file of the class accumulates header files, so it tends to become more and more generally useful. It has the disadvantage that the compiler frequently has to process many header files which aren't actually used by the function to compile.
With computers running faster and faster (and compilers getting smarter and
smarter) I think the second alternative is to be preferred over the first
alternative. So, as a starting point source files of a particular class
MyClass
could be organized according to the following example:
#include <myclass.h> int MyClass::aMemberFunction() {}
There is only one include
-directive. Note that the directive refers to
a header file in a directory mentioned in the INCLUDE
-file environment
variable. Local header files (using #include "myclass.h"
) could be used
too, but that tends to complicate the organization of the class header file
itself somewhat.
The organization of the header file itself requires some attention. Consider
the following example, in which two classes File
and String
are
used.
Assume the File
class has a member gets(String &destination)
, while
the class String
has a member function getLine(File &file)
. The
(partial) header file for the class String
is then:
#ifndef STRING_H_ #define STRING_H_ #include <project/file.h> // to know about a File class String { public: void getLine(File &file); }; #endif
Unfortunately a similar setup is required for the class File
:
#ifndef FILE_H_ #define FILE_H_ #include <project/string.h> // to know about a String class File { public: void gets(String &string); }; #endif
Now we have created a problem. The compiler, trying to compile the source
file of the function File::gets
proceeds as follows:
project/file.h
is opened to be read;
FILE_H_
is defined
project/string.h
is opened to be read
STRING_H_
is defined
project/file.h
is (again) opened to be read
FILE_H_
is already defined, so the remainder of
project/file.h
is skipped.
String
is now parsed.
File
object is
encountered.
class File
hasn't been parsed yet, a File
is still
an undefined type, and the compiler quits with an error.
#ifndef STRING_H_ #define STRING_H_ class File; // forward reference class String { public: void getLine(File &file); }; #include <project/file.h> // to know about a File #endif
A similar setup is required for the class File
:
#ifndef FILE_H_ #define FILE_H_ class String; // forward reference class File { public: void gets(String &string); }; #include <project/string.h> // to know about a String #endif
This works well in all situations where either references or pointers to other classes are involved and with (non-inline) member functions having class-type return values or parameters.
This setup doesn't work with composition, nor with in-class inline
member functions. Assume the class File
has a composed data member of
the class String
. In that case, the class interface of the class File
must include the header file of the class String
before the class
interface itself, because otherwise the compiler can't tell how big a File
object is. A File
object contains a String
member, but the compiler
can't determine the size of that String
data member and thus, by
implication, it can't determine the size of a File
object.
In cases where classes contain composed objects (or are derived from other
classes, see chapter 13) the header files of the classes of the
composed objects must have been read before the class interface itself.
In such a case the class File
might be defined as follows:
#ifndef FILE_H_ #define FILE_H_ #include <project/string.h> // to know about a String class File { String d_line; // composition ! public: void gets(String &string); }; #endif
The class String
can't declare a File
object as a composed member:
such a situation would again result in an undefined class while compiling the
sources of these classes.
All remaining header files (appearing below the class interface itself) are required only because they are used by the class's source files.
This approach allows us to introduce yet another refinement:
#ifndef ... #endif
construction introduced
in section 2.5.10.
INCLUDE
path.
string
header file) as
well. The class header file itself as well as these additional header files
should be included in a separate internal header file (for which the extension
.ih
(`internal header') is suggested).
The .ih
file should be defined in the same directory as the source
files of the class. It has the following characteristics:
#ifndef
.. #endif
shield, as the header file is never included by other header files.
.h
header file defining the class interface
is included.
.h
header file are included.
/usr/local/include/myheaders/file.h
:
#ifndef FILE_H_ #define FILE_H_ #include <fstream> // for composed 'ifstream' class Buffer; // forward reference class File // class interface { std::ifstream d_instream; public: void gets(Buffer &buffer); }; #endif
~/myproject/file/file.ih
, where all
sources of the class File are stored:
#include <myheaders/file.h> // make the class File known #include <buffer.h> // make Buffer known to File #include <string> // used by members of the class #include <sys/stat.h> // File.
using
directive should be specified in those header
files if they are to be used as general header files declaring classes or
other entities from a library. When the using
directive is used in a
header file then users of such a header file are forced to accept and use the
declarations in all code that includes the particular header file.
For example, if in a namespace special
an object Inserter cout
is
declared, then special::cout
is of course a different object than
std::cout
. Now, if a class Flaw
is constructed, in which the
constructor expects a reference to a special::Inserter
, then the class
should be constructed as follows:
class special::Inserter; class Flaw { public: Flaw(special::Inserter &ins); };
Now the person designing the class Flaw
may be in a lazy mood, and
might get bored by continuously having to prefix special::
before every
entity from that namespace. So, the following construction is used:
using namespace special; class Inserter; class Flaw { public: Flaw(Inserter &ins); };
This works fine, up to the point where somebody wants to include
flaw.h
in other source files: because of the using
directive, this
latter person is now by implication also using namespace special
, which
could produce unwanted or unexpected effects:
#include <flaw.h> #include <iostream> using std::cout; int main() { cout << "starting\n"; // won't compile }
The compiler is confronted with two interpretations for cout
: first,
because of the using
directive in the flaw.h
header file, it considers
cout
a special::Inserter
, then, because of the using
directive in
the user program, it considers cout
a std::ostream
. Consequently,
the compiler reports an error.
As a rule of thumb, header files intended for general use
should not contain using
declarations. This rule does not hold true for
header files which are only included by the sources of a class: here the
programmer is free to apply as many using
declarations as desired, as
these directives never reach other sources.
printf
in main
the preprocessor
directive #include <stdio.h>
had to be specified.
This method still works in C++, but gradually proved to be inefficient. One reason being that header files have to be processed again for every source file of a set of source files each including that header file. The drawback of this approach quickly becomes apparent once classes are used, as the compiler will repeatedly have to process the class's header file for each source file using that class. Usually it's not just that one header file, but header files tend to include other header files, resulting in an avalanche of header files that must be processed by the compiler again and again for every single source file that the compiler must compile. If a typical source file includes h header files, and s source files must be compiled, then that results in a significant compilation load, as the compiler must process s * h header files.
Precompiled headers offered an initial attempt to reduce this excessive workload. But precompiled headers have problems of their own: they're enormously big (a precompiled header file of less than 100 bytes can easily result in a precompiled header of 25 MB or more), and they're kind of fragile: simply recompiling a header if it's younger than its precompiled form may quickly result in much overhead, e.g., if merely some comment is added to the header.
Another common defense mechanism encountered in traditional headers is the use of include guards, ensuring that a header file is processed once if it is included by multiple other header files. Such include guards are macros, and were extensively discussed in section 7.11. Include guards work, but completely depend on the uniqueness of the guard-identifier, which is usually a long name, written in capitals using several underscores to increase the probability of their uniqueness.
By offering modules the C++20 standard provides solutions to the problems
mentioned above. At the time of this writing the Gnu g++
compiler (still)
experiences problems with modules. Once these problems have been solved a
separate chapter about modules will definitely be added to the C++ Annotations.
sizeof
operator can be applied to data
members of classes without the need to specify an object as well. Consider:
class Data { std::string d_name; ... };
To obtain the size of Data
's d_name
member the following
expression can be used:
sizeof(Data::d_name);
However, note that the compiler observes data protection here as
well. Sizeof(Data::d_name)
can only be used where d_name
may be
visible as well, i.e., by Data
's member functions and friends.