Chapter 11: More Operator Overloading

Having covered the overloaded assignment operator in chapter 9, and having shown several examples of other overloaded operators as well (i.e., the insertion and extraction operators in chapters 3 and 6), we now take a look at operator overloading in general.

11.1: Overloading `operator[]()'

As our next example of operator overloading, we introduce a class IntArray encapsulating an array of ints. Indexing the array elements is possible using the standard array index operator [], but additionally checks for array bounds overflow are performed (note, however, that index checking is not normally done by index operators. Since it's good practice to avoid surprises array bound checks should normally not be performed by overloaded index operators). The index operator (operator[]) is interesting because it can be used in expressions as both lvalue and as rvalue.

Here is an example illustrating the basic use of the class:

    int main()
    {
        IntArray x{ 20 };               // 20 ints

        for (int idx = 0; idx < 20; ++idx)
            x[idx] = 2 * idx;                   // assign the elements

        for (int idx = 0; idx <= 20; ++idx)     // result: boundary overflow
            cout << "At index " << idx << ": value is " << x[idx] << '\n';
    }
First, the constructor is used to create an object containing 20 ints. The elements stored in the object can be assigned or retrieved. The first for-loop assigns values to the elements using the index operator, the second for-loop retrieves the values but also results in a run-time error once the non-existing value x[20] is addressed. The IntArray class interface is:
    #include <cstddef>

    class IntArray
    {
        size_t d_size;
        int     *d_data;

         public:
            IntArray(size_t size = 1);
            IntArray(IntArray const &other);
            ~IntArray();
            IntArray &operator=(IntArray const &other);

                                                // overloaded index operators:
            int &operator[](size_t index);              // first
            int const &operator[](size_t index) const;  // second

            void swap(IntArray &other);         // trivial

        private:
            void boundary(size_t index) const;
            int &operatorIndex(size_t index) const;
    };
This class has the following characteristics:

Now, the implementation of the members (omitting the trivial implementation of swap, cf. chapter 9) are:

    #include "intarray.ih"

    IntArray::IntArray(size_t size)
    :
        d_size(size)
    {
        if (d_size < 1)
            throw "IntArray: size of array must be >= 1"s;

        d_data = new int[d_size];
    }

    IntArray::IntArray(IntArray const &other)
    :
        d_size(other.d_size),
        d_data(new int[d_size])
    {
        memcpy(d_data, other.d_data, d_size * sizeof(int));
    }

    IntArray::~IntArray()
    {
        delete[] d_data;
    }

    IntArray &IntArray::operator=(IntArray const &other)
    {
        IntArray tmp(other);
        swap(tmp);
        return *this;
    }

    int &IntArray::operatorIndex(size_t index) const
    {
        boundary(index);
        return d_data[index];
    }

    int &IntArray::operator[](size_t index)
    {
        return operatorIndex(index);
    }

    int const &IntArray::operator[](size_t index) const
    {
        return operatorIndex(index);
    }

    void IntArray::swap(IntArray &other)
    {
        // swaps the d_size and d_data data members
        // of *this and other
    }


    void IntArray::boundary(size_t index) const
    {
        if (index < d_size)
            return;
        ostringstream out;
        out  << "IntArray: boundary overflow, index = " <<
                index << ", should be < " << d_size << '\n';
        throw out.str();
    }
Note how the operator[] members were implemented: as non-const members may call const member functions and as the implementation of the const member function is identical to the non-const member function's implementation both operator[] members could be defined inline using an auxiliary function int &operatorIndex(size_t index) const. A const member function may return a non-const reference (or pointer) return value, referring to one of the data members of its object. Of course, this is a potentially dangerous backdoor that may break data hiding. However, the members in the public interface prevent this breach and so the two public operator[] members may themselves safely call the same int &operatorIndex() const member, that defines a private backdoor.

11.2: Overloading insertion and extraction operators

Classes may be adapted in such a way that their objects may be inserted into and extracted from, respectively, a std::ostream and std::istream.

The class std::ostream defines insertion operators for primitive types, such as int, char *, etc.. In this section we learn how to extend the existing functionality of classes (in particular std::istream and std::ostream) in such a way that they can be used also in combination with classes developed much later in history.

In particular we will show how the insertion operator can be overloaded allowing the insertion of any type of object, say Person (see chapter 9), into an ostream. Having defined such an overloaded operator we're able to use the following code:

    Person kr("Kernighan and Ritchie", "unknown", "unknown");

    cout << "Name, address and phone number of Person kr:\n" << kr << '\n';

The statement cout << kr uses operator<<. This member function has two operands: an ostream & and a Person &. The required action is defined in an overloaded free function operator<< expecting two arguments:

                                // declared in `person.h'
    std::ostream &operator<<(std::ostream &out, Person const &person);

                                // defined in some source file
    ostream &operator<<(ostream &out, Person const &person)
    {
        return
            out <<
                "Name:    " << person.name() << ", "
                "Address: " << person.address() << ", "
                "Phone:   " << person.phone();
    }

The free function operator<< has the following noteworthy characteristics:

In order to overload the extraction operator for, e.g., the Person class, members are needed modifying the class's private data members. Such modifiers are normally offered by the class interface. For the Person class these members could be the following:

    void setName(char const *name);
    void setAddress(char const *address);
    void setPhone(char const *phone);

These members may easily be implemented: the memory pointed to by the corresponding data member must be deleted, and the data member should point to a copy of the text pointed to by the parameter. E.g.,

    void Person::setAddress(char const *address)
    {
        delete[] d_address;
        d_address = strdupnew(address);
    }

A more elaborate function should check the reasonableness of the new address (address also shouldn't be a 0-pointer). This however, is not further pursued here. Instead, let's have a look at the final operator>>. A simple implementation is:

    istream &operator>>(istream &in, Person &person)
    {
        string name;
        string address;
        string phone;

        if (in >> name >> address >> phone)    // extract three strings
        {
            person.setName(name.c_str());
            person.setAddress(address.c_str());
            person.setPhone(phone.c_str());
        }
        return in;
    }

Note the stepwise approach that is followed here. First, the required information is extracted using available extraction operators. Then, if that succeeds, modifiers are used to modify the data members of the object to be extracted. Finally, the stream object itself is returned as a reference.

11.3: Conversion operators

A class may be constructed around a built-in type. E.g., a class String, constructed around the char * type. Such a class may define all kinds of operations, like assignments. Take a look at the following class interface, designed after the string class:
    class String
    {
        char *d_string;

        public:
            String();
            String(char const *arg);
            ~String();
            String(String const &other);
            String &operator=(String const &rvalue);
            String &operator=(char const *rvalue);
    };

Objects of this class can be initialized from a char const *, and also from a String itself. There is an overloaded assignment operator, allowing the assignment from a String object and from a char const * (Note that the assignment from a char const * also allows the null-pointer. An assignment like stringObject = 0 is perfectly in order.).

Usually, in classes that are less directly linked to their data than this String class, there will be an accessor member function, like a member char const *String::c_str() const. However, the need to use this latter member doesn't appeal to our intuition when an array of String objects is defined by, e.g., a class StringArray. If this latter class provides the operator[] to access individual String members, it would most likely offer at least the following class interface:

    class StringArray
    {
        String *d_store;
        size_t d_n;

        public:
            StringArray(size_t size);
            StringArray(StringArray const &other);
            StringArray &operator=(StringArray const &rvalue);
            ~StringArray();

            String &operator[](size_t index);
    };

This interface allows us to assign String elements to each other:

    StringArray sa{ 10 };

    sa[4] = sa[3];  // String to String assignment

But it is also possible to assign a char const * to an element of sa:

    sa[3] = "hello world";

Here, the following steps are taken:

Now we try to do it the other way around: how to access the char const * that's stored in sa[3]? The following attempt fails:
    char const *cp = sa[3];

It fails since we would need an overloaded assignment operator for the 'class' char const *. Unfortunately, there isn't such a class, and therefore we can't build that overloaded assignment operator (see also section 11.14). Furthermore, casting won't work as the compiler doesn't know how to cast a String to a char const *. How to proceed?

One possibility is to define an accessor member function c_str():

    char const *cp = sa[3].c_str()

This compiles fine but looks clumsy.... A far better approach would be to use a conversion operator.

A conversion operator is a kind of overloaded operator, but this time the overloading is used to cast the object to another type. In class interfaces, the general form of a conversion operator is:

    operator <type>() const;

Conversion operators usually are const member functions: they are automatically called when their objects are used as rvalues in expressions having a type lvalue. Using a conversion operator a String object may be interpreted as a char const * rvalue, allowing us to perform the above assignment.

Conversion operators are somewhat dangerous. The conversion is automatically performed by the compiler and unless its use is perfectly transparent it may confuse those who read code in which conversion operators are used. E.g., novice C++ programmers are frequently confused by statements like `if (cin) ...'.

As a rule of thumb: classes should define at most one conversion operator. Multiple conversion operators may be defined but frequently result in ambiguous code. E.g., if a class defines operator bool() const and operator int() const then passing an object of this class to a function expecting a size_t argument results in an ambiguity as an int and a bool may both be used to initialize a size_t.

In the current example, the class String could define the following conversion operator for char const *:

    String::operator char const *() const
    {
        return d_string;
    }

Notes:

Conversion operators are also used when objects of classes defining conversion operators are inserted into streams. Realize that the right hand sides of insertion operators are function parameters that are initialized by the operator's right hand side arguments. The rules are simple: In the following example an object of class Insertable is directly inserted; an object of the class Convertor uses the conversion operator; an object of the class Error cannot be inserted since it does not define an insertion operator and the type returned by its conversion operator cannot be inserted either (Text does define an operator int() const, but the fact that a Text itself cannot be inserted causes the error):
    #include <iostream>
    #include <string>
    using namespace std;

    struct Insertable
    {
        operator int() const
        {
            cout << "op int()\n";
            return 0;
        }
    };
    ostream &operator<<(ostream &out, Insertable const &ins)
    {
        return out << "insertion operator";
    }
    struct Convertor
    {
        operator Insertable() const
        {
            return Insertable();
        }
    };
    struct Text
    {
        operator int() const
        {
            return 1;
        }
    };
    struct Error
    {
        operator Text() const
        {
            return Text{};
        }
    };

    int main()
    {
        Insertable insertable;
        cout << insertable << '\n';
        Convertor convertor;
        cout << convertor << '\n';
        Error error;
        cout << error << '\n';
    }

Some final remarks regarding conversion operators:

11.4: The keyword `explicit'

Conversions are not only performed by conversion operators, but also by constructors accepting one argument (i.e., constructors having one or multiple parameters, specifying default argument values for all parameters or for all but the first parameter).

Assume a data base class DataBase is defined in which Person objects can be stored. It defines a Person *d_data pointer, and so it offers a copy constructor and an overloaded assignment operator.

In addition to the copy constructor DataBase offers a default constructor and several additional constructors:

The above constructors all are perfectly reasonable. But they also allow the compiler to compile the following code without producing any warning at all:

    DataBase db;
    DataBase db2;
    Person person;

    db2 = db;           // 1
    db2 = person;       // 2
    db2 = 10;           // 3
    db2 = cin;          // 4

Statement 1 is perfectly reasonable: db is used to redefine db2. Statement 2 might be understandable since we designed DataBase to contain Person objects. Nevertheless, we might question the logic that's used here as a Person is not some kind of DataBase. The logic becomes even more opaque when looking at statements 3 and 4. Statement 3 in effect waits for the data of 10 persons to appear at the standard input stream. Nothing like that is suggested by db2 = 10.

Implicit promotions are used with statements 2 through 4. Since constructors accepting, respectively a Person, an istream, and a size_t and an istream have been defined for DataBase and since the assignment operator expects a DataBase right-hand side (rhs) argument the compiler first converts the rhs arguments to anonymous DataBase objects which are then assigned to db2.

It is good practice to prevent implicit promotions by using the explicit modifier when declaring a constructor. Constructors using the explicit modifier can only be used to construct objects explicitly. Statements 2-4 would not have compiled if the constructors expecting one argument would have been declared using explicit. E.g.,

    explicit DataBase(Person const &person);
    explicit DataBase(size_t count, std:istream &in);

Having declared all constructors accepting one argument as explicit the above assignments would have required the explicit specification of the appropriate constructors, thus clarifying the programmer's intent:

    DataBase db;
    DataBase db2;
    Person person;

    db2 = db;                   // 1
    db2 = DataBase{ person };   // 2
    db2 = DataBase{ 10 };       // 3
    db2 = DataBase{ cin };      // 4

As a rule of thumb prefix one argument constructors with the explicit keyword unless implicit promotions are perfectly natural (string's char const * accepting constructor is a case in point).

11.4.1: Explicit conversion operators

In addition to explicit constructors, C++ supports explicit conversion operators.

For example, a class might define operator bool() const returning true if an object of that class is in a usable state and false if not. Since the type bool is an arithmetic type this could result in unexpected or unintended behavior. Consider:

    void process(bool value);

    class StreamHandler
    {
        public:
            operator bool() const;      // true: object is fit for use
            ...
    };

    int fun(StreamHandler &sh)
    {
        int sx;

        if (sh)                         // intended use of operator bool()
            ... use sh as usual; also use `sx'

        process(sh);                    // typo: `sx' was intended
    }

In this example process unintentionally receives the value returned by operator bool using the implicit conversion from bool to int.

When defining explicit conversion operators implicit conversions like the one shown in the example are prevented. Such conversion operators can only be used in situations where the converted type is explicitly required (as in the condition clauses of if or while statements), or is explicitly requested using a static_cast. To declare an explicit bool conversion operator in class StreamHandler's interface replace the above declaration by:

        explicit operator bool() const;

Since the C++14 standard istreams define an explicit operator bool() const. As a consequence:

    while (cin.get(ch)) // compiles OK
        ;

    bool fun1()
    {
        return cin;     // 'bool = istream' won't compile as 
    }                   // istream defines 'explicit operator bool'

    bool fun1()
    {
        return static_cast<bool>(cin); // compiles OK
    }

11.5: Overloading increment and decrement operators

Overloading the increment operator (operator++) and decrement operator ( operator--) introduces a small problem: there are two versions of each operator, as they may be used as postfix operator (e.g., x++) or as prefix operator (e.g., ++x).

Used as postfix operator, the value's object is returned as an rvalue, temporary const object and the post-incremented variable itself disappears from view. Used as prefix operator, the variable is incremented, and its value is returned as lvalue and it may be altered again by modifying the prefix operator's return value. Whereas these characteristics are not required when the operator is overloaded, it is strongly advised to implement these characteristics in any overloaded increment or decrement operator.

Suppose we define a wrapper class around the size_t value type. Such a class could offer the following (partially shown) interface:

    class Unsigned
    {
        size_t d_value;

        public:
            Unsigned();
            explicit Unsigned(size_t init);

            Unsigned &operator++();
    }

The class's last member declares the prefix overloaded increment operator. The returned lvalue is Unsigned &. The member is easily implemented:

    Unsigned &Unsigned::operator++()
    {
        ++d_value;
        return *this;
    }

To define the postfix operator, an overloaded version of the operator is defined, expecting a (dummy) int argument. This might be considered a kludge, or an acceptable application of function overloading. Whatever your opinion in this matter, the following can be concluded:

The postfix increment operator is declared as follows in the class Unsigned's interface:
    Unsigned operator++(int);

It may be implemented as follows:

    Unsigned Unsigned::operator++(int)
    {
        Unsigned tmp{ *this };
        ++d_value;
        return tmp;
    }

Note that the operator's parameter is not used. It is only part of the implementation to disambiguate the prefix- and postfix operators in implementations and declarations.

In the above example the statement incrementing the current object offers the nothrow guarantee as it only involves an operation on a primitive type. If the initial copy construction throws then the original object is not modified, if the return statement throws the object has safely been modified. But incrementing an object could itself throw exceptions. How to implement the increment operators in that case? Once again, swap is our friend. Here are the pre- and postfix operators offering the strong guarantee when the member increment performing the increment operation may throw:

    Unsigned &Unsigned::operator++()
    {
        Unsigned tmp{ *this };
        tmp.increment();
        swap(tmp);
        return *this;
    }
    Unsigned Unsigned::operator++(int)
    {
        Unsigned tmp{ *this };
        tmp.increment();
        swap(tmp);
        return tmp;
    }

Both operators first create copies of the current objects. These copies are incremented and then swapped with the current objects. If increment throws the current objects remain unaltered; the swap operations ensure that the correct objects are returned (the incremented object for the prefix operator, the original object for the postfix operator) and that the current objects become the incremented objects.

When calling the increment or decrement operator using its full member function name then any int argument passed to the function results in calling the postfix operator. Omitting the argument results in calling the prefix operator. Example:

    Unsigned uns{ 13 };

    uns.operator++();     // prefix-incrementing uns
    uns.operator++(0);    // postfix-incrementing uns

Both the prefix and postfix increment and decrement operators are deprecated when applied to bool type of variables. In situations where a postfix increment operator could be useful the std::exchange (cf. section 19.1.11) should be used.

11.6: Overloading binary operators

In various classes overloading binary operators (like operator+) can be a very natural extension of the class's functionality. For example, the std::string class has various overloaded operator+ members.

Most binary operators come in two flavors: the plain binary operator (like the + operator) and the compound binary assignment operator (like operator+=). Whereas the plain binary operators return values, the compound binary assignment operators usually return references to the objects for which the operators were called. For example, with std::string objects the following code (annotations below the example) may be used:

    std::string s1;
    std::string s2;
    std::string s3;

    s1 = s2 += s3;                  // 1
    (s2 += s3) + " postfix";        // 2
    s1 = "prefix " + s3;            // 3
    "prefix " + s3 + "postfix";     // 4

Now consider the following code, in which a class Binary supports an overloaded operator+:

    class Binary
    {
        public:
            Binary();
            Binary(int value);
            Binary operator+(Binary const &rhs);
    };

    int main()
    {
        Binary b1;
        Binary b2{ 5 };

        b1 = b2 + 3;            // 1
        b1 = 3 + b2;            // 2
    }
Compilation of this little program fails for statement // 2, with the compiler reporting an error like:
    error: no match for 'operator+' in '3 + b2'

Why is statement // 1 compiled correctly whereas statement // 2 won't compile?

In order to understand this remember promotions. As we have seen in section 11.4, constructors expecting single arguments may implicitly be activated when an argument of an appropriate type is provided. We've already encountered this with std::string objects, where NTBSs may be used to initialize std::string objects.

Analogously, in statement // 1, operator+ is called, using b2 as its left-hand side operand. This operator expects another Binary object as its right-hand side operand. However, an int is provided. But as a constructor Binary(int) exists, the int value can be promoted to a Binary object. Next, this Binary object is passed as argument to the operator+ member.

Unfortunately, in statement // 2 promotions are not available: here the + operator is applied to an int-type lvalue. An int is a primitive type and primitive types have no knowledge of `constructors', `member functions' or `promotions'.

How, then, are promotions of left-hand operands implemented in statements like "prefix " + s3? Since promotions can be applied to function arguments, we must make sure that both operands of binary operators are arguments. This implies that plain binary operators supporting promotions for either their left-hand side operand or right-hand side operand should be declared as free operators, also called free functions.

Functions like the plain binary operators conceptually belong to the class for which they implement these operators. Consequently they should be declared in the class's header file. We cover their implementations shortly, but here is our first revision of the declaration of the class Binary, declaring an overloaded + operator as a free function:

    class Binary
    {
        public:
            Binary();
            Binary(int value);
    };

    Binary operator+(Binary const &lhs, Binary const &rhs);

After defining binary operators as free functions, several promotions are available:

The next step consists of implementing the required overloaded binary compound assignment operators, having the form @=, where @ represents a binary operator. As these operators always have left-hand side operands which are object of their own classes, they are implemented as genuine member functions. Compound assignment operators usually return references to the objects for which the binary compound assignment operators were requested, as these objects might be modified in the same statement. E.g., (s2 += s3) + " postfix".

Here is our second revision of the class Binary, showing the declaration of the plain binary operator as well as the corresponding compound assignment operator:

    class Binary
    {
        public:
            Binary();
            Binary(int value);

            Binary &operator+=(Binary const &rhs);
    };

    Binary operator+(Binary const &lhs, Binary const &rhs);

How should the compound addition assignment operator be implemented? When implementing compound binary assignment operators the strong guarantee should always be kept in mind: if the operation might throw use a temporary object and swap. Here is our implementation of the compound assignment operator:

    Binary &Binary::operator+=(Binary const &rhs)
    {
        Binary tmp{ *this };
        tmp.add(rhs);           // this might throw
        swap(tmp);
        return *this;
    }

It's easy to implement the free binary operator: the lhs argument is copied into a Binary tmp to which the rhs operand is added. Then tmp is returned, using copy elision. The class Binary declares the free binary operator as a friend (cf. chapter 15), so it can call Binary's add member:

    class Binary
    {
        friend Binary operator+(Binary const &lhs, Binary const &rhs);

        public:
            Binary();
            Binary(int value);

            Binary &operator+=(Binary const &other);

        private:
            void add(Binary const &other);
    };

The binary operator's implementation becomes:

    Binary operator+(Binary const &lhs, Binary const &rhs)
    {
        Binary tmp{ lhs };
        tmp.add(rhs);
        return tmp;
    }

If the class Binary is move-aware then it's attractive to add move-aware binary operators. In this case we also need operators whose left-hand side operands are rvalue references. When a class is move aware various interesting implementations are suddenly possible, which we encounter below, and in the next (sub)section. First have a look at the signature of such a binary operator (which should also be declared as a friend in the class interface):

    Binary operator+(Binary &&lhs, Binary const &rhs);

Since the lhs operand is an rvalue reference, we can modify it ad lib. Binary operators are commonly designed as factory functions, returning objects created by those operators. However, the (modified) object referred to by lhs should itself not be returned. As stated in the C++ standard,

A temporary object bound to a reference parameter in a function call persists until the completion of the full-expression containing the call.
and furthermore:
The lifetime of a temporary bound to the returned value in a function return statement is not extended; the temporary is destroyed at the end of the full-expression in the return statement.
In other words, a temporary object cannot itself be returned as the function's return value: a Binary && return type should therefore not be used. Therefore functions implementing binary operators are factory functions (note, however, that the returned object may be constructed using the class's move constructor whenever a temporary object has to be returned).

Alternatively, the binary operator can first create an object by move constructing it from the operator's lhs operand, performing the binary operation on that object and the operator's rhs operand, and then return the modified object (allowing the compiler to apply copy elision). It's a matter of taste which one is preferred.

Here are the two implementations. Because of copy elision the explicitly defined ret object is created in the location of the return value. Both implementations, although they appear to be different, show identical run-time behavior:

                // first implementation: modify lhs
    Binary operator+(Binary &&lhs, Binary const &rhs)   
    {
        lhs.add(rhs);
        return std::move(lhs);
    }
                // second implementation: move construct ret from lhs
    Binary operator+(Binary &&lhs, Binary const &rhs)   
    {
        Binary ret{ std::move(lhs) };
        ret.add(rhs);
        return ret;
    }

Now, when executing expressions like (all Binary objects) b1 + b2 + b3 the following functions are called:

    copy operator+          = b1 + b2 
    Copy constructor        = tmp(b1) 
        adding              = tmp.add(b2)
    copy elision            : tmp is returned from b1 + b2
        
    move operator+          = tmp + b3 
    adding                  = tmp.add(b3)
    Move construction       = tmp2(move(tmp)) is returned

But we're not there yet: in the next section we encounter possibilities for several more interesting implementations, in the context of compound assignment operators.

11.6.1: Member function reference bindings (& and &&)

We've seen that binary operators (like operator+) can be implemented very efficiently, but require at least move constructors.

An expression like

    Binary{} + varB + varC + varD

therefore returns a move constructed object representing Binary{} + varB, then another move constructed object receiving the first return value and varC, and finally yet another move constructed object receiving the second returned object and varD as its arguments.

Now consider the situation where we have a function defining a Binary && parameter, and a second Binary const & parameter. Inside that function these values need to be added, and their sum is then passed as argument to two other functions. We could do this:

    void fun1(Binary &&lhs, Binary const &rhs)
    {
        lhs += rhs;
        fun2(lhs);
        fun3(lhs);
    }

But realize that when using operator+= we first construct a copy of the current object, so a temporary object is available to perform the addition on, and then swap the temporary object with the current object to commit the results. But wait! Our lhs operand already is a temporary object. So why create another?

In this example another temporary object is indeed not required: lhs remains in existence until fun1 ends. But different from the binary operators the binary compound assignment operators don't have explicitly defined left-hand side operands. But we still can inform the compiler that a particular member (so, not merely compound assignment operators) should only be used when the objects calling those members is an anonymous temporary object, or a non-anonymous (modifiable or non-modifiable) object. For this we use reference bindings a.k.a. reference qualifiers.

Reference bindings consist of a reference token (&), optionally preceded by const, or an rvalue reference token (&&). Such reference qualifiers are immediately affixed to the function's head (this applies to the declaration and the implementation alike). Functions provided with rvalue reference bindings are selected by the compiler when used by anonymous temporary objects, whereas functions provided with lvalue reference bindings are selected by the compiler when used by other types of objects.

Reference qualifiers allow us to fine-tune our implementations of compund assignment operators like operator+=. If we know that the object calling the compound assignment operator is itself a temporary, then there's no need for a separate temporary object. The operator may directly perform its operation and could then return itself as an rvalue reference. Here is the implementation of operator+= tailored to being used by temporary objects:

    Binary &&Binary::operator+=(Binary const &rhs) &&
    {
        add(rhs);                   // directly add rhs to *this, 
        return std::move(*this);    // return the temporary object itself
    }

This implementation is about as fast as it gets. But be careful: in the previous section we learned that a temporary is destroyed at the end of the full expression of a return stattement. In this case, however, the temporary already exists, and so (also see the previous section) it should persist until the expression containing the (operator+=) function call is completed. As a consequence,

    cout << (Binary{} += existingBinary) << '\n';

is OK, but

    Binary &&rref = (Binary{} += existingBinary);
    cout << rref << '\n';

is not, since rref becomes a dangling reference immediately after its initialization.

A full-proof alternative implementation of the rvalue-reference bound operator+= returns a move-constructed copy:

    Binary Binary::operator+=(Binary const &rhs) &&
    {
        add(rhs);                   // directly add rhs to *this, 
        return std::move(*this);    // return a move constructed copy
    }

The price to pay for this full-proof implementation is an extra move construction. Now, using the previous example (using rref), operator+= returns a copy of the Binary{} temporary, which is still a temporary object which can safely be referred to by rref.

Which implementation to use may be a matter of choice: if users of Binary know what they're doing then the former implementation can be used, since these users will never use the above rref initialization. If you're not so sure about your users, use the latter implementation: formally your users will do something they shouldn't do, but there's no penalty for that.

For the compound assignment operator called by an lvalue reference (i.e., a named object) we use the implementation for operator+= from the previous section (note the reference qualifier):

    Binary &Binary::operator+=(Binary const &other) &
    {
        Binary tmp(*this);
        tmp.add(other);     // this might throw
        swap(tmp);
        return *this;
    }

With this implementation adding Binary objects to each other (e.g., b1 += b2 += b3) boils down to

    operator+=    (&)       = b2 += b3
    Copy constructor        = tmp(b2) 
        adding              = tmp.add(b3)
        swap                = b2 <-> tmp
    return                  = b2

    operator+=    (&)       = b1 += b2
    Copy constructor        = tmp(b1) 
        adding              = tmp.add(b2)
        swap                = b1 <-> tmp
    return                  = b1

When the leftmost object is a temporary then a copy construction and swap call are replaced by the construction of an anonymous object. E.g., with Binary{} += b2 += b3 we observe:

    operator+=    (&)       = b2 += b3
    Copy constructor        = tmp(b2) 
        adding              = tmp.add(b3)
        swap                = b2 <-> tmp
    
    Anonymous object        = Binary{}

    operator+=    (&&)      = Binary{} += b2
        adding              = add(b2)

    return                  = move(Binary{})

For Binary &Binary::operator+=(Binary const &other) & an alternative implementation exists, merely using a single return statement, but in fact requiring two extra function calls. It's a matter of taste whether you prefer writing less code or executing fewer function calls:

    Binary &Binary::operator+=(Binary const &other) &
    {
        return *this = Binary{ *this } += rhs;
    }

Notice that the implementations of operator+ and operator+= are independent of the actual definition of the class Binary. Adding standard binary operators to a class (i.e., operators operating on arguments of their own class types) can therefore easily be realized.

11.6.2: The three-way comparison operator `<=>'

The C++20 standard added the three-way comparison operator <=>, also known as the spaceship operator, to the language.

This operator is closely related to comparison classes, covered in section 18.7. At this point we focus on using the std::strong_ordering class: the examples of the spaceship operator presented in this section all return strong_ordering objects. These objects are

Standard operand conversions are handled by the compiler. Note that

Other standard conversions, like lvalue transformations and qualification conversions (cf. section 21.4), are automatically performed.

Now about the spaceship operator itself. Why would you want it? Of course, if it's defined then you can use it. As it's available for integral numeric types the following correctly compiles:

    auto isp =    3 <=> 4;
whereafter isp's value can be compared to available outcome-values:
    cout << ( isp == strong_ordering::less ? "less\n" : "not less\n" );

But that by itself doesn't make the spaceship operator all too interesting. What does make it interesting is that, in combination with operator==, it handles all comparison operators. So after providing a class with operator== and operator<=> its objects can be compared for equality, inequality, and they can be ordered by <, <=, >, and >=. As an example consider books. To book owners the titles and author names are the books' important characterstics. To sort them on book shelfs we must use operator<, to find a particular book we use operator==, to determine whether two books are different we use operator!= and if you want to order them in an country where Arabic is the main language you might want to sort them using operator> considering that the prevalent reading order in those countries is from right to left. Ignoring constructors, destructors and other members, then this is the interface of our class Book (note the inclusion of the <compare> header file, containing the declarations of the comparison classes):

    #include <string>
    #include <compare>

    class Book
    {
        friend bool operator==(Book const &lhs, Book const &rhs);
        friend std::strong_ordering operator<=>(Book const &lhs, 
                                                Book const &rhs);
        std::string d_author;
        std::string d_title;

        // ...
    };
Both friend-functions are easy to implement:
    bool operator==(Book const &lhs, Book const &rhs)
    {
        return lhs.d_author == rhs.d_author and lhs.d_title == rhs.d_title;
    }
    
    strong_ordering operator<=>(Book const &lhs, Book const &rhs)
    {
        return lhs.d_author < rhs.d_author  ? strong_ordering::less     :
               lhs.d_author > rhs.d_author  ? strong_ordering::greater  :
               lhs.d_title  < rhs.d_title   ? strong_ordering::less     :
               lhs.d_title  > rhs.d_title   ? strong_ordering::greater  :
                                              strong_ordering::equal;
    }
And that's it! Now all comparison operators (and of course the spaceship operator itself) are available. The following now compiles flawlessly:
    void books(Book const &b1, Book const &b2)
    {
        cout << (b1 == b2) << (b1 != b2) << (b1 <  b2) << 
                (b1 <= b2) << (b1 >  b2) << (b1 >= b2) << '\n';
    }
calling books for two identical books inserts 100101 into cout.

The spaceship operator is available for integral numeric types and may have been defined for class types. E.g., it is defined for std::string. It is not automatically available for floating point types.

11.7: Overloading `operator new(size_t)'

When operator new is overloaded, it must define a void * return type, and its first parameter must be of type size_t. The default operator new defines only one parameter, but overloaded versions may define multiple parameters. The first one is not explicitly specified but is deduced from the size of objects of the class for which operator new is overloaded. In this section overloading operator new is discussed. Overloading new[] is discussed in section 11.9.

It is possible to define multiple versions of the operator new, as long as each version defines its own unique set of arguments. When overloaded operator new members must dynamically allocate memory they can do so using the global operator new, applying the scope resolution operator ::. In the next example the overloaded operator new of the class String initializes the substrate of dynamically allocated String objects to 0-bytes:

    #include <cstring>
    #include <iosfwd>

    class String
    {
        std::string *d_data;

        public:
            void *operator new(size_t size)
            {
                return memset(::operator new(size), 0, size);
            }
            bool empty() const
            {
                return d_data == 0;
            }
    };

The above operator new is used in the following program, illustrating that even though String's default constructor does nothing the object's data member d_data is initialized to zero:

    #include "string.h"
    #include <iostream>
    using namespace std;

    int main()
    {
        String *sp = new String;

        cout << boolalpha << sp->empty() << '\n';   // shows: true
    }

At new String the following took place:

As String::operator new initialized the allocated memory to zero bytes the allocated String object's d_data member had already been initialized to a 0-pointer by the time it started to exist.

All member functions (including constructors and destructors) we've encountered so far define a (hidden) pointer to the object on which they should operate. This hidden pointer becomes the function's this pointer.

In the next example of pseudo C++ code, the pointer is explicitly shown to illustrate what's happening when operator new is used. In the first part a String object str is directly defined, in the second part of the example the (overloaded) operator new is used:

    String::String(String *const this);     // real prototype of the default
                                            // constructor

    String *sp = new String;                // This statement is implemented
                                            // as follows:

    String *sp = static_cast<String *>(            // allocation
                        String::operator new(sizeof(String))
                 );
    String::String{ sp };                          // initialization

In the above fragment the member functions were treated as object-less member functions of the class String. Such members are called static member functions (cf. chapter 8). Actually, operator new is such a static member function. Since it has no this pointer it cannot reach data members of the object for which it is expected to make memory available. It can only allocate and initialize the allocated memory, but cannot reach the object's data members by name as there is as yet no data object layout defined.

Following the allocation, the memory is passed (as the this pointer) to the constructor for further processing.

Operator new can have multiple parameters. The first parameter is initialized as an implicit argument and is always a size_t parameter. Additional overloaded operators may define additional parameters. An interesting additional operator new is the placement new operator. With the placement new operator a block of memory has already been set aside and one of the class's constructors is used to initialize that memory. Overloading placement new requires an operator new having two parameters: size_t and char *, pointing to the memory that was already available. The size_t parameter is implicitly initialized, but the remaining parameters must explicitly be initialized using arguments to operator new. Hence we reach the familiar syntactical form of the placement new operator in use:

    char buffer[sizeof(String)];        // predefined memory
    String *sp = new(buffer) String;    // placement new call

The declaration of the placement new operator in our class String looks like this:

    void *operator new(size_t size, char *memory);

It could be implemented like this (also initializing the String's memory to 0-bytes):

    void *String::operator new(size_t size, char *memory)
    {
        return memset(memory, 0, size);
    }

Any other overloaded version of operator new could also be defined. Here is an example showing the use and definition of an overloaded operator new storing the object's address immediately in an existing array of pointers to String objects (assuming the array is large enough):

        // use:
    String *next(String **pointers, size_t *idx)
    {
        return new(pointers, (*idx)++) String;
    }

        // implementation:
    void *String::operator new(size_t size, String **pointers, size_t idx)
    {
        return pointers[idx] = ::operator new(size);
    }

11.8: Overloading `operator delete(void *)'

The delete operator may also be overloaded. In fact it's good practice to overload operator delete whenever operator new is also overloaded.

Operator delete must define a void * parameter. A second overloaded version defining a second parameter of type size_t is related to overloading operator new[] and is discussed in section 11.9.

Overloaded operator delete members return void.

The `home-made' operator delete is called when deleting a dynamically allocated object after executing the destructor of the associated class. So, the statement

    delete ptr;

with ptr being a pointer to an object of the class String for which the operator delete was overloaded, is a shorthand for the following statements:

    ptr->~String(); // call the class's destructor

                    // and do things with the memory pointed to by ptr
    String::operator delete(ptr);

The overloaded operator delete may do whatever it wants to do with the memory pointed to by ptr. It could, e.g., simply delete it. If that would be the preferred thing to do, then the default delete operator can be called using the :: scope resolution operator. For example:

    void String::operator delete(void *ptr)
    {
        // any operation considered necessary, then, maybe:
        ::delete ptr;
    }

To declare the above overloaded operator delete simply add the following line to the class's interface:

    void operator delete(void *ptr);

Like operator new operator delete is a static member function (see also chapter 8).

11.9: Operators `new[]' and `delete[]'

In sections 9.1.1, 9.1.2 and 9.2.1 operator new[] and operator delete[] were introduced. Like operator new and operator delete the operators new[] and delete[] may be overloaded.

As it is possible to overload new[] and delete[] as well as operator new and operator delete, one should be careful in selecting the appropriate set of operators. The following rule of thumb should always be applied:

If new is used to allocate memory, delete should be used to deallocate memory. If new[] is used to allocate memory, delete[] should be used to deallocate memory.

By default these operators act as follows:

11.9.1: Overloading `new[]'

To overload operator new[] in a class (e.g., in the class String) add the following line to the class's interface:
    void *operator new[](size_t size);

The member's size parameter is implicitly provided and is initialized by C++'s run-time system to the amount of memory that must be allocated. Like the simple one-object operator new it should return a void *. The number of objects that must be initialized can easily be computed from size / sizeof(String) (and of course replacing String by the appropriate class name when overloading operator new[] for another class). The overloaded new[] member may allocate raw memory using e.g., the default operator new[] or the default operator new:

    void *operator new[](size_t size)
    {
        return ::operator new[](size);
        // alternatively:
        // return ::operator new(size);
    }

Before returning the allocated memory the overloaded operator new[] has a chance to do something special. It could, e.g., initialize the memory to zero-bytes.

Once the overloaded operator new[] has been defined, it is automatically used in statements like:

    String *op = new String[12];

Like operator new additional overloads of operator new[] may be defined. One opportunity for an operator new[] overload is overloading placement new specifically for arrays of objects. This operator is available by default but becomes unavailable once at least one overloaded operator new[] is defined. Implementing placement new is not difficult. Here is an example, initializing the available memory to 0-bytes before returning:

    void *String::operator new[](size_t size, char *memory)
    {
        return memset(memory, 0, size);
    }

To use this overloaded operator, the second parameter must again be provided, as in:

    char buffer[12 * sizeof(String)];
    String *sp = new(buffer) String[12];

11.9.2: Overloading `delete[]'

To overload operator delete[] in a class String add the following line to the class's interface:
    void operator delete[](void *memory);

Its parameter is initialized to the address of a block of memory previously allocated by String::new[].

There are some subtleties to be aware of when implementing operator delete[]. Although the addresses returned by new and new[] point to the allocated object(s), there is an additional size_t value available immediately before the address returned by new and new[]. This size_t value is part of the allocated block and contains the actual size of the block. This of course does not hold true for the placement new operator.

When a class defines a destructor the size_t value preceding the address returned by new[] does not contain the size of the allocated block, but the number of objects specified when calling new[]. Normally that is of no interest, but when overloading operator delete[] it might become a useful piece of information. In those cases operator delete[] does not receive the address returned by new[] but rather the address of the initial size_t value. Whether this is at all useful is not clear. By the time delete[]'s code is executed all objects have already been destroyed, so operator delete[] is only to determine how many objects were destroyed but the objects themselves cannot be used anymore.

Here is an example showing this behavior of operator delete[] for a minimal Demo class:

    struct Demo
    {
        size_t idx;
        Demo()
        {
            cout << "default cons\n";
        }
        ~Demo()
        {
            cout << "destructor\n";
        }
        void *operator new[](size_t size)
        {
            return ::operator new(size);
        }
        void operator delete[](void *vp)
        {
            cout << "delete[] for: " << vp << '\n';
            ::operator delete[](vp);
        }
    };

    int main()
    {
        Demo *xp;
        cout << ((int *)(xp = new Demo[3]))[-1] << '\n';
        cout << xp << '\n';
        cout << "==================\n";
        delete[] xp;
    }
    // This program displays (your 0x?????? addresses might differ, but
    // the difference between the two should be sizeof(size_t)):
    //  default cons
    //  default cons
    //  default cons
    //  3
    //  0x8bdd00c
    //  ==================
    //  destructor
    //  destructor
    //  destructor
    //  delete[] for: 0x8bdd008

Having overloaded operator delete[] for a class String, it will be used automatically in statements like:

        delete[] new String[5];

Operator delete[] may also be overloaded using an additional size_t parameter:

    void operator delete[](void *p, size_t size);

Here size is automatically initialized to the size (in bytes) of the block of memory to which void *p points. If this form is defined, then void operator[](void *) should not be defined, to avoid ambiguities. An example of this latter form of operator delete[] is:

    void String::operator delete[](void *p, size_t size)
    {
        cout << "deleting " << size << " bytes\n";
        ::operator delete[](ptr);
    }

Additional overloads of operator delete[] may be defined, but to use them they must explicitly be called as static member functions (cf. chapter 8). Example:

        // declaration:
    void String::operator delete[](void *p, ostream &out);
        // usage:
    String *xp = new String[3];
    String::operator delete[](xp, cout);

11.9.3: The `operator delete(void *, size_t)' family

As we've seen classes may overload their operator delete and operator delete[] members.

Since the C++14 standard the global void operator delete(void *, size_t size) and void operator delete[](void *, size_t size) functions can also be overloaded.

When a global sized deallocation function is defined, it is automatically used instead of the default, non-sized deallocation function. The performance of programs may improve if a sized deallocation function is available (cf. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3663.html).

11.9.4: `new[]', `delete[]' and exceptions

When an exception is thrown while executing a new[] expression, what will happen? In this section we'll show that new[] is exception safe even when only some of the objects were properly constructed.

To begin, new[] might throw while trying to allocate the required memory. In this case a bad_alloc is thrown and we don't leak as nothing was allocated.

Having allocated the required memory the class's default constructor is going to be used for each of the objects in turn. At some point a constructor might throw. What happens next is defined by the C++ standard: the destructors of the already constructed objects are called and the memory allocated for the objects themselves is returned to the common pool. Assuming that the failing constructor offers the basic guarantee new[] is therefore exception safe even if a constructor may throw.

The following example illustrates this behavior. A request to allocate and initialize five objects is made, but after constructing two objects construction fails by throwing an exception. The output shows that the destructors of properly constructed objects are called and that the allocated substrate memory is properly returned:

    #include <iostream>
    using namespace std;

    static size_t count = 0;

    class X
    {
        int x;

        public:
            X()
            {
                if (count == 2)
                    throw 1;
                cout << "Object " << ++count << '\n';
            }
            ~X()
            {
                cout << "Destroyed " << this << "\n";
            }
            void *operator new[](size_t size)
            {
                cout << "Allocating objects: " << size << " bytes\n";
                return ::operator new(size);
            }
            void operator delete[](void *mem)
            {
                cout << "Deleting memory at " << mem << ", containing: " <<
                    *static_cast<int *>(mem) << "\n";
                ::operator delete(mem);
            }
    };

    int main()
    try
    {
        X *xp = new X[5];
        cout << "Memory at " << xp << '\n';
        delete[] xp;
    }
    catch (...)
    {
        cout << "Caught exception.\n";
    }
    // Output from this program (your 0x??? addresses might differ)
    //  Allocating objects: 24 bytes
    //  Object 1
    //  Object 2
    //  Destroyed 0x8428010
    //  Destroyed 0x842800c
    //  Deleting memory at 0x8428008, containing: 5
    //  Caught exception.

11.10: Function Objects

Function Objects are created by overloading the function call operator operator(). By defining the function call operator an object masquerades as a function, hence the term function objects. Function objects are also known as functors.

Function objects are important when using generic algorithms. The use of function objects is preferred over alternatives like pointers to functions. The fact that they are important in the context of generic algorithms leaves us with a didactic dilemma. At this point in the C++ Annotations it would have been nice if generic algorithms would already have been covered, but for the discussion of the generic algorithms knowledge of function objects is required. This bootstrapping problem is solved in a well known way: by ignoring the dependency for the time being, for now concentrating on the function object concept.

Function objects are objects for which operator() has been defined. Function objects are not just used in combination with generic algorithms, but also as a (preferred) alternative to pointers to functions.

Function objects are frequently used to implement predicate functions. Predicate functions return boolean values. Predicate functions and predicate function objects are commonly referred to as `predicates'. Predicates are frequently used by generic algorithms such as the count_if generic algorithm, covered in chapter 19, returning the number of times its function object has returned true. In the standard template library two kinds of predicates are used: unary predicates receive one argument, binary predicates receive two arguments.

Assume we have a class Person and an array of Person objects. Further assume that the array is not sorted. A well known procedure for finding a particular Person object in the array is to use the function lsearch, which performs a linear search in an array. Example:

    Person &target = targetPerson();    // determine the person to find
    Person *pArray;
    size_t n = fillPerson(&pArray);

    cout << "The target person is";

    if (!lsearch(&target, pArray, &n, sizeof(Person), compareFunction))
        cout << " not";
    cout << "found\n";

The function targetPerson determines the person we're looking for, and fillPerson is called to fill the array. Then lsearch is used to locate the target person.

The comparison function must be available, as its address is one of the arguments of lsearch. It must be a real function having an address. If it is defined inline then the compiler has no choice but to ignore that request as inline functions don't have addresses. CompareFunction could be implemented like this:

    int compareFunction(void const *p1, void const *p2)
    {
        return *static_cast<Person const *>(p1)     // lsearch wants 0
                !=                                  // for equal objects
                *static_cast<Person const *>(p2);
    }

This, of course, assumes that the operator!= has been overloaded in the class Person. But overloading operator!= is no big deal, so let's assume that that operator is actually available.

On average n / 2 times at least the following actions take place:

  1. The two arguments of the compare function are pushed on the stack;
  2. The value of the final parameter of lsearch is determined, producing compareFunction's address;
  3. The compare function is called;
  4. Then, inside the compare function the address of the right-hand argument of the Person::operator!= argument is pushed on the stack;
  5. Person::operator!= is evaluated;
  6. The argument of the Person::operator!= function is popped off the stack;
  7. The two arguments of the compare function are popped off the stack.
Using function objects results in a different picture. Assume we have constructed a function PersonSearch, having the following prototype (this, however, is not the preferred approach. Normally a generic algorithm is preferred over a home-made function. But for now we focus on PersonSearch to illustrate the use and implementation of a function object):
    Person const *PersonSearch(Person *base, size_t nmemb,
                               Person const &target);

This function can be used as follows:

    Person &target = targetPerson();
    Person *pArray;
    size_t n = fillPerson(&pArray);

    cout << "The target person is";

    if (!PersonSearch(pArray, n, target))
        cout << " not";

    cout << "found\n";

So far, not much has been changed. We've replaced the call to lsearch with a call to another function: PersonSearch. Now look at PersonSearch itself:

    Person const *PersonSearch(Person *base, size_t nmemb,
                                Person const &target)
    {
        for (int idx = 0; idx < nmemb; ++idx)
            if (target(base[idx]))
                return base + idx;
        return 0;
    }

PersonSearch implements a plain linear search. However, in the for-loop we see target(base[idx]). Here target is used as a function object. Its implementation is simple:

    bool Person::operator()(Person const &other) const
    {
        return *this == other;
    }

Note the somewhat peculiar syntax: operator(). The first set of parentheses define the operator that is overloaded: the function call operator. The second set of parentheses define the parameters that are required for this overloaded operator. In the class header file this overloaded operator is declared as:

    bool operator()(Person const &other) const;

Clearly Person::operator() is a simple function. It contains but one statement, and we could consider defining it inline. Assuming we do, then this is what happens when operator() is called:

  1. The address of the right-hand argument of the Person::operator== argument is pushed on the stack;
  2. The operator== function is evaluated (which probably also is a semantic improvement over calling operator!= when looking for an object equal to a specified target object);
  3. The argument of Person::operator== argument is popped off the stack.
Due to the fact that operator() is an inline function, it is not actually called. Instead operator== is called immediately. Moreover, the required stack operations are fairly modest.

Function objects may truly be defined inline. Functions that are called indirectly (i.e., using pointers to functions) can never be defined inline as their addresses must be known. Therefore, even if the function object needs to do very little work it is defined as an ordinary function if it is going to be called through pointers. The overhead of performing the indirect call may annihilate the advantage of the flexibility of calling functions indirectly. In these cases using inline function objects can result in an increase of a program's efficiency.

An added benefit of function objects is that they may access the private data of their objects. In a search algorithm where a compare function is used (as with lsearch) the target and array elements are passed to the compare function using pointers, involving extra stack handling. Using function objects, the target person doesn't vary within a single search task. Therefore, the target person could be passed to the function object's class constructor. This is in fact what happens in the expression target(base[idx]) receiving as its only argument the subsequent elements of the array to search.

11.10.1: Constructing manipulators

In chapter 6 we saw constructions like cout << hex << 13 << to display the value 13 in hexadecimal format. One may wonder by what magic the hex manipulator accomplishes this. In this section the construction of manipulators like hex is covered.

Actually the construction of a manipulator is rather simple. To start, a definition of the manipulator is needed. Let's assume we want to create a manipulator w10 which sets the field width of the next field to be written by the ostream object to 10. This manipulator is constructed as a function. The w10 function needs to know about the ostream object in which the width must be set. By providing the function with an ostream & parameter, it obtains this knowledge. Now that the function knows about the ostream object we're referring to, it can set the width in that object.

Next, it must be possible to use the manipulator in an insertion sequence. This implies that the return value of the manipulator must be a reference to an ostream object also.

From the above considerations we're now able to construct our w10 function:

    #include <ostream>
    #include <iomanip>

    std::ostream &w10(std::ostream &str)
    {
        return str << std::setw(10);
    }
The w10 function can of course be used in a `stand alone' mode, but it can also be used as a manipulator. E.g.,
        #include <iostream>
        #include <iomanip>

        using namespace std;

        extern ostream &w10(ostream &str);

        int main()
        {
            w10(cout) << 3 << " ships sailed to America\n";
            cout << "And " << w10 << 3 << " more ships sailed too.\n";
        }
The w10 function can be used as a manipulator because the class ostream has an overloaded operator<< accepting a pointer to a function expecting an ostream & and returning an ostream &. Its definition is:
    ostream& operator<<(ostream &(*func)(ostream &str))
    {
        return (*func)(*this);
    }

In addition to the above overloaded operator<< another one is defined

    ios_base &operator<<(ios_base &(*func)(ios_base &base))
    {
        (*func)(*this);
        return *this;
    }

This latter function is used when inserting, e.g., hex or internal.

The above procedure does not work for manipulators requiring arguments. It is of course possible to overload operator<< to accept an ostream reference and the address of a function expecting an ostream & and, e.g., an int, but while the address of such a function may be specified with the <<-operator, the arguments itself cannot be specified. So, one wonders how the following construction has been implemented:

    cout << setprecision(3)

In this case the manipulator is defined as a macro. Macro's, however, are the realm of the preprocessor, and may easily suffer from unwelcome side-effects. In C++ programs they should be avoided whenever possible. The following section introduces a way to implement manipulators requiring arguments without resorting to macros, but using anonymous objects.

11.10.1.1: Manipulators requiring arguments

Manipulators taking arguments are implemented as macros: they are handled by the preprocessor, and are not available beyond the preprocessing stage.

Manipulators, maybe requiring arguments, can also be defined without using macros. One solution, suitable for modifying globally available objects (like cin, or cout) is based on using anonymous objects:

Here is an example of a little program using such a home-made manipulator expecting multiple arguments:
    #include <iostream>
    #include <iomanip>

    class Align
    {
        unsigned d_width;
        std::ios::fmtflags d_alignment;

        public:
            Align(unsigned width, std::ios::fmtflags alignment);
            std::ostream &operator()(std::ostream &ostr) const;
    };

    Align::Align(unsigned width, std::ios::fmtflags alignment)
    :
        d_width(width),
        d_alignment(alignment)
    {}

    std::ostream &Align::operator()(std::ostream &ostr) const
    {
        ostr.setf(d_alignment, std::ios::adjustfield);
        return ostr << std::setw(d_width);
    }

    std::ostream &operator<<(std::ostream &ostr, Align &&align)
    {
        return align(ostr);
    }

    using namespace std;

    int main()
    {
        cout
            << "`" << Align{ 5, ios::left } << "hi" << "'"
            << "`" << Align{ 10, ios::right } << "there" << "'\n";
    }

    /*
        Generated output:

        `hi   '`     there'
    */

When (local) objects must be manipulated, then the class that must provide manipulators may define function call operators receiving the required arguments. E.g., consider a class Matrix that should allow its users to specify the value and line separators when inserting the matrix into an ostream.

Two data members (e.g., char const *d_valueSep and char const *d_lineSep) are defined (and initialized to acceptable values). The insertion function inserts d_valueSep between values, and d_lineSep at the end of inserted rows. The member operator()(char const *valueSep, char const *lineSep) simply assigns values to the corresponding data members.

Given an object Matrix matrix, then at this point matrix(" ", "\n") can be called. The function call operator should probably not insert the matrix, as the responsibility of manipulators is to manipulate, not to insert. So, to insert a matrix a statement like

        cout << matrix(" ", "\n") << matrix << '\n';

should probably be used. The manipulator (i.e., function call operator) assigns the proper values to d_valueSep and d_lineSep, which are then used during the actual insertion.

The return value of the function call operator remains to be specified. The return value should be insertable, but in fact should not insert anything at all. An empty NTBS could be returned, but that's a bit kludge-like. Instead the address of a manipulator function, not performing any action, can be returned. Here's the implementation of such an empty manipulator:

        // static       (alternatively a free function could be used)
        std::ostream &Matrix::nop(std::ostream &out)
        {
            return out;
        }

Thus, the implementation of the Matrix's manipulator becomes:

        std::ostream &( 
            *Matrix::operator()(char const *valueSep, char const *lineSep) ) 
                                                            (std::ostream &)
        {
            d_valueSep = valueSep;
            d_lineSep = lineSep;
            return nop;
        }

Instead (probably a matter of taste) of returning the address of an empty function the manipulator could first set the required insertion-specific values and could then return itself: the Matrix would be inserted according to the just assigned values to the insertion variables:

        Matrix const &Matrix::operator()
            (char const *valueSep, char const *lineSep)
        {
            d_valueSep = valueSep;
            d_lineSep = lineSep;
            return *this;
        }

In this case the insertion statement is simplified to

    cout << matrix(" ", "\n") << '\n';

11.11: Lambda expressions

C++ supports lambda expressions. As we'll see in chapter 19 generic algorithms often accept arguments that can either be function objects or plain functions. Examples are the sort (cf. section 19.1.59) and find_if (cf. section 19.1.17) generic algorithms. As a rule of thumb: when a called function must remember its state a function object is appropriate, otherwise a plain function can be used.

Frequently the function or function object is not readily available, and it must be defined in or near the location where it is used. This is commonly realized by defining a class or function in the anonymous namespace (say: class or function A), passing an A to the code needing A. If that code is itself a member function of the class B, then A's implementation might benefit from having access to the members of class B.

This scheme usually results in a significant amount of code (defining the class), or it results in complex code (to make available software elements that aren't automatically accessible to A's code). It may also result in code that is irrelevant at the current level of specification. Nested classes don't solve these problems either. Moreover, nested classes can't be used in templates.

lambda expressions solve these problems. A lambda expression defines an anonymous function object which may immediately be passed to functions expecting function object arguments, as explained in the next few sections.

According to the C++ standard, lambda expressions provide a concise way to create simple function objects. The emphasis here is on simple: a lambda expression's size should be comparable to the size of inline-functions: just one or maybe two statements. If you need more code, then encapsulate that code in a separate function which is then called from inside the lambda expression's compound statement, or consider designing a separate function object.

11.11.1: Lambda expressions: syntax

A lambda expression defines an anonymous function object, also called a closure object or simply a closure.

When a lambda expression is evaluated it results in a temporary function object (the closure object). This temporary function object is of a unique anonymous class type, called its closure type.

Lambda expressions are used inside blocks, classes or namespaces (i.e., pretty much anywhere you like). Their implied closure type is defined in the smallest block, class or namespace scope containing the lambda expression. The closure object's visibility starts at its point of definition and ends where its closure type ends (i.e., their visibility is identical to the visibility of plain variables).

The closure type defines a const public inline function call operator. Here is an example of a lambda expression:

    []                      // the `lambda-introducer'
    (int x, int y)          // the `lambda-declarator'
    {                       // a normal compound-statement
        return x * y;
    }

The function (formally: the function call operator of the closure type created by this lambda expression) expects two int arguments and returns their product. This function is an inline const member of its closure type. Its const attribute is removed if the lambda expression specifies mutable. E.g.,

    [](int x, int y) mutable
    ...

The lambda-declarator may be omitted if no parameters are defined, but when specifying mutable (or constexpr, see below) a lambda-declarator must be specified (at least as an empty set of parentheses). The parameters in a lambda declarator cannot be given default arguments.

Declarator specifiers can be mutable, constexpr, or both. A constexpr lambda-expression is itself a constexpr, which may be compile-time evaluated if its arguments qualify as const-expressions. By implication, if a lambda-expression is defined inside a constexpr function then the lambda-expression itself is a constexpr, and the constexpr declarator specifier is not required. Thus, the following function definitions are identical:

    int constexpr change10(int n)
    {
        return [n] 
               { 
                   return n > 10 ? n - 10 : n + 10; 
               }();
    }
    
    int constexpr change10(int n)
    {
        return [n] () constexpr 
               { 
                   return n > 10 ? n - 10 : n + 10; 
               }();
    }

A closure object as defined by the previous lambda expression could for example be used in combination with the accumulate generic algorithm (cf. section 19.1.1) to compute the product of a series of int values stored in a vector:

    cout << accumulate(vi.begin(), vi.end(), 1,
                [] (int x, int y) 
                { 
                    return x * y; 
                }
            );

This lambda expression implicitly defines its return type as decltype(x * y). Implicit return types can be used in these cases:

If there are multiple return statements returning values of different types then the lambda expression's return type must explicitly be specified using a late-specified return type, (cf. section 3.3.7):

    [](bool neg, double y) -> int
    {
        return neg ? -y : y;
    }

Variables visible at the location of a lambda expression may be accessible from inside the lambda expression's compound statement. Which variables and how they are accessed depends on the content of the lambda-introducer.

When the lambda expression is defined inside a class member function the lambda-introducer may contain this or *this; where used in the following overview this class-context is assumed.

Global variables are always accessible, and can be modified if their definitions allow so (this in general holds true in the following overview: when stated that `variables can be modified' then that only applies to variables that themselves allow modifications).

Local variables of the lambda expression's surrounding function may also be specified inside the lambda-introducer. The specification local is used to refer to any comma-separated list of local variables of the surrounding function that are visible at the lambda expression's point of definition. There is no required ordering of the this, *this and local specifications.

Finally, where in the following overview mutable is mentioned it must be specified, where mutable_opt is specified it is optional.

Access globals, maybe data members and local variables:

The following specifications must use = as the first element of the lambda-introducer. It allows accessing local variables by value, unless...:

The following specifications must use & as the first element of the lambda-introducer. It allows accessing local variables by reference, unless...:

Even when not specified, lambda expressions implicitly capture their this pointers, and class members are always accessed relative to this. But when members are called asynchronously (cf. chapter 20) a problem may arise, because the asynchronously called lambda function may refer to members of an object whose lifetime ended shortly after asynchronously calling the lambda function. This potential problem is solved by using `*this' in the lambda-capture if it starts with =, e.g., [=, *this] (in addition, variables may still also be captured, as usual). When specifying `*this' the object to which this refers is explicitly captured: if the object's scope ends it is not immediately destroyed, but its lifetime is extended by the lambda-expression for the duration of that expression. In order to use the `*this' specification, the object must be available. Consider the following example:

    struct S2 
    {
        double ohseven = .007;

        auto f() 
        {
            return [this]                       // (1, see below)
                   {
                        return [*this]          // (2)
                               {
                                    return ohseven; // OK
                               };
                   }();                         // (3)
        }

        auto g() 
        {
            return [] 
                   {
                        return [*this] 
                        { 
                            // error: *this not captured by 
                            // the outer lambda-expression 
                        }; 
                    }();
        }
    };

Although lambda expressions are anonymous function objects, they can be assigned to variables. Often, the variable is defined using the keyword auto. E.g.,

    auto sqr = [](int x)
               {
                   return x * x;
               };

The lifetime of such lambda expressions is equal to the lifetime of the variable receiving the lambda expression as its value.

Note also that defining a lambda expression is different from calling its function operator. The function S2::f() returns what the lamba expression (1)'s function call operator returns: its function call operator is called by using () (at (3)). What it in fact returns is another anonymous function object (defined at (2)). As that's just a function object, to retrieve its value it must still be called from f's return value using something like this:

    S2 s2;
    s2.f()();

Here, the second set of parentheses activates the returned function object's function call operator. Had the parentheses been omitted at (3) then S2::f() would have returned a mere anonymous function object (defined at (1)), in which case it would require three sets of parentheses to retrieve ohseven's value: s2.f()()().

11.11.2: Using lambda expressions

Now that the syntax of lambda expressions have been covered let's see how they can be used in various situations.

First we consider named lambda expressions. Named lambda expressions nicely fit in the niche of local functions: when a function needs to perform computations which are at a conceptually lower level than the function's task itself, then it's attractive to encapsulate these computations in a separate support function and call the support function where needed. Although support functions can be defined in anonymous namespaces, that quickly becomes awkward when the requiring function is a class member and the support function also must access the class's members.

In that case a named lambda expression can be used: it can be defined inside a requiring function, and it may be given full access to the surrounding class. The name to which the lambda expression is assigned becomes the name of a function which can be called from the surrounding function. Here is an example, converting a numeric IP address to a dotted decimal string, which can also be accessed directly from an Dotted object (all implementations in-class to conserve space):

    class Dotted
    {
        std::string d_dotted;
        
        public:
            std::string const &dotted() const
            {
                return d_dotted;
            }
            std::string const &dotted(size_t ip)
            {
                auto octet = 
                    [](size_t idx, size_t numeric)
                    {
                        return to_string(numeric >> idx * 8 & 0xff);
                    };

                d_dotted = 
                        octet(3, ip) + '.' + octet(2, ip) + '.' +
                        octet(1, ip) + '.' + octet(0, ip);

                return d_dotted;
            }
    };

Next we consider the use of generic algorithms, like the for_each (cf. section 19.1.18):

    void showSum(vector<int> const &vi)
    {
        int total = 0;
        for_each(
            vi.begin(), vi.end(),
            [&](int x)
            {
                total += x;
            }
        );
        std::cout << total << '\n';
    }

Here the variable int total is passed to the lambda expression by reference and is directly accessed by the function. Its parameter list merely defines an int x, which is initialized in sequence by each of the values stored in vi. Once the generic algorithm has completed showSum's variable total has received a value that is equal to the sum of all the vector's values. It has outlived the lambda expression and its value is displayed.

But although generic algorithms are extremely useful, there may not always be one that fits the task at hand. Furthermore, an algorithm like for_each looks a bit unwieldy, now that the language offers range-based for-loops. So let's try this, instead of the above implementation:

    void showSum(vector<int> const &vi)
    {
        int total = 0;
        for (auto el: vi)
            [&](int x)
            {
                total += x;
            };

        std::cout << total << '\n';
    }

But when showSum is now called, its cout statement consistently reports 0. What's happening here?

When a generic algorithm is given a lambda function, its implementation instantiates a reference to a function. The referenced function is thereupon called from within the generic algorithm. But, in the above example the range-based for-loop's nested statement merely represents the definition of a lambda function. Nothing is actually called, and hence total remains equal to 0.

Thus, to make the above example work we not only must define the lambda expression, but we must also call the lambda function. We can do this by giving the lambda function a name, and then call the lambda function by its given name:

    void showSum(vector<int> const &vi)
    {
        int total = 0;
        for (auto el: vi)
        {
            auto lambda = [&](int x)
                            {
                                total += x;
                            };

            lambda(el);
        }
        std::cout << total << '\n';
    }

In fact, there is no need to give the lambda function a name: the auto lambda definition represents the lambda function, which could also directly be called. The syntax for doing this may look a bit weird, but there's nothing wrong with it, and it allows us to drop the compound statement, required in the last example, completely. Here goes:

    void showSum(vector<int> const &vi)
    {
        int total = 0;
        for (auto el: vi)
            [&](int x)
            {
                total += x;
            }(el);          // immediately append the 
                            // argument list to the lambda
                            // function's definition
        std::cout << total << '\n';
    }

Lambda expressions can also be used to prevent spurious returns from condition_variable's wait calls (cf. section 20.4.3).

The class condition_variable allows us to do so by offering wait members expecting a lock and a predicate. The predicate checks the data's state, and returns true if the data's state allows the data's processing. Here is an alternative implementation of the down member shown in section 20.4.3, checking for the data's actual availability:

    void down()
    {
        unique_lock<mutex> lock(sem_mutex);
        condition.wait(lock, 
            [&]()
            {
                return semaphore != 0
            }
        );
        --semaphore;
    }

The lambda expression ensures that wait only returns once semaphore has been incremented.

Lambda expression are primarily used to obtain functors that are used in a very localized section of a program. Since they are used inside an existing function we should realize that once we use lambda functions multiple aggregation levels are mixed. Normally a function implements a task which can be described at its own aggregation level using just a few sentences. E.g., ``the function std::sort sorts a data structure by comparing its elements in a way that is appropriate to the context where sort is called''. By using an existing comparison method the aggregation level is kept, and the statement is clear by itself. E.g.,

    sort(data.begin(), data.end(), greater<DataType>());

If an existing comparison method is not available, a tailor-made function object must be created. This could be realized using a lambda expression. E.g.,

    sort(data.begin(), data.end(), 
        [&](DataType const &lhs, DataType const &rhs)
        {
            return lhs.greater(rhs);
        }
    );

Looking at the latter example, we should realize that here two different aggregation levels are mixed: at the top level the intent is to sort the elements in data, but at the nested level (inside the lambda expression) something completely different happens. Inside the lambda expression we define how a the decision is made about which of the two objects is the greater. Code exhibiting such mixed aggregation levels is hard to read, and should be avoided.

On the other hand: lambda expressions also simplify code because the overhead of defining tailor-made functors is avoided. The advice, therefore, is to use lambda expressions sparingly. When they are used make sure that their sizes remain small. As a rule of thumb: lambda expressions should be treated like in-line functions, and should merely consist of one, or maybe occasionally two expressions.

A special group of lambda expressions is known as generic lambda expressions. As generic lambda expressions are in fact class templates, their coverage is postponed until chapter 22.

11.12: The case of [io]fstream::open()

Earlier, in section 6.4.2.1, it was noted that the [io]fstream::open members expect an ios::openmode value as their final argument. E.g., to open an fstream object for writing you could do as follows:
    fstream out;
    out.open("/tmp/out", ios::out);

Combinations are also possible. To open an fstream object for both reading and writing the following stanza is often seen:

    fstream out;
    out.open("/tmp/out", ios::in | ios::out);

When trying to combine enum values using a `home made' enum we may run into problems. Consider the following:

    enum Permission
    {
        READ =      1 << 0,
        WRITE =     1 << 1,
        EXECUTE =   1 << 2
    };

    void setPermission(Permission permission);

    int main()
    {
        setPermission(READ | WRITE);
    }

When offering this little program to the compiler it replies with an error message like this:

invalid conversion from 'int' to 'Permission'

The question is of course: why is it OK to combine ios::openmode values passing these combined values to the stream's open member, but not OK to combine Permission values.

Combining enum values using arithmetic operators results in int-typed values. Conceptually this never was our intention. Conceptually it can be considered correct to combine enum values if the resulting value conceptually makes sense as a value that is still within the original enumeration domain. Note that after adding a value READWRITE = READ | WRITE to the above enum we're still not allowed to specify READ | WRITE as an argument to setPermission.

To answer the question about combining enumeration values and yet stay within the enumeration's domain we turn to operator overloading. Up to this point operator overloading has been applied to class types. Free functions like operator<< have been overloaded, and those overloads are conceptually within the domain of their class.

As C++ is a strongly typed language realize that defining an enum is really something beyond the mere association of int-values with symbolic names. An enumeration type is really a type of its own, and as with any type its operators can be overloaded. When writing READ | WRITE the compiler performs the default conversion from enum values to int values and applies the operator to ints. It does so when it has no alternative.

But it is also possible to overload the enum type's operators. Thus we may ensure that we'll remain within the enum's domain even though the resulting value wasn't defined by the enum. The advantage of type-safety and conceptual clarity is considered to outweigh the somewhat peculiar introduction of values hitherto not defined by the enum.

Here is an example of such an overloaded operator:

    Permission operator|(Permission left, Permission right)
    {
        return static_cast<Permission>(static_cast<int>(left) | right);
    }

Other operators can easily and analogously be constructed.

Operators like the above were defined for the ios::openmode enumeration type, allowing us to specify ios::in | ios::out as argument to open while specifying the corresponding parameter as ios::openmode as well. Clearly, operator overloading can be used in many situations, not necessarily only involving class-types.

11.13: User-defined literals

In addition to the well-known literals, like numerical constants (with or without suffixes), character constants and string (textual) literals, C++ also supports user-defined literals, also known as extensible literals.

A user-defined literal is defined by a function (see also section 23.3) that must be defined at namespace scope. Such a function is called a literal operator. A literal operator cannot be a class member function. The names of a literal operator must start with an underscore, and a literal operator is used (called) by suffixing its name (including the underscore) to the argument that must be passed to it . Assuming _NM2km (nautical mile to km) is the name of a literal operator, then it could be called as 100_NM2km, producing, e.g., the value 185.2.

Using Type to represent the return type of the literal operator its generic declaration looks like this:

    Type operator "" _identifier(parameter-list);

The blank space trailing the empty string is required. The parameter lists of literal operators can be:

If literal operators are overloaded the compiler will pick the literal operator requiring the least `effort'. E.g., 120 is processed by a literal operator defining an unsigned long long int parameter and not by its overloaded version, defining a char const * parameter. But if overloaded literal operators exist defining char const * and long double parameters then the operator defining a char const * parameter is used when the argument 120 is provided, while the operator defining a long double parameter is used with the argument 120.3.

A literator operator can define any return type. Here is an example of a definition of the _NM2km literal operator:

    double operator "" _NM2km(char const *nm)
    {
        return std::stod(nm) * 1.852;
    }

    double value = 120_NM2km;   // example of use

Of course, the argument could also have been a long double constant. Here's an alternative implementation, explicitly expecting a long double:

    double constexpr operator "" _NM2km(long double nm)
    {
        return nm * 1.852;
    }

    double value = 450.5_NM2km;   // example of use

A numeric constant can also be processed completely at compile-time. Section 23.3 provides the details of this type of literal operator.

Arguments to literal operators are themselves always constants. A literal operator like _NM2km cannot be used to convert, e.g., the value of a variable. A literal operator, although it is defined as a function, cannot be called like a function. The following examples therefore result in compilation errors:

    double speed;

    speed_NM2km;        // no identifier 'speed_NM2km'
    _NM2km(speed);      // no function _NM2km
    _NM2km(120.3);      // no function _NM2km

11.14: Overloadable operators

The following operators can be overloaded:
    +       -       *       /       %       ^       &       |
    ~       !       ,       =       <=>     <       >       <=
    >=      ++      --      <<      >>      ==      !=      &&
    ||      +=      -=      *=      /=      %=      ^=      &=
    |=      <<=     >>=     []      ()      ->      ->*     new
    new[]   delete  delete[]

Several operators have textual alternatives:


textual alternative operator

and &&
and_eq &=
bitand &
bitor |
compl ~
not !
not_eq !=
or ||
or_eq |=
xor ^
xor_eq ^=

`Textual' alternatives of operators are also overloadable (e.g., operator and). However, note that textual alternatives are not additional operators. So, within the same context operator&& and operator and can not both be overloaded.

Several of these operators may only be overloaded as member functions within a class. This holds true for the '=', the '[]', the '()' and the '->' operators. Consequently, it isn't possible to redefine, e.g., the assignment operator globally in such a way that it accepts a char const * as an lvalue and a String & as an rvalue. Fortunately, that isn't necessary either, as we have seen in section 11.3.

Finally, the following operators cannot be overloaded:

    .       .*      ::      ?:      sizeof  typeid