☑ C++11: Function and method changes

17 Sep 2013 at 7:50PM in Software
 |  | 

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers various changes to function and method declaration and definition.

This is the 4th of the 8 articles that currently make up the “C++11 Features” series.

child map

Defaulted and deleted methods

C++11 contains various changes which perhaps don’t expand what’s possible, but certainly improve the clarity with which they can be expressed. Among these changes are explicit defaulting and deleting of methods.

C++ has long had the ability for the compiler to automatically generate some special methods on demand — specifically, it will generate:

  • Default constructor: a constructor which takes no arguments.
  • Copy constructor: a constructor which takes only a const reference to its own type.
  • Copy assignment operator: assignment from a const reference to its own type.
  • Destructor: performs various low-level cleanup functions.

Incidentally, C++11 adds two more to this list:

  • Move constructor: as the copy constructor but using move semantics
  • Move assignment operator: as the copy assignment operator but using move semantics.

However, there are a number of limitations to this in C++03, where the compiler either generates a method that you don’t want or fails to generate a method that you do. For example, you may wish the compiler to provide a default constructor even if there are other constructors defined on the class (normally it only generates one if no other constructors are defined). Or perhaps your class semantics are such that copying it will lead to double-freeing some internal resource so you want to disable the copy constructor and assignment.

Some of this has always been possible with little tricks — for example, you could define a private copy constructor and fail to provide an implementation, leading to compile errors if anybody tries to use it. However, in C++11 the default and delete keywords provide a way to do this cleanly and explicitly in a manner consistent with the syntax for declaring a pure virtual method:

class Uncopyable
{
public:
    Uncopyable() = default;
    Uncopyable(const Uncopyable&) = delete;
    Uncopyable& operator=(const Uncopyable&) = delete;
};

This class first declares explicitly that it wants the compiler to generate a default constructor for it. Then it explicitly prevents the compiler from generating a copy constructor and assignment operator, in essence preventing another instance of the class attempting to copy it.

When combined with appropriate templating, the delete keyword can also be used to restrict a function call to only taking one or more explicitly defined types:

class OnlyLongs
{
public:
    void method(long int arg);
    template<class T> void method(T) = delete;
};

I think of this as conceptually extending the notion of an explicit constructor to regular methods, although I’m sure C++ experts might wince at the comparison.

Explicit overrides and final

In a similar vein, it’s now possible to override base class methods more reliably in C++11. Consider the following inheritance:

class Base
{
public:
    virtual void method(double arg);
};

class Derived : public Base
{
public:
    virtual void method(int arg);
};

Perhaps the author of Derived intended to override the virtual method(), but has instead overloaded it with an alternative version taking an int instead of a double. This might seem an obvious mistake, but consider that perhaps Base::method() took an int when Derived was originally defined but has since been modified by someone who was unaware of the existence of Derived. Due to function overloading this is quite valid so the compilation won’t fail.

In C++11 this sort of problem can be avoided by marking a method with the override attribute, specifying that it must explicitly override a method with the same signature in a base class or a compilation error will result:

class Base
{
public:
    virtual void method1(double arg);
    virtual void method2(double arg);
};

class Derived : public Base
{
public:
    virtual void method1(double arg) override; // this is fine
    virtual void method2(int arg) override;    // fails to compile in C++11
};

Similar to override is final which is used in the same way but explicitly prevents a derived class from overriding a method. It can also be used on a base class to prevent any other classes deriving from it further:

class Base1 final { };
class Derived1 : public Base1 { }; // fails to compile, Base1 is final

class Base2
{
public:
    virtual void method(int arg) final;
};

class Derived2 : public Base2
{
public:
    virtual void method(int arg); // fails to compile, method() is final
};

I haven’t quite made up my mind what I think of this particular feature yet - it seems like it could be useful in some limited cases, but I worry a little that base class authors may overuse it and make future improvements difficult. On the other hand, you already have to make things virtual explicitly in C++ and I suppose that final isn’t much higher on the risk spectrum than that.

Constructor chaining and in-place initialisation

One aspect of C++03 that’s a bit of a pain is the fact that you can’t call one constructor from another. So, if you want any commonality of code between constructors then you need to move it into another method and call it from the relevant places:

class Person
{
public:
    Person(const char* name)
    {
        init(name);
    }

    Person(const std::string& name)
    {
        init(name.c_str());
    }

    void init(const char *name)
    {
        // Initialise from name here
    }
};

This works fine but it’s a little annoying having that extra method. Derived classes also must re-implement their own constructors even if a base class constructor would have done the job just as well, which is also a pain.

C++11 solves these issues by allowing constructors to delegate to other constructors (i.e. call them), where the syntax is the same as that used to invoke a base class constructor:

class Person
{
public:
    Person(const char* name)
    {
        // Initialise from name here
    }

    Person(const std::string& name) : Person(name.c_str()) { }
};

One slightly subtle point to note here is for default arguments in library code. Consider the following constructor which we’ll assume is somewhere in a library:

class FileReader
{
public:
    FileReader(size_t bufferSize=4096);
    // ...
};

This is fine as far as it goes, but remember that the constant 4096 is in the header file and hence is incorporated into all code which links against this library — this means that changing the constant requires recompilation of all the code which uses the library, even if the library itself is built as a shared object. However, using delegation we can instead arrange that the constant is fixed in the library code without compromising the interface offered to clients of the library:

// In header file:
class FileReader
{
public:
    FileReader(size_t bufferSize);
    FileReader();
    // ...
};

// In cpp file:
FileReader::FileReader() : FileReader(4096)
{
}

This is perhaps not a major concern outside those building shared libraries (or people with very large projects worried about recompilation time) but it’s worth bearing in mind.

In C++11 it’s also possible to expose a base class’ constructors directly in the derived class with a slightly quirky new use of the using keyword:

class Base
{
public:
    Base(int value);
};

class Derived : public Base
{
public:
    using Base::Base;
};

Finally with regard to construction, it’s now possible to provide non-const members with initial values like this:

class Point
{
public:
    IntList() { }
    IntList(int x, int y) : xPos(x), xPos(y) { }

private:
    int x = 10;
    int y = 20;
};

These initial values apply to any constructors which don’t explicitly assign a different value in their initialiser lists. So in the example above, the default constructor leaves x=10 and y=20, whereas the explicit one leaves the values as whatever was passed in by the caller.

Trailing return type

On the vaguely related subject of method declarations in general, one thing I forgot to mention in my previous post on type inference was that as of C++11, method calls can employ an alternative syntax where the return type is specified after the argument list.

To illustrate why this is useful, consider the following template function:

template <class Left, class Right>
Left divide(const &Left lhs, const &Right rhs)
{
    return lhs / rhs;
}

This is all very generic apart from one problem — the programmer has to specify in advance the return type which constrains the interface for those using it. For example, the original coder has specified here that the return type must be Left, but what about an instantiation where Left is int and Right is float, and the return type is expected to be float?

Since the compiler already must know the result type of the expression at compile-time, C++11 allows type inference to be used. Given my previous post on the topic, what we’d like is something like this:

template <class Left, class Right>
decltype(lhs+rhs) divide(const &Left lhs, const &Right rhs)
{
    return lhs / rhs;
}

This is invalid, however, because the decltype expression refers to parameter names which are not yet defined. To solve this, the new C++11 syntax of trailing return type is used:

template <class Left, class Right>
auto divide(const &Left lhs, const &Right rhs) -> decltype(lhs+rhs)
{
    return lhs / rhs;
}

Note that auto here means something slightly different to its use in type inference since it doesn’t actually do any deduction. The decltype in this case is doing the deduction, but this syntax can just as easily be used for standard function definitions:

auto divide(int x, float y) -> double
{
    return x / y;
}

Personally I’d stick to the standard syntax unless you’re using type inference, however.

Lambdas

Last on the list of ever-increasingly-loosly-related topics in this post is the subject of anonymous functions, or lambdas as they’re often known. These allow functions to be defined inline with other code, taking arguments and returning a value. The basic syntax is:

[capture](parameters)->return { body }

The parameters take the form of a parameter list just as a standard function and you’ll note that the return type is specified using the alternative syntax described above. In the case of a lambda, incidentally, the return type can be omitted and the compiler will infer it, provided that all return expressions evaluate to the same type — similarly the parameter list can be omitted if empty, although I’m not sure if this is good practice. The function body is also that of a standard function. The capture list is related to making closures and I’ll cover that in a second — first let’s see an example.

Consider a container implementing a vector of strings where the user wishes to create a filtered version based on some arbitrary criteria they specify. A flexible way to implement this is to allow the user to pass in a function which is passed each item in turn and have it return a bool to indicate whether the item should be included in the filtered list:

#include <functional>
#include <string>
#include <vector>

class MyStrings
{
public:
    // ...
    std::vector<std::string>
    filteredList(std::function<bool (const std::string&)> func);
};

std::vector<std::string>
MyStrings::filteredList(std::function<bool (const std::string&)> func)
{
    std::vector<std::string> filtered;
    for (auto it = _strings.begin(); it != _strings.end(); ++it) {
        if (func(*it)) {
            filtered.push_back(*it);
        }
    }
}

Note the use of std::function — this is another C++11 addition which stands for any type of function which has the specified prototype. In this case it takes a const std::string& and returns bool. Note that you need to include functional to use std::function.

In principle this approach would work fine in C++03 with an appropriate function pointer or functor, but both defining a named function andy a class with the function call operator overridden are heavier than required here. Also, including the filter function in the context of the calling code makes it easier to follow, rather than having to jump elsewhere in the code to find it.

So, how are things better in C++11? Enter lambdas:

std::vector<std::string> getStringsWithHello(MyStrings &strings)
{
    return strings.filteredList([](const std::string& str) {
        return str.find("hello") != std::string::npos; }
    } );
}

Please excuse the indentation — I’m still fairly new to these things so I haven’t settled on an appropriate coding style. Hopefully the intent of this (not particularly useful) function should be clear, however, and it’s entirely self-contained. Note that we’ve omitted the return type as described above.

To make this function remotely useful in the real world we need to be able to search for any string, not just "hello", and for this we can use the capture list. Effectively this specifies inputs to the lambda from the defining code (as opposed to the code which calls the lambda once it’s defined), but the form this takes is direct access to the local variables in the scope of the defining function. This is usually known as a closure. Let’s see the example above updated to use this:

std::vector<std::string>
getStringsWithSubstring(MyStrings &strings, const std::string& substring)
{
    return strings.filteredList([&substring](const std::string& str) {
        return str.find(substring) != std::string::npos;
    } );
}

In this example the substring parameter has been defined in the capture list of the lambda to make it available to the code in the body. In this case &substring means “capture the substring variable by reference”. It’s worth noting that because it’s captured by reference, it’s a reference to the real value on the stack in the context of getStringsWithSubstring() (which in turn is a reference to the value in the caller as it happens in this example) so any changes the lambda makes will be reflected in the outer scope.

Of course, like any reference to a local variable within a function it will no longer be valid once the function has returned. As a result, if you wish to return a lambda from a function then you’d better capture its locals by value instead of by reference — just omit the &:

std::function<bool (const std::string&)>
getFilterFunction(const std::string& str)
{
    return [str](const std::string& s) {
        return s.find(str) != std::string::npos;
    };
}

For methods it’s also possible to include this in the capture list, which will give the lambda the same implicit access to the members of the class that methods have (i.e. there’s no need to explicitly mention this). Also, a lone & can be used to allow the compiler to automatically determine which variables to capture by reference, and a lone = is equivalent except that values are captured by value. So, some examples of capture lists:

  • [] means capture nothing.
  • [foo,&bar] means capture foo by value and bar by reference.
  • [&] means capture everything by reference.
  • [=,&foo] means capture everything by value except foo, which is captured by reference.
  • [this,foo] means allow the lambda access to class members as well as capturing foo by value.

Bear in mind that both [&] and [=] also include this implicitly. This is very convenient, but it does mean you need to employ the same care accessing variable inside lambdas that use these specifiers inside methods as you would in the methods themselves. Personally I’d suggest keeping your capture list as tightly specified as possible, since I have a great preference for making things explicit in code — that’s very much a personal view, however.

This might all seem a little strange to anybody not familiar with closures in other languages so it might help to not consider lambdas as functions per se, but instead as little class instantiations with the function call operator overloaded. These instances can maintain their own copies of the variables passed into the capture section and use them in the body of the lambda when it’s called. In fact, this is probably more or less how the compiler implements them, but the syntax is a good deal more convenient than attempting to do the same.

Anyway, that was a bit of a long post, but hopefully covered some interesting ground. Well, you’re still reading, aren’t you? What do you mean you just skipped to the end to see if there’d be an amusing picture of a cat?!

The next article in the “C++11 Features” series is C++11: Template changes
Tue 26 Nov, 2013
17 Sep 2013 at 7:50PM in Software
 |  |