I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers various changes to function and method declaration and definition.
This is the 4th of the 8 articles that currently make up the “C++11 Features” series.
C++11 contains various changes which perhaps don’t expand what’s possible, but certainly improve the clarity with which they can be expressed. Among these changes are explicit defaulting and deleting of methods.
C++ has long had the ability for the compiler to automatically generate some special methods on demand — specifically, it will generate:
const
reference to
its own type.const
reference to its own type.Incidentally, C++11 adds two more to this list:
However, there are a number of limitations to this in C++03, where the compiler either generates a method that you don’t want or fails to generate a method that you do. For example, you may wish the compiler to provide a default constructor even if there are other constructors defined on the class (normally it only generates one if no other constructors are defined). Or perhaps your class semantics are such that copying it will lead to double-freeing some internal resource so you want to disable the copy constructor and assignment.
Some of this has always been possible with little tricks — for example, you
could define a private
copy constructor and fail to provide an
implementation, leading to compile errors if anybody tries to use it. However,
in C++11 the default
and delete
keywords provide a way to do this cleanly and
explicitly in a manner consistent with the syntax for declaring a pure virtual method:
class Uncopyable
{
public:
Uncopyable() = default;
Uncopyable(const Uncopyable&) = delete;
Uncopyable& operator=(const Uncopyable&) = delete;
};
This class first declares explicitly that it wants the compiler to generate a default constructor for it. Then it explicitly prevents the compiler from generating a copy constructor and assignment operator, in essence preventing another instance of the class attempting to copy it.
When combined with appropriate templating, the delete
keyword can also be
used to restrict a function call to only taking one or more explicitly defined types:
class OnlyLongs
{
public:
void method(long int arg);
template<class T> void method(T) = delete;
};
I think of this as conceptually extending the notion of an explicit
constructor to regular methods, although I’m sure C++ experts might wince at
the comparison.
In a similar vein, it’s now possible to override base class methods more reliably in C++11. Consider the following inheritance:
class Base
{
public:
virtual void method(double arg);
};
class Derived : public Base
{
public:
virtual void method(int arg);
};
Perhaps the author of Derived
intended to override the virtual method()
,
but has instead overloaded it with an alternative version taking an int
instead of a double
. This might seem an obvious mistake, but consider that
perhaps Base::method()
took an int
when Derived
was originally defined
but has since been modified by someone who was unaware of the existence of
Derived
. Due to function overloading this is quite valid so the compilation
won’t fail.
In C++11 this sort of problem can be avoided by marking a method with the
override
attribute, specifying that it must explicitly override a method with
the same signature in a base class or a compilation error will result:
class Base
{
public:
virtual void method1(double arg);
virtual void method2(double arg);
};
class Derived : public Base
{
public:
virtual void method1(double arg) override; // this is fine
virtual void method2(int arg) override; // fails to compile in C++11
};
Similar to override
is final
which is used in the same way but explicitly
prevents a derived class from overriding a method. It can also be used on a
base class to prevent any other classes deriving from it further:
class Base1 final { };
class Derived1 : public Base1 { }; // fails to compile, Base1 is final
class Base2
{
public:
virtual void method(int arg) final;
};
class Derived2 : public Base2
{
public:
virtual void method(int arg); // fails to compile, method() is final
};
I haven’t quite made up my mind what I think of this particular feature yet -
it seems like it could be useful in some limited cases, but I worry a little
that base class authors may overuse it and make future improvements difficult.
On the other hand, you already have to make things virtual
explicitly in C++
and I suppose that final
isn’t much higher on the risk spectrum than that.
One aspect of C++03 that’s a bit of a pain is the fact that you can’t call one constructor from another. So, if you want any commonality of code between constructors then you need to move it into another method and call it from the relevant places:
class Person
{
public:
Person(const char* name)
{
init(name);
}
Person(const std::string& name)
{
init(name.c_str());
}
void init(const char *name)
{
// Initialise from name here
}
};
This works fine but it’s a little annoying having that extra method. Derived classes also must re-implement their own constructors even if a base class constructor would have done the job just as well, which is also a pain.
C++11 solves these issues by allowing constructors to delegate to other constructors (i.e. call them), where the syntax is the same as that used to invoke a base class constructor:
class Person
{
public:
Person(const char* name)
{
// Initialise from name here
}
Person(const std::string& name) : Person(name.c_str()) { }
};
One slightly subtle point to note here is for default arguments in library code. Consider the following constructor which we’ll assume is somewhere in a library:
class FileReader
{
public:
FileReader(size_t bufferSize=4096);
// ...
};
This is fine as far as it goes, but remember that the constant 4096
is in the
header file and hence is incorporated into all code which links against this
library — this means that changing the constant requires recompilation of all
the code which uses the library, even if the library itself is built as a
shared object. However, using delegation we can instead arrange that the
constant is fixed in the library code without compromising the interface
offered to clients of the library:
// In header file:
class FileReader
{
public:
FileReader(size_t bufferSize);
FileReader();
// ...
};
// In cpp file:
FileReader::FileReader() : FileReader(4096)
{
}
This is perhaps not a major concern outside those building shared libraries (or people with very large projects worried about recompilation time) but it’s worth bearing in mind.
In C++11 it’s also possible to expose a base class’ constructors directly in
the derived class with a slightly quirky new use of the using
keyword:
class Base
{
public:
Base(int value);
};
class Derived : public Base
{
public:
using Base::Base;
};
Finally with regard to construction, it’s now possible to provide non-const
members with initial values like this:
class Point
{
public:
IntList() { }
IntList(int x, int y) : xPos(x), xPos(y) { }
private:
int x = 10;
int y = 20;
};
These initial values apply to any constructors which don’t explicitly assign a
different value in their initialiser lists. So in the example above, the
default constructor leaves x=10
and y=20
, whereas the explicit one leaves
the values as whatever was passed in by the caller.
On the vaguely related subject of method declarations in general, one thing I forgot to mention in my previous post on type inference was that as of C++11, method calls can employ an alternative syntax where the return type is specified after the argument list.
To illustrate why this is useful, consider the following template function:
template <class Left, class Right>
Left divide(const &Left lhs, const &Right rhs)
{
return lhs / rhs;
}
This is all very generic apart from one problem — the programmer has to specify
in advance the return type which constrains the interface for those using it.
For example, the original coder has specified here that the return type must be
Left
, but what about an instantiation where Left
is int
and Right
is
float
, and the return type is expected to be float
?
Since the compiler already must know the result type of the expression at compile-time, C++11 allows type inference to be used. Given my previous post on the topic, what we’d like is something like this:
template <class Left, class Right>
decltype(lhs+rhs) divide(const &Left lhs, const &Right rhs)
{
return lhs / rhs;
}
This is invalid, however, because the decltype
expression refers to parameter
names which are not yet defined. To solve this, the new C++11 syntax of
trailing return type is used:
template <class Left, class Right>
auto divide(const &Left lhs, const &Right rhs) -> decltype(lhs+rhs)
{
return lhs / rhs;
}
Note that auto
here means something slightly different to its use in type
inference since it doesn’t actually do any deduction. The decltype
in this
case is doing the deduction, but this syntax can just as easily be used for
standard function definitions:
auto divide(int x, float y) -> double
{
return x / y;
}
Personally I’d stick to the standard syntax unless you’re using type inference, however.
Last on the list of ever-increasingly-loosly-related topics in this post is the subject of anonymous functions, or lambdas as they’re often known. These allow functions to be defined inline with other code, taking arguments and returning a value. The basic syntax is:
[capture](parameters)->return { body }
The parameters take the form of a parameter list just as a standard
function and you’ll note that the return type is specified using the
alternative syntax described above. In the case of a lambda, incidentally, the
return type can be omitted and the compiler will infer it, provided that all
return
expressions evaluate to the same type — similarly the parameter list
can be omitted if empty, although I’m not sure if this is good practice. The
function body is also that of a standard function. The capture list is
related to making closures and I’ll cover that in a second — first let’s
see an example.
Consider a container implementing a vector of strings where the user wishes to
create a filtered version based on some arbitrary criteria they specify. A
flexible way to implement this is to allow the user to pass in a function which
is passed each item in turn and have it return a bool
to indicate whether the
item should be included in the filtered list:
#include <functional>
#include <string>
#include <vector>
class MyStrings
{
public:
// ...
std::vector<std::string>
filteredList(std::function<bool (const std::string&)> func);
};
std::vector<std::string>
MyStrings::filteredList(std::function<bool (const std::string&)> func)
{
std::vector<std::string> filtered;
for (auto it = _strings.begin(); it != _strings.end(); ++it) {
if (func(*it)) {
filtered.push_back(*it);
}
}
}
Note the use of std::function
— this is another C++11 addition which stands
for any type of function which has the specified prototype. In this case it
takes a const std::string&
and returns bool
. Note that you need to include
functional
to use std::function
.
In principle this approach would work fine in C++03 with an appropriate function pointer or functor, but both defining a named function andy a class with the function call operator overridden are heavier than required here. Also, including the filter function in the context of the calling code makes it easier to follow, rather than having to jump elsewhere in the code to find it.
So, how are things better in C++11? Enter lambdas:
std::vector<std::string> getStringsWithHello(MyStrings &strings)
{
return strings.filteredList([](const std::string& str) {
return str.find("hello") != std::string::npos; }
} );
}
Please excuse the indentation — I’m still fairly new to these things so I haven’t settled on an appropriate coding style. Hopefully the intent of this (not particularly useful) function should be clear, however, and it’s entirely self-contained. Note that we’ve omitted the return type as described above.
To make this function remotely useful in the real world we need to be able to
search for any string, not just "hello"
, and for this we can use the
capture list. Effectively this specifies inputs to the lambda from the
defining code (as opposed to the code which calls the lambda once it’s
defined), but the form this takes is direct access to the local variables in
the scope of the defining function. This is usually known as a closure.
Let’s see the example above updated to use this:
std::vector<std::string>
getStringsWithSubstring(MyStrings &strings, const std::string& substring)
{
return strings.filteredList([&substring](const std::string& str) {
return str.find(substring) != std::string::npos;
} );
}
In this example the substring
parameter has been defined in the capture list
of the lambda to make it available to the code in the body. In this case
&substring
means “capture the substring
variable by reference”. It’s worth
noting that because it’s captured by reference, it’s a reference to the real
value on the stack in the context of getStringsWithSubstring()
(which in turn
is a reference to the value in the caller as it happens in this example) so any
changes the lambda makes will be reflected in the outer scope.
Of course, like any reference to a local variable within a function it will no
longer be valid once the function has returned. As a result, if you wish to
return a lambda from a function then you’d better capture its locals by value
instead of by reference — just omit the &
:
std::function<bool (const std::string&)>
getFilterFunction(const std::string& str)
{
return [str](const std::string& s) {
return s.find(str) != std::string::npos;
};
}
For methods it’s also possible to include this
in the capture list, which
will give the lambda the same implicit access to the members of the class that
methods have (i.e. there’s no need to explicitly mention this
). Also, a lone
&
can be used to allow the compiler to automatically determine which
variables to capture by reference, and a lone =
is equivalent except that
values are captured by value. So, some examples of capture lists:
[]
means capture nothing.[foo,&bar]
means capture foo
by value and bar
by reference.[&]
means capture everything by reference.[=,&foo]
means capture everything by value except foo
, which is
captured by reference.[this,foo]
means allow the lambda access to class members as well as
capturing foo
by value.Bear in mind that both [&]
and [=]
also include this
implicitly. This is
very convenient, but it does mean you need to employ the same care accessing
variable inside lambdas that use these specifiers inside methods as you would
in the methods themselves. Personally I’d suggest keeping your capture list as
tightly specified as possible, since I have a great preference for making
things explicit in code — that’s very much a personal view, however.
This might all seem a little strange to anybody not familiar with closures in other languages so it might help to not consider lambdas as functions per se, but instead as little class instantiations with the function call operator overloaded. These instances can maintain their own copies of the variables passed into the capture section and use them in the body of the lambda when it’s called. In fact, this is probably more or less how the compiler implements them, but the syntax is a good deal more convenient than attempting to do the same.
Anyway, that was a bit of a long post, but hopefully covered some interesting ground. Well, you’re still reading, aren’t you? What do you mean you just skipped to the end to see if there’d be an amusing picture of a cat?!