I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers a miscellaneous set of language improvements which I haven’t yet discussed.
This is the 6th of the 8 articles that currently make up the “C++11 Features” series.
This post contains a collection of smaller changes which I didn’t feel were a good fit into other posts, but which I wanted to cover nonetheless.
C++11 has finally added a type-safe equivalent of C’s NULL
macro for
pointers so one no longer has to use 0
and risk all sorts of confusion where
a function has overloads that take an integral type and a pointer. The new
constant is nullptr
and is implicitly convertible to any pointer type,
including pointer-to-members. Its type is nullptr_t
. To remain
backwards-compatible, the old 0
constant will still work.
‘Nuff said on that one.
In C++03 enumerations seem like a wonderfully clear and safe way to specify
arbitrary groupings of values. Unfortunately they suffer from a few issues
which can quite easily bite you. The main problems stem from the fact that
they’re just a teensy tinsy dusting of syntactic sugar over plain old integers,
and can be treated like them in most contexts. The compiler won’t implicitly
convert between different enum
types, but it will convert between them and
integers quite happily. Worse still, the members of the enumeration aren’t
scoped, they’re exposed directly in the outer scope — the programmer almost
invariably ends up doing this scoping with crude name prefixing, which is ugly
and prone to inconsistency.
Fortunately C++11 has remedied this lack by adding a new syntax for declaring type-safe enumerations:
enum class MyEnum
{
First,
Second,
Third=103,
Fourth=104
};
As can be seen, values can be assigned to enumeration members or the compiler
can assign them. The identifiers here are scoped within the MyEnum
namespace,
such as MyEnum::First
, so two different enumerations can freely use the same
constants without concern. Also, these values can no longer be compared with
integers directly, only with other members of the same enumeration.
One of the more minor, but still occasionally annoying, problems with
enumerations in C++03 was that the eumeration type was implementation-specific,
and could even vary according to the number of items in the enumeration, which
could lead to portability problems. As of C++11 the underlying integral type is
always specified by the programmer. It defaults to int
in declarations such
as that shown above, but can be explicitly specified like so:
enum class MyBigEnum : unsigned long { /* ... */ };
There’s also a transitional syntax to allow legacy enumeration declarations to benefit from just this change:
enum MyOldEnum : unsigned long { /* ... */ };
Finally, new-style enumerations can also be forward-declared, something that wasn’t possible in C++031, as long as the underlying type is known (either implicitly or explicitly):
enum MyEnum1; // Illegal: legacy syntax, no type
enum MyEnum2 : unsigned long; // Legal in C++11: type explicit
enum class MyEnum3; // Legal in C++11: type implicit (int)
enum class MyEnum4 : short // Legal in C++11: type explicit
enum class MyEnum3 : short // Illegal: can't change type once declared
Of course, as the final example shows it’s not legal to change the type once it’s declared, even if only implicitly.
Iterating with a for
loop is a common occurrence in C++ code, but the syntax
for it is still rooted in its C origins. It’s a flexible construct which has
served us well, but there are times when it’s just unpleasantly verbose. As a
result, C++11 has added a new version of for
which works with a limited set
of iterables:
begin()
and end()
methods (e.g. STL containers)The new syntax is basically a clone of the same construct in Java:
int myArray[] = {0, 1, 2, 3, 4, 5, 6, 7};
for (int& x : myArray) {
// ...
}
Note that within the loop x
will be a reference to the real values in the
array so may be modified. I could have also declared x
as a simple int
instead of int&
and as you might expect this will create a copy of each value
as x
in the loop so modifications wouldn’t be reflected in the original array.
This is particularly convenient for STL-style containers when combined with type inference:
std::map<std::string, std::string> myMap;
myMap["hello"] = "world";
myMap["foo"] = "bar";
// Old C++03 version
for (std::map<std::string, std::string>::iterator it = myMap.begin();
it != myMap.end(); ++it) {
std::cout << it->first << ": " << it->second << std::endl;
}
// New C++11 version
for (auto it : myMap) {
std::cout << it.first << ": " << it.second << std::endl;
}
Note how with the new syntax the iterator is implicitly dereferenced.
Operator overloading allows classes to work intuitively in similar ways to
builtins and one of application of this is for value conversion — for example,
overriding operator bool()
allows a class instance to be evaluated in a
boolean context. Unfortunately C++’s implicit type conversions mean that
overriding such operators also brings with it a slew of potentially unwanted
other behaviour, which leads to ugly workaround such as the
safe bool idiom.
As a cleaner solution, C++11 has extended the possible uses of the explicit
keyword to cover such conversion functions. Using this for the bool
conversion, for example, allows the class to operator as a boolean but prevent
it being further implicitly cast to, say, an integral type.
class Testable
{
public:
explicit operator bool() const { /* ... */ }
};
In C++03 there are two types of string literal:
const char normal[] = "normal char literal";
const wchar_t wide[] = L"wide char literal";
Wide character support is of an unspecified type and encoding, however, sometimes limiting its usefulness. C++11 improves the situation significantly by adding support for the encodings UTF-8, UTF-16 and UTF-32:
const char utf8[] = u8"UTF-8 encoded string";
const char16_t utf16[] = u"UTF-16 encoded string";
const char32_t utf32[] = U"UTF-32 encoded string";
Within these types the escape \uXXXX
can be used to specify a 16-bit Unicode
code point in hex and \UXXXXXXXX
a 32-bit one.
My hope is that wide character support can now quietly expire and be replaced
by the standard UTF encodings that everyone should be using. Worst case, I
would hope all platform vendors would be working towards wchat_t
becoming
simply an alias for one of the UTF types.
In addition to unicode strings, C++11 also adds a new syntax for reducing the need for escaping special characters such as quotes within strings:
const char old[] = "Quotes \"within\" strings must be escaped.";
const char new[] = R"xyz(Little "escaping" \ "quoting" required)xyz";
The delimiter (xyz
above) can be anything up to 16 characters, and can be
chosen so as it doesn’t occur in the string itself. The delimiter can also be
empty, making the literal R"(...)"
.
I’ll outline this only briefly as I haven’t had much cause to play with it myself yet, but C++11 has added the ability to define new types of literal.
Going back to pure C, it’s been possible to clarify the type of a potentially
ambiguous literal. For example, 1.23
is a double
, but add the f
suffix to
form 1.23f
and the literal is instead of type float
. In C++11 the
programmer can define new such suffixes to convert raw literals to specific
types. These conversions take the form of functions which can accept either the
raw form of the literal as a string:
long operator"" _squareint(const char *literal)
{
long value = strtol(literal, NULL, 10); // Check errors, kids
return value * value;
}
long foo = 12_squareint; // foo has value 144
Alternatively the code can rely on the compiler to convert the literal to a numeric or string type and use that instead:
// Allow literals in any time unit to be stored as seconds.
unsigned long long operator"" _hours(unsigned long long literal)
{
return literal * 3600;
}
I must admit I suspect I’ll have limited use for this, but I suppose it’s potentially a nice idea for code which makes heavy use of a particular type - complex numbers spring to mind, for example.
C/C++ provide the assert()
facility for checking invariants at runtime and
the #error
pragma for compile-time errors in preprocessor macros. However,
templates can also benefit from compile-time checks and the new static_assert
keyword allows this:
template <class TYPE>
class MyContainer
{
static_assert(sizeof(TYPE) >= sizeof(int), "TYPE is too small");
}
Finally, those targeting many architectures may rejoice that C++11 has added
alignof
and alignas
to query and force the memory address alignment of
variables. If you don’t know what alignment is, you probably don’t need to
know. Seriously, don’t worry about it — go back and read about lambdas again.
The reason is that the size of the enumeration type couldn’t be known before the full list of members was declared, because implementations were allowed to vary the underlying type based on the number of members. ↩