☑ C++11: Library Changes: Pointers, Randoms, Wrappers and more

24 Apr 2015 at 11:33AM in Software
 |  | 

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this is the second of the final two that cover what I feel to be the most important changes to the standard template library.

This is the 8th of the 8 articles that currently make up the “C++11 Features” series.

child map

Continuing on from the last post here are the remaining library changes for C++11 which I think are particularly noteworthy.

Smart pointers

The venerable old std::auto_ptr has been in the STL for some time and works tolerably well for simple cases of an RAII pointer. It has, however, some signficant drawbacks:

  • It can’t be stored in containers.
  • Changes of ownership aren’t always obvious and may be unexpected.
  • It will always call delete on its owned pointer, never delete[].

As a result of this in C++11 std::auto_ptr is deprecated and has been replaced by the improved std::unqiue_ptr. This has similar semantics in that ownership can only exist with one instance at a time — if ownership is transferred, the pointer is set to nullptr in the old instance, as with auto_ptr.

The advantages of this class are essentially the removal of the three limitations mentioned above:

  • Can be stored in a container.
  • Move semantics are used to transfer ownership, so code will fail to compile without use of an explicit std::move, reducing the chances that the transfer could happen at times the programmer didn’t intend.
  • Support of arrays so either delete or delete[] will be called as appropriate.

This improved std::unqiue_ptr is a fairly thin wrapper around a native pointer and actually incurs pretty minimal overhead on dereference. On the other hand, its simple semantics make it unsuitable for multithreaded use-cases — that’s where the other new smart pointer std::shared_ptr comes in.

As the name implies, std::shared_ptr allows multiple references to the same underlying pointer. The pointer itself is wrapped in some reference counting structure and the destructor of each std::shared_ptr reduces the reference count by 1, the resource being deleted when the count reaches zero. Assignment of a different pointer to a std::shared_ptr reduces the reference count on the previous pointer as well, of course. In short, it’s the obvious reference-counting semantics you’d expect.

Since one of the common use-cases of std::shared_ptr is in multithreaded applications, it’s probably no surprise that the underlying reference counting structure is thread-safe — multiple threads can all safely own their own std::shared_ptr instances which point to the same underlying resource. It’s worth remembering that the non-const methods of the std::shared_ptr themselves are not thread-safe, however, so sharing instances of std::shared_ptr itself between threads could lead to races. The semantics of the class make it easy to simply assign to a new instance when the pointer passes to a new thread, however, so it’s not hard to use correctly.

The final new smart pointer in C++11 is the std::weak_ptr. This can share ownership of a resource with std::shared_ptr instances (not std::unique_ptr), however the reference so created is not sufficient to prevent the underlying resource from being deleted. This could be useful for secondary structures which, for example, keep track of all allocated instances of a particular class, but don’t want those references to keep the instances in memory once the primary owning class has deleted it.

Because a std::weak_ptr can become nullptr at any point due to operations in other threads, some care has to be taken when dereferencing it. It’s possible to construct a std::shared_ptr from one, which will guarantee that either it’s still extant and will remain so for the lifetime of the std::shared_ptr, or it’s no longer extant and he std::shared_ptr constructor will raise an exception. std::weak_ptr itself also has a lock() method which returns a std::shared_ptr for the underlying pointer — the behaviour in this case is quite similar except that if the resource has been freed then an empty pointer object is returned instead of an exception raised.

All quite straightforward — exactly how language features should be.

Random Numbers

The core C and C++ libraries have never had hugely comprehensive facilities for random number generation. Well, let’s face it, they’ve had rand() and srand() and that’s about your lot. For many programmers I’m sure that’s plenty, and part of me thinks that more specialised use-cases in statistics and the like perhaps belong in their own third party libraries.

That said, if you’re writing code in a commercial context where things like licences and packaging become troublesome issues to deal with, there’s definite value in not having to resort to third party code. Comparing writing code in, say, Perl and Python, I vastly prefer Python. Now partly that’s just because I like the language better, of course, but it’s also the standard library — part of a language that’s too often discounted from comparisons in my opinion1.

Perl always seems to involve grabbing a whole raft of stuff from CPAN. Then you have the fun of figuring out where to install it so that your code can see it, whether anyone else in your organisation is already using it so that you don’t have to install multiple copies, how you handle sudden urgent security updates to the libraries while you’re in the midst of a release cycle, how you fix urgent bugs in your own copy whilst not losing the ability to merge in upstream fixes, whether your company allows you to contribute your fixes back to the upstream at all, and so on and so forth. All these issues have resolutions, of course, but the point is that it’s a pain when one of the main reasons you used a library was probably to reduce effort.

Taking all that2 into account, and getting back to the matter in hand, I’m therefore quite glad that the C++ standards committee has decided to include some much improved random number facilities in C++11 even if I don’t necessarily think I’ll have much cause to use them in the forseeable future.

The new facilities split random number generation into two parts:

  • The engine is responsible for actually generating pseudorandom numbers. It contains the basic algorithm and current state.
  • The distribution takes values from an engine and transforms them such that the resultant set follows some specified probability distribution.

Looking at the engine first, this is a functor which, when called, yields the next item in whatever pseudorandom sequence it’s generating. It also has a seed() method to set the initial seed state, discard() to skip ahead a specified amount in the sequence and static min() and max() methods to query the potential range of generated values.

The library provides three basic options:

std::linear_congruential_engine
Fast and with small storage requirements.
std::mersenne_twister_engine
Not quite so fast, and larger storage requirements, but with a longer non-repeating sequence and (in some ways) more randomly distributed results.
std::subtract_with_carry_engine
Fast even on archiectures with only simple arithmetic instruction sets, but with larger storage requirements.

Typically I suspect the architecture will determine the choice used — in lieu of strict performance and/or memory requirements, I’d suggest the old favourite Mersenne Twister.

There are also engine adapters such as the std::discard_block_engine which discards some of the output of an underlying base engine. Presumably the intention is that these adapters can be used to improve some randomness criteria of imperfect underlying engines, but I can’t help but think that in most cases where one is so concerned about the quality of the random numbers one should probably be using a true entropy source. Still, I’m hardly an expert and I’m sure there are good reasons why these were added.

Once the underlying engine has generated a raw stream of random numbers, the distribution transforms these in such a way that the resultant stream matches some probability distribution. There are over 20 options for this that I’m aware of, and presumably the potential for many more to be defined later. A handful for flavour:

std::uniform_int_distribution
Generates integers distributed evenly across a specified range.
std::uniform_real_distribution
Generates real numbers distributed evenly across a specified range.
std::normal_distribution
Generates values that are normally distributed around specified mean and standard deviation values.
std::binomial_distribution
Generates values that are binomially distributed given the number of trials and probability of success.

Overall it seems quite comprehensive, even if I’m still not quite sure for how many people this will remove the need for more specialist third party libraries.

Wrapper References

STL containers are both flexible and convenient, but to my mind they’ve always had two weaknesses: they involve an annoying amount of copying; and you can’t store references, only pointers. C++11 provides move semantics as a solution to the first problem — and wrapper references as a solution to the second.

A wrapper reference is really just a placeholder that can be used very much like a reference, although some minor hoops must be jumped through when performing assignment. The following code snippet shows how they can be used to assign to underlying values:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#include <functional>
#include <iostream>
#include <vector>

int main()
{
    int array[5] = {1, 2, 3, 4, 5};
    std::vector<std::reference_wrapper<int>> vec;
    vec.push_back(array[0]);
    vec.push_back(array[2]);
    for (int& it : vec) {
        it = 99;
    }
    for (unsigned int i = 0; i < 5; ++i) {
        std::cout << array[i] << std::endl;
    }
    return 0;
}

The result is that array[0] and array[2] are set to 99, whereas all the other values in array remain the same.

As well as the template std::reference_wrapper there’s also the std::ref and closely related std::cref helper functions which allow a slightly more convenient syntax for creating wrapper references by using type inference.

As well as storing references in containers there are a number of other potential uses for wrapper references, one of the main ones being to instantiate template functions such that they take references to their arguments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#include <functional>
#include <iostream>

template <class T>
void function(T a, T b)
{
    a += b;
}

int main()
{
    int x = 3;
    int y = 4;
    // Will instantiate function<int>.
    function(x, y);
    std::cout << "x=" << x << ", y=" << y << std::endl;
    // Will instantiate function<std::reference_wrapper<int>>.
    function(std::ref(x), std::ref(y));
    std::cout << "x=" << x << ", y=" << y << std::endl;
    return 0;
}

Polymorphic Function Wrappers

C++ has lots of mechanisms to make things generic, such as polymorphism, overloading and templating. One thing that’s always been difficult to do, however, is have a generic notion of a callable — i.e. something which can be invoked in a function-like manner, but which may not necessarily be a function.

There are various ways to refer to callables — function pointers, member function pointers and functors are all examples. But C++’s type-safety makes it tricky to declare something that can wrap all these up.

No longer, however — now we have std::function for just this purpose. This can wrap up any callable which matches the return and argument types of the wrapper and otherwise acts like a regular functor.

One other benefit is that it has the property of a pointer in that it can be uninitialised — in this case calling it will throw the std::bad_function_call exeption. Here’s an example that demonstrates this, as well as using the wrapper with both a function and functor:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <functional>
#include <iostream>
#include <string>

std::string multiplyStringFunc(std::string s, unsigned int num)
{
    std::string ret;
    for (/* blank */; num > 0; --num) {
        ret.append(s);
    }
    return ret;
}

class MultiplyStringFunctor
{
  public:
    std::string operator()(std::string s, unsigned int num)
    {
        return multiplyStringFunc(s, num);
    }
};

int main()
{
    std::function<std::string(std::string, unsigned int)> wrapper;
    try {
        std::cout << wrapper("bad", 1) << std::endl;
    } catch (std::bad_function_call) {
        std::cout << "Called invalid function" << std::endl;
    }
    wrapper = multiplyStringFunc;
    std::cout << wrapper("hello ", 2) << std::endl;
    wrapper = MultiplyStringFunctor();
    std::cout << wrapper("world ", 3) << std::endl;
    return 0;
}

While this is all undeniably convenient, you should bear in mind that there is, unsurprisingly, an overhead to using these wrappers. As a run-time mechanism there’s no way they’ll ever be as efficient as, say, a raw templated function, where the heavy lifting is done at compile time. The Boost documentation has some discussion of performance, but I think an issue that’s at least as critical as the raw overhead itself is the opportunity cost of the compiler being unable to inline or otherwise optimise your code at compile time.

Nonetheless, such performance concerns are often not dominant in a design compared to, say, extensibility and maintainability, so it’s definitely good to know there’s now a proper callable interface class available.

Type Traits

I’m only going to skim this one because it leans heavily in the direction of template metaprogramming, in whose waters I’ve dipped just the most tentative toe. However, I think the basic mechanism is simple enough to explain, so I’ll try to illustrate with some simplistic examples and leave it as an exercise to the reader to extrapolate into, say, an entire BitTorrent tracker that runs wholly at compile time3.

C++ has had templates for quite a long time — these are a great way to write generic code across multiple types. You make some basic assumptions about your type (e.g. it has a method foo() or supports operator +) and the compiler should let you know if your assumptions don’t hold at compile time — possibly in a fairly cryptic manner, but it should let you know.

Since it can be hard to write pleasant/optimal/correct code that works for a range of disparate types, there are also template specialisations. These are alternative implementations of the template for for specific types which can override the generic implementation. So far so C++03.

This system is quite restrictive, however, in that it’s quite hard to write code that modifies its behaviour in more subtle ways based on the types it’s given. It’s also hard to write a system that will restrict the types on which you can instantiate your template. If you write code to convert between big- and little-endian, it would be nice if someone using it on a non-integral type would get a pleasant compile-time error as opposed to randomly scrambling bytes around at runtime.

This is where type traits come in handy. Essentially they’re just expressions that return information about a type, evaluated at compile-time. One of the simplest uses is in concert with C++11’s new static_assert feature, to constrain the types of templates.

template <typename T>
T swapEndian(T value)
{
    static_assert(std::is_integral<T>::value);
    // Implementation left as an exercise
}

As well as std::is_integral there’s also std::is_floating_point, std::is_array, std::is_pointer and more. Things like std::is_rvalue_reference could come in handy when writing complex code that attmempts to support move semantics efficiently and perhaps std::is_trivially_copyable could be used to swap some optimisations in common cases. I also think that std::is_base_of sounds very promising for enforcing interfaces.

Not all of the expressions are boolean in nature, although it has to be said that most of them are. The few exceptions include std::rank, which evaluates the number of dimensions in an array, or 0 for non-array types; and std::extent, which evaluates to the size of an array in a specified dimension, or 0 for unknown bounds or non-array types. I could imagine these could make it very convenient to implement efficient templated operations on fixed-size arrays without the hassle of tricks like null-terminating them.

As I said, I’m just touching on the surface here, but I think it’s worth bearing these new facilities in mind even if you just plan to use fairly simplistic forms of templating.

Return Type Inference

With prolific use of templates and overloading, it can make it quite difficult to implement generic classes. For example, if you want to implement map() where some functor is applied to every item in a container then you have to decide how you’re going to template it. You could just assume that the return type matches the parameter types, but this is a constraint it would be nice to do without.

C++11 has therefore introduced std::result_of which, when passed any callable, will evaluate to the return type of that callable. This is perhaps demonstrated best with an (extremely contrived!) example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <type_traits>
#include <iostream>

struct PercentageConverter
{
    double operator()(int value);
    int operator()(double value);
};

double PercentageConverter::operator()(int value)
{
    return static_cast<double>(value) / 100.0;
}

int PercentageConverter::operator()(double value)
{
    return static_cast<int>(value * 100.0);
}

int main()
{
    PercentageConverter instance;
    std::result_of<PercentageConverter(int)>::type res1 =
            instance(50);
    std::cout << res1 << std::endl;
    std::result_of<PercentageConverter(double)>::type res2 =
            instance(0.15);
    std::cout << res2 << std::endl;
    return 0;
}

One of those things that’s probably incredibly useful in the few cases that you need it, but unless you’re using templates in a fairly heavy-duty manner, there’s a good chance that you can live in bliss ignorance of its existence.

Conclusion

So that’s (finally!) the end my posts on C++11. Wow, it’s taken rather longer than I first anticipated, but hopefully it’s been useful — I know it has for me, at least.

As a final note, if you have any more C++ questions in general, I can strongly recommend both the C++ Super-FAQ and cppreference.com as great resources. Personally I’d strongly suggest consulting them before resulting to the usual Stack Overflow search as their hit rate is great and the quality of the information is fantastic.


  1. For example, my mild dislike of Java has very little to do with the language, which I feel is quite adequate, and a lot to do with the standard library. I don’t like living in the kingdom of the nouns

  2. And let’s face it, there was rather too much of it. 

  3. Don’t worry, this is a joke. At least I hope so. 

This is the most recent article in the “C++11 Features” series, which started with C++11: Move Semantics
Tue 9 Jul, 2013
24 Apr 2015 at 11:33AM in Software
 |  |