☑ C++11: Move Semantics

9 Jul 2013 at 11:04AM in Software
 |  | 

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers move semantics.

This is the 1st of the 8 articles that currently make up the “C++11 Features” series.

child map

As most C++ programmers will know, a new version of the standard was approved a couple of years ago, replacing the previous C++03. This is called C++11, and was formerly known as C++0x. Since I’ve recently happened across a few Stack Overflow questions which mentioned C++11 features I thought I’d have a look over the (to me, at least) more interesting ones and jot down the highlights here for anyone who’s interested.

This post covers move semantics.

The STL tends to be very clever at enabling fairly high-level functionality whilst minimising performance impact. One of its weakest areas, however, is the fact that one often wants to initialise containers from temporary values, or return a container from a function by value, and this involves a potentially expensive copy.

This issue has been improved in C++11 with the addition of move constructors. These are the same as copy constructors in essence, but they take a non-const reference and are not required to leave the source object in a valid state. They are used by the compiler in cases like copying from a temporary value, where the source object is about to go out of scope anyway and hence cannot be accessed after the operation.

This allows classes to implement more efficient copy constructors by having the destination class take ownership of some underlying data directly instead of having to copy it, similar in function to things like std::vector::swap().

In C++03 this couldn’t work because references to temporary values could only ever be const:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
std::string function()
{
    return std::string("hello, world");
}

void anotherFunction()
{
    // const is required on line below or code won't compile.
    const std::string& str = function();
    std::cout << "Value: " << str << std::endl;
}

To enable this behaviour in C++11 a new type of reference known as an rvalue reference has been created. These may only be bound to rvalues (i.e. temporary values) but they allow non-const references to be bound. These are specified using an extra ampersand as shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
std::string function()
{
    return std::string("hello, world");
}

void anotherFunction()
{
    // Note extra & denoting rvalue ref, allowed to be non-const.
    std::string&& str = function();
    std::cout << "Value: " << str << std::endl;
}

If a function or method is overloaded with different variants which take rvalue and lvalue references then this allows code to behave more optimally when dealing with temporary values which can safely be invalidated. As well as the previously-mentioned move constructor it’s also possible to define move assignment operators in the same way.

The following trivial class which holds a block of memory shows the definition of both a move and copy constructor to illustrate how the move constructor is more efficient:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
class MemoryBlob
{
public:
  // Standard constructor -- expensive copy from caller's buffer.
  MemoryBlob(const char *blob, size_t blobSize)
      : size(blobSize), buffer(new char[size])
  {
    memcpy(buffer, blob, size);
  }

  // Copy constructor -- expensive copy from other class's buffer.
  MemoryBlob(const MemoryBlob& other)
      : size(other.size), buffer(new char[size])
  {
    memcpy(buffer, other.buffer, size);
  }

  // C++11 move constructor -- cheap theft of other class's pointer.
  MemoryBlob(MemoryBlob&& other) : size(other.size), buffer(other.buffer)
  {
    other.buffer = NULL;
  }

  // Destructor -- remember delete of NULL is harmless.
  ~MemoryBlob()
  {
    delete buffer;
  }

  // Standard assignment operator.
  MemoryBlob& operator=(const MemoryBlob& other)
  {
    size = other.size;
    delete buffer;
    buffer = new char[size];
    memcpy(buffer, other.buffer, size);
  }

  // C++11 move assignment operator.
  MemoryBlob& operator=(MemoryBlob&& other)
  {
    size = other.size;
    delete buffer;
    buffer = other.buffer;
    other.buffer = NULL;
  }

  // Remainder of class...

private:
  size_t size;
  char *buffer;
};

int main()
{
  std::vector<MemoryBlob> vec;
  vec.push_back(MemoryBlob("hello", 5))  // Will invoke move constructor.
  return 0;
}

In the example above it’s important to note how the buffer pointer of the source class gets reset to NULL during the move constructor and assignment operator. Without this, the destructor of the other class would delete the pointer still held by the destination class causing all sorts of mischief.

As a final note, named variables are never considered rvalues — only lvalue references to them can be created. There are, however, cases where you may need to treat an lvalue reference as an rvalue. In these instances the std::move() function can be used to “cast” an lvalue reference to an rvalue version. Of course, careless use of this could cause all sorts of problems, just as with casting.

For a more detailed discussion of move semantics, see this article.

The next article in the “C++11 Features” series is C++11: Initialization
Mon 15 Jul, 2013
9 Jul 2013 at 11:04AM in Software
 |  |