☑ Tuning in the static

In C++ the static keyword has quite a few wrinkles that may not be immediately apparently. One of them is related to constructor order, and I briefly describe it here.

large rock

The static keyword will be familiar to most C and C++ programmers. It has various uses, but for the purposes of this post I’m going to focus on static, local variables within a function.

In C you might find code like this:

1
2
3
4
5
int function()
{
  static int value = 9;
  return ++value;
}

On each call this function will return int values starting at 10 and incrementing by one on each call, as value is static and hence persists between calls to the function. The following, however, is not valid in C:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
int another_function()
{
  return 123 + 456;
}

int function()
{
  static int value = another_function();
  return ++value;
}

This is because in C, objects with static storage must be initialised with constant expressions1. This makes life easy for the compiler because typically it can put the initial value directly into the data segment of the binary and then just omit any initialisation of that variable when the function is called.

Not so in C++ where things are rather more complicated. Here, static variables within a function or method can be initialised with any expression that their non-static counterparts would accept and the initialisation happens on the first call to that function2.

This makes sense when you think about it, because in C++ variables can be class types and their constructors can perform any arbitrary code anyway. So, calling a function to get the initial value isn’t really much of a leap. However, it’s potentially quite a pain for the compiler and, by extension, the performance-conscious coder as well.

The reason that this might impact performance is that the compiler can no longer perform initialisation by including literals in the binary, since the values aren’t, in general, known until runtime. It now needs to track whether the static variables have been initialised in a function, and it needs to check this every time the function is called. Now I’m not sure which approach compilers take to achieve this, but it’s most likely going to add some overhead3, even if just a little. In a commonly-used function on a performance-critical data path, this could become significant.

A further complicating factor is the fact that each static variable must be separately tracked because any given run through the function may end up not passing the definition if its within a conditional block. Also, objects are required4 to be destroyed in the reverse of the order in which they were constructed. Put these two together and there’s quite a bit of variability — consider this small program:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <sstream>
#include <string>

class MyClass
{
public:
  MyClass(std::string id);
  ~MyClass();
private:
  std::string id_;
};

MyClass::MyClass(std::string id) : id_(id)
{
  std::cout << "Created " << id << std::endl;
}

MyClass::~MyClass()
{
  std::cout << "Destroyed " << id_ << std::endl;
}

void function(bool do_first)
{
  if (do_first) {
    static MyClass first("first");
  }
  static MyClass second("second");
}

int main(int argc, char *argv[])
{
  MyClass instance("zero");
  function(argc % 2 == 0);
  if (argc > 2) {
    function(true);
  }

  return 0;
}

With this code, you get a different order of construction and destruction based on the number of arguments you provide. We may skip the construction of first entirely:

$ static-order
Created zero
Created second
Destroyed zero
Destroyed second

We may construct first and second on the first call to function() and hence have them destroyed in the opposite order:

$ static-order one
Created zero
Created first
Created second
Destroyed zero
Destroyed second
Destroyed first

Or we may skip over constructing first on the first call, and then have it performed on the second, in which case we get the opposite order of destruction to the above:

$ static-order one two
Created zero
Created second
Created first
Destroyed zero
Destroyed first
Destroyed second

In all cases you’ll note that zero is both constructed and destroyed first. It’s constructed first because it’s created at the start of main() before any calls to function(). It’s destroyed first because this happens once main() goes out of scope, which happens just prior to the termination of the program which is the point at which static objects, whether local or global, go out of scope.

Static variables get more complex every time you look at them — I haven’t covered the order (or lack thereof) of initialising static objects in different compilation units, and we haven’t even begun to talk about multithreaded environments yet…

Just be careful with static. Oh, and, uh, the rest of C++ too, I suppose. In fact, have you ever considered Python?


  1. See §6.7.8.4 of the C standard

  2. Incidentally, this enables a useful way to prevent the static initialisation order fiasco, but that’s another story. 

  3. Well, if you assume you can write to your own code section then there are probably ways of branching to the static initialiser and then overwriting the branch with a no-op or similar, and this would be almost zero overhead. However, I believe on Linux at least the code section is read-only at runtime which puts the kibosh on sneaky tricks like that. 

  4. See §3.6.3.1 of the C++ standard

14 Jun 2013 at 4:18PM by Andy Pearce in Software  | Photo by Frantzou Fleurine on Unsplash  | Tags: c++ c