In C++ the static
keyword has quite a few wrinkles that may not be immediately apparently. One of them is related to constructor order, and I briefly describe it here.
The static
keyword will be familiar to most C and C++ programmers. It has
various uses, but for the purposes of this post I’m going to focus on static,
local variables within a function.
In C you might find code like this:
1 2 3 4 5 |
|
On each call this function will return int
values starting at 10
and
incrementing by one on each call, as value
is static
and hence persists
between calls to the function. The following, however, is not valid in C:
1 2 3 4 5 6 7 8 9 10 |
|
This is because in C, objects with static storage must be initialised with constant expressions1. This makes life easy for the compiler because typically it can put the initial value directly into the data segment of the binary and then just omit any initialisation of that variable when the function is called.
Not so in C++ where things are rather more complicated. Here, static variables within a function or method can be initialised with any expression that their non-static counterparts would accept and the initialisation happens on the first call to that function2.
This makes sense when you think about it, because in C++ variables can be class types and their constructors can perform any arbitrary code anyway. So, calling a function to get the initial value isn’t really much of a leap. However, it’s potentially quite a pain for the compiler and, by extension, the performance-conscious coder as well.
The reason that this might impact performance is that the compiler can no longer perform initialisation by including literals in the binary, since the values aren’t, in general, known until runtime. It now needs to track whether the static variables have been initialised in a function, and it needs to check this every time the function is called. Now I’m not sure which approach compilers take to achieve this, but it’s most likely going to add some overhead3, even if just a little. In a commonly-used function on a performance-critical data path, this could become significant.
A further complicating factor is the fact that each static variable must be separately tracked because any given run through the function may end up not passing the definition if its within a conditional block. Also, objects are required4 to be destroyed in the reverse of the order in which they were constructed. Put these two together and there’s quite a bit of variability — consider this small program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
With this code, you get a different order of construction and destruction based
on the number of arguments you provide. We may skip the construction of first
entirely:
$ static-order
Created zero
Created second
Destroyed zero
Destroyed second
We may construct first
and second
on the first call to function()
and
hence have them destroyed in the opposite order:
$ static-order one
Created zero
Created first
Created second
Destroyed zero
Destroyed second
Destroyed first
Or we may skip over constructing first
on the first call, and then have it
performed on the second, in which case we get the opposite order of destruction
to the above:
$ static-order one two
Created zero
Created second
Created first
Destroyed zero
Destroyed first
Destroyed second
In all cases you’ll note that zero
is both constructed and destroyed first.
It’s constructed first because it’s created at the start of main()
before any
calls to function()
. It’s destroyed first because this happens once main()
goes out of scope, which happens just prior to the termination of the program
which is the point at which static objects, whether local or global, go out of scope.
Static variables get more complex every time you look at them — I haven’t covered the order (or lack thereof) of initialising static objects in different compilation units, and we haven’t even begun to talk about multithreaded environments yet…
Just be careful with static
. Oh, and, uh, the rest of C++ too, I suppose.
In fact, have you ever considered Python?
See §6.7.8.4 of the C standard. ↩
Incidentally, this enables a useful way to prevent the static initialisation order fiasco, but that’s another story. ↩
Well, if you assume you can write to your own code section then there are probably ways of branching to the static initialiser and then overwriting the branch with a no-op or similar, and this would be almost zero overhead. However, I believe on Linux at least the code section is read-only at runtime which puts the kibosh on sneaky tricks like that. ↩
See §3.6.3.1 of the C++ standard. ↩