☑ Python destructor drawbacks

Python’s behaviour with regards to destructors can be a little surprising in some cases.

green python

As you learn Python, sooner or later you’ll come across the special method __del__() on classes. Many people, especially those coming from a C++ background, consider this to be the “destructor” just as they consider __init__() to be the “constructor”. Unfortunately, they’re often not quite correct on either count, and Python’s behaviour in this area can be a little quirky.

Take the following console session:

>>> class MyClass(object):
...   def __init__(self, init_dict):
...     self.my_dict = init_dict.copy()
...   def __del__(self):
...     print "Destroying MyClass instance"
...     print "Value of my_dict: %r" % (self.my_dict,)
... 
>>> instance = MyClass({1:2, 3:4})
>>> del instance
Destroying MyClass instance
Value of my_dict: {1: 2, 3: 4}

Hopefully this is all pretty straightforward. The class is constructed and __init__() takes an initial dict instance and stores a copy of it as the my_dict attribute of the MyClass instance. Once the final reference to the MyClass instance is removed (with del in this case) then it is garbage collected and the __del__() method is called, displaying the appropriate message.

However, what happens if __init__() is interrupted? In C++ if the constructor terminates by throwing an exception then the class isn’t counted as fully constructed and hence there’s no reason to invoke the destructor1. How about in Python? Consider this:

>>> try:
...   instance = MyClass([1,2,3,4])
... except Exception as e:
...   print "Caught exception: %s" % (e,)
... 
Caught exception: 'list' object has no attribute 'copy'
Destroying MyClass instance
Exception AttributeError: "'MyClass' object has no attribute 'my_dict'" in <bound method MyClass.__del__ of <__main__.MyClass object at 0x7fd309fbc450>> ignored

Here we can see that a list instead of a dict has been passed, which is going to cause an AttributeError exception in __init__() because list lacks the copy() method which is called. Here we catch the exception, but then we can see that __del__() has still been called.

Indeed, we get a further exception there because the my_dict attribute hasn’t had chance to be set by __init__() due to the earlier exception. Because __del__() methods are called in quite an odd context, exceptions thrown in them actually result in a simple error to stderr instead of being propagated. That explains the odd message about an exception being ignored which appeared above.

This is quite a gotcha of Python’s __del__() methods — in general, you can never rely on any particular piece of initialisation of the object having been performed, which does reduce their usefulness for some purposes. Of course, it’s possible to be fairly safe with judicious use of hasattr() and getattr(), or catching the relevant exceptions, but this sort of fiddliness is going to lead to tricky bugs sooner or later.

This all seems a little puzzling until you realise that __del__() isn’t actually the opposite of __init__() — in fact, it’s the opposite of __new__(). Indeed, if __new__() of the base class (which is typically responsible for actually doing the allocation) fails then __del__() won’t be called, just as in C++. Of course, this doesn’t mean the appropriate thing to do is shift all your initialisation into __new__() — it just means you have to be aware of the implications of what you’re doing.

There are other gotchas of using __del__() for things like resource locking as well, primarily that it’s a little too easy for stray references to sneak out and keep an object alive longer than you expected. Consider the previous example, modified so that the exception isn’t caught:

>>> instance = MyClass([1,2,3,4])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __init__
AttributeError: 'list' object has no attribute 'copy'
>>>

Hmm, how odd — the instance can’t have been created because of the exception, and yet there’s no message from the destructor. Let’s double-check that instance wasn’t somehow created in some weird way:

>>> print instance
Destroying MyClass instance
Exception AttributeError: "'MyClass' object has no attribute 'my_dict'" in <bound method MyClass.__del__ of <__main__.MyClass object at 0x7fd309fbc2d0>> ignored
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'instance' is not defined

Isn’t that interesting! See if you can have a guess at what’s happened…

… Give up? So, it’s true that instance was never defined. That’s why when we try to print it subsequently, we get the NameError exception we can see at the end of the second example. So the only real question is why was __del__() invoked later than we expected? There must be a reference kicking around somewhere which prevented it from being garbage collected, and using gc.get_referrers() we can find out where it is:

>>> instance = MyClass([1,2,3,4])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __init__
AttributeError: 'list' object has no attribute 'copy'
>>> import sys
>>> import gc
>>> import types
>>> 
>>> for obj in gc.get_objects():
...   if isinstance(obj, MyClass):
...     for i in gc.get_referrers(obj):
...       if isinstance(i, types.FrameType):
...         print repr(i)
... 
<frame object at 0x1af19c0>
>>> sys.last_traceback.tb_next.tb_frame
<frame object at 0x1af19c0>

Because we don’t have a reference to the instance any more, we have to trawl through the gc.get_objects() output to find it, and then use gc.get_referrers() to find who has the reference. Since I happen to know the answer already, I’ve filtered it to only show the frame object — without this filtering it also includes the list returned by gc.get_objects() and calling repr() on that yields quite a long string!

We then compare this to the parent frame of sys.last_traceback and we get a match. So, the reference that still exists is from a stack frame attached to sys.last_traceback, which is the traceback of the most recent exception thrown. What happened earlier when we then attempted print instance is that this threw an exception which replaced the previous traceback (only the most recent one is kept) and this removed the final reference to the MyClass instance hence causing its __del__() method to finally be called.

Phew! I’ll never complain about C++ destructors again. As an aside, many of the uses for the __del__() method can be replaced by careful use of the context manager protocol, although this does typically require your resource management to extend over only a single function call at some level in the call stack as opposed to the lifetime of a class instance. In many cases I would argue this is actually a good thing anyway, because you should always try to minimise the time when a resource is acquired, but like anything it’s not always applicable.

Still, if you must use __del__(), bear these quirks in mind and hopefully that’s one less debugging nightmare you’ll need to go through in future.


  1. The exception (haha) to this is when a derived class’s constructor throws an exception, then the destructor of any base classes will still be called. This makes sense because by the time the derived class constructor was called, the base class constructors have already executed fully and may need cleaning up just as if an instance of the base class was created directly. 

23 Apr 2013 at 10:48AM by Andy Pearce in Software  | Photo by Alfonso Castro on Unsplash  | Tags: python destructors  |  See comments

☑ When is a closure not a closure?

Python’s simple scoping rules occasionally hide some surprising behaviour.

closed sign

Scoping in Python is pretty simple, especially in Python 2.x. Essentially you have three scopes:

  • Local scope
  • Enclosing scope
  • Global scope

Local scope is anything defined in the same function as you. Enclosing scopes are those of the functions in which you’re defined — this only applies to functions which are lexically contained within other functions1. Global scope is anything at the module level. There’s also a special “builtin” scope outside of that, but let’s ignore that for now. Classes also have their own special sorts of scopes, but we’ll ignore that as well.

When you assign to a variable within a function, this counts as a declaration and the variable is created in the local scope2 of the function. This is unless you use the global keyword to force the variable to refer to one at module scope instead3.

When you read the value of a variable, Python starts with the local scope and attempts to look up the name there. If it’s not found, it recurses up through the enclosing scopes looking for it until it reaches the module scope (and finally the magic builtin scope). This is more or less as you’d expect if you’re used to normal lexically-scoped languages.

However, if you were paying attention you’ll notice that I specifically said that a local scope is defined by a function. In particular, constructs such as for loops do not define their own scopes — they operate entirely in the local scope of the enclosing function (or module). This has some beneficial side-effects — for example, loop counters are still available once the loop has exited, which is rather handy. It has some potential pitfalls — take this code snippet, for example4:

1
2
functions = [(lambda: i) for i in xrange(5)]
print ", ".join(str(func()) for func in functions)

So, this builds a list of functions5 and then executes each one in turn and concatenates and prints the results. Intuitively one would expect the results to be 0 1 2 3 4, but actually we get 4 4 4 4 4 — eh?

What’s happening is that each of the functions created is in a closure with the variable i in its global scope bound to the one used in the loop. However, each iteration just updates the same loop counter in the local scope of the enclosing function (or module) and so all the functions end up with a reference to the same variable i. In other words, closures in Python refer directly to the enclosing scopes, they don’t create “frozen copies” of them6.

This works fine when a closure is created by a function and then returned, because the enclosing scope is then kept alive only by the closure and inaccessible elsewhere. Further invocations of the same function will produce new scopes and different closures. In this case, though, the functions are all defined under the same scope. So when they’re evaluated, they all return the final value of i as it was when the loop terminated.

We can illustrate this by amending the example to delete the loop counter:

1
2
3
functions = [(lambda: i) for i in xrange(5)]
del i
print ", ".join(str(func()) for func in functions)

Now the third line raises an exception:

NameError: global name 'i' is not defined

Of course, if you use the generator expression form to defer generation of the functions until the point of invocation then everything works as you’d expect:

1
2
3
# This prints "0 1 2 3 4" as expected.
functions = ((lambda: i) for i in xrange(5))
print ", ".join(str(func()) for func in functions)

So, all this is quite comprehensible once you understand what’s going on, but I do wonder how many people get bitten by this sort of thing when using closures in loops.

As a final note, this behaviour is the same in Python 3.x. There is a small difference with regards to scopes that is the addition of the nonlocal keyword which is the equivalent of global except it allows updating the value of variables in enclosing scopes which are between the local and global scopes. I believe that with regards to reading the values of such variables, however, the behaviour is unchanged.


  1. Note that this is a lexical definition of enclosure, which is to say it’s to do with where the function is defined. It’s nothing to do with where the function was called from. Unlike dynamically-scoped languages, Python gives a function no access to variables defined in the scope of a calling function. 

  2. This actually extends to the entire function, which is why it’s an error to read the value of a variable assigned to later in the function even if it exists in an enclosing scope. 

  3. Or the nonlocal keywords in Python 3.x — see the note at the end of this post. 

  4. This example uses a list comprehension for concision, but the issues described would apply equally to a for loop. 

  5. Yes I’m using lambda — so sue me, it’s just an example. 

  6. Actually, once you think of closures as references to a scope rather than some sort of “freeze-frame” of the state, some things are easier to understand. For example, if two functions are defined in the same closure, updates that each of them makes to the state can be felt by the other. This is especially relevant if they use Python 3’s nonlocal keyword (see the note at the end this post). 

10 Apr 2013 at 3:41PM by Andy Pearce in Software  | Photo by Tim Mossholder on Unsplash  | Tags: python scoping  |  See comments

☑ The Dark Arts of C++ Streams

I’ve always avoided C++ streams as they never seemed as convenient as printf() and friends for formatted output, but I’ve decided I should finally take the plunge.

pagan ritual

I’ve never really delved into the details of C++ streams. For formatted output of builtin types they’ve always seemed less convenient than printf() and friends due to the way they mix together the format and the values. However, I recently decided it was time to figure out the basics for future reference, and I’m noting my conclusions in this blog post in case it proves a useful summary for anybody else.

A stream in C++ is a generic character-oriented serial interface to a resource. Streams may only accept input, only produce output or be available for both input and output. Reading and writing streams is achieved using the << operator, which is overloaded beyond its standard bit-shifting function to mean “write RHS to LHS stream”, and the >> operator, which is similarly overloaded to mean “read from LHS stream into RHS”.

To use streams, include the iostream header file for the basic functionality, and any additional headers for the specific streams to use — for example, file-oriented streams require the fstream header. Here’s an example of writing to a file using the stream interface1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    myFile.open("/tmp/output.txt", std::ios::out | std::ios::trunc);
    myFile << "hello, world" << std::endl;
    myFile.close();

    return 0;
}

Most of this example is pretty standard. Since streams use the standard bit-shift operators which are left-associative, the first operation performed in the third line of main() above is myFile << "hello, world". This expression also evaluates to a reference to the stream, allowing the operators to be chained to write multiple values in sequence. In this case, the std::endl identifier pushes a newline into an output stream, but also implicitly calls the flush() method as well.

So far so obvious. What about reading from a file? Reading into strings is fairly obvious:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#include <iostream>
#include <fstream>
#include <string>

int main()
{
    std::fstream myFile;
    std::string myString;
    myFile.open("/tmp/input.txt", std::ios::in);
    myFile >> myString;
    std::cout << "Read: " << myString << std::endl;

    return 0;
}

In this example the cout stream represents stdout, but otherwise this example seems quite straightforward. However, if you run it you’ll see that the string contains only the text up to the first whitespace in the file. It turns out that this is the defined behaviour for strings, which strikes me as a little quirky but hey ho.

It’s quite possible to also read integer and other types — the example below demonstrates this as well as a file stream open for read/write and seeking:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    int intVar;
    float floatVar;
    bool boolVar;

    myFile.open("/tmp/testfile", std::ios::in | std::ios::out | std::ios::trunc);
    myFile << 123 << " " << "1.23" << " true" << std::endl;
    myFile.seekg(0);
    myFile >> intVar >> floatVar >> std::boolalpha >> boolVar;
    std::cout << "Int=" << intVar << " Float=" << floatVar
              << " Bool=" << boolVar << std::endl;
    myFile.close();

    return 0;
}

The file is opened for both read and write here, and also any existing file will be truncated to zero length upon opening. The output line is much the same as previous examples, but the input line demonstrates how input streams are overloaded based on the destination type to parse character input into the appropriate type. The seekg() method fairly obviously seeks within the stream in a similar way to standard C file IO.

Also demonstrated here is an IO manipulator, in this case std::boolalpha which converts the strings "true" and "false" to a bool value. This can be used to modify the value on both input and output streams. The important thing to remember about these is that they set flags on the stream which are persistent, they don’t just apply to the following value. For example, the following function will show the first bool as an integer, the next two as a string and the final one as an integer again:

1
2
3
4
5
void showbools(bool one, bool two, bool three, bool four)
{
    std::cout << one << ", " << std::boolalpha << two << ", " << three
              << ", " << std::noboolalpha << four << std::endl;
}

Other examples include std::setbase, which displays integers in other number bases; and std::fixed, which displays floating point values to a fixed number of decimal places, determined by the std::setprecision manipulator.

All these manipulators are really placeholders for the setf() method being called at the appropriate portions in the stream. So, printf()-like formatting can be done, albeit in a slightly more verbose manner. Many of them require the iomanip header to be included.

So what about more basic file IO, such as reading an entire line into a string as opposed to a single word? To do this you need to avoid the stream operators and instead use appropriate methods — for example, std::getline() will read up to a newline into a string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Displays specified file to stdout with line numbers.
void dumpFile(const char *filename)
{
    std::fstream myFile;
    myFile.open(filename, std::ios::in);
    unsigned int lineNum = 0;
    while (myFile) {
        std::string line;
        std::getline(myFile, line);
        cout << std::setw(3) << ++lineNum << " " << line << std::endl;
    }
    myFile.close();
}

To instead read an arbitrary number of characters, the read() method can be used, which can read into a standard C char array. There are also overloads of the get() method which do the same thing, but since the read() method has only a single purpose it’s probably clearer to use that.

To read into a std::string, however, requires another concept of C++ streams — the streambuf. This is just a generalisation of a buffer which holds characters which can be sent or received from streams. Existing streams use a buffer to hold characters read and written to the stream which can be accessed via the rdbuf() method. Using this and our own std::stringstream, which is a stream wrapper around a std::string, we can read an entire file into a std::string:

1
2
3
4
5
6
7
// Requires the <sstream> header.
std::string readFile(std::fstream &inFile)
{
    std::stringstream buffer;
    buffer << inFile.rdbuf();
    return buffer.str();
}

However, this still doesn’t address the issue of reading n characters from the stream directly into a std::string. I’ve looked into this and frankly I don’t think it’s possible without resorting to reading into a char array, although as a result of the Stack Overflow question which I asked just now I’ve realised that this can be done into a std::string safely. The trick is to call resize() to make sure the string has enough valid space to store the result of the read and then use the non-const version of operator[] to get the address of the string’s character storage2. Crucially you can use neither c_str() nor data(), which both return read-only pointers, the result of modifying which is undefined.

Finally, I’ll very briefly cover the issue of customising a class so that it can be sent to and from streams like the built-in types. Actually this is just as simple as creating a new overload of the operator>> method with the appropriate stream type. The example below shows a class which can output itself:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>

class MyIntWrapper
{
public:
  MyIntWrapper(int value) : value(value) { }
private:
  int value;

  friend std::ostream &operator<<(std::ostream &o, const MyIntWrapper &i);
};

std::ostream &operator<<(std::ostream &ost, const MyIntWrapper &instance)
{
  ost << "<MyIntWrapper: " << instance.value << ">";
  return ost;
}

int main()
{
  MyIntWrapper sample(123);
  std::cout << sample << std::endl;
  return 0;
}

The only thing to note is the friend declaration, which is required to allow the operator function to access the private data members of the class. Input operators can be overloaded in a similar way.


  1. Note that error handling has been omitted for clarity in all examples. 

  2. Implementations which use copy-on-write, such as the GNU STL, are forced to perform any required copy operations when the non-const version of operator[] is used. Interestingly, C++11 effectively forbids copy-on-write implementations of std::string which makes the whole thing rather less tricky (but also potentially slower for some use-cases, although those cases should probably be using their own classes anyway). 

8 Apr 2013 at 12:28PM by Andy Pearce in Software  | Photo by freestocks.org on Unsplash  | Tags: c++ c++-stl  |  See comments

☑ Function Template Fun

Function template specialisation is a tricky beast — the short answer is, avoid it.

stencil lake

For my sins I’ve recently had to implement some fairly generic code in C++ using function templates, and I’ve realised something a little quirky about then that hadn’t occurred to be before.

For anybody who’s unaware of templating in C++ this post probably won’t be of much interest, but as a refresher of the syntax you can declare a class template like this:

1
2
3
4
5
6
7
8
template <typename Type>
class MyIncrementer
{
    Type increment(Type value)
    {
        return value + 1;
    }
};

Aside from having to define these classes in the header file (because the compiler requires the implementation of each method to be available to instantiate the template) this works more or less as you’d expect. You can also specialise templates, providing an alternate implementation for particular types:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
template <>
class MyIncrementer<std::string>
{
    std::string increment(std::string value)
    {
        unsigned long int v = strtoul(value.c_str(), NULL, 10);
        std::stringstream out;
        out << v + 1;
        return out.str();
    }
};

This works really well for classes, but you can also template functions and methods in the same way:

1
2
3
4
5
template <typename Type>
Type myIncrementFunction(Type value)
{
    return value + 1;
}

And finally, you can also specialise them in the same way1:

1
2
3
4
5
6
7
8
template <>
std::string myIncrementFunction(std::string value)
{
    unsigned long int v = strtoul(value.c_str(), NULL, 10);
    std::stringstream out;
    out << v + 1;
    return out.str();
}

So, that’s all there is to it right? Well, not quite. You see, function templates are slightly different beasts to class templates. For example, you can still overload templated functions just like you can with regular functions. The following are all quite valid at the same time:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
// Template on any type.
template <typename Type>
void myFunction(Type value)
{ /* ... */ }

// Overload previous template with a pointer version.
template <typename Type>
void myFunction(Type *value)
{ /* ... */ }

// A simple non-templated overload.
void myFunction(int* value)
{ /* ... */ }

// A template specialisation.
template <>
void myFunction<int*>(int *value)
{ /* ... */ }

Clearly there’s some potential for ambiguity here, and since this is all valid syntax then the compiler can’t just give you a warning. The first rule is that non-templated methods (the third one above) always take precedence over templated ones. So, if myFunction() is passed an int* then the third method above will always be called.

Essentially this means that the fourth method declaration will never be called, right? Well, almost — you can actually force it to be invoked by calling it as myFunction<int*>(...). But otherwise, the non-templated method is always counted as a better match.

However, there is still potential for more confusion here. Let’s omit the non-templated method and reorder the definitions slightly2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// The same basic template as above.
template <typename Type>
void myFunction(Type value)
{ /* ... */ }

// The same specialisation as above, but moved earier.
template <>
void myFunction<int*>(int* value)
{ /* ... */ }

// The same pointer template as above.
template <typename Type>
void myFunction(Type *value)
{ /* ... */ }

Now we call myFunction() with an int* again. Since we’ve removed the non-templated method, the specialisation will be called, right?

WRONG!3

Our simply re-ordering of the declarations means that the specialisation is now an explicit specialisation of the first template, since the second didn’t exist at the point it was defined. When the second template is added, it makes a better match for int* than the first template, so the specialisation of the first template isn’t even considered.

The general rule of overloading templates like this is that only generic templates themselves overload on each other, specialisations aren’t considered until a base template has been chosen, even if there’s a specialisation which would have made a better match.

Confusing, eh? Still, at least there’s an easy moral lesson here — don’t specialise methods. In general you shouldn’t have to, since you can just overload them with non-templated methods which will take precedence anyway, which is probably what you wanted in the first place.

Or just avoid templates entirely. And C++. Use Python. Or just take a walk in the park instead. Look, there’s a bird making a nest. Hopefully out of the shredded remains of the C++11 standard.


  1. Note that the <std::string> suffix to the function name can only be ommitted if the compiler can unambiguously determine the template type. 

  2. This code example is essentially one presented by Peter Dimov and Dave Abrahams — I found it in an article originally written about this which I’ve fairly shamelessly plagiarised here. 

  3. I can’t really blame you for getting it wrong, however. Especially if you didn’t — it’s all a bit presumptuous of me, really, Let’s move on. 

3 Apr 2013 at 1:12PM by Andy Pearce in Software  | Photo by rawpixel.com on Unsplash  | Tags: c++ c++-templates  |  See comments

☑ Git in “almost does what you expect” shocker

Git’s filename filtering surprisingly works as you’d expect. To an extent.

index drawers

I recently had cause to find out about Git commits which affected files whose name matched a particular pattern. In a classic case of trying to be too clever, I hunted through all the documentation in a vain effort to find some option which would meet my requirements. You can filter by commit comments, by changes added or removed within the diff, by commit time or a bundle of other options, but I couldn’t find anything to do with the filename.

I was just about to ask a question on Stack Overflow when it suddenly occurred to me to simply try the standard commit filtering by filename but using a shell glob instead of a full filename, and I was surprised to see it simply worked — perhaps I shouldn’t have been.

So, if you want to find changes to, say, any file with a .txt extension in your repository you can use this:

git log -- '*.txt'

This behaviour is perhaps a little surprising, because that * is matching any character including slashes, which is why this pattern works recursively across the repository. Intrigued as to what was going on I did a little judicious ltracing and found that Git indeed calls fnmatch() to match filenames in this case. Further, it doesn’t pass any flags to the call - in particular it doesn’t pass FNM_PATHNAME, which would cause wildcards to fail to match path separators.

The slightly quirky thing I then observed is that Git appears to notice whether you’ve used any wildcards and decide whether or not to use fnmatch() on this basis, presumably to make operations not using wildcards faster. I tried digging around in the source code and believe I’ve located where the matching is done, the slightly fearsome-looking tree_entry_interesting() function. This calls into git_fnmatch(), which is a fairly thin wrapper around fnmatch(). Indeed, it’s easy to see that GFNM_PATHNAME is not passed into git_fnmatch() which would otherwise be converted into FNM_PATHNAME.

This glob matching behaviour is potentially quite useful, but it’s inconsistent with standard shell glob matches which behave as if FNM_PATHNAME is set. It also makes it difficult to express things such as “match all text files in the current directory only”.

I wonder how many people will find this behaviour confusing. Mind you, probably not a massive proportion of people once you’ve already removed everyone who finds everything else about Git confusing as well.

18 Mar 2013 at 12:01PM by Andy Pearce in Software  | Photo by Sanwal Deen on Unsplash  | Tags: git scm  |  See comments

☑ File Capabilities In Linux

File capabilities offer a secure alternative to SUID executables, but can be a little confusing at first.

mixing desk

We all know that SUID binaries are bad news from a security perspective. Fortunately, if your application requires some limited privileges then there is a better way known as capabilities.

To save you reading the article above in detail, in essence this allows processes which start as root, and hence have permission to do anything, to retain certain limited abilities chosen from this set when they drop privileges and run as a standard user. This means that if an attacker manages to compromise the process via a buffer overrun or similar exploit, they can’t take advantage of anything except the specific minimal privileges that the process actually needs.

This is great for services, which are typically always run as root, but what about command-line utilities? Fortunately this is catered for as well, provided you have the correct utilities installed. If you’re on Ubuntu you’ll need the libcap2-bin package, for example. You’ll also need to be running a non-archaic kernel (anything since 2.6.24).

These features allow you to associate capabilities with executables, in a similar way to setting the SUID bit but only for the specific capabilities set. The setcap utility is used to add and remove capabilities from a file.

The first step is to select the capabilities you need. For the purposes of this blog, I’ll assume there’s a network diagnostic tool called tracewalk which needs to be able to use raw sockets. This would typically require the application to be run as root, but consulting the list it turns out that only the CAP_NET_RAW capability is required.

Assuming you’re in the directory where the tracewalk binary is located, you can add this capability as follows:

sudo setcap cap_net_raw=eip tracewalk

For now ignore that =eip suffix on the capability, I’ll explain that in a second. Note that the capability name is lowercase. You can now verify that you’ve set the capabilities properly with:

setcap -v cap_new_raw=eip tracewalk

Or you can recover a list of all capabilities set on a given executable:

getcap tracewalk

For reference, you can also remove all capabilities from an executable with:

setcap -r tracewalk

At this point you should be able to run the executable as an unprivileged user and it should have the ability to deal with raw sockets, but none of the other privileges that the root user has.

So, what’s the meaning of the strange =eip suffix? This requires a brief digression into the nature of capabilities. Each process has three sets of capabilities — inheritable, permitted and effective:

  • Effective capabilities are those which define what a process can actually do. For example, it can’t deal with raw sockets unless CAP_NET_RAW is in the effective set.
  • Permitted capabilities are those which a process is allowed to have should it ask for them with the appropriate call. These don’t allow a process to actually do anything unless it’s been specially written to ask for the capability specified. This allows processes to be written to add particularly sensitive capabilities to the effective set only for the duration when they’re actually required.
  • Inheritable capabilities are those which can be inherited into the permitted set of a spawned child process. During a fork() or clone() operation the child process is always given a duplicate of the capabilities of the parent process, since at this point it’s still running the same executable. The inheritable set is used when an exec() (or similar) is called to replace the running executable with another. At this point the permitted set of the process is masked with the inheritable set to obtain the permitted set that will be used for the new process.

So, the setcap utility allows us to add capabilities to these three sets independently for a given executable. Note that the meaning of the groups is interpreted slightly different for file permissions, however:

  • Permitted file capabilities are those which are always available to the executable, even if the parent process which invoked it did not have them. These used to be called “forced” capabilities.
  • Inheritable file capabilities specifies an additional mask which can also be used to remove capabilities from the calling process’s set. It applies in addition to the calling process’s inheritable set, so a capability is only inherited if exists in both sets.
  • Effective file capability is actually just a single bit rather than a set, and if set then it indicates that the entire permitted set is also copied to the effective set of the new process. This can be used to add capabilities to processes which weren’t specifically written to request them. Since it is a single bit, if you set it for any capability then it must be set for all capabilities. You can think of this as the “legacy” bit because it’s used to allow capabilities to be used for applications which don’t support them.

When specifying capabilities via setcap the three letters e, i and p refer to the effective, inhertable and pemitted sets respectively. So the earlier specification:

sudo setcap cap_net_raw=eip tracewalk

… specifies that the CAP_NET_RAW capability should be added to the permitted and inheritable sets and that the effective bit should also be set. This will replace any previously set capabilities on the file. To set multiple capabilities, use a comma-separated list:

sudo setcap cap_net_admin,cap_net_raw=eip tracewalk

The capabilities man page discusses this all in more detail, but hopefully this post has demystified things slightly. The only remaining things to mention are a few caveats and gotchas.

Firstly, file capabilities don’t work with symlinks — you have to apply them to the binary itself (i.e. the target of the symlink).

Secondly, they don’t work with interpreted scripts. For example, if you have a Python script that you’d like to assign a capability to, you have to assign it to the Python interpreter itself. Obviously this is a potential security issue because then all scripts run with that interpreter will have the specified capability, although it’s still significantly better than making it SUID. The most common workaround appears to be to write a separate executable in C or similar which can perform the required operations and invoke that from within the script. This is similar to the approach used by Wireshark which uses the binary /usr/bin/dumpcap to perform privileged operations:

$ getcap /usr/bin/dumpcap 
/usr/bin/dumpcap = cap_net_admin,cap_net_raw+eip

Thirdly, file capabilities are disabled if you use the LD_LIBRARY_PATH environment variable for hopefully obvious security reasons1. The same also applies to LD_PRELOAD as far as I’m aware.


  1. Because an attacker could obviously subvert one of the standard libraries and use LD_LIBRARY_PATH to cause their subverted library to be invoked in preference to the system one, and hence have their own arbitrary code executed with the same privileges as the calling application. 

11 Mar 2013 at 12:07PM by Andy Pearce in Software  | Photo by Alexey Ruban on Unsplash  | Tags: linux capabilities  |  See comments

☑ (A)IEEE, maths!

Writing some notes on IEEE 754 led to discovering a nifty way to render formulae on the web.

maths blackboard

For my sins I recently had to do some work on IEEE 754 format floating point numbers. For the uninitiated1, IEEE 754 is a standard which specifies the underlying binary representation of floating point numbers, and the rules for their manipulation. Typically it’s not something that you need to worry about because compilers handle it for you, but in this case I couldn’t make any assumptions about the underlying hardware and the representation really had to be compliant.

As usual when I encounter something new, I set about pulling together my own little reference on the subject, and this led me to another issue — I needed to write some simple mathematical formulae on my wiki page, but HTML is really quite deficient in its ability to express such things. I did discover that a GTK text box can accept arbitrary unicode characters by holding down CTRL and SHIFT and typing U followed by four hex digits - this lasted all of about three seconds before I got sick and tired of continually2 looking up code points.

It was at this point a quick bit of Googling led me to MathJax, which is a great little Javascript-based solution to displaying formulae in browsers. Essentially you enter it in LaTeX format3 and the library handles rendering it elegantly in whatever browser is in use. Since I use Dokuwiki, I was also pleasantly surprised to see that there’s event a plugin for it already. You don’t even need to host the library yourself, since they have their own CDN for that, which means there’s a decent chance it’s already cached in a given browser anyway.

So, I still had to wrap my head around strange floating point conversion issues, but at least I had a pleasant way to write about it, which almost4 made up for it.


  1. You lucky, lucky people. 

  2. For “continually” read “twice”. 

  3. Or MathML if you’re a total masochist. 

  4. For “almost” read “not even slightly”. 

25 Feb 2013 at 1:18PM by Andy Pearce in Software  | Photo by Roman Mager on Unsplash  | Tags: maths  web floatingpoint  |  See comments

☑ Herding children with a python

It’s possible to manage multiple subprocesses in Python, but there are a few gotchas.

herding cattle

Happy Valentines Day!

Right, now we’ve got that out of the way…

The subprocess module in Python is a handy interface to manage forked commands. If you’re still using os.popen() or os.system() give yourself a sharp rap on the knuckles1 and go and read that Python documentation page right now. No really, I’ll wait.

One of the things that’s slightly tricky with subprocess, however, is the management of multiple child processes running in parallel. This is quite possible, but there are a few quirks.

There are two basic approaches you can take — using the select module to watch the various file descriptors, or fork two separate threads for each process, one for stdout and one for stderr. I’ll discuss the former option here because it’s a little lighter on system resources, especially where many processes are concerned. It works fine on POSIX systems2, but on Windows you may need to use threads3.

The first point is that you can’t use any form of blocking I/O when interacting with the processes, or one of the other processes may fill the pipe it’s using to communicate with the main process and hence block. This means that your main process needs to be continually reading from both stdout and stderr of each process, lest they suddenly produce reams of output, but it shouldn’t read anything unless it knows it won’t block. This is why either select.select() or select.poll() are used.

The second issue is that you have to use the underlying operating system I/O primitives such as os.read(), because the builtin Python ones have a tendency to loop round reading until they’ve got as many bytes as requested or they reach EOF. You could read 1 byte at a time, but that’s dreadfully inefficient.

The third issue is that you have to manage closing down processes properly and removing their file descriptors from your watched set or they may continually flag themselves as being read-ready.

The fourth issue, which is a bit of an obsure one, is that you may wish to consider processes which close either or both of stdout and stderr before terminating — you may not need to care about this.

Fortunately for all you lucky people, I’ve already written the ProcPoller class which handles all this — I wrote this some time ago, but I only just dug it up out of an old backup.

Essentially it works in a similar way to select.poll() — you create an instance of it, and then you call the run_command() method to fork off processes in the background. Each command is also passed a context object, which can be anything hashable (i.e. non-mutable). This is used as the index to refer to that command in future.

Once everything is running that you want to watch, you call the poll() method, optionally with a timeout if you like. This will watch both stdout and stderr of each process and pass anything from them into the handle_output() method, which you can override in a derived class for your own purposes.

The base class will collect this output into two strings per process in the output dictionary, which may be fine for commands which produce little output. For more voluminous quantities, or long-running processes, you’re better off overriding handle_output() to parse the data as you go. You can also return True from handle_output() to terminate the process once you have the result you need.

Do drop a comment in if you find this useful at any point, or if you need any help using it.

But seriously, whether you use this class or not, stop using os.system(). Pretty please!


  1. Preferably from a respected artist like Antony Carmichael

  2. A portable implementation should support both select.select() and select.poll() as some platforms only provide one of these. The implementation I link to below uses poll() only for simplicity, but I only ever intended this code to support Linux systems. If you look at the implementation of communicate() on POSIX systems in the subprocess library you’ll see an example of supporting both. 

  3. If you want to see how this might be done on Windows, take a look at the implementation of communicate() on Windows systems in the subprocess library. 

14 Feb 2013 at 11:15AM by Andy Pearce in Software  | Photo by Daniel Burka on Unsplash  | Tags: python subprocess  |  See comments

⇐ Page 1   |   ← Page 4   |   Page 5 of 6   |   Page 6 →