☑ Python destructor drawbacks

23 Apr 2013 at 10:48AM in Software
 |  | 

Python’s behaviour with regards to destructors can be a little surprising in some cases.

green python

As you learn Python, sooner or later you’ll come across the special method __del__() on classes. Many people, especially those coming from a C++ background, consider this to be the “destructor” just as they consider __init__() to be the “constructor”. Unfortunately, they’re often not quite correct on either count, and Python’s behaviour in this area can be a little quirky.

Take the following console session:

>>> class MyClass(object):
...   def __init__(self, init_dict):
...     self.my_dict = init_dict.copy()
...   def __del__(self):
...     print "Destroying MyClass instance"
...     print "Value of my_dict: %r" % (self.my_dict,)
... 
>>> instance = MyClass({1:2, 3:4})
>>> del instance
Destroying MyClass instance
Value of my_dict: {1: 2, 3: 4}

Hopefully this is all pretty straightforward. The class is constructed and __init__() takes an initial dict instance and stores a copy of it as the my_dict attribute of the MyClass instance. Once the final reference to the MyClass instance is removed (with del in this case) then it is garbage collected and the __del__() method is called, displaying the appropriate message.

However, what happens if __init__() is interrupted? In C++ if the constructor terminates by throwing an exception then the class isn’t counted as fully constructed and hence there’s no reason to invoke the destructor1. How about in Python? Consider this:

>>> try:
...   instance = MyClass([1,2,3,4])
... except Exception as e:
...   print "Caught exception: %s" % (e,)
... 
Caught exception: 'list' object has no attribute 'copy'
Destroying MyClass instance
Exception AttributeError: "'MyClass' object has no attribute 'my_dict'" in <bound method MyClass.__del__ of <__main__.MyClass object at 0x7fd309fbc450>> ignored

Here we can see that a list instead of a dict has been passed, which is going to cause an AttributeError exception in __init__() because list lacks the copy() method which is called. Here we catch the exception, but then we can see that __del__() has still been called.

Indeed, we get a further exception there because the my_dict attribute hasn’t had chance to be set by __init__() due to the earlier exception. Because __del__() methods are called in quite an odd context, exceptions thrown in them actually result in a simple error to stderr instead of being propagated. That explains the odd message about an exception being ignored which appeared above.

This is quite a gotcha of Python’s __del__() methods — in general, you can never rely on any particular piece of initialisation of the object having been performed, which does reduce their usefulness for some purposes. Of course, it’s possible to be fairly safe with judicious use of hasattr() and getattr(), or catching the relevant exceptions, but this sort of fiddliness is going to lead to tricky bugs sooner or later.

This all seems a little puzzling until you realise that __del__() isn’t actually the opposite of __init__() — in fact, it’s the opposite of __new__(). Indeed, if __new__() of the base class (which is typically responsible for actually doing the allocation) fails then __del__() won’t be called, just as in C++. Of course, this doesn’t mean the appropriate thing to do is shift all your initialisation into __new__() — it just means you have to be aware of the implications of what you’re doing.

There are other gotchas of using __del__() for things like resource locking as well, primarily that it’s a little too easy for stray references to sneak out and keep an object alive longer than you expected. Consider the previous example, modified so that the exception isn’t caught:

>>> instance = MyClass([1,2,3,4])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __init__
AttributeError: 'list' object has no attribute 'copy'
>>>

Hmm, how odd — the instance can’t have been created because of the exception, and yet there’s no message from the destructor. Let’s double-check that instance wasn’t somehow created in some weird way:

>>> print instance
Destroying MyClass instance
Exception AttributeError: "'MyClass' object has no attribute 'my_dict'" in <bound method MyClass.__del__ of <__main__.MyClass object at 0x7fd309fbc2d0>> ignored
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'instance' is not defined

Isn’t that interesting! See if you can have a guess at what’s happened…

… Give up? So, it’s true that instance was never defined. That’s why when we try to print it subsequently, we get the NameError exception we can see at the end of the second example. So the only real question is why was __del__() invoked later than we expected? There must be a reference kicking around somewhere which prevented it from being garbage collected, and using gc.get_referrers() we can find out where it is:

>>> instance = MyClass([1,2,3,4])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __init__
AttributeError: 'list' object has no attribute 'copy'
>>> import sys
>>> import gc
>>> import types
>>> 
>>> for obj in gc.get_objects():
...   if isinstance(obj, MyClass):
...     for i in gc.get_referrers(obj):
...       if isinstance(i, types.FrameType):
...         print repr(i)
... 
<frame object at 0x1af19c0>
>>> sys.last_traceback.tb_next.tb_frame
<frame object at 0x1af19c0>

Because we don’t have a reference to the instance any more, we have to trawl through the gc.get_objects() output to find it, and then use gc.get_referrers() to find who has the reference. Since I happen to know the answer already, I’ve filtered it to only show the frame object — without this filtering it also includes the list returned by gc.get_objects() and calling repr() on that yields quite a long string!

We then compare this to the parent frame of sys.last_traceback and we get a match. So, the reference that still exists is from a stack frame attached to sys.last_traceback, which is the traceback of the most recent exception thrown. What happened earlier when we then attempted print instance is that this threw an exception which replaced the previous traceback (only the most recent one is kept) and this removed the final reference to the MyClass instance hence causing its __del__() method to finally be called.

Phew! I’ll never complain about C++ destructors again. As an aside, many of the uses for the __del__() method can be replaced by careful use of the context manager protocol, although this does typically require your resource management to extend over only a single function call at some level in the call stack as opposed to the lifetime of a class instance. In many cases I would argue this is actually a good thing anyway, because you should always try to minimise the time when a resource is acquired, but like anything it’s not always applicable.

Still, if you must use __del__(), bear these quirks in mind and hopefully that’s one less debugging nightmare you’ll need to go through in future.


  1. The exception (haha) to this is when a derived class’s constructor throws an exception, then the destructor of any base classes will still be called. This makes sense because by the time the derived class constructor was called, the base class constructors have already executed fully and may need cleaning up just as if an instance of the base class was created directly. 

23 Apr 2013 at 10:48AM in Software
 |  |