☑ That's Good Enough
Gilding lilies might be fun, but gold leaf doesn't grow on trees.

Blog

Site

Me

☑ When is a closure not a closure?

10 Apr 2013 at 3:41PM in Software

｜

Photo by Tim Mossholder on Unsplash

｜

See comments

python scoping

Python’s simple scoping rules occasionally hide some surprising behaviour.

Scoping in Python is pretty simple, especially in Python 2.x. Essentially you have three scopes:

Local scope
Enclosing scope
Global scope

Local scope is anything defined in the same function as you. Enclosing scopes are those of the functions in which you’re defined — this only applies to functions which are lexically contained within other functions¹. Global scope is anything at the module level. There’s also a special “builtin” scope outside of that, but let’s ignore that for now. Classes also have their own special sorts of scopes, but we’ll ignore that as well.

When you assign to a variable within a function, this counts as a declaration and the variable is created in the local scope² of the function. This is unless you use the global keyword to force the variable to refer to one at module scope instead³.

When you read the value of a variable, Python starts with the local scope and attempts to look up the name there. If it’s not found, it recurses up through the enclosing scopes looking for it until it reaches the module scope (and finally the magic builtin scope). This is more or less as you’d expect if you’re used to normal lexically-scoped languages.

However, if you were paying attention you’ll notice that I specifically said that a local scope is defined by a function. In particular, constructs such as for loops do not define their own scopes — they operate entirely in the local scope of the enclosing function (or module). This has some beneficial side-effects — for example, loop counters are still available once the loop has exited, which is rather handy. It has some potential pitfalls — take this code snippet, for example⁴:

functions = [(lambda: i) for i in xrange(5)]
print ", ".join(str(func()) for func in functions)

So, this builds a list of functions⁵ and then executes each one in turn and concatenates and prints the results. Intuitively one would expect the results to be 0 1 2 3 4, but actually we get 4 4 4 4 4 — eh?

What’s happening is that each of the functions created is in a closure with the variable i in its global scope bound to the one used in the loop. However, each iteration just updates the same loop counter in the local scope of the enclosing function (or module) and so all the functions end up with a reference to the same variable i. In other words, closures in Python refer directly to the enclosing scopes, they don’t create “frozen copies” of them⁶.

This works fine when a closure is created by a function and then returned, because the enclosing scope is then kept alive only by the closure and inaccessible elsewhere. Further invocations of the same function will produce new scopes and different closures. In this case, though, the functions are all defined under the same scope. So when they’re evaluated, they all return the final value of i as it was when the loop terminated.

We can illustrate this by amending the example to delete the loop counter:

functions = [(lambda: i) for i in xrange(5)]
del i
print ", ".join(str(func()) for func in functions)

Now the third line raises an exception:

NameError: global name 'i' is not defined

Of course, if you use the generator expression form to defer generation of the functions until the point of invocation then everything works as you’d expect:

# This prints "0 1 2 3 4" as expected.
functions = ((lambda: i) for i in xrange(5))
print ", ".join(str(func()) for func in functions)

So, all this is quite comprehensible once you understand what’s going on, but I do wonder how many people get bitten by this sort of thing when using closures in loops.

As a final note, this behaviour is the same in Python 3.x. There is a small difference with regards to scopes that is the addition of the nonlocal keyword which is the equivalent of global except it allows updating the value of variables in enclosing scopes which are between the local and global scopes. I believe that with regards to reading the values of such variables, however, the behaviour is unchanged.

Note that this is a lexical definition of enclosure, which is to say it’s to do with where the function is defined. It’s nothing to do with where the function was called from. Unlike dynamically-scoped languages, Python gives a function no access to variables defined in the scope of a calling function. ↩
This actually extends to the entire function, which is why it’s an error to read the value of a variable assigned to later in the function even if it exists in an enclosing scope. ↩
Or the nonlocal keywords in Python 3.x — see the note at the end of this post. ↩
This example uses a list comprehension for concision, but the issues described would apply equally to a for loop. ↩
Yes I’m using lambda — so sue me, it’s just an example. ↩
Actually, once you think of closures as references to a scope rather than some sort of “freeze-frame” of the state, some things are easier to understand. For example, if two functions are defined in the same closure, updates that each of them makes to the state can be felt by the other. This is especially relevant if they use Python 3’s nonlocal keyword (see the note at the end this post). ↩

10 Apr 2013 at 3:41PM in Software

｜

Photo by Tim Mossholder on Unsplash

｜

See comments

python scoping