☑ Python 2to3: What’s New in 3.2

7 Feb 2021 at 1:08PM in Software
 |   | 

Another installment in my look at all the new features added to Python in each 3.x release, this one covering 3.2. There’s a lot covered including the argparse module, support for futures, changes to the GIL implementation, SNI support in SSL/TLS, and much more besides. This is my longest article ever by far! If you’re puzzled why I’m looking at releases that are years old, check out the first post in the series.

This is the 3rd of the 14 articles that currently make up the “Python 2to3” series.

green python two 32

In this post I’m going to continue my examination of every Python 3.x release to date with a look at Python 3.2. I seem to remember this as a pretty big one, so there’s some possibility that this article will rival the first one in this series for length. In fact, it got so long that I also implemented “Table of Contents” support in my articles! So, grab yourself a coffee and snacks and let’s jump right in and see what hidden gems await us.

Command-line Arguments

We kick off with one of my favourite Python modules, argparse, defined in PEP 389. This is the latest in series of modules for parsing command-line arguments, which is a topic close to my heart as I’ve written a lot of command-line utilities over the years. I spent a number of those years getting increasingly frustrated with the amount of boilerplate I needed to add every time for things like validating arguments and presenting help strings.

Python’s first attempt at this was the getopt module, which was essentially just exposing the POSIX getopt() function in Python, even offering a version that’s compatible with the GNU version. This works, and it’s handy for C programmers familiar with the API, but it makes you do most of the work of validation and such. The next option was optparse, which did a lot more work for you and was very useful indeed.

Whilst optparse did a lot of work of parsing options for you (e.g. --verbose), it left any other arguments in the list for you to parse yourself. This was always slightly frustrating for me, because let’s say you expect the user to pass a list of integers, it seemed inconvenient to force them to use options for it just to take advantage of the parsing and validation the module offers. Also, more complex command-line applications like git often have subcommands which are tedious to validate by hand as well.

The argparse module is a replacement for optparse which aims to address these limitations, and I think by this point we’ve got to something pretty comprehensive. It’s usage is fairly similar to optparse, but adds enough flexibility to parse all sorts of arguments. It also can validate the types of arguments, provide command-line help automatically and allow subcommands to be validated.

The variety of options this module provides are massive, so there’s no way I’m going to attempt an exhaustive examination here. By way of illustration, I’ve implemented a very tiny subset of the git command-line as a demonstration of how subcommands work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import argparse
import os

# These functions would normally carry out the subcommands.
def do_status(args):
    print("Normally I'd run the status command here")

def do_log(args):
    print("Normally I'd run the log command here")

# We construct the base parser here and add global options.
parser = argparse.ArgumentParser()
parser.add_argument("--version", action="version", version="%(prog)s 2.24")
parser.add_argument("-C", action="store", dest="working_dir", metavar="<path>",
                    help="Run as if was started in PATH")
parser.add_argument("-p", "--paginate", action="store_true", dest="paginate",
                    help="Enable pagination of output")
parser.add_argument("-P", "--no-pager", action="store_false", dest="paginate",
                    help="Disable pagingation of output")
parser.set_defaults(subcommand=None, use_pager=True, working_dir=os.getcwd())

# We add a "status" subcommand with its own parser.
subparsers = parser.add_subparsers(title="Subcommands", description="Valid subcommands",
                                   help="additional help")
parser_status = subparsers.add_parser("status", help="Show working tree status")
parser_status.add_argument("-s", "--short", action="store_const", const="short",
                           dest="format", help="Use short format")
parser_status.add_argument("-z", action="store_const", const="\x00", dest="lineend",
                           help="Terminate output lines with NUL instead of LF")
parser_status.add_argument("pathspecs", metavar="<pathspec>", nargs="*",
                           help="One or more pathspecs to show")
parser_status.set_defaults(subcommand=do_status, format="long", lineend="\n")

# We add a "log" subcommand as well.
parser_log = subparsers.add_parser("log", help="Show commit logs")
parser_log.add_argument("-p", "--patch", action="store_true", dest="patch",
                        help="Generate patch")
parser_log.set_defaults(subcommand=do_log, patch=False)

# Shows how this parser could be used.
args = parser.parse_args()
if args.subcommand is None:
    print("No subcommand chosen")
    parser.print_help()
else:
    args.subcommand(args)

You can see the command-line help generated by the class below. First up, the output of running fakegit.py --help:

usage: fakegit.py [-h] [--version] [-C <path>] [-p] [-P] {status,log} ...

optional arguments:
  -h, --help      show this help message and exit
  --version       show program's version number and exit
  -C <path>       Run as if was started in PATH
  -p, --paginate  Enable pagination of output
  -P, --no-pager  Disable pagingation of output

Subcommands:
  Valid subcommands

  {status,log}    additional help
    status        Show working tree status
    log           Show commit logs

The subcommands also support their own command-line help, such as fakegit.py status --help:

usage: fakegit.py status [-h] [-s] [-z] [<pathspec> [<pathspec> ...]]

positional arguments:
  <pathspec>   One or more pathspecs to show

optional arguments:
  -h, --help   show this help message and exit
  -s, --short  Use short format
  -z           Terminate output lines with NUL instead of LF

Logging

The logging module has acquired the ability to be configured by passing a dict, as per PEP 391. Previously it could accept a config file in .ini format as parsed by the configparser module, but formats such as JSON and YAML are becoming more popular these days. To allow these to be used, logging has allowed a dict to be passed specifying the configuration, given that most of these formats can be trivial reconstructed into that format, a illustrated for JSON:

1
2
3
4
5
import json
import logging.config
with open("logging-config.json") as conf_fd:
    config = json.load(conf_fd)
logging.config.dictConfig(config)

When you’re packaging a decent sized application storing logging configuration in a file makes it easier to maintain the logging configuration vs. the option of hard-coding it in executable code. For example, it becomes easier to swap in a different logging configuration in different environments (e.g. pre-production and production). The fact that more popular formats can now be supported will open this flexibility to more developers.

In addition to this, the logging.basicConfig() function now has a style parameter where you can select which type of string formatting token to use for the format string itself. All of the following are equivalent:

>>> import logging
>>> logging.basicConfig(style='%', format="%(name)s -> %(levelname)s: %(message)s")
>>> logging.basicConfig(style='{', format="{name} -> {levelname} {message}")
>>> logging.basicConfig(style='$', format="$name -> $levelname: $message")

Also, if a log event occurs prior to configuring logging, there is a default setup of a StreamHandler connected to sys.stderr, which displays any message of WARNING level or higher. If you need to fiddle with this handler for any reason, it’s available as logging.lastResort.

Some other smaller changes:

  • Levels can now be supplied to setLevel() as strings such as INFO instead of integers like logging.INFO.
  • A getChild() method on Logger instances now returns a logger with a suffix appended to the name. For example, logging.getLogger("foo").getChild("bar.baz") will return the same logger as logging.getLogger("foo.bar.baz"). This is convenient when the first level of the name is __name__, as it often is by convention, or in cases where a parent logger is passed to some code which wants to create its own child logger from it.
  • The hasHandlers() method has also been added to Logger which returns True iff this logger, or a parent to which events are propagated, has at least one configured handler.
  • A new logging.setLogRecordFactory() and a corresponding getLogRecordFactory() have been added to allow programmers to override log record creation process.

Concurrency

There are a number of changes in concurrency this release.

Futures

The largest change is a new concurrent.futures module in the library, specified by PEP 3148, and it’s a pretty useful one. The intention with the new concurrent namespace is to collect together high-level code for managing concurrency, but so far it’s only acquired the one futures module.

The intention here is to provide what has become a standard abstraction over concurrent operations which represents the eventual result of a concurrent operation. In the Python module, the API style is deliberately decoupled from the implementation detail of what form of concurrency is used, whether it’s a thread, another process or some RPC to another host. This is useful as it allows the style to be potentially changed later if necessary without invalidating the business logic around it.

The style is to construct an executor which is where the flavour of concurrency is selected. Currently the module supports two options, ThreadPoolExecutor and ProcessPoolExecutor. The code can then schedule jobs to the executor, which returns a Future instance which can be used to obtain the results of the operation once it’s complete.

To exercise these in a simple example I wrote a basic password cracker, something that should benefit from parallelisation. I used PBKDF2 with SHA-256 for hashing the passwords, although only with 1000 iterations1 to keep running times reasonable on my laptop. Also, to keep things simple we assume that the password is a single dictionary word with no variations in case.

For comparison I first wrote a simple implementation which checks every word in /usr/share/dict/words with no parallelism:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import concurrent.futures
import hashlib
import sys

# The salt is normally stored alongside the password hash.
SALT = b"\xe2\x13*\xbb\x1a\xaar\t"
# This is the hash of a dictionary word.
TARGET_HASH = b"\xba<\xdfU\xc3\xdanx\x1b\x1c\xb0js\xf1\x19\xa9\xc5\xb9"\
              b"d!l\xa2\x14\x11K\x86\xac#\xc8\xc7\x8a\x91"
ITERATIONS = 1000

def calc_checksum(line):
    word = line.strip()
    return (word, hashlib.pbkdf2_hmac("sha256", word, SALT, ITERATIONS))

def main():
    with open("/usr/share/dict/words", "rb") as fd:
        for line in fd:
            check = calc_checksum(line)
            if check[1] == TARGET_HASH:
                print(line.strip())
    return 0

if __name__ == "__main__":
    sys.exit(main())

Here’s the output of time running it:

python3 crack.py  257.08s user 0.25s system 99% cpu 4:17.72 total

On my modest 2016 MacBook Pro, this took 4m 17s in total, and the CPU usage figures indicated that one core was basically maxed out, as you’d expect. Then I swapped out main() for a version that used ThreadPoolExecutor from concurrent.futures:

15
16
17
18
19
20
21
22
23
24
25
def main():
    with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor:
        futures = set()
        with open("/usr/share/dict/words", "rb") as dict_fd:
            for line in dict_fd:
                futures.add(executor.submit(calc_checksum, line))
        for future in concurrent.futures.as_completed(futures):
            word, check = future.result()
            if check == TARGET_HASH:
                print(word)
    return 0

After creating a ThreadPoolExecutor which can use a maximum of 8 worker threads at any time, we then need to submit jobs to the executor. We do this in a loop around reading /usr/share/dict/words, submitting each word as a job to the executor to distribute among its workers. Once all the jobs are submitted, we then wait for them to complete and harvest the results.

Again, here’s the time output:

python3 crack.py  506.42s user 2.50s system 680% cpu 1:14.83 total

With my laptop’s four cores, I’d expect this would run around four times as fast2 and it more or less did, allowing for some overhead scheduling the work to the threads. The total run time was 1m 14s so a little less than the expected four times faster, but not a lot. The CPU usage was around 85% of the total of all four cores, which is again roughly what I’d expect. Running in a quarter of the time seems like a pretty good deal for only four lines of additional code!

Finally, just for fun I then swapped out ThreadPoolExecutor for ProcessPoolExecutor, which is the same but using child processes instead of threads:

16
17
    with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
        

And the time output with processes:

python3 crack.py  575.08s user 15.50s system 669% cpu 1:28.15 total

I didn’t expect this to make much difference to a CPU-bound task like this, provided that the hashing routine are releasing the GIL as they’re supposed to. Indeed, it was actually somewhat slower than the threaded case, taking 1m 28s to execute in total. The total user time was higher for the same amount of work, so this definitely points to some decreased efficiency rather than just differences in background load or similar. I’m assuming that the overhead of the additional IPC and associated memory copying accounts for the increased time, but this sort of thing may well be platform-dependent.

As one final flourish, I tried to reduce the inefficiencies of the multiprocess case by batching the work into larger chunks using a recipe from the itertools documentation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import concurrent.futures
import hashlib
import itertools
import sys

# The salt is normally stored alongside the password hash.
SALT = b'\xe2\x13*\xbb\x1a\xaar\t'
# This is the hash of a dictionary word.
TARGET_HASH = b"\xba<\xdfU\xc3\xdanx\x1b\x1c\xb0js\xf1\x19\xa9\xc5\xb9"\
              b"d!l\xa2\x14\x11K\x86\xac#\xc8\xc7\x8a\x91"
ITERATIONS = 1000

def calc_checksums(lines):
    return {
        word: hashlib.pbkdf2_hmac('sha256', word, SALT, ITERATIONS)
        for word in (line.strip() for line in lines if line is not None)
    }

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

def main():
    with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
        futures = set()
        with open("/usr/share/dict/words", "rb") as dict_fd:
            for lines in grouper(dict_fd, 1000):
                futures.add(executor.submit(calc_checksums, lines))
        for future in concurrent.futures.as_completed(futures):
            results = future.result()
            for word, check in results.items():
                if check == TARGET_HASH:
                    print(word)
    return 0

if __name__ == "__main__":
    sys.exit(main())

This definitely made some difference, bringing the time down from 1m 28s to 1m 6s. The CPU usage also indicates more of the CPU time is being spent in user space, presumably due to less IPC.

python3 crack.py  509.95s user 1.20s system 764% cpu 1:06.83 total

I suspect that the multithreaded case would also benefit from some batching, but at this point I thought I’d better draw a line under it or I’d never finish this article.

Overall, I really like the concurrent.futures module, as it takes so much hassle out of processing things in parallel. There are still cases where the threading module is going to be more appropriate, such as some background thread which performs periodic actions asynchronously. But for cases where you have a specific task that you want to tackle synchronously but in parallel, this module wraps up a lot of the annoying details.

I’m excited to see what else might be added to concurrent in the future3!

Threading

Despite all the attention on concurrent.futures this release, the threading module has also had some attention with the addition of a new Barrier class. This is initialised with a number of threads to wait for. As individual threads call wait() on the barrier they are held up until all the required number of threads are waiting, at which point all are allowed to proceed simultaneously. This is a little like the join() method, except the threads can continue to execute after the barrier.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import threading
import time

def wait_thread(name, barrier, delay):
    for i in range(3):
        print("{} starting {}s delay".format(name, delay))
        time.sleep(delay)
        print("{} finishing delay".format(name))
        barrier.wait()

num_threads = 5
barrier = threading.Barrier(num_threads)
threads = [
    threading.Thread(target=wait_thread, args=(str(i), barrier, (i+1) * 2))
    for i in range(num_threads)
]
print("Starting threads...")
for thread in threads:
    thread.start()
for thread in threads:
    thread.join()
print("All threads finished.")

The Barrier can also be initialised with a timeout argument. If the timeout expires before the required number of threads have called wait() then all currently waiting threads are released and a BrokenBarrierError exception is raised from all the wait() methods.

I can think of a few use-cases where this synchronisation primitive might come in handy, such as multiple threads all producing streams of output which need to be synchronised with each other so one of them doesn’t get too far ahead of the other. For example, perhaps one thread is producing chunks of audio data and another chunks of video, you could use a barrier to ensure that neither of them gets ahead of the other.

Another small but useful change in threading is that the Lock.acquire(), RLock.acquire() and Semaphore.acquire() methods can now accept a timeout, instead of only allowing a simple choice between blocking and non-blocking as before. Also there’s been a fix to allow lock acquisitions to be interrupted by signals on pthreads platforms, which means that programs that deadlock on locks can be killed by repeated SIGINT (as opposed to requiring SIGKILL as they used to sometimes).

Finally, threading.RLock has been moved from pure Python to a C implementation, which results in a 10-15x speedup using them.

GIL Overhaul

In another change that will impact all forms of threading in CPython, the code behind the GIL has been rewritten. The new implementation aims to offer more predictable switching intervals and reduced overhead due to lock contention.

Prior to this change, the GIL was released after a fixed number of bytecode instructions had been executed. However, this is a very crude way to measure a timeslice since the time taken to execute an instruction can vary from a few nanoseconds to much longer, since not all the expensive C functions in the library release the GIL while they operate. This can mean that scheduling between threads can be very unbalanced depending on their workload.

To replace this, the new approach releases the GIL at a fixed time interval, although the GIL is still only released at an instruction boundary. The specific interval is tunable through sys.setswitchinterval(), with the current default being 5 milliseconds. As well as being a more balanced way to share processor time among threads, this can also reduce the overhead of locks in heavily contended situations — this is because waiting for a lock which is already held by another thread can add significant overhead on some platforms (apparently OS X is particularly impacted by this).

If you want to get technical4, threads wishing to take the GIL first wait on a condition variable for it to be released, with a timeout equal to the switch interval. Hence, it’ll wake up either after this interval, or if the GIL is released by the holding thread if that’s earlier. At this point the requesting thread checks whether any context switches have already occurred, and if not it sets the volatile flag gil_drop_request, shared among all threads, to indicate that it’s requesting the release of the GIL. It then continues around this loop until it gets the lock, re-requesting GIL drop after a delay every time a new thread acquires it.

The holding thread, meanwhile, attempts to release the GIL when it performs blocking operations, or otherwise every time around the eval loop it checks if gil_drop_request is set and releases the GIL if so. In so doing, it wakes up any threads which are waiting on the GIL and relies on the OS to ensure fair scheduling among threads.

The advantage of this approach is that it provides an advisory cap on the amount of time a thread may hold the GIL, by delaying setting the gil_drop_request flag, but also allows the eval loop as long as it needs to finish proessing its current bytecode instruction. It also minimises overhead in the simple case when no other thread has requested the GIL.

The final change is around thread switching. Prior to Python 3.2, the GIL was released for a handful of CPU cycles to allow the OS to schedule another thread, and then it was immediately reacquired. This was efficient if the common case is that no other threads are ready to run, and meant that threads running lots of very short opcodes weren’t unduly penalised, but in some cases this delay wasn’t sufficient to trigger the OS to context switch to a different thread. This can cause particular problems with you have an I/O-bound thread competing with a CPU-intensive one — the OS will attempt to schedule the I/O-bound thread, but it will immediately attempt to acquire the GIL and be suspended again. Meanwhile, the CPU-bound thread will tend to cling to the GIL for longer than it should, leading to higher I/O latency.

To combat this, the new system forces a thread switch at the end of the fixed interval if any other threads are waiting on the GIL. The OS is still responsible for scheduling which thread, this change just ensures that it’s not the previously running thread. It does this using a last_holder shared variable which points to the last holder of the GIL. When a thread releases the GIL, it additionally checks if last_holder is its own ID and if so, it waits on a condition variable for the value to change to another thread. This can’t cause a deadlock if no other threads are waiting, because in that case gil_drop_request isn’t set and this whole operation is skipped.

Overall I’m hopeful that these changes should make a positive impact to fair scheduling in multithreaded Python applications. As much as I’m sure everyone would love to find a way to remove the GIL entirely, it doesn’t seem like that’s likely for some time to come.

Date and Time

There are a host of small improvements to the datetime module to blast through.

First and foremost is that there’s now a timezone type which implements the tzinfo interface and can be used in simple cases of fixed offsets from UTC (i.e. no DST adjustments or the like). This means that creating a timezone-aware datetime at a known offset from UTC is now straightforward:

>>> from datetime import datetime, timedelta, timezone
>>> # Naive datetime (no timezone attached)
>>> datetime.now()
datetime.datetime(2021, 2, 6, 15, 26, 37, 818998)
>>> # Time in UTC (happens to be my timezone also!)
>>> datetime.now(timezone.utc)
datetime.datetime(2021, 2, 6, 15, 26, 46, 488588, tzinfo=datetime.timezone.utc)
>>> # Current time in New York (UTC-5) ignoring DST
>>> datetime.now(timezone(timedelta(0, -5*3600)))
datetime.datetime(2021, 2, 6, 10, 27, 41, 764597, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=68400)))

Also, timedelta objects can now be multiplied and divided by integers or floats, as well as divided by each other to determine how many of one interval fit into the other interval. This is all fairly straightforward by converting the values to a total number of seconds to perform the operations, but it’s convenient not to have to.

>>> timedelta(1, 20*60*60) * 1.5
datetime.timedelta(days=2, seconds=64800)
>>> timedelta(8, 3600) / 4
datetime.timedelta(days=2, seconds=900)
>>> timedelta(8, 3600) / timedelta(2, 900)
4.0

If you’re using Python to store information about the Late Medieval Period then you’re in luck, as datetime.date.strftime() can now cope with dates prior to 1900. If you want to expand your research to the Dark Ages, however, you’re out of luck since it still only handles dates from 1000 onwards.

Also, use of two-digit years is being discouraged. Until now setting time.accept2dyear to True would allow you to use a 2-digit year in a time tuple and its century would be guessed. However, as of Python 3.2 using this logic will raise you a DeprecationError. Quite right too, 2-digit years are quite an anacronism these days.

String Formatting

The str.format() method for string formatting is now joined by str.format_map() which, as the name implies, takes a mapping type to supply arguments by name.

>>> "You must cut down the mightiest {plant} in the forest with... a {fish}!"
    .format_map({"fish": "herring", "plant": "tree"})
'You must cut down the mightiest tree in the forest with... a herring!'

As well as a standard dict instance, you can pass any dict-like object and Python has plenty of these, such as ConfigParser and the objects created by the dbm modules.

There have also been some minor changes to formatting of numeric values as strings. Prior to this release convertinig a float or complex to string form with str() would show fewer decimal places than repr(). This was because the repr() level of precision would occasionally show surprising results, and the pragmatic way to avoid this being more of an issue was to make str() round to a lower precision.

However, as discussed in the previous article, repr() was changed to always select the shortest equivalent representation for these types in Python 3.1. Hence, in Python 3.2 the str() and repr() forms of these types have been unified to the same precision.

Function Enhancements

There are a series of enhancements to decorators provided by the functools module, plus a change to contextlib.

Firstly, just to make the example from the previous article more pointless, there is now a functools.lru_cache() decorator which can cache the results of a function based on its parameters. If the function is called with the same parameters, a cached result will be used if present.

This is really handy to drop in to commonly-used but slow functions for a very low effort speed boost. What’s even more useful is that you can call a cache_info() method of the decorated function to get statistics about the cache. There’s also a cache_clear() method if you need to invalidate the cache, although there’s unfortunately no option to clear only selected parameters.

>>> @functools.lru_cache(maxsize=10)
... def slow_func(arg):
...   return arg + 1
...
>>> slow_func(100)
101
>>> slow_func(200)
201
>>> slow_func(100)
101
>>> slow_func.cache_info()
CacheInfo(hits=1, misses=2, maxsize=10, currsize=2)

Secondly, there have been some improvements to functools.wraps() to improve introspection, such as a __wrapped__ attribute pointing back to the original callable and copying __annotations__ across to the wrapped version, if defined.

Thirdly, a new functools.total_ordering() class decorator has been provided. This is very useful for producing classes which support all the rich comparison operators with minimal effort. If you define a class with __eq__ and __lt__ and apply the @functools.total_ordering decorator to it, all the other rich comparision operators will be synthesized.

>>> import functools
>>> @functools.total_ordering
... class MyClass:
...     def __init__(self, value):
...         self.value = value
...     def __lt__(self, other):
...         return self.value < other.value
...     def __eq__(self, other):
...         return self.value == other.value
...
>>> one = MyClass(100)
>>> two = MyClass(200)
>>> one < two
True
>>> one > two
False
>>> one == two
False
>>> one != two
True

Finally, there have been some changes which mean that the contextlib.contextmanager() decorator now results in a function which can be used both as a context manager (as previously) but now also as a function decorator. This could be pretty handy, although bear in mind if you yield a value which is normally bound in a with statement, there’s no equivalent approach for function deocorators.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import contextlib

@contextlib.contextmanager
def log_entry_exit(ident):
    print("Entering {}".format(ident))
    yield
    print("Leaving {}".format(ident))

with log_entry_exit("foo"):
    print("In context")

@log_entry_exit("my_func")
def my_func(value):
    print("Value is {}".format(value))

my_func(123)

Itertools

Only one improvement to itertools which is the addition of an accumulate function. However, this has the potential to be pretty handy so I’ve given it its own section.

Passed an iterable, itertools.accumulate() will return the cumulative sum of all elements so far. This works with any type that’s defined for operator +:

>>> import itertools
>>> list(itertools.accumulate([1,2,3,4,5]))
[1, 3, 6, 10, 15]
>>> list(itertools.accumulate([[1,2],[3],[4,5,6]]))
[[1, 2], [1, 2, 3], [1, 2, 3, 4, 5, 6]]

For other types, you can define any binary function to combine them:

>>> import operator
>>> list(itertools.accumulate((set((1,2,3)), set((3,4,5))),
         func=operator.or_))
[{1, 2, 3}, {1, 2, 3, 4, 5}]

And it’s also possible to start with an initial value before anything’s added by providing the initial argument.

Collections

The collections module has had a few improvements.

Counter

The collections.Counter class added in the previous release has now been extended with a subtract() method which supports negative numbers. Previously the semantics of -= as applied to a Counter would never reduce a value beyond zero — it would simply be removed from the set. This is consistent with how you’d expect a counter to work:

>>> x = Counter(a=10, b=20)
>>> x -= Counter(a=5, b=30)
>>> x
Counter({'a': 5})

However, in its initerpretation as a multiset, you might actually want values to go negative. If so, you can use the new subtract() method:

>>> x = Counter(a=10, b=20)
>>> x.subtract(Counter(a=5, b=30))
>>> x
Counter({'a': 5, 'b': -10})

OrderedDict

As demonstrated in the previous article, it’s a little inconvenient to move something to the end of the insertion order. That’s been addressed in this release with the OrderedDict.move_to_end() method. By default this moves the item to the last position in the ordered sequence in the same way as x[key] = x.pop(key) would but is significantly more efficient. Alternatively you can call move_to_end(key, last=False) to move it to the first position in the sequence.

Deque

Finally, collections.deque has two new methods, count() and reverse() which allow them to be used in more situations where code was designed to take a list.

>>> import collections
>>> x = collections.deque('antidisestablishmentarianism')
>>> x.count('i')
5
>>> x.reverse()
>>> x
deque(['m', 's', 'i', 'n', 'a', 'i', 'r', 'a', 't', 'n', 'e', 'm', 'h', 's',
'i', 'l', 'b', 'a', 't', 's', 'e', 's', 'i', 'd', 'i', 't', 'n', 'a'])

Internet Modules

The three modules email, mailbox and nntplib now correctly support the str and bytes types that Python 3 introduced. In particular, this means that messages in mixed encodings now work correctly. These have also necessitated a number of changes in the mailbox module, which should now work correctly.

The email module has new functions message_from_bytes() and message_from_binary_file(), and classes BytesFeedParser and BytesParser, to allow messages read or stored in the form of bytes to be parsed into model objects. Also, the get_payload() method and Generator class have been updated to properly support the Content-Transfer-Encoding header, encoding or decoding as appropriate.

Sticking with the theme of email, imaplib now supports upgrade of an existing connection to TLS using the new imaplib.IMAP4.starttls() method.

The ftplib.FTP class now supports the context manager protocol to consume socket.error exceptions which are thrown and close the connection when done. This makes it pretty handy, but due to the way that FTP opens additional sockets, you need to be careful to close all these before the context manager exits or your application will hang. Consider the following example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from ftplib import FTP

with FTP("ftp1.at.proftpd.org") as ftp:
    ftp.login()
    print(ftp.dir())
    sock = ftp.transfercmd("RETR README.MIRRORS")
    while True:
        data = sock.recv(8192)
        if not data:
            break
        print(data)
    sock.close()

Assuming that FTP site is still up, and README.MIRRORS is still available, that should execute fine. However, if you remove that sock.close() line then you should find it just hangs up and never terminiates (perhaps until the TCP connection gets terminated due to being idle).

The socket.create_connection() function can also be used as a context manager, and swallows errors and closes the connection in the same way as the FTP class above.

The ssl module has seen some love with a host of small improvements. There’s a new SSLContext class to hold persistent connection data such as settings, certificates and private keys. This allows the settings to be reused for multiple connections, and provides a wrap_socket() method for creating a socket using the stored details.

There’s a new ssl.match_hostname() which applies RFC-specified rules for confirming that a specified certificate matches the specified hostname. The certificate specification it expects is as returned by SSLSocket.getpeercert(), but it’s not particularly hard to fake as shown in the session below.

>>> import ssl
>>> cert = {'subject': ((('commonName', '*.andy-pearce.com'),),)}
>>> ssl.match_hostname(cert, "www.andy-pearce.com")
>>> ssl.match_hostname(cert, "ftp.andy-pearce.com")
>>> ssl.match_hostname(cert, "www.andy-pearce.org")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.2.6/lib/python3.2/ssl.py", line 162, in match_hostname
    % (hostname, dnsnames[0]))
ssl.CertificateError: hostname 'www.andy-pearce.org' doesn't match '*.andy-pearce.com'

This release also adds support for SNI (Server Name Indication), which is like virtual hosting but for SSL connections. This removes the longstanding issue whereby you can host as many domains on a single IP address for standard HTTP, but for SSL you needed a unique IP address for each domain. This is essentially beause the virtual hosting of websites is implemented by passing the HTTP Host header, but since the SSL connection is set up prior to sending the HTTP request (by definition!) then the only thing you have to connect to is an IP address. The remote end needs to decide what certificate to send you, and since all it has to decide that is the IP address then you can’t have different certificates for different domains on the same IP. This is problematic because the certificate needs to match the domain or the browser will reject it.

SNI handles this by extending the SSL ClientHello message to include the domain. To implement this with the ssl module in Python, you need to specify the server_hostname parameter to SSLContext.wrap_socket().

The http.client module has been updated to use the new certificate verification processes when using a HTTPSConnection. The request() method is now more flexible on sending request bodies — previously it required a file object, but now it will also accept an iterable providing that an explicit Content-Length header was sent. According to HTTP/1.1 this header shouldn’t be required, since requests can be sent using chunked encoding, which doesn’t require the length of the request body to be known up front. In practice, however, it’s common for servers not to bother supporting chunked requests, despite being mandated by the HTTP/1.1 standard. As a result, it’s sensible to regard Content-Length as mandatory for requests with a body. HTTP/2 has its own methods of streaming data anyway, so once that gains wide acceptance then chunked encoding won’t be used anyway — but given the rate of adoption so far, I wouldn’t hold your breath.

The urllib.parse module has some changes as well, with urlparse() now supporting IPv6 and urldefrag() returning a collections.namedtuple for convenience. The urlencode() function can also now accept both str and bytes for the query parameter.

Markup Languages

There have been some significant updates to the xml.etree.ElementTree package, including the addition of the following top-level functions:

fromstringlist()
A handy method which builds an XML document from a series of fragment strings. In partiular this means you can open a filehandle in text mode and have it parsed one line at a time, since iterating the filehandle will yield one line at a time.
tostringlist()
The opposite of fromstringlist(), generates the XML output in chunks. It doesn’t make any guarantees except that joining them all together will yield the same as generating the output as a single string, but in my experience each chunk is around 8192 bytes plus whatever takes it up to the next tag boundary.
register_namespace()
Allows you to register a namespace prefix globally, which can be useful for parsing lots of XML documents which make heavy use of namespaces.

The Element class also has a few extra methods:

Element.extend()
Appends children to the current element from a sequence, which must itself contain Element instances.
Element.iterfind()
As Element.findall() but yields elements instead of returning a list.
Element.itertext()
As Element.findtext() but iterates over all the current element and all child elements as opposed to just returning the first match.

The TreeBuilder class also has acquired the end() method to end the current element and doctype() to handle a doctype declaration.

Finally, a couple of unnecessary methods have been deprecated. Instead of getchildren() you can just use list(elem), and instead of getiterator() just use Element.iter().

Also in 3.2 there’s a new html module, but it only contains one function escape() so far which will do the obvious HTML-escaping.

>>> import html
>>> html.escape("<blink> & <marquee> tags are both deprecated")
'&lt;blink&gt; &amp; &lt;marquee&gt; tags are both deprecated'

Compression and Archiving

The gzip.GzipFile class now provides a peek() method which can read a number of bytes from the archive without advancing the read pointer. This can be very useful when implemented parsers which need to choose between various functions to branch into based on what’s next in the file, but which to also leave those functions to read from the file itself as a simpler interface.

The gzip module has also added the compress() and decompress() methods which simply perform in-memory compression/decompression without the need to construct a GzipFile instance. This has been a source of irritation for me in the past, so it’s great to see it finally addressed.

The zipfile module also had some improvements, with the ZipFile class now supporting use as a context manager. Also, the ZipExtFile object has had some performance improvements. This is the file-like object returned when you open a file within a ZIP archive using the ZipFile.open() method. You can also wrap it in io.BufferedReader for even better performance if you’re doing multiple smaller reads.

The tarfile module has changes, with tarfile.TarFile also supporting use as a context manager. Also, the add() method for adding files to the archive now supports a filter parameter which can modify attributes of the files as they’re added, or exclude them altogether. You pass a callable using this parameter, which is called on each file as it’s added. It’s passed a TarInfo structure which has the metainformation about the file, such as the permissions and owner. It can return a modified version of the structure (e.g. to squash all files to being owned by a specific user), or it can return None to block the file from being added.

Finally, the shutil module has also grown a couple of archive-related functions, make_archive() and unpack_archive(). These provide a convenient high-level interface to zipping up multiple files into an archive without having to mess around with the details of the individual compression modules. It also means that the format of your archives can be altered with minimal impact on your code by changing a parameter.

It supports the common archiving formats out of the box, but there’s also a register_archive_format() hook should you wish to add code to handle additional formats.

Math

There are some new functions in the math library, some of which look pretty handy.

isfinite()
Returns True iff the float argument is not a special value (e.g. NaN or infinity)
expm1()
Calculates for small x in a way which doesn’t result in a loss of precision that can occur when subtracting nearly equal values.
erf() and erfc()
erf() is the Guassian Error Function, which is useful for assessinig how much of an outlier a data point is against a normal distribution. The erfc() function is simply the compliment where erfc(x) == 1 - erf(x).
gamma() and lgamma()
Implements the Gamma Function, which is an extension of factorial to cover continuous and complex numbers. I suspect for almost everyone math.factorial() will be what you’re looking for. Since the value grows so quickly, larger values will yield an OverflowError. To deal with this, the lgamma() function returns the natural logarithm of the value.

Compiled Code

There have been a couple of changes to the way that both compiled bytecode and shared object files are stored on disk. More casual users of Python might want to skip over this section, although I would say it’s always helpful to know what’s going on under the hood, if only to help diagnose problems you might run into.

PYC Directories

The previous scheme of storing .pyc files in the same directory as the .py files didn’t play nicely when the same source files were being used by multiple different interpreters. The interpreter would note that the file was created by another one, and replace it with its own. As the files swap back and forth, it cancels out the benefits of caching in the first place.

As a result, the name of the interpreter is now added to the .pyc filename, and to stop these files cluttering things up too much they’ve all been moved to a __pycache__ directory.

I suspect many people will not need to care about this any further than it being another entry for the .gitignore file. However, sometimes there can be odd effects with these compiled files, so it’s worth being aware of. For example, if a module is installed and used and then deleted, it might leave the .pyc files behind, confusing programmers who were expecting an import error. If you do want to check for this, there’s a new __cached__ attribute of an imported module indicating the file that was loaded, in addition to the existing __file__ attribute which continues to refer to the source file. The imp module also has some new functions which are useful for scripts that need to correlate source and compiled files for some reason, as illustrated by the session below:

>>> import mylib
>>> print(mylib.__file__)
/tmp/mylib.py
>>> print(mylib.__cached__)
/tmp/__pycache__/mylib.cpython-32.pyc
>>> import imp
>>> imp.get_tag()
'cpython-32'
>>> imp.cache_from_source("/tmp/mylib.py")
'/tmp/__pycache__/mylib.cpython-32.pyc'
>>> imp.source_from_cache("/tmp/__pycache__/mylib.cpython-32.pyc")
'/tmp/mylib.py'

There are also some corresponding changes to the py_compile, compileall and importlib.abc modules which are a bit esoteric to cover here, the documentation has you well covered. You can also find lots of details and a beautiful module loading flowchart in PEP 3147.

Shared Objects

Similar changes have been implemented for shared object files. These are compiled against a specific ABI (Application Binary Interface) and the ABI is sensitive to major Python version, but also the compilation flags that were used to compiled the interpreter can also affect it. As a result, being able to support the same shared object compiled against multiple ABIs is useful.

The implementation is similar to that for compiled bytecode, where .so files acquire unique filenames based on the ABI and are collected into a shared directory pyshared. The suffix for the current interpreter can be queried using sysconfig:

>>> import sysconfig
>>> sysconfig.get_config_var("SOABI")
'cpython-32m-x86_64-linux-gnu'
>>> sysconfig.get_config_var("EXT_SUFFIX")
'.cpython-32m-x86_64-linux-gnu.so'

The interpreter is cpython, 32 is the version and the letters appended indicate the compilation flags. In this example, m corresponds to pymalloc.

If you want more details, PEP 3149 has a ton of interesting info.

Syntax Changes

The syntax of the language has been expanded to allow deletion of a variable that are free in a nested block. If that didn’t make any sense, it’s best explained with an example. The following code was legal in Python 2.x, but would raised a SyntaxError in Python 3.0 or 3.1. In Python 3.2, however, this is once again legal.

1
2
3
4
5
6
7
def outer_function(x):
    def inner():
        # Reference to x in a nested scope.
        return x
    inner()
    # Deleting variable referenced in nested scope.
    del x

So what happens if we were to call inner() again after the del x now? We exactly the same results as if we hadn’t declared the local yet which is to get NameError with the message free variable 'x' referenced before assignment in enclosing scope. The following example may make this message clearer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def outer_function():
    def inner():
        return x
    # print(inner()) here would raise NameError
    x = 123
    print(inner())  # Prints 123
    x = 456
    print(inner())  # Prints 456
    del x
    # print(inner()) here would raise NameError

An important example of an implicit del is at the end of an except block, so the following code would have raised a SyntaxError in Python 3.0-3.1, but is now valid again:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import traceback

def func():
    def print_exception():
        traceback.print_exception(type(exc), exc, exc.__traceback__)
    try:
        do_something_here()
    except Exception as exc:
        print_exception()
        # There is an implicit `del exc` here

Diagnostics and Testing

A new ResourceWarning has been added to detect issues such as gc.garbage not being empty at interpreter shutdown, indicating finalisation problems with the code. It’s also raised if a file object is destroyed before being properly closed.

This warning is silenced by default, but can be enabled by the warnings module, or using an appropriate -W option on the command-line. The session shown below shows the warning being triggered by destroying an unclosed file object:

>>> warnings.filterwarnings("default")
>>> f = open("/etc/passwd", "rb")
>>> del f
<stdin>:1: ResourceWarning: unclosed file <_io.BufferedReader name='/etc/passwd'>

Note that as of Python 3.4 most of the cases that could cause garbage collection to fail have been resolved, but we have to pretend we don’t know that for now.

There have also been a range of improvements to the unittest module. There are two new assertions, assertWarns() and assertWarnsRegex(), to test whether code raises appropriate warnings (e.g. DeprecationWarning). Another new assertion assertCountEqual() can be used to perform an order-independent comparison of two iterables — functionally this is equivalent to feeding them both into collections.Counter() and comparing the results. There is also a new maxDiff attribute for limiting the size of diff output when logging assertion failures.

Some of the assertion names are being tidied up. Examples include assertRegex() being the new name for assertRegexpMatches() and assertTrue() replacing assert_(). The assertDictContainsSubset() assertion has also been deprecated because the arguments were in the wrong order, so it was never quite clear which argument was required to be a subset of which.

Finally, the command-line usage with python -m unittest has been made more flexible, so you can specify either module names or source file paths to indicate which tests to run. There are also additional options for python -m unittest discover for specifying which directory to search for tests, and a regex filter on the filenames to run.

Optimisations

Some performance tweaks are welome to see. Firstly, the peephole optimizer is now smart enough to convert set literals consisting of constants to frozenset. This makes things faster in cases like this:

1
2
3
def is_archive(path):
    _, ext = os.path.splitext(path)
    return ext.lower() in {"zip", "tgz", "gz", "tar", "bz2"}

The Timsort algorithm used by list.sort() and sorted() is now faster and uses less memory when a key function is supplied by changing the way this case is handled internally. The performance and memory consumption of json decoding is also improved, particularly in the case where the same key is used repeatedly.

A faster substring search algorithm, which is based on the Boyer-Moore-Horspool algorithm, is used for a number of methods on str, bytes and bytearray objects such as split(), rsplit(), splitlines(), rfind() and rindex().

Finally, int to str conversions now process two digits at a time to reduce the number of arithmetic operations required.

Other Changes

There’s a whole host of little changes which didn’t sit nicely in their own section. Strap in and prepare for the data blast!

New WSGI Specification
As part of this release PEP 3333 is included as an update to the original PEP 333 which specifies the WSGI (Web Server Gateway Interface) specification. Primarily this tightens up the specifications around request/response header and body strings with regards to the types (str vs. bytes) and encodings to use. This is important reading for anyone building web apps conforming to WSGI.
range Improvements
range objects now support index() and count() methods, as well as slicing and negative indices, to make them more interoperable with list and other sequences.
csv Improvements
The csv module now supports a unix_dialect output mode where all fields are quoted and lines are terminated with \n. Also, csv.DictWriter has a writeheader() method which writes a row of column headers to the output file, using the key names you provided at construction.
tempfile.TemporaryDirectory Added
The tempfile module now provides a TemporaryDirectory context manager for easy cleanup of temporary directories.
Popen() Context Managers
os.popen() and subprocess.Popen() can now act as context managers to automatically close any associated file descriptors.
configparser Always Uses Safe Parsing
configparser.SafeConfigParser has been renamed to ConfigParser to replace the old unsafe one. The default settings have also been updated to make things more predictable.
select.PIPE_BUF Added
The select module has added a PIPE_BUF constant which defines the minimum number of bytes which is guaranteed not to block when a select.select() has indicated that a pipe is ready for writing.
callable() Re-introduced
The callable() builtin from Python 2.x was re-added to the language, as it’s a more readable alternative to isinstance(x, collections.Callable).
ast.literal_eval() For Safer eval()
The ast module has a useful literal_eval() function which can be used to evaluate expressions more safely than the builtin eval().
reprlib.recursive_repr() Added
When writing __repr__() special methods, it’s easy to forget to handle the case where a container can contain a reference to itself, which easily leads to __repr__() calling itself in an endlessly recursive loop. The reprlib module now provides a recursive_repr() decorator which will detect the recursive call and add ... to the string representation instead.
Numeric Type Hash Equivalence
Hash values of the various different numeric types should now be equal whenever their actual values are equal, e.g hash(1) == hash(1.0) == hash(1+0j).
hashlib.algorithms_available() Added
The hashlib module now provides the algorithms_available set which indicates the hashing algorithms available on the current platform, as well as algorithms_guaranteed which are the algorithms guaranteed to be available on all platforms.
hasattr() Improvements
Some undesiriable behaviour in hasattr() has been fixed. This works by calling getattr() and checking whether an exception is thrown. This approach allows it to support the multiple ways in which an attribute may be provided, such as implementing __getattr__(). However, prior to this release hasattr() would catch any exception, which could mask genuine bugs. As of Python 3.2 it will only catch AttributeError, allowing any other exceptioni to propogate out.
memoryview.release() Added
Bit of an esoteric one this, but memoryview objects now have a release() method and support use as a context manager. These objects allow a zero-copy view into any object that supports the buffer protocol, which includes the builtins bytes and bytearray. Some objects may need to allocate resources in order to provide this view, particularly those provided by C/C++ extension modules. The release() method allows these resources to be freed earlier than the memoryview object itself going out of scope.
structsequence Tool Improvements
The internal structsequence tool has been updated so that C structures returned by the likes of os.stat() and time.gmtime() now work like namedtuple and can be used anywhere where a tuple is expected.
Interpreter Quiet Mode
There’s a -q command-line option to the interpreter to enable “quiet” mode, which suppresses the copyright and version information being displayed in interactive mode. I struggle a little to think of cases where this would matter, I’ll be honest — perhaps if you’re embedding the interpreter as a feature in a larger application?

Conclusions

Well now, I must admit that I did not expect that to be double the size of the post covering Python 3.0! If you’ve come here reading that whole article in one go, I must say I’m impressed. Perhaps lay off caffeine for awhile…?

Overall it feels like a really massive release, this one. Admittedly I did cover a high proportion of the details, whereas in the first article I glossed over quite a lot as some of the changes were so massive I wanted to focus on them.

Out of all that, it’s really hard to pick only a few highlights, but I’ll give it a go. As I said at the outset I love argparse — anyone who writes command-line tools and cares about their usability should save a lot of hassle with this. Also, the concurrent.futures module is great — I’ve only really started using it recently, and I love how it makes it really convenient to add parallelism in simple cases to applications where the effort might otherwise be too high to justify the effort.

The functools.lru_cache() and functools.total_ordering() decorators are both great additions because they offer significant advantages with minimal coding effort, and this is the sort of feature that a language like Python should really be focusing on. It’s never going to beat C or Rust in the performance stakes, but it has real strengths in time to market, as well as the concision and elegance of code.

It’s also great to see some updates to the suite of Internet-facing modules, as having high quality implementations of these in the standard library is another great strength of Python that needs to be maintained. SSL adding support for SNI is a key improvement that can’t come too soon, as it still seems a long way off that we’ll be saying goodbye to the limited address space of IPv4.

Finally, the GIL changes are great to see. Although we’d all love to see the GIL be deprecated entirely, this is clearly a very difficult problem or it would have been addressed by now. Until someone can come up with something clever to achieve this, at least things are significantly better than they were for multithreaded Python applications.

So there we go, my longest article yet. If you have any feedback on the amount of detail that I’m putting in (either too much or too little!) then I’d love to hear from you. I recently changed my commenting system from Disqus to Hyvor which is much more privacy-focused and doesn’t require you to register an account to comment, and also has one-click feedback buttons. I find writing these articles extremely helpful for myself anyway, but it’s always nice to know if anyone else is reading them! If you’re reading this on the front-page, you can jump to the comments section of the article view using the link at the end of the article at the bottom-right.

OK, so that’s it — before I even think of looking at the Python 3.3 release notes, I’m going to go lie down in a darkened room with a damp cloth on my forehead.


  1. In real production envronments you should use many more iterations than this, a bigger salt and ideally a better key derivation function like scrypt, as defined in RFC 7914. Unforunately that won’t be in Python until 3.6. 

  2. Maybe more due to hyperthreading, but my assumption was that it wouldn’t help much with a CPU-intensive task like password hashing. My results seemed to validate that assumption. 

  3. Spoiler alert: using my time machine I can tell you it’s not a lot else yet, at least as of 3.10.0a5. 

  4. And you know I love to get technical. 

This is the 3rd of the 14 articles that currently make up the “Python 2to3” series.

7 Feb 2021 at 1:08PM in Software
 |   | 
Photo by David Clode on Unsplash