☑ The State of Python Coroutines: Python 3.5

This is part 4 of the “The State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers additional syntax that was added in Python 3.5.

In the previous post in this series I went over an example of using coroutines to handle IO with asyncio and how it compared with the same example implemented using callbacks. This almost brings us up to date with coroutines in Python but there’s one more change yet to discuss — Python 3.5 contains some new keywords to make defining and using coroutines more convenient.

As usual for a Python release, 3.5 contains quite a few changes but probably the biggest, and certainly the most relevant to this article, are those proposed by PEP-492. These changes aim to raise coroutines from something that’s supported by libraries to the status of a core language feature supported by proper reserved keywords.

Sounds great — let’s run through the new features.

Declaring and awaiting coroutines

To declare a coroutine the syntax is the same as a normal function but where async def is used instead of the def keyword. This serves approximately the same function as the @asyncio.coroutine decorator did previously — indeed, I believe one purpose of the decorator, aside from documentation purposes, was to allow async def routines to be called. Since coroutines are now a language mechanism and shouldn’t be intrinsically tied to a specific library, there’s now also a new decorator @types.coroutine that can be used for this purpose.

Previously coroutines were essentially a special case of generators — it’s important to note that this is no longer the case, they are a wholly separate language construct. They do still use the generator mechanisms under the hood, but my understanding is that’s primarily an implementation detail with which programmers shouldn’t need to concern themselves most of the time.

The distinction between a function and a generator is whether the yield keyword appears in its body, but the distinction between a function and a coroutine is whether it’s delcared with async def. If you try to use yield in a coroutine declared with async def you’ll get SyntaxError (i.e. a routine cannot be both a generator and a coroutine).

So far so simple, but coroutines aren’t particularly useful until they can yield control to other code — that’s more or less the whole point. With generator-based coroutines this was achieved with yield from and with new syntax it’s achieved with the await keyword. This can be used to wait for the result from any object which is awaitable

An awaitable object is one of:

  • A coroutine, as declared with async def.
  • A coroutine-compatible generator (i.e. decorated with @types.coroutine).
  • Any object that implements an appropriate __await__() method.
  • Objects defined in C/C++ extensions with a tp_as_async.am_await method — this is more or less equivalent to __await__() in pure Python objects.

The last option is perhaps simpler than it sounds — any object wishes to be awaitable needs to return an interator from its __await__() method. This iterator is used to implement the funamental wait operation — the iterator’s __next__() method is invoked and the value it yields is used as the value of the await expression.

It’s important to note that this definition of awaitable is what’s required of the argument to await, but the same conditions don’t apply to yield from. There are some things that both will accept (i.e. coroutines) but await won’t accept generic generators and yield from won’t accept the other forms of awaitable (e.g. an object with __await__()).

It’s also equally important to note that a coroutine defined with async def can’t every directly return control to the event loop — there simply isn’t the machinery to do so. Typically this isn’t much of a problem since most of the time you’ll be using asyncio functions to do this, such as asyncio.sleep() — however, if you wanted to implement something like asyncio.sleep() yourself then as far as I can tell you could only do so with generator-based coroutines.

OK, so let me be pedantic and contradict myself for a moment — you can indeed implement something like asyncio.sleep() yourself. Indeed, here’s a simple implementation:

1
2
3
4
5
async def my_sleep(delay, result=None):
    loop = asyncio.get_event_loop()
    future = loop.create_future()
    loop.call_later(delay, future.set_result, result)
    return (await future)

This has a lot of deficiencies as it doesn’t handle being cancelled or other corner cases, but you get the idea. However the key point here is that this depends on asyncio.Future and if you go look at the implementation for that then you’ll see that __await__() is just an alias for __iter__() and that method uses yield to return control to the event loop. As I said earlier, it’s all built on generators under the hood, and since yield isn’t permitted in an async def coroutine, there’s no way to achieve that (at least as far as I can tell).

In general, however, the amount of times you would be returning control to the event loop is very low — the vast majority of cases where you’re likely to do that are for a fixed delay or for IO and asyncio already has you covered in both cases.

One final note is that there’s also an abstract base class for awaitable objects in case you ever need to test the “awaitability” of something you’re passed.

Coroutines example

As a quick example of await in action consider the script below which is used to ping several hosts in parallel to determine whether they’re alive. This example is quite contrived, but it illustrates the new syntax — it’s also an example of how to use the asyncio subprocess support.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import asyncio
import os
import sys


PING_PATH = "/sbin/ping"

async def ping(server, results):
    with open(os.devnull, "w") as fd:
        # -c1 -> perform a single ping request only
        # -t3 -> timeout of three seconds on response
        # -q  -> generate less output
        proc = await asyncio.create_subprocess_exec(
                PING_PATH, '-c1', '-q', '-t3', server, stdout=fd)

        # Wait for the ping process to exit and check exit code
        returncode = await proc.wait()
        results[server] = not bool(returncode)


async def progress_ticker(results, num_servers):
    while len(results) < num_servers:
        waiting = num_servers - len(results)
        msg = "Waiting for {0} response(s)".format(waiting)
        sys.stderr.write(msg)
        sys.stderr.flush()
        await asyncio.sleep(0.5)
        sys.stderr.write("\r" + " "*len(msg) + "\r")


def main(argv):
    results = {}
    tasks = [ping(server, results) for server in argv[1:]]
    tasks.append(progress_ticker(results, len(tasks)))
    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
    for server, pingable in sorted(results.items()):
        status = "alive" if pingable else "dead"
        print("{0} is {1}".format(server, status))


if __name__ == "__main__":
sys.exit(main(sys.argv))

One point that’s worth noting is that since we’re using coroutines as opposed to threads to achieve concurrency within the script1, we can safely access the results dictionary without any form of locking and be confident that only one coroutine will be accessing it at any one time.

Asynchronous context manager and iterators

As well as the simple await demonstrated above there’s also a new syntax for allowing context managers to be used in coroutines.

The issue with a standard context manager is that the __enter__() and __exit__() methods could take some time or perform blocking operations - how then can a coroutine use them whilst still yielding to the event loop during these operations?

The answer is support for asynchronous context managers. These work in a vary similar manner but provide two new methods __aenter__() and __aexit__() — these are called instead of the regular versions when the caller invokes async with instead of the plain with statement. In both cases they are expected to return an awaitable object that does the actual work.

These are a natural extension to the syntax already described and allow coroutines to make use of any constructions which may perform blocking IO in their enter/exit routines — this could be database connections, distributed locks, socket connections, etc.

Another natural extension are asynchronous iterators. In this case objects that wish to be iterable implement an __aiter__() method which returns an asynchronous iterator which implements an __anext__() method. These two are directly analogous to __iter__() and __next__() for standard iterators, the difference being that __anext__() must return an awaitable object to obtain the value instead of the value directly.

Note that in Python 3.5.x prior to 3.5.2 the __aiter__() method was also expected to return an awaitable, but this changed in 3.5.2 so that it should return the iterator object directly. This makes it a little fiddly to write compatible code because earlier versions still expect an awaitable, but I strongly recommend writing code which caters for the later versions — the Python documentation has a workaround if necessary.

To wrap up this section let’s see an example of async for — with apologies in advance to anyone who cares even the slightest bit about the correctness of HTTP implementations I present a HTTP version of the cat utility.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
import asyncio
import os
import sys
import urllib.parse


class HTTPCat:

    def __init__(self, urls):
        self.urls = urls
        self.url_reader = None

    class URLReader:

        def __init__(self, url):
            self.parsed_url = urllib.parse.urlparse(url)

        async def connect(self):
            port = 443 if self.parsed_url.scheme == "https" else 80
            connect = asyncio.open_connection(
                    self.parsed_url.netloc, port, ssl=(port==443))
            self.reader, writer = await connect
            query = ('GET {path} HTTP/1.0\r\n'
                     'Host: {host}\r\n'
                     '\r\n').format(path=self.parsed_url.path, 
                                    host=self.parsed_url.netloc)
            writer.write(query.encode('latin-1'))
            while True:
                line = await self.reader.readline()
                if not line.strip():
                    break

        async def readline(self):
            line = await self.reader.readline()
            return line.decode('latin1')


    def __aiter__(self):
        return self

    async def __anext__(self):
        while True:
            if self.url_reader is None:
                if not self.urls:
                    raise StopAsyncIteration
                self.url_reader = self.URLReader(self.urls.pop(0))
                await self.url_reader.connect()
            line = await self.url_reader.readline()
            if line:
                return line
            self.url_reader = None


async def http_cat(urls):
    async for line in HTTPCat(urls):
        print("Line: {0}".format(line.rstrip()))


def main(argv):
    loop = asyncio.get_event_loop()
    loop.run_until_complete(http_cat(argv[1:]))
    loop.close()

if __name__ == "__main__":
sys.exit(main(sys.argv))

This is a heavily over-simplified example with many shortcomings (e.g. it doesn’t even support redirections or chunked encoding) but it shows how the __aiter__() and __anext__() methods can be used to wrap up operations which may block for significant periods.

One nice property of this construction is that lines of output will flow down as soon as they arrive from the socket — many HTTP clients seem to want to block until the whole document is retrieved and return it as a string. This is terribly inconvenient if you’re fetching a file of many GB.

Coroutines make streaming the document back in chunks a much more natural affair, however, and I really like the ease of use for the client. Of course, in reality you’d use a library like aiohttp to avoid messing around with HTTP yourself.

Conclusions

That’s the end of this sequence of articles and we’re brought about bang up to date. Overall I really like the fact that the Python developers have focused on making coroutines a proper first-class concept within the language. The implementation is somewhat different to other languages, which often seem to try to hide the coroutines themselves and offer only futures as the language interface, but I do like knowing when my context switches are constrained to be — especially if I’m relying on this mechanism to avoid locking that would otherwise be required.

The syntax is nice and the paradigm is pleasant to work with — but are there any downsides? Well, because the implementation is based on generators under the hood I do have my concerns around raw performance. One of the benefits of asynchonrous IO should really be the performance boost and scalability vs. threads for dominantly IO-bound applications — while the scalability is probably there, I’m a little unconvinced about the performance for real-world cases.

I hunted around for some proper benchmarks but they see few and far between. There’s this page which has a useful collection of links, although it hasn’t been updated for almost a year — I guess things are unlikely to have moved on significantly in that time. From looking over these results it’s clear that asyncio and aiohttp aren’t the cutting edge of performance, but then again they’re not terrible either.

When all’s said and done, if performance is the all-consuming overriding concern then you’re unlikely to be using Python anyway. If it’s important enough to warrant an impact on readability then you might want to at least investigate threads or gevent before making a decision. But if you’ve got what I would regard as a pretty typical set of concerns, where your readablity and maintainability are the top priority, even though you don’t want performance to suffer too much, then take a serious look at coroutines — with a bit of practice I think you might learn to love them.

Or maybe at least dislike them less than the other options.


  1. I’m ignoring the fact that we’re also using subprocesses for concurrency in this example since it’s just an implementation detail of this particular case and not relevant to the point of safe access to data structures within the script. 

Wed 13 Jul 2016 at 07:00PM by Andy Pearce in Software tagged with python and coroutines  |  See comments

☑ The State of Python Coroutines: asyncio - Callbacks vs. Coroutines

This is part 3 of the “The State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers more of the asyncio module that was added in Python 3.4.

In the preceding post in this series I introduced the asyncio module and its utility as an event loop for coroutines. However, this isn’t the only use of the module — its primary purpose is to act as an event loop for various forms of I/O such as network sockets and pipes to child processes. In this post, then, I’d like to compare the two main approaches to doing this: using callbacks and using coroutines.

A brief digression: handling multiple connections

Anyone that’s done a decent amount of non-blocking I/O can probably skim or skip this section — for anyone who’s not come across this problem in their coding experience, this might be useful.

There are quite a few occasions where you end up needing to handle multiple I/O streams simultaneously. An obvious one is something like a webserver, where you want to handle multiple network connections concurrently. There are other examples, though — one thing that crops up quite often for me is managing multiple child processes, where I want to stream output from them as soon as it’s generated. Another possibility is where you’re making multiple HTTP requests that you want to be fetched in parallel.

In all these cases you want your application to respond immediately to input received on any stream, but at the same time it’s clear you need to block and wait for input — endlessly looping and polling each stream would be a massive waste of system resources. Typically there are two main approaches to this: threads and non-blocking I/O1.

These days threads seem to be the more popular solution — each I/O stream has a new thread allocated for it and the stack of this thread encapsulates its complete state. This makes it easy for programmers who aren’t used to dealing with event loops — they can continue to write simple sequential code that uses standard blocking I/O calls to yield as required. It has some downsides, however — cooperating with other threads requires the overhead of synchronisation and if the turnover of connections is high (consider, say, a busy DNS server) then it’s slightly wasteful to be continually creating and destroying thread stacks. If you want to solve the C10k problem, for example, I think you’d struggle to do it using a thread per connection.

The other alternative is to use a single thread and have it wait for activity on any stream, then process that input and go back to sleep again until another stream is ready. This is typically simpler in some ways — for example, you don’t need any locking between connections because you’re only processing one at any given time. It’s also perfectly performant in cases where you expect to be primarily IO-bound (i.e. handling connections won’t require significant CPU time) — indeed, depending on how the data structures associated with your connections are allocated this approach could improve performance by avoiding false sharing issues.

The downside to this method is that it’s a rather less intuitive for many programmers. In general you’d like to write some straight-line code to handle a single connection, then have some magical means to extend that to multiple connections in parallel — that’s the lure of threading. But there is a way we can achieve, to some extent, the best of both worlds (spoiler alert: it’s coroutines).

The mainstays for implementing non-blocking I/O loops in the Unix world have long been select(), introduced by BSD in 1983, and the slightly later poll(), added to System V in 1986. There are some minor differences but in both cases the model is very similar:

  • Register a list of file descriptors to watch for activity.
  • Call the function to wait for activity on any of them.
  • Examine the returned value to discover which descriptors are active and process them.
  • Loop around to the beginning and wait again.

This is often known as the event loop — it’s a loop, it handles events. Implementing an event loop is quite straightforward, but the downside is that the programmer essentially has to find their own way to maintain the state associated with each connection. This often isn’t too tricky, but sometimes when the connection handling is very context-dependent it can make the code rather hard to follow. If often feels like scrabbling to implement some half-arsed version of closures and it would preferable to let language designers worry about that sort of thing.

The rest of this article will focus on how we can use asyncio to stop worrying so much about some of these details and write more natural code whilst still getting the benefits of the non-blocking I/O approach.

asyncio with callbacks

One problem with using the likes of select() is that it can encourage you to drive all your coding from one big loop. Without a bit of work, this tends to run counter to the design principle of separating concerns, so we’d like to move as much as possible out of this big loop. Ideally we’d also like to abstract it, implement in a library somewhere and get benefits of reusing well-tested code. This is partiularly important for event loops where the potential for serious issues (such as getting into a busy loop) is rather higher than in a lot of areas of code.

The most common way to hook into a generic event look is with callbacks. The application registers callback functions which are to be invoked when particular events occur, and then the application jumps into a wait function whose purpose is to simply loop until there are events and invoke the appropriate callbacks.

It’s unsurprising, then, that asyncio is designed to support the callback approach. To illustrate this I’ve turned to my usual example of a chat server — this is a really simple daemon that waits for socket connections (e.g. using netcat or telnet) then prompts for a username and allows connected users to talk to each other.

This implementation is, of course, exceedingly basic — it’s meant to be an example, not a fully-featured application. Here’s the code, I’ll touch on the highlights afterwards.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
import asyncio
import sys

class ChatServer:

    class ChatProtocol(asyncio.Protocol):

        def __init__(self, chat_server):
            self.chat_server = chat_server
            self.username = None
            self.buffer = ""
            self.transport = None

        def connection_made(self, transport):
            # Callback: when connection is established, pass in transport.
            self.transport = transport
            welcome = "Welcome to " + self.chat_server.server_name
            self.send_msg(welcome + "\nUsername: ")

        def data_received(self, data):
            # Callback: whenever data is received - not necessarily buffered.
            data = data.decode("utf-8")
            self.buffer += data
            self.handle_lines()

        def connection_lost(self, exc):
            # Callback: client disconnected.
            if self.username is not None:
                self.chat_server.remove_user(self.username)

        def send_msg(self, msg):
            self.transport.write(msg.encode("utf-8"))

        def handle_lines(self):
            while "\n" in self.buffer:
                line, self.buffer = self.buffer.split("\n", 1)
                if self.username is None:
                    if self.chat_server.add_user(line, self.transport):
                        self.username = line
                    else:
                        self.send_msg("Sorry, that name is taken\nUsername: ")
                else:
                    self.chat_server.user_message(self.username, line)


    def __init__(self, server_name, port, loop):
        self.server_name = server_name
        self.connections = {}
        self.server = loop.create_server(
                lambda: self.ChatProtocol(self),
                host="", port=port)

    def broadcast(self, message):
        for transport in self.connections.values():
            transport.write((message + "\n").encode("utf-8"))

    def add_user(self, username, transport):
        if username in self.connections:
            return False
        self.connections[username] = transport
        self.broadcast("User " + username + " joined the room")
        return True

    def remove_user(self, username):
        del self.connections[username]
        self.broadcast("User " + username + " left the room")

    def get_users(self):
        return self.connections.keys()

    def user_message(self, username, msg):
        self.broadcast(username + ": " + msg)


def main(argv):

    loop = asyncio.get_event_loop()
    chat_server = ChatServer("Test Server", 4455, loop)
    loop.run_until_complete(chat_server.server)
    try:
        loop.run_forever()
    finally:
        loop.close()


if __name__ == "__main__":
sys.exit(main(sys.argv))

The ChatServer class provides the main functionality of the application, tracking the users that are connected and providing methods to send messages. The interaction with asycio, however, is provided by the nested ChatProtocol class. To explain what this is doing, I’ll summarise a little terminology.

The asyncio module splits IO handling into two areas of responsibility — transports take care of getting raw bytes from one place to another and protocols are responsible for interpreting those bytes into some more meaningful form. In the case of a HTTP request, for example, the transport would read and write from the TCP socket and the protocol would marshal up the request and parse the response to exract the headers and body.

This is something asyncio took from the Twisted networking framework and it’s one of the aspects I really appreciate. All too many HTTP client libraries, for example, jumble up the transport and protocol handling into one big mess such that changing one aspect but still making use of the rest is far too difficult.

The transports that asyncio provides cover TCP, UDP, SSL and pipes to a subprocess, which means that most people won’t need to roll their own. The interesting part, then, is asycio.Protocol and that’s what ChatProtocol implements in the example above.

The first thing that happens is that the main() function instantiates the event loop — this occurs before anything else as it’s required for all the other operations. We then create a ChatServer instance whose constructor calls create_server() on the event loop. This opens a listening TCP socket on the specified port2 and takes a protocol factory as a parameter. Every time there is a connection on the listening socket, the factory will be used to manufacture a protocol instance to handle it.

The main loop then calls run_until_complete() passing the server that was returned by create_server() — this will block until the listening socket is fully open and ready to accept connections. This probably isn’t really required because the next thing it does is then call run_forever() which causes the event loop to process IO endlessly until explicitly terminated.

The meat of the application is then how ChatProtocol is implemented. This implements several callback methods which are invoked by the asyncio framework in response to different events:

  • A ChatProtocol instance is constructed in response to an incoming connection on the listening socket. No parameters are passed by asyncio — because the protocol needs an instance to the ChatServer instance this is passed via a closure by the lambda in the create_server() call.
  • Once the connection is ready, the connection_made() method is invoked which passes the transport that asyncio has allocated for the connection. This allows the protocol to store a reference to it for future writes, and also to trigger any actions required on a new connection — in this example, prompting the user for a username.
  • As data is received on the socket, data_received() is invoked to pass this to the protocol. In our example we only want line-oriented data (we don’t want to send a message to the chat room until the user presses return) so we buffer up data in a string and then process any complete lines found in it. Note that we also should take care of character encoding here — in our simplistic example we blindly assume UTF-8.
  • When we want to send data back to the user we invoke the write() method of the transport. Again, the transport expects raw bytes so we handle encoding to UTF-8 ourselves.
  • Finally, when the user terminates their connection then our connection_lost() method is invoked — in our example we use this to remove the user from the chatroom. Note that this is subtly different to the eof_received() callback which represents TCP half-close (i.e. the remote end called shutdown() with SHUT_WR) — this is important if you want to support protocols that indicate the end of a request in this manner.

That’s about all there is to it — with this in mind, the rest of example should be quite straightforward to follow. The only other aspect to mention is that once the loop has been terminated, we go ahead and call its close() method — this clears out any queued data, closes listening sockets, etc.

asyncio with coroutines

Since we’ve seen how to implement the chat server with callbacks, I think it’s high time we got back to the theme of this post and now compare that with an implementation of the same server with coroutines. In usual fashion, let’s jump in and look at the code first:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import asyncio
import sys

class ChatServer:

    def __init__(self, server_name, port, loop):
        self.server_name = server_name
        self.connections = {}
        self.server = loop.run_until_complete(
                asyncio.start_server(
                    self.accept_connection, "", port, loop=loop))

    def broadcast(self, message):
        for reader, writer in self.connections.values():
            writer.write((message + "\n").encode("utf-8"))

    @asyncio.coroutine
    def prompt_username(self, reader, writer):
        while True:
            writer.write("Enter username: ".encode("utf-8"))
            data = (yield from reader.readline()).decode("utf-8")
            if not data:
                return None
            username = data.strip()
            if username and username not in self.connections:
                self.connections[username] = (reader, writer)
                return username
            writer.write("Sorry, that username is taken.\n".encode("utf-8"))

    @asyncio.coroutine
    def handle_connection(self, username, reader):
        while True:
            data = (yield from reader.readline()).decode("utf-8")
            if not data:
                del self.connections[username]
                return None
            self.broadcast(username + ": " + data.strip())

    @asyncio.coroutine
    def accept_connection(self, reader, writer):
        writer.write(("Welcome to " + self.server_name + "\n").encode("utf-8"))
        username = (yield from self.prompt_username(reader, writer))
        if username is not None:
            self.broadcast("User %r has joined the room" % (username,))
            yield from self.handle_connection(username, reader)
            self.broadcast("User %r has left the room" % (username,))
        yield from writer.drain()


def main(argv):

    loop = asyncio.get_event_loop()
    server = ChatServer("Test Server", 4455, loop)
    try:
        loop.run_forever()
    finally:
        loop.close()


if __name__ == "__main__":
sys.exit(main(sys.argv))

As you can see, this version is written in quite a different style to the callback variant. This is because it’s using the streams API which is essentially a set of wrappers around the callbacks version which adapts them for use with a coroutines.

To use this API we call start_server() instead of create_server() — this wrapper changes the way the supplied callback is invoked and instead passes it two streams: StreamReader and StreamWriter instances. These represent the input and output sides of the socket, but importantly they’re also coroutines so that we can delegate to them with yield from.

On the subject of coroutines, you’ll notice that some of the methods have an @asyncio.coroutine decorator — this serves a practical function in Python 3.5 in that it enables you to delegate to the new style of coroutine that it defines. Pre-3.5 it’s therefore useful for future compatibility, but also serves as documentation that this method is being treated as a coroutine. You should always use it to decorate your coroutines, but this isn’t enforced anywhere.

Back to the code. Our accept_connection() method is the callback that we provided to the start_server() method and the lifetime of this method call is the same as the lifetime of the connection. We could implement the handling of a connection in a strictly linear fashion within this method — such is the flexibility of coroutines — but of course being good little software engineers we like to break things out into smaller functions.

In this case I’ve chosen to use a separate coroutine to handle prompting the user for their username, so accept_connection() delegates to prompt_username() with this line:

username = (yield from self.prompt_username(reader, writer))

Once delegated, this coroutine takes control for as long as it takes to obtain a unique username and then returns this value to the caller. It also handles storing the username and the writer in the connections member of the class — this is used by the broadcast() method to send messages to all users in the room.

The handle_connection() method is also implemented in quite a straightforward fashion, reading input and broadcasting it until it detects that the connection has been closed by an empty read. At this point it removes the user from the connections dictionary and returns control to accept_connection(). We finally call writer.drain() to send any last buffered output — this is rather pointless if the user’s connection was cut, but could still serve a purpose if they only half-closed or if the server is shutting down instead. After this we simply return and everything is cleaned up for us.

How does this version compare, then? It’s a little shorter for one thing — OK, that’s a little facile, what else? We’ve managed to lose the nested class, which seems to simplify the job somewhat — there’s less confusion about the division of responsibilities. We also don’t need to worry so much about where we store things — there’s no transport that we have to squirrel away somewhere while we wait for further callbacks. The reader and writer streams as just passed naturally through the callchain in an intuitive manner. Finally, we don’t have to engage in any messy buffering of data to obtain line-oriented input — the reader stream handles all that for us.

Conclusions

That about wraps it up for this post. Hopefully it’s been an interesting comparison — I know that I certainly feel like I understand the various layers of asyncio a little better having gone through this exercise.

It takes a bit of a shift in one’s thinking to use coroutine approach, and I think it’s helpful to have a bit of a handle on both mechanisms to better understand what’s going on under the hood, but overall the more I use the coroutine style for IO the more I like it. It feels like a good compromise between the intuitive straight-line approach of the thread-per-connection approach and the lock-free simplicity of non-blocking IO with callbacks.

In the next post I’m going to look at the new syntax for coroutines introduced in Python 3.5, which was the inspiration for writing this series of posts in the first place.


  1. Some people use the term asynchronous IO for what I’m discussing here, which is certainly the more general term, but I prefer to avoid it due to risk of confusion with the POSIX asynchronous IO interface. 

  2. In this example we use a hard-coded port of 4455 for simplicity. 

Tue 05 Jul 2016 at 07:45AM by Andy Pearce in Software tagged with python and coroutines  |  See comments

☑ The State of Python Coroutines: Introducing asyncio

This is part 2 of the “The State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers parts of the asyncio module that was added in Python 3.4.

In the previous post I discussed the state of coroutines in Python 2.x and then the yield from enhancement added in Python 3.3. Since that release there’s been a succession of improvements for coroutines and in this post I’m going to discuss those that were added as part of the asyncio module.

It’s a pretty large module and covers quite a wide variety of functionality, so covering all that with in-depth discussion and examples is outside the scope of this series of articles. I’ll try to touch on the finer points, however — in this article I’ll discuss the elements that are relevant to coroutines directly and then in the following post I’ll talk about the IO aspects.

History of asyncio

Python 2 programmers may recall the venerable asyncore module, which was added way back in the prehistory of Python 1.5.2. Its purpose was to assist in writing endpoints that handle IO from sources such as sockets asynchronously. To create clients you derive your own class from asyncore.dispatcher and override methods to handle events.

This was a helpful module for basic use-cases but it wasn’t particularly flexible if what you wanted didn’t quite match its structure. Generally I found I just ended up rolling my own polling loop based on things from the select module as I needed them (although if I were using Python 3.4 or above then I’d prefer the selectors module).

If you’re wondering why talk of an old asynchronous IO module is relevant to a series on coroutines, bear with me.

The limitations of asyncore were well understood and several third party libraries sprang up as alternatives, one of the most popular being Twisted. However, it was always a little annoying that such a common use-case wasn’t well catered for within the standard library.

Back in 2011 PEP 3153 was created to address this deficiency. It didn’t really have a concrete proposal, however, it just defined the requirements — Guido addressed this in 2012 with PEP 3156 and the fledgling asyncio library was born.

The library went through some iterations under the codename Tulip and a couple of years later it was included in the standard library of Python 3.4. This was on a provisional basis — this means that it’s there, it’s not going away, but the core developers reserve the right to make incompatible changes prior to it being finalised.

OK, still not seeing the link with coroutines? Well, as well as handling IO asynchronously, asyncio also has a handy event loop for scheduling coroutines. This is because the entire library is designed for use in two different ways depending on your preferences — either a more traditional callback-based scheme, where callbacks are invoked on events; or with a set of coroutines which can each block until there’s IO activity for them to process. Even if you don’t need to do IO, the coroutine scheduler is a useful piece that you don’t need to build yourself.

asyncio as a scheduler

At this point it would be helpful to consider a quick example of what asyncio can do on the scheduling front without worrying too much about IO — we’ll cover that in the next post.

In the example below, therefore, I’ve implemented something like logrotate — mine is extremely simple1 and doesn’t run off a configuration file, of course, because it’s just for demonstration purposes.

First here’s the code — see if you can work out what it does, then I’ll explain the finer points below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
import asyncio
import datetime
import errno
import os
import sys

def rotate_file(path, n_versions):
    """Create .1 .2 .3 etc. copies of the specified file."""

    if not os.path.exists(path):
        return
    for i in range(n_versions, 1, -1):
        old_path = "{0}.{1}".format(path, i - 1)
        if os.path.exists(old_path):
            os.rename(old_path, "{0}.{1}".format(path, i))
    os.rename(path, path + ".1")


@asyncio.coroutine
def rotate_by_interval(path, keep_versions, rotate_secs):
    """Rotate file every N seconds."""

    while True:
        yield from asyncio.sleep(rotate_secs)
        rotate_file(path, keep_versions)


@asyncio.coroutine
def rotate_daily(path, keep_versions):
    """Rotate file every midnight."""

    while True:
        now = datetime.datetime.now()
        last_midnight = now.replace(hour=0, minute=0, second=0)
        next_midnight = last_midnight + datetime.timedelta(1)
        yield from asyncio.sleep((next_midnight - now).total_seconds())
        rotate_file(path, keep_versions)


@asyncio.coroutine
def rotate_by_size(path, keep_versions, max_size, check_interval_secs):
    """Rotate file when it exceeds N bytes checking every M seconds."""

    while True:
        yield from asyncio.sleep(check_interval_secs)
        try:
            file_size = os.stat(path).st_size
            if file_size > max_size:
                rotate_file(path, keep_versions)
        except OSError as exc:
            if exc.errno != errno.ENOENT:
                raise


def main(argv):

    loop = asyncio.get_event_loop()
    # Would normally read this from a configuration file.
    rotate1 = loop.create_task(rotate_by_interval("/tmp/file1", 3, 30))
    rotate2 = loop.create_task(rotate_by_interval("/tmp/file2", 5, 20))
    rotate3 = loop.create_task(rotate_by_size("/tmp/file3", 3, 1024, 60))
    rotate4 = loop.create_task(rotate_daily("/tmp/file4", 5))
    loop.run_forever()


if __name__ == "__main__":
    sys.exit(main(sys.argv))

Each file rotation policy that I’ve implemented is its own coroutine. Each one operates independently of the others and the underlying rotate_file() function is just to refactor out the common task of actually rotating the files. In this case they all delegate their waiting to the asyncio.sleep() function as a convenience, but it would be equally possible to write a coroutine which does something more clever, like hook into inotify, for example.

You can see that main() just creates a bunch of tasks and plugs them into an event loop, then asyncio takes care of the scheduling. This script is designed to run until terminated so it uses the simple run_forever() method of the loop, but there are also methods to run until a particular coroutine completes or just wait for one or more specific futures.

Under the hood the @asyncio.coroutine decorator marks the function as a coroutine such that asyncio.iscoroutinefunction() returns True — this may be required for disambiguation in parts of asyncio where the code needs to handle coroutines differently from regular callback functions. The create_task() call then wraps the coroutine instance in a Task class — Task is a subclass of Future and this is where the coroutine and callback worlds meet.

An asyncio.Future represents the future result of an asynchronous process. Completion callbacks can be registered with it using the add_done_callback(). When the asynchronous result is ready then it’s passed to the Future with the set_result() method — at this point any registered completion callbacks are invoked. It’s easy to see, then, how the Task class is a simple wrapper which waits for the result of its wrapped coroutine to be ready and passes it to the parent Future class for invocation of callbacks. In this way, the coroutine and callback worlds can coexist quite happily — in fact in many ways the coroutine interface is a layer implemented on top of the callbacks. It’s a pretty crucial layer in making the whole thing cleaner and more manageable for the programmer, however.

The part that links it all together is the event loop, which asyncio just gives you for free. There are a few details I’ve glossed over, however, since it’s not too important for a basic understanding. One thing to be aware of is that there are currently two event loop implementations — most people will be using SelectorEventLoop, but on Windows there’s also the ProactorEventLoop which uses different underlying primitives and has different tradeoffs.

This scheduling may all seem simplistic, and it’s true that in this example asyncio isn’t doing anything hugely difficult. But building your own event loop isn’t quite as trivial as it sounds — there are quite a few gotchas that can trip you up and leave your code locked up or sleeping forever. This is particularly acute when you introduce IO into the equation, where there are some slightly surprising edge cases that people often miss such as handling sockets which have performed a remote shutdown. Also, this approach is quite modular and manages to produce single-threaded code where different asynchronous operations interoperate with little or no awareness of each other. This can also be achieved with threading, of course, but this way we don’t need locks and we can more or less rule out issues such as race conditions and deadlocks.

That wraps it up for this article. I’ll cover the IO aspects of ascynio in my next post, covering and comparing both the callback and coroutine based approaches to using it. This is particularly important because one area where coroutines really shine (vs threads) is where your application is primarily IO-bound and so there’s no need to explode over multiple cores.


  1. In just one example of many issues, for extra credit2 you might like to consider whay happens to the rotate_daily() implementation when it spans a DST change. 

  2. Where the only credit to which I’m referring are SmugPoints(tm): a currency that sadly only really has any traction inside the privacy of your own skull. 

Thu 16 Jun 2016 at 08:29AM by Andy Pearce in Software tagged with python and coroutines  |  See comments

☑ The State of Python Coroutines: yield from

This is part 1 of the “The State of Python Coroutines” series.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers the facilities up to and including the yield from syntax added in Python 3.3.

I’ve always thought that coroutines are an underused paradigm.

Multithreading is great for easily expanding single threaded approaches to make better use of modern hardware with minimal changes; multiprocess is great for enforcement of interfaces and also extending across multiple machines. In both cases, however, the premise is on performance at the expense of simplicity.

To my mind, coroutines offer the flip side of the coin — perhaps performance isn’t critical, but your approach is just more naturally expressed as a series of cooperative processes. You don’t want to wade through a sea of memory barriers to implement such things, you just want to divide up your responsibilities and let the data flow through.

In this short series of posts I’m going to explore what facilities we have available for implementing coroutines in Python 3, and in the process catch myself up developments in that area.

Coroutines in Python 2

Before looking at Python 3 it’s worth having a quick refresher on the options for implementing coroutines in Python 2, not least because many programmers will still be constrained to use this version in many commercial environments.

The genesis of coroutines was when generators were added to the language in Python 2.2 — these are essentially lazily-evaluated lists. One defines what looks like a normal function but instead of a return statement yield is used. This has the effect of emitting a value from your generator but — and this is crucial — it also suspends execution of your generator in place and returns the flow of execution back to the calling code. This continues until the caller requests the next value from the generator at which point it resumes execution just after the yield statement.

For a real-world example consider the following implementation of the sieve of Eratosthenes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# This is Python 2 code, since this section discusses Python 2.
# For Python 3 replace range(...) with list(range(...)) and
# replace xrange(...) with range(...).
def primes(limit):
    "Yield all primes <= limit."

    sqrt_limit = int(round(limit**0.5))
    limit += 1
    sieve = range(limit)
    sieve[1] = 0
    for i in xrange(2, sqrt_limit + 1):
        if sieve[i]:
            sieve[i*i:limit:i] = [0] * len(xrange(i*i, limit, i))
            yield i
    for i in xrange(sqrt_limit + 1, limit):
        if sieve[i]:
            yield i

Generators are, of course, fantasically useful on their own. In terms of coroutines, however, they’re only half the story — they can yield outputs but they can only take their initial inputs, they can’t be updated during their execution.

To address this Python 2.5 extended generators in several ways which allow them to be turned into general purpose generators. A quick summary of these enhancements is:

  • yield, which was previously a statement, was redefined to be an expression.
  • Added a send() method to inject inputs during execution.
  • Added a throw() method to inject exceptions.
  • Added a close() method to allow the caller to terminate a generator early.

There are a few other tweaks, but those are the main points. The net result of these changes is that one could now write a generator where new values can be injected, via the send() method, and these are returned within the generator as the value of the yield expression.

As a simple example of this, consider the code below which implements a coroutine that accepts a number as a parameter and returns back the average of all the numbers up to that point.

1
2
3
4
5
6
7
import itertools

def averager():
    sum = float((yield))
    counter = itertools.count(start=1)
    while True:
        sum += (yield sum / next(counter))

Python 3.3 adds “yield from”

The conversion of generators to true coroutines was the final development in this story in Python 2 and development of the language long ago moved to Python 3. In this vein there was another advancement of coroutines added in Python 3.3 which was the yield from construction.

This stemmed from the observation that it was quite cumbersome to refactor generators into several smaller units. The complication is that a generator can only yield to its immediate caller — if you want to split generators up for reasons of code reuse and modularity, the calling generator would have to manually iterate the sub-generator and re-yield all the results. This is tedious and inefficient.

The solution was to add a yield from statement to delegate control entirely to another generator. The subgenerator is run to completion, with results being passed directly to the original caller without involvement from the calling generator. In the case of coroutines, sent values and thrown exceptions are also propogated directly to the currently executing subgenerator.

At its simplest this allows a more natural way to express solutions where generators are delegated. For a really simple example, compare these two sample1 implementations of itertools.chain():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Implementation in pre-3.3 Python
def chain(*generators):
    for generator in generators:
        for item in generator:
            yield item

# Implementation in post-3.3 Python
def chain(*generators):
    for generator in generators:
        yield from generator

Right now, of course, this looks somewhat handy but a fairly minor improvement. But when you consider general coroutines, it becomes a great mechanism for transferring control. I think of them a bit like a state machine where each state can have its own coroutine, so the concerns are kept separate, and where the whole thing just flows data through only as fast as required by the caller.

I’ve illustrated this below by writing a fairly simple parser for expressions in Polish Notation — this is just like Reverse Polish Notation only backwards. Or perhaps I mean forwards. Well, whichever way round it is, it really lends itself to simple parsing because the operators precede their arguments which keeps the state machine nice and simple. As long as the [arity][arity-wikipedia] of the operators is fixed, no brackets are required for an unambiguous representation.

First let’s see the code, then I’ll discuss its operation below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import math
import operator

# Subgenerator for unary operators.
def parse_unary_operator(op):
    return op((yield from parse_argument((yield))))

# Subgenerator for binary operators.
def parse_binary_operator(op):
    values = []
    for i in (1, 2):
        values.append((yield from parse_argument((yield))))
    return op(*values)

OPERATORS = {
    'sqrt': (parse_unary_operator, math.sqrt),
    '~': (parse_unary_operator, operator.invert),
    '+': (parse_binary_operator, operator.add),
    '-': (parse_binary_operator, operator.sub),
    '*': (parse_binary_operator, operator.mul),
    '/': (parse_binary_operator, operator.truediv)
}

# Detect whether argument is an operator or number - for
# operators we delegate to the appropriate subgenerator.
def parse_argument(token):
    subgen, op = OPERATORS.get(token, (None, None))
    if subgen is None:
        return float(token)
    else:
        return (yield from subgen(op))

# Parent generator - send() tokens into this.    
def parse_expression():
    result = None
    while True:
        token = (yield result)
        result = yield from parse_argument(token)

The main entrypoint is the parse_expression() generator. In this case it’s necessary to have a single parent because we want the behaviour of the top-level expressions to be fundamentally different — in this case, we want it to yield the result of the expression, whereas intermediate values are instead consumed internally within the set of generators and not exposed to calling code.

We use the parse_argument() generator to calculate the result of an expression and return it — it can use a return value since it’s called as a subgenerator of parse_expression() (and others). This determines whether each token is an operator or numeric literal — in the latter case it just returns the literal as a float. In the former case it delegates to a subgenerator based on the operator type — here I just have unary and binary operators as simple illustrative cases. Note that one could easily implement an operator of variable arity here, however, since the delegate generator makes its own decision of when to relinquish control back to the caller — this is an important property when modularising code.

Hopefully this example is otherwise quite clear — the parse_expression() generator simply loops and yields the values of all the top-level expressions that it encounters. Note that because there’s no filtering of the results by the calling generator (since it’s just delegating) then it will yield lots of None results as it consumes inputs until the result of a top-level expression can be yielded — it’ll be up to the calling code to ignore these. This is just a consequence of the way send() on a generator always yields a value even if there isn’t a meaningful value.

The only other slight wrinkle is that you might see some excessive bracketing around the yield operators — this is typically a good idea. PEP 342 describes the parsing rules, but if you just remember to always bracket the expression then that’s one less thing to worry about.

One thing that’s worth noting is that this particular example is quite wasteful for deeply nested expressions in the same way that recursive functions can be. This is because it constructs two new generators for each nested expression — one for parse_argument() and one for whichever operator-specific subgenerator this delegates to. Whether this is acceptable depends on your use-cases and the extent which you want to trade off the code expressiveness against space and time complexity.

Below is an example of how you might use parse_expression():

1
2
3
4
5
6
7
8
9
def parse_pn_string(expr_str):
    parser = parse_expression()
    next(parser)
    for result in (parser.send(i) for i in expr_str.split()):
        if result is not None:
            yield result

results = parse_pn_string("* 2 9 * + 2 - sqrt 25 1 - 9 6")
print("\n".join(str(i) for i in results))

Here I’ve defined a convenience wrapper generator which accepts the expression as a whitespace-delimited string and strips out the intermediate None values that are yielded. If you run that you should see there’s two top-level expressions which yield the same result.

Coming up

That wraps it up for this post — I hope it’s been a useful summary of where things stand in terms of coroutines as far as Python 3.3. In future posts I’ll discuss the asyncio library that was added in Python 3.4, and the additional async keyword that was added in Python 3.5.


  1. Neither of these are anything to do with the official Python library, of course — they’re just implementations off the top of my head. I chose itertools.chain() purely because it’s very simple. 

Fri 10 Jun 2016 at 07:58AM by Andy Pearce in Software tagged with python and coroutines  |  See comments

☑ Fighting Fonts on Mobile

I recently ran into some odd font sizing issues when viewing my website on my iPhone and discovered a few interesting tidbits about rendering on mobile browsers along the way.

Recently I found the time to finally get around to some tweaks to my website theme that I’d been meaning to make for some time — primarily these were related to syntax highlighting and code snippets.

Handling code snippets in HTML has some surprising subtleties, one of them being the overflow behaviour. As much as you try to wrap everything at a sensible number of columns, there will always be times when you absolutely need to represent a long line and it’s always rather unsatisfactory to have to explain that the line breaks were just added for readability. As a result, the styling needs a graceful fallback for these inevitable cases.

I considered using white-space: pre-wrap, which acts like <pre> except that it wraps text on overflow as well as explicit line breaks. One complication, however, is that I sometimes use line numbering for my longer code snippets:

1
2
3
This isn't actually very long.
So the line numbers are rather pointless.
But you get the idea.

To facilitate easy cut and paste this is actually a HTML table with a single row and the line numbers and file content in their own cells, contained within nested <pre> tags1. This is handy, but it does lead to some odd formatting issues, since there’s nothing explicitly aligning the individual lines in the two elements except for the fact that they’re using a consistent font and line width.

One issue I’ve had in the past, for example, is when I used bold text for some of the syntax highlighting styles — I found that some browsers would adjust the line height when this happened such that the rows no longer quite lined up after that point. I tried various line-height fixes with limited success, but eventually it was just easiest to avoid bold text in code snippets.

Another issue concerns overflows — if you wrap text in the file content as I suggested earlier then you’d need to also somehow arrange a gap (or repeat) in the line numbering or all the line numbers will be off by one for each wrapped line. There’s no practical way to arrange this except by perhaps putting each code row in its own table row and modifying third party code extensively to do that just didn’t appeal for a quick fix.

Instead, therefore, I opted for using overflow: auto, which inserts scrollbars as required, combined with judicious max-width: 100% here and there. I was pleasantly surprised to see this works on any sensible browser2.

However, when I tested the site on my iPhone I discovered a new and puzzling issue: the overflow scrolling was working fine, but for some reason when the text overflowed the font size of the line numbers was oddly reduced and hence no longer aligned with the code.

I realised fairly early on that this was some odd resizing issue due to the content being too large, and hence presumably an attempt to fit more text into place getting confused with some of the CSS rules — but no amount of font-size, width, height or anything else seemed to fix it, even with !important littered all over the place.

This threatened to be one of those annoying issues that can be very tough to track down — but luckily for me I stumbled upon the solution fairly quickly. As it turns out, it’s all about how browsers on mobile devices render pages.

The issue with smartphone browsers is that most web content authors didn’t design their content and/or styles to cope with such a small viewport. All kinds of web pages don’t render correctly — the kind of hacks that are required to get cross-browser compatibility almost certainly don’t help, either. As a result of this the browsers use a bit of a trick to get things looking a little more natural.

What they do is render the page to a larger viewport (e.g. 1000 pixels wide) and then scale the page down to fit on the screen. This allows the styles to render more plausibly on most pages, but it does tend to make the text awfully hard to read. This would entail scrolling the screen left/right to read a full line of text which would be a trifle vexing to say the least.

To get around this issue the browsers inflate the text — they scale the font size up so that it becomes legible once again, whilst still leaving all the pixel-oriented sizes intact.

As it turns out, this was happening with my code snippets — but for some reason a different inflation ratio was applied to the line numbers element than the code. Once I knew that this was the issue it was quite easy to fix3 by adding -webkit-text-size-adjust: 100% to my assorted <pre> elements — apparently this did the trick as it now seems to work. Check out that linked page because it’s got a lot of additional useful details which isn’t Mozilla-specific.

There you are, then — who knew rendering pages on mobile was such a subtle business? Actually, given the usual state of all but the simplest CSS tasks I’m quite surprised it was as straightforward as it was.


  1. I can’t really take either credit or blame for this, it’s part of the functionality provided by the Pygments package that I use for syntax highlighting. The documentation indicates that it’s optional, but I think it’s useful. 

  2. sensible, adj.: Any popular web browser which isn’t IE

  3. Perhaps not fix optimally, but at least fix. Work around. Hack. Whatever you want to call it. 

Thu 08 Oct 2015 at 08:10PM by Andy Pearce in Software tagged with fonts and web  |  See comments

Page 1 of 8   |   Page 2 →   |   Page 8 ⇒