# ☑ FTP Considered Painful

After many years of separation I was recently reunited with the venerable old FTP protocol. The years haven’t been kind to it.

Happy New Year!

Right, that’s the jollity out of the way.

I recently had cause to have some dealings with the File Transfer Protocol, which is something I honestly never thought I’d be using again. During the process I was reminded what an antiquated protocol it is. It’s old, it has a lot of wrinkles and, frankly, it’s really starting to smell.

But perhaps I’m getting ahead of myself, what is FTP anyway?

It’s been around for decades, but in these days of cloud storage and everything being web-based, it’s not something that people typically come across very often. This article is really an effort to convince you that this is something for which we should all be thankful.

Wikipedia says of FTP:

The File Transfer Protocol (FTP) is the standard network protocol used for the transfer of computer files between a client and server on a computer network.

Twenty years ago, perhaps this was even true, but this Wikipedia page was last edited a month ago! Surely the world has moved on these days? At least I certainly hope so, and this post is my attempt to explain some of the reasons why.

## A Little Piece of History

What’s so bad about FTP?” I hear you cry! Oh wait, my mistake, it wasn’t you but the painfully transparent literary device sitting two rows back. Well, regardless, I’m glad you asked. First, however, a tiny bit of history and a brief explanation of how FTP works.

FTP is a protocol with a long history, and even predates TCP/IP. It first put in an appearance in 1971 in RFC 114, a standard so old it wasn’t even put into machine-readable form until 2001. At this point it was built on the Network Control Program (NCP), which was a unidirectional precursor of TCP/IP. The simplex nature of NCP may well explain some of FTP’s quirks, but I’m getting ahead of myself.

In 1980 the first version of what we’d recognise as FTP today was defined in RFC 765. In this version the client opens a TCP connection (thereafter known as the command connection) to port 21 on a server. It then sends requests to transfer files across this connection, but the file data itself is transferred across a separate TCP connection, the data connection. This is the main aspect of FTP which doesn’t play well with modern network topologies as we’ll find out later.

Given that TCP connecitons are full-duplex, why didn’t they take the opportunity to remove the need for a second connection when they moved off NCP? Well, the clues are in RFC 327, from a time when people were still happy to burn RFC numbers for the minutes of random meetings. I won’t rehash it here, but suffice to say it was a different time and the designers of the protocol had very different considerations.

Whatever the reasons, once the command connection is open and a transfer is requested, the server connects back to the client machine on a TCP port specified by the FTP PORT command. This is known as active mode. Once this connection is established, the sending end can throw data down this connection.

Even back in 1980 they anticipated that this strategy may not always be ideal, however, so they also added a PASV command to use passive mode instead. In this mode, the server passively listens on a port and sends its IP address and port to the client. The client then makes a second outgoing connection to this point and thus the data connection is formed. This works a lot better than active mode when you’re behind a firewall, or especially a NAT gateway. As NAT gateways became more popular, as the IPv4 address space became increasingly crowded, this form of FTP transfer became more or less entirely dominant.

There were a few later revisions of the RFC to tighten up some of the definitions and provide more clarity. There was a final change that is relevant to this article, however, which was made in 1998 when adding IPv6 support to the protocol, as part of RFC 2428. One change this made was to add the EPSV command to enter extended passive mode. The intended use of this was to work around the fact the original protocol was tied to using 4-byte addresses, and they couldn’t change this without breaking existing clients. As a simple change the EPSV command simply removes the IP address that the server sends to the client for PASV and instead the client uses the same address as it used to create the command connection1.

Not only is extended passive mode great for IPv6, it also works in the increasingly common case where the server is behind a NAT gateway. This causes problems with standard passive mode because the FTP server doesn’t necessarily know its own external IP address, and hence typically sends a response to the client asking it to connect to an address in a private range which, unsurprisingly, doesn’t work2.

It’s important to note that EPSV mode isn’t the only solution to the NATed server problem—some FTP servers allow the external address they send to be configured instead of the server simply using the local address. There are still some pitfalls to this approach, which I’ll mention later.

## Simple Enough?

Given all that what, then, are the problems with FTP?

Well, some of them we’ve covered already, in that it’s quite awkward to run FTP through any kind of firewall or NAT gateway. Active mode requires the client to be able to accept incoming connections to an arbitrary port, which is typically painful as most gateways are built on the assumption of outwards connections only and require fiddly configuration to support inbound.

Passive mode makes life easier for the client, but for security-conscious administrators it can be frustrating to have to enable a large range of ports on which to allow outbound connections. It’s also more painful for the server due to the dynamic ports involved, as we’ve already touched on. The server can’t use only a single port for its data connections since that would only allow it to support a single client concurrently. This is because the port number is the only think linking the command and data connections—if two clients opened data connections at the same time, the server would have no other way to tell them apart.

Extended passive mode makes like easier all round, as long as you can live with opening the largish range of ports required. But even given all this there’s still one major issue which I haven’t yet mentioned, which crops up with the way that modern networks tend to be architected.

## FTP = Forget Talking to Pools

Anyone who’s familiar with architecting resilient systems will know that servers are often organised into clusters. This makes it simple to tolerate failures of a particular system, and is also the only practical way to handle more load than a single server can tolerate.

When you have a cluster of servers, it’s important to find a way to direct incoming connections to the right machine in the cluster. One way to do this is with a hardware load balancer, but a simpler approach is simply use DNS. In this approach you have a domain name which resolves to multiple IP addresses, sorted into a random order each time, and each address represents one member of the pool. As clients connect they’ll typically use the first address and hence incoming connections will tend to be balanced across available servers.

This works really well for protocols like HTTP which are stateless because every time the client connects back in it doesn’t matter which of the servers it gets connected to, any of them are equally capable of handling any request. If a server gets overloaded or gets taken down for maintenance, the DNS record is updated and no new connections go to it. Simple.

This approach works fine for making the FTP command connection. However, when it comes to something that requires a data connection (e.g. transferring a file), things are not necessarily so rosy. In some cases it might work fine, but it’s a lot more dependent on network

Let’s illustrate a potential problem with an example. Let’s say there’s a public FTP site that’s served with a cluster of three machines, and those have public IP addresses 100.1.1.1, 100.2.2.2 and 100.3.3.3. These are hidden behind the hostname ftp.example.com which will resolve to all three addresses. This can either be in the form of returning multiple A records in one response, or returning different addresses each time. We can see examples of both of these if we look at the DNS records for Facebook:

$host -t A facebook.com facebook.com has address 185.60.216.35$ host -t A facebook.com


\$ host -t A twitter.com


When the FTP client initiates a connection to ftp.example.com it first performs a DNS lookup—let’s say that it gets address 100.1.1.1. It then connects to 100.1.1.1:21 to form the command connection. Let’s say the FTP client and server are both well behaved and then negotiate the recommended EPSV mode, and the server returns port 12345 for the client to connect on.

At this point the client must make a new connection to the specified port. Since it needs to reuse the original address it connected to, let’s say that it repeats the DNS lookup and this time gets IP address 100.2.2.2 and so makes its outgoing data connection to that address. However, since that’s a physically separate server it won’t be listening on port 12345 and the data connection will fail.

OK, so you can argue that’s a broken FTP client—instead of repeating the DNS lookup it could just reconnect to the same address it got last time. However, in the case where you’re connecting through a proxy then this is much less clear cut—the proxy server is going to have no way to know that the two connections that the FTP client is making through it should go to the same IP address, and so it’s more than likely to repeat the DNS resolution and risk resolving to a different IP address as a result. This is particularly likely for sites using DNS for load-balancing since they’re very likely to have set a very short TTL to prevent DNS caches from spoiling the effect.

We could use regular passive mode to work around the inconsistent DNS problem, because the FTP server returns its IP address explicitly. However, this could still cause an issue with the proxy if it’s whitelisting outgoing connections—we would likely have just included the domain name in the whitelist, so the IP address would be blocked. Leaving that issue aside, there’s still another potential pitfall if the FTP server has had the public IP address to return configured by an administrator. If that administrator has configured this via a domain name, the FTP server itself could attempt to resolve the the name and get the wrong IP address, so actually instruct the client to connect back incorrectly. Each server could be configured with its external IP address directly, but this is going to make centralised configuration management quite painful.

## Insecurity

As well as all the potential connectivity issues, FTP also suffers from a pretty poor security model. This is fairly well known and there’s even an RFC discussing many of the issues.

One of the most fundamental weaknesses is that it involves sending the username and password in plaintext across the channel. One easy way to solve this is to tunnel the FTP connection over something more secure, such as an SSL connection. This setup, usually known as FTPS, works fairly well, but still suffers from the same issues around the separate data and command connections. Another alternative is to tunnel FTP connections over SSH.

None of these options should be confused with SFTP which, despite the similarity in name, is a completely different protocol developed by the IETF3. It’s also different from SCP, just for extra confusion4. This protocol assumes only a previously authenticated and secure channel, so is applicable over SSH but more generally anywhere where a secure connection has been created.

Overall, then, I strongly recommend sticking to SFTP wherever you can, as the world of FTP is, as we’ve seen, by and large a world of pain if you care about more or less any aspect of security at all, or indeed ability to work in any but the most trivial network architectures.

In conclusion, then, I think that far from FTP being the “standard” network protocol used for the transfer of computer files, we should instead be hammering the last few nails in its coffin and putting it out of our misery.

1. I guess by 1998 they’d given up on those crazy ideas from the 80’s of transferring between remote systems without taking a local copy—you know, the thing that absolutely nobody ever used ever. I wonder why they dropped it?

2. Even with extended passive mode NAT can still cause problems, as you also need to redirect the full port range that you plan to use for data connections to the right server. It solves part of the problem, however.

3. Interestingly there doesn’t appear to be any kind of RFC for SFTP, but instead just a draft. I find this rather odd considering how widely used it is!

4. Just for extra bonus confusion there’s a really old protocol called the Simple File Transfer Protocol defined in RFC 913 which could also reasonably be called “SFTP”. But it never really caught on so probably this isn’t likely to cause confusion unless some pedantic sod reminds everyone about it in a blog post or similar.

7 Jan 2017 at 9:50AM by Andy Pearce in Software  | Photo by Rob Potter on Unsplash  | Tags: ftp  |  See comments

# ☑ Website Maintenance on the Move

I write most of my blog articles and make other changes to my site whilst on my daily commute. The limitations of poor network reception different hardware have forced me to come up with a streamlined process for it and I thought it might be helpful to share in case it’s helpful for anyone else.

I like writing. Since software is what I know, I tend to write about that. QED.

Like many people, however, my time is somewhat pressured these days — between a wife and energetic four-year-old daughter at home and my responsibilities at work, there isn’t a great deal of time left for me to pursue my own interests. When your time is squeezed the moments that remain become a precious commodity that must be protected and maximimsed.

Most of my free time these days is spent on the train between Cambridge and London. While it doesn’t quite make it into my all time top ten favourite places to be, it’s not actually too bad — I almost invariably get a seat, usually with a table, and there’s patchy mobile reception along the route. Plenty of opportunties for productivity, therefore, if you’re prepared to take them.

Since time is precious, the last thing I want to do when maintaining my blog, therefore, is spend ages churning out tedious boiler-plate HTML, or waiting for an SSH connection to catch up with the last twenty keypresses as I hit a reception blackspot. Fortunately it’s quite possible to set things up to avoid these issues and this post is a rather rambling discussion of things I’ve set up to mitigate them.

## Authoring

The first time-saving tool I use is Pelican. This is a source code generator which processes Markdown source files and generates static HTML from them according to a series of Jinja templates.

When first resurrecting my blog from a cringeworthy earlier effort1 the first thing I had to decide was whether to use some existing blogging platform (Wordpress, Tumblr, Medium, etc.) either self-hosted or otherwise. The alternative I’d always chosen previously was to roll my own web app — the last one being in Python using CherryPy — but I quickly ruled out that option. If the point was to save time, writing my own CMS from scratch probably wasn’t quite the optimal way to go about it.

Also, the thought of chucking large amounts of text into some clunky old relational database always fills me with a mild sense of revulsion. It’s one of those solutions that only exists because if all you’ve got is a hosted MySQL instance, everything looks like a BLOB.

In the end I also rejected the hosted solutions. I’m sure they work very well, with all sorts of mobile apps and all such mod cons, but part of the point of all this for me has always been the opportunity to keep my web design skills, meagre as they might be, in some sort of barely functional state. I’m also enough of a control freak to want to keep ownership of my content and make my own arrangements for backing it up and such — who knows when these providers will disappear into the aether.

What I was really tempted to do for awhile was build something that was like a wiki engine but which rendered with appropriate styling like a standard website — it was the allure of using some lightweight markup that really appealed to me. At that point I discovered Pelican and suddenly I realised with this simple tool I could throw all my Markdown sources into a Git repository and then throw it through Pelican2 to generate the site. Perhaps I’m crazy but it felt like a tool for storing versioned text files might be a far more appropriate tool than a relational database for, you know, storing versioned text files. Just like a wiki, but without the online editing3.

All there was to do then was build my own Pelican template, set up nginx to serve the whole lot and I was good to go. Simple enough.

## Updating the site

Except, of course, that getting site generated was only half the battle. I could SSH into my little VPS, write some article in Markdown using Vim and then run Pelican to generate it. That’s great when I’m sitting at home on a nice, fast wifi connection — but when I’m sitting at home I’m generally either spending time with my family or wading through that massive list of things that are way lower on the fun scale than blogging, but significantly higher on the “your house will be an uninhabitable pit of utter filth and despair” scale.4

When I’m sitting on a train where the mobile reception varies between non-existent and approximately equivalent to a damp piece of string, however, remote editing is a recipe for extreme frustration and a string of incoherently muttered expletives every few minutes. Since I don’t like to be a source of annoyance to other passengers, it was my civic duty to do better.

Fortunately this was quite easy to arrange. Since I was already using a Git repository to store my blog, I could just set up a cron job which updated the repo, checked for any new commits and invoked Pelican to update the site. This is quite a simple script to write and the cron job to invoke it is also quite simple:

*/5 * * * *     git -C /home/andy/www/blog-src pull; \


If you look at check-updates.py you’ll find it just uses git log -1 --pretty=oneline to grab the ID of the current commit and compares it to the last time it ran — if there’s any difference, it triggers a run of Pelican. It has a few other complicating details like allowing generation in a staging area and doing atomic updates of the destination directory using a symlink to avoid a brief outage during the update, but essentially it’s doing a very simple job.

This was now great — I could clone my blog’s repo on to my laptop, perform local edits to the files, run a staging build with a local browser to confirm them and then push the changes back to the repo during brief periods of connectivity. Every five minutes my VPS would check for updates to the repo and regenerate the site as required. Perfect.

## There’s an app for that

Well, not quite perfect as it turns out. While travelling with a laptop it was easy to find a Git client, SSH client and text editor, but sometimes I travel with just my iPad and a small keyboard and things were a little trickier.

However, I’ve finally discovered a handful of apps that have streamlined this process:

Working Copy
Since I put Git at the heart of my workflow it was always disappointing that it took so long for a decent Git client to arrive on iOS. Fortunately we now have Working Copy and it was worth the wait. Whilst unsurprisingly lacking some of the more advanced functionality of command-line Git, it’s effective and quite polished and does the job rather nicely. It has a basic text editor built in, but one of its main benefits is that it exposes the working directory to other applications which allows me to choose something a little more full-featured.
Textastic
This is the editor I currently use on both iOS and Mac. It’s packed with features and can open files from Working Copy as well as supporting direct SFTP access and other mechanisms. I won’t go through it’s myriad features, just suffice to say it’s very capable. I should give an honourable mention to Coda for iOS, Panic Inc.’s extremely polished beautifully crafted text editor for iOS, which I used to use. Coda has a builtin SSH client and is really heavily optimised for remote editing, so it’s a great alternative if you want to explore. The original reason I switched was that, with my unreliable uplink, Textastic‘s more explicit download/edit/upload model worked a little better for me than Coda’s more implicit remote editing with caching. Now the fact that Textastic supports local editing within the Working Copy repo is also a factor. I’ll also be totally honest and point out that I haven’t played with Coda since they released a (free) major update awhile back. I’ve nothing but praise for its presentation and overall quality, however.
Prompt 2
If Coda for iOS didn’t quite tempt me as much as Textastic, another of Panic’s offerings Prompt 2 is absolutely exactly what I need. This is by far the most accomplished SSH client I’ve used on iOS bar none. It supports all the funtionality you need with credentials, plus you can layer Touch ID on top if you want it to remember your passphrases. Its terminal emulation is pretty much perfect - I’ve never had any issues with curses or anything else. It runs multiple connections effortlessly and keeps them open in the background without issue. It can even pop up a notification reminder to swap it back to keep your connections alive if it’s idle for too long. As with any remote access on a less than perfect link I’d very strongly suggest using tmux, but Prompt 2 does about all it can to maintain the stability of your connections.

## Summary

That’s about the long and the short of it, then. I’ve been very happy with my Git-driven workflow and found it flexible enough to cope with changes in my demands and platforms. Any minor deficiencies I can work around with scripting on the server side.

The nice thing about Git, of course, is that its branching support means that if I ever wanted to set up, say, a staging area then I can do that with no changes at all. I just create another commit on the server which uses the staging branch instead of master, and I’m good to go — no code changes required, except perhaps some trivial configuration file updates.

Hopefully that’s provided a few useful pointers to someone interesting in optimising their workflow for sporadic remote access. I was of two minds whether to even write this article since so much of it is fairly obvious stuff, but sometimes it’s just useful to have the validation that someone else has made something work before you embark on it — I’ve done so and can confirm it works very well.

1. Not to be confused with a hilariously precocious first version that I created shortly after I graduated.

2. You may be more familiar with Jekyll, a tool written by Github co-founder Tom Preston-Werner which does the same job. The only reason I chose Pelican was the fact it was written in Python and hence I could easily extend it myself without needing to learn Ruby (not that I wouldn’t like to learn Ruby, given the spare time).

3. Of course, one could quite reasonably make the point that the online editing is more or less the defining characteristic of a wiki, so perhaps instead of “just like a wiki” I should be saying “almost wholly unlike a wiki but sharing a few minor traits that I happened to find useful, such as generating readable output from a simple markup that’s easier to maintain”, but I prefer to keep my asides small enough to fit in a tweet. Except when they’re talking about asides too large to fit in a tweet — then it gets challenging.

4. The SI unit of measurement is “chores”.

19 Sep 2016 at 1:45PM by Andy Pearce in Software  | Photo by rawpixel.com on Unsplash  | Tags: web  |  See comments

# ☑ Brexit, or Brexit Not — There Is No Try

I voted against Brexit as I feel the UK is significantly better within the EU. However, the looming uncertainty over whether the UK will follow through is much worse than either option.

On Thursday 23 June the United Kingdom held a referendum to decide whether to remain within the European Union, of which it has been a member since 1973. The vote was to leave by a majority of 52% with a fairly substantial turnout of almost 72%. Not the largest majority, but a difference of over a million people can’t be called ambiguous.

So that was it then — we were out. Time to start comparing people’s plans for making it happen to decide which was the best.

Except, of course, it turned out nobody really had any plans. The result seemed to have been a bit of a shock to everyone, including all the politicians who were campaigning for it. Nobody really seemed to know what to do next. Disappointing, but hardly surprising — we’re a rather impulsive nation, always jumping into things without really figuring out what our end game should be. Just look at the shambles that followed the Iraq war.

Fortunately for the Brexiteers there was a bit of a distraction in the form of David Cameron’s resignation — having campaigned to remain within the EU he felt that remaining as leader was untenable. Well, let’s face it, that’s probably disingenuous — what he most likely really felt was he didn’t want to go down in history as the Prime Minister who took the country out of the EU, just in case (as many people think quite likely) it’s a bit of a disaster, quite possibly resented by generations to come.

This triggered an immediately leadership contest within the Tory party which drew all eyes for a time, until former Home Secretary Theresa May was left as the only candidate and assumed leadership of the party. At this point everyone’s attention seems to be meandering its way back to thoughts of Brexit and all the questions it raises.

And a lot of questions there certainly are. There are immigration questions, NHS questions, questions for the Bank of England, questions for EU migrants, questions for Northern Ireland, profound questions for Scotland1 questions for David Davis, and even a whopping great 94 questions on climate and energy policy, which frankly I think is rather hypocritical — they know full well that nobody has any use for so many questions and most of them will end up on landfill.

To my mind, however, there’s still one question that supercedes all these when talking about Brexit — namely, will Br actually exit?

You’d think this was a done deal — I mean, we had a referendum on it and everything. Usually clears these things right up. But in this case, even well over a month after the vote, there’s still talk about whether we’re going to go through with it.

Apparently the legal situation seems quite muddy but there are possible grounds for a second referendum — although Theresa May is on record as rejecting that possibility. I must say I can see her point — to reject the clearly stated opinion of the British public would need some pretty robust justification and “the leave campaigners lied through their teeth” probably doesn’t really cut it. It’s not like people aren’t used to dealing with politicians being economical with the truth in general elections.

Then we hear that the House of Lords might try to block the progress of Brexit — or at least delay it. Once again, it’s not yet at all clear to what extent this will happen; and if it happens, how effective it will be; and if it’s effective, how easily the government can bypass it. For example, the government could try to force it through with the Parliament Act.

What this all adds up to is very little clarity right now. We have a flurry of mixed messages coming out of government where they tell us that the one thing they are 100% certain of is that they’re definitely going to leave the EU, but not only can’t they give us a plan, they can’t even give us a rough approximation of when they’ll have a plan; we have a motley crew of different groups clamouring for increasingly deperate ways to delay, defer or cancel the whole thing, but very little certainty on whether they even have the theoretical grounds to do so let alone the public support to push it through; and we have an increasingly grumpy EU who are telling us that if we’re really going to leave then we should jolly well get on with it and don’t let Article 50 hit us in the bum on the way out.

Meanwhile the rest of the world doesn’t seem to know what to make of it, so it’s not clear that we’ve seen much of the possible impact, even assuming we do go ahead. But to think there hasn’t been any impact is misleading — even when things are uncertain we’ve already seen negative impacts on academics, education and morale in the public sector. Let’s be clear here, it hasn’t happened yet and it isn’t even a certainty that it will, and we’re already seeing a torrent of negative sentiment.

To be fair, though, we haven’t yet really had the chance to see any possible positive aspects of the decision filtering through. In fact, we probably won’t see any of those until the decision is finalised — or at least until Article 50 is triggered and there’s a deadline to chivvy everyone along.

That’s a big problem.

I think that the longer this “will they/won’t they” period of uncertainty carries on, the more we’ll start to see these negative impacts. Nobody wants to bank on the unlikely event that the UK will change course and remain in the EU, but neither can anyone count on the fact that we won’t. We’re stuck in an increasingly acrimonious relationship that we can’t quite bring ourselves to end yet. If they could find an actor with a sufficient lack of charisma to play Nigel Farage, they could turn it into a low budget BBC Three sitcom.

Don’t get me wrong, I voted firmly to remain in this EU. But whatever we do, I feel like we, as a nation — and by that I mean they as a government that we, as a nation, were daft enough to elect2 — need to make a decision and act on it. This wasteland of uncertainty is worse than either option, and doesn’t benefit anyone except the lawyers and the journalists — frankly they can both find more worthwhile ways to earn their keep.

So come on Theresa, stop messing about. Stick on a Spotify3 playlist called something like “100 Best Break Up Songs”, mutter some consoling nonsense to yourself about how there are plenty more nation states in the sea and pick up the phone. Then we can get on with making the best of wherever we find ourselves.

1. Although they’re asked by Michael Gove so I dont know if they count — given his behaviour during the Tory leadership election I’m not sure he’s been allowed off the naughty step yet.

2. In the interests of balance I should point out that, in my opinion, more or less every party this country elected since 1950 has been a daft decision. Probably before that, too, but my history gets a little too rusty to be certain. The main problem is that the people elected have an unpleasant tendency to be politicians, and if there’s one group of people to whom the business of politics should never be entrusted, it’s politicians.

3. Assuming Spotify, being Swedish, are still allowed?

4 Aug 2016 at 7:45PM by Andy Pearce in Politics  | Photo by James Giddins on Unsplash  | Tags: uk  brexit politics  |  See comments

# ☑ Responsive Design

My website now looks hopefully very slightly less terrible on mobile devices, and I learned a few things getting there.

This website is more or less wholly my own work. I use a content generator to create the HTML from Markdown source files, but the credit (or blame) for the layout and design lies with me alone1.

Actually, I should hastily add that the one exception to this is that the current colour scheme comes straight from Ethan Schoover’s excellent Solarized palette. But that’s not really germane to the topic of this post, which I note with dismay I haven’t yet broached despite this already being the end of the second paragraph.

Now I’m not going to try and claim that this site is brilliantly designed — I’m a software engineer not a graphic designer. But I do strongly believe that developers should strive to be generalists where practical and this site is my opportunity to tinker. Most recently my tinkering has been to try to improve the experience on mobile devices, and I thought I’d share my meagre discoveries for anyone else who’s looking for a some basic discussion on the topic to get them started.

The first thing I’d like to make clear is that this stuff is surprisingly simple. I’ve known for awhile that the experience of reading this site on my iPhone was not a hugely pleasant one compared to the desktop, but I’d put off dealing with it on the assumption that fixing it would require all sorts of hacky Javascript or browser-specific hacks — it turns out that it doesn’t, really.

However, there are some gotchas and quirks due to the way that font rendering works and other odd issues. Without further ado let’s jump head-first into them.

## Any viewport in a storm

No matter how simple your site layout or styling, you might find that it doesn’t render optimally on mobile devices. One reason for this is that mobile browsers render to a much larger resolution than the actual screen, and then they zoom font sizes up to remain legible on the screen.

The reason they do this is because they’ve grown up having to deal with all sorts of awful web designers to whom it never occurred that someone might want to view their page at anything less than a maximised web browser on a 1600x1200 display. As a result they use all sorts of absolute positioning and pixel widths, and these cope horribly when rendered on a low resolution screen. Thus the browsers pretend that their screen is high resolution and render accordingly — but to keep the text readable they have to boost the font sizes. The net result is typically something that’s not necessarily all that pretty, but is often surprisingly usable.

This is quite annoying if you’ve gone to the trouble of making sure your site renders correctly at lower resolutions, however, and that’s pretty much the topic of this post. So how do you go about disabling this behaviour?

The short answer is to include a tag like this:

<meta name="viewport" content="initial-scale=1.0" />


The slightly longer answer is that the viewport meta tag specifies the viewport area to which you expect the browser to render. If your layout assumes a certain minimal resolution for desirable results then you can specify that width in pixels like this:

<meta name="viewport" content="width=500" />


This instructs the browser to render to a viewport of width 500 pixels and then perform any scaling required to fill the screen with this. It’s also possible to specify a height attribute but most web layouts don’t make too many assumptions about viewport height.

The first line I showed above didn’t set width, however, but instead the initial-scale. As you might expect, this instructs the browser to render the the full width of the page on the pixel width of the window without applying any scaling. Note that this is probably the only value you’ll need if you’re confident you’ve designed your styles to adapt well to all device screens.

You can also use this tag to constrain user scaling to prevent the user zooming in or out, but personally I find this a rather gross affront to accessibility so I’m not going to discuss it further — you should never presume to know how your users want to access your site better than your users, in my opinion.

## When is a pixel not a pixel?

This is all fine and dandy, but pixels aren’t everything. If I render to a viewport of 1000 pixels the resultant physical size is significantly different on a desktop monitor than on a retina display iPhone.

The correct solution to this would probably be some sort of way to query the physical size or dot pitch of the display, but that would require all web designers to do The Right Thing™ and that’s always been a bit too much to ask.

Instead, therefore, browser vendors have invented a layer of “virtual pixels” which is designed to be approximately the same physical size on all devices. When you specify a size in pixels anywhere within your CSS then these “virtual pixels” are used — the browser will have some mapping to real physical pixels according to the screen size and resolution, and also no doubt to some extent the whims of the browser writers and consistency with other browsers.

The CSS specifications even have a definition of a reference pixel which links pixel measurements to physical distances — broadly speaking it says that a pixel means a pixel on a 96 DPI standard monitor. I wouldn’t want to make any assumptions about this, however — I’m guessing these things still vary from browser to browser and it’ll be some time, if ever, before everyone converges on the same standard.

As a result of all this you might find your page being stretched around despite your best efforts — there doesn’t seem to be a good way around this except have a fluid layout and use high resolution images in case they’re ever stretched (few things look more shoddy than a pixelated image).

If you want more details, Mozilla have a great article about the whole thing. You can also read all about the reference-pixel and this slightly old article about CSS units is also very useful.

## Responsive design

Now our site renders at the correct size and scale on the device. Great. If your layout is anything like mine, however, now you’re feeling the loss of all those grotty scaling hacks and things don’t look all that good at all. What you need to do is restyle your site so it degrades gracefully at smaller screen sizes.

As much as possible I really recommend doing this through fluid layouts — the browsers are very good at laying out text and inline elements in an appropriate manner for a given width so where possible just use that. However, there are things that aren’t possible with this approach — for example, if you’re using a mobile device to view this site, or you shrink your browser window enough, you’ll see that the side bar disappears and jumps to the top to free up some additional width to stop the text getting too squashed. There’s no way that a browser will be able to make this sort of decision automatically, it needs a little help from the designer.

This sort of thing can be achieved with media queries.

These offer a way to conditionally include bits of CSS based on your current media type2 and resolution. They don’t make it any easier to design a good mobile layout — making pretty websites is left as an exercise for the reader — but they do at least give you a simple way to swap between the styles you’ve created for different viewports.

There are two ways to use them — you can put conditional blocks in your CSS:

@media screen and (max-width: 450px) {
body {
font-size:0.9em
}
}


Or alternatively you can switch to an entirely different stylesheet in your HTML:

<link rel="stylesheet" media="screen and (max-width: 750px)"
href="/styles/mobile.css" type="text/css" />


I use both of these approaches on my site — I find it easier to write a wholly different stylesheet for the mobile layout since it has so many small changes3, and then I make a few small tweaks within the CSS itself for more fine-grained viewport widths within that range.

For example, switching to the mobile stylesheet on my site converts the sidebar sections to inline-block and renders them above the content. Within that same stylesheet, however, there are some small tweaks to make these render centered and spaced out where there’s room to do so, but left-aligned for narrow displays where they’re likely to flow on to multiple lines.

However, it’s important to note that you could use either approach equally well on its own — I don’t believe there’s anything that can be achieved with one but not the other.

As an aside if you’re tempted to go for the <link> approach to save network traffic on the assumption that the browser will only fetch the sheets it needs, think again. The browser can’t really have any special foresight about how you’re going to resize the viewport so it fetches all CSS resources. The media queries then just control which rules get applied.

You can find a few more details in this Stack Overflow question. It does turn out that some browsers will avoid downloading image files they don’t need, but that would presumably apply regardless of whether the media query was located in a <link> tag or in a CSS file.

## Font scaling

One additional issue to be aware of when designing mobile layouts is that if you render to the device width you can get odd results when you turn a mobile device into landscape orientation. Intuitively I’d expect a wider, shorter viewport with text rendered to the same font size, but in actual fact what you can end up with is a zoomed page instead. I think what’s going on here is that the scale factor is determined for portrait orientation and then applied in both cases, but I must admit I’m not confident I fully appreciate the details.

Whatever the cause, one fix appears to be to disable the mobile browser font size adjustments — this should no longer be required, after all, because the layout is now designed to work equally well in any viewport.

Because this is a somewhat new property and not necessarily supported in its standard form by all browsers it’s wise to specify all the browser-specific versions as well:

-webkit-text-size-adjust: 100%;


You might have better luck playing around with different values of this setting than I did — I must confess I didn’t experiment much once I realised that disabling the adjustment seemed to fix my issues.

## Hard copy

The discussion so far has been all about screens, primarily — the same media selectors can also be used to specify an alternative stylesheet for printing.

<link rel="stylesheet" media="print"
href="/styles/print.css" type="text/css" />


This all works in exactly the same way except the sorts of things you do in these stylesheets is likely quite different. For example, I force all my colours to greyscale4 and remove the sidebar and other navigation elements entirely. I also remove decorations around links since these of course have no significance in a printed document.

If you want to remove something from the flow of the document you can do this in CSS by setting display to none:

@media print {
div.sidebar {
display: none;
}
}


You might wonder how this differs from setting visibility: hidden — the difference is that display: none removes the element from the document completely, changing the layout; whereas visibility: hidden still reserves space for the element in the layout, it just doesn’t actually render it.

If you want to test out the print stylesheet of this or any site without wasting paper, you can do so with Google Chrome Developer Tools5. Open up the developer tools with the More Tools » Developer Tools menu option and then click the vertical ellipsis () and select More Tools » Rendering Settings to show a new window. Now you can tick the Emulate Media option and choose Print from the dropdown.

## Layout in CSS

One important thing to note about all the techniques discussed on this page is that they only allow you to swap in or out CSS in response to viewport width — the HTML document structure itself is unchanged. This isn’t generally much of a limitation in today’s browsers since modern CSS offers a huge degree of flexibility, but it’s certainly something to bear in mind when you’re writing your HTML. In general the closer you are to a clean structural HTML document where CSS supplies all of the layout controls, the easier you’re likely to find adapting your site for multiple devices.

If you really need to redirect users to a different page based on width then of course it’s possible with Javascript, but this is a pretty ugly solution — it’s the sort of thing that leads to a world of pain where you have a version of your site for every device under the sun. If you’re the sort of masochist to whom that appeals, go right ahead.

One final point that I should mention is that there are two schools of thought about laying things out for multiple devices — these are responsive and adaptive design, although it’s important to note that they’re not actually mutually exclusive.

Responsive design
This describes layouts that change continuously with the width of the viewport as it changes.
This describes layouts that have several distinct options and “snap” between them as the viewport changes.

Responsive layouts are generally regarded as more graceful, I think, but adaptive layouts may be easier for more complex sites where it’s easier to implement a test a small number of distinct options. Personally I’ve used aspects of both, but I think I’d be comfortable describing my design as responsive overall since I try to use the full viewport width where I can, except in very wide cases where overlong text harms readability.

This is probably better illustrated than explained so I suggest checking out this set of GIFs that demonstrate the differences.

## Conclusions

Overall I found the whole process of making my site more mobile-friendly very painless. I have quite a simple layout (quite intentionally) which made it a lot less hassle than I can imagine a lot of image-heavy sites might find. Frankly, though, that’s the modern web for you — bitmap images are so passé, and for good reason.

1. Anyone who’s been curious enough to poke around in the HTML or CSS (as if anyone would do that…) might notice some references to Sphinx in the naming. This isn’t because I’ve pilfered anything from the Python documentation generator, but simply that this website theme started life as an ill-fated attempt to re-style Sphinx generated documentation — I soon realised, however, it was significantly deficient in every aspect to others such as the Read The Docs style and gave up on that idea and used it solely for this site.

2. Where media type is essentially either screen or print in the vast majority of cases.

3. I do factor out some commonality on the backend with sass, however, so arguably I’d save a little network bandwidth by putting those into a common CSS file and using only the CSS-style media queries. However, I feel that such minute fiddling would be somehwat against the spirit of the title of this blog.

4. This might be a little irksome to someone with a colour printer but due to the way I’ve factored out my palette choices from the rest of the CSS it makes life a good deal easier for me. For example, if I change my colour scheme to light-on-dark then there’s no guarantee that the text colours will render legibly in the dark-on-light world of hard copy, whereas greyscale should always be consistently readable on any printer.

5. Other browsers are available — some of them may even have equivalent functionality. As well as a browser, Google are good enough to provide you a search engine to find out just as easily as I could.

2 Aug 2016 at 8:12AM by Andy Pearce in Software  | Photo by Ilya Pavlov on Unsplash  | Tags: web  |  See comments

# ☑ The State of Python Coroutines: Python 3.5

This is part 4 of the “State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers additional syntax that was added in Python 3.5.

In the previous post in this series I went over an example of using coroutines to handle IO with asyncio and how it compared with the same example implemented using callbacks. This almost brings us up to date with coroutines in Python but there’s one more change yet to discuss — Python 3.5 contains some new keywords to make defining and using coroutines more convenient.

As usual for a Python release, 3.5 contains quite a few changes but probably the biggest, and certainly the most relevant to this article, are those proposed by PEP-492. These changes aim to raise coroutines from something that’s supported by libraries to the status of a core language feature supported by proper reserved keywords.

Sounds great — let’s run through the new features.

## Declaring and awaiting coroutines

To declare a coroutine the syntax is the same as a normal function but where async def is used instead of the def keyword. This serves approximately the same function as the @asyncio.coroutine decorator did previously — indeed, I believe one purpose of the decorator, aside from documentation purposes, was to allow async def routines to be called. Since coroutines are now a language mechanism and shouldn’t be intrinsically tied to a specific library, there’s now also a new decorator @types.coroutine that can be used for this purpose.

Previously coroutines were essentially a special case of generators — it’s important to note that this is no longer the case, they are a wholly separate language construct. They do still use the generator mechanisms under the hood, but my understanding is that’s primarily an implementation detail with which programmers shouldn’t need to concern themselves most of the time.

The distinction between a function and a generator is whether the yield keyword appears in its body, but the distinction between a function and a coroutine is whether it’s delcared with async def. If you try to use yield in a coroutine declared with async def you’ll get SyntaxError (i.e. a routine cannot be both a generator and a coroutine).

So far so simple, but coroutines aren’t particularly useful until they can yield control to other code — that’s more or less the whole point. With generator-based coroutines this was achieved with yield from and with new syntax it’s achieved with the await keyword. This can be used to wait for the result from any object which is awaitable

An awaitable object is one of:

• A coroutine, as declared with async def.
• A coroutine-compatible generator (i.e. decorated with @types.coroutine).
• Any object that implements an appropriate __await__() method.
• Objects defined in C/C++ extensions with a tp_as_async.am_await method — this is more or less equivalent to __await__() in pure Python objects.

The last option is perhaps simpler than it sounds — any object wishes to be awaitable needs to return an interator from its __await__() method. This iterator is used to implement the funamental wait operation — the iterator’s __next__() method is invoked and the value it yields is used as the value of the await expression.

It’s important to note that this definition of awaitable is what’s required of the argument to await, but the same conditions don’t apply to yield from. There are some things that both will accept (i.e. coroutines) but await won’t accept generic generators and yield from won’t accept the other forms of awaitable (e.g. an object with __await__()).

It’s also equally important to note that a coroutine defined with async def can’t every directly return control to the event loop — there simply isn’t the machinery to do so. Typically this isn’t much of a problem since most of the time you’ll be using asyncio functions to do this, such as asyncio.sleep() — however, if you wanted to implement something like asyncio.sleep() yourself then as far as I can tell you could only do so with generator-based coroutines.

OK, so let me be pedantic and contradict myself for a moment — you can indeed implement something like asyncio.sleep() yourself. Indeed, here’s a simple implementation:

 1 2 3 4 5 async def my_sleep(delay, result=None): loop = asyncio.get_event_loop() future = loop.create_future() loop.call_later(delay, future.set_result, result) return (await future) 

This has a lot of deficiencies as it doesn’t handle being cancelled or other corner cases, but you get the idea. However the key point here is that this depends on asyncio.Future and if you go look at the implementation for that then you’ll see that __await__() is just an alias for __iter__() and that method uses yield to return control to the event loop. As I said earlier, it’s all built on generators under the hood, and since yield isn’t permitted in an async def coroutine, there’s no way to achieve that (at least as far as I can tell).

In general, however, the amount of times you would be returning control to the event loop is very low — the vast majority of cases where you’re likely to do that are for a fixed delay or for IO and asyncio already has you covered in both cases.

One final note is that there’s also an abstract base class for awaitable objects in case you ever need to test the “awaitability” of something you’re passed.

## Coroutines example

As a quick example of await in action consider the script below which is used to ping several hosts in parallel to determine whether they’re alive. This example is quite contrived, but it illustrates the new syntax — it’s also an example of how to use the asyncio subprocess support.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 import asyncio import os import sys PING_PATH = "/sbin/ping" async def ping(server, results): with open(os.devnull, "w") as fd: # -c1 -> perform a single ping request only # -t3 -> timeout of three seconds on response # -q -> generate less output proc = await asyncio.create_subprocess_exec( PING_PATH, '-c1', '-q', '-t3', server, stdout=fd) # Wait for the ping process to exit and check exit code returncode = await proc.wait() results[server] = not bool(returncode) async def progress_ticker(results, num_servers): while len(results) < num_servers: waiting = num_servers - len(results) msg = "Waiting for {0} response(s)".format(waiting) sys.stderr.write(msg) sys.stderr.flush() await asyncio.sleep(0.5) sys.stderr.write("\r" + " "*len(msg) + "\r") def main(argv): results = {} tasks = [ping(server, results) for server in argv[1:]] tasks.append(progress_ticker(results, len(tasks))) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.wait(tasks)) loop.close() for server, pingable in sorted(results.items()): status = "alive" if pingable else "dead" print("{0} is {1}".format(server, status)) if __name__ == "__main__": sys.exit(main(sys.argv)) 

One point that’s worth noting is that since we’re using coroutines as opposed to threads to achieve concurrency within the script1, we can safely access the results dictionary without any form of locking and be confident that only one coroutine will be accessing it at any one time.

## Asynchronous context manager and iterators

As well as the simple await demonstrated above there’s also a new syntax for allowing context managers to be used in coroutines.

The issue with a standard context manager is that the __enter__() and __exit__() methods could take some time or perform blocking operations - how then can a coroutine use them whilst still yielding to the event loop during these operations?

The answer is support for asynchronous context managers. These work in a vary similar manner but provide two new methods __aenter__() and __aexit__() — these are called instead of the regular versions when the caller invokes async with instead of the plain with statement. In both cases they are expected to return an awaitable object that does the actual work.

These are a natural extension to the syntax already described and allow coroutines to make use of any constructions which may perform blocking IO in their enter/exit routines — this could be database connections, distributed locks, socket connections, etc.

Another natural extension are asynchronous iterators. In this case objects that wish to be iterable implement an __aiter__() method which returns an asynchronous iterator which implements an __anext__() method. These two are directly analogous to __iter__() and __next__() for standard iterators, the difference being that __anext__() must return an awaitable object to obtain the value instead of the value directly.

Note that in Python 3.5.x prior to 3.5.2 the __aiter__() method was also expected to return an awaitable, but this changed in 3.5.2 so that it should return the iterator object directly. This makes it a little fiddly to write compatible code because earlier versions still expect an awaitable, but I strongly recommend writing code which caters for the later versions — the Python documentation has a workaround if necessary.

To wrap up this section let’s see an example of async for — with apologies in advance to anyone who cares even the slightest bit about the correctness of HTTP implementations I present a HTTP version of the cat utility.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 import asyncio import os import sys import urllib.parse class HTTPCat: def __init__(self, urls): self.urls = urls self.url_reader = None class URLReader: def __init__(self, url): self.parsed_url = urllib.parse.urlparse(url) async def connect(self): port = 443 if self.parsed_url.scheme == "https" else 80 connect = asyncio.open_connection( self.parsed_url.netloc, port, ssl=(port==443)) self.reader, writer = await connect query = ('GET {path} HTTP/1.0\r\n' 'Host: {host}\r\n' '\r\n').format(path=self.parsed_url.path, host=self.parsed_url.netloc) writer.write(query.encode('latin-1')) while True: line = await self.reader.readline() if not line.strip(): break async def readline(self): line = await self.reader.readline() return line.decode('latin1') def __aiter__(self): return self async def __anext__(self): while True: if self.url_reader is None: if not self.urls: raise StopAsyncIteration self.url_reader = self.URLReader(self.urls.pop(0)) await self.url_reader.connect() line = await self.url_reader.readline() if line: return line self.url_reader = None async def http_cat(urls): async for line in HTTPCat(urls): print("Line: {0}".format(line.rstrip())) def main(argv): loop = asyncio.get_event_loop() loop.run_until_complete(http_cat(argv[1:])) loop.close() if __name__ == "__main__": sys.exit(main(sys.argv)) 

This is a heavily over-simplified example with many shortcomings (e.g. it doesn’t even support redirections or chunked encoding) but it shows how the __aiter__() and __anext__() methods can be used to wrap up operations which may block for significant periods.

One nice property of this construction is that lines of output will flow down as soon as they arrive from the socket — many HTTP clients seem to want to block until the whole document is retrieved and return it as a string. This is terribly inconvenient if you’re fetching a file of many GB.

Coroutines make streaming the document back in chunks a much more natural affair, however, and I really like the ease of use for the client. Of course, in reality you’d use a library like aiohttp to avoid messing around with HTTP yourself.

## Conclusions

That’s the end of this sequence of articles and we’re brought about bang up to date. Overall I really like the fact that the Python developers have focused on making coroutines a proper first-class concept within the language. The implementation is somewhat different to other languages, which often seem to try to hide the coroutines themselves and offer only futures as the language interface, but I do like knowing when my context switches are constrained to be — especially if I’m relying on this mechanism to avoid locking that would otherwise be required.

The syntax is nice and the paradigm is pleasant to work with — but are there any downsides? Well, because the implementation is based on generators under the hood I do have my concerns around raw performance. One of the benefits of asynchonrous IO should really be the performance boost and scalability vs. threads for dominantly IO-bound applications — while the scalability is probably there, I’m a little unconvinced about the performance for real-world cases.

I hunted around for some proper benchmarks but they see few and far between. There’s this page which has a useful collection of links, although it hasn’t been updated for almost a year — I guess things are unlikely to have moved on significantly in that time. From looking over these results it’s clear that asyncio and aiohttp aren’t the cutting edge of performance, but then again they’re not terrible either.

When all’s said and done, if performance is the all-consuming overriding concern then you’re unlikely to be using Python anyway. If it’s important enough to warrant an impact on readability then you might want to at least investigate threads or gevent before making a decision. But if you’ve got what I would regard as a pretty typical set of concerns, where your readablity and maintainability are the top priority, even though you don’t want performance to suffer too much, then take a serious look at coroutines — with a bit of practice I think you might learn to love them.

Or maybe at least dislike them less than the other options.

1. I’m ignoring the fact that we’re also using subprocesses for concurrency in this example since it’s just an implementation detail of this particular case and not relevant to the point of safe access to data structures within the script.

13 Jul 2016 at 7:00PM by Andy Pearce in Software  | Photo by Andy Pearce  | Tags: python coroutines  |  See comments

# ☑ The State of Python Coroutines: asyncio - Callbacks vs. Coroutines

This is part 3 of the “State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers more of the asyncio module that was added in Python 3.4.

In the preceding post in this series I introduced the asyncio module and its utility as an event loop for coroutines. However, this isn’t the only use of the module — its primary purpose is to act as an event loop for various forms of I/O such as network sockets and pipes to child processes. In this post, then, I’d like to compare the two main approaches to doing this: using callbacks and using coroutines.

## A brief digression: handling multiple connections

Anyone that’s done a decent amount of non-blocking I/O can probably skim or skip this section — for anyone who’s not come across this problem in their coding experience, this might be useful.

There are quite a few occasions where you end up needing to handle multiple I/O streams simultaneously. An obvious one is something like a webserver, where you want to handle multiple network connections concurrently. There are other examples, though — one thing that crops up quite often for me is managing multiple child processes, where I want to stream output from them as soon as it’s generated. Another possibility is where you’re making multiple HTTP requests that you want to be fetched in parallel.

In all these cases you want your application to respond immediately to input received on any stream, but at the same time it’s clear you need to block and wait for input — endlessly looping and polling each stream would be a massive waste of system resources. Typically there are two main approaches to this: threads and non-blocking I/O1.

These days threads seem to be the more popular solution — each I/O stream has a new thread allocated for it and the stack of this thread encapsulates its complete state. This makes it easy for programmers who aren’t used to dealing with event loops — they can continue to write simple sequential code that uses standard blocking I/O calls to yield as required. It has some downsides, however — cooperating with other threads requires the overhead of synchronisation and if the turnover of connections is high (consider, say, a busy DNS server) then it’s slightly wasteful to be continually creating and destroying thread stacks. If you want to solve the C10k problem, for example, I think you’d struggle to do it using a thread per connection.

The other alternative is to use a single thread and have it wait for activity on any stream, then process that input and go back to sleep again until another stream is ready. This is typically simpler in some ways — for example, you don’t need any locking between connections because you’re only processing one at any given time. It’s also perfectly performant in cases where you expect to be primarily IO-bound (i.e. handling connections won’t require significant CPU time) — indeed, depending on how the data structures associated with your connections are allocated this approach could improve performance by avoiding false sharing issues.

The downside to this method is that it’s a rather less intuitive for many programmers. In general you’d like to write some straight-line code to handle a single connection, then have some magical means to extend that to multiple connections in parallel — that’s the lure of threading. But there is a way we can achieve, to some extent, the best of both worlds (spoiler alert: it’s coroutines).

The mainstays for implementing non-blocking I/O loops in the Unix world have long been select(), introduced by BSD in 1983, and the slightly later poll(), added to System V in 1986. There are some minor differences but in both cases the model is very similar:

• Register a list of file descriptors to watch for activity.
• Call the function to wait for activity on any of them.
• Examine the returned value to discover which descriptors are active and process them.
• Loop around to the beginning and wait again.

This is often known as the event loop — it’s a loop, it handles events. Implementing an event loop is quite straightforward, but the downside is that the programmer essentially has to find their own way to maintain the state associated with each connection. This often isn’t too tricky, but sometimes when the connection handling is very context-dependent it can make the code rather hard to follow. If often feels like scrabbling to implement some half-arsed version of closures and it would preferable to let language designers worry about that sort of thing.

The rest of this article will focus on how we can use asyncio to stop worrying so much about some of these details and write more natural code whilst still getting the benefits of the non-blocking I/O approach.

## asyncio with callbacks

One problem with using the likes of select() is that it can encourage you to drive all your coding from one big loop. Without a bit of work, this tends to run counter to the design principle of separating concerns, so we’d like to move as much as possible out of this big loop. Ideally we’d also like to abstract it, implement in a library somewhere and get benefits of reusing well-tested code. This is partiularly important for event loops where the potential for serious issues (such as getting into a busy loop) is rather higher than in a lot of areas of code.

The most common way to hook into a generic event look is with callbacks. The application registers callback functions which are to be invoked when particular events occur, and then the application jumps into a wait function whose purpose is to simply loop until there are events and invoke the appropriate callbacks.

It’s unsurprising, then, that asyncio is designed to support the callback approach. To illustrate this I’ve turned to my usual example of a chat server — this is a really simple daemon that waits for socket connections (e.g. using netcat or telnet) then prompts for a username and allows connected users to talk to each other.

This implementation is, of course, exceedingly basic — it’s meant to be an example, not a fully-featured application. Here’s the code, I’ll touch on the highlights afterwards.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 import asyncio import sys class ChatServer: class ChatProtocol(asyncio.Protocol): def __init__(self, chat_server): self.chat_server = chat_server self.username = None self.buffer = "" self.transport = None def connection_made(self, transport): # Callback: when connection is established, pass in transport. self.transport = transport welcome = "Welcome to " + self.chat_server.server_name self.send_msg(welcome + "\nUsername: ") def data_received(self, data): # Callback: whenever data is received - not necessarily buffered. data = data.decode("utf-8") self.buffer += data self.handle_lines() def connection_lost(self, exc): # Callback: client disconnected. if self.username is not None: self.chat_server.remove_user(self.username) def send_msg(self, msg): self.transport.write(msg.encode("utf-8")) def handle_lines(self): while "\n" in self.buffer: line, self.buffer = self.buffer.split("\n", 1) if self.username is None: if self.chat_server.add_user(line, self.transport): self.username = line else: self.send_msg("Sorry, that name is taken\nUsername: ") else: self.chat_server.user_message(self.username, line) def __init__(self, server_name, port, loop): self.server_name = server_name self.connections = {} self.server = loop.create_server( lambda: self.ChatProtocol(self), host="", port=port) def broadcast(self, message): for transport in self.connections.values(): transport.write((message + "\n").encode("utf-8")) def add_user(self, username, transport): if username in self.connections: return False self.connections[username] = transport self.broadcast("User " + username + " joined the room") return True def remove_user(self, username): del self.connections[username] self.broadcast("User " + username + " left the room") def get_users(self): return self.connections.keys() def user_message(self, username, msg): self.broadcast(username + ": " + msg) def main(argv): loop = asyncio.get_event_loop() chat_server = ChatServer("Test Server", 4455, loop) loop.run_until_complete(chat_server.server) try: loop.run_forever() finally: loop.close() if __name__ == "__main__": sys.exit(main(sys.argv)) 

The ChatServer class provides the main functionality of the application, tracking the users that are connected and providing methods to send messages. The interaction with asycio, however, is provided by the nested ChatProtocol class. To explain what this is doing, I’ll summarise a little terminology.

The asyncio module splits IO handling into two areas of responsibility — transports take care of getting raw bytes from one place to another and protocols are responsible for interpreting those bytes into some more meaningful form. In the case of a HTTP request, for example, the transport would read and write from the TCP socket and the protocol would marshal up the request and parse the response to exract the headers and body.

This is something asyncio took from the Twisted networking framework and it’s one of the aspects I really appreciate. All too many HTTP client libraries, for example, jumble up the transport and protocol handling into one big mess such that changing one aspect but still making use of the rest is far too difficult.

The transports that asyncio provides cover TCP, UDP, SSL and pipes to a subprocess, which means that most people won’t need to roll their own. The interesting part, then, is asycio.Protocol and that’s what ChatProtocol implements in the example above.

The first thing that happens is that the main() function instantiates the event loop — this occurs before anything else as it’s required for all the other operations. We then create a ChatServer instance whose constructor calls create_server() on the event loop. This opens a listening TCP socket on the specified port2 and takes a protocol factory as a parameter. Every time there is a connection on the listening socket, the factory will be used to manufacture a protocol instance to handle it.

The main loop then calls run_until_complete() passing the server that was returned by create_server() — this will block until the listening socket is fully open and ready to accept connections. This probably isn’t really required because the next thing it does is then call run_forever() which causes the event loop to process IO endlessly until explicitly terminated.

The meat of the application is then how ChatProtocol is implemented. This implements several callback methods which are invoked by the asyncio framework in response to different events:

• A ChatProtocol instance is constructed in response to an incoming connection on the listening socket. No parameters are passed by asyncio — because the protocol needs an instance to the ChatServer instance this is passed via a closure by the lambda in the create_server() call.
• Once the connection is ready, the connection_made() method is invoked which passes the transport that asyncio has allocated for the connection. This allows the protocol to store a reference to it for future writes, and also to trigger any actions required on a new connection — in this example, prompting the user for a username.
• As data is received on the socket, data_received() is invoked to pass this to the protocol. In our example we only want line-oriented data (we don’t want to send a message to the chat room until the user presses return) so we buffer up data in a string and then process any complete lines found in it. Note that we also should take care of character encoding here — in our simplistic example we blindly assume UTF-8.
• When we want to send data back to the user we invoke the write() method of the transport. Again, the transport expects raw bytes so we handle encoding to UTF-8 ourselves.
• Finally, when the user terminates their connection then our connection_lost() method is invoked — in our example we use this to remove the user from the chatroom. Note that this is subtly different to the eof_received() callback which represents TCP half-close (i.e. the remote end called shutdown() with SHUT_WR) — this is important if you want to support protocols that indicate the end of a request in this manner.

That’s about all there is to it — with this in mind, the rest of example should be quite straightforward to follow. The only other aspect to mention is that once the loop has been terminated, we go ahead and call its close() method — this clears out any queued data, closes listening sockets, etc.

## asyncio with coroutines

Since we’ve seen how to implement the chat server with callbacks, I think it’s high time we got back to the theme of this post and now compare that with an implementation of the same server with coroutines. In usual fashion, let’s jump in and look at the code first:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 import asyncio import sys class ChatServer: def __init__(self, server_name, port, loop): self.server_name = server_name self.connections = {} self.server = loop.run_until_complete( asyncio.start_server( self.accept_connection, "", port, loop=loop)) def broadcast(self, message): for reader, writer in self.connections.values(): writer.write((message + "\n").encode("utf-8")) @asyncio.coroutine def prompt_username(self, reader, writer): while True: writer.write("Enter username: ".encode("utf-8")) data = (yield from reader.readline()).decode("utf-8") if not data: return None username = data.strip() if username and username not in self.connections: self.connections[username] = (reader, writer) return username writer.write("Sorry, that username is taken.\n".encode("utf-8")) @asyncio.coroutine def handle_connection(self, username, reader): while True: data = (yield from reader.readline()).decode("utf-8") if not data: del self.connections[username] return None self.broadcast(username + ": " + data.strip()) @asyncio.coroutine def accept_connection(self, reader, writer): writer.write(("Welcome to " + self.server_name + "\n").encode("utf-8")) username = (yield from self.prompt_username(reader, writer)) if username is not None: self.broadcast("User %r has joined the room" % (username,)) yield from self.handle_connection(username, reader) self.broadcast("User %r has left the room" % (username,)) yield from writer.drain() def main(argv): loop = asyncio.get_event_loop() server = ChatServer("Test Server", 4455, loop) try: loop.run_forever() finally: loop.close() if __name__ == "__main__": sys.exit(main(sys.argv)) 

As you can see, this version is written in quite a different style to the callback variant. This is because it’s using the streams API which is essentially a set of wrappers around the callbacks version which adapts them for use with a coroutines.

To use this API we call start_server() instead of create_server() — this wrapper changes the way the supplied callback is invoked and instead passes it two streams: StreamReader and StreamWriter instances. These represent the input and output sides of the socket, but importantly they’re also coroutines so that we can delegate to them with yield from.

On the subject of coroutines, you’ll notice that some of the methods have an @asyncio.coroutine decorator — this serves a practical function in Python 3.5 in that it enables you to delegate to the new style of coroutine that it defines. Pre-3.5 it’s therefore useful for future compatibility, but also serves as documentation that this method is being treated as a coroutine. You should always use it to decorate your coroutines, but this isn’t enforced anywhere.

Back to the code. Our accept_connection() method is the callback that we provided to the start_server() method and the lifetime of this method call is the same as the lifetime of the connection. We could implement the handling of a connection in a strictly linear fashion within this method — such is the flexibility of coroutines — but of course being good little software engineers we like to break things out into smaller functions.

In this case I’ve chosen to use a separate coroutine to handle prompting the user for their username, so accept_connection() delegates to prompt_username() with this line:

username = (yield from self.prompt_username(reader, writer))


Once delegated, this coroutine takes control for as long as it takes to obtain a unique username and then returns this value to the caller. It also handles storing the username and the writer in the connections member of the class — this is used by the broadcast() method to send messages to all users in the room.

The handle_connection() method is also implemented in quite a straightforward fashion, reading input and broadcasting it until it detects that the connection has been closed by an empty read. At this point it removes the user from the connections dictionary and returns control to accept_connection(). We finally call writer.drain() to send any last buffered output — this is rather pointless if the user’s connection was cut, but could still serve a purpose if they only half-closed or if the server is shutting down instead. After this we simply return and everything is cleaned up for us.

How does this version compare, then? It’s a little shorter for one thing — OK, that’s a little facile, what else? We’ve managed to lose the nested class, which seems to simplify the job somewhat — there’s less confusion about the division of responsibilities. We also don’t need to worry so much about where we store things — there’s no transport that we have to squirrel away somewhere while we wait for further callbacks. The reader and writer streams as just passed naturally through the callchain in an intuitive manner. Finally, we don’t have to engage in any messy buffering of data to obtain line-oriented input — the reader stream handles all that for us.

## Conclusions

That about wraps it up for this post. Hopefully it’s been an interesting comparison — I know that I certainly feel like I understand the various layers of asyncio a little better having gone through this exercise.

It takes a bit of a shift in one’s thinking to use coroutine approach, and I think it’s helpful to have a bit of a handle on both mechanisms to better understand what’s going on under the hood, but overall the more I use the coroutine style for IO the more I like it. It feels like a good compromise between the intuitive straight-line approach of the thread-per-connection approach and the lock-free simplicity of non-blocking IO with callbacks.

In the next post I’m going to look at the new syntax for coroutines introduced in Python 3.5, which was the inspiration for writing this series of posts in the first place.

1. Some people use the term asynchronous IO for what I’m discussing here, which is certainly the more general term, but I prefer to avoid it due to risk of confusion with the POSIX asynchronous IO interface.

2. In this example we use a hard-coded port of 4455 for simplicity.

5 Jul 2016 at 7:45AM by Andy Pearce in Software  | Photo by Andy Pearce  | Tags: python coroutines  |  See comments

# ☑ The State of Python Coroutines: Introducing asyncio

This is part 2 of the “State of Python Coroutines” series which started with The State of Python Coroutines: yield from.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers parts of the asyncio module that was added in Python 3.4.

In the previous post I discussed the state of coroutines in Python 2.x and then the yield from enhancement added in Python 3.3. Since that release there’s been a succession of improvements for coroutines and in this post I’m going to discuss those that were added as part of the asyncio module.

It’s a pretty large module and covers quite a wide variety of functionality, so covering all that with in-depth discussion and examples is outside the scope of this series of articles. I’ll try to touch on the finer points, however — in this article I’ll discuss the elements that are relevant to coroutines directly and then in the following post I’ll talk about the IO aspects.

## History of asyncio

Python 2 programmers may recall the venerable asyncore module, which was added way back in the prehistory of Python 1.5.2. Its purpose was to assist in writing endpoints that handle IO from sources such as sockets asynchronously. To create clients you derive your own class from asyncore.dispatcher and override methods to handle events.

This was a helpful module for basic use-cases but it wasn’t particularly flexible if what you wanted didn’t quite match its structure. Generally I found I just ended up rolling my own polling loop based on things from the select module as I needed them (although if I were using Python 3.4 or above then I’d prefer the selectors module).

If you’re wondering why talk of an old asynchronous IO module is relevant to a series on coroutines, bear with me.

The limitations of asyncore were well understood and several third party libraries sprang up as alternatives, one of the most popular being Twisted. However, it was always a little annoying that such a common use-case wasn’t well catered for within the standard library.

Back in 2011 PEP 3153 was created to address this deficiency. It didn’t really have a concrete proposal, however, it just defined the requirements — Guido addressed this in 2012 with PEP 3156 and the fledgling asyncio library was born.

The library went through some iterations under the codename Tulip and a couple of years later it was included in the standard library of Python 3.4. This was on a provisional basis — this means that it’s there, it’s not going away, but the core developers reserve the right to make incompatible changes prior to it being finalised.

OK, still not seeing the link with coroutines? Well, as well as handling IO asynchronously, asyncio also has a handy event loop for scheduling coroutines. This is because the entire library is designed for use in two different ways depending on your preferences — either a more traditional callback-based scheme, where callbacks are invoked on events; or with a set of coroutines which can each block until there’s IO activity for them to process. Even if you don’t need to do IO, the coroutine scheduler is a useful piece that you don’t need to build yourself.

## asyncio as a scheduler

At this point it would be helpful to consider a quick example of what asyncio can do on the scheduling front without worrying too much about IO — we’ll cover that in the next post.

In the example below, therefore, I’ve implemented something like logrotate — mine is extremely simple1 and doesn’t run off a configuration file, of course, because it’s just for demonstration purposes.

First here’s the code — see if you can work out what it does, then I’ll explain the finer points below.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 import asyncio import datetime import errno import os import sys def rotate_file(path, n_versions): """Create .1 .2 .3 etc. copies of the specified file.""" if not os.path.exists(path): return for i in range(n_versions, 1, -1): old_path = "{0}.{1}".format(path, i - 1) if os.path.exists(old_path): os.rename(old_path, "{0}.{1}".format(path, i)) os.rename(path, path + ".1") @asyncio.coroutine def rotate_by_interval(path, keep_versions, rotate_secs): """Rotate file every N seconds.""" while True: yield from asyncio.sleep(rotate_secs) rotate_file(path, keep_versions) @asyncio.coroutine def rotate_daily(path, keep_versions): """Rotate file every midnight.""" while True: now = datetime.datetime.now() last_midnight = now.replace(hour=0, minute=0, second=0) next_midnight = last_midnight + datetime.timedelta(1) yield from asyncio.sleep((next_midnight - now).total_seconds()) rotate_file(path, keep_versions) @asyncio.coroutine def rotate_by_size(path, keep_versions, max_size, check_interval_secs): """Rotate file when it exceeds N bytes checking every M seconds.""" while True: yield from asyncio.sleep(check_interval_secs) try: file_size = os.stat(path).st_size if file_size > max_size: rotate_file(path, keep_versions) except OSError as exc: if exc.errno != errno.ENOENT: raise def main(argv): loop = asyncio.get_event_loop() # Would normally read this from a configuration file. rotate1 = loop.create_task(rotate_by_interval("/tmp/file1", 3, 30)) rotate2 = loop.create_task(rotate_by_interval("/tmp/file2", 5, 20)) rotate3 = loop.create_task(rotate_by_size("/tmp/file3", 3, 1024, 60)) rotate4 = loop.create_task(rotate_daily("/tmp/file4", 5)) loop.run_forever() if __name__ == "__main__": sys.exit(main(sys.argv)) 

Each file rotation policy that I’ve implemented is its own coroutine. Each one operates independently of the others and the underlying rotate_file() function is just to refactor out the common task of actually rotating the files. In this case they all delegate their waiting to the asyncio.sleep() function as a convenience, but it would be equally possible to write a coroutine which does something more clever, like hook into inotify, for example.

You can see that main() just creates a bunch of tasks and plugs them into an event loop, then asyncio takes care of the scheduling. This script is designed to run until terminated so it uses the simple run_forever() method of the loop, but there are also methods to run until a particular coroutine completes or just wait for one or more specific futures.

Under the hood the @asyncio.coroutine decorator marks the function as a coroutine such that asyncio.iscoroutinefunction() returns True — this may be required for disambiguation in parts of asyncio where the code needs to handle coroutines differently from regular callback functions. The create_task() call then wraps the coroutine instance in a Task class — Task is a subclass of Future and this is where the coroutine and callback worlds meet.

An asyncio.Future represents the future result of an asynchronous process. Completion callbacks can be registered with it using the add_done_callback(). When the asynchronous result is ready then it’s passed to the Future with the set_result() method — at this point any registered completion callbacks are invoked. It’s easy to see, then, how the Task class is a simple wrapper which waits for the result of its wrapped coroutine to be ready and passes it to the parent Future class for invocation of callbacks. In this way, the coroutine and callback worlds can coexist quite happily — in fact in many ways the coroutine interface is a layer implemented on top of the callbacks. It’s a pretty crucial layer in making the whole thing cleaner and more manageable for the programmer, however.

The part that links it all together is the event loop, which asyncio just gives you for free. There are a few details I’ve glossed over, however, since it’s not too important for a basic understanding. One thing to be aware of is that there are currently two event loop implementations — most people will be using SelectorEventLoop, but on Windows there’s also the ProactorEventLoop which uses different underlying primitives and has different tradeoffs.

This scheduling may all seem simplistic, and it’s true that in this example asyncio isn’t doing anything hugely difficult. But building your own event loop isn’t quite as trivial as it sounds — there are quite a few gotchas that can trip you up and leave your code locked up or sleeping forever. This is particularly acute when you introduce IO into the equation, where there are some slightly surprising edge cases that people often miss such as handling sockets which have performed a remote shutdown. Also, this approach is quite modular and manages to produce single-threaded code where different asynchronous operations interoperate with little or no awareness of each other. This can also be achieved with threading, of course, but this way we don’t need locks and we can more or less rule out issues such as race conditions and deadlocks.

That wraps it up for this article. I’ll cover the IO aspects of ascynio in my next post, covering and comparing both the callback and coroutine based approaches to using it. This is particularly important because one area where coroutines really shine (vs threads) is where your application is primarily IO-bound and so there’s no need to explode over multiple cores.

1. In just one example of many issues, for extra credit2 you might like to consider whay happens to the rotate_daily() implementation when it spans a DST change.

2. Where the only credit to which I’m referring are SmugPoints(tm): a currency that sadly only really has any traction inside the privacy of your own skull.

16 Jun 2016 at 8:29AM by Andy Pearce in Software  | Photo by Andy Pearce  | Tags: python coroutines  |  See comments

# ☑ The State of Python Coroutines: yield from

This is part 1 of the “State of Python Coroutines” series.

I recently spotted that Python 3.5 has added yet more features to make coroutines more straightforward to implement and use. Since I’m well behind the curve I thought I’d bring myself back up to date over a series of blog posts, each going over some functionality added in successive Python versions — this one covers the facilities up to and including the yield from syntax added in Python 3.3.

I’ve always thought that coroutines are an underused paradigm.

Multithreading is great for easily expanding single threaded approaches to make better use of modern hardware with minimal changes; multiprocess is great for enforcement of interfaces and also extending across multiple machines. In both cases, however, the premise is on performance at the expense of simplicity.

To my mind, coroutines offer the flip side of the coin — perhaps performance isn’t critical, but your approach is just more naturally expressed as a series of cooperative processes. You don’t want to wade through a sea of memory barriers to implement such things, you just want to divide up your responsibilities and let the data flow through.

In this short series of posts I’m going to explore what facilities we have available for implementing coroutines in Python 3, and in the process catch myself up developments in that area.

## Coroutines in Python 2

Before looking at Python 3 it’s worth having a quick refresher on the options for implementing coroutines in Python 2, not least because many programmers will still be constrained to use this version in many commercial environments.

The genesis of coroutines was when generators were added to the language in Python 2.2 — these are essentially lazily-evaluated lists. One defines what looks like a normal function but instead of a return statement yield is used. This has the effect of emitting a value from your generator but — and this is crucial — it also suspends execution of your generator in place and returns the flow of execution back to the calling code. This continues until the caller requests the next value from the generator at which point it resumes execution just after the yield statement.

For a real-world example consider the following implementation of the sieve of Eratosthenes:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # This is Python 2 code, since this section discusses Python 2. # For Python 3 replace range(...) with list(range(...)) and # replace xrange(...) with range(...). def primes(limit): "Yield all primes <= limit." sqrt_limit = int(round(limit**0.5)) limit += 1 sieve = range(limit) sieve[1] = 0 for i in xrange(2, sqrt_limit + 1): if sieve[i]: sieve[i*i:limit:i] = [0] * len(xrange(i*i, limit, i)) yield i for i in xrange(sqrt_limit + 1, limit): if sieve[i]: yield i 

Generators are, of course, fantasically useful on their own. In terms of coroutines, however, they’re only half the story — they can yield outputs but they can only take their initial inputs, they can’t be updated during their execution.

To address this Python 2.5 extended generators in several ways which allow them to be turned into general purpose generators. A quick summary of these enhancements is:

• yield, which was previously a statement, was redefined to be an expression.
• Added a send() method to inject inputs during execution.
• Added a throw() method to inject exceptions.
• Added a close() method to allow the caller to terminate a generator early.

There are a few other tweaks, but those are the main points. The net result of these changes is that one could now write a generator where new values can be injected, via the send() method, and these are returned within the generator as the value of the yield expression.

As a simple example of this, consider the code below which implements a coroutine that accepts a number as a parameter and returns back the average of all the numbers up to that point.

 1 2 3 4 5 6 7 import itertools def averager(): sum = float((yield)) counter = itertools.count(start=1) while True: sum += (yield sum / next(counter)) 

## Python 3.3 adds “yield from”

The conversion of generators to true coroutines was the final development in this story in Python 2 and development of the language long ago moved to Python 3. In this vein there was another advancement of coroutines added in Python 3.3 which was the yield from construction.

This stemmed from the observation that it was quite cumbersome to refactor generators into several smaller units. The complication is that a generator can only yield to its immediate caller — if you want to split generators up for reasons of code reuse and modularity, the calling generator would have to manually iterate the sub-generator and re-yield all the results. This is tedious and inefficient.

The solution was to add a yield from statement to delegate control entirely to another generator. The subgenerator is run to completion, with results being passed directly to the original caller without involvement from the calling generator. In the case of coroutines, sent values and thrown exceptions are also propogated directly to the currently executing subgenerator.

At its simplest this allows a more natural way to express solutions where generators are delegated. For a really simple example, compare these two sample1 implementations of itertools.chain():

  1 2 3 4 5 6 7 8 9 10 # Implementation in pre-3.3 Python def chain(*generators): for generator in generators: for item in generator: yield item # Implementation in post-3.3 Python def chain(*generators): for generator in generators: yield from generator 

Right now, of course, this looks somewhat handy but a fairly minor improvement. But when you consider general coroutines, it becomes a great mechanism for transferring control. I think of them a bit like a state machine where each state can have its own coroutine, so the concerns are kept separate, and where the whole thing just flows data through only as fast as required by the caller.

I’ve illustrated this below by writing a fairly simple parser for expressions in Polish Notation — this is just like Reverse Polish Notation only backwards. Or perhaps I mean forwards. Well, whichever way round it is, it really lends itself to simple parsing because the operators precede their arguments which keeps the state machine nice and simple. As long as the arity of the operators is fixed, no brackets are required for an unambiguous representation.

First let’s see the code, then I’ll discuss its operation below:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 import math import operator # Subgenerator for unary operators. def parse_unary_operator(op): return op((yield from parse_argument((yield)))) # Subgenerator for binary operators. def parse_binary_operator(op): values = [] for i in (1, 2): values.append((yield from parse_argument((yield)))) return op(*values) OPERATORS = { 'sqrt': (parse_unary_operator, math.sqrt), '~': (parse_unary_operator, operator.invert), '+': (parse_binary_operator, operator.add), '-': (parse_binary_operator, operator.sub), '*': (parse_binary_operator, operator.mul), '/': (parse_binary_operator, operator.truediv) } # Detect whether argument is an operator or number - for # operators we delegate to the appropriate subgenerator. def parse_argument(token): subgen, op = OPERATORS.get(token, (None, None)) if subgen is None: return float(token) else: return (yield from subgen(op)) # Parent generator - send() tokens into this. def parse_expression(): result = None while True: token = (yield result) result = yield from parse_argument(token) 

The main entrypoint is the parse_expression() generator. In this case it’s necessary to have a single parent because we want the behaviour of the top-level expressions to be fundamentally different — in this case, we want it to yield the result of the expression, whereas intermediate values are instead consumed internally within the set of generators and not exposed to calling code.

We use the parse_argument() generator to calculate the result of an expression and return it — it can use a return value since it’s called as a subgenerator of parse_expression() (and others). This determines whether each token is an operator or numeric literal — in the latter case it just returns the literal as a float. In the former case it delegates to a subgenerator based on the operator type — here I just have unary and binary operators as simple illustrative cases. Note that one could easily implement an operator of variable arity here, however, since the delegate generator makes its own decision of when to relinquish control back to the caller — this is an important property when modularising code.

Hopefully this example is otherwise quite clear — the parse_expression() generator simply loops and yields the values of all the top-level expressions that it encounters. Note that because there’s no filtering of the results by the calling generator (since it’s just delegating) then it will yield lots of None results as it consumes inputs until the result of a top-level expression can be yielded — it’ll be up to the calling code to ignore these. This is just a consequence of the way send() on a generator always yields a value even if there isn’t a meaningful value.

The only other slight wrinkle is that you might see some excessive bracketing around the yield operators — this is typically a good idea. PEP 342 describes the parsing rules, but if you just remember to always bracket the expression then that’s one less thing to worry about.

One thing that’s worth noting is that this particular example is quite wasteful for deeply nested expressions in the same way that recursive functions can be. This is because it constructs two new generators for each nested expression — one for parse_argument() and one for whichever operator-specific subgenerator this delegates to. Whether this is acceptable depends on your use-cases and the extent which you want to trade off the code expressiveness against space and time complexity.

Below is an example of how you might use parse_expression():

 1 2 3 4 5 6 7 8 9 def parse_pn_string(expr_str): parser = parse_expression() next(parser) for result in (parser.send(i) for i in expr_str.split()): if result is not None: yield result results = parse_pn_string("* 2 9 * + 2 - sqrt 25 1 - 9 6") print("\n".join(str(i) for i in results)) 

Here I’ve defined a convenience wrapper generator which accepts the expression as a whitespace-delimited string and strips out the intermediate None values that are yielded. If you run that you should see there’s two top-level expressions which yield the same result.

## Coming up

That wraps it up for this post — I hope it’s been a useful summary of where things stand in terms of coroutines as far as Python 3.3. In future posts I’ll discuss the asyncio library that was added in Python 3.4, and the additional async keyword that was added in Python 3.5.

1. Neither of these are anything to do with the official Python library, of course — they’re just implementations off the top of my head. I chose itertools.chain() purely because it’s very simple.

10 Jun 2016 at 7:58AM by Andy Pearce in Software  | Photo by Andy Pearce  | Tags: python coroutines  |  See comments

Page 1 of 6   |   Page 2 →   |   Page 6 ⇒