HTTP/2

8 May 2015 at 11:52PM in Software
 |  | 

After eighteen years there’s a new version of HTTP. Having heard comparatively little about it until now, I decided to take a quick look.

highway night

HTTP is the protocol that underpins the Word Wide Web, and it’s been running more or less unchanged since 1997. After all this time there’s finally a new version of it, and it’s pretty different. This article delves into some of its details and their implications for the future of The Web.

Introduction

The Web is surely the most used service on the Internet by far — so much so, in fact, that I’m sure to many people the terms “Web” and “Internet” are wholly synonymous.

Perhaps one of the most remarkable things about the Web is that despite the revolutionary changes in the services offered over it, the protocol that underpins the whole thing, HTTP, has remained remarkably unchanged for almost two decades. The Web as we know it now runs more or less entirely over HTTP/1.1, the version standardised in 1997, receiving only minor tweaks since then.

However, the venerable old HTTP/1.1 is now starting to look a little creaky when delivering today’s highly dynamic and media-rich websites. In response to this, Google set a handful of its boffins to planning solutions to what it saw as some of the more serious issues — the result was a new protocol known as SPDY. This was intended to be more or less compatible with HTTP at the application level, but improve matters in the underlying layers.

In fact, SPDY has essentially formed the basis of the new HTTP/2 standard, the long-awaited replacement for HTTP/1.1 — so much so that Google has indicated that it plans to drop support for SPDY in favour of the new standard, and the other main browser vendors seem to be following suit. Since it seems to have fairly broad support, it’s probably useful to take a quick look at what it’s all about, then.

Why HTTP/2?

But before looking at HTTP/2 itself, what are the issues it’s trying to solve? The main one, as I see it, is latency — the amount of time between the first request for a page, and finally seeing the full thing loaded on screen.

The issue here is mainly that pages are typically composed of many disparate media: the main HTML file, one or more CSS files, Javascript files, images, videos, etc. With HTTP/1.1 these are all fetched with separate requests and since one connection can only handle one request at a time, this serialisation tends to hurt performance. Over conventional HTTP browsers can, and do, make multiple connections and fetch media over them concurrently — this helps, although it’s arguably rather wasteful. This is much less helpful over HTTPS, however, as the overhead of creating a secure connection is much higher — the initial connection triggers a cryptographic handshake which not only adds latency, it also puts a significant additional load on the server. Since HTTPS is increasingly becoming the standard for online services, even those which don’t deal in particularly sensitive data, this use-case is quite important.

There are other issues which add to the latency problem. HTTP headers need to be repeated with each request — this is quite wasteful as the ASCII representation of these headers is quite verbose compared to the data contained therein. Clients also need to make a good deal of progress through parsing the main HTML page to figure out which additional files need to be fetched and this adds yet more latency to the overall page load process.

Some of you may be thinking at this point that there’s an existing partial solution to some of these issues: pipelining. This is the process by which a client may send multiple requests without waiting for responses, leaving the server to respond in sequence to them as quickly as it can. Hence, as the client parses the response to the initial request, it can immediately fire off requests for the additional files it needs and this helps reduce latency by avoiding a round-trip-time for each request.

This is a useful technique but it still suffers from a significant problem known as head-of-line blocking. Essentially this is where a response for a large object holds up further responses on the connection — so for example, fetching a large image, without which the page could quite possibly still be usefully rendered, could hold up fetching of a CSS file which is rather more crucial.

Latency, then, is an annoying issue — so how does HTTP/2 go about solving it?

Negotiation

In essence what they’ve done is allow multiple request/response streams to be multiplexed onto a single connection. They’re still using TCP, but they’ve added a framing layer on top of that, so requests and responses are split up into frames of up to 16KB (by default). Each frame has a minimal header which indicates to which stream the frame belongs and so clients and servers can request and serve multiple resources concurrently over the same connection without suffering from the head-of-line blocking issue.

This is, of course, a pretty fundamental change from HTTP/1.1 and so they’ve also had to add a negotiation mechanism to allow clients and servers to agree to handle a connection over HTTP/2. The form this mechanism takes differs between HTTP and HTTPS. In the former case the existing HTTP/1.1 Upgrade header is supplied with the value h2c. There’s also a HTTP2-Settings header which must be supplied to specify the parameters of the new HTTP/2 connection, should the server accept — I’ll cover these settings in a moment. If the server supports HTTP/2 it can accept the upgrade by sending a 101 Switching Protocols response and then respond to the initial request using HTTP/2 — otherwise it can treat the request as if the header wasn’t present and respond as normal for HTTP/1.1.

For HTTPS a different mechanism is used, Application Layer Protocol Negotiation. This is an extension to TLS that was requested by the HTTP/2 working group and was published as an RFC last year. It allows a client and server to negotiate an application layer protocol to run over a TLS connection without additional round trip times. Essentially the client includes a list of the protocols it supports in its ClientHello message from which the server selects its most preferred option and indicates this choice in its ServerHello response.

However the connection was negotiated, both endpoints then send final confirmation of HTTP/2 in the form of a connection preface. For the client this takes the form of the following fixed string followed by a SETTINGS frame:

PRI * HTTP/2.0<CRLF>
<CRLF>
SM<CRLF>
<CRLF>

As an aside you may be wondering why the client is obliged to send a HTTP2-Settings header with its upgrade request if it’s then required to send a SETTINGS frame anyway — I’ve no idea myself, it seems a little wasteful to me. The RFC does allow that initial SETTINGS frame to be empty, though.

The server’s connection preface is simply a SETTINGS frame, which may also be empty. Both endpoints must also acknowledge the new frames by sending back empty SETTINGS frames with an ACK flag set. At this point the HTTP/2 connection is fully established.

The SETTINGS frames (or the HTTP2-Settings header) are used to set per-connection (not per-stream) parameters for the sending endpoint. For example, they can be used to set the maximum frame size that endpoint is prepared to accept, or disable server push (see later). A full list of the settings can be found in the RFC.

Frames

As I’ve already mentioned, all HTTP/2 communication is packetised, although of course since it’s running over a stream-based protocol then only one packet can be in flight on a given connection at once. The frame format includes a 9 byte header with the following fields:

Length [24 bits]
The size of the frame, not including the 9-byte header. This must not exceed 16KB unless otherwise negotiated with appropriate SETTINGS frames.
Type [8 bits]
An 8-bit integer indicating the type of this frame.
Flags [8 bits]
A bitfield of flags whose interpretation depends on the frame type.
Reserved [1 bit]
Protocol designers do love their reserved fields. Should always be zero in the current version of the protocol.
Stream ID [31 bits]
A large part of the purpose of HTTP/2 is to multiplex multiple concurrent requests and this is done by means of streams. This ID indicates to which stream this frame applies. Stream ID 0 is special and reserved for frames which apply to the connection as a whole as opposed to a specific stream. Stream IDs are chosen by endpoints and namespaced by the simple expedient of the client always choosing odd-numbered IDs and the server limiting itself to even-numbered IDs. In the case of an upgraded HTTP connection the initial request is automatically assumed to be stream 1.

The RFC specifies a complete state machine for streams which I won’t go into in detail here. Suffice to say that there are two main ways to initiate a new stream — client request and server push. We’ll look at these in turn.

Client Requests

When the client wishes to make a new request, it creates a new stream ID (odd-numbered) and sends a HEADERS frame on it. This replaces the list of headers in a conventional HTTP request. If the headers don’t fit in a single frame then one or more CONTINUATION frames can be used to send the full set, with the final frame being indicated by an END_HEADERS flag being set.

The format of the headers themselves has also changed in HTTP/2 as a form of Huffman Encoding with a static dictionary is now used to compress them. This is actually covered in a separate RFC and I don’t want to go into it in too much detail — essentially it’s just a simple process for converting a set of headers and values into a compressed binary form. As you might expect, this scheme does allow for arbitrary headers not covered by the static dictionary, but it’s designed such that common headers take up very little space.

The block of binary data that results from the encoding process is what’s split up into chunks and placed in the HEADERS and CONTINUATION frames sent as part of making a request. One thing that initially surprised me is that one single sequence of frames which make up one header block must be transmitted sequentially — no other frames, not even from other streams, may be interleaved. The reason for this is pretty clear, however, when you realise that the process of decoding a header block is stateful — this restriction ensures that any given connection only needs one set of decoder state to be maintained, as opposed to one per stream, to keep resource requirements bounded. One consequence of this is that very large header blocks could lead to HTTP/1.1-style head-of-line blocking, but in practice I doubt such large blocks will be common.

There are a few other things to note about the transmission of headers other than the compression. The first is that all headers are forced to lowercase for transmission — this is just to assist the compression, and since HTTP headers have always been case-insensitive this shouldn’t affect operation. The second thing to note is that the Cookie header can be broken up into separate headers, one for each constituent cookie. This reduces the amount of state that needs to be retransmitted when cookies change.

The third and final item of note is slightly more profound — the information that used to be presented in the initial request and response lines has now moved into headers. Any request is therefore expected to contain at least the following pseudo-headers:

:method
The method of the request — e.g. GET or POST.
:scheme
The URL scheme of the requested resource — e.g. http or https.
:path
The path and (optionally) query parameters portion of the requested URL.

Similarly the response also contains a pseudo-header:

:status
The status code of the response as a bare integer — there is no longer a way to include a “reason” phrase.

At this point, therefore, we’ve got a way to send a request, including headers. After this point any request body is sent via DATA frames. Due to the framing of HTTP/2 there’s no need for chunked encoding, of course, although the standard does still allow a Content-Length header. Whether or not the request contains a body, it is terminated by its final frame having the END_STREAM flag set — this puts the stream into a “half-closed” state where the request is finished and the client is awaiting a response.

The response is sent using HEADER, CONTINUATION and DATA frames in the same way as the request, and completion of the response is similarly indicated by the final frame having END_STREAM set. After this point the stream is closed and finished with — this has echoes of the old HTTP/1.0 approach of closing the connection to indicate the end of a response, but of course doesn’t have the overhead of closing and re-opening TCP connections associated with it.

Server Push

That’s covered a standard client request, so what about this “push” mechanism that’s been added?

Well, one of the causes of latency when opening a multimedia website is the fact that the client has to scan the original requested page for links to other media files, and then send requests for this. In many cases, however, the server already knows that the client is very likely to, for example, request images that are linked within a HTML page. As a result, the overall page load time could be reduced if the server could just send these resources speculatively along with the main requested page — in essence, this is server push.

When the client makes a request the server can send PUSH_PROMISE frames on the same stream that’s being used for the DATA frames of the response. These frames (and potentially associated CONTINUATION frames) contain a header block which corresponds to a request that the server believes the client will need to make as a result of the currently requested resource. They also reference a new stream ID which is reserved by the server for sending the linked resource. At this point the client can decide if it really does want the offered resource — if so, all it has to do is wait until the server starts sending the response, consisting of the usual HEADERS, CONTINUATION and DATA frames. If it wishes to reject the push, it can do so by sending a RST_STREAM frame to close the stream.

That’s about it for server push. The only slightly fiddly detail is that the server has to send the PUSH_PROMISE frame prior to any DATA frames that reference the offered resource — this prevents any races where the client could request a pushed resource. The flip side to this is that a client is not permitted to a request a resource whose push has been offered. This is all fairly common sense.

Prioritisation and Flow Control

The only other details of HTTP/2 that I think are interesting to note are some details about prioritisation and flow control.

Flow control is intended to prevent starvation of streams or entire connections — since many requests could be in flight on a single connection, this could become quite an important issue. Flow control in HTTP/2 is somewhat similar to the TCP protocol where each receiver has an available window which is an amount of data they’ve indicated they’re prepared to accept from the other end. This defaults to a modest 64KB. Windows can be set on a per-connection and a per-stream basis, and senders are required to ensure that they do not transmit any frames that would exceed either available window.

Each endpoint can update its advertised windows with WINDOW_UPDATE frames. On systems with a large number of resources flow control can be effectively disabled by advertising the maximum window size of \(2^{31}-1\). When deciding whether to use flow control, it’s important to remember that advertising a window size which is too small, or not sending WINDOW_UPDATE frames in a timely manner, could significantly hurt performance, particularly on high latency connections — in these days of mobile connectivity, I suspect latency is still very much an issue.

As well as simple flow control, HTTP/2 also introduces a prioritisation mechanism where one stream can be made to be independent of another by sending a PRIORITY frame with the appropriate details. Initially all streams are assumed to depend on the non-existent stream 0 which effectively forms the root of a dependency tree. The intent is that these dependencies are used to prioritise the streams — where there is a decision about on which stream to send/receive frames next, endpoints should give priority to streams on which other depend over those streams that depend on them.

As well as the dependencies themselves, there’s also a weighting scheme, where resources can be shared unevenly between the dependencies of a stream. For example, perhaps images used for a navigation are regarded as more important than a background texture and hence can be given either higher weight such that they download faster over a connection with limited bandwidth.

Conclusions

That’s about it for the protocol itself — so what does it all mean?

My thoughts on the new protocol are rather mixed. I think the the multiplexing of streams is a nice idea and there’s definite potential for improvement over secure a single secure connection. Aside from the one-time cost of implementing the protocol, I think it also has the potential to make life easier for browser vendors since they won’t have to worry so much about using ugly hacks to improve performance. There’s also good potential for the prioritisation features to really improve things over high latency and/or low bandwidth connections, such as mobile devices in poor reception — to make best use of this browsers will need to use a bit of cleverness to improve their partial rendering of pages before all the resources are fetched, but at least the potential is there.

On the flip side, I have a number of concerns. Firstly, it’s considerably more complicated, and less transparent, then standard HTTP/1.1. Any old idiot can knock together a HTTP/1.1 client library1 but HTTP/2 definitely requires some more thought — the header compression needs someone who’s fairly clueful to implement it and although the protocol allows the potential for improved performance, a decent implementation is also required to take advantage of this. With regards to transparency it’s going to make it considerably harder to debug problems — with HTTP/1.1 all you need is tcpdump or similar to grab a transcript of the connection and you can diagnose a significant number of issues. With HTTP/2 the multiplexing and header compression are going to make this much trickier, although I would expect more advanced tools like Wireshark to have decoders developed for them fairly quickly, if they haven’t already.

Of course, transparency of this sort isn’t really relevant if you’re already running over a TLS connection, and this reminds me of another point I’d neglected to mention — although the protocol includes negotiation methods over both HTTP and HTTPS, at least the Firefox and Chrome dev teams appear to be indicating they’ll only support HTTP/2 over TLS. I’m assuming they simply feel that there’s insufficient benefit over standard HTTP, where multiple connections are opened with much less overhead. I do find this stance a little disappointing, however, as I’m sure there are still efficiencies to be had in allow clients more explicit prioritisation and flow control of their requests.

Speaking of prioritisation, that’s my second concern with the standard — the traffic management features are quite complicated and rely heavily on servers to use them effectively. A standard webserver sending out static HTML can probably have some effective logic to, for example, push linked images and stylesheets — but this isn’t at all easy when you consider that the resources are linked in the URL space but the webserver will have to check for that content in the filesystem space. Life is especially complicated when you consider requests being served by some load-balanced cluster of servers.

But even aside from those complexities, today’s web is more dynamic than static. This means that web frameworks are likely going to need serious overhauls in order to allow their application authors the chance to take advantage of HTTP/2’s more advanced features; and said authors are similarly going to have to put in some significant work on their applications as well. Some of these facilities come “for free”, such as header compression, but not most of them by my estimation.

My third annoyance with the standard is its continued reliance on TCP. Now I can see exactly why they’ve done this — moving to something like UDP or SCTP would be a much more major shift and have made it rather harder to maintain any kind of HTTP/1.1 compatibility. But the fact remains that implementing a packet-based protocol on top of a stream-based protocol on top of a packet-based protocol is anything but elegant. I have all sorts of vague concerns about how TCP and HTTP/2 features could interact badly — for example, what happens if a client consistently chooses frame sizes that are a few bytes bigger than the path MTU? How will the TCP and HTTP/2 window sizes interact on slow connections? I’m not saying these are critical failings, I can just see potential for annoyances which will need to be worked around there.2

My final issue with it is more in the nature of an opportunity cost. The standard resolves some very specific problems with HTTP/1.1 but totally missed the chance to, for example, add decent session management. Cookies are, and always have been, a rather ugly hack — something more natural, and more secure by design, would have been nice.

Still, don’t get me wrong, HTTP/2 definitely looks to solve some of HTTP/1.1’s thorny performance problems and assuming at least major web services upgrade their sites intelligently, we should be seeing some improvements once support is rolled out more widely.


  1. And a startlingly large number of people appear to have done so. 

  2. As an aside it appears that Google have had this thought also as they’re currently working on another protocol called QUIC which replaces TCP with a UDP-based approximate equivalent which supports HTTP/2 with better performance. 

8 May 2015 at 11:52PM in Software
 |  |