HTTP/3 in Practice — HTTP/3

13 Apr 2023 at 9:30AM in Software
Photo by Jack Hunter on Unsplash

The second article covering my attempt to implement a HTTP/3 server from scratch in Rust. Having looked at the QUIC protocol at length in the previous article, this one sees how HTTP/3 is implemented atop it.

This is the 2nd of the 2 articles that currently make up the “HTTP/3 in Practice” series.

wood panel three

In the previous article, we took a fairly in-depth tour of QUIC, the transport protocol that HTTP/3 is built on top of instead of TCP, as with previous HTTP versions.

This article looks at how HTTP/3 uses QUIC to implement HTTP semantics. Since QUIC was itself designed with this goal in mind, I was expecting this to be a significantly shorter article, since it’s just mapping simple HTTP semantics on to a transport designed for it. It turned out quite a bit longer than I originally expected, however — partly that’s because of some brief(ish!) digressions into previous HTTP versions for context, but partly because HTTP/3 has some complexities, especially around the way it compresses request and response headers.

I am going to assume that you’ve at least got a good working understanding of HTTP/1.1 and at least a passing acquaintance with HTTP/2 — if you’re not familiar with HTTP/2 then you might like to read the article I wrote a few years ago about it.

In case you’re short on time, the tl;dr of this article is that HTTP/3 maintains the same request methods, status codes and header fields as previous versions, but maps these concepts to the underlying transport in a different way. If that’s enough detail for you then no need to thank me for giving you back a few minutes of your life you might otherwise have spent reading this article. If you’re interested in drilling into things in a little more detail then this is the article for you — read on!

A Brief History of HTTP

I believe it’s often helpful to know a little historical context of how a system has evolved to understand some of its behaviours today, and in this section I’ll run through a very brief summary of how HTTP has evolved since its first incarnation in 1991. You may already be quite familiar with this, or might simply disagree with me that this is useful, and if so I suggest you skip to the section on HTTP Semantics, which briefly outlines the high-level semantics of HTTP, which remain the same in HTTP/3, or skip that as well and jump straight to the Tour of HTTP/3 section.


In principle there have been four major versions in HTTP in common usage. The first version was HTTP as it was defined in the original 1991 proposal by Tim Berners-Lee — it consisted of making a TCP connection, sending a single GET /path/to/file request and then disconnecting. These days, this is generally known as HTTP/0.9. It was never described by any rigorous standard, so it’s questionable whether it even qualifies as a version.


The second version, HTTP/1.0, was the first to be properly defined in RFC 1945, which was published in 1996. It defines a much more recogisable request and response formats, modelled after MIME, and it specifies multiple request methods: GET, HEAD and POST. It also added the HTTP version field, as well as specification of the content type and the status codes that we still use today.


This was quite quickly replaced by HTTP/1.1 in RFC 2068, released only a year later. This addressed some concerns with the original specification, and I think was regarded by many at the time as the proper HTTP/1 specification. This made the Host header mandatory, which was important to enable the now common practice of hosting multiple domains on a single IP address, and it also added persistent connections to address the increasing waste of requiring the TCP handshake for every resource on increasingly multimedia websites.

HTTP/1.1 also added a number of other features which were less used at the time, such as 100 Continue semantics and slew of additional request methods, namely PUT, PATCH, DELETE, CONNECT, TRACE and OPTIONS. Some of these features were initially little-used, but most did find later use. Amazon S3, for example, makes good use of 100 Continue, and the additional methods were later quite useful for use in RESTful APIs.

HTTP/1.1 was a remarkably stable and is still in very wide use today. Partly this has been down to a solid initial design, but it has also been updated on a few of occasions to address some minor issues:

  • RFC 2616, released in 1999, made a series of small changes, tidying up some wording and clarifying some ambiguous details. There were also enhancements in some cases, such as cache control and transfer encodings.
  • In 2014 the specification was broken down into six smaller RFCs, as well as some changes in each one:
    • RFC 7230 Message Syntax and Routing clarified some details and deprecated multiline headers.
    • RFC 7231 Semantics and Content made primarily minor editorial improvements, but did make some minor changes to the semantics of some headers.
    • RFC 7232 Conditional Requests made very few changes, such as clarifying a few points around ETag headers.
    • RFC 7233 Range Requests similarly made only a few very minor tweaks.
    • RFC 7234 Caching improved the clarify of some descriptions of what can be cached, and when caches should be invalidated.
    • RFC 7235 Authentication rolled RFC 2617, specifying Basic and Digest authentication methods, into the main HTTP/1.1 specification and added an IANA registry for new authentication schemes.
  • In 2022 the standards were reorganised again, which I’ll discuss in the HTTP/3 section below.


Despite HTTP/1.1’s longevity, there was eventually a newer version HTTP/2, published as RFC 7540 in 2015. As I mentioned earlier, I discussed this in a little more detail previously, so I’ll try to be brief. This standard doesn’t really mess with the request/response semantics, but what it does try to do is address some of the underlying transport performance issues. It does this in several ways:

Request multiplexing
Serialisation of responses can significantly slow down delivery of the key data required to render the page, particularly if one of them proves to be particularly large. Browsers have worked around this by using multiple TCP connections to a server, but HTTP/2 solves this more gracefully by multiplexing multiple streams on a single connection.
Request priority
HTTP/2 allows servers to set the priority of responses, so the key resources that browsers will need first can be delivered as quickly as possible.
In HTTP/1.1 servers had an optional means to support GZip compression, but only with client support — in HTTP/2 this can be done implicitly. In addition, request and response headers were also compressed using a binary representation called HPACKHTTP/3 uses a similar but different approach known as QPACK, which we’ll see later in this article.
Server push
Servers can anticipate resources that the client may request next and actively push them to reduce the overall load time.

This brings us almost up to date. HTTP/2 adoption grew for several years, rising from around 16% of websites in August 2016 until plateauing at around 65-70% in November 20201. Adoption rose most quickly in the most popular websites, unsurprisingly, but there’s still a long tail of smaller websites which don’t support it, which is a testament to how effective HTTP/1 still is. There’s also been a small but puzzling drop of support in the last few years, and I’m not quite sure of the reason for that — perhaps site admins don’t see the point in maintaining HTTP/2 support once they’ve added HTTP/3, but that’s just a guess. The vast majority of browsers support HTTP/2, at least for those users who’ve updated them at some point since 2015.


HTTP/3 was released most recently, its very first draft being released at the end of 2016 under the title “HTTP over QUIC”, and being referred to as HTTP/3 from the end of 2018. It became a full Standards Track RFC in June 2022. As with HTTP/2, this version of the standard is primarily concerned with improving the transport performance by shifting the underlying transport from TCP to QUIC — as well as the performance benefits, this also enshrines TLS as a core and mandatory part of the standard. It has seen increasing support among websites since early 2020, reaching around 20% of websites at time of writing, and almost 30% among the top 1000 sites.

Since the improvements in page load latency should be more substantial than with HTTP/2, I’m hopeful that there’s more incentive for sites to update — but it’s still a very young standard in the scheme of things, and there’s probably a ton of proxies and other middleboxes that don’t properly support it properly yet, which may be a barrier for some time to come. Whilst updating of browsers is fairly fast, due to the automatic update process, updating websites is much more driven by the perceived benefits and these are potentially less compelling for smaller sites. Updating middleboxes is slowest of all — many companies may have hardware that’s outside its support contract and no longer receiving substantive firmware updates, and many vendors may prefer to use HTTP/3 support to incentivise customers to upgrade to newer hardware as opposed to adding it to existing device firmware. These costs must be justified by benefits, and frankly a lot of websites are “good enough”.

In any case, I appear to have drifted from the history to the future of HTTP, so it’s time to move on. Before I do so, I’ll just mention one more thing, which is that the HTTP/1.1 RFCs mentioned above have been once again obseleted by a reorganisation of the standards which happened in 2022, at the same time that HTTP/3 became an RFC. There are now two standards which are independent of the HTTP version in use:

  • RFC 9110 defines the core HTTP semantics, which are consistent across versions.
  • RFC 9111 defines rules around caching.

Each specific version then has its own RFC for the version-specific aspects:

The rest of the discussion in this article will be based on these standards, and not give any regard to the previous versions now obsoleted.

HTTP Semantics

Before getting into the aspects of HTTP which change with HTTP/3, I’ll run through a quick summary of the current state of HTTP semantics, as outlined in RFC 9110. These are independent of the HTTP version in use, and some understanding of them is important to comprehend how HTTP/3 applies them to QUIC. If you’re already very familiar with previous HTTP versions then I doubt there’s anything much for you to learn here, so you can skip ahead to the Tour of HTTP/3 if you like.

Requests and Responses

HTTP is a client/server protocol where a client requests resources from a server, where resources are identifed by URLs2 — a resource is intentionally kept vague, but will typically either be a file or dynamically-generated data. A client forms a connection and makes requests, and the server responds to each request with some representation of the requested resource, or an error indicating why it cannot or will not do so.

HTTP is a stateless protocol, so each request message must contain the full context required for the server to process it, and each response must only depend on the specific request to which it’s responding, plus any server-side state that might also be relevant. Not state is maintained between requests on the same connection, and indeed HTTP doesn’t generally assume that all requests on a single connection come from the same client — for example, an intermediate proxy might maintain a single connection to a server on behalf of multiple clients.

HTTP messages, both requests and responses, consist of:

  • Control data — this is the request or response line in HTTP/1.1
  • One or more header fields
  • Optionally, some content data
  • Optionally, trailing header fields

In HTTP/1.1 the control data was the initial line of the request or response and the header fields were on subsequent lines, but HTTP/2 and HTTP/3 have their own ways to include this information.

Header fields are name/value pairs, and the HTTP specification defines the meaning of many headers — going through them all is rather beyond the scope of this brief summary. Request headers typically specify things like the format of data expected in response, information about any content sent with the request, and some cache control information to save the server responding if the client already has the latest version of the resource cached. Response headers specify information about the returned content, redirections to other locations to find the resource, and information about how long the returned data may be cached.

Here’s a simple HTTP/1.1 request, which doesn’t contain any content — note the use of ␍␊ (\r\n) as line delimiters, which is specific to HTTP/1.1 and only used in the header section.

GET /blog/posts/2023/index.html HTTP/1.1␍␊
User-Agent: curl/7.79.1␍␊
Accept: */*␍␊

Here is a typical response to that request in HTTP/1.1:

HTTP/1.1 200 OK␍␊
Server: nginx/1.18.0 (Ubuntu)␍␊
Date: Thu, 06 Apr 2023 10:26:41 GMT␍␊
Content-Type: text/html␍␊
Content-Length: 27325␍␊
Last-Modified: Tue, 28 Mar 2023 07:53:03 GMT␍␊
Connection: keep-alive␍␊
ETag: "64229cdf-6abd"␍␊
Accept-Ranges: bytes␍␊
<!DOCTYPE html>
<html lang="en">

Status Codes

The 200 in the response above is the status code of the response, which is a general indication of the success or failure of the request. These status codes have the same semantics across all versions of HTTP, though representations of them may differ. The first digit of the code indicates the general type of status:

1xx — Informational
Request has been received, but processing is continuing. The only codes in this catetory defined in the RFC are 100 Continue, which indicates the client should continue to send the full request, and 101 Switching Protocols, which is used with an Upgrade header to switch to a later HTTP version or some other protocol entirely.
2xx — Success
Request has been successfully processed. The specific 2xx code depends on the type of request, but 200 OK is the most common response, typically with content included which represents the entire resource. Another common exampl is 206 Partial, which is used when the server responds to a range request and the content only includes the specified portion of the resource representation.
3xx — Redirection
The client should request this resource somewhere else. There are multiple possibilities here, but the common ones are either 301 Moved Permanently or 302 Found in conjunction with a Location header indicating the new URL of the resource. The difference between these is whether the client should use the new URL in future requests (for 301), or continue to use the old URL (for 302) as this new URL is only temporary.
4xx — Client Error
These codes are used when a client appears to have made an invalid request. Common examples are 404 Not Found, where a URL does not map to any resource the server recognises, and 400 Bad Request, where the client’s request syntax appears to be invalid.
5xx — Server Error
These codes indicate that the client request was valid and for a recognised resource, but the server is unable to fulfill it due to a server-side issue. Commonly this might be 500 Internal Server Error if, say, some component on the server has crashed, or 503 Service Unavailable, if the server would normally be able to respond but is currently overloaded or undergoing scheduled maintenance.

§15 of RFC 9110 has a full list of all 46 specific status codes defined at time of writing.

Content Representation

Whilst a resource is an abstract and generic concept in HTTP, it must have some sort of representation as a series of bytes to be sent down the connection to the client. How a resource is mapped to the bytes that represent it is a topic with some subtlties, which I’ll try to briefly summarise here.

Let’s start by running through the various response header fields which are relevant to the content in the response — this isn’t an exhaustive list, it just includes the basic options which affect how the response is processed.

Indicates the media type of the representation, as per the registry maintained by IANA. Examples would be things like text/html and image/png.
Indicates what additional encodings, if any, have been applied to the raw representation specified by Content-Type. A common example is gzip, which indicates that the server is sending a gzip-compressed version of the data.
Indicates the number of bytes in the representation attached. Note that this is after any Content-Encoding has been applied, so if the content is gzipped then this size will be the compressed size, for example.
This shouldn’t really be in this list, since it’s specific to HTTP/1.1, but I wanted to include it since that might not be immedaitely obvious. In HTTP/1.1 this header indicates encodings which have been applied to this specific message content and not the underlying representation of the resource. Intermediate proxies and the like are at liberty to change the transfer encoding, but not the content encoding. Common examples in HTTP/1.1 include chunked and gzip, and it’s also important to note that this header is mutually exclusive with Content-Length. The header has a very limited use in HTTP/2 and is not permitted in HTTP/3 at all.
Finally, there are a few headers which indicate metadata about the resource or its representation, typically to help with caching. Common examples include Last-Modified, which specifies a timestamp at which the resource was last changed, and ETag, which specifies a checksum or similar opaque value which is expected to change if the resource is altered. This allows clients to make conditional requests with headers reflecting these values back, allowing servers to respond with 304 Not Modified if the client can safely continue to use a cached version of the resource — this improves performance whilst allow cached copies to be invalidated promptly when required.

Features Not Discussed

These feel like the important aspects required to understand HTTP. Whilst there is, of course, significantly more detail in the RFC, there are also a number of additional topics I’ve chosen not to discuss at all. A rough list of these, and the section in RFC 9110 in which to find them, is shown below.

  • Message context fields (§10), such as the client Expect header.
  • HTTP authentication (§11).
  • Content negotiation (§12), to select appropriate languages and file formats for the client.
  • Conditional requests (§13), to allow clients to perform cache invalidation.
  • Range requests (§14), to allow clients to fetch subsets of large reprensetations across multiple requests.

Tour of HTTP/3

So now to the main topic of this article, the HTTP/3 protocol itself. It borrows heavily from the changes that HTTP/2 already made, and essentially just maps those semantics to a QUIC transport instead of TCP.

Over the following sections, we’ll try to answer these questions:

  • How do clients discover that severs support HTTP/3?
  • How are HTTP messages structured in HTTP/3?
  • How are messages mapped to QUIC streams?

Connecting HTTP/3

Let’s start at the beginning. Presuming that a browser or other client supports HTTP/3, it can’t assume that every server does — and servers can’t possibly afford to drop HTTP/1 and HTTP/2 support until they’re certain every single browser supports HTTP/3. Giv en all this, how does a client discover whether a given URL should be fetched over HTTP/3 or not?

For HTTP/2 this situation was resolved by making an initial HTTP/1.1-style connection and then upgrading if the sevrer supports it This is done using the Upgrade header if using the http scheme, although I believe more or less all web browsers have decided to only support HTTP/2 if using https. In that case, a TLS extension Application Layer Protocol Negotiation (ALPN), specified in RFC 7301, is used. For HTTP/3, however, the use of QUIC means an inline upgrade won’t work, since QUIC is UDP-based rather than TCP so the existing connection is useless.

The first thing to note is that since QUIC always uses TLS, only the https scheme is valid for use with HTTP/3 — so any URL which uses http is only going to be TCP-based. Beyond this, there currently seems to be two ways to discover HTTP/3 support, and we’ll look at both below.

HTTP Alternative Services

The first approach relies on an approach specified in RFC 7838 called HTTP Alternative Services. This uses a HTTP header Alt-Svc to advertise an “alternative service” which the server would prefer the client use if possible. This can be used to direct to other hosts, ports and protocols — it is the client’s decision whether to open a new connection and make the request again on the alternative service.

You’ll need to read the RFC if you want the full details, but to illustrate the principle, here’s an example of a possible such header:

Alt-Svc:;ma=3600 , h2=:8002;ma=3600

This means that the server would prefer that the client reconnect using HTTP/3 to host on port 8003. Failing that, it would prefer the client connect using HTTP/2 to the same host as the current request but on port 8002. The ma=3600 indicates that both these preferences can be cached for an hour (3600 seconds). The protocol names h2 and h3 come from the IANA registry for TLS extensions as they’re the same names used for ALPN.

One downside of this approach is that it adds latency to the connection, since the client must set up TLS over TCP before making the HTTP/1.1 request whose response contains this header — since one of the prime benefits of QUIC is a reduction in latency, this feels quite counterproductive.

It’s also worth noting that the same approach can be used with a new frame type ALTSVC in HTTP/2, although since HTTP/2 is typically already upgraded from HTTP/1.1 anyway, I don’t see a great deal of value in this at present.


This where the second approach comes in, however, and it relies on a new HTTPS DNS record type. This approch is specified in Internet Draft draft-ietf-dnsop-svcb-https-latest, which is at version 12 at time of writing.

The standard defines two new DNS record types, SVCB (“Service Binding”) and HTTPS — the latter is essenially a special case of the former, and since it’s the one that you should use for HTTP services then that’s what we’re going to look at here. The principle is actually quite similar to the Alt-Svc header mentioned above, but it’s a lot faster for a client to do a quick check for a DNS record than it is to run through all the TLS handshake and make a request.

Each HTTPS record has the following attributes:

As with all DNS records, they have a name which specifies the domain name to which they apply.
Similarly, all DNS records have a TTL which indicates for how long they can be cached. To avoid overheads, I imagine this will be fairly high for HTTPS records, which won’t be expected to change much.
This and remaining fields are specific to HTTPS records. This one indiates the priority of this service, which allows multiple alternatives to be specified at different priorities.
This is the hostname of the alternative service, or a full-stop (.) to indicate that it is the same as the name of this record.
An optional set of key/value pairs which specify other attributes of this alternative service. The important one of these is alpn which specifies the protocols that can be used on this alternative service. The port parameter can also be used to specify a different port to use for this service.

Let’s see an example of this — I’m using the dnspython library in these snippets, so lookup and decode the HTTPS records. First let’s look up

>>> import dns.resolver
>>> ans = dns.resolver.resolve("", "HTTPS")
>>> ans.rrset[0].to_text()
'1 . alpn="h2,h3"'

There’s only a single record of priority 1 which specifies . as the target, hence still in this example, and the only parameter specified is alpn="h2,h3", which means that the you should be able to make requests with HTTP/2 or HTTP/3 to port 443 of their hostname.

A slightly more interesting example comes courtesy of

>>> ans = dns.resolver.resolve("", "HTTPS")
>>> ans.rrset[0].to_text()
'1 . alpn="h3,h3-29,h2"

(Output manually wrapped and indented for readability)

This is saying something similar, although you can see that h3-29 is included — this means the version of HTTP/3 specified by draft 29 of the specification, before it became an RFC. You can also see the ipv4hint and ipv6hint parameters, which are a time-saver so the client doesn’t need to do additional DNS loopups for the A and/or AAAA records for the domain.

This DNS-based approach does not appear to have caught on massively so far, mind you. I found a handy list of the top 1000 websites in CSV form (citation needed) and did a DNS lookup of HTTPS records on each of them. Only 127 of them had such records at all, and only 58 of them mentioned “h3” in their HTTPS record. Hopefully this approach will become more popular over time, because it’s a real shame to think the latency improvements of HTTP/3 will be largely nullified by having to make a HTTP/1.1 connection first.

Connection Establishment

Once the client has decided to set up a HTTP/3 connection, it follows the usual QUIC connection process as described in the previous article.

There are a couple of specific considerations when using HTTP/3 over QUIC. The first thing is the server name must be sent to the server, as with the Host header in HTTP/1.1. This is done using Server Name Indication (SNI), which is specified in §3 of RFC 6066.

The second consideration is that h3 token must be specified in the ALPN TLS extension, as mentioned earlier in this article in the context of HTTP/2. The client can also offer other protocols if it wishes.

Once the connection is made, both endpoints create a unidirection control stream and send a SETTINGS frame down it to negotiate connection settings — streams are discussed in the next section, and frame types are discussed in the section after it.


Now we’ve looked at the ways that a client might use to decide to connect with HTTP/3, we’ll look at the different QUIC streams that it creates.

Bidirectional Streams

For requests and responses, QUIC uses bidirectional streams, always created by the client3. Since QUIC allows each endpoint to limit the number of concurrent streams, the HTTP/3 standard expects that each end configures at least 100 concurrent bidirectional streams at a time, to avoid reducing performance by reducing parallelism.

A HTTP message, either request or response, consists of up to three sections:

  • A HEADERS frame, containing control data and initial headers
  • Optionally, one or more DATA frames containing message content
  • Optionally, a single HEADERS frame containing trailing headers

The client initates the stream and sends a single request — each request stream should only ever contain a single request. The server sends its response, and then the stream is closed.

We’ll look at the specifics of these frame types a little later in this article.

Unidirectional Streams

Unlike bidirectional streams, unidirectional streams may be created in either direction. Also, they are used for multiple different purposes, so the first thing sent on each stream is a single variable-length integer (encoded as per §16 of the QUIC RFC) indicating the stream type.

The initial standards define four stream types:

ID Type
0x00 Control
0x01 Push
0x02 Encoder
0x03 Decoder

The last two stream types are defined in a separate document, RFC 9204, which specifies the QPACK header compression technique used to save bandwidth on commonly repeated header text. The use of these streams are discussed as part of the Header Compression section later in this article.

Control streams are used for messages which apply to the connection as a whole, not just a single request or response. Both endpoints create a single such stream at the start of the connection, and the first frame on each should be a SETTINGS frame to negotiate connection-specific settings. There are additional frame types which can be sent on this stream, such as GOAWAY to initiate a graceful connection shutdown, and these are discussed later in this article.

Push streams can only be initiated by the server and are used for the optional “server push” feature that was introduced in HTTP/2, although HTTP/3 uses different mechanisms to implement the same principle. The use of this stream is discussed in more detail in the later Server Push section.


HTTP/3 uses frames to carry all information, where frames are serialised across QUIC streams4. A HTTP frame header consists of simply a type and a length, both of which are variable-length integers as per §16 of the QUIC RFC, and these are immediately followed by the frame payload.

Different frames are only valid on some stream types. The frame types are:

ID Type Streams
0x00 DATA Request & Push
0x01 HEADERS Request & Push
0x03 CANCEL_PUSH Control
0x04 SETTINGS Control
0x05 PUSH_PROMISE Request
0x07 GOAWAY Control
0x0d MAX_PUSH_ID Control


As mentioned earlier, request streams use HEADERS and DATA frames to encapsulate requests and responses. The use of the HEADERS frame is tied in with the encoder and decoder streams, and specified in a different RFC, so I’ve discussed that all together in Header Compression section a little later in this article.

The DATA frame is simply the frame header (type and length) followed by a series of bytes which form the data content. The earlier HEADERS frame contains all the context required to process the request or response, the DATA frames just contain raw content. A given response may span as many DATA frames as necessary to transfer it.

Note that both HTTP/2 and HTTP/3 do not allow use of chunked encoding, as does HTTP/1. It is not required, as each request/response consumes a single stream, so in the absence of a Content-Length header then closing the request stream can be used to indicate the end of the response. As such, the Content-Length header is really just advisory, to allow clients to implement download progress bars or similar — however, the RFC does say that endpoints SHOULD provide it if the size of the content is known in advance5.


A SETTINGS frame consists of the usual type and length, and the remainder of the frame consists of pairs of variable-length integers. The first of the pair is a numeric identifier for the setting — these value are specified from a registry managed by IANA. The second integer is the value itself — all settings are numeric.

Every setting has a default, and endpoints should use those defaults initially until receiving the SETTINGS frame from their peer. Clients do not need to explicitly wait for SETTINGS from the server, but they should process all received traffic before sending anything, just to maximise their chances of setting the SETTINGS frame first. Also, clients using 0-RTT QUIC traffic should use settings from the previous session rather than the defaults, although of course these should be updated by any SETTINGS frame subsequently received from the server.

The main RFC defines only a single setting, SETTINGS_MAX_FIELD_SECTION_SIZE, which allows an endpoint to specify an upper bound on the size of header it will accept on each HTTP message — by default there is no such limit.

The RFC covering how header fields are encoded also defines two more settings, QPACK_MAX_TABLE_CAPACITY and QPACK_BLOCKED_STREAMS, and these are discussed in the Header Compression section below.


These are discussed in the later Server Push section.


This frame can be sent by either endpoint at any point to initiate a graceful shutdown of the connection. If sent by the server, it includes the client-initiated stream ID indicating the final stream that the server has handled, or still intends to handle. If sent by the client, the highest push ID is sent — the concept of push IDs will be explained later in the Server Push section, but suffice to say this indicates the final server-initiated push that the client intends to handle.

The sender of the GOAWAY then refuses and rejects any additional streams beyond this limit, and the receiver should not initiate any more such streams.

This approach aims to allow both endpoints to have a consistent idea which streams were handled before the QUIC connection itself is torn down. That said, the endpoint isn’t under an obligation to tear the connection down after a GOAWAY — it can simply leave it to become idle and be closed later.

Putting It All Together

We now know enough to see what a simple connection and first couple of requests might look like in HTTP/3. The diagram below starts once the QUIC connection is established between a client and server, and shows it requesting an index.html and associated style.css.

HTTP/3 Simple Requests

This diagram is somewhat simplified, however, as it assumes no use of server push and ignores any additional communication on the streams used for header compression — we’ll look at both these mechanisms in the following sections.

Header Compression

In this section we’ll look at how header values are transmitted in HEADERS frames. A technique called QPACK is used for compressing these headers, which is specified in a separate document, RFC 9204. This is similar to the situation in HTTP/2 where a method called HPACK was used, specified in RFC 7541. Since HPACK relies compressed field sections being transmitted in-order, and this can’t be guaranteed by QUIC across different streams, then the QPACK method was developed instead.

Control Data

The HEADERS frames carry both control data and header fields. The control data is carried by mapping it into pseudo-headers, which start with a colon (:). For requests, the following pseudo-headers are defined:

Mandatory for all requests, this specifies the request method, such as GET or POST.
Mandatory for all requests except CONNECT. Specifies the scheme from the request URL (e.g. https).
Mandatory for all requests except CONNECT. Contains the path and query parts of the request URL.
Optional, but should usually be specified. Contains the authority portion of the URL, and is loosely equivalent to the Host header in HTTP/1.1.

For responses, only a single psuedo-header is defined, which is mandatory:

The status code of the response (e.g. 200), as per the core HTTP semantics in RFC 9110.

Mapping Headers to IDs

Header strings must be mapped to integer IDs to be used within the HEADERS frame to identify header fields. To do this mapping, QPACK uses two different tables:

  • A static table for common headers, which is fixed within the RFC.
  • A dynamic table for all other headers, which is constructed over the course of the connection as we’ll see below.

Entries in the tables can refer to just the name of the field, or a name/value pair, for maximal compression of common values. For example, the value of the :method pseudo-header is an enumeration with low cardinality, so each name/value pair is represented as a unique value in the static table.

The encoder is responsible for converting header names and values to wire represenations, which can be mapped IDs or just be plain literal representations as well. It is also responsible for maintaining the dynamic table, which is separately stored at each endpoint. The encoder uses the encoder stream to transmit instructions to the decoder in the other endpoint to add entries to the dynamic table to allow the decoder to do its job.

The decoder component is responsible for taking the compressed representations that the encoder generates, and converting them back to textual names and values for the application to consume. It does this by referring both the static and dyanmic table entries that have been built for the connection. It uses a separate decoder stream to send back acknowledgements.

Since both endpoints need to send headers — request headers in one case, response headers in the other — they both have an encoder and decoder, and so there are two encoder streams and two decoder streams. In the next subsection we’ll look at how the tables are used to actually encode a HEADERS frame, and in the subsection after we’ll look at how the streams are used to sychronise the encoder/decoder pairs.

Encoding HEADERS

To populated a HEADERS frame, two basic data types are used:

Prefixed integers
These were specified in the §5.1 of the HPACK RFC 7541 for HTTP/2, and the same type is used for QPACK in HTTP/3. I won’t go into the details here as they’re a little involved, but it’s a way of encoding integers in a variable number of bytes such that smaller values require fewer bytes — you can read the RFC if you want the gory details. In the rest of this document, if I say an “N-bit prefix integer”, it means that this prefix encoding is used — if the value fits within those N bits, then that’s all the space it takes, but if not then additional bytes are used as required.
String literals
Once again, these are as specified in HPACK, this time in §5.2. These can be specified as characters, or using Huffman encoding — the leading bit of the first byte indicates which is used. The remainder of the first byte start a 7-bit prefix integer indicating how many following bytes are used for the value. If the value is Huffman encoded, a static lookup table is used which was generated by analysing lots of HTTP header values — this is specified as appendix B of RFC 7541.

The HEADERS frame itself consists of the usual frame header that we saw earlier (frame type and length) followed by a field section. The field section itself starts with a header containing two values, which are then followed by the fields themselves — the encoding of the fields use the static and dynamic tables mentioned earlier. The two values in the header are:

Required insert count
Specifies the state of the dynamic table as used by the encoder. Since the dynamic table changes over time, as the name implies, it’s important that the decoder is using a dynamic table which has all the entries which the encoder may have used. If the dynamic table doesn’t match, the decoder must block until instructions arrive from the encoder to bring it up to date — the way the table is updated is described in the next section. This required insertion count is encoding as an 8-bit prefix integer, using some slightly complicated modulo arithmetic to limit the size of this count, as it may become large on long-lived connections.
Relative index base
Entries in the dynamic table can be referenced by both absolute and relative index. To use relative indexing, a base for the relative offset is required, and this base index is provided by the second value in the header. To save space, the base is encoded as a delta to the required insert count value above — it consists of a single sign bit followed by a 7-bit prefix integer, which is added or subtracted from the required insert count as per the sign bit.

After this header, the remainder of the HEADERS frame consists of the header fields themselves. These can take any of several forms depending on how the header field has been encoded.

Indexed field line
A relative index of an entry specifying both name and value in either the static or dynamic table can be provided. In the static table, this is interpreted as an absolute index into the table, but the dynamic table is always accesesed via relative index, which can be added or subtracted from the base specified in the header.
Literal field value with name reference
Alternatively, a reference to an entry can be provided just to identify the name of the header, and this followed by a string literal as described earlier in this section (length-prefixed and either plain or Huffman-encoded).
Literal field value with literal name
Finally, both the field name and the value can be specified as literal strings.

Putting all this together, the diagram below shows the format of the HEADERS frame as a whole, including the options for the field lines. You don’t need to worry too much about the specific layouts, unless you’re actually planning to implement a HTTP/3 library6 — but I think it’s often helpful to see something drawn out to get an idea of how it fits together.

HTTP/3 HEADERS frame format

Encoder/Decoder Streams

Since QPACK is a mandatory extension of HTTP/3, endpoints should set up the encoder and decoder streams at the same time as the control stream, which we discussed earlier. This means that each HTTP/3 connection will have six unidirectional streams in total, plus any used for server push (next section), and the bidirectional request streams. Each encoder/decoder pair work in a symmetric way, so this discussion isn’t specific to the client or server.

The encoder can send the following events down the encoder stream:

Set dynamic table capacity
The capacity of the dynamic table is its maximum size in bytes. This starts at zero, and the encoder must indicate its intention to use the dynamic table by increasing this size. The other endpoint can use the setting QPACK_MAX_TABLE_CAPACITY to specify an upper bound to this, and the setting defaults to zero which effectively disables use of the dynamic table.
Insert with name reference
The encoder can refer to an existing field name in the static or dynamic table to add an entry to the dynamic table, and provide an additional value for this field to use later.
Insert with literal name
Similarly a new literal name and literal value pair can be inserted, for headers which don’t yet exist in either table. The value may be empty, which is typically done if the encoder intends to refer to the field name by index but provide a literal value each time.
An existing entry in the dynamic table, both field name and value, can be duplicated by specifying just its index. I’m not quite sure why this is a duplication as opposed to simply refreshing the “last used” time of a previous entry to keep it in the table, but I’m guessing it’s something to do with the way that the two tables are used asynchronously so perhaps old entries need to be immutable except for their expiry to avoid some class of race conditions.

Conversely, a decoder can send the following events on its own stream:

Section acknowledgement
After processing any encoded field section with a non-zero required insert count, the decoder should emit one of these instructions to confirm to the encoder that it was successfully decoded. This allows the encoder to know when it can safely expire entries from the dynamic table, for example.
Stream cancellation
When the decoder is resetting or ceasing to read the encoder stream, it sends one of these instructions to indicate that to the encoder.
Insert count increment
Whenever a dynamic table insertion (or duplication) is received from the encoder, the decoder sends back one of these instructions to confirm an increment to the insertion count — this can be an increment of more than one to acknowledge multiple insertions. This is important because of the use of insertion count as a means of synchronising dynamic table insertions with HEADERS frames that refer to these new entries.

To put this all together, let’s see the exchange of instructions required to open a connection and send the first request. To keep the diagram somewhat simple, I’ve just considered the request encoding, so we’re only looking at the client encoder and server decoder here — when the server sends the response, the converse process will happen with the server encoder and client encoder on their respective streams.

HTTP/3 HEADERS encoding example exchange

One interesting thing to note is that after the client encoder has sent the instructions to the server decoder, it doesn’t need to wait for any acknowledgement back to start using them to encoder header fields. This is safe because of the Required insert count field in the field section header that we saw earlier, which indicates to the server’s decoder that it should wait until it’s received the corresponding number of updates before proceeding with the decode.

Server Push

The final feature of HTTP/3 we’ll look at here is server push. Essentially this is where a server predicts a request that the client is going to make and pre-emptively pushes it to the client to reduce latency of page loading. For example, if the client requests a particular HTML file, the server might quite reasonably assume that the client is also going to request the CSS and javascript files linked within it, so if the server has enough logic to figure this out (or is simply so configured by the administrator) then it can push these files and save a round trip for the client to request them after parsing the HTML.

Of course, this is something that should be used with caution — pushing a number of large images could take up a lot of network bandwidth, for example, and if a client is configured not to display images to the user then this would be totally wasted. But the RFC doesn’t talk about the logic servers should use to decide whether to use push, simply the mechanics of how it does so.

Every push is assigned a unique integer ID which is sequentially assigned starting with zero. These are capped by the value sent in the MAX_PUSH_ID frame by the client, and the server is not permitted to use server push until the client has sent a first MAX_PUSH_ID frame to allow some space to allocate IDs — this means that clients can choose not to support server push by simply not sending this frame. The client is free to send another MAX_PUSH_ID frame at any time to allow the server to send additional pushes.

A push is always triggered by another client request — for example, a request for /path/index.html might trigger a push of /path/style.css. The first step is that the server sends a PUSH_PROMISE frame on the request stream for /path/index.html, and this frame contains the ID that’s been allocated to the push. It also contains the field section that would normally be in the HEADERS frame of the request sent by the client for this resource.

Not all requests can be pushed. In particular, requests that require request content cannot be pushed (for hopefully obvious reasons) and also typically only cacheable resources would be pushed. Also, generally only request methods which are safe as defined in §9.2.1 of RFC 9110 — i.e. those which are read-only — can be pushed in this way, and safe methods don’t typically take request payloads anyway.

A client can reject the push by sending a CANCEL_PUSH frame on the control stream, specifying the push ID — this should abort any server push that is planned or in progress. Failing that, however, the server opens a new stream of type 0x01 and then sends the push ID as a variable length integer. It then proceeds to send response HEADERS and content DATA frames down this new push stream as if it was responding on a request stream. Once the push is complete, it closes the stream, also just like a request stream.

That’s about it, really — aside from the server generating the request headers instead of the client, and the use of a push stream as opposed to a request stream, the process is essentially the same as responding to a normal client request. One aspect which might not immediately be obvious is that since streams are not synchronised, it’s entirely possible that the push stream arrives at the client before the PUSH_PROMISE which corresponds to it — therefore, clients must be written to deal with this, and should probably buffer up the recevied data in expectation of receiving a PUSH_PROMISE for it shortly. As we saw in the previous article, QUIC offers flow control mechanisms that the client can use to limit the amount of data it must buffer in this way.

HTTP/3 Server Push exchange


So that’s it for our whirlwind tour of HTTP/3. My opinions of it are not entirely dissimilar to my views on HTTP/2, though rather more pronounced. There are definitely some clever aspects, but my main concerns remain complexity and inscrutability.

Let’s talk about the clever aspects first. HTTP/3 doesn’t really try to do all that much more than HTTP/2, it just leverages features of QUIC to do it in a way which is less prone to blocking interactions between streams. As with HTTP/2, I applaud the use of streams across a single connection, which promises worthwhile improvements to latency, as well as requiring fewer connections, which makes life easier for servers and all kinds of middleboxes. The server’s level of control over prioritisation of content is also potentially valuable, and hopefully web authors and developers don’t need to worry so much about where CSS and Javascript is introduced in their HTML if the server’s going to pre-emptively push it anyway. Another minor advantage is dropping the notions of transfer encodings and other connection-specific behaviour, which simplifies HTTP semantics somewhat.

That said, some of these aspects are only going to be of benefit if website administrators and/or webserver vendors put work into leveraging them, and clients indicate support for them. For example, a server would probably need to be configured to properly use server push because it relies on knowing relationships between request patterns of resources. Over time this is something that perhaps servers can handle more automatically, at least for static resources — it wouldn’t be hard to imagine a utility which would parse HTML and work out which other resources (CSS, Javascript, images, etc.) should be pushed to clients requesting it. This could be dumped out in some server-specific configuration file to configure the server push operation. Things are harder for dynamic content, but it might still be possible for a server to automatically detect recurring themes in request patterns and dynamically adjust its push strategy accordingly.

Which brings us on to the first of my other concerns: complexity. This protocol is, if anything, more complicated than HTTP/2, which was itself way more complicated than HTTP/1. The header compression in particular has lots of edge cases which will be quite hard to test systematically. Also, although the encoder and request streams are nominally independent, the fact that the dynamic header field table is required to decode requests creates a serialisation constraint between them — it wouldn’t be hard to see how this could negatively impact performance if both server and client authors aren’t quite careful. The fact that potentially multiple request streams could all be blocked on receiving some update on the encoder stream, which might have randomly been dropped or delayed, is something that I’d be wary of.

A good amount of this complexity is in the header compression, as it was with HTTP/2, and I still question whether this is justified, if I’m honest. HTTP connections commonly transfer many meganytes of data, once you consider chunky HTML files with lots of layout, sprawling CSS to cover the whole site, a myriad of Javascript files, and large image files and other media. Are a coupl e of KB of request and response headers really so much of an overhead in this process to warrant the difficulties of binary representations, dynamically managed lookup tables, and Huffman coding? I’m not suggesting these measures don’t save bandwidth, and I’m not suggesting that saving bandwidth isn’t a worthy goal in principle, but complexity is a cost and costs must be justified — whether the complexity of header compression is justified by the practical benefits it will bring is, in my view, open to debate.

That said, some of this complexity can be side-stepped completely by implementors wishing to keep things simple. Although the header compression must be supported to some extent, endpoints can refuse to support their peer using the dynamic table, and are of course under no obligation to use it themselves. Similarly, server push is disabled by default unless the client explicitly allows it.

My other concern is the inscrutability of the protocol, by which I mean the degree of difficulty inspecting traffic for debugging purposes. Firstly, QUIC itself is quite opaque, even if you disregard its use of TLS. On top of this, HTTP/3 layers its own framing over QUIC’s streams, which are themselves implemented by frames on top of packets on top of UDP datagrams. Taken all together, this is going to make things really hard to work out what’s going on without copious amounts of logging in the endpoints. With HTTP/1 and HTTP/2, at least you could disable TLS for your local development, which allow tools like Wireshark to be used to see what’s going on. I’m sure traffic sniffers could, in principle, also help with QUIC (and, by extension, HTTP/3), but they’ll need some help getting hold of the TLS keys, and this is going to add friction to debugging.

If you take the complexity and the inscrutability together, I feel that this adds up to it being much, much harder to debug and test implementations than it was in the old HTTP/1 days. This is bad news for stability, because obscure bugs are also more likely with this complex protocol — race conditions between endpoints and streams, for example. This might mean that implementations end up riddled with all sorts of odd bugs for a long time, which is going to be quite frustrating, and also a barrier to adoption if HTTP/3 gets an unfair reputation for unreliability due to buggy implementations in some langauge or other.

On the flip side, if this raises the barrier to entry for anyone and their dog writing their own HTTP client implementations, perhaps there might actually be some counterintuitive upsides in focusing more attention on improving the fewer implementations that do exist, which hopefully become more stable, performance and flexible as a result. If a single solid implementation of a client exists for a language, it’s got a much better chance of being adopted into the standard library and making things better for a lot of developers, rather than fragmenting developer efforts across multiple competing libraries.

Still, these concerns are entirely unproven right now — it’s too early in HTTP/3’s lifecycle to be making these sorts of assertions with any degree of confidence. This is one reason why I’m planning to through with actually implementing it — it may prove simpler (or perhaps even more complicated!) than I’m estimating here.

Overall, I don’t want to come across as too negative on HTTP/3 — I don’t think any of the aspects are badly designed per se, and some of them have a lot of great potential. Given HTTP’s ubiquity across the web, it is worth looking for ways to improve things, and it’s not always clear up front whether any given feature will be pointless and over-engineered, or a game-changing enhancement that we all come to love. There’s an argument for being ambitious and throwing a lot at the web to see what sticks — rarely used parts of the protocol can always be ignored or trimmed back in later versions.

But I’m always wary of standards where a good chunk of the complexity is inherent, as opposed to in optional extensions, because of the risk that implementors end up ignoring the hard parts and diverging from the standards — non-compliant implementations end up hurting everyone if they see any kind of wide usage. I think that’s really my primary fear here, that we’re going to see a lot of rubbish client and server implementations which are riddled with issues, and if these become popular then everyone else starts to become obliged to work around them with their own hacks and deviations from the standards. It could become like the browser wars all over again.

But I would only too pleased for these features to be proven unfounded — that’s the direction in which I’m always happy to be wrong.

In any case, that’s all we have for this article. As usual, I hope it’s been interesting and/or helpful in some way. Also as usual, I’ll caution that there may be errors in my elaborations above — it’s based on pulling information together across a comparatively large number of RFCs and other sources, and due to the immature nature of the protocol then there are comparatively few sources with which to corroborate my understanding. If you do spot anything you think is a mistake, I’d very much appreciate you letting me know.

Next time I plan to shift my focus to Rust and implementing a simple UDP server and client, just to get the hang of things.

  1. You might notice the RFCs use the term URI, and I’m instead using the term URL. If you don’t already know the difference, and you just want to get things done (which is, after all, the very loosely linking theme of this blog) then I suggest you simply don’t worry about it. If not knowing pains you, then take a read of RFC 3986. My reasons for using URL are simply that I think it’s a more familiar term to most people and doesn’t prompt unnecessary confusion about what a URI is and how it differs. Anyone who understands the distinction will not be confused whichever term is used. 

  2. The standard does mention that server-initiated streams could be potentially added in a future extension, but unless such an extension is supported by the client and has been negotiated, the client should regard creation of such streams as an error. 

  3. Pausing for a moment just to set the context, that’s HTTP/3 frames over QUIC streams over QUIC frames over QUIC packets over UDP datagrams over IP packets over whatever type of packets the underlying transport uses. Network protocols love these sorts of layered abstractions, but to be fair to both HTTP/3 and QUIC they’ve put some effort into making sure the header overheads of these layers are smallish. 

  4. In particular, proxies trying to translate HTTP/2 or HTTP/3 from a client back to HTTP/1.1 to send to a server would probably be forced to buffer up an entire (potentially large) request before sending so that they can generate a Content-Length header — this is because server support for chunk-encoded requests has always been poor. Proxies going from server to client would be OK, they could just add a Transfer-Encoding: chunked and send each DATA frame as a separate chunk, as chunked encoding of responses is well supported in clients. 

  5. But who’d be crazy enough to come up with a plan like that when there are already perfectly good options out there? 

13 Apr 2023 at 9:30AM in Software
Photo by Jack Hunter on Unsplash