☑ Westonbirt

Last weekend Michelle had arranged to spend a long weekend with what Aurora now refers to as her “knitting friends”, so Aurora and I decided to use the opportunity to pop over and visit Mum and Dad.

Westonbirt

Mon 18 May 2015 at 12:46AM by Andy Pearce in Outings tagged with photos, michelle, aurora and mum-and-dad  |  See comments

HTTP/2

After eighteen years there’s a new version of HTTP. Having heard comparatively little about it until now, I decided to take a quick look.

The Web is surely the most used service on the Internet by far — so much so, in fact, that I’m sure to many people the terms “Web” and “Internet” are wholly synonymous.

Perhaps one of the most remarkable things about the Web is that despite the revolutionary changes in the services offered over it, the protocol that underpins the whole thing, HTTP, has remained remarkably unchanged for almost two decades. The Web as we know it now runs more or less entirely over HTTP/1.1, the version standardised in 1997, receiving only minor tweaks since then.

However, the venerable old HTTP/1.1 is now starting to look a little creaky when delivering today’s highly dynamic and media-rich websites. In response to this, Google set a handful of its boffins to planning solutions to what it saw as some of the more serious issues — the result was a new protocol known as SPDY. This was intended to be more or less compatible with HTTP at the application level, but improve matters in the underlying layers.

In fact, SPDY has essentially formed the basis of the new HTTP/2 standard, the long-awaited replacement for HTTP/1.1 — so much so that Google has indicated that it plans to drop support for SPDY in favour of the new standard, and the other main browser vendors seem to be following suit. Since it seems to have fairly broad support, it’s probably useful to take a quick look at what it’s all about, then.

Why HTTP/2?

But before looking at HTTP/2 itself, what are the issues it’s trying to solve? The main one, as I see it, is latency — the amount of time between the first request for a page, and finally seeing the full thing loaded on screen.

The issue here is mainly that pages are typically composed of many disparate media: the main HTML file, one or more CSS files, Javascript files, images, videos, etc. With HTTP/1.1 these are all fetched with separate requests and since one connection can only handle one request at a time, this serialisation tends to hurt performance. Over conventional HTTP browsers can, and do, make multiple connections and fetch media over them concurrently — this helps, although it’s arguably rather wasteful. This is much less helpful over HTTPS, however, as the overhead of creating a secure connection is much higher — the initial connection triggers a cryptographic handshake which not only adds latency, it also puts a significant additional load on the server. Since HTTPS is increasingly becoming the standard for online services, even those which don’t deal in particularly sensitive data, this use-case is quite important.

There are other issues which add to the latency problem. HTTP headers need to be repeated with each request — this is quite wasteful as the ASCII representation of these headers is quite verbose compared to the data contained therein. Clients also need to make a good deal of progress through parsing the main HTML page to figure out which additional files need to be fetched and this adds yet more latency to the overall page load process.

Some of you may be thinking at this point that there’s an existing partial solution to some of these issues: pipelining. This is the process by which a client may send multiple requests without waiting for responses, leaving the server to respond in sequence to them as quickly as it can. Hence, as the client parses the response to the initial request, it can immediately fire off requests for the additional files it needs and this helps reduce latency by avoiding a round-trip-time for each request.

This is a useful technique but it still suffers from a significant problem known as head of line blocking. Essentially this is where a response for a large object holds up further responses on the connection — so for example, fetching a large image, without which the page could quite possibly still be usefully rendered, could hold up fetching of a CSS file which is rather more crucial.

Latency, then, is an annoying issue — so how does HTTP/2 go about solving it?

Negotiation

In essence what they’ve done is allow multiple request/response streams to be multiplexed onto a single connection. They’re still using TCP, but they’ve added a framing layer on top of that, so requests and responses are split up into frames of up to 16KB (by default). Each frame has a minimal header which indicates to which stream the frame belongs and so clients and servers can request and serve multiple resources concurrently over the same connection without suffering from the head of line blocking issue.

This is, of course, a pretty fundamental change from HTTP/1.1 and so they’ve also had to add a negotiation mechanism to allow clients and servers to agree to handle a connection over HTTP/2. The form this mechanism takes differs between HTTP and HTTPS. In the former case the existing HTTP/1.1 Upgrade header is supplied with the value h2c. There’s also a HTTP2-Settings header which must be supplied to specify the parameters of the new HTTP/2 connection, should the server accept — I’ll cover these settings in a moment. If the server supports HTTP/2 it can accept the upgrade by sending a 101 Switching Protocols response and then respond to the initial request using HTTP/2 — otherwise it can treat the request as if the header wasn’t present and respond as normal for HTTP/1.1.

For HTTPS a different mechanism is used, Application Layer Protocol Negotiation. This is an extension to TLS that was requested by the HTTP/2 working group and was published as an RFC last year. It allows a client and server to negotiate an application layer protocol to run over a TLS connection without additional round trip times. Essentially the client includes a list of the protocols it supports in its ClientHello message from which the server selects its most preferred option and indicates this choice in its ServerHello response.

However the connection was negotiated, both endpoints then send final confirmation of HTTP/2 in the form of a connection preface. For the client this takes the form of the following fixed string followed by a SETTINGS frame:

PRI * HTTP/2.0<CRLF>
<CRLF>
SM<CRLF>
<CRLF>

As an aside you may be wondering why the client is obliged to send a HTTP2-Settings header with its upgrade request if it’s then required to send a SETTINGS frame anyway — I’ve no idea myself, it seems a little wasteful to me. The RFC does allow that initial SETTINGS frame to be empty, though.

The server’s connection preface is simply a SETTINGS frame, which may also be empty. Both endpoints must also acknowledge the new frames by sending back empty SETTINGS frames with an ACK flag set. At this point the HTTP/2 connection is fully established.

The SETTINGS frames (or the HTTP2-Settings header) are used to set per-connection (not per-stream) parameters for the sending endpoint. For example, they can be used to set the maximum frame size that endpoint is prepared to accept, or disable server push (see later). A full list of the settings can be found in the RFC.

Frames

As I’ve already mentioned, all HTTP/2 communication is packetised, although of course since it’s running over a stream-based protocol then only one packet can be in flight on a given connection at once. The frame format includes a 9 byte header with the following fields:

Length [24 bits]

The size of the frame, not including the 9-byte header. This must not exceed 16KB unless otherwise negotiated with appropriate SETTINGS frames.

Type [8 bits]

An 8-bit integer indicating the type of this frame.

Flags [8 bits]

A bitfield of flags whose interpretation depends on the frame type.

Reserved [1 bit]

Protocol designers do love their reserved fields. Should always be zero in the current version of the protocol.

Stream ID [31 bits]

A large part of the purpose of HTTP/2 is to multiplex multiple concurrent requests and this is done by means of streams. This ID indicates to which stream this frame applies. Stream ID 0 is special and reserved for frames which apply to the connection as a whole as opposed to a specific stream. Stream IDs are chosen by endpoints and namespaced by the simple expedient of the client always choosing odd-numbered IDs and the server limiting itself to even-numbered IDs. In the case of an upgraded HTTP connection the initial request is automatically assumed to be stream 1.

The RFC specifies a complete state machine for streams which I won’t go into in detail here. Suffice to say that there are two main ways to initiate a new stream — client request and server push. We’ll look at these in turn.

Client Requests

When the client wishes to make a new request, it creates a new stream ID (odd-numbered) and sends a HEADERS frame on it. This replaces the list of headers in a conventional HTTP request. If the headers don’t fit in a single frame then one or more CONTINUATION frames can be used to send the full set, with the final frame being indicated by an END_HEADERS flag being set.

The format of the headers themselves has also changed in HTTP/2 as a form of Huffman Encoding with a static dictionary is now used to compress them. This is actually covered in a separate RFC and I don’t want to go into it in too much detail — essentially it’s just a simple process for converting a set of headers and values into a compressed binary form. As you might expect, this scheme does allow for arbitrary headers not covered by the static dictionary, but it’s designed such that common headers take up very little space.

The block of binary data that results from the encoding process is what’s split up into chunks and placed in the HEADERS and CONTINUATION frames sent as part of making a request. One thing that initially surprised me is that one single sequence of frames which make up one header block must be transmitted sequentially — no other frames, not even from other streams, may be interleaved. The reason for this is pretty clear, however, when you realise that the process of decoding a header block is stateful — this restriction ensures that any given connection only needs one set of decoder state to be maintained, as opposed to one per stream, to keep resource requirements bounded. One consequence of this is that very large header blocks could lead to HTTP/1.1-style head of line blocking could result, but I very much doubt this is an issue in practice.

There are a few other things to note about the transmission of headers other than the compression. The first is that all headers are forced to lowercase for transmission — this is just to assist the compression, and since HTTP headers have always been case-insensitive this shouldn’t affect operation. The second thing to note is that the Cookie header can be broken up into separate headers, one for each constituent cookie. This reduces the amount of state that needs to be retransmitted when cookies change.

The third and final item of note is slightly more profound — the information that used to be presented in the initial request and response lines has now moved into headers. Any request is therefore expected to contain at least the following pseudo-headers:

:method

The method of the request — e.g. GET or POST.

:scheme

The URL scheme of the requested resource — e.g. http or https.

:path

The path and (optionally) query parameters portion of the requested URL.

Similarly the response also contains a pseudo-header:

:status

The status code of the response as a bare integer — there is no longer a way to include a “reason” phrase.

At this point, therefore, we’ve got a way to send a request, including headers. After this point any request body is sent via DATA frames. Due to the framing of HTTP/2 there’s no need for chunked encoding, of course, although the standard does still allow a Content-Length header. Whether or not the request contains a body, it is terminated by its final frame having the END_STREAM flag set — this puts the stream into a “half-closed” state where the request is finished and the client is awaiting a response.

The response is sent using HEADER, CONTINUATION and DATA frames in the same way as the request, and completion of the response is similarly indicated by the final frame having END_STREAM set. After this point the stream is closed and finished with — this has echoes of the old HTTP/1.0 approach of closing the connection to indicate the end of a response, but of course doesn’t have the overhead of closing and re-opening TCP connections associated with it.

Server Push

That’s covered a standard client request, so what about this “push” mechanism that’s been added?

Well, one of the causes of latency when opening a multimedia website is the fact that the client has to scan the original requested page for links to other media files, and then send requests for this. In many cases, however, the server already knows that the client is very likely to, for example, request images that are linked within a HTML page. As a result, the overall page load time could be reduced if the server could just send these resources speculatively along with the main requested page — in essence, this is server push.

When the client makes a request the server can send PUSH_PROMISE frames on the same stream that’s being used for the DATA frames of the response. These frames (and potentially associated CONTINUATION frames) contain a header block which corresponds to a request that the server believes the client will need to make as a result of the currently requested resource. They also reference a new stream ID which is reserved by the server for sending the linked resource. At this point the client can decide if it really does want the offered resource — if so, all it has to do is wait until the server starts sending the response, consisting of the usual HEADERS, CONTINUATION and DATA frames. If it wishes to reject the push, it can do so by sending a RST_STREAM frame to close the stream.

That’s about it for server push. The only slightly fiddly detail is that the server has to send the PUSH_PROMISE frame prior to any DATA frames that reference the offered resource — this prevents any races where the client could request a pushed resource. The flip side to this is that a client is not permitted to a request a resource whose push has been offered. This is all fairly common sense.

Prioritisation and Flow Control

The only other details of HTTP/2 that I think are interesting to note are some details about prioritisation and flow control.

Flow control is intended to prevent starvation of streams or entire connections — since many requests could be in flight on a single connection, this could become quite an important issue. Flow control in HTTP/2 is somewhat similar to the TCP protocol where each receiver has an available window which is an amount of data they’ve indicated they’re prepared to accept from the other end. This defaults to a modest 64KB. Windows can be set on a per-connection and a per-stream basis, and senders are required to ensure that they do not transmit any frames that would exceed either available window.

Each endpoint can update its advertised windows with WINDOW_UPDATE frames. On systems with a large number of resources flow control can be effectively disabled by advertising the maximum window size of . When deciding whether to use flow control, it’s important to remember that advertising a window size which is too small, or not sending WINDOW_UPDATE frames in a timely manner, could significantly hurt performance, particularly on high latency connections — in these days of mobile connectivity, I suspect latency is still very much an issue.

As well as simple flow control, HTTP/2 also introduces a prioritisation mechanism where one stream can be made to be independent of another by sending a PRIORITY frame with the appropriate details. Initially all streams are assumed to depend on the non-existent stream 0 which effectively forms the root of a dependency tree. The intent is that these dependencies are used to prioritise the streams — where there is a decision about on which stream to send/receive frames next, endpoints should give priority to streams on which other depend over those streams that depend on them.

As well as the dependencies themselves, there’s also a weighting scheme, where resources can be shared unevenly between the dependencies of a stream. For example, perhaps images used for a navigation are regarded as more important than a background texture and hence can be given either higher weight such that they download faster over a connection with limited bandwidth.

Conclusions

That’s about it for the protocol itself — so what does it all mean?

My thoughts on the new protocol are rather mixed. I think the the multiplexing of streams is a nice idea and there’s definite potential for improvement over secure a single secure connection. Aside from the one-time cost of implementing the protocol, I think it also has the potential to make life easier for browser vendors since they won’t have to worry so much about using ugly hacks to improve performance. There’s also good potential for the prioritisation features to really improve things over high latency and/or low bandwidth connections, such as mobile devices in poor reception — to make best use of this browsers will need to use a bit of cleverness to improve their partial rendering of pages before all the resources are fetched, but at least the potential is there.

On the flip side, I have a number of concerns. Firstly, it’s considerably more complicated, and less transparent, then standard HTTP/1.1. Any old idiot can knock together a HTTP/1.1 client library1 but HTTP/2 definitely requires some more thought — the header compression needs someone who’s fairly clueful to implement it and although the protocol allows the potential for improved performance, a decent implementation is also required to take advantage of this. With regards to transparency it’s going to make it considerably harder to debug problems — with HTTP/1.1 all you need is tcpdump or similar to grab a transcript of the connection and you can diagnose a significant number of issues. With HTTP/2 the multiplexing and header compression are going to make this much trickier, although I would expect more advanced tools like Wireshark to have decoders developed for them fairly quickly, if they haven’t already.

Of course, transparency of this sort isn’t really relevant if you’re already running over a TLS connection, and this reminds me of another point I’d neglected to mention — although the protocol includes negotiation methods over both HTTP and HTTPS, at least the Firefox and Chrome dev teams appear to be indicating they’ll only support HTTP/2 over TLS. I’m assuming they simply feel that there’s insufficient benefit over standard HTTP, where multiple connections are opened with much less overhead. I do find this stance a little disappointing, however, as I’m sure there are still efficiencies to be had in allow clients more explicit prioritisation and flow control of their requests.

Speaking of prioritisation, that’s my second concern with the standard — the traffic management features are quite complicated and rely heavily on servers to use them effectively. A standard webserver sending out static HTML can probably have some effective logic to, for example, push linked images and stylesheets — but this isn’t at all easy when you consider that the resources are linked in the URL space but the webserver will have to check for that content in the filesystem space. Life is especially complicated when you consider requests being served by some load-balanced cluster of servers.

But even aside from those complexities, today’s web is more dynamic than static. This means that web frameworks are likely going to need serious overhauls in order to allow their application authors the chance to take advantage of HTTP/2’s more advanced features; and said authors are similarly going to have to put in some significant work on their applications as well. Some of these facilities come “for free”, such as header compression, but not most of them by my estimation.

My third annoyance with the standard is its continued reliance on TCP. Now I can see exactly why they’ve done this — moving to something like UDP or SCTP would be a much more major shift and have made it rather harder to maintain any kind of HTTP/1.1 compatibility. But the fact remains that implementing a packet-based protocol on top of a stream-based protocol on top of a packet-based protocol is anything but elegant. I have all sorts of vague concerns about how TCP and HTTP/2 features could interact badly — for example, what happens if a client consistently chooses frame sizes that are a few bytes bigger than the path MTU? How will the TCP and HTTP/2 window sizes interact on slow connections? I’m not saying these are critical failings, I can just see potential for annoyances which will need to be worked around there.2

My final issue with it is more in the nature of an opportunity cost. The standard resolves some very specific problems with HTTP/1.1 but totally missed the chance to, for example, add decent session management. Cookies are, and always have been, a rather ugly hack — something more natural, and more secure by design, would have been nice.

Still, don’t get me wrong, HTTP/2 definitely looks to solve some of HTTP/1.1’s thorny performance problems and assuming at least major web services upgrade their sites intelligently, we should be seeing some improvements once support is rolled out more widely.


  1. And a startlingly large number of people appear to have done so. 

  2. As an aside it appears that Google have had this thought also as they’re currently working on another protocol called QUIC which replaces TCP with a UDP-based approximate equivalent which supports HTTP/2 with better performance. 

Fri 08 May 2015 at 11:52PM by Andy Pearce in Software tagged with http  |  See comments

☑ C++11: Library Changes: Pointers, Randoms, Wrappers and more

This is part 8 of the “C++11 Features” series which started with C++11: Move Semantics.

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this is the second of the final two that cover what I feel to be the most important changes to the standard template library.

Continuing on from the last post here are the remaining library changes for C++11 which I think are particularly noteworthy.

Smart pointers

The venerable old std::auto_ptr has been in the STL for some time and works tolerably well for simple cases of an RAII pointer. It has, however, some signficant drawbacks:

  • It can’t be stored in containers.
  • Changes of ownership aren’t always obvious and may be unexpected.
  • It will always call delete on its owned pointer, never delete[].

As a result of this in C++11 std::auto_ptr is deprecated and has been replaced by the improved std::unqiue_ptr. This has similar semantics in that ownership can only exist with one instance at a time — if ownership is transferred, the pointer is set to nullptr in the old instance, as with auto_ptr.

The advantages of this class are essentially the removal of the three limitations mentioned above:

  • Can be stored in a container.
  • Move semantics are used to transfer ownership, so code will fail to compile without use of an explicit std::move, reducing the chances that the transfer could happen at times the programmer didn’t intend.
  • Support of arrays so either delete or delete[] will be called as appropriate.

This improved std::unqiue_ptr is a fairly thin wrapper around a native pointer and actually incurs pretty minimal overhead on dereference. On the other hand, its simple semantics make it unsuitable for multithreaded use-cases — that’s where the other new smart pointer std::shared_ptr comes in.

As the name implies, std::shared_ptr allows multiple references to the same underlying pointer. The pointer itself is wrapped in some reference counting structure and the destructor of each std::shared_ptr reduces the reference count by 1, the resource being deleted when the count reaches zero. Assignment of a different pointer to a `std::shared_ptr’ reduces the reference count on the previous pointer as well, of course. In short, it’s the obvious reference-counting semantics you’d expect.

Since one of the common use-cases of std::shared_ptr is in multithreaded applications, it’s probably no surprise that the underlying reference counting structure is thread-safe — multiple threads can all safely own their own std::shared_ptr instances which point to the same underlying resource. It’s worth remembering that the non-const methods of the std::shared_ptr themselves are not thread-safe, however, so sharing instances of std::shared_ptr itself between threads could lead to races. The semantics of the class make it easy to simply assign to a new instance when the pointer passes to a new thread, however, so it’s not hard to use correctly.

The final new smart pointer in C++11 is the std::weak_ptr. This can share ownership of a resource with std::shared_ptr instances (not std::unique_ptr), however the reference so created is not sufficient to prevent the underlying resource from being deleted. This could be useful for secondary structures which, for example, keep track of all allocated instances of a particular class, but don’t want those references to keep the instances in memory once the primary owning class has deleted it.

Because a std::weak_ptr can become nullptr at any point due to operations in other threads, some care has to be taken when dereferencing it. It’s possible to construct a std::shared_ptr from one, which will guarantee that either it’s still extant and will remain so for the lifetime of the std::shared_ptr, or it’s no longer extant and he std::shared_ptr constructor will raise an exception. std::weak_ptr itself also has a lock() method which returns a std::shared_ptr for the underlying pointer — the behaviour in this case is quite similar except that if the resource has been freed then an empty pointer object is returned instead of an exception raised.

All quite straightforward — exactly how language features should be.

Random Numbers

The core C and C++ libraries have never had hugely comprehensive facilities for random number generation. Well, let’s face it, they’ve had rand() and srand() and that’s about your lot. For many programmers I’m sure that’s plenty, and part of me thinks that more specialised use-cases in statistics and the like perhaps belong in their own third party libraries.

That said, if you’re writing code in a commercial context where things like licences and packaging become troublesome issues to deal with, there’s definite value in not having to resort to third party code. Comparing writing code in, say, Perl and Python, I vastly prefer Python. Now partly that’s just because I like the language better, of course, but it’s also the standard library — part of a language that’s too often discounted from comparisons in my opinion1.

Perl always seems to involve grabbing a whole raft of stuff from CPAN. Then you have the fun of figuring out where to install it so that your code can see it, whether anyone else in your organisation is already using it so that you don’t have to install multiple copies, how you handle sudden urgent security updates to the libraries while you’re in the midst of a release cycle, how you fix urgent bugs in your own copy whilst not losing the ability to merge in upstream fixes, whether your company allows you to contribute your fixes back to the upstream at all, and so on and so forth. All these issues have resolutions, of course, but the point is that it’s a pain when one of the main reasons you used a library was probably to reduce effort.

Taking all that2 into account, and getting back to the matter in hand, I’m therefore quite glad that the C++ standards committee has decided to include some much improved random number facilities in C++11 even if I don’t necessarily think I’ll have much cause to use them in the forseeable future.

The new facilities split random number generation into two parts:

  • The engine is responsible for actually generating pseudorandom numbers. It contains the basic algorithm and current state.
  • The distribution takes values from an engine and transforms them such that the resultant set follows some specified probability distribution.

Looking at the engine first, this is a functor which, when called, yields the next item in whatever pseudorandom sequence it’s generating. It also has a seed() method to set the initial seed state, discard() to skip ahead a specified amount in the sequence and static min() and max() methods to query the potential range of generated values.

The library provides three basic options:

std::linear_congruential_engine

Fast and with small storage requirements.

std::mersenne_twister_engine

Not quite so fast, and larger storage requirements, but with a longer non-repeating sequence and (in some ways) more randomly distributed results.

std::subtract_with_carry_engine

Fast even on archiectures with only simple arithmetic instruction sets, but with larger storage requirements.

Typically I suspect the architecture will determine the choice used — in lieu of strict performance and/or memory requirements, I’d suggest the old favourite Mersenne Twister.

There are also engine adapters such as the std::discard_block_engine which discards some of the output of an underlying base engine. Presumably the intention is that these adapters can be used to improve some randomness criteria of imperfect underlying engines, but I can’t help but think that in most cases where one is so concerned about the quality of the random numbers one should probably be using a true entropy source. Still, I’m hardly an expert and I’m sure there are good reasons why these were added.

Once the underlying engine has generated a raw stream of random numbers, the distribution transforms these in such a way that the resultant stream matches some probability distribution. There are over 20 options for this that I’m aware of, and presumably the potential for many more to be defined later. A handful for flavour:

std::uniform_int_distribution

Generates integers distributed evenly across a specified range.

std::uniform_real_distribution

Generates real numbers distributed evenly across a specified range.

std::normal_distribution

Generates values that are normally distributed around specified mean and standard deviation values.

std::binomial_distribution

Generates values that are binomially distributed given the number of trials and probability of success.

Overall it seems quite comprehensive, even if I’m still not quite sure for how many people this will remove the need for more specialist third party libraries.

Wrapper References

STL containers are both flexible and convenient, but to my mind they’ve always had two weaknesses: they involve an annoying amount of copying; and you can’t store references, only pointers. C++11 provides move semantics as a solution to the first problem — and wrapper references as a solution to the second.

A wrapper reference is really just a placeholder that can be used very much like a reference, although some minor hoops must be jumped through when performing assignment. The following code snippet shows how they can be used to assign to underlying values:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#include <functional>
#include <iostream>
#include <vector>

int main()
{
    int array[5] = {1, 2, 3, 4, 5};
    std::vector<std::reference_wrapper<int>> vec;
    vec.push_back(array[0]);
    vec.push_back(array[2]);
    for (int& it : vec) {
        it = 99;
    }
    for (unsigned int i = 0; i < 5; ++i) {
        std::cout << array[i] << std::endl;
    }
    return 0;
}

The result is that array[0] and array[2] are set to 99, whereas all the other values in array remain the same.

As well as the template std::reference_wrapper there’s also the std::ref and closely related std::cref helper functions which allow a slightly more convenient syntax for creating wrapper references by using type inference.

As well as storing references in containers there are a number of other potential uses for wrapper references, one of the main ones being to instantiate template functions such that they take references to their arguments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#include <functional>
#include <iostream>

template <class T>
void function(T a, T b)
{
    a += b;
}

int main()
{
    int x = 3;
    int y = 4;
    // Will instantiate function<int>.
    function(x, y);
    std::cout << "x=" << x << ", y=" << y << std::endl;
    // Will instantiate function<std::reference_wrapper<int>>.
    function(std::ref(x), std::ref(y));
    std::cout << "x=" << x << ", y=" << y << std::endl;
    return 0;
}

Polymorphic Function Wrappers

C++ has lots of mechanisms to make things generic, such as polymorphism, overloading and templating. One thing that’s always been difficult to do, however, is have a generic notion of a callable — i.e. something which can be invoked in a function-like manner, but which may not necessarily be a function.

There are various ways to refer to callables — function pointers, member function pointers and functors are all examples. But C++’s type-safety makes it tricky to declare something that can wrap all these up.

No longer, however — now we have std::function for just this purpose. This can wrap up any callable which matches the return and argument types of the wrapper and otherwise acts like a regular functor.

One other benefit is that it has the property of a pointer in that it can be uninitialised — in this case calling it will throw the std::bad_function_call exeption. Here’s an example that demonstrates this, as well as using the wrapper with both a function and functor:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <functional>
#include <iostream>
#include <string>

std::string multiplyStringFunc(std::string s, unsigned int num)
{
    std::string ret;
    for (/* blank */; num > 0; --num) {
        ret.append(s);
    }
    return ret;
}

class MultiplyStringFunctor
{
  public:
    std::string operator()(std::string s, unsigned int num)
    {
        return multiplyStringFunc(s, num);
    }
};

int main()
{
    std::function<std::string(std::string, unsigned int)> wrapper;
    try {
        std::cout << wrapper("bad", 1) << std::endl;
    } catch (std::bad_function_call) {
        std::cout << "Called invalid function" << std::endl;
    }
    wrapper = multiplyStringFunc;
    std::cout << wrapper("hello ", 2) << std::endl;
    wrapper = MultiplyStringFunctor();
    std::cout << wrapper("world ", 3) << std::endl;
    return 0;
}

While this is all undeniably convenient, you should bear in mind that there is, unsurprisingly, an overhead to using these wrappers. As a run-time mechanism there’s no way they’ll ever be as efficient as, say, a raw templated function, where the heavy lifting is done at compile time. The Boost documentation has some discussion of performance, but I think an issue that’s at least as critical as the raw overhead itself is the opportunity cost of the compiler being unable to inline or otherwise optimise your code at compile time.

Nonetheless, such performance concerns are often not dominant in a design compared to, say, extensibility and maintainability, so it’s definitely good to know there’s now a proper callable interface class available.

Type Traits

I’m only going to skim this one because it leans heavily in the direction of template metaprogramming, in whose waters I’ve dipped just the most tentative toe. However, I think the basic mechanism is simple enough to explain, so I’ll try to illustrate with some simplistic examples and leave it as an exercise to the reader to extrapolate into, say, an entire BitTorrent tracker that runs wholly at compile time3.

C++ has had templates for quite a long time — these are a great way to write generic code across multiple types. You make some basic assumptions about your type (e.g. it has a method foo() or supports operator +) and the compiler should let you know if your assumptions don’t hold at compile time — possibly in a fairly cryptic manner, but it should let you know.

Since it can be hard to write pleasant/optimal/correct code that works for a range of disparate types, there are also template specialisations. These are alternative implementations of the template for for specific types which can override the generic implementation. So far so C++03.

This system is quite restrictive, however, in that it’s quite hard to write code that modifies its behaviour in more subtle ways based on the types it’s given. It’s also hard to write a system that will restrict the types on which you can instantiate your template. If you write code to convert between big- and little-endian, it would be nice if someone using it on a non-integral type would get a pleasant compile-time error as opposed to randomly scrambling bytes around at runtime.

This is where type traits come in handy. Essentially they’re just expressions that return information about a type, evaluated at compile-time. One of the simplest uses is in concert with C++11’s new static_assert feature, to constrain the types of templates.

template <typename T>
T swapEndian(T value)
{
    static_assert(std::is_integral<T>::value);
    // Implementation left as an exercise
}

As well as std::is_integral there’s also std::is_floating_point, std::is_array, std::is_pointer and more. Things like std::is_rvalue_reference could come in handy when writing complex code that attmempts to support move semantics efficiently and perhaps std::is_trivially_copyable could be used to swap some optimisations in common cases. I also think that std::is_base_of sounds very promising for enforcing interfaces.

Not all of the expressions are boolean in nature, although it has to be said that most of them are. The few exceptions include std::rank, which evaluates the number of dimensions in an array, or 0 for non-array types; and std::extent, which evaluates to the size of an array in a specified dimension, or 0 for unknown bounds or non-array types. I could imagine these could make it very convenient to implement efficient templated operations on fixed-size arrays without the hassle of tricks like null-terminating them.

As I said, I’m just touching on the surface here, but I think it’s worth bearing these new facilities in mind even if you just plan to use fairly simplistic forms of templating.

Return Type Inference

With prolific use of templates and overloading, it can make it quite difficult to implement generic classes. For example, if you want to implement map() where some functor is applied to every item in a container then you have to decide how you’re going to template it. You could just assume that the return type matches the parameter types, but this is a constraint it would be nice to do without.

C++11 has therefore introduced std::result_of which, when passed any callable, will evaluate to the return type of that callable. This is perhaps demonstrated best with an (extremely contrived!) example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <type_traits>
#include <iostream>

struct PercentageConverter
{
    double operator()(int value);
    int operator()(double value);
};

double PercentageConverter::operator()(int value)
{
    return static_cast<double>(value) / 100.0;
}

int PercentageConverter::operator()(double value)
{
    return static_cast<int>(value * 100.0);
}

int main()
{
    PercentageConverter instance;
    std::result_of<PercentageConverter(int)>::type res1 =
            instance(50);
    std::cout << res1 << std::endl;
    std::result_of<PercentageConverter(double)>::type res2 =
            instance(0.15);
    std::cout << res2 << std::endl;
    return 0;
}

One of those things that’s probably incredibly useful in the few cases that you need it, but unless you’re using templates in a fairly heavy-duty manner, there’s a good chance that you can live in bliss ignorance of its existence.

Conclusion

So that’s (finally!) the end my posts on C++11. Wow, it’s taken rather longer than I first anticipated, but hopefully it’s been useful — I know it has for me, at least.

As a final note, if you have any more C++ questions in general, I can strongly recommend both the C++ Super-FAQ and cppreference.com as great resources. Personally I’d strongly suggest consulting them before resulting to the usual Stack Overflow search as their hit rate is great and the quality of the information is fantastic.


  1. For example, my mild dislike of Java has very little to do with the language, which I feel is quite adequate, and a lot to do with the standard library. I don’t like living in the kingdom of the nouns

  2. And let’s face it, there was rather too much of it. 

  3. Don’t worry, this is a joke. At least I hope so. 

Fri 24 Apr 2015 at 11:33AM by Andy Pearce in Software tagged with c++  |  See comments

☑ Strolling Through The Glen

A couple of weeks ago Michelle and I took Jackie to Sherwood Forest, which she’d wanted to see for a long time. As per usual I snapped a few pictures while we were there.

Another Adobe Slate story - I’m really loving this workflow, so much less hassle than writing proper blog entries. Let’s hope Adobe don’t do a Google and decide to simply stop offering the service in a couple of years.

Strolling Through The Glen

Wed 22 Apr 2015 at 07:30AM by Andy Pearce in Outings tagged with photos, michelle, aurora and jackie  |  See comments

☑ Off to Bassett

On Good Friday Michelle, Aurora, Jackie and I visited my parents for the day.

As a bit of an experiment I’ve written a blog entry with Adobe Slate and I’m seeing how well the link embeds in my blog. So without further ado, here it is.

Off to Bassett

Wed 15 Apr 2015 at 07:13PM by Andy Pearce in Outings tagged with photos, michelle, aurora, jackie and mum-and-dad  |  See comments

Page 1 of 9   |   Page 2 →   |   Page 9 ⇒