☑ Netflix Neutrality

Well, after almost a year’s downtime I’ve finally found time to get my blog up and running again and even actually write an entry. Spurred by articles on Netflix’s rampant traffic growth last month, I’ve decided to lay down my thoughts on the related topic of Net Neutrality, which may be somewhat at odds with many in the technology community. This is a fairly old topic these days, but one that I think will be relevant for some time to come yet.

You’ve probably heard of the online video rental service Netflix — in case you hadn’t, they’re a company who started as a flat rate DVD rental service in the US but are now best known for their online streaming offering, making their entire catalogue available for watching online for a flat monthly fee.

Well, Netflix have enjoyed some pretty significant growth. Last month, for example, it was announced that they comprised almost 35% of downstream traffic and even 10% of upstream ACKs — that’s a massive proportion of bandwidth for anyone, not least of which a company whose market cap is only around 5% of Google’s. This growth in traffic speaks to Netflix’s rampant popularity, but this success has also brought them some pretty stern opponents — primarily ISPs1.

ISPs seem rather bitter about the success of companies such as Google and Netflix. This is because the ISPs feel that these companies are only able to make their profits because of the network infrastructure that the ISPs have built out at their own expense. This is a rather debatable position, which I’ll come to in a moment, but whether justified or not in recent years the ISPs have become increasingly desperate to somehow jump on the bandwagon and monetise the rampant success of online services.

Clearly the primary source of revenue for an ISP is subscription fees — could they just charge their customers a higher flat rate? Well, increasing fees is generally unpopular, especially in areas where there’s little competition, and would most likely attract a lot of negative attention from regulators. Another possibility for making additional money is to provide their own services, but in practice these are almost invariably significantly less compelling than existing players and gain very little market traction. It shouldn’t be a surprise, since the existing providers make this their sole business — in short, they know what they’re doing.

Faced with these avenues being frustrated, ISPs have instead made moves to try to leach some of the revenue streams away from service companies (e.g. Google and Netflix). They have a trump card to play to achieve this, which is to put roadblocks between their users and the servers used by these companies such that the service gets degraded or removed completely. End users typically don’t know the reasons for this and assume the service is poor, placing their pressure on the provider of the service to fix it. Indeed, exactly this has happened to Netflix where the ISP Comcast in the US started slowing down their data — the problem got so bad that some users were totally unable to watch movies and cancelled their Netflix subscriptions.

Understandably the likes of Google and Netflix are none too happy with what they view as unfair business practices. These fears were proved somewhat justified in Netflix’s case where they were forced to cave in and start paying ISPs for unthrottled access to their customers. This turn of events is a concern to a lot of the online services companies who feel that it’s the thin end of a wedge that’s going to leave them at the mercy of powerful ISPs. As a resut, for years now they’ve been lobbying governments worldwide, but primarily in the US, to pass laws enforcing what’s typically known as Net Neutrality — put simply, the notion that nobody gets to artificially accelerate, slow down or block traffic from one particular source relative to others.

Under Net Neutrality, Comcast wouldn’t be allowed to slow down Netflix movie streaming, for example, although they would be able to, say, offer a uniformly slow service for a lower cost to consumers (i.e. “fair” throttling). As well as these companies, a lot of grassroots Internet advocates also support Net Neutrality, believing that it will help protect the Internet in its current fairly open form from undue corporate interference which could harm the quality of services now and in the future.

Now the actions of all of these parties are quite understandable from their own points of view, but frankly I believe they’re all rather misguided — for the remainder of this post, I’ll try to explain why.

Firstly, as much as the ISPs tend to be demonised in these debates, it can’t be denied that they have a potential problem. As public companies they have shareholders, and like any shareholders they want to see growth in the company and a return on their investment. If ISPs are blocked from innovating to achieve this growth then they’ll stop attracting investment; and that means they’ll be unable to afford to upgrade their infrastructure; and that means gradually degrading service for everyone — that’s a lose/lose scenario. Hence it’s totally natural to see why they’d oppose major legislative constraints on their business.

Secondly, and conversely, it is certainly a problem that ISPs are in the position where they can so easily inflict untold damage on the bottom line of companies who sell their services online. At its most extreme this coud develop into a form of protection racket, where the ISPs can extract arbitrary sums of money from other companies in return for access — I’m not suggesting that it’s anything like this at present, but even the possible risk is clearly not an attractive state of affairs.

Thirdly, a lot of people seem to be putting a lot of faith into legislation to sort this out — they believe some strong Net Neutrality laws will block the ISPs from “abusing” their networks and interfering with other people’s business. But they apparently forget the atrocious record that governments have in passing laws that intersect with business, and especially technology. Those who make policy just do not understand the issues enough to make informed choices, so it becomes a battle of the lobbyists. Let us not forget that ISPs are typically well established companies with plenty of lobbyists of their own, so it’s not at all clear that some awful tangled up mess of a compromise won’t emerge at the end that doesn’t particularly please anyone except the lawyers who’ll get to wrangle over it in legal disputes for decades.

Finally, even if we could rely on government to pen some good legislation now which is fit for purpose and achieves the stated goals, how can we assess the potential for stifling future innovation? There are quite legitimate uses for traffic prioritisation — for example, a cable company might decide to deliver all of its TV services over its IP infrastructure and customers are clearly going to expect to receive this without interruption even if their neighbour is downloading a hundred movies over BitTorrent. Or perhaps a hospital decides to allow doctors to perform operations remotely via the Internet and requires the ISP to give this important traffic priority. Preventing ISPs from implementing this sort of mechanism risks harming both their service and customers in the future by preventing them from making best use of their infrastructure.

In summary, I accept there are problems to solve in the ISP business, but I don’t accept that Net Neutrality legislation is necessarily the best solution to them. So what is?

What about good old competition?

The root cause of these problems, in my mind, is not that the ISPs are able to make decisions about their own networks which affect other companies — the problem is that their consumers are, in many cases, lacking any viable alternative provider they can move to. This problem appears to be particularly acute in the US, where some areas really have no practical choice at all for high speed Internet access. This means the ISPs have more or less carte blanche to implement whatever draconian traffic management policies they like, and their customers can do very little but complain about it. Were there viable alternatives, customers could move away in droves and effectively force a reverse of the policy. For this to work adequately we could perhaps have some light legislation that forces ISPs to make public their traffic shaping measures, but that’s really just tightening up existing regulations on requiring products and services to be accurately described.

Let us not also forget that ISPs don’t just look to exploit other companies — they’ve shown a willingness to exploit their customers as well, admitting that data caps are more about extracting additional revenue than managing network congestion, which is the reason they typically cite publicly. Competition and free market economics handily resolve these sorts of issues as well, whereas Net Neutrality doesn’t say a lot about ISP charging structures, subscription fees or fairness to customers.

Of course, it’s undeniable that competition in this market is incredibly hard to engineer — a lot of basic infrastructure has to be built for even a basic service, and that’s ignoring ask those difficult issues like getting permission to dig up thousands of roads. There are ways this can be done, however, such as “local loop unbundling” or LLU, where the government instead force the incumbents to open up their “last mile” infrastructure to the competition — that’s the connection to your house, and it’s the expensive part that prevents competing entities starting a business.

This might seem unfair to incumbent ISPs but let’s not forget that, for example, the US cable companies only got their local monopolies with a little help from the government in the first place. It’s also important to note that while ISPs may be jealous of the profits of service companies, they’re still large companies with a healthy bottom line — compare Comcast’s profits of $2.6bn last quarter with Netflix’s rather more modest $60m, so incurring some capex in the short term to unbundle their services isn’t going to send them bankrupt.

The use of LLU has been quite successful in countries like the UK, for example. Indeed, earlier this year the incumbent BT asked the regulator to allow it to charge more for LLU access, which is probably a pretty good sign that competition is working. Also, LLU is just one example of an approach that’s been shown to work — I’m sure there are other forms of government intervention which could encourage competition.

Overall, therefore, I would argue that although fostering competition may be the harder path, ultimately the market will be a lot healthier for everyone, not least of which the consumer, if fair market economics is allowed to assert itself instead of relying on the vagaries of government legislation to swing the balance of power around one way or another.

Whilst we’re on the subject I should also mention that this isn’t regarded as a purely legislative or policy issue by everyone — some people believe that a technical solution is a feasible alternative. The most likely candidate seems to be switching to some sort of P2P system, where users share video streams among themselves as well as from Netflix’s servers. This approach is used to implement the popular music service Spotify, for example. It’s not without its potential issues, but P2P company BitTorrent have said they think this is the best approach. Well, OK, so they’re a P2P company, of course they’d say that, right? But actually not so crazy — looks like Netflix has advertised for someone with the relevant skills already, so it seems as if they’re at least investigating this possibility.

Personally I think that’s a mixed bag. On the one hand, it would undoubtedly make it a lot harder for ISPs to selectively block or throttle Netflix’s traffic; on the other hand, ISPs like Comcast have shown in the past that they’re quite willing to take on the challenges of blocking P2P traffic and if Netflix decided to stop paying them then it’s quite possible they’d be up for the fight of doing so again. I think it’s also debatable that a catalogue as large as Netflix’s would benefit from P2P’s swarm effect — the efficiencies will tend to come when a large number of people are concurrently watching a small amount of content. This might work with popular new releases, for example, but the long tail of Netflix’s content may mean that much video just gets streamed from its servers anyway due to lack of other peers. Finally, there are complex issues surrounding use of storage on customer machines and concerns from rights holders over storing their content for longish periods on computers outside of Netflix’s control. I’m sure all these issues are quite resolvable, but it’s certainly not a simple topic.

In conclusion, then, like many issues in life it’s complicated. Parties on both sides have legitimate concerns; and parties on both sides have, at times, been quite disingenuous with their petitions to the government and the public. I think it boils down to whether we can find a way to allow the tried and tested mechanisms of the free market take control; or whether we’re going to roll the dice and let the a handful of government policy makers come up with some sort of legislation that may help or hinder and whose long-term effects are hard to predict.

I know which I’d prefer, but I think there’s only one thing that we can all be certain of — this debate is likely to rumble on for a very long time yet before it’s decided one way or another.


  1. Internet Service Providers (ISPs) are those companies that provide Internet access to the public. 

Fri 19 Dec 2014 at 07:49AM by Andy Pearce in Software tagged with internet and net-neutrality  |  See comments

☑ C++11: Other language changes

This is part 6 of the “C++11 Features” series which started with C++11: Move Semantics.

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers a miscellaneous set of language improvements which I haven’t yet discussed.

This post contains a collection of smaller changes which I didn’t feel were a good fit into other posts, but which I wanted to cover nonetheless.

nullptr

C++11 has finally added a type-safe equivalent of C’s NULL macro for pointers so one no longer has to use 0 and risk all sorts of confusion where a function has overloads that take an integral type and a pointer. The new constant is nullptr and is implicitly convertible to any pointer type, including pointer-to-members. Its type is nullptr_t. To remain backwards-compatible, the old 0 constant will still work.

Nuff said on that one.

Type-safe enumerations

In C++03 enumerations seem like a wonderfully clear and safe way to specify arbitrary groupings of values. Unfortunately they suffer from a few issues which can quite easily bite you. The main problems stem from the fact that they’re just a teensy tinsy dusting of syntactic sugar over plain old integers, and can be treated like them in most contexts. The compiler won’t implicitly convert between different enum types, but it will convert between them and integers quite happily. Worse still, the members of the enumeration aren’t scoped, they’re exposed directly in the outer scope — the programmer almost invariably ends up doing this scoping with crude name prefixing, which is ugly and prone to inconsistency.

Fortunately C++11 has remedied this lack by adding a new syntax for declaring type-safe enumerations:

enum class MyEnum
{
    First,
    Second,
    Third=103,
    Fourth=104
};

As can be seen, values can be assigned to enumeration members or the compiler can assign them. The identifiers here are scoped within the MyEnum namespace, such as MyEnum::First, so two different enumerations can freely use the same constants without concern. Also, these values can no longer be compared with integers directly, only with other members of the same enumeration.

One of the more minor, but still occasionally annoying, problems with enumerations in C++03 was that the eumeration type was implementation-specific, and could even vary according to the number of items in the enumeration, which could lead to portability problems. As of C++11 the underlying integral type is always specified by the programmer. It defaults to int in declarations such as that shown above, but can be explicitly specified like so:

enum class MyBigEnum : unsigned long { /* ... */ };

There’s also a transitional syntax to allow legacy enumeration declarations to benefit from just this change:

enum MyOldEnum : unsigned long { /* ... */ };

Finally, new-style enumerations can also be forward-declared, something that wasn’t possible in C++031, as long as the underlying type is known (either implicitly or explicitly):

enum MyEnum1;                 // Illegal: legacy syntax, no type
enum MyEnum2 : unsigned long; // Legal in C++11: type explicit
enum class MyEnum3;           // Legal in C++11: type implicit (int)
enum class MyEnum4 : short    // Legal in C++11: type explicit
enum class MyEnum3 : short    // Illegal: can't change type once declared

Of course, as the final example shows it’s not legal to change the type once it’s declared, even if only implicitly.

Array “for” loops

Iterating with a for loop is a common occurrence in C++ code, but the syntax for it is still rooted in its C origins. It’s a flexible construct which has served us well, but there are times when it’s just unpleasantly verbose. As a result, C++11 has added a new version of for which works with a limited set of iterables:

  • C-style arrays
  • Initializer lists
  • Any type with begin() and end() methods (e.g. STL containers)

The new syntax is basically a clone of the same construct in Java:

int myArray[] = {0, 1, 2, 3, 4, 5, 6, 7};
for (int& x : myArray) {
    // ...
}

Note that within the loop x will be a reference to the real values in the array so may be modified. I could have also declared x as a simple int instead of int& and as you might expect this will create a copy of each value as x in the loop so modifications wouldn’t be reflected in the original array.

This is particularly convenient for STL-style containers when combined with type inference:

std::map<std::string, std::string> myMap;
myMap["hello"] = "world";
myMap["foo"] = "bar";

// Old C++03 version
for (std::map<std::string, std::string>::iterator it = myMap.begin();
     it != myMap.end(); ++it) {
    std::cout << it->first << ": " << it->second << std::endl;
}

// New C++11 version
for (auto it : myMap) {
    std::cout << it.first << ": " << it.second << std::endl;
}

Note how with the new syntax the iterator is implicitly dereferenced.

Explicit Conversions

Operator overloading allows classes to work intuitively in similar ways to builtins and one of application of this is for value conversion — for example, overriding operator bool() allows a class instance to be evaluated in a boolean context. Unfortunately C++’s implicit type conversions mean that overriding such operators also brings with it a slew of potentially unwanted other behaviour, which leads to ugly workaround such as the safe bool idiom.

As a cleaner solution, C++11 has extended the possible uses of the explicit keyword to cover such conversion functions. Using this for the bool conversion, for example, allows the class to operator as a boolean but prevent it being further implicitly cast to, say, an integral type.

class Testable
{
  public:
    explicit operator bool() const { /* ... */ }
};

Unicode and Raw String Literals

In C++03 there are two types of string literal:

const char normal[] = "normal char literal";
const wchar_t wide[] = L"wide char literal";

Wide character support is of an unspecified type and encoding, however, sometimes limiting its usefulness. C++11 improves the situation significantly by adding support for the encodings UTF-8, UTF-16 and UTF-32:

const char utf8[] = u8"UTF-8 encoded string";
const char16_t utf16[] = u"UTF-16 encoded string";
const char32_t utf32[] = U"UTF-32 encoded string";

Within these types the escape \uXXXX can be used to specify a 16-bit Unicode code point in hex and \UXXXXXXXX a 32-bit one.

My hope is that wide character support can now quietly expire and be replaced by the standard UTF encodings that everyone should be using. Worst case, I would hope all platform vendors would be working towards wchat_t becoming simply an alias for one of the UTF types.

In addition to unicode strings, C++11 also adds a new syntax for reducing the need for escaping special characters such as quotes within strings:

const char old[] = "Quotes \"within\" strings must be escaped.";
const char new[] = R"xyz(Little "escaping" \ "quoting" required)xyz";

The delimiter (xyz above) can be anything up to 16 characters, and can be chosen so as it doesn’t occur in the string itself. The delimiter can also be empty, making the literal R"(...)".

User-Defined Literals

I’ll outline this only briefly as I haven’t had much cause to play with it myself yet, but C++11 has added the ability to define new types of literal.

Going back to pure C, it’s been possible to clarify the type of a potentially ambiguous literal. For example, 1.23 is a double, but add the f suffix to form 1.23f and the literal is instead of type float. In C++11 the programmer can define new such suffixes to convert raw literals to specific types. These conversions take the form of functions which can accept either the raw form of the literal as a string:

long operator"" _squareint(const char *literal)
{
    long value = strtol(literal, NULL, 10); // Check errors, kids
    return value * value;
}

long foo = 12_squareint; // foo has value 144

Alternatively the code can rely on the compiler to convert the literal to a numeric or string type and use that instead:

// Allow literals in any time unit to be stored as seconds.
unsigned long long operator"" _hours(unsigned long long literal)
{
    return literal * 3600;
}

I must admit I suspect I’ll have limited use for this, but I suppose it’s potentially a nice idea for code which makes heavy use of a particular type - complex numbers spring to mind, for example.

Static Assertions

C/C++ provide the assert() facility for checking invariants at runtime and the #error pragma for compile-time errors in preprocessor macros. However, templates can also benefit from compile-time checks and the new static_assert keyword allows this:

template <class TYPE>
class MyContainer
{
    static_assert(sizeof(TYPE) >= sizeof(int), "TYPE is too small");
}

Alignment

Finally, those targeting many architectures may rejoice that C++11 has added alignof and alignas to query and force the memory address alignment of variables. If you don’t know what alignment is, you probably don’t need to know. Seriously, don’t worry about it — go back and read about lambdas again.


  1. The reason is that the size of the enumeration type couldn’t be known before the full list of members was declared, because implementations were allowed to vary the underlying type based on the number of members.  

Sun 19 Jan 2014 at 11:09PM by Andy Pearce in Software tagged with c++  |  See comments

☑ C++11: Template changes

This is part 5 of the “C++11 Features” series which started with C++11: Move Semantics.

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers improvements to template declaration and instantiation.

Extern templates

As fancy as templates seem, they’re really just a type-safe version of pre-processor macros. That is, they do syntactic substitution instead of simple textual substitution. This is part of the reason they’re powerful, but also brings downsides.

One of the main drawbacks is related to compile time and code size, and this is really an interaction between the way templates work and the way C and C++ code is compiled. Because each source file is compiled into an independent object, and the compiler has no way to know in advance how these will be linked together, the compiler has to generate any template code which is used in that source file. Also, it’s possible to force instantiation by simply declaring each instantiated type:

template std::vector<int>;
template std::vector<double>;

This all works for a single object file, but what happens when you link these object files together? Well, if you’re lucky (read: using a halfway recent linker on a well-supported platform) then the code will be automatically collapsed together at link time. If you’re unlucky, you’ll get potentially many copies of each template instantiated — this will work, but leads to unnecessarily large compiled code, with resultant performance implications due to reduced caching potential.

Even if you’re lucky then compile times are still higher because the compiler has to generate those same template instantiations for each compilation unit, even if they’ll later be collapsed together during linking.

So, how does C++11 help here? Well, they’ve added a pretty simple construct which allows you to forward-declare template instantiations:

extern template std::vector<int>;

This is similar to a function prototype — it tells the compiler to assume that a definition of the template instantiation will be provided elsewhere so it doesn’t need to generate one. Of course, if no such instantiation exists then a link error will result.

This means that, for example, code which defines templated classes can use the original C++ syntax above to create instantiations for common types in its source file, and then declare them extern in its header file, so clients of the class need not generate their own. This could provide definite compile time (and potentially code size) improvements for large projects when used with common classes such as containers.

Alias templates

Templating is a great way to make code generically useful whilst maintaining the type safety which is one of the biggest advantages of a statically typed language. However, add a few levels of nesting and things can rapidly become cumbersome — consider the following loop declaration:

for (std::map<std::string, std::list<std::string> >::iterator it = myMap.begin;
     it != myMap.end(); ++it) {
    // ...
}

One way to cut down on this verbosity is to use typedef:

typedef std::map<std::string, std::list<std::string> > MyMapType;
for (MyMapType::iterator it = myMap.begin(); it != myMap.end(); ++it) {
    // ...
}

This works fine for template instantiations (i.e. with all template parameters specified) but it’s not legal to do this for templates which have not been instantiated in C++03. In C++11 this has now been allowed via a new use of the using keyword:

template <typename ListType>
using MapStringToListType = std::map<std::string, std::list<ListType>>;

In the example above the type MapStringToListType<std::list<int>> can be used as an alias for std::map<std::string, std::list<int>>.

This corrects an omission which may have perhaps been only occasionally annoying, but in those cases it was particularly annoying. There are a few limitations, the major ones being that you can’t partially or fully specialise an alias template; and alias templates are never used as candidates for automatic template argument deduction.

Variadic templates

Variadic functions have existed since the early days of the C programming language. Probably the most well known example is the printf() function with the following prototype:

int printf(const char *fmt, ...);

The ellipsis in there denotes an arbitrary number of arguments follow and the compiler and standard library together provide runtime support for determining the number of arguments passed and iterating through them.

In C++11 this feature has been brought to templates:

template<typename... TYPES>
class Aggregate;

Instantiations of such templates can take any number of template arguments - for example:

Aggregate<int, float, std::map<std::string, int>> aggregateInstance;

The language doesn’t provide a mechanism to iterate over the template parameters, however, so often they’re used recursively. The example below shows how the ellipsis can be used to pass a variable number of template arguments into a function:

template<typename TYPE>
void print_stuff(TYPE arg)
{
  std::cout << "Arg: " << arg << "\nEND" << std::endl;
}


template<typename FIRST, typename... REST>
void print_stuff(FIRST first_arg, REST... rest_args)
{
  std::cout << "Arg: " << first_arg << "\n";
  print_stuff(rest_args...);
}

I suspect the legitimate use-cases for these are fairly limited, but at least there’s now a certain amount of type-safety involved, which is more than can be said for traditional varargs functions1.

Right angle bracket

Finally, a teensy little change to the parse rules to address a long-standing annoyance of mine. In C++03 nested template instantiations required an annoying extra space to prevent the parser identifying the double-angle bracket as a bitshift operator:

std::vector<std::vector<int>> myVector;     // Incorrect in C++03
std::vector<std::vector<int> > myVector;    // Corrected version

In C++11 the parsing rules have been modified so that both of the above examples would be valid and equivalent. If, for some reason, the old behaviour is required then it can be forced with appropriate bracketing (but that seems unlikely!).


  1. With the debatable exception of printf() which is common enough that some compilers (e.g. gcc) allow a varags function to be flagged as in the same style as printf() and do type checking against the format string. This is the only such case of type-safety in traditional varargs of which I’m aware, however. 

Tue 26 Nov 2013 at 07:12AM by Andy Pearce in Software tagged with c++  |  See comments

☑ C++11: Function and method changes

This is part 4 of the “C++11 Features” series which started with C++11: Move Semantics.

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers various changes to function and method declaration and definition.

Defaulted and deleted methods

C++11 contains various changes which perhaps don’t expand what’s possible, but certainly improve the clarity with which they can be expressed. Among these changes are explicit defaulting and deleting of methods.

C++ has long had the ability for the compiler to automatically generate some special methods on demand — specifically, it will generate:

  • Default constructor: a constructor which takes no arguments.
  • Copy constructor: a constructor which takes only a const reference to its own type.
  • Copy assignment operator: assignment from a const reference to its own type.
  • Destructor: performs various low-level cleanup functions.

Incidentally, C++11 adds two more to this list:

  • Move constructor: as the copy constructor but using move semantics
  • Move assignment operator: as the copy assignment operator but using move semantics.

However, there are a number of limitations to this in C++03, where the compiler either generates a method that you don’t want or fails to generate a method that you do. For example, you may wish the compiler to provide a default constructor even if there are other constructors defined on the class (normally it only generates one if no other constructors are defined). Or perhaps your class semantics are such that copying it will lead to double-freeing some internal resource so you want to disable the copy constructor and assignment.

Some of this has always been possible with little tricks — for example, you could define a private copy constructor and fail to provide an implementation, leading to compile errors if anybody tries to use it. However, in C++11 the default and delete keywords provide a way to do this cleanly and explicitly in a manner consistent with the syntax for declaring a pure virtual method:

class Uncopyable
{
public:
    Uncopyable() = default;
    Uncopyable(const Uncopyable&) = delete;
    Uncopyable& operator=(const Uncopyable&) = delete;
};

This class first declares explicitly that it wants the compiler to generate a default constructor for it. Then it explicitly prevents the compiler from generating a copy constructor and assignment operator, in essence preventing another instance of the class attempting to copy it.

When combined with appropriate templating, the delete keyword can also be used to restrict a function call to only taking one or more explicitly defined types:

class OnlyLongs
{
public:
    void method(long int arg);
    template<class T> void method(T) = delete;
};

I think of this as conceptually extending the notion of an explicit constructor to regular methods, although I’m sure C++ experts might wince at the comparison.

Explicit overrides and final

In a similar vein, it’s now possible to override base class methods more reliably in C++11. Consider the following inheritance:

class Base
{
public:
    virtual void method(double arg);
};

class Derived : public Base
{
public:
    virtual void method(int arg);
};

Perhaps the author of Derived intended to override the virtual method(), but has instead overloaded it with an alternative version taking an int instead of a double. This might seem an obvious mistake, but consider that perhaps Base::method() took an int when Derived was originally defined but has since been modified by someone who was unaware of the existence of Derived. Due to function overloading this is quite valid so the compilation won’t fail.

In C++11 this sort of problem can be avoided by marking a method with the override attribute, specifying that it must explicitly override a method with the same signature in a base class or a compilation error will result:

class Base
{
public:
    virtual void method1(double arg);
    virtual void method2(double arg);
};

class Derived : public Base
{
public:
    virtual void method1(double arg) override; // this is fine
    virtual void method2(int arg) override;    // fails to compile in C++11
};

Similar to override is final which is used in the same way but explicitly prevents a derived class from overriding a method. It can also be used on a base class to prevent any other classes deriving from it further:

class Base1 final { };
class Derived1 : public Base1 { }; // fails to compile, Base1 is final

class Base2
{
public:
    virtual void method(int arg) final;
};

class Derived2 : public Base2
{
public:
    virtual void method(int arg); // fails to compile, method() is final
};

I haven’t quite made up my mind what I think of this particular feature yet - it seems like it could be useful in some limited cases, but I worry a little that base class authors may overuse it and make future improvements difficult. On the other hand, you already have to make things virtual explicitly in C++ and I suppose that final isn’t much higher on the risk spectrum than that.

Constructor chaining and in-place initialisation

One aspect of C++03 that’s a bit of a pain is the fact that you can’t call one constructor from another. So, if you want any commonality of code between constructors then you need to move it into another method and call it from the relevant places:

class Person
{
public:
    Person(const char* name)
    {
        init(name);
    }

    Person(const std::string& name)
    {
        init(name.c_str());
    }

    void init(const char *name)
    {
        // Initialise from name here
    }
};

This works fine but it’s a little annoying having that extra method. Derived classes also must re-implement their own constructors even if a base class constructor would have done the job just as well, which is also a pain.

C++11 solves these issues by allowing constructors to delegate to other constructors (i.e. call them), where the syntax is the same as that used to invoke a base class constructor:

class Person
{
public:
    Person(const char* name)
    {
        // Initialise from name here
    }

    Person(const std::string& name) : Person(name.c_str()) { }
};

One slightly subtle point to note here is for default arguments in library code. Consider the following constructor which we’ll assume is somewhere in a library:

class FileReader
{
public:
    FileReader(size_t bufferSize=4096);
    // ...
};

This is fine as far as it goes, but remember that the constant 4096 is in the header file and hence is incorporated into all code which links against this library — this means that changing the constant requires recompilation of all the code which uses the library, even if the library itself is built as a shared object. However, using delegation we can instead arrange that the constant is fixed in the library code without compromising the interface offered to clients of the library:

// In header file:
class FileReader
{
public:
    FileReader(size_t bufferSize);
    FileReader();
    // ...
};

// In cpp file:
FileReader::FileReader() : FileReader(4096)
{
}

This is perhaps not a major concern outside those building shared libraries (or people with very large projects worried about recompilation time) but it’s worth bearing in mind.

In C++11 it’s also possible to expose a base class’ constructors directly in the derived class with a slightly quirky new use of the using keyword:

class Base
{
public:
    Base(int value);
};

class Derived : public Base
{
public:
    using Base::Base;
};

Finally with regard to construction, it’s now possible to provide non-const members with initial values like this:

class Point
{
public:
    IntList() { }
    IntList(int x, int y) : xPos(x), xPos(y) { }

private:
    int x = 10;
    int y = 20;
};

These initial values apply to any constructors which don’t explicitly assign a different value in their initialiser lists. So in the example above, the default constructor leaves x=10 and y=20, whereas the explicit one leaves the values as whatever was passed in by the caller.

Trailing return type

On the vaguely related subject of method declarations in general, one thing I forgot to mention in my previous post on type inference was that as of C++11, method calls can employ an alternative syntax where the return type is specified after the argument list.

To illustrate why this is useful, consider the following template function:

template <class Left, class Right>
Left divide(const &Left lhs, const &Right rhs)
{
    return lhs / rhs;
}

This is all very generic apart from one problem — the programmer has to specify in advance the return type which constrains the interface for those using it. For example, the original coder has specified here that the return type must be Left, but what about an instantiation where Left is int and Right is float, and the return type is expected to be float?

Since the compiler already must know the result type of the expression at compile-time, C++11 allows type inference to be used. Given my previous post on the topic, what we’d like is something like this:

template <class Left, class Right>
decltype(lhs+rhs) divide(const &Left lhs, const &Right rhs)
{
    return lhs / rhs;
}

This is invalid, however, because the decltype expression refers to parameter names which are not yet defined. To solve this, the new C++11 syntax of trailing return type is used:

template <class Left, class Right>
auto divide(const &Left lhs, const &Right rhs) -> decltype(lhs+rhs)
{
    return lhs / rhs;
}

Note that auto here means something slightly different to its use in type inference since it doesn’t actually do any deduction. The decltype in this case is doing the deduction, but this syntax can just as easily be used for standard function definitions:

auto divide(int x, float y) -> double
{
    return x / y;
}

Personally I’d stick to the standard syntax unless you’re using type inference, however.

Lambdas

Last on the list of ever-increasingly-loosly-related topics in this post is the subject of anonymous functions, or lambdas as they’re often known. These allow functions to be defined inline with other code, taking arguments and returning a value. The basic syntax is:

[capture](parameters)->return { body }

The parameters take the form of a parameter list just as a standard function and you’ll note that the return type is specified using the alternative syntax described above. In the case of a lambda, incidentally, the return type can be omitted and the compiler will infer it, provided that all return expressions evaluate to the same type — similarly the parameter list can be omitted if empty, although I’m not sure if this is good practice. The function body is also that of a standard function. The capture list is related to making closures and I’ll cover that in a second — first let’s see an example.

Consider a container implementing a vector of strings where the user wishes to create a filtered version based on some arbitrary criteria they specify. A flexible way to implement this is to allow the user to pass in a function which is passed each item in turn and have it return a bool to indicate whether the item should be included in the filtered list:

#include <functional>
#include <string>
#include <vector>

class MyStrings
{
public:
    // ...
    std::vector<std::string>
    filteredList(std::function<bool (const std::string&)> func);
};

std::vector<std::string>
MyStrings::filteredList(std::function<bool (const std::string&)> func)
{
    std::vector<std::string> filtered;
    for (auto it = _strings.begin(); it != _strings.end(); ++it) {
        if (func(*it)) {
            filtered.push_back(*it);
        }
    }
}

Note the use of std::function — this is another C++11 addition which stands for any type of function which has the specified prototype. In this case it takes a const std::string& and returns bool. Note that you need to include functional to use std::function.

In principle this approach would work fine in C++03 with an appropriate function pointer or functor, but both defining a named function andy a class with the function call operator overridden are heavier than required here. Also, including the filter function in the context of the calling code makes it easier to follow, rather than having to jump elsewhere in the code to find it.

So, how are things better in C++11? Enter lambdas:

std::vector<std::string> getStringsWithHello(MyStrings &strings)
{
    return strings.filteredList([](const std::string& str) {
        return str.find("hello") != std::string::npos; }
    } );
}

Please excuse the indentation — I’m still fairly new to these things so I haven’t settled on an appropriate coding style. Hopefully the intent of this (not particularly useful) function should be clear, however, and it’s entirely self-contained. Note that we’ve omitted the return type as described above.

To make this function remotely useful in the real world we need to be able to search for any string, not just "hello", and for this we can use the capture list. Effectively this specifies inputs to the lambda from the defining code (as opposed to the code which calls the lambda once it’s defined), but the form this takes is direct access to the local variables in the scope of the defining function. This is usually known as a closure. Let’s see the example above updated to use this:

std::vector<std::string>
getStringsWithSubstring(MyStrings &strings, const std::string& substring)
{
    return strings.filteredList([&substring](const std::string& str) {
        return str.find(substring) != std::string::npos;
    } );
}

In this example the substring parameter has been defined in the capture list of the lambda to make it available to the code in the body. In this case &substring means “capture the substring variable by reference”. It’s worth noting that because it’s captured by reference, it’s a reference to the real value on the stack in the context of getStringsWithSubstring() (which in turn is a reference to the value in the caller as it happens in this example) so any changes the lambda makes will be reflected in the outer scope.

Of course, like any reference to a local variable within a function it will no longer be valid once the function has returned. As a result, if you wish to return a lambda from a function then you’d better capture its locals by value instead of by reference — just omit the &:

std::function<bool (const std::string&)>
getFilterFunction(const std::string& str)
{
    return [str](const std::string& s) {
        return s.find(str) != std::string::npos;
    };
}

For methods it’s also possible to include this in the capture list, which will give the lambda the same implicit access to the members of the class that methods have (i.e. there’s no need to explicitly mention this). Also, a lone & can be used to allow the compiler to automatically determine which variables to capture by reference, and a lone = is equivalent except that values are captured by value. So, some examples of capture lists:

  • [] means capture nothing.
  • [foo,&bar] means capture foo by value and bar by reference.
  • [&] means capture everything by reference.
  • [=,&foo] means capture everything by value except foo, which is captured by reference.
  • [this,foo] means allow the lambda access to class members as well as capturing foo by value.

Bear in mind that both [&] and [=] also include this implicitly. This is very convenient, but it does mean you need to employ the same care accessing variable inside lambdas that use these specifiers inside methods as you would in the methods themselves. Personally I’d suggest keeping your capture list as tightly specified as possible, since I have a great preference for making things explicit in code — that’s very much a personal view, however.

This might all seem a little strange to anybody not familiar with closures in other languages so it might help to not consider lambdas as functions per se, but instead as little class instantiations with the function call operator overloaded. These instances can maintain their own copies of the variables passed into the capture section and use them in the body of the lambda when it’s called. In fact, this is probably more or less how the compiler implements them, but the syntax is a good deal more convenient than attempting to do the same.

Anyway, that was a bit of a long post, but hopefully covered some interesting ground. Well, you’re still reading, aren’t you? What do you mean you just skipped to the end to see if there’d be an amusing picture of a cat?!

Tue 17 Sep 2013 at 07:50PM by Andy Pearce in Software tagged with c++  |  See comments

☑ C++11: Type inference and constant expressions

This is part 3 of the “C++11 Features” series which started with C++11: Move Semantics.

I’ve finally started to look into the new features in C++11 and I thought it would be useful to jot down the highlights, for myself or anyone else who’s curious. Since there’s a lot of ground to cover, I’m going to look at each item in its own post — this one covers automatic type inference and generalised constant expressions.

It’s been awhile since my previous C++11 post but I’m still planning to finish the series — eventually! In any case, this post examines new facilities for the compiler to infer types from context, to allow more compact expressions of type-safe code, and also the ability to specify a function or constructor as a compile-time constant.

It’s often a little cumbersome to figure out the type of a value in C++, especially when you’re dealing with a lot of templated and STL code. Even when the types are obvious they can still be rather verbose to specify without adding any particular clarity to the code. C++11 has addressed these issues by adding a new decltype keyword and also taking the venerable old auto keyword from C, dusting it off and giving it a new lease of life.

Even as a C programmer you may not be familiar with the auto keyword, which specifies that a variable has automatic storage. This basically means “a normal variable” since all variables are automatic by default unless explicitly declared otherwise with the register keyword, which I’ve only ever seen used in kernel and device driver code (and rarely even then).

Since it’s essentially entirely unused these days, C++11 has removed the old meaning of auto and instead allowed it to be specifyed instead of a type where the type can be ascertained from the initialiser. For example, the type of both these variables is long int in C++11:

long one;
auto two = 123L;

This may not appear particularly handy, but if you’re dealing with something like an opaque handle or context value returned from some third party library it’s quite nice not to have to be explicit about the type everywhere. Also, how many people enjoy writing this sort of thing:

for (std::map<std::string, std::string>::const_iterator ci = myStringMap.begin();
     ci != myStringMap.end; ++ci) {
    // ...
}

Surely much more pleasant without harming readability one iota:

for (auto ci = myStringMap.begin(); ci != myStringMap.end(); ++ci) {
    // ...
}

Related to the new functionality of auto, the decltype keyword evaluates to the type of the expression within it — this is, of course, something the compiler must know but the programmer may find awkward to discover and often doesn’t care about. So, you could declare a container iterator like this:

const std::vector<int> myVector;
decltype(myVector.begin()) iterator;

This is more useful in combination with auto:

auto context = library::function(arg);
decltype(context) anotherContext;

Be aware, however, that auto and decltype can evaluate to different types for the same argument:

const std::vector<int> myVector(1);
auto v1 = myVector[0];         // has type int
decltype(myVector[0]) v2 = 2;  // has type const int&

Now, from type inference to the not-very-related-but-in-the-same-post-anyway topic of constant expressions. Among C++ programmers, unnecessary use of the C preprocessor is typically frowned upon, for a number of legitimate reasons that are beyond the scope of this post — the executive summary is that it makes it even easier than usual to shoot yourself in the foot.

In C, the preprocessor is typically used for constant values so the compile stage doesn’t need to go to the overhead of allocating storage. In C++, however, the const keyword is typically used instead — this has the advantage of including little niceties like type safety, and still allows the compiler to make all the same optimisations as it otherwise would with the preprocessor.

There are a number of limitations of constant expressions in C++03, however, such as the requirement that they be of essentially integral type (including enumerations). Several of these limitations have been lifted in C++11 with the introduction of the constexpr keyword which specifies that its attached type is a compile-time constant and can be assumed as such by compiler optimisations. So, now non-integral constants are available:

constexpr double pi = 3.141592;
constexpr double twoPi = 2 * pi;

The constexpr keyword can also be used with functions:

constexpr int factorial(int x)
{
    return (x > 0 ? x * factorial(x-1) : 1);
}

void someOtherFunction()
{
    int myArray[factorial(5)];
    // ...
}

In C++03 this would have been invalid as the return value of a function was never a constant expression. In C++11 the constexpr keyword allows the programmer to tell the compiler that the function’s return value will be the same at both compile and runtime. Of course, there are a lot of restrictions on what you can do in such a function — generally it must consist of only a return statement and it can only reference other constexpr functions and values.

Still, despite all the restrictions, hopefully this might discourage people from engaging in template metaprogramming which is, in general, obfuscatory obscurantism of the highest order.

Wed 04 Sep 2013 at 07:38PM by Andy Pearce in Software tagged with c++  |  See comments

Page 1 of 7   |   Page 2 →   |   Page 7 ⇒