☑ The Dark Arts of C++ Streams

I’ve always avoided C++ streams as they never seemed as convenient as printf() and friends for formatted output, but I’ve decided I should finally take the plunge.

pagan ritual

I’ve never really delved into the details of C++ streams. For formatted output of builtin types they’ve always seemed less convenient than printf() and friends due to the way they mix together the format and the values. However, I recently decided it was time to figure out the basics for future reference, and I’m noting my conclusions in this blog post in case it proves a useful summary for anybody else.

A stream in C++ is a generic character-oriented serial interface to a resource. Streams may only accept input, only produce output or be available for both input and output. Reading and writing streams is achieved using the << operator, which is overloaded beyond its standard bit-shifting function to mean “write RHS to LHS stream”, and the >> operator, which is similarly overloaded to mean “read from LHS stream into RHS”.

To use streams, include the iostream header file for the basic functionality, and any additional headers for the specific streams to use — for example, file-oriented streams require the fstream header. Here’s an example of writing to a file using the stream interface1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    myFile.open("/tmp/output.txt", std::ios::out | std::ios::trunc);
    myFile << "hello, world" << std::endl;
    myFile.close();

    return 0;
}

Most of this example is pretty standard. Since streams use the standard bit-shift operators which are left-associative, the first operation performed in the third line of main() above is myFile << "hello, world". This expression also evaluates to a reference to the stream, allowing the operators to be chained to write multiple values in sequence. In this case, the std::endl identifier pushes a newline into an output stream, but also implicitly calls the flush() method as well.

So far so obvious. What about reading from a file? Reading into strings is fairly obvious:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#include <iostream>
#include <fstream>
#include <string>

int main()
{
    std::fstream myFile;
    std::string myString;
    myFile.open("/tmp/input.txt", std::ios::in);
    myFile >> myString;
    std::cout << "Read: " << myString << std::endl;

    return 0;
}

In this example the cout stream represents stdout, but otherwise this example seems quite straightforward. However, if you run it you’ll see that the string contains only the text up to the first whitespace in the file. It turns out that this is the defined behaviour for strings, which strikes me as a little quirky but hey ho.

It’s quite possible to also read integer and other types — the example below demonstrates this as well as a file stream open for read/write and seeking:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    int intVar;
    float floatVar;
    bool boolVar;

    myFile.open("/tmp/testfile", std::ios::in | std::ios::out | std::ios::trunc);
    myFile << 123 << " " << "1.23" << " true" << std::endl;
    myFile.seekg(0);
    myFile >> intVar >> floatVar >> std::boolalpha >> boolVar;
    std::cout << "Int=" << intVar << " Float=" << floatVar
              << " Bool=" << boolVar << std::endl;
    myFile.close();

    return 0;
}

The file is opened for both read and write here, and also any existing file will be truncated to zero length upon opening. The output line is much the same as previous examples, but the input line demonstrates how input streams are overloaded based on the destination type to parse character input into the appropriate type. The seekg() method fairly obviously seeks within the stream in a similar way to standard C file IO.

Also demonstrated here is an IO manipulator, in this case std::boolalpha which converts the strings "true" and "false" to a bool value. This can be used to modify the value on both input and output streams. The important thing to remember about these is that they set flags on the stream which are persistent, they don’t just apply to the following value. For example, the following function will show the first bool as an integer, the next two as a string and the final one as an integer again:

1
2
3
4
5
void showbools(bool one, bool two, bool three, bool four)
{
    std::cout << one << ", " << std::boolalpha << two << ", " << three
              << ", " << std::noboolalpha << four << std::endl;
}

Other examples include std::setbase, which displays integers in other number bases; and std::fixed, which displays floating point values to a fixed number of decimal places, determined by the std::setprecision manipulator.

All these manipulators are really placeholders for the setf() method being called at the appropriate portions in the stream. So, printf()-like formatting can be done, albeit in a slightly more verbose manner. Many of them require the iomanip header to be included.

So what about more basic file IO, such as reading an entire line into a string as opposed to a single word? To do this you need to avoid the stream operators and instead use appropriate methods — for example, std::getline() will read up to a newline into a string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Displays specified file to stdout with line numbers.
void dumpFile(const char *filename)
{
    std::fstream myFile;
    myFile.open(filename, std::ios::in);
    unsigned int lineNum = 0;
    while (myFile) {
        std::string line;
        std::getline(myFile, line);
        cout << std::setw(3) << ++lineNum << " " << line << std::endl;
    }
    myFile.close();
}

To instead read an arbitrary number of characters, the read() method can be used, which can read into a standard C char array. There are also overloads of the get() method which do the same thing, but since the read() method has only a single purpose it’s probably clearer to use that.

To read into a std::string, however, requires another concept of C++ streams — the streambuf. This is just a generalisation of a buffer which holds characters which can be sent or received from streams. Existing streams use a buffer to hold characters read and written to the stream which can be accessed via the rdbuf() method. Using this and our own std::stringstream, which is a stream wrapper around a std::string, we can read an entire file into a std::string:

1
2
3
4
5
6
7
// Requires the <sstream> header.
std::string readFile(std::fstream &inFile)
{
    std::stringstream buffer;
    buffer << inFile.rdbuf();
    return buffer.str();
}

However, this still doesn’t address the issue of reading n characters from the stream directly into a std::string. I’ve looked into this and frankly I don’t think it’s possible without resorting to reading into a char array, although as a result of the Stack Overflow question which I asked just now I’ve realised that this can be done into a std::string safely. The trick is to call resize() to make sure the string has enough valid space to store the result of the read and then use the non-const version of operator[] to get the address of the string’s character storage2. Crucially you can use neither c_str() nor data(), which both return read-only pointers, the result of modifying which is undefined.

Finally, I’ll very briefly cover the issue of customising a class so that it can be sent to and from streams like the built-in types. Actually this is just as simple as creating a new overload of the operator>> method with the appropriate stream type. The example below shows a class which can output itself:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>

class MyIntWrapper
{
public:
  MyIntWrapper(int value) : value(value) { }
private:
  int value;

  friend std::ostream &operator<<(std::ostream &o, const MyIntWrapper &i);
};

std::ostream &operator<<(std::ostream &ost, const MyIntWrapper &instance)
{
  ost << "<MyIntWrapper: " << instance.value << ">";
  return ost;
}

int main()
{
  MyIntWrapper sample(123);
  std::cout << sample << std::endl;
  return 0;
}

The only thing to note is the friend declaration, which is required to allow the operator function to access the private data members of the class. Input operators can be overloaded in a similar way.


  1. Note that error handling has been omitted for clarity in all examples. 

  2. Implementations which use copy-on-write, such as the GNU STL, are forced to perform any required copy operations when the non-const version of operator[] is used. Interestingly, C++11 effectively forbids copy-on-write implementations of std::string which makes the whole thing rather less tricky (but also potentially slower for some use-cases, although those cases should probably be using their own classes anyway). 

8 Apr 2013 at 12:28PM by Andy Pearce in Software  | Photo by freestocks.org on Unsplash  | Tags: c++ c++-stl  |  See comments

☑ Function Template Fun

Function template specialisation is a tricky beast — the short answer is, avoid it.

stencil lake

For my sins I’ve recently had to implement some fairly generic code in C++ using function templates, and I’ve realised something a little quirky about then that hadn’t occurred to be before.

For anybody who’s unaware of templating in C++ this post probably won’t be of much interest, but as a refresher of the syntax you can declare a class template like this:

1
2
3
4
5
6
7
8
template <typename Type>
class MyIncrementer
{
    Type increment(Type value)
    {
        return value + 1;
    }
};

Aside from having to define these classes in the header file (because the compiler requires the implementation of each method to be available to instantiate the template) this works more or less as you’d expect. You can also specialise templates, providing an alternate implementation for particular types:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
template <>
class MyIncrementer<std::string>
{
    std::string increment(std::string value)
    {
        unsigned long int v = strtoul(value.c_str(), NULL, 10);
        std::stringstream out;
        out << v + 1;
        return out.str();
    }
};

This works really well for classes, but you can also template functions and methods in the same way:

1
2
3
4
5
template <typename Type>
Type myIncrementFunction(Type value)
{
    return value + 1;
}

And finally, you can also specialise them in the same way1:

1
2
3
4
5
6
7
8
template <>
std::string myIncrementFunction(std::string value)
{
    unsigned long int v = strtoul(value.c_str(), NULL, 10);
    std::stringstream out;
    out << v + 1;
    return out.str();
}

So, that’s all there is to it right? Well, not quite. You see, function templates are slightly different beasts to class templates. For example, you can still overload templated functions just like you can with regular functions. The following are all quite valid at the same time:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
// Template on any type.
template <typename Type>
void myFunction(Type value)
{ /* ... */ }

// Overload previous template with a pointer version.
template <typename Type>
void myFunction(Type *value)
{ /* ... */ }

// A simple non-templated overload.
void myFunction(int* value)
{ /* ... */ }

// A template specialisation.
template <>
void myFunction<int*>(int *value)
{ /* ... */ }

Clearly there’s some potential for ambiguity here, and since this is all valid syntax then the compiler can’t just give you a warning. The first rule is that non-templated methods (the third one above) always take precedence over templated ones. So, if myFunction() is passed an int* then the third method above will always be called.

Essentially this means that the fourth method declaration will never be called, right? Well, almost — you can actually force it to be invoked by calling it as myFunction<int*>(...). But otherwise, the non-templated method is always counted as a better match.

However, there is still potential for more confusion here. Let’s omit the non-templated method and reorder the definitions slightly2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// The same basic template as above.
template <typename Type>
void myFunction(Type value)
{ /* ... */ }

// The same specialisation as above, but moved earier.
template <>
void myFunction<int*>(int* value)
{ /* ... */ }

// The same pointer template as above.
template <typename Type>
void myFunction(Type *value)
{ /* ... */ }

Now we call myFunction() with an int* again. Since we’ve removed the non-templated method, the specialisation will be called, right?

WRONG!3

Our simply re-ordering of the declarations means that the specialisation is now an explicit specialisation of the first template, since the second didn’t exist at the point it was defined. When the second template is added, it makes a better match for int* than the first template, so the specialisation of the first template isn’t even considered.

The general rule of overloading templates like this is that only generic templates themselves overload on each other, specialisations aren’t considered until a base template has been chosen, even if there’s a specialisation which would have made a better match.

Confusing, eh? Still, at least there’s an easy moral lesson here — don’t specialise methods. In general you shouldn’t have to, since you can just overload them with non-templated methods which will take precedence anyway, which is probably what you wanted in the first place.

Or just avoid templates entirely. And C++. Use Python. Or just take a walk in the park instead. Look, there’s a bird making a nest. Hopefully out of the shredded remains of the C++11 standard.


  1. Note that the <std::string> suffix to the function name can only be ommitted if the compiler can unambiguously determine the template type. 

  2. This code example is essentially one presented by Peter Dimov and Dave Abrahams — I found it in an article originally written about this which I’ve fairly shamelessly plagiarised here. 

  3. I can’t really blame you for getting it wrong, however. Especially if you didn’t — it’s all a bit presumptuous of me, really, Let’s move on. 

3 Apr 2013 at 1:12PM by Andy Pearce in Software  | Photo by rawpixel.com on Unsplash  | Tags: c++ c++-templates  |  See comments

☑ Git in “almost does what you expect” shocker

Git’s filename filtering surprisingly works as you’d expect. To an extent.

index drawers

I recently had cause to find out about Git commits which affected files whose name matched a particular pattern. In a classic case of trying to be too clever, I hunted through all the documentation in a vain effort to find some option which would meet my requirements. You can filter by commit comments, by changes added or removed within the diff, by commit time or a bundle of other options, but I couldn’t find anything to do with the filename.

I was just about to ask a question on Stack Overflow when it suddenly occurred to me to simply try the standard commit filtering by filename but using a shell glob instead of a full filename, and I was surprised to see it simply worked — perhaps I shouldn’t have been.

So, if you want to find changes to, say, any file with a .txt extension in your repository you can use this:

git log -- '*.txt'

This behaviour is perhaps a little surprising, because that * is matching any character including slashes, which is why this pattern works recursively across the repository. Intrigued as to what was going on I did a little judicious ltracing and found that Git indeed calls fnmatch() to match filenames in this case. Further, it doesn’t pass any flags to the call - in particular it doesn’t pass FNM_PATHNAME, which would cause wildcards to fail to match path separators.

The slightly quirky thing I then observed is that Git appears to notice whether you’ve used any wildcards and decide whether or not to use fnmatch() on this basis, presumably to make operations not using wildcards faster. I tried digging around in the source code and believe I’ve located where the matching is done, the slightly fearsome-looking tree_entry_interesting() function. This calls into git_fnmatch(), which is a fairly thin wrapper around fnmatch(). Indeed, it’s easy to see that GFNM_PATHNAME is not passed into git_fnmatch() which would otherwise be converted into FNM_PATHNAME.

This glob matching behaviour is potentially quite useful, but it’s inconsistent with standard shell glob matches which behave as if FNM_PATHNAME is set. It also makes it difficult to express things such as “match all text files in the current directory only”.

I wonder how many people will find this behaviour confusing. Mind you, probably not a massive proportion of people once you’ve already removed everyone who finds everything else about Git confusing as well.

18 Mar 2013 at 12:01PM by Andy Pearce in Software  | Photo by Sanwal Deen on Unsplash  | Tags: git scm  |  See comments

☑ File Capabilities In Linux

File capabilities offer a secure alternative to SUID executables, but can be a little confusing at first.

mixing desk

We all know that SUID binaries are bad news from a security perspective. Fortunately, if your application requires some limited privileges then there is a better way known as capabilities.

To save you reading the article above in detail, in essence this allows processes which start as root, and hence have permission to do anything, to retain certain limited abilities chosen from this set when they drop privileges and run as a standard user. This means that if an attacker manages to compromise the process via a buffer overrun or similar exploit, they can’t take advantage of anything except the specific minimal privileges that the process actually needs.

This is great for services, which are typically always run as root, but what about command-line utilities? Fortunately this is catered for as well, provided you have the correct utilities installed. If you’re on Ubuntu you’ll need the libcap2-bin package, for example. You’ll also need to be running a non-archaic kernel (anything since 2.6.24).

These features allow you to associate capabilities with executables, in a similar way to setting the SUID bit but only for the specific capabilities set. The setcap utility is used to add and remove capabilities from a file.

The first step is to select the capabilities you need. For the purposes of this blog, I’ll assume there’s a network diagnostic tool called tracewalk which needs to be able to use raw sockets. This would typically require the application to be run as root, but consulting the list it turns out that only the CAP_NET_RAW capability is required.

Assuming you’re in the directory where the tracewalk binary is located, you can add this capability as follows:

sudo setcap cap_net_raw=eip tracewalk

For now ignore that =eip suffix on the capability, I’ll explain that in a second. Note that the capability name is lowercase. You can now verify that you’ve set the capabilities properly with:

setcap -v cap_new_raw=eip tracewalk

Or you can recover a list of all capabilities set on a given executable:

getcap tracewalk

For reference, you can also remove all capabilities from an executable with:

setcap -r tracewalk

At this point you should be able to run the executable as an unprivileged user and it should have the ability to deal with raw sockets, but none of the other privileges that the root user has.

So, what’s the meaning of the strange =eip suffix? This requires a brief digression into the nature of capabilities. Each process has three sets of capabilities — inheritable, permitted and effective:

  • Effective capabilities are those which define what a process can actually do. For example, it can’t deal with raw sockets unless CAP_NET_RAW is in the effective set.
  • Permitted capabilities are those which a process is allowed to have should it ask for them with the appropriate call. These don’t allow a process to actually do anything unless it’s been specially written to ask for the capability specified. This allows processes to be written to add particularly sensitive capabilities to the effective set only for the duration when they’re actually required.
  • Inheritable capabilities are those which can be inherited into the permitted set of a spawned child process. During a fork() or clone() operation the child process is always given a duplicate of the capabilities of the parent process, since at this point it’s still running the same executable. The inheritable set is used when an exec() (or similar) is called to replace the running executable with another. At this point the permitted set of the process is masked with the inheritable set to obtain the permitted set that will be used for the new process.

So, the setcap utility allows us to add capabilities to these three sets independently for a given executable. Note that the meaning of the groups is interpreted slightly different for file permissions, however:

  • Permitted file capabilities are those which are always available to the executable, even if the parent process which invoked it did not have them. These used to be called “forced” capabilities.
  • Inheritable file capabilities specifies an additional mask which can also be used to remove capabilities from the calling process’s set. It applies in addition to the calling process’s inheritable set, so a capability is only inherited if exists in both sets.
  • Effective file capability is actually just a single bit rather than a set, and if set then it indicates that the entire permitted set is also copied to the effective set of the new process. This can be used to add capabilities to processes which weren’t specifically written to request them. Since it is a single bit, if you set it for any capability then it must be set for all capabilities. You can think of this as the “legacy” bit because it’s used to allow capabilities to be used for applications which don’t support them.

When specifying capabilities via setcap the three letters e, i and p refer to the effective, inhertable and pemitted sets respectively. So the earlier specification:

sudo setcap cap_net_raw=eip tracewalk

… specifies that the CAP_NET_RAW capability should be added to the permitted and inheritable sets and that the effective bit should also be set. This will replace any previously set capabilities on the file. To set multiple capabilities, use a comma-separated list:

sudo setcap cap_net_admin,cap_net_raw=eip tracewalk

The capabilities man page discusses this all in more detail, but hopefully this post has demystified things slightly. The only remaining things to mention are a few caveats and gotchas.

Firstly, file capabilities don’t work with symlinks — you have to apply them to the binary itself (i.e. the target of the symlink).

Secondly, they don’t work with interpreted scripts. For example, if you have a Python script that you’d like to assign a capability to, you have to assign it to the Python interpreter itself. Obviously this is a potential security issue because then all scripts run with that interpreter will have the specified capability, although it’s still significantly better than making it SUID. The most common workaround appears to be to write a separate executable in C or similar which can perform the required operations and invoke that from within the script. This is similar to the approach used by Wireshark which uses the binary /usr/bin/dumpcap to perform privileged operations:

$ getcap /usr/bin/dumpcap 
/usr/bin/dumpcap = cap_net_admin,cap_net_raw+eip

Thirdly, file capabilities are disabled if you use the LD_LIBRARY_PATH environment variable for hopefully obvious security reasons1. The same also applies to LD_PRELOAD as far as I’m aware.


  1. Because an attacker could obviously subvert one of the standard libraries and use LD_LIBRARY_PATH to cause their subverted library to be invoked in preference to the system one, and hence have their own arbitrary code executed with the same privileges as the calling application. 

11 Mar 2013 at 12:07PM by Andy Pearce in Software  | Photo by Alexey Ruban on Unsplash  | Tags: linux capabilities  |  See comments

☑ (A)IEEE, maths!

Writing some notes on IEEE 754 led to discovering a nifty way to render formulae on the web.

maths blackboard

For my sins I recently had to do some work on IEEE 754 format floating point numbers. For the uninitiated1, IEEE 754 is a standard which specifies the underlying binary representation of floating point numbers, and the rules for their manipulation. Typically it’s not something that you need to worry about because compilers handle it for you, but in this case I couldn’t make any assumptions about the underlying hardware and the representation really had to be compliant.

As usual when I encounter something new, I set about pulling together my own little reference on the subject, and this led me to another issue — I needed to write some simple mathematical formulae on my wiki page, but HTML is really quite deficient in its ability to express such things. I did discover that a GTK text box can accept arbitrary unicode characters by holding down CTRL and SHIFT and typing U followed by four hex digits - this lasted all of about three seconds before I got sick and tired of continually2 looking up code points.

It was at this point a quick bit of Googling led me to MathJax, which is a great little Javascript-based solution to displaying formulae in browsers. Essentially you enter it in LaTeX format3 and the library handles rendering it elegantly in whatever browser is in use. Since I use Dokuwiki, I was also pleasantly surprised to see that there’s event a plugin for it already. You don’t even need to host the library yourself, since they have their own CDN for that, which means there’s a decent chance it’s already cached in a given browser anyway.

So, I still had to wrap my head around strange floating point conversion issues, but at least I had a pleasant way to write about it, which almost4 made up for it.


  1. You lucky, lucky people. 

  2. For “continually” read “twice”. 

  3. Or MathML if you’re a total masochist. 

  4. For “almost” read “not even slightly”. 

25 Feb 2013 at 1:18PM by Andy Pearce in Software  | Photo by Roman Mager on Unsplash  | Tags: maths  web floatingpoint  |  See comments

☑ Herding children with a python

It’s possible to manage multiple subprocesses in Python, but there are a few gotchas.

herding cattle

Happy Valentines Day!

Right, now we’ve got that out of the way…

The subprocess module in Python is a handy interface to manage forked commands. If you’re still using os.popen() or os.system() give yourself a sharp rap on the knuckles1 and go and read that Python documentation page right now. No really, I’ll wait.

One of the things that’s slightly tricky with subprocess, however, is the management of multiple child processes running in parallel. This is quite possible, but there are a few quirks.

There are two basic approaches you can take — using the select module to watch the various file descriptors, or fork two separate threads for each process, one for stdout and one for stderr. I’ll discuss the former option here because it’s a little lighter on system resources, especially where many processes are concerned. It works fine on POSIX systems2, but on Windows you may need to use threads3.

The first point is that you can’t use any form of blocking I/O when interacting with the processes, or one of the other processes may fill the pipe it’s using to communicate with the main process and hence block. This means that your main process needs to be continually reading from both stdout and stderr of each process, lest they suddenly produce reams of output, but it shouldn’t read anything unless it knows it won’t block. This is why either select.select() or select.poll() are used.

The second issue is that you have to use the underlying operating system I/O primitives such as os.read(), because the builtin Python ones have a tendency to loop round reading until they’ve got as many bytes as requested or they reach EOF. You could read 1 byte at a time, but that’s dreadfully inefficient.

The third issue is that you have to manage closing down processes properly and removing their file descriptors from your watched set or they may continually flag themselves as being read-ready.

The fourth issue, which is a bit of an obsure one, is that you may wish to consider processes which close either or both of stdout and stderr before terminating — you may not need to care about this.

Fortunately for all you lucky people, I’ve already written the ProcPoller class which handles all this — I wrote this some time ago, but I only just dug it up out of an old backup.

Essentially it works in a similar way to select.poll() — you create an instance of it, and then you call the run_command() method to fork off processes in the background. Each command is also passed a context object, which can be anything hashable (i.e. non-mutable). This is used as the index to refer to that command in future.

Once everything is running that you want to watch, you call the poll() method, optionally with a timeout if you like. This will watch both stdout and stderr of each process and pass anything from them into the handle_output() method, which you can override in a derived class for your own purposes.

The base class will collect this output into two strings per process in the output dictionary, which may be fine for commands which produce little output. For more voluminous quantities, or long-running processes, you’re better off overriding handle_output() to parse the data as you go. You can also return True from handle_output() to terminate the process once you have the result you need.

Do drop a comment in if you find this useful at any point, or if you need any help using it.

But seriously, whether you use this class or not, stop using os.system(). Pretty please!


  1. Preferably from a respected artist like Antony Carmichael

  2. A portable implementation should support both select.select() and select.poll() as some platforms only provide one of these. The implementation I link to below uses poll() only for simplicity, but I only ever intended this code to support Linux systems. If you look at the implementation of communicate() on POSIX systems in the subprocess library you’ll see an example of supporting both. 

  3. If you want to see how this might be done on Windows, take a look at the implementation of communicate() on Windows systems in the subprocess library. 

14 Feb 2013 at 11:15AM by Andy Pearce in Software  | Photo by Daniel Burka on Unsplash  | Tags: python subprocess  |  See comments

☑ SO_REUSEADDR on Windows

The SO_REUSEADDR option has quite different functionality on Windows than it does on Unix.

boy spying

Anybody who’s done more than a little work at the sockets layer will have encountered the SO_REUSEADDR socket option. For anybody who hasn’t, first a little background: when a TCP socket is closed, it’s kept lingering around for a little while in a state called TIME_WAIT. This is essentially a safeguard to prevent the port number being reused for another service until there can be a fair degree of confidence that there aren’t any outdated packets from the connection bouncing around that might get erroneously delivered to the new service, royally confusing everyone.

This option is great for client sockets because the number of outgoing ports is huge enough that typically it’s not a big deal having some sockets kicking around the place (it can cause issues on extremely busy systems that are making connections at a rapid rate, but we’ll gloss over that). It’s a bit more annoying for servers which listen for connections, however, since they typically need to re-acquire the same port number when they get restarted.

To work around this issue, the SO_REUSEADDR socket option was created which allows the port number to be reused by a new socket immediately. On Unix systems this means that if the old socket is in TIME_WAIT (and perhaps some of the other idle states) then a new socket can come along and steal the port number. Since the TIME_WAIT state is really just a precaution then this is generally pretty safe.

So far I’ve been discussing the behaviour of this option on Unix systems, but the same option also exists on Windows. However, I recently discovered that its operation is actually quite different. Instead of allowing a socket to “steal” the port number from an inactive previous one, it actually allows a socket to “steal” the port from any other socket, including actively listening ones. This could be fine if one socket is UDP and the other TCP, for example, but Windows will happily allow two or more active TCP sockets to share port numbers.

This MSDN page explains how the option works under Windows — the upshot is that if two sockets end up sharing a port then the behaviour of an incoming connection is undefined. Helpful.

As the article goes on to point out, this is a bit of a security problem — you can have a trusted service listening on a particular port and some other piece of malware can come along and silently steal the port, intercepting all the connections — this is a brilliant opportunity for a man-in-the-middle attack (admittedly requiring the ability to run code on the server machine).

Microsoft’s solution to this was not to change the operation of SO_REUSEADDR as you might expect, but to add a new option SO_EXCLUSIVEADDRUSE which explicitly prevents any other sockets listening on the same port, even if they use SO_REUSEADDR. Ah, why use one socket option when two will do?

Anyway, definitely something to bear in mind when writing Windows services (and, perhaps more importantly, when debugging them).

1 Feb 2013 at 12:42PM by Andy Pearce in Software  | Photo by Dmitry Ratushny on Unsplash  | Tags: windows sockets  |  See comments

☑ Sharing pthreads locks between processes

How to share pthreads primitives across processes.

share tomatoes

The POSIX threads library has some useful primitives for locking between multiple threads, primarily mutexes and condition variables.

Typically these are only effective to lock between threads within the same process. However, pthreads defines a PTHREAD_PROCESS_SHARED attribute for both of these primitives which can be used to specify that they should also be used between processes.

This attribute simply indicates that the primitives be used in a way which is compatible with shared access, however — application code must still arrange to store them in shared memory. This is fairly easily arranged with an anonymous mmap(), or shmget() and shmat() if your processes don’t have a parent/child relationship.

The use of shared memory makes things a little more complicated, which is disappointing, but it still seems to me that using these primitives to synchronise processes is probably still a little more elegant than using pipes or something similar (even if that’s probably a little more portable).

I’ve added a code example to my wiki which illustrates this. I’ve used an anonymous mmap() for shared memory — the previous revision of the page used System V shared memory, but since this isn’t cleaned up automatically on application exit then the mmap() approach is safer.

31 Jan 2013 at 1:02PM by Andy Pearce in Software  | Photo by Elaine Casap on Unsplash  | Tags: pthreads  linux  ipc posix  |  See comments

⇐ Page 1   |   ← Page 4   |   Page 5 of 6   |   Page 6 →