☑ The Dark Arts of C++ Streams

8 Apr 2013 at 12:28PM in Software
 |  | 

I’ve always avoided C++ streams as they never seemed as convenient as printf() and friends for formatted output, but I’ve decided I should finally take the plunge.

pagan ritual

I’ve never really delved into the details of C++ streams. For formatted output of builtin types they’ve always seemed less convenient than printf() and friends due to the way they mix together the format and the values. However, I recently decided it was time to figure out the basics for future reference, and I’m noting my conclusions in this blog post in case it proves a useful summary for anybody else.

A stream in C++ is a generic character-oriented serial interface to a resource. Streams may only accept input, only produce output or be available for both input and output. Reading and writing streams is achieved using the << operator, which is overloaded beyond its standard bit-shifting function to mean “write RHS to LHS stream”, and the >> operator, which is similarly overloaded to mean “read from LHS stream into RHS”.

To use streams, include the iostream header file for the basic functionality, and any additional headers for the specific streams to use — for example, file-oriented streams require the fstream header. Here’s an example of writing to a file using the stream interface1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    myFile.open("/tmp/output.txt", std::ios::out | std::ios::trunc);
    myFile << "hello, world" << std::endl;
    myFile.close();

    return 0;
}

Most of this example is pretty standard. Since streams use the standard bit-shift operators which are left-associative, the first operation performed in the third line of main() above is myFile << "hello, world". This expression also evaluates to a reference to the stream, allowing the operators to be chained to write multiple values in sequence. In this case, the std::endl identifier pushes a newline into an output stream, but also implicitly calls the flush() method as well.

So far so obvious. What about reading from a file? Reading into strings is fairly obvious:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#include <iostream>
#include <fstream>
#include <string>

int main()
{
    std::fstream myFile;
    std::string myString;
    myFile.open("/tmp/input.txt", std::ios::in);
    myFile >> myString;
    std::cout << "Read: " << myString << std::endl;

    return 0;
}

In this example the cout stream represents stdout, but otherwise this example seems quite straightforward. However, if you run it you’ll see that the string contains only the text up to the first whitespace in the file. It turns out that this is the defined behaviour for strings, which strikes me as a little quirky but hey ho.

It’s quite possible to also read integer and other types — the example below demonstrates this as well as a file stream open for read/write and seeking:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <fstream>

int main()
{
    std::fstream myFile;
    int intVar;
    float floatVar;
    bool boolVar;

    myFile.open("/tmp/testfile", std::ios::in | std::ios::out | std::ios::trunc);
    myFile << 123 << " " << "1.23" << " true" << std::endl;
    myFile.seekg(0);
    myFile >> intVar >> floatVar >> std::boolalpha >> boolVar;
    std::cout << "Int=" << intVar << " Float=" << floatVar
              << " Bool=" << boolVar << std::endl;
    myFile.close();

    return 0;
}

The file is opened for both read and write here, and also any existing file will be truncated to zero length upon opening. The output line is much the same as previous examples, but the input line demonstrates how input streams are overloaded based on the destination type to parse character input into the appropriate type. The seekg() method fairly obviously seeks within the stream in a similar way to standard C file IO.

Also demonstrated here is an IO manipulator, in this case std::boolalpha which converts the strings "true" and "false" to a bool value. This can be used to modify the value on both input and output streams. The important thing to remember about these is that they set flags on the stream which are persistent, they don’t just apply to the following value. For example, the following function will show the first bool as an integer, the next two as a string and the final one as an integer again:

1
2
3
4
5
void showbools(bool one, bool two, bool three, bool four)
{
    std::cout << one << ", " << std::boolalpha << two << ", " << three
              << ", " << std::noboolalpha << four << std::endl;
}

Other examples include std::setbase, which displays integers in other number bases; and std::fixed, which displays floating point values to a fixed number of decimal places, determined by the std::setprecision manipulator.

All these manipulators are really placeholders for the setf() method being called at the appropriate portions in the stream. So, printf()-like formatting can be done, albeit in a slightly more verbose manner. Many of them require the iomanip header to be included.

So what about more basic file IO, such as reading an entire line into a string as opposed to a single word? To do this you need to avoid the stream operators and instead use appropriate methods — for example, std::getline() will read up to a newline into a string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Displays specified file to stdout with line numbers.
void dumpFile(const char *filename)
{
    std::fstream myFile;
    myFile.open(filename, std::ios::in);
    unsigned int lineNum = 0;
    while (myFile) {
        std::string line;
        std::getline(myFile, line);
        cout << std::setw(3) << ++lineNum << " " << line << std::endl;
    }
    myFile.close();
}

To instead read an arbitrary number of characters, the read() method can be used, which can read into a standard C char array. There are also overloads of the get() method which do the same thing, but since the read() method has only a single purpose it’s probably clearer to use that.

To read into a std::string, however, requires another concept of C++ streams — the streambuf. This is just a generalisation of a buffer which holds characters which can be sent or received from streams. Existing streams use a buffer to hold characters read and written to the stream which can be accessed via the rdbuf() method. Using this and our own std::stringstream, which is a stream wrapper around a std::string, we can read an entire file into a std::string:

1
2
3
4
5
6
7
// Requires the <sstream> header.
std::string readFile(std::fstream &inFile)
{
    std::stringstream buffer;
    buffer << inFile.rdbuf();
    return buffer.str();
}

However, this still doesn’t address the issue of reading n characters from the stream directly into a std::string. I’ve looked into this and frankly I don’t think it’s possible without resorting to reading into a char array, although as a result of the Stack Overflow question which I asked just now I’ve realised that this can be done into a std::string safely. The trick is to call resize() to make sure the string has enough valid space to store the result of the read and then use the non-const version of operator[] to get the address of the string’s character storage2. Crucially you can use neither c_str() nor data(), which both return read-only pointers, the result of modifying which is undefined.

Finally, I’ll very briefly cover the issue of customising a class so that it can be sent to and from streams like the built-in types. Actually this is just as simple as creating a new overload of the operator>> method with the appropriate stream type. The example below shows a class which can output itself:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>

class MyIntWrapper
{
public:
  MyIntWrapper(int value) : value(value) { }
private:
  int value;

  friend std::ostream &operator<<(std::ostream &o, const MyIntWrapper &i);
};

std::ostream &operator<<(std::ostream &ost, const MyIntWrapper &instance)
{
  ost << "<MyIntWrapper: " << instance.value << ">";
  return ost;
}

int main()
{
  MyIntWrapper sample(123);
  std::cout << sample << std::endl;
  return 0;
}

The only thing to note is the friend declaration, which is required to allow the operator function to access the private data members of the class. Input operators can be overloaded in a similar way.


  1. Note that error handling has been omitted for clarity in all examples. 

  2. Implementations which use copy-on-write, such as the GNU STL, are forced to perform any required copy operations when the non-const version of operator[] is used. Interestingly, C++11 effectively forbids copy-on-write implementations of std::string which makes the whole thing rather less tricky (but also potentially slower for some use-cases, although those cases should probably be using their own classes anyway). 

8 Apr 2013 at 12:28PM in Software
 |  |