IOStream Is Hopelessly Broken

, Programming

The <iostream> library in C++ is hopelessly broken. It’s not surprising, since C++ has so many problems, but <iostream> is on a whole other level. It’s like an advanced alien civilization has infiltrated Earth and given us this terrible API in an effort to sabotage our software developers. Even std::printf() is better.

// C style
std::printf("Hello, world!\n");

// C++ style
std::cout << "Hello, world!\n";

// Using {fmt}
fmt::print("Hello, world!\n");

“Now, what’s wrong with std::cout?” You ask. “It looks fine to me.”

America Is Not the Only Country in the World

Native English speakers are spoiled, since we can go our entire lives without having to read a piece of documentation written in another language. We forget that other languages even exist, so we write code like this:

std::cout << "There " << (n == 1 ? "is" : "are") << " " << n
          << " file" << (n == 1 ? "" : "s") << "." << '\n';

Clever, right? A native English speaker wrote code like that for a project I was localizing. Let’s pretend we didn’t see that.

Here’s a more sensible piece of code, which displays an error message. I’ve taken this message from GNU Coreutils and rewritten it in C++.

// English
std::cout << "failed to clone " << target " from " << src;

We send this line of code to our translators, who give us four new versions.

// French
std::cout << "impossible de cloner " << target << " depuis " << src;

// German
std::cout << target << " konnte nicht von " << src
          << " geklont werden";

// Dutch
std::cout << "kan " << src << " niet klonen naar " << target;

// Japanese
std::cout << src << " から " << target << " への複製に失敗しました";

Wait a moment! Isn’t target supposed to come first, and src supposed to come last? What happened to Dutch and Japanese? Do we have to write a separate piece of code for each language?

No, we can just store format strings in a file. With std::printf(), we can reorder the format parameters by using 1$ for the first parameter and 2$ for the second.

// English
std::printf("failed to clone %s from %s", target, src);

// French
std::printf("impossible de cloner %s depuis %s", target, src);

// German
std::printf("%s konnte nicht von %s geklont werden", target, src);

// Dutch
std::printf("kan %2$s niet klonen naar %1$s", target, src);

// Japanese
std::printf("%2$s から %1$s への複製に失敗しました", target, src);

Now we have a bunch of strings we can just store in a file and look up later. Better yet, that file will be full of complete sentences, which are much easier to translate than the fragments you see in code that uses iostream. If you use gettext, your code will just look like this:

std::printf(_("failed to clone %s from %s"), target, src);

So you can already see that if you ever want to translate your application, you are going to want to rip out all of the code that uses std::cout and replace it with std::printf() or a library like {fmt}.

Thread Safety Is for Wimps

Did you know that std::printf() is thread-safe on POSIX systems? From flockfile, POSIX.1-2008:

All functions that reference (FILE *) objects, except those with names ending in _unlocked, shall behave as if they use flockfile() and funlockfile() internally to obtain ownership of these (FILE *) objects.

In standard speak, that means that we can put as much text into a std::printf() as we want, and the text will always stay together. Here’s a multithreaded greeting program:

void greet(const char *name) {
  // One operation, always stays together.
  std::printf("Hello, %s!\n", name);
}
void greet_everyone() {
  std::thread t1{[]() { greet("Alice"); }};
  std::thread t2{[]() { greet("Bob"); }};
  t1.join();
  t2.join();
}

As we expect, the order that it greets people is not deterministic, but it will always greet people correctly.

$ ./a.out
Hello, Alice!
Hello, Bob!
$ ./a.out
Hello, Bob!
Hello, Alice!

Our iostream version looks fine, but sometimes the output gets all messed up. That’s because each call operator<< will be a completely different operation. C++ guarantees that we won’t see a data race, but that’s not much consolation.

void greet(const char *name) {
  // Three different operations, can get split apart.
  std::cout << "Hello, " << name << "!\n";
}
$ ./a.out
Hello, Bob!
Hello, Alice!
$ ./a.out
Hello, Hello, AliceBob!
!

We could use a temporary buffer, but even this is not guaranteed to work by the C++ standard.

void greet(const char *name) {
  std::stringstream ss;
  ss << "Hello, " << name << "!\n";
  // Maybe.  Not guaranteed by the standard!
  std::cout << ss.str();
}

This comes up when add logging to your program, and discover that some of the lines in the log file are corrupted. You should probably use a logging library, but if you don’t need advanced features std::printf() is already good enough.

See Is cout synchronized/thread-safe? and stdout thread-safe in C on Linux?

Operator Overloading Abuse

Yes, we all know that operator overloading is easily abused, but we’re adults and we can be trusted to do the right thing when we’re writing C++. Or can we?

#include <iostream>
int main() {
  int x = 5, y = 10;
  std::cout << "sum = " << x + y << '\n';
  unsigned data = 0xfeed0123;
  std::cout << "low byte = " << data & 0xff << '\n';
  return 0;
}
$ c++ test.cpp
test.cpp: In function ‘int main()’:
test.cpp:6:38: error: no match for ‘operator&’ (operand types are ‘std::basic_ostream<char>::__ostream_type {aka std::basic_ostream<char>}’ and ‘int’)
   std::cout << "low byte = " << data & 0xff << '\n';
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~

C++ is just full of surprises like this. That’s a lovely error message, too. The problem here is that << has lower precedence that + and *, but higher precedence than &. From another perspective, the problem here is that << was intended to be an arithmetic operator, and we’re using it for a completely unintended purpose.

The error even moves to a different place if you use std::endl instead of '\n'.

$ c++ test.cpp
test.cpp: In function ‘int main()’:
test.cpp:6:45: error: invalid operands of types ‘int’ and ‘<unresolved overloaded function type>’ to binary ‘operator<<’
   std::cout << "low byte = " << data & 0xff << std::endl;
                                        ~~~~~^~~~~~

None of this is a problem if you use function syntax.

#include <cstdio>
int main() {
  int x = 5, y = 10;
  std::printf("sum = %d\n", x + y);
  unsigned data = 0xfeed0123;
  std::printf("low byte = %u\n", data & 0xff);
  return 0;
}
$ c++ test.cpp
$ ./a.out
sum = 15
low byte = 35

The {fmt} library looks similar.

#include <fmt/format.h>
int main() {
  int x = 5, y = 10;
  fmt::print("sum = {}\n", x + y);
  unsigned data = 0xfeed0123;
  fmt::print("low byte = {}\n", data & 0xff);
  return 0;
}

The Curse of Statefulness

IO streams in C++ have formatting state, which you must remember to reset whenever you modify it.

#include <iomanip>
#include <iostream>
void func1() {
  // Should always print the same thing, right?
  std::cout << "1000 is " << 1000 << '\n';
}
void func2() {
  std::cout << "0xff is 0x" << std::hex << std::setw(8)
            << std::setfill('0') << 0xff << '\n';
}
int main() {
  func1();
  func2();
  func1();
}
$ c++ test.cpp
$ ./a.out
1000 is 1000
0xff is 0x000000ff
1000 is 3e8

Here is the correct way to do it, using only the standard C++ library. Of course, we could be using Boost I/O Stream-State Saver Library to do this for us, but the fact that such a library even exists is ridiculous.

void func2() {
  std::ios::fmtflags f(std::cout.flags());
  std::cout << "0xff is 0x" << std::hex << std::setw(8)
            << std::setfill('0') << 0xff << '\n';
  std::cout.flags(f);
}
$ c++ test.cpp
$ ./a.out
1000 is 1000
0xff is 0x000000ff
1000 is 1000

So we add two extra lines of code just so we can avoid messing up the state. This is another bit of complexity that was done so much better and simpler with std::printf(). You might not remember how to write %#010x, but we should have been able to fix it without making everything else awful. The std::printf() code is so short and clean, as usual.

#include <cstdio>
void func1() {
  std::printf("1000 is %d\n", 1000);
}
void func2() {
  std::printf("0xff is %#010x\n", 0xff);
}
$ c++ test.cpp
$ ./a.out
1000 is 1000
0xff is 0x000000ff
1000 is 1000

The {fmt} code is similar.

#include <fmt/format.h>
void func1() {
  fmt::print("1000 is {}\n", 1000);
}
void func2() {
  fmt::print("0xff is {:#010x}\n", 0xff);
}

Conclusions

C++ I/O streams aren’t just bad, they are horrible. They are less flexible and less capable than printf(), which they are supposed to replace. The only advantage is that you can overload operator<<, but that is a small benefit for such an enormous downside. I/O streams don’t even offer any additional type safety these days, since modern compilers will verify format arguments. It’s also telling that a few other languages copied printf(), but I can’t name a single language that copied the <iostream> API.

This article doesn’t even mention some of the older flaws with I/O streams. If you are still stuck using older toolchains, you might have I/O streams that cast to void * when you check for errors, streams that are noticeably slower than their C counterparts, and problems with code size when you use streams (see printf vs cout in C++).

Fortunately, we don’t have to use I/O streams, there are a few alternatives.

More Information

  1. C++ FQA: Input/output via <iostream> and <cstdio>
  2. Quora: When would you use fprintf instead of cerr/iostream in C++?
  3. Stack Overflow: printf vs cout in C++