C++11 Quine

By | 2013-07-02

A Quine is a program that takes no inputs and produces its source as its output. “No inputs” includes its source. They are an interesting little computational trick, and it takes a while to understand what’s happened. For a long time, C and C++ quines have been among the more complicated because they have never supported any kind of delimited string input. Producing a quine was still possible, but it wasn’t pretty because of all the escapes needed. With C++11, that limitation has been removed thanks to it’s support for raw strings.

auto const raw_string = R"(_RAW_STRING_GOES_HERE_)";

The key feature of a raw string is that it is delimited by more than just a quote. The delimiters can, in fact, be any arbitrary string of the forms:

R"_arbitrary_string_(

i.e. R" is always present, and ( is always present, whatever is in between may be chosen by you at will to not clash with the content you wish to delimit. The end delimiter is similar:

)_arbitrary_string_";

Inside a raw string, backslash escapes are not processed, and of course quotes are just quotes.

That makes it possible to make an educationally clear quine in C++11. I’d like to do more than just give you one though; I’d like to build up to one. Here’s what we’d like:

int main()
{
    cout << THIS_PROGRAM << endl;
}

This is obviously incomplete, since THIS_PROGRAM is undefined. It’s what we’re aiming for though. Hopefully you can see the difficulty, a naive definition of THIS_PROGRAM won’t work:

auto const THIS_PROGRAM = "int main() { cout << THIS_PROGRAM << endl;}";

int main()
{
    cout << THIS_PROGRAM << endl;
}

It doesn’t work because THIS_PROGRAM doesn’t contain the definition of THIS_PROGRAM. It should be obvious that continuing down this path will fail, everything we might add to THIS_PROGRAM will then be missing whatever we’ve just added to THIS_PROGRAM.

Let’s instead think about the general form of a quine. It must be this:

THIS_PROGRAM = START_RAW + QUINE + END_RAW + QUINE

This is the core trick of a quine. There is some part of the text of the program, that here we’ve name QUINE, that appears both inside a definition of a string, and outside. We are certain that this very line will be part of QUINE. That gives us somewhere to begin.

auto const QUINE = R"***(
// QUINE will be above this line
auto const THIS_PROGRAM = BEFORE_QUINE + QUINE + AFTER_QUINE + QUINE;
)***";
// QUINE will be above this line
auto const THIS_PROGRAM = BEFORE_QUINE + QUINE + AFTER_QUINE + QUINE;

This is obviously missing definitions of BEFORE_QUINE and AFTER_QUINE, but what goes in them is already defined, because we’ve implicitly written them. We can add them easily, by observation:

auto const QUINE = R"***(
// QUINE will be above this line
auto const BEFORE_QUINE = "auto const QUINE = R\"***(";
auto const AFTER_QUINE = ")***\";";
auto const THIS_PROGRAM = BEFORE_QUINE + QUINE + AFTER_QUINE + QUINE;
)***";
// QUINE will be above this line
auto const BEFORE_QUINE = "auto const QUINE = R\"***(";
auto const AFTER_QUINE = ")***\";";
auto const THIS_PROGRAM = BEFORE_QUINE + QUINE + AFTER_QUINE + QUINE;

Note that every line we add inside QUINE gets added outside as well, so that the program always includes its own source. Note that the only thing missing from the definition of QUINE is the definition of QUINE itself, but that’s fine because we’re adding that on the fly in THIS_PROGRAM;

It might not look it, but these few lines represent the job done, we have a program that stores its own text in the constant THIS_PROGRAM. It should be easy to see, that “storing” or “outputting” makes no major difference. Everything we do now is just to satisfy the language, and ensure all the newlines are correct.

#include <iostream>

auto const QUINE = R"***(
// QUINE will be above this line

auto const BEFORE_QUINE =
    "#include <iostream>\n\n"
    "auto const QUINE = R\"***(";
auto const AFTER_QUINE = ")***\";";

int main()
{
    std::cout << BEFORE_QUINE << QUINE << AFTER_QUINE << QUINE;
}
)***";
// QUINE will be above this line

auto const BEFORE_QUINE =
    "#include <iostream>\n\n"
    "auto const QUINE = R\"***(";
auto const AFTER_QUINE = ")***\";";

int main()
{
    std::cout << BEFORE_QUINE << QUINE << AFTER_QUINE << QUINE;
}

Proof?

$ clang++ -ggdb3 -O2 -Wall -Wextra -std=c++11 -o quine quine.cc
$ quine > quine2.cc
$ md5sum quine.cc quine2.cc
d6353ab8c00324f4e905ad2bc54a8668  quine.cc
d6353ab8c00324f4e905ad2bc54a8668  quine2.cc

Here’s a fun party trick with a quine like this (it’s not complicated, but it is strange):

$ clang++ -std=c++11 -o quine quine.cc
$ rm -f quine.cc
$ ./quine | clang++ -std=c++11 -xc++ -o quine -
$ ./quine | clang++ -std=c++11 -xc++ -o quine -
$ ./quine | clang++ -std=c++11 -xc++ -o quine -
$ ./quine | clang++ -std=c++11 -xc++ -o quine -
$ ./quine | clang++ -std=c++11 -xc++ -o quine -

Compiling the output of a binary to make a new copy of the binary. Weird.

This is actually the beginnings of a trick that Ken Thompson talked about when discussing the issue of trustworthy compilers.

Leave a Reply