Code Evaluating IRC Bots and the Quine Attack Vector

Background

It wasn't even four hours into the new year and I already did something stupid. That's pretty impressive - even by my standards.

Over at some channel in some IRC network we got a very useful bot. It can be used for (wildcarded) API lookups, remembers a few important URLs etc and it can also evaluate code. Executing arbitrary code sounds scary at first - and well, it actually is. However, there are many safety nets to prevent any harm: it's sandboxed, execution time is limited to a few msecs, it runs in headless mode, there is a flood protection, it runs in a separate process, memory usage is capped, and output is queued in a Set and capped which ensures that only up to 3 different lines are printed.

Thanks to the headless mode even creative attack routes such as inlining an image, which exploits a yet to be discovered new image parsing vulnerability (they should be all patched by now) on that rather exotic operating system (BSD) are cut off. Not to mention that it would be very difficult to fit all that into less than 512 bytes.

So, what could possibly go wrong?

Famous last words, but there is nothing I can think of. We tried to break the bot in every possible way and not a single issue was found. Every safety net works as intended and there doesn't seem a way to break out. I thought everything was covered - and for the isolated case (the model, I build my assumptions on) this actually might be the case.

What Went Wrong

That's where Murphy's law comes into play. I simply wasn't thinking outside the box. Due to network issues and/or human error we got two instances of the bot in the channel. While a bot cannot trigger itself, it can trigger others. And that's where Quines enter the picture.

In a nutshell: a Quine is a program, which outputs its whole source code as its only output. And I thought it would be a good idea to try that. Boy was I wrong! I assumed the flood protection would kick in and it would stop (after some flooding). Thanks to network and compiling latencies that didn't happen tho.

After panicking for a bit (heh) I managed to starve out the bots' queues with gentle flooding. Phew! I'm really glad that that worked, otherwise they would have filled the logs with a few megabytes of junk before an operator could have intervened. After all it was New Year's Day and most people (myself included) were sorta wasted. On the flip side it's sort of interesting - who would have thought that Quines can be this malicious?

Solutions

Pointing out issues like that without offering a solution isn't really helpful, is it? For this reason I thought about it a bit before writing about it. The first awfully naïve solution which comes to mind is checking whether the output matches the input. That won't do the trick, because there are also alternating Quines. That is: program A outputs program B which outputs program A. There are even language alternating Quines. I think it was Markus Persson who wrote one of those which alternated between Java and BrainF*ck.

So, that won't work. Trying to work out if it's a Quine of some sort is also beyond question. It's too complicated which means it will be too fragile.

A better starting point is the trigger mechanic. The trigger character has to be the first character in the line (typically '!', '.', '-', or '~' is used). All we need to do is to ensure that the output won't start with that kind of thing. Using a blacklist is error prone and prefixing everything with a space would be a bit ugly.

Fortunately there are invisible control characters, which could be used as output prefix. Possible options are the "normal" control character (U+000F) which was introduced by mIRC, Zero-width joiner aka ZWJ (U+200D), or Zero-width non-joiner aka ZWNJ (U+200C). The "normal" control character is most likely the safest choice, since it avoids encoding issues completely.

Bonus Round: How the used Quine works

The complete Quine:

~exec String s="~exec String s=%c%s%c;out(String.format(s,34,s,34));";out(String.format(s,34,s,34));

The pieces:

  • ~exec is the evaluation trigger.
  • String.format is a String formatting method, similar to C's printf.
  • The number 34 formatted as char is the quote (") character.
  • out is a print function.

The data:

String s="~exec String s=%c%s%c;out(String.format(s,34,s,34));"

What happens with that data (via String.format):
string expansion diagram

  • The first %c is replaced with ".
  • %s is replaced with the data string s.
  • The second %c is replaced with ".
  • Then this formatted string is printed.

And we end up with this Quine's source code again. :)

The channel/network was intentionally omitted, and nicks and some details were also changed.

Comments

Coolest blog post title this year

No, seriously...quite definitely the coolest.

Heh

It's a bit too long if you ask me, but it's the shortest descriptive title I could think of. And uhm... isn't it a bit too early to throw these awards around? ;)

Ha!

Nice one Aho! :)
void256

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options