2014-04-17 23:25:43

What is this private key doing in my random pool?

As a consequence of the Heartbleed bug, the OpenBSD community has taken up the challenge of auditing the OpenSSL source code and improving its general quality. Given the recent events this is a very necessary and just cause and should by all means be supported.

A necessary prerequisite to editing OpenSSL code however should be either some amount of knowledge about how cryptosystems work or, alternatively, guidance by people who have such knowledge. The underlying issue is that crypto source code tends to be extremely fragile since a lot of the effort needs to go into preventing so-called “oracles”, which are code paths which reveal information about the success or failure of cryptographic processes either through differences in the error message, through differences in timing of the involved methods in case of success or failure, or through differences in power consumption or heat radiation caused by different code paths being taken.

This is not about this kind of oracles though, but right now we're talking about a relatively simple matter: random pools.

Random Pools and Non-Random Data

The inner workings of random pools are relatively simple. Essentially, each pool has a given size and quality. Data which is used for encryption can only be withdrawn from the pool if the randomness of the data in the pool, the so-called entropy, has passed a given theshold. Until then, any attempts to withdraw random data from the pool will simply block until enough random data is available. This is the main reason why virtual machines sometimes take such a long time doing SSL handshakes.

The pool is usually fed from a variety of random sources. One of these sources can be key presses, which aren't very random per se but there is some amount of jitter in latency even if you are just typing away at a word. So this would feed into the pool as a source with a relatively low entropy value. You can also measure the timing differences between issuing read or write commands to the disk and getting a result back. Since this is not influenced by humans, the entropy of this source might be a lot higher. Finally, some people attach actual diodes to the serial or USB port of their machines and gather actual physical entropy, which would of course be the highest known level of entropy.

So essentially you would want an algorithm which, for every byte of entropy from the keyboard gives you perhaps 0.3 bytes of entropy from the pool, and for every byte of entropy from the hard disk you get to read 0.7 bytes of entropy out of the pool, and for each byte of entropy from the diode you get 1 byte of entropy from the pool. That's why the RAND_add function in OpenSSL takes a third parameter named “entropy”. It simply specifies a factor which should be applied to the size of random data passed into the pool to know by how much the pool size can be advanced.

Typically this is achieved by using a XOR operation of the already-pooled random data with the data being passed in. As long as data is not XORed against itself (which yields 0), it will always come out at least as random than it went in.

Using entropy from untrusted sources

A nice game played by some nerds on IRC is called “random number exchange”. Essentially, you send someone a random number to be fed into the pool — but with the entropy of 0. This will not advance the pool of the other person a single bit. However, it has 2 possible outcomes:

  1. The data is not of good quality. The worst case of this would be a value that is all-ones or all-zeroes. Clearly, such a number is not random at all. In the case of all-zeroes, the contents of the random pool are unchanged. In the case of all-ones, the contents of the random pool end up being its binary complement. The binary complement of a random number is just as random as the original number.
  2. The data is actually random. In that case, the actual entropy in the random pool is increased, since the existing data is XORed against other random data. That makes it more unpredictable what will be returned next from the pool.

So as you can see, whatever data you feed into the random pool at entropy 0 can only increase the entropy of the pool data but never decrease it.

Feeding sensitive data into the random pool

What the OpenBSD audit had found was that OpenSSL was feeding sensitive data — user passwords and cryptographic secret keys — into the random pool. The code doing so was deleted from the OpenBSD OpenSSL repository with a question in the commit message: “What were they thinking?!”

What they were thinking is actually a good question. As we learned already, any data that is fed into the random pool at entropy 0 cannot harm the quality of the random data. In the case of the code which was deleted, the entropy parameter of RAND_add was always 0. Also, the manual page of RAND_add clearly states:

RAND_add() may be called with sensitive data such as user entered passwords. The seed values cannot be recovered from the PRNG output.

Can we justify this claim?

Yes, we can. The important part here is that the data is fed into the pool at entropy 0, which means that no bytes will be released from the pool until enough entropy has been gathered so we can confidently state that the data returned is random. At that point, we essentially XORed the sensitive data with high-entropy random data. This algorithm is known as a One-Time Pad. As long as the random data used in the XOR is guaranteed to be high-entropy and as long as it is never ever re-used, whichever data is XORed against it becomes indistinguishable from random noise. This means that adding the sensitive data into the random pool may have increased the entropy of the pool or not, but it also means that we can never tell what has been written into the pool.

So the patch which was committed by the OpenBSD people actually has the potential to weaken the entropy of the OpenSSL random pool, but it was never a security or privacy concern, so the patch doesn't fix anything. As such I would suggest that it should be backed out.

LibreSSL?

In addition to the technical side of things, the OpenBSD community is currently in the process of falling victim to a rather massive administrative fallacy in Open Source development. They are producing a rather large set of patches against OpenSSL. As recently outlined by Kristian Köhntopp (in German), this operation runs at high risk of the patches never making their way back into OpenSSL because it would be too much effort to go through all of them. So essentially, OpenBSD is creating a fork of OpenSSL.

Given their minuscule manpower, I don't think they're up to that…