Impermanence in Linux – Exclusive (By Hari Iyer)
Impermanence, also called Anicca or Anitya, is one of the essential doctrines and a part of three marks of existence in Buddhism The doctrine asserts that all of conditioned existence, without exception, is “transient, evanescent, inconstant”
On Linux, the root of all randomness is something called the kernel entropy pool. This is a large (4,096 bit) number kept privately in the kernel’s memory. There are 24096 possibilities for this number so it can contain up to 4,096 bits of entropy. There is one caveat – the kernel needs to be able to fill that memory from a source with 4,096 bits of entropy. And that’s the hard part: finding that much randomness.
The entropy pool is used in two ways: random numbers are generated from it and it is replenished with entropy by the kernel. When random numbers are generated from the pool the entropy of the pool is diminished (because the person receiving the random number has some information about the pool itself). So as the pool’s entropy diminishes as random numbers are handed out, the pool must be replenished.
Replenishing the pool is called stirring: new sources of entropy are stirred into the mix of bits in the pool.
This is the key to how random number generation works on Linux. If randomness is needed, it’s derived from the entropy pool. When available, other sources of randomness are used to stir the entropy pool and make it less predictable. The details are a little mathematical, but it’s interesting to understand how the Linux random number generator works as the principles and techniques apply to random number generation in other software and systems.
The kernel keeps a rough estimate of the number of bits of entropy in the pool. You can check the value of this estimate through the following command:
A healthy Linux system with a lot of entropy available will have return close to the full 4,096 bits of entropy. If the value returned is less than 200, the system is running low on entropy.
The kernel is watching you
I mentioned that the system takes other sources of randomness and uses this to stir the entropy pool. This is achieved using something called a timestamp.
Most systems have precise internal clocks. Every time that a user interacts with a system, the value of the clock at that time is recorded as a timestamp. Even though the year, month, day and hour are generally guessable, the millisecond and microsecond are not and therefore the timestamp contains some entropy. Timestamps obtained from the user’s mouse and keyboard along with timing information from the network and disk each have different amount of entropy.
How does the entropy found in a timestamp get transferred to the entropy pool? Simple, use math to mix it in. Well, simple if you like math.
Just mix it in
A fundamental property of entropy is that it mixes well. If you take two unrelated random streams and combine them, the new stream cannot have less entropy. Taking a number of low entropy sources and combining them results in a high entropy source.
All that’s needed is the right combination function: a function that can be used to combine two sources of entropy. One of the simplest such functions is the logical exclusive or (XOR). This truth table shows how bits x and y coming from different random streams are combined by the XOR function.
Even if one source of bits does not have much entropy, there is no harm in XORing it into another source. Entropy always increases. In the Linux kernel, a combination of XORs is used to mix timestamps into the main entropy pool.
Generating random numbers
Cryptographic applications require very high entropy. If a 128 bit key is generated with only 64 bits of entropy then it can be guessed in 264 attempts instead of 2128 attempts. That is the difference between needing a thousand computers running for a few years to brute force the key versus needing all the computers ever created running for longer than the history of the universe to do so.
Cryptographic applications require close to one bit of entropy per bit. If the system’s pool has fewer than 4,096 bits of entropy, how does the system return a fully random number? One way to do this is to use a cryptographic hash function.
A cryptographic hash function takes an input of any size and outputs a fixed size number. Changing one bit of the input will change the output completely. Hash functions are good at mixing things together. This mixing property spreads the entropy from the input evenly through the output. If the input has more bits of entropy than the size of the output, the output will be highly random. This is how highly entropic random numbers are derived from the entropy pool.
The hash function used by the Linux kernel is the standard SHA-1 cryptographic hash. By hashing the entire pool and and some additional arithmetic, 160 random bits are created for use by the system. When this happens, the system lowers its estimate of the entropy in the pool accordingly.
Above I said that applying a hash like SHA-1 could be dangerous if there wasn’t enough entropy in the pool. That’s why it’s critical to keep an eye on the available system entropy: if it drops too low the output of the random number generator could have less entropy that it appears to have.
Running out of entropy
One of the dangers of a system is running out of entropy. When the system’s entropy estimate drops to around the 160 bit level, the length of a SHA-1 hash, things get tricky, and how they effect programs and performance depends on which of two Linux random number generators are used.
Linux exposes two interfaces for random data that behave differently when the entropy level is low. They are /dev/random and /dev/urandom. When the entropy pool becomes predictable, both interfaces for requesting random numbers become problematic.
When the entropy level is too low, /dev/random blocks and does not return until the level of entropy in the system is high enough. This guarantees high entropy random numbers. If /dev/random is used in a time-critical service and the system runs low on entropy, the delays could be detrimental to the quality of service.
On the other hand, /dev/urandom does not block. It continues to return the hashed value of its entropy pool even though there is little to no entropy in it. This low-entropy data is not suited for cryptographic use.
The solution to the problem is to simply add more entropy into the system.
Hardware random number generation to the rescue?
Intel’s Ivy Bridge family of processors have an interesting feature called “secure key.” These processors contain a special piece of hardware inside that generates random numbers. The single assembly instruction RDRAND returns allegedly high entropy random data derived on the chip.
It has been suggested that Intel’s hardware number generator may not be fully random. Since it is baked into the silicon, that assertion is hard to audit and verify. As it turns out, even if the numbers generated have some bias, it can still help as long as this is not the only source of randomness in the system. Even if the random number generator itself had a back door, the mixing property of randomness means that it cannot lower the amount of entropy in the pool.
On Linux, if a hardware random number generator is present, the Linux kernel will use the XOR function to mix the output of RDRAND into the hash of the entropy pool. This happens here in the Linux source code (the XOR operator is ^ in C).
Third party entropy generators
Hardware number generation is not available everywhere, and the sources of randomness polled by the Linux kernel itself are somewhat limited. For this situation, a number of third party random number generation tools exist. Examples of these are haveged, which relies on processor cache timing, audio-entropyd and video-entropyd which work by sampling the noise from an external audio or video input device. By mixing these additional sources of locally collected entropy into the Linux entropy pool, the entropy can only go up.