Does this work with a guaranteed random bit output bandwidth ?
These techniques (eg von neumann) all seem to suffer from that same problem, namely, they crank out uniformly distributed random bits from biased sources, but with no guarantee of how long you may have to wait for a bit to come out.
For certain input distributions, a guaranteed output bandwidth is impossible. For instance, suppose that you had a function that, given a vector of N integers chosen uniformly from (0,1,2), returns a single unbiased output bit. Then, exactly half of the 3^N input vectors would correspond to each output bit. But this is impossible, since 3^N is always odd, so this (0,1,2)-function cannot exist for any fixed N.
Now, suppose that you had a function that turned N input bits with a 1:2 bias into a single bit with no bias. Then you could use this to implement an (0,1,2)-function as above, by collapsing inputs 1 and 2 into input 1. Since such an (0,1,2)-function is impossible, the 1:2-biased function is similarly impossible.
This argument holds for any i.i.d. input source that can be generated from a uniform-integer source with an odd number of possible values.
If the coin is biased so that it lands on heads only 1 in a million times it will take a lot of coin throws to get some randomness out of it. This isn't a limit of the technique, just a consequence of the low entropy source.
The latter (except that it doesn't always have to be long; the length of time per bit follows a gaussian distribution). If it were the former then randomness would be predictable in some sense. It's similar to the halting problem.
If you know the bias, then you can surely estimate the bit rate. Of course you could get very unlucky and wait a long time for bits, even with an unbiased underlying source. But that's the nature of these methods.
Or you could pump the heavily biased bit stream into a modern cipher and get a - for all intents and purposes - super high quality, completely unbiased random stream with the same bandwidth characteristics as the input.
And, as a matter of fact, much faster than the input if required.
Things are not so simple: how do you key the cipher? with the same randomness as the input string? If so, then that input distribution could be a bad one for the cipher, in that it might fail to generate a sufficiently scrambled output.
This can be mitigated if you use a unkeyed hash function instead, but even for these we don't have any provable guarantees unless you assume the function is a random oracle, which is an ideal object that cannot exist.
Yeah, if we’re starting with the “1 in a million” unfair coin then most of the input strings are going to be all 0 or all 1 which means the cipher is going to give the same sequence every time. Not useful!
That's a good question. You cannot get a non-statistical guarantee of an output rate from the Von Neumann biased coin toss, because you can potentially get a long run of the same toss result, which stalls output bit generation. That might be a generic limitation of all similar de-biasing approaches. Has that been proven or disproven?
See my sibling reply [0]: for many input distributions, you can never put a fixed upper bound on the number of inputs you need to get a single output bit.
These techniques (eg von neumann) all seem to suffer from that same problem, namely, they crank out uniformly distributed random bits from biased sources, but with no guarantee of how long you may have to wait for a bit to come out.