## January 30, 2010## Random Seeds, Coded Hints, and QuintillionsHere is a seedable random number generator in Javascript that you can set up to produce a determinstic sequence of pseudorandom numbers. Browsers do not provide a built-in way to seed Math.random(), so this solution is handy both when you need a repeatable pseudorandom sequence that is
Many games that use weak random number generators have been cracked by exploiting their lack of randomness, and recently it has even been shown that it is possible to guess your 'random' Social Security Number given information about the time and location of your birth. To resist this type of attack, you want do better than a linear congruential PRNG seeded with the current time. Explanations below.
This script defines a function Math.seedrandom() that replaces Math.random with a seeded sequence of your choice. You use it by including seedrandom.js and then calling
Next time you call The code uses RC4 as the pseudorandom number generator, so the randomness is a bit better than what you get from most browsers' built-in Math.random. But since it does all the computation in javascript, it is also is 3-10x slower than the native function (though still plenty fast - less than 0.002 milliseconds per call for me). A minified version of seedrandom.js (using uglifyjs) is about 1K. The code also supports automatic seeding.
If you don't want to think of a seed but just need an arbitrary unpredictable seed, seedrandom.js can also do automatic seeding from local entropy that is good enough for non-adversarial use. If you call Much of this information is collected by seedrandom using a recursive traversal of selected dom objects: function flatten(obj, depth) { var result = [], typ = (typeof obj), prop; if (depth && typ == 'object') { for (prop in obj) { try { result.push(flatten(obj[prop], depth - 1)); } catch (e) {} } } return (result.length ? result : typ == 'string' ? obj : obj + '\0'); }
function autoseed() { try { var seed = new Uint8Array(width); global.crypto.getRandomValues(seed); return tostring(seed); } catch (e) { return [+new Date, window, window.navigator.plugins, window.screen, tostring(pool)]; } } seed = '' + flatten(autoseed(), 3); Since the underlying RC4 generator uses a 256-byte seed, any explicit and DOM-derived seeds are smeared into a 256-byte array like this: var smear = 0; for (var j = 0; j < seed.length; ++j) key[j & 255] = ((smear ^= key[j & 255] * 19) + seed.charCodeAt(j)) & 255; shortseed = ''; for (j in key) shortseed += String.fromCharCode(key[j]); return shortseed; The shortened seed string is returned from Math.seedrandom() so that you can record and replay an autogenerated seed if needed.
Unfortunately, while local seeding is pretty fast, the amount of unpredictability that can be collected quickly from the DOM ends up being pretty thin and not enough to protect you against a determined hacker who is trying to beat your online poker game by using brute force to deduce your seed. You can add entropy by calling Math.seedrandom with sources of unpredictable bits such as mouse positions over time. The script maintains an "entropy" variable that accumulates new bits of information every time Math.seedrandom() is called, so if you call Math.seedrandom(data) with explicit seeds several times and then finish with a call to Math.seedrandom() without arguments, all the previously-supplied explicit entropy will be mixed together with DOM state to build a now-more-unpredictable automatic seed. Available local browser entropy has been studied, and because of the coarseness of timers and limited access to physical devices, it is probably necessary to collect about a minute of user-interface event observations to build an attack-proof pool of local entropy. If you need robust unpredictability and you do not have a whole minute to wait, you need to find a source of entropy that is not local.
One option for quick robust entropy is to use an online source of random bits like random.org. Random.org provides a high-volume online stream of unpredictable bits that are derived from atmospheric noise detected by an array of radio receivers in Dublin and Copenhagen, all built and run by Trinity College professor Mads Haahr. His service will happily ship a few of these physically generated bits to you over https for free. Ideally after loading seedrandom.js, you would like be able to run some code like the following that supplies random strings that come from random.org: Math.seedrandom({'results':['pcPKzEptrHN3wkgTZJJe', '9uM6b5j5vKZzpVVds2ZT', 'zpHSar2s2a7kLnewGFFh']}); You can get very close to this by using YQL to reformat data from a random.org download page. The query you want is this: select * from html where url = "https://www.random.org/passwords/?num=50&len=9" and xpath = "//ul[@class='data']//p/text()" The YQL above can be invoked as a URL that returns the results as JSONP for convenient use as a cross-domain script include. The format of the random data can be cleaned up a bit by using YQL execute scripts, which I have done using this xml file. To make usage easy, I have posted a shortened url to http://bit.ly/srandom-256. So to make a random.org-seeded PRNG in Javascript, you can just write: <script src="http://davidbau.com/encode/seedrandom.min.js"></script> There are two main disadvantages when seeding using a network source. Frst, it introduces a series of intermediaries that you must trust to pass untampered random bits to you, including in this case, bit.ly, Yahoo, davidbau.com, and random.org. Second, there will be a delay of a few hundred milliseconds to load the data. You could use multiple sources (e.g., see HotBits), seed asynchronously, or create your own source of network randomness to avoid these issues. For example, I have posted a Google App Engine-hosted service at call.jsonlib.com that uses a Google-supplied urandom to provide faster network seeding in one roundtrip. It's faster than the random.org method and shares none of the same intermediaries. Use it alone for speed, or mix it with bits from random.org to satisfy your paranoia: <script src="https://jsonlib.appspot.com/urandom?callback=Math.seedrandom"></script>
Sometimes an unpredictable seed is not what you want - sometimes you want to specify an explicit nonrandom seed. What is an explicitly seeded PRNG good for? It is good for knowing the future with certainty while everybody else is left guessing! For example, you can use a PRNG for a really easy-to-implement javascript secret decoder ring. The one I have linked here is useful for pencil-and-paper written clues, the sort that you might want to hide in a kid's treasure hunt. In our family we have used it for exactly that purpose many times. The secret encoder encrypts letters as normal letters and preserves spaces, capitalization and so on, so you can easily transcribe the ciphertext. It is fully symmetric, by which I mean that the algorithm for going from ciphertext to cleartext is davidbau.com/encode your treasure hunt hints here.
The RC4 algorithm generates discrete random integers. But Math.random() is supposed to return a uniformly distributed floating-point value in the range [0, 1). What is the right way to get from integers to the floats? Most floating-point PRNG code that I have seen fails to do this correctly, in my opinion. For example, the widely-used 1982 Wichman-Hill algorithm (that used to be used by python) goes something like this: x = (171 * x) % 30269 y = (172 * y) % 30307 z = (170 * z) % 30323 return (x/30269.0 + y/30307.0 + z/30323.0) % 1.0 The three generators have period 30268, 30306, and 30322, which have LCM about 6.9 trillion. This corresponds to a gap between numbers that can be generated on the order of 10 In other words, Wichman-Hill can only generate about 0.1% of the 52-bit mantissas that are possible in a double-precision floating point value. Generating the whole set of 2
The more modern approach to generating a double precision floating point value (used by python since 2.3 and Boost C++ libraries) is to simply use a long-period integer generator to directly build an integer with 52 bits of randomness, and then divide the number by 2 function random() { return get_randint_up_to(1 << 52) / (double)(1 << 52); } But when we are using an underlying PRNG like RC4 that has a period around 10 A 64-bit floating-point value naturally cannot represent more than 2
The following code is a natural way to simulate a one-in-a-quintillion event: if (random() * (1.0e+18) < 1.0) { one_in_a_quintillion = true; } Unfortunately, using the current regime of random number generators, the code above will not do what it appears to promise, because the coarse 2 This is not just a one-in-a-quintillion issue. If a typical modern PRNG happens to generate a population of numbers smaller than one billionth (which it can do in a few seconds on a modern CPU) then more than half of the random-looking digits in the floating-point representation of those small random numbers are a deception. The very smallest digits are not random at all: they are determined by the leading digits. If you were to choose to use those least-significant digits for hashing or subsampling, you would run into trouble. IEEE 754 floating-point representations are specifically designed to provide telescoping high resolution near zero so that things like very small probabilities can be dealt with accurately. A double precision value has no problems representing accurate values with full precision down to about 10 I have not seen a previous PRNG that takes advantage of that power.
So to scratch this pet peeve, the uniform [0, 1) generator included in seedrandom.js has the following code to smoothly utilize all the available exponents in the floating point representation (note that in javascript, all those large integers bigger than 2 math['random'] = function random() { // Closure to return a random double: var n = arc4.g(6); // Start with a numerator <= 2 ^ 48 var d = denom; // and denominator = 2 ^ 48. var x = 0; // and no 'extra last byte'. while (n < significance) { // Fill up all significant digits by n = (n + x) * 256; // shifting numerator and d *= 256; // denominator and generating a x = arc4.g(1); // new least-significant-byte. } while (n >= overflow) { // To avoid rounding up, before adding n /= 2; // last byte, shift everything d /= 2; // right using integer math until x >>>= 1; // we have exactly the desired bits. } return (n + x) / d; // Form the number within [0, 1). } The code above scales up the exponent by increasing d as long as the mantissa does not fill the full 52 bits available in the floating-point representation ("significance" is set to 2 By adding the scaling loop for small numbers, this uniform variate generator uses not only the whole mantissa, but the whole exponent as well. One-in-quintillion simulations are now possible using the ordinary uniform distribution.
At the limits of resolution, there are a couple things to note. There are only 53 significant digits, so if you multiply 2 In Javascript IEEE 754 round-to-even mode is used, so we need to take care to avoid rounding at the limits of resolution by shifting off any bits beyond the least significant digit. This achieves a last digit that is not the result of rounding which is equally likely to be even or odd when you multiply by 2 Once you have gone to all this trouble, the other set of issues to recognize are limits of the underlying RC4 algorithm. For example, the period of RC4 is recognized to be about 10
Never assume you have enough randomness without looking at the numbers. In Las Vegas it is common to deal blackjack out of an eight-deck shoe to make it harder for card counters to beat the house. But there are about 10 With ordinary RC4, you will be limited to a subset of shuffles that represent less than one quintillionth quintillionth quintillionth quintillionth of the possible shuffles that you could do with a physical shoe of eight decks! In other words, after you have dealt out one full shoe of eight decks, you will have provided a distinctive fingerprint about your shuffling algorithm. To shuffle several decks together using a PRNG without revealing a lot of information about the shuffle algorithm you happen to be using, you would need to switch to a generator with a much larger state space than 8-bit RC4. For example, the 10-bit version of RC4 has 10 It is hard to be a good card dealer.
Here is the dieharder analysis of the sequence starting with Math.seedrandom(1). Note that the "WEAK" asessment is given for tests that result in a p that is near zero or near one, but because more than a hundred tests were run, we expect occasional tests to come in big or small - indeed if no tests were reported "WEAK", it would be an indication of a problem. Overall, the performance of this little javascript PRNG is pretty robust.
Comments
What is the licensing on your code: seedrandom.js There is no copyright or license information (e.g. MIT, BSD, LGPL, GPL, Public Domain, etc). I would like to use it, but without any licensing info its not very usable. Thanks I've posted a version under BSD license now. This version also: - adds an optional second argument, which, if set, lets you mix an explicit seed with the entropy pool without the slowness of other local entropy. On initializing without losing bits - a good way to initialize with a truly random string seed is to not limit yourself to printable characters but to use the full range of character values, i.e., Math.seedrandom('\xF9\x7F\x00\x1E...'). Posted by: David at March 11, 2010 06:40 AMVery interesting article David .. very. I have used the seeded random number principle extensively in my own "dabbling" code, however I bumped into a small issue that I wonder if you might be able to give me some advice on? Data is being randomly generated from a seed (a name). That data is useful and in the main what I want, Using the name as the seed means I only have to store the name to get back the added detail the randomness generates. Basically I am using the seed almost like a Hash. Very occasionally I want to alter the data (alter one or two of the random numbers generated, but not all of them). Can you think of a way to take the stream of random numbers (including the changes) and generate a seed, working backwards if you will? Thanks David Posted by: David Johnston at June 4, 2010 10:44 AMSince seedrandom.js uses RC4 which is a cryptographically strong generator, it is designed very difficult to reverse. In other words, even if you know the sequence of generated bits, it is hard to go backwards to figure out the seed, and it is hard to go forward to predict the next random bits. There are some cases where RC4 keys can be partially deduced from parts of the output sequence. The worst weaknesses are mostly eliminated by priming RC4 by discarding the first few values (seedrandom.js does this). If you discover any new cases where it is possible to deduce a key by analyzing the output, it could be publication-worthy. Posted by: David at June 6, 2010 11:31 AMThis code is wonderful. Having a repeatable series of numbers is so helpful for many areas, especially in making repeatable test cases. Anyways, when using the Closure Compiler, which I see you use too, I get this warning: WARNING - Suspicious code. This code lacks side-effects. Is there a bug? Hi David, thank you for this very handy set of functions. You suggest in your documentation: The problem with that is: rng1 is the whole function and so it cant be stored via JSON. So what I'd need would be something like where rng1 is typeof string,number,array... anything but a function. I'd be really gratefull for this functionality added. Greetings, I found out myself: // remember and restore PRNG state Sorry, better clone S on setting, too: Hi David, I'm working with some code that uses seedrandom.js. Seeds in this application are an increasing series of unpadded integers. I noticed that I got the same sequence for any seed that matches /^.+$/. For example, 1, 11, 111, 1111, etc all return the same sequence. I think this is caused by a bug in seedrandom. You state above that RC4 needs a 256-byte key, but mixkey does not always return a 256-byte array - instead it returns an array of size min(seed.length, 256). I think the key needs to be padded out to 256 bytes if the seed string is too small. You can do this by initializing key to be 256 bytes long, or by changing the loop in mixkey like this: -for (j = 0; j < seed.length; j++) { This seems to work for me, but I'd appreciate if you can confirm this fix makes sense. I don't really know anything about RC4/crypto beyond what I discovered while investigating this issue :) Posted by: Desmond Brand at April 2, 2011 01:05 AMHi Desmond, Thanks for the bug report! I am using the standard RC4 key scheduling algorithm, which has the property that for short keys, "A" is equivalent to "AA". However, I agree people should expect that Math.seedrandom(num) should treat the numeric seeds 1 and 11 differently, so I have changed the way seedrandom converts from non-strings to strings. I am releasing version 2.0 (rc1) of seedrandom.js today. In this version, I have changed Math.seedrandom(num) to be equivalent to Math.seedrandom(num + '\0') when num is not a string. The key "1\0" is not equivalent to "11\0", so you will not get key collisions for small numeric keys. The behavior for strings is unchanged, so you still need to be aware that Math.seedrandom('A') is equivalent to Math.seedrandom('AA'). If you are working with short variable-length strings and you want to avoid collisions, it is probably best to add a terminator yourself. Math.seedrandom will only do this automaticaly for you when you use a non-string seed. If you are using numeric seeds and want to keep the old behavior while using the latest seedrandom, then you can just convert to strings yourself by saying Math.seedrandom(num + ''). Posted by: David at April 2, 2011 07:03 AMHi David. I am very interested in this topic and I see that you know what your are talking about. I have a couple of questions. The option to use mouse positions, keyboard keys, and etc, I could see no hints of that in the code. Although I see the window object and flatten. The flatten method could be improved further. Are your intentions that we collect that data on our own and then pass that data to seedrandom as seed? You should also consider making things more extensible. The flatten method should be able to be replaced by outsiders. // Each time seedrandom('arg') is called, entropy from the passed seed What if you want to clear the old seed, because too much data is in it now? You should create your own namespace, or return an object as a class with all functions and properties in order for them to be able to override or add certain features. Thanks, and great job! Posted by: Moe at September 24, 2011 06:58 AMHi David. Thanks for the nice piece of code. Just wanted to clarify shuffling of 8-deck shoe. Generating every possible permutation is actually not the good way this is done in software. Better way is this: numerate all cards in first deck from 1 to 52, in second deck from 53 to 104 etc. Then when you need first card, you say "give me number between 1 and 416". Say for example you got number 28. Then when you need second card you say "give me number between 1 nad 416, but not 28". Say you got 199 this time. For N cards, simply repeat this operation N times. That way, you can "shuffle" very large number of decks with simple RNG. Best regards! Posted by: Ivan at December 22, 2011 12:10 PMWill this return identical results regardless of Javascript implementation (different browsers, Rhino etc)? Posted by: at March 26, 2012 08:08 AMYes, this returns perfectly identical results regardless of Javascript implementation. Posted by: David at April 13, 2012 04:18 AMJust wondering what the best way to get random ints out of this is. I'm looking at adding a Math.randomInt(n) feature where you get a random number between 0 (inclusive) and n (exclusive). From what I've been reading doing a simple %n isn't good because of potential uneven distribution. Posted by: Eric Dalquist at January 29, 2013 02:59 PMOk I think I figured something out for my Math.randomInt feature, this is based on the Java Random.nextInt implementation to ensure equal distribution within the constrained range: math['randomInt'] = function randomInt(n) { Thanks for providing seedrandom.js. Given: Math.seedrandom('hello.'); Is there some way to get IE browser to return the same answer as FireFox browser? FireFox : 0.9282578795792454 ( as described in source comments and same under Chrome, Opera and Safari IE : 0.6362644381117817 ( same under IIS asp Good health! Seedrandom has been tested to perform identically on IE as is does with other browsers. Can you post an example page that you are finding that works differently on IE? If there is a missing case or a bug, I will fix it. Here is a sample page that I have just retested. It prints 0.9282578795792454 on IE (version 11): http://david.pencilcode.net/home/testing/seedrandom.html Posted by: David at January 31, 2014 10:03 AMUpdate - I have followed up with Blair. He was seeing the problem on IE8, and I was able to see and fix the bug in that environment. The PRNG sequence has been incorrect on IE8 since seedrandom 2.1 (2.0 was correct). There will be a new release soon that will fix the IE8 problem. As a bonus, it is also 5 bytes smaller. It will also switch to the MIT license. Post a comment
| |

Copyright 2010 © David Bau. All Rights Reserved. |