[HN Gopher] Evolution of Random Number Generators
___________________________________________________________________
Evolution of Random Number Generators
Author : algui91
Score : 125 points
Date : 2021-04-30 09:11 UTC (2 days ago)
(HTM) web link (www.johndcook.com)
(TXT) w3m dump (www.johndcook.com)
| TekMol wrote:
| I am looking for a simple random number generator that fills a
| given amount of slots.
|
| Say it is initialized with "size=5" then it might output:
|
| 3,5,2,1,4
|
| Is there something like this?
|
| It does not need much statistical resemblence to randomness. Just
| look kind of random to the eye. And the code should be short. A
| few lines of Javascript or so.
|
| Maybe one approach might be to just loop through the sequence
| (1,2,3,4,5) and xor the number with some other number? Maybe with
| 0101010...?
| Someone wrote:
| Xoring won't work:
|
| - xoring will only produce permutations that have period two,
| and every element will be part of a 2-cycle.
|
| - if your n isn't a power of two, it may produce numbers larger
| than n
|
| The set of 2-cycles will have a lot of structure, too (for
| example, if your magic constant is even, all cycles will have
| either two even numbers or two odd numbers; if it is odd, all
| cycles will have an even and an odd number), but the above
| should already be bad enough to drop that idea.
| TekMol wrote:
| if your n isn't a power of two, it may produce
| numbers larger than n
|
| This is the biggest problem.
|
| The others are not such big problems as I don't need strong
| resemblence to real randomness.
| Someone wrote:
| I can't look into your use case, but I'm not sure you
| realize how spectacularly bad xoring with a fixed value is
| as a way to generate a 'random' permutation.
|
| For example, it will still alternate odd and even numbers.
|
| Also, an adversary can derive the xor key from a single
| sample, and predict the entire sequence from it.
| jchook wrote:
| The equivalence classes modulo some n form a partition of the
| integers so you can use that to create a very efficient
| solution with very little code.
|
| Here is a neat explanation:
|
| https://preshing.com/20121224/how-to-generate-a-sequence-of-...
|
| If you need even better properties (eg cryptographically
| secure) you can also look at PCG with k-dimensional
| equidistribution:
|
| http://www.pcg-random.org/index.html
| amitp wrote:
| Yes, xor can be used as part of this -- see the paper[1] and
| code[2]. The idea is that you're generating a random _shuffle_
| of a sequence, but you can do it in a way that _doesn 't_
| require storing the entire shuffled sequence. You can ask
| "what's in position 4" and it will return "item 1".
|
| [1]: http://pcgworkshop.com/archive/mawhorter2019anarchy.pdf
| [2]: https://github.com/solsword/anarchy
| jlouis wrote:
| If size is a prime, repeated multiplication on a generator
| element (any element) will work if you compute `mod p`. But
| it's not going to have nice random properties for the most
| part. It's just a "permutation".
|
| The reason it works is because the subgroup order generated by
| the element has to be divisible in the group order. There's
| only 1 and p. So unless it's trivial (i.e., the element 1),
| it's going to be p and thus it generates the whole group in p
| steps.
| nestorD wrote:
| From memory you don't need size to be a prime, what you need
| is that it is _coprime_ with your step.
|
| Thus, the easiest solution is to take a step-size that is
| prime and different from you size (that works for any size).
| Given the index of the current element, the next index would
| be `(current_index + step) % size`.
| thewakalix wrote:
| Are you talking about generating a random permutation?
| TekMol wrote:
| I don't think so. Maybe a "random sequence" could be a name
| for it. But not sure.
| thewakalix wrote:
| What I mean is, in your example, do you need to have
| precisely the numbers 1, 2, 3, 4, and 5, but in any order?
| Or is the constraint something else?
| TekMol wrote:
| Ah yes, you are right. In that sense, it is a random
| permutation of the numbers from 1 to max.
|
| Yes, when initialized with "size=5" the numbers in the
| output must be precisely 1,2,3,4,5 but in random looking
| order.
|
| And I don't want to store the whole sequence. I want to
| say gimmeTheNumberAtPosition(x) and it return just that
| number.
| e12e wrote:
| I guess you could predictably enumerate the permutations,
| so that you could lookup with parameters size/n,
| position/i, seed/r - where r is the permutation.
|
| For size=1, all parameters are 1, and only result is 1.
|
| For size 2, r can be 1 or 2, naming the permutations
| [1,2];[2,1] - and eg: rand_at(n=2,r=2,i=2) would return
| 2, but i=1, would return 2.
|
| I'm not sure how I'd implement this - I suppose it might
| be possible to generate a predictable "walk" based on
| n/r/i?
|
| But for sizes less than, say, a million i would think
| that a shuffled array would be easier?
|
| r would need to be in the range 1..n!
| resolvent wrote:
| So you want a mapping of [1, n] to a permutation of [1,
| n] without having to generate the list [1, n] and
| shuffling it. An affine transformation modular n should
| work. Instead of [1, n], look at [0, n). Find a value `a`
| in [0, n) such that gcd(a, n) = 1 and pick a random
| integer b in [0, n). Then the `random` number at each
| position is `a * x + b mod n`.
|
| This is simple and fast, but is not secure at all. You
| can solve for a, b by solving the linear congruence. It
| also does not generate every permutation of `n`. For n =
| 5, only 20 sequences can be found out of 5! = 120.
| johndough wrote:
| The value `a` has to fulfill some more properties: https:
| //en.wikipedia.org/wiki/Linear_congruential_generator#...
|
| Since the "randomness" of the permutation is not that
| great, I was looking for something better, but could not
| find anything. The closest I got was https://en.wikipedia
| .org/wiki/Xorshift#xoshiro_and_xoroshiro which only works
| for powers of two. A workaround would be to choose the
| next larger power of two and reject all random values
| which are smaller than `n`, but that introduces
| unpredictable latency and destroys the cool jump-ahead
| feature.
| resolvent wrote:
| The LCG is the improved extension which causes you to
| have to compute values x_0 = seed to x_n in order to
| calculate x_{n+1}. What I am talking about is just a
| mapping of index x -> f(x) which is what the OP seemed to
| have wanted. In that case, `a` only needs to be
| relatively prime. This cannot account for every
| permutation of n since the number of values relatively
| prime to n, the totient, has an upper bound of n, which
| is the possible values for a. The number of values for b
| is also n, so at most n^2 possible sequences are
| generated, which is less than n! for n > 3.
|
| For example, with n = 5: Let a = 3 and b = 2. x = [0, 1,
| 2, 3, 4], a * x = [0, 3, 6, 9, 12], a * x + b = [2, 5, 8,
| 11, 14], a * x + b mod n = [2, 0, 3, 1, 4]
| CipherThrowaway wrote:
| What you want is a cipher from 1..N to 1..N. This problem
| is often approached as an encryption problem known as
| Format Preserving Encryption. The Sometimes Recurse
| Shuffle linked elsewhere in this thread is one solution
| but the most foundational and accessible paper on this is
| Black and Rogaway's "Ciphers with Arbitrary Domains" The
| Generalized Feistel Network (Method 3) is fast and easy
| to implement.
|
| https://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf
| FabHK wrote:
| Generating random permutations is quite a different
| problem from creating random numbers. The standard algo
| for random permutations is the Fisher-Yates Shuffle.
|
| https://en.wikipedia.org/wiki/Fisher-Yates_shuffle
|
| ETA: As you don't want to store the permutation, you
| might want to pick a number randomly from 1 to n!, and
| then generate the permutation on the fly up to the
| desired element, using the techniques outlined here:
|
| https://stackoverflow.com/questions/29236556/how-can-i-
| calcu...
| SuchAnonMuchWow wrote:
| Storing a number of the order of magnitude of n! will
| take the same amount of memory than storing a permutation
| of n elements, so what's the point ?
| HelloNurse wrote:
| If one really wants to do it the slow and hard way, it's
| possible to compute a random permutation of n elements
| one element at a time with n bits of storage (less with
| fairly generic compression techniques): just remember the
| set of numbers that have been produced so far and
| sequentially pick one of the remaining ones. You'll waste
| some combination of additional memory (for arithmetic
| decoding state) and random bits (for rejected samples),
| but materializing a large randomly permuted vector should
| be slower.
| FabHK wrote:
| Good point. Storing a number of magnitude n! will take n
| log n bits, and storing n numbers up to n will also take
| n log n bit.
|
| In other words, good old Fisher Yates, storing the
| resulting permutation, and a simple lookup is probably
| the way to go.
| kinos wrote:
| for JS you can probably just do
| `[...Array(n).keys()].sort(()=>Math.random() - 0.5)` ?
| matsemann wrote:
| Having an unstable comparator might wreak havoc and maybe
| not give a proper distributed space of possible results?
|
| Edit: yes, here is an example (in the comments) of how
| biased this is:
| https://stackoverflow.com/a/18650169/923847
| anderskaseorg wrote:
| That is exactly the algorithm that Microsoft used to bias
| its purportedly random "browser choice" screen. Don't use
| it.
|
| https://www.robweir.com/blog/2010/02/microsoft-random-
| browse...
|
| The Fisher-Yates shuffle is the right way to shuffle an
| array in an unbiased way.
|
| https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffl
| e
| isaac21259 wrote:
| If your value of size is a power of two you could just
| encrypt x with a block cipher where the block size is
| equal to size and the key is your seed.
| nightcracker wrote:
| You can construct arbitrarily-sized permutations with
| log(n) time random access using Sometimes-Recurse
| Shuffle: https://eprint.iacr.org/2013/560.pdf.
| vince14 wrote:
| https://news.ycombinator.com/item?id=25497357
| Ayesh wrote:
| Probably looking for this:
| https://en.m.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
| TekMol wrote:
| This seems to use an external random number generator.
|
| I want the code to be complete and reproducible. So
| gimmeTheNumberAtPosition(x) will always return the same for
| the same x.
|
| Also, FYShuffle is much more complex than I would like the
| algo to be.
| miked85 wrote:
| Are you saying you want a reproducible random number
| generator?
| jhgb wrote:
| Fisher-Yates is one of the simplest things out there IMO.
|
| Isn't the Lehmer code the thing that you want? Then you can
| specify your permutation with an integer and extract the
| individual digits from it repeatedly.
| TekMol wrote:
| Fisher-Yates is one of the simplest things out
| there IMO
|
| Just using "$n xor $something" would be simpler. I am
| just not sure yet, what a good "$something" is.
| jhgb wrote:
| How does this guarantee the desired property of the
| output being a permutation?
| jimktrains2 wrote:
| If you seed your random number generator, then it is
| reproducible.
| [deleted]
| jamespwilliams wrote:
| If you're using Go, you can use rand.Perm
| (https://golang.org/pkg/math/rand/#Perm), which uses a Fisher-
| Yates shuffle under the hood.
| NotCamelCase wrote:
| Have you looked up LFSRs yet?
| https://en.wikipedia.org/wiki/Linear-feedback_shift_register
|
| They are quite efficient for both HW and SW implementations.
| wizeman wrote:
| Look into linear feedback shift registers, they are very simple
| and can produce random-looking permutations.
| ascar wrote:
| If you want a simple pseudo random sequence of N that just
| looks random (pseudocode): int start =
| getRandomInt(seed) % N; double k = N / 1.618; k = k
| % 2 == 0 ? k - 1 : k; // k and N must be coprime int
| nextElement = k * start % N; start++;
|
| This will jump wildly across your sequence and visit all
| elements (as k and N are chosen coprime). No need to store a
| full array. Is it statistically random? Of course not, e.g. you
| will never see two values close to each other right after
| another.
| anderskaseorg wrote:
| Your algorithm fails to choose k and N coprime for about 19%
| of N, such as N = 6, 10, 14, 15, 25, 27, 35, 36, 45, 54, 55,
| 65, 66, 75, 84, 85, 90, 93, 95 ....
| jedimastert wrote:
| I'm reminded of Chris Wellon's "Prospecting for Hash Functions",
| where he randomly generates hash functions and then runs them
| through a couple of tests.
|
| https://nullprogram.com/blog/2018/07/31/
|
| Out of curiosity, is running the state through a hash a
| reasonable rand strategy?
| midjji wrote:
| Recently noted that mt19937, mt19937_64 are much faster than the
| standard c++ random generator. They are also possibly better with
| regards to number distribution, but the performance difference is
| gigantic on clang. The standard 32 bit
| std::default_random_generator is almost 40x slower than the 64
| bit mt19937_64 one across the board, from -o1 to -O3 march...
| etc.
|
| As all of them are also faster than old school rand, its worth
| upgrading for the performance increase if nothing else.
| FabHK wrote:
| I think both the PCG family and the xoroshiro family are faster
| still.
|
| https://prng.di.unimi.it
|
| https://www.pcg-random.org
|
| ETA: https://github.com/lemire/testingRNG
| ufo wrote:
| Does anyone know what's the current consensus about PCG vs
| xoroshiro. I remember that the xoroshiro author had several
| complaints about PCG but I don't know ifhe is right or if he
| was just being salty.
| aDfbrtVt wrote:
| There's a bunch of analysis written by the PCG folks
| bashing Xoroshiro as well [1]. I think that for most normal
| use cases with randomized state, Xoroshiro is quite good.
| It only requires a single 64b add which is amazingly
| efficient.
|
| [1] https://www.pcg-random.org/posts/xoshiro-repeat-
| flaws.html
| rurban wrote:
| MT is pretty slow. There are fast variants (SFMT), but why use
| that when can have much better and much faster rngs?
|
| https://rurban.github.io/dieharder/QUALITY.html
| ascar wrote:
| I don't know about all the other RNGs in this benchmark, but
| MT prepares a dense/contiguous array of random numbers. The
| whole computation can be SIMDified (often even by the
| compiler), but extraction is (unless changed explicitly) a
| single copy of a double.
|
| If RNG speed matters and you need a lot of random numbers in
| succession (and I assume these two assumptions correlate
| strongly) editing MT to directly pull 4 or even 8 random
| numbers at once (i.e. a `__m256 rand_m256()` interface) is a
| huge performance gain.
|
| wyrand (the top spot of the linked benchmark), doesn't have
| this benefit. The computation can't be SIMDified and the
| extraction is always a single value.
|
| So I would take these benchmark with a grain of salt and take
| a closer look at the specific situation for any application
| where RNG speed really matters.
| jhgb wrote:
| Wouldn't a SIMDified implementation of other RNGs speed
| them up by a comparable factor, making MT relatively slower
| again?
| ascar wrote:
| If the next number depends on the previous number (like
| with wyrand and many others) it can't be simdified.
|
| You could simdify computation with different seeds tho,
| which might be fine for many purposes. However at that
| point it's also a custom implementation with a different
| number sequence than the non-simd version, which isn't
| the case with a SIMDified MT.
| adgjlsfhk1 wrote:
| Also xoshiro generators are enough cheaper than MT that
| even if you couldn't simdify it, you could keep 4 and get
| run the 4 together in SIMD to get 4 outputs way cheaper
| than 1 SIMD MT.
| galangalalgol wrote:
| It looks like xorshiro was SIMDified in a paper linked in
| abother post in this tree.
| galangalalgol wrote:
| Very useful! Thanks! Aren't there AES related intrinsics for
| some architectures that can be really fast for rng? How do
| they compare?
| rainworld wrote:
| https://github.com/google/randen#performance
___________________________________________________________________
(page generated 2021-05-02 23:03 UTC)