[HN Gopher] Original source of `(seed * 9301 and 49297) % 233280...
___________________________________________________________________
Original source of `(seed * 9301 and 49297) % 233280` random
algorithm? (2014)
Author : thunderbong
Score : 142 points
Date : 2022-08-06 14:07 UTC (8 hours ago)
(HTM) web link (softwareengineering.stackexchange.com)
(TXT) w3m dump (softwareengineering.stackexchange.com)
| montroser wrote:
| As pointed out in the comments of the post, this is an LCG[1]
| PRNG. The Wikipedia article has a good discussion on how to
| choose parameters, with examples of use in the wild from various
| popular projects over the last many decades.
|
| [1]: https://en.wikipedia.org/wiki/Linear_congruential_generator
| Someone wrote:
| And its first reference is to what I consider the definitive
| reference on the subject, at least for the state of the art in
| 1968: the Art of Computer Programming. Vol. 2: Seminumerical
| Algorithms.
|
| I also think the only additions that might be needed to that
| are mentions of newer algorithms that are comparatively simple,
| but better. The subject itself is pretty well-trodden.
|
| Knuth does mention the extremely simple additive random number
| generator Xn = (Xn-24 + Xn-55) mod m
|
| (And says that this can be used directly with floating point
| values to produce floats in the range [0,1))
|
| It uses 55 words of state (initialized to not all be even),
| needs only about ten simple instructions to implement (in
| particular, no multiplications or divisions), has period of at
| least 2^55-1, and Knuth said it's very good in practice, isn't
| known to be bad, and ends with (in the 1980 second edition of
| volume 2, with the first edition being from 1968):
|
| _The only reason it is difficult to recommend sequence (7)
| wholeheartedly is that there is very little theory to prove
| that it does or does not have desirable randomness properties;
| essentially all we know for sure is that the period is very
| long, and this is not enough"_
|
| I can't find anything more recent about this RNG. Has it been
| forgotten, or was it shown to be bad in some important sense?
| hcs wrote:
| It's a lagged Fibonacci generator
| https://en.wikipedia.org/wiki/Lagged_Fibonacci_generator
|
| In the third edition (1998) there are a few more notes, most
| importantly:
|
| "Lagged Fibonacci generators have been used successfully in
| many situations since 1958, so it came as a shock to discover
| in the 1990s that they actually fail an extremely simple,
| non-contrived test for randomness" with reference to exercise
| 3.3.2-31 asking "prove that if we generate 79 consecutive
| random bits ... starting at a random point in the period, the
| probability is more than 51% that there are more 1s than 0s".
|
| The solution references this paper on the tests:
| https://arxiv.org/abs/cond-mat/9406054
|
| There's a recommendation to generate X numbers and only use
| the first Y an order of magnitude smaller, citing Luscher's
| RANLUX, but I don't know how much that holds up either.
| Someone wrote:
| Thanks! I focused too much on "additive" as a search term.
| I even looked at https://en.wikipedia.org/wiki/List_of_rand
| om_number_generato..., and would have found that better
| name if I had searched for "Mitchell" or "Moore" there, not
| "additive".
| hcs wrote:
| FYI if you only have the 2nd edition of volume 2, Knuth
| provides the full text of the changes from the 2nd to 3rd
| edition electronically: https://www-cs-
| faculty.stanford.edu/~knuth/taocp.html (search for
| "Errata for Volume 2 (2nd ed.)")
|
| Since it's too late to edit the earlier post, this is the
| specifics around discarding (p571):
|
| Luscher's discarding technique can be used to avoid the
| bias towards 1s. For example, with lags 55 and 24, no
| deviation for randomness is observed for random walks of
| length 1001 when the numbers are generated in batches of
| 165, if only the first 55 numbers of each batch are used.
| aaaaaaaaaaab wrote:
| Probably the NSA...
| russellbeattie wrote:
| In case people don't know this story, or worse, believe the
| conspiracy theories, here's what happened with the development
| of DES:
|
| > _" According to Steven Levy, IBM Watson researchers
| discovered differential cryptanalytic attacks in 1974 and were
| asked by the NSA to keep the technique secret. Coppersmith
| explains IBM's secrecy decision by saying, "that was because
| [differential cryptanalysis] can be a very powerful tool, used
| against many schemes, and there was concern that such
| information in the public domain could adversely affect
| national security." Levy quotes Walter Tuchman: "[t]hey asked
| us to stamp all our documents confidential... We actually put a
| number on each one and locked them up in safes, because they
| were considered U.S. government classified. They said do it. So
| I did it"._
|
| _Bruce Schneier observed that "It took the academic community
| two decades to figure out that the NSA 'tweaks' actually
| improved the security of DES."_
|
| https://en.m.wikipedia.org/wiki/Data_Encryption_Standard
| walterbell wrote:
| John Denker's 2013 paper/code for using sound cards as input to a
| HRNG, https://www.av8n.com/turbid/
|
| _> We discuss how to configure and use turbid, which is a
| Hardware Random Number Generator (HRNG), also called a True
| Random Generator (TRNG). It is suitable for a wide range of
| applications, from the simplest benign applications to the most
| demanding high-stakes adversarial applications, including
| cryptography and gaming. It relies on a combination of physical
| process and cryptological algorithms, rather than either of those
| separately. It harvests randomness from physical processes, and
| uses that randomness efficiently. The hash saturation principle
| is used to distill the data, so that the output is virtually 100%
| random for all practical purposes. This is calculated based on
| physical properties of the inputs, not merely estimated by
| looking at the statistics of the outputs. In contrast to a
| Pseudo-Random Generator, it has no internal state to worry about.
| In particular, we describe a low-cost high-performance
| implementation, using the computer's audio I /O system._
| hkgjjgjfjfjfjf wrote:
| sbf501 wrote:
| This is what bugs me about the stackexchange model. Most of the
| comments are guesses by people who really don't know. The
| "correct" answer doesn't explain anything, it just points to some
| sources. The highest ranked comment that comes closest, attempts
| to explain the nature of the numbers (the modulous is the period,
| and the primes stretch out the cycle) and fires off a link to
| wikipedia.
|
| Then it appears on HN, and more people half guessing, but better
| content in general.
|
| Based on yesterday's post, BYTE magazine would have actually
| given an answer with both why and how.
|
| EDIT: Granted, the link DOES go to Wikipedia which has a great
| explanation, but in general, providing a link and not answer is a
| poor model because links decay so quickly.
| svat wrote:
| It took me a while to understand your comment, until I realized
| there are _two_ interpretations of the question of where these
| numbers come from:
|
| * One is a _historical_ question: who first used these numbers
| and when? who copied from where?
|
| * The other is a _conceptual_ question: why might someone
| choose these numbers, what do they "mean", how do we interpret
| these constants?
|
| The former is purely a question of facts and there is nothing
| to "explain the nature of the numbers", and this is what the
| answerer on Stack Exchange seems to have interpreted it as.
| Under this interpretation saying things like "the modulus is
| the period" is not really germane to the question; what's
| required is to search sources and try to find the
| chronologically earliest one. (The answer doesn't seem to have
| done a definitive job, but it's a reasonable start.)
|
| This is like _synchronic_ vs _diachronic_ in linguistics: in
| the latter interpretation you mainly just want to _understand_
| the random-number generator.
| pizza234 wrote:
| > This is what bugs me about the stackexchange model.
|
| > providing a link and not answer is a poor model because links
| decay so quickly.
|
| Providing a link and not answer is not the stackexchange model.
| On the Stack Exchange network, such answers are often (not
| always, of course) commented with a request to expand the
| answer.
| nerdponx wrote:
| And it's not just accepted but encouraged to comment on old
| questions and answers that are either wrong or not up to
| current standards.
| phalangion wrote:
| > links decay quickly
|
| That might be true in many places, and I agree with the general
| sentiment that the answer should provide an answer and not just
| a link to an answer. But wikipedia links are generally pretty
| stable. It's not some random blog post.
| [deleted]
| spullara wrote:
| I'd be willing to bet that a Wikipedia link will outlast a
| StackExchange answer.
| matheusmoreira wrote:
| It absolutely will. Some of my favorite answers on Stack
| Overflow have been deleted and there seems to be no archive
| for such lost data.
| nerdponx wrote:
| Stackoverflow itself retains all deleted questions and
| answers. If you have enough points you can see them.
| kragen wrote:
| Which ones were they?
| cxcorp wrote:
| I usually save my bookmarks on the Wayback Machine just so
| there's at least one copy out there.
| loldk wrote:
| Insecure devs always obfuscate and try to hide the whole truth.
| It is a massive problem in tech.
| bluedino wrote:
| The title question was the original source. The answer was a
| list of places it was found, including the earliest.
| sbf501 wrote:
| Oh, good point. OP was looking for original source, papers
| referring to, etc. Too late to edit, but minus points for me
| for not comprehending the original question.
| bluedino wrote:
| Didn't help that he asked a couple more questions at the
| end
|
| _Who invented this algorithm and tested the distribution?
| Is there a paper or something to cite?_
|
| That would have made another question.
| Nition wrote:
| I don't think you were far off what the asker was looking
| for. In a comment, they say "I was kind of hoping to be
| able to point to some specific research as to why these
| numbers are special".
| 0xTJ wrote:
| The question isn't asking about how it works. It asks about
| where that algorithm comes from.
| a-dub wrote:
| NR has long been seen as a sort of necronomicon for scientific
| computer programming.
|
| i don't think it's SE's fault as much as it is a historical
| rough edge between two fields (computer programming and applied
| mathematics).
|
| the first constant is klaatu, the second is barada and the
| third is nikto.
|
| (NR also tends to be fairly rigorous in terms of citations, for
| the curious)
| ascar wrote:
| What's NR, what's a necronomicon and what's klaatu, barads
| and nikto.
|
| I googled all these things and I can't still make any sense
| of your comment except that it's intentionally confusing with
| some popular fiction references?
| svnpenn wrote:
| It's common hacker news trope for someone to do acronym
| diarrhea, even with rarely or never used abbreviations.
| It's quite annoying.
| Dylan16807 wrote:
| NR is in the answer in the link.
|
| I feel like SE being stack exchange is pretty clear here.
|
| The remaining reference is not an acronym or
| abbreviation, nor do I think it looks like one.
| hinkley wrote:
| "Klaatu barada nikto" is a code phrase from The Day the
| Earth Stood Still to stop the homicidal antagonists from
| killing you.
|
| It's doubly famous because Sam Raimi recycled it in "Army
| of Darkness" as an incantation to dispel the curse of the
| Necronomicon (book of the dead, portrayed as the arch book
| of black magic in any number of stories), which our hapless
| anti-hero Ash fucks up and thus unleashes the armies of the
| dead on a roughly medieval and wholly unprepared world.
|
| If you like campy horror movies, there is so much camp in
| Army of Darkness that there's barely any room for the
| horror. It is technically a sequel but it works as a
| standalone movie. I watched it years before I finally sat
| down and watched The Evil Dead.
| Natsu wrote:
| Wasn't it also used in Ghost Busters at some point?
| kragen wrote:
| In addition to what the other answers have said, the
| Necronomicon is the book of dark magic in the Lovecraft
| mythos, from which it has been copied into some other
| mythoi.
| webstrand wrote:
| NR is apparently Numerical Recipes, a book. I think they
| mean that the constants are not well understood, like the
| phrase "Klaatu barada nikto" was not understood, but still
| had an effect.
| adhesive_wombat wrote:
| I assume _Numerical Recipes_.
| overshard wrote:
| And this is when comments in code are important! Any random
| numbers without a source are immediate suspect to me, especially
| in something that needs to be secure. It will save your coworkers
| and peers time trying to figure out why it's there.
| JKCalhoun wrote:
| Security through obscurity?
|
| Maybe not.
| swatcoder wrote:
| That's true, but _this_ thing is not code any more.
|
| It's an incantation that's propagated for 50+ years because
| it's minimal and effective. Over time, it's been fully
| distilled to those properties.
|
| Since comments aren't essential to being minimal and effective,
| they don't survive the distillation.
|
| Think of it like a clever gist that got pasted and shared a
| hundred times. Even if the original source had explained every
| step in great detail, with inline comments and deep explanatory
| discourses and citations to prior art and etc, they'd
| eventually get trimmed away as fat as people repeatedly prune
| it down to some "important" bits pasted into their own copies
| and then later share those trimmed copies, ad infinitum.
|
| This is that, but 50 years out.
| AstralStorm wrote:
| Since it's an LCG, it does not matter how it was derived as
| long as the randomness properties are known. Such as cycle
| length, identical initial state set and dispersion
| properties. Perhaps also performance.
|
| These should be documented.
|
| It's a rather weak PRNG of short cycle, so the suspicion is
| that it's made for particular dispersion properties, such as
| for a hash table of particular data and size or other
| bucketing algorithm.
| ChrisLomont wrote:
| >These should be documented.
|
| They're trivial to look up, and any modern source would
| likely outlive the game of telephone of trying to keep such
| a comment intact correctly.
| krater23 wrote:
| When you find this algorithm with or without any comments from
| where the numbers come from in a point that should be secure,
| you should be more than suspect. In any way, this is not a
| secure PRNG.
| Waterluvian wrote:
| "No magic numbers."
| eru wrote:
| You'd want https://en.wikipedia.org/wiki/Nothing-up-my-
| sleeve_number
| Waterluvian wrote:
| That's clever!
| bee_rider wrote:
| It is interesting -- 5/9 of the books listed there are clearly
| numerics textbooks, so it isn't surprising they talked about a
| non-cryptographic PRNG. Curious about the other 4. But maybe
| this shows up as a "here's why you should be careful what type
| of RNG you are using" type example.
| pclmulqdq wrote:
| For PRNGs like this, constants are often chosen by guess and
| check. This algorithm (an LCG) has a bit more theory, so these
| might not have been chosen that way, but the author of the code
| probably didn't have any insight either.
| swayvil wrote:
| I think it fits the rng period to fit max 32 bit (4 bil).
| Somebody in the comments said similar so I feel good about this.
| jffry wrote:
| The person in the comments said the period is only ~233k, which
| is correct, because you are always taking modulo 233280. But
| it's also easy to verify experimentally: let
| state = 6, iters = 0; do { state = (state * 9301
| + 49297) % 233280; iters++; } while(state !== 6);
| console.log("period is", iters, "iters"); //period is
| 233280 iters
|
| If you instead took modulo 2^32, I believe the period would
| indeed be the maximal period, according to this article:
| https://en.wikipedia.org/wiki/Linear_congruential_generator#...
| bonzini wrote:
| He means that the constants are chosen so that the
| multiplication and addition can be done on a signed 32-bit
| integer without overflow.
| morelisp wrote:
| There's no reason to avoid overflow (nor any reason to make
| it signed since that would also be a concern these days)
| though.
| bonzini wrote:
| That depends on the language. For example BASIC used to
| have only 16-bit or 32-bit signed integers (depending on
| the processor; plus strings and floating point), so this
| particular RNG can be useful for its portability.
| morelisp wrote:
| Such dialects of BASIC also offered defined overflow
| behavior (it's really only C/C++ and other things
| targeting MCUs that don't - and even three also only
| relatively recently that "undefined overflow" meant
| something more practically dangerous than "implementation
| defined") so it was still not a concern.
| bonzini wrote:
| Sure but signed and unsigned division are different.
| [deleted]
| a1369209993 wrote:
| > so that the multiplication and addition can be done on a
| signed 32-bit integer without overflow
|
| Except they can't: any seed in the range 230888 thru 233279
| will produce a signed integer overflow when multiplied by
| 9301 (eg 9301*230888 = 0x80001608).
| bonzini wrote:
| You're right, that makes no sense. Also a-1 (9300) should
| not have factors in common with m (233280) but it has a
| lot...
| kloch wrote:
| Here's a good article about this type of PRNG, with other sample
| seeds:
|
| https://www.pcg-random.org/posts/does-it-beat-the-minimal-st...
| [deleted]
| snet0 wrote:
| If someone can access it, my university notes where I looked into
| LCGs such as this discusses that in Numerical Recipes, 1992 C
| Edition, a clever LCG can be used to generates random numbers
| between 0 and 1. Using a 32 bit int generated by a fast LCG
| modulo 2^32. The cleverness is that you treat this value as a
| float. It uses a bit mask to mask only the mantissa of a
| float-32, and then is OR'd with a value to set the exponent to
| 127. Subtracting 1 from this results in a random value between 0
| and 1, using only simple or bitwise operations!
| 10000truths wrote:
| The issue with this is approach that the interval [1,2) has
| much fewer representable values than [0,1) in IEEE-754 floats,
| so the range of generatable values is only a fraction of the
| representable values.
|
| On modern hardware, you should instead use a count-leading-
| zeroes or count-trailing-zeroes instruction on a uniform bit
| pattern to directly generate the exponent. This is what is done
| in the Zig standard library:
|
| https://github.com/ziglang/zig/pull/10428
| binarycoffee wrote:
| You are right of course, but I would contend that generating
| as well sub-normal floats is in 99.9% of the cases a needless
| cost. The reason is that, more often than not, all this extra
| precision will be immediately lost in subsequent operations.
|
| A very common situation is for instance to plug the random
| number in a log, in which case you need to use log(1-r)
| rather than log(r) to avoid an infinite at r=0. The problem
| is, by doing this simple subtraction you have already lost
| all the subnormal precision.
| ralphb wrote:
| This is well-known, and almost trivial. But because I am
| curious I looked this up in NR 3rd edition. Section 7.1.5 says
|
| > The steps above that convert a 64-bit integer to a double
| precision floating-point value involves both a non-trivial type
| conversion and a 64-bit floating multiply. They are performance
| bottlenecks. One can instead directly move the random bits into
| the right place in the double word with a union structure, a
| mask, and some 64-bit logical operations; _but in our
| experience this is not significantly faster_
|
| It goes on to describe a _lagged Fibonacci generator_ which
| generates values directly as floating point.
| binarycoffee wrote:
| This is a widely used method to map random integers to floating
| point numbers, but it has the disadvantage of wasting 1 bit of
| float mantissa precision.
|
| On modern CPUs, its computational advantage over full-precision
| mapping methods, such as multiplication by a float, is not
| always clear [1].
|
| [1] https://github.com/rust-random/rand/issues/416
| humanistbot wrote:
| Yeah, I'm pretty sure this isn't the same kind of thing as
| Carmack's (edit: not Carmack's) legendary inverse square root
| function [1], which has that 0x5F3759DF constant (edit: magic
| number):
|
| [1] https://en.wikipedia.org/wiki/Fast_inverse_square_root
| float Q_rsqrt( float number ) { long i; float
| x2, y; const float threehalfs = 1.5F; x2 =
| number \* 0.5F; y = number; i = \* ( long \* )
| &y; // evil floating point bit level
| hacking i = 0x5f3759df - ( i >> 1 );
| // what the fuck? y = \* ( float \* ) &i; y = y
| \* ( threehalfs - ( x2 \* y \* y ) ); // 1st iteration
| // y = y \* ( threehalfs - ( x2 \* y \* y ) ); // 2nd
| iteration, this can be removed return y; }
| silisili wrote:
| Not Carmack's, it says so in the link you provided.
|
| > The algorithm was originally attributed to John Carmack, but
| an investigation showed that the code had deeper roots in the
| hardware and software side of computer graphics. Adjustments
| and alterations passed through both Silicon Graphics and 3dfx
| Interactive, with the original constant being derived in a
| collaboration between Cleve Moler and Gregory Walsh, while
| Gregory worked for Ardent Computing in the late 1980s.[3] Walsh
| and Moler adapted their version from an unpublished paper by
| William Kahan and K.C. Ng circulated in May 1986.
| secondcoming wrote:
| What has this got to do with seeding a RNG?
| allan_s wrote:
| I think the comment author meant "a magical constant that has
| been found by someone by empirical method without a nice
| paper pointing why it's a good value"
|
| (even though I think now the constant above has a paper on
| why it's actually so good)
| humanistbot wrote:
| This was an example of a seemingly random magic number
| appearing in code for a very clever hack. I think the OP was
| assuming there was something similar at work with the PRNG
| magic numbers.
___________________________________________________________________
(page generated 2022-08-06 23:00 UTC)