[HN Gopher] Cracking Random Number Generators Using Machine Lear...
___________________________________________________________________
Cracking Random Number Generators Using Machine Learning
Author : Hard_Space
Score : 136 points
Date : 2021-10-16 09:53 UTC (13 hours ago)
(HTM) web link (research.nccgroup.com)
(TXT) w3m dump (research.nccgroup.com)
| smitty1e wrote:
| If machine learning is ultimately pattern recognition, then, is
| random number generation the antithesis of machine learning?
| willis936 wrote:
| Perhaps the machine is hallucinating.
|
| https://en.m.wikipedia.org/wiki/Pareidolia
| joe__f wrote:
| What are the potential implications of this?
| IncRnd wrote:
| None. This has been trivial to break for a long time. It is not
| considered a cryptographically secure random number generator.
| joe__f wrote:
| Is it potentially useful for something outside of
| cryptography? I was watching a video on speedrunning where
| they were gaming the RNG somehow to improve their times
| upofadown wrote:
| Predictable bit streams that look random are useful for
| communications. Satellite navigation systems like GPS for
| instance use such streams where the whole idea is for a
| receiver to sync up on on the bitstream.
|
| An example where no one even cares exactly what the
| bitsream is can be found on motherboards where a bitstream
| is used to jitter the system clock to spread out any radio
| interference caused by the motherboard.
| _hl_ wrote:
| > to jitter the system clock to spread out any radio
| interference caused by the motherboard
|
| Fascinating! That seems very counterintuitive, where can
| I learn more about this?
| upofadown wrote:
| A hopefully relevant search:
|
| * https://www.google.com/search?q=motherboard+spread+spec
| trum
| swayvil wrote:
| They need to try this out on one those natural RNGs. Electron-
| noise or lava lamp.
|
| Maybe they'll discover something.
| maweki wrote:
| The only thing you could discover is, that it's not indeed a
| natural RNG, but a PRNG.
| cblconfederate wrote:
| If a hidden variables model is found, it would be the
| discovery of the century
| tlholaday wrote:
| Strong evidence, perhaps dispositive evidence, for the
| Simulation Hypothesis.
| JackFr wrote:
| No it isn't.
| swayvil wrote:
| A fundamentally entropic phenomenon is discovered to be
| patterened. It would be like looking through a microscope
| and seeing voxels.
|
| How is this not strong evidence for simulation?
| chaboud wrote:
| It could just as easily be evidence of a previously non-
| understood physical phenomenon or mechanism being
| recognized by new methods.
|
| Of course, that's exactly the kind of thing a simulation
| would type!
| JackFr wrote:
| Intellectually the simulation hypothesis is creationism's
| hipster cousin.
|
| It's not a falsifiable hypothesis. It makes so many
| implicit assumptions it's scarcely worth talking about.
|
| It's fake smart and not interesting.
| swayvil wrote:
| Have you heard the one about the rat in the glove? Looks
| like a hand. Wiggles like a hand. All rat.
| rsfern wrote:
| Isn't the quantum nature of reality already like looking
| through a microscope and finding voxels?
|
| I don't see why finding underlying structure in what we
| expect to be pure noise has to mean our reality is a
| simulation. I guess it could be consistent with that, but
| couldn't it also be consistent with some previously
| undiscovered physics, much like the development of
| quantum mechanics?
| SavantIdiot wrote:
| I'd like to see a neural net have a go at a DRBG, such as one
| using an HMAC/SHA256 and PBDKF2, a cryptographically strong
| pseudo-random bit generator. Is it possible there's something
| that analytic methods of this RFC didn't catch that a neural net
| could?
| tptacek wrote:
| There's been over a decade of research into machine learning
| cryptanalysis. It's not a new idea.
| cblconfederate wrote:
| ANNs are function approximators so it s trivial for them to learn
| a mathematical function like a prng. The seed is irrelevant to
| the ANN, it just learns the function's sequence, regardless of
| where it started. Which makes the interesting question how many
| people are already cracking PRNGs with ANNs ?
| CodesInChaos wrote:
| 1. For many functions ANNs will never converge to an
| approximation of that function given only (input, output)
| tuples, even if the function is simple enough that the ANN
| could represent it. Chaotic functions rarely have a useful
| gradient the NN can follow.
|
| 2. To break a PRF it'd need to learn its inverse. For many of
| them it's unlikely that a small circuit computing the inverse
| even exists, so a non-iterated NN couldn't even represent the
| desired function, let alone converge to it.
| cblconfederate wrote:
| i dont know much about cryptography but why would it need to
| learn its inverse? I mean a theoretically absurdly large/wide
| network can approximate almost any function . Are PRFs
| resistant to that?
| _hl_ wrote:
| The underlying assumption of all of modern cryptography is
| that one-way functions exist (implying P != NP), and
| further that the algorithms we use in practice are actually
| instances of these theoretically hard problems. So there
| provably always exists an algorithm that breaks any given
| PRNG, it will just always take you very long to compute
| and/or a lot of queries to the PRNG. ANNs are nothing
| special in that regard, they are just algorithms and have
| the same theoretical lower bound on their runtime.
|
| So yes, an absurdly large ANN will break any PRNG, but so
| will other absurdly long-running algorithms, some of which
| are quite trivial: Just try every possible seed until you
| get the same sequence as what you observe, repeat until
| only one candidate seed is left.
|
| EDIT: To add to this, you seem to be referring to the
| universal approximation theorems for ANNs. These theorems
| state that for any function (subject to some conditions not
| relevant here) and any arbitrary approximation ratio, there
| exists a finite ANN that approximates the function to
| within the desired approximation ratio. It says nothing
| about whether it's possible to train such an ANN, merely
| that it exists. Which in this case is a fairly trivial
| results, you could feasibly create a look-up table of PRNG
| sequences to seeds and encode that as an enormous ANN. But
| finding/training this ANN is prohibitively expensive.
| cblconfederate wrote:
| I'm not sure if there is any guarantee on the size of the
| network required. Problems like vision or language were
| also very complex and nonlinear but anns were able to
| handle them. It is analytically difficult to find an
| inverse function but the ANNs are not trying to do that,
| merely approximating the source code of the PRNG as
| piecewise-nonlinear functions. So i wouldn't think the
| former problem (inverse) is informative about the
| difficulty of the ANN training. At least i dont know if
| there's theoretical work on designing ANN-hard
| cryptographic functions
| _hl_ wrote:
| The argument is fairly straightforward: Let n be the
| number of seed bits of the CSPRNG. Assume there exists an
| ANN that approximates the CSPRNG so that the number of
| possibly wrong output bits is at most polylogarithmic in
| n (otherwise you haven't really gained much insight from
| the ANN). Assume further that the ANN is sufficiently
| small so that a forward pass through the ANN takes at
| most polynomial time in n. Then you can create a
| polynomial-time algorithm that (i) runs the ANN on the
| observed CSPRNG samples/by querying the oracle, and then
| (ii) brute-forces the possible error bits in polynomial
| time, thereby recovering the full seed. But this is a
| contradiction with the assumption that the CSPRNG is
| secure, i.e. that it admits no polynomial-time adversary,
| q.e.d.
| cblconfederate wrote:
| The difference is that ANNs make approximations. It could
| be manageably small and learn an approximation of the
| PRNG code that succeeds for a large number of outputs,
| even if running a fully accurate network is impossible. I
| think the complexity of the PRNG recursive algorithm ,
| when unrolled through time for large number of steps, is
| the relevant complexity here. The ANN is not necessarily
| trying to devise a new function that recovers the seed.
| _hl_ wrote:
| The proof accounts for that. You can even allow for the
| ANN being totally wrong on "almost all" seeds except for
| a non-negligible fraction, the definition of a CSPRNG
| allows for that. If it is wrong for all but a negligible
| fraction of seeds, then you haven't gained much over
| random guessing. So the only way that an ANN could help
| is if the ANN approximation is sufficiently bad that it
| doesn't give you any significant speedup in cracking the
| CSPRNG.
|
| Of course, this is assuming that practical CSPRNGs
| conform to the theory, but we're just debating theory
| here.
| MauranKilom wrote:
| > I mean a theoretically absurdly large/wide network can
| approximate almost any function
|
| Yes. But chances are that the network size scales
| exponentially with the number of e.g. input bits. Good luck
| building/training/storing a model with 2^128 neurons in the
| hidden layer.
| seanhunter wrote:
| I could see how they could learn to approximate the function if
| they could observe the internal state, but without being able
| to see the state, if it's a strong PRNG how would they be able
| to do it?
|
| Edit: Reading the paper they're approximating xorshift which as
| I understand it is fast and efficient but has very little
| internal state and is not a strong PRNG.
| cblconfederate wrote:
| Would it be called pseudorandom if it had an undecipherable
| internal state such as temperature or sth? aren't all PRNGs
| deterministic functions by definition?
| adrian_b wrote:
| For some PRNGs it is intended that the output values must
| be unpredictable, e.g. for using in cryptographic
| applications.
|
| For such PRNGs, you can either view them as having a known
| state and an unknown output function or you can view them
| as having a known output function and a state composed of a
| known part and of an unknown, secret, part.
|
| The second point of view is more general, as the first case
| can always be reduced to the second, because the unknown
| output function must be derived using known functions from
| a secret key, so you can consider the secret key as the
| part of the state that is unknown.
|
| (There is a third possible structure for a PRNG, with known
| state, known output function and unknown transition
| function, but this is obsolete for making unpredictable
| PRNGs, as it has inferior properties).
|
| So yes, a pseudo RNG that has a part of its state that
| remains unknown may produce an output sequence from which
| it is computationally infeasible to determine its complete
| state.
|
| Nonetheless, an unpredictable PRNG remains a PRNG, not a
| true RNG. You can reproduce any time its output sequence if
| you know the secret part of the state, e.g. when you are
| the recipient of an encrypted message.
|
| If a ML method would be able to recover the secret part of
| the state when given the output sequence, obviously that
| PRNG would be considered broken.
|
| Many PRNGs are designed for high-speed in non-cryptographic
| applications, e.g. simulations, and for those PRNGs it is
| usually very easy to determine the internal state from the
| output sequence, without the need of any ML.
| [deleted]
| joshuaissac wrote:
| There are CSPRNGs like Fortuna that contain an entropy
| accumulator, so even if the secret part of the state is
| compromised, it can recover.
| adrian_b wrote:
| True, but they belong to a different class, which are
| used for applications like key generation, where there is
| no need to ever reproduce a certain output sequence.
|
| Such RNGs are used instead of true RNGs, which might not
| be able to provide the required random numbers fast
| enough. They cannot be used instead of normal PRNGs in
| most of their applications.
|
| Unpredictable cryptographic PRNGs can be used instead of
| any other PRNG for simulations, games, MonteCarlo
| integration etc. and they are actually better than
| simpler predictable PRNGs, except that they might be too
| slow.
|
| On modern CPUs with hardware AES instructions, any of the
| traditional PRNGs that is not faster than computing AES,
| has become obsolete, as using AES with a counter would
| provide a random sequence with a higher quality.
| rurban wrote:
| Wrong as we can see with x64's new rdrand which relies on
| a fast aesna, but is terribly broken.
| saithound wrote:
| Generally, PRNGs have an internal state, a transition
| function that determines a new internal state based on the
| current internal state, and an output function that
| determines the output value based on the current internal
| state.
|
| In most cases, the output is determined by the output
| function based on a small part of the state (e.g. you may
| have 128 bits of state, feed the first 64 bits of it to an
| output function, and derive a 32-bit output as a result).
| This means that you don't learn much about the internal
| state of the generator merely by looking at an output
| value: you need to observe a lot of consecutive outputs to
| recover the internal state of the generator, and without
| knowing the internal state it's difficult (i.e. very hard
| to impossible) to use a relatively small number of known
| outputs to predict the next output.
|
| The xorshift128 generator tested in the article has a nigh-
| trivial output function (because it's designed to be fast,
| not hard-to-predict) and a very simple transition function,
| which is why 4 outputs can be used to recover the internal
| state and to train a NN to predict the next output.
| cblconfederate wrote:
| but ANNs are universal approximators, they dont care if a
| function is recursive or anything as long as it's
| deterministic
| saithound wrote:
| The point is precisely that predicting the next output
| based on a number of known outputs is not deterministic.
| Predicting the next _internal state_ from the current
| _internal state_ is deterministic, but that doesn't help,
| since you don't have access to the internal state.
|
| Here's a simple random number generator. You start with a
| 16 bit number x, (your internal state). At each step you
| update your internal state to 75x + 74 (mod 65537), and
| output the first digit of your internal state (your
| random number)
|
| Here's a sequence of outputs: 6,5,3,3,3,5. Does this
| information allow you to predict the next output? No.
|
| For example, if the generator's starting state is
| x=60160, then the generator's internal state evolves as
| follows: [60160,55558,38093,38958,38296,54183,505]. So
| the seventh output will be 5. However, if your
| generator's starting state was instead x=6, then the
| generator's state evolved as follows:
| [6,524,39374,3959,34851,57956,21332]. So the seventh
| output will be 2. The releationship between the first six
| outputs and the seventh output is not deterministic.
| cblconfederate wrote:
| The function is still deterministic. ANNs can learn the
| source code of the function , they re not looking for an
| analytical solution to the forward or inverse problem.
| roywiggins wrote:
| The point is, you can be handed 1) the entire source code
| of the PRNG, and 2) six outputs from it, and it's
| impossible to know what the starting state of the PRNG
| was because six outputs could have come from many
| different starting states. You could brute force every
| single starting state and find every one that would
| produce those 6 outputs but it still doesn't let you know
| the next value.
| ChrisLomont wrote:
| They're already trivial to crack 100% with Z3. Using something
| fuzzy like a neural net is by far not an optimal way to break
| it.
| visskiss wrote:
| Uhh yeah right.
| saithound wrote:
| cblconfederate happens to be right in this specific case
| (although wrong about using neural nets for predicting PRNGs
| in general): the author uses four consecutive non-truncated
| outputs of the xorshift128 generator as the network input,
| and learn to predict the next output. Since four consecutive
| outputs expose the entire internal state of the xorshift128
| generator, this amounts to little more than approximating a
| (very simple) four-variable function.
| [deleted]
| ChuckMcM wrote:
| Can't wait for the first time a couple of thousand people all
| pick the winning lottery numbers :-)
|
| This looks like excellent work.
| Hendrikto wrote:
| The underlying problem seems to be that the PRNG they are
| emulating is giving away its internal state. I wouldn't expect
| this to work for PCG, for example.
| ChrisLomont wrote:
| It would also work for PCG. You're simply modeling the PRNG and
| learning the internal state from outputs.
|
| It's even easier using Z3 or your own SAT solver. I've broken
| pretty much all non-crypto PRNGS using Z3, and it's trivial to
| do so. You simply model the PRNG in Z3, with unknown constants,
| plug in a few values from the output (or even parts of output,
| or even non-consecutives ones, anything you derive from them,
| etc - just model how the inputs formed), and ask Z3 to solve
| for the internal state. It works every time.
| tubby12345 wrote:
| How do you pick how many constants and they're relationships?
| Someone wrote:
| I think that's the "You simply model the PRNG in Z3" part.
|
| Whether that's simple depends is debatable, but for example
| would be if somebody open-sourced the server-side of their
| poker dealing site.
| ChrisLomont wrote:
| Also, to break unknown/unseen PRNG, model a few popular
| classes of PRNG, a few common reductions (so for cards
| the naive val%52 and better rejection sampling), then
| solve across the states till you get one. Then watch a
| few more outputs, check that you guessed the source and
| reduction, and now you've cracked that too.
|
| I doubt any poker site uses anything other than a crypto
| rand for deals because the technique I just explained
| would eat them alive. They can use fast and insecure PRNG
| for doing Monte Carlo hand evaluation, though, since that
| doesn't give you any advantage.
| downandout wrote:
| So building on this...many physical casinos use a manual
| shuffle, especially at higher limits, as bigger players
| tend to distrust shuffle machines. Even some online
| casinos with live dealers will shuffle the previous shoe
| on camera, manually.
|
| Since getting a truly random shuffle would take too long,
| this is a poor method of getting random results. Would it
| be possible to, at least to some degree, predict the
| order of the cards using one of these models, given the
| order of the cards prior to shuffling? I'm thinking
| something along the lines of being able to say that the
| odds of a given card being the next one are higher than
| they should be, not outright predicting the full order of
| the cards.
| ChrisLomont wrote:
| If I recall, something like 7 hand shuffles becomes
| indistinguishable from random. I'd guess a pro could get
| this level of mixing in a few shuffles
| ChrisLomont wrote:
| It's quite simple. Z3 supports every basic integer option
| out of the box. You literally write the same code as the
| original prng. Instead of a uint32, or uint64, or
| float32, etc., you use the Z3 equivalent type. That's it.
| You define the state as unknown, but same Z3 type as in C
| or JavaScript. Then you run the function once. Set it to
| the output you see, repeat a few observed outputs, then
| tell Z3 to solve for the original unknown state.
|
| It's absolutely trivial, taking a few lines of code.
|
| I've broken all sorts of code using similar tools.
| tmoertel wrote:
| Do you have a worked example of finding a PRNG's internal
| state that you could share? (Most people are unfamiliar
| with Z3 and other SMT solvers. Seeing a relevant worked
| example would be very illustrative.)
| ChrisLomont wrote:
| Yeah, I was considering posting one. I've lately switched
| to Z3 under C# since it's so easy to do things... If I
| get around to it tonight I'll make a post.
| tubby12345 wrote:
| Oh gotcha. I thought you meant that you were some how
| modeling the prng using some reduced set of constraints.
| sillysaurusx wrote:
| This is pretty hilarious. I just noticed that this is published
| by NCC Group. (Former NCC Group pentester here, and now I happen
| to be an ML engineer.)
|
| I don't think HN's negativity is warranted. But there's also some
| confusion, which I'd like to help clear up.
|
| People here seem confused about "Why? Why do this?"
|
| The answer is simply that it's an interesting problem. That's it.
|
| It doesn't need to be important, or a promising research
| direction, or mention low bit precision, or quantized networks,
| or be performant, or tackle a complex PRNG.
|
| It just needs to be an interesting problem which you can solve.
| Love of ML is no more complicated than that.
|
| Knowing the fellas at NCC, I'm quite certain that they did this
| mostly out of intellectual curiosity rather than some grand plan
| to advance the state of the art of ML. And I think that's
| wonderful. More people should get into ML with exactly that
| mindset.
| [deleted]
| [deleted]
| _hl_ wrote:
| What they are doing is essentially encoding the binary circuit of
| the xorshift128 PRNG as a neutal network. The fact that you can
| encode abritrary binary circuits as neural networks is well-
| known, so it's not surprising that it is possible to do this.
|
| The interesting and perhaps surprising result is the
| demonstration that it is possible to train this network using
| standard gradient methods, when choosing the proper loss function
| for the problem and designing the network with the required
| domain-knowledge of the algorithm it is supposed to model. In
| other words, this can be seen as an example of how well-educated
| network design and choice of loss function massively impacts the
| performance of your fancy ML model.
|
| For more "real-world" problems, it is often very difficult to
| come up with a network design that encodes the implicit
| constraints of the problem, mostly because these constraints are
| not even known (that's why we use ML!), but results like the
| recent AlphaFold show that even for these problems, thoughtful
| design of the model architecture goes a very long way.
| downandout wrote:
| So would this model be able to predict _any_ imperfect PRNG
| with some degree of accuracy, or just xorshift128? For the
| purposes I 'm thinking, even 1% accuracy above purely random
| would suffice.
| IAmGraydon wrote:
| Financial market prediction is where my mind went too.
| SeanLuke wrote:
| > What they are doing is essentially encoding the binary
| circuit of the xorshift128 PRNG as a neutal network. The fact
| that you can encode abritrary binary circuits as neural
| networks is well-known, so it's not surprising that it is
| possible to do this.
|
| Mcullough-Pitts aside, I think this is unfair. What's
| interesting to me is that they can predict a recurrent
| algorithm with internal state using with a trivial NON-
| recurrent feedforward network without any internal state.
| gus_massa wrote:
| They can see the last 4 results, not only the last result.
| That's somewhat equivalent of partially seeing the secret
| internal state.
|
| As a bad metaphor: Let's suppose that you enter a lifter and
| a NN without internal state must predict which button you
| will press, but now consider the case where the NN can see a
| video of everything you have done during the last year. Do
| you think it's possible?
| _hl_ wrote:
| > That's somewhat equivalent of partially seeing the secret
| internal state
|
| Not just partially, xorshift128 is designed such that the
| output depends only on the last 4 results. So this is
| "just" learning the output function of xorshift as a neural
| network.
| SeanLuke wrote:
| Perhaps you might consider how big that output function
| ideally is, then compare to the capacity of the neural
| network they used and the number of samples provided.
| _hl_ wrote:
| My reading of it is that it's essentially one-to-one (up
| to minor constant factors), they took the binary circuit
| for xorshift128 and replace the xor gates by functionally
| equivalent NN blocks, and then trained the weights.
| SeanLuke wrote:
| That's only ~365 values. XORshift32 has a period of 2^32-1,
| a good portion of that (I'd imagine) unique. If XORshift
| had really good randomness properties, at the extreme we'd
| expect that the NN would have to special case _all of
| them._ What this implies is that XORshift has easily
| cracked, nonrandom, repeatable patterns (which we suspect
| of course since it fails certain tests) and that the NN has
| identified them.
| rurban wrote:
| We knew that before the two ML attempts. You just need to
| look at the source. There are several other trivial
| PRNG's with the exact same properties. Never use a
| trivial PRNG for security or seeding.
| jameshart wrote:
| It looks like this specific algorithm can easily be
| deterministically reversed - the XORs and bitwise shifts mean
| that, given four output numbers, you can completely determine
| what state it was in before the numbers were generated - and,
| in fact, that there is probably a simple series of bit shifts
| and XORs you can perform on the last four outputs that
| produces the next number.
|
| Bit shifts and XORs are very much the kind of pattern neural
| networks should excel at learning - they mean that each
| output bit is a simple linear function of the input bits.
|
| Actually doing it is still a good demonstration! But the fact
| that the function in question is used as a PRNG doesn't imply
| that other PRNGs would be susceptible to a similar approach.
| 317070 wrote:
| XOR is not linear in the inputs. I mention this, because
| Minsky's paper used that fact to show that perceptrons
| could never work as AI. This pretty much started the first
| AI winter in 1973.
|
| It's an interesting story:
| https://towardsdatascience.com/history-of-the-first-ai-
| winte...
| ajb wrote:
| Linearity is actually relative to the algebra you are
| using. Xor is not linear in real arithmetic but it is in
| GF(2). This is useful because some of the algorithms work
| across many algebras, and are useful for different
| problems. Eg, the tropical algebra.
|
| But you are quite correct about the AI winter.
| More-nitors wrote:
| hm can this be applied to AES?
| bawolff wrote:
| I'm pretty sure it can't
| cookiengineer wrote:
| This whole project reads as a beginner's guide to ML and what
| it can do.
|
| I mean, it's nice and all but at some point I would've expected
| at least a mention that this all was solvable with a quantized
| neural network with low bit precision as well.
|
| Most of the article was about LSTMs and the recurrent design of
| those is just a very inefficient way to solve the problem at
| hand.
|
| I would've expected an LSTM try for something like a seed based
| randomizer like MT19937 and that the unfolding layers are
| trained on the seed itself with the idea that they learn how to
| predict the seed's state for the next iteration.
|
| Something like this would be really important research,
| especially in times where time based one time passwords are
| used everywhere, and their cryptographic security of how seeds
| are generated is important.
| sillysaurusx wrote:
| > I mean, it's nice and all but at some point I would've
| expected at least a mention that this all was solvable with a
| quantized neural network with low bit precision as well.
|
| This doesn't matter, so I'm not sure why you were expecting
| it.
|
| I'm struggling to be diplomatic, so I'll leave it at that.
| deegles wrote:
| Since SHA-1 is known to be broken, could this be adapted to
| create collisions on demand? How much training would be required
| to break SHA256?
| tptacek wrote:
| No, you can't use ML to break SHA256.
| phenkdo wrote:
| How Is ML being used to generate more randomness in the PRNGs?
| Wouldn't the opposite be a more interesting research idea?
| arendtio wrote:
| I think this is a very good idea, because both topics (random
| number generators and AI) are hard to fully understand and using
| both against each other could be used to teach both topics with
| increasing difficulty.
| bionhoward wrote:
| Could this make a more performant PRNG for generative models,
| random simulations, etc?
___________________________________________________________________
(page generated 2021-10-16 23:00 UTC)