[HN Gopher] Benchmarking RSA Key Generation
___________________________________________________________________
Benchmarking RSA Key Generation
Author : mfrw
Score : 42 points
Date : 2024-12-31 12:57 UTC (3 days ago)
(HTM) web link (words.filippo.io)
(TXT) w3m dump (words.filippo.io)
| throw0101c wrote:
| For new deployments/protocols, is there any reason to use RSA? Is
| RSA (mostly?) about legacy support nowadays?
| bux93 wrote:
| For new protocols, go with X25519MLKEM768, then it's quantum
| proof as well.
| daneel_w wrote:
| To my understanding, even with Shor's algorithm, RSA is
| quantum-resistant when applying sufficiently large numbers
| such that the quantum computer used to locate the prime must
| be "sufficiently larger", i.e. a cat and mouse game of sorts
| until some major breakthrough comes along. I agree with
| simply moving to Ed25519 and X25519/ML-KEM etc.
| honzaik wrote:
| well, it depends on the size of the quantum computer. of
| course you can make large enough RSA keys (depends whats
| your security margin/assumptions) but the problem is that
| the size/computational increase is exponential whereas the
| solving speed scales polynomially.
|
| https://eprint.iacr.org/2017/351
| maginx wrote:
| The size/computational complexity of usage also only
| grows as a polynomial with RSA. But even with this in
| mind, a quantum computer that can crack in polynomial
| time is still problematic. While you can increase the key
| size, the operator of the quantum computer could enlarge
| his computer, a true cat-and-mouse game. Unlike the
| current situation where usage complexity grows as a
| polynomial, while cracking grows exponentially. This
| difference is the reason why we can pick key sizes where
| signing/decryption takes a millisecond while cracking it
| takes (presumably) billions of years.
| wolf550e wrote:
| Ed25519 has been recommended over RSA for many years, post-
| quantum stuff is recent. RSA should only be used to support old
| protocols like webpki certificates.
| upofadown wrote:
| RSA is faster than elliptic curves for signature verification
| and encryption. It is one of the oldest methods (half a
| century) which suggests that it could be a good choice when you
| need the lowest possible chance of some discovered weakness. It
| can be used for both signing and encryption (but not with the
| same key pair).
| maginx wrote:
| I still see a lot of RSA especially for encryption. While ECC
| can be used for encryption it is more convoluted and much less
| supported in hardware, software etc. than RSA
| decryption/encryption.
| thayne wrote:
| FIPS 140-2 Compliance. FIPS 140-3 adds support for ECC, but it
| is relatively new, and there aren't a lot of modules that have
| been certified for it yet, so depending on you your environment
| and requirements you might still need to use RSA.
|
| Or you are doing a new deployment that needs to be compatible
| with something that only supports RSA.
| jdewerd wrote:
| > The prime-counting function approximation tells us there are
| Li(x) primes less than x, which works out[5] to one prime every
| 354 odd integers of 1024 bits.
|
| Rule of thumb: Want a 1024-bit prime? Try 1024 1024-bit
| candidates and you'll probably find one. Want a 4096-bit prime?
| Try 4096 4096-bit candidates and you'll probably find one.
|
| The approximate spacing of primes around p is ln(p), so
| ln(2^1024) = 1024*ln(2), and ln(2)=0.693 so if you are willing to
| absorb 0.693 into your rule of thumb as a safety margin you get
| the delightfully simple rule of thumb above. Of course, you'll
| still want to use a sieve to quickly reject numbers divisible by
| 2, 3, 5, 7, etc, and this easily rejects 90% of numbers, and then
| do a Fermat primality test on the remainders (which if you squint
| is sort of like "try RSA, see if it works"), and then do Miller-
| Rabin test to really smash down the probability that your
| candidate isn't prime. The probabilities can be made absurdly
| small, but it still feels a bit scandalous that the whole thing
| is probabilistic.
|
| EDIT: updated rule of thumb to reflect random candidate choice
| rather than sequential candidate choice.
| rainsford wrote:
| It's been a while since I've looked at the literature on RSA
| prime generation, but I seem to remember that picking a random
| starting point and iterating until you find a prime is
| discouraged because primes aren't evenly distributed so key
| generation timing could reveal some information about your
| starting point and eventual prime choice.
|
| I'm not sure how realistic of an issue this is given the size
| of the primes involved. Even if an attacker can extract
| sensitive enough timing information to figure out exactly how
| many iterations were required to find a 1024 bit prime from a
| 1204 bit random starting point, I'm not aware of a good way to
| actually find either value. You do also introduce a bias since
| you're more likely to select prime numbers without a close
| neighbor in the direction you are iterating from, but again I'm
| not sure how practical an attack on this bias would be.
|
| Still, to avoid any potential risk there I seem to remember
| best practice being to just randomly generate numbers of the
| right size until you find a prime one. With the speed of modern
| RNGs, generating a fresh number each time vs iterating doesn't
| seem like a significant penalty.
| honzaik wrote:
| it might be this https://facthacks.cr.yp.to/fermat.html
|
| if N=p*q and p-q < sqrt(p) then its easy to factor
| maginx wrote:
| Encountering this by chance is exceedingly unlikely of
| course, if p and q are randomly generated. In probability
| terms it amounts to the first half of p (or q) all being
| zero (apart from a leading 1) so roughly 2^(-n/4) where n
| is the bit size of n. So for RSA 2048 the probability of
| this happening is on the order of 2^-512, or in base-10
| terms 0.0000000...0000001, with roughly 150 zeros before
| the one!
| jdewerd wrote:
| Yes, excellent point! I originally omitted this detail for
| simplicity, but on reflection I don't think it actually
| achieved much in the way of simplifying the rule so I changed
| it to reflect reality. Thanks for pointing that out.
|
| EDIT: the rush of people offering up sieve optimizations is
| pushing me back towards formulating the rule of thumb on a
| consecutive block of numbers, since it makes it very clear
| that these are not included, rather than implicitly or
| explicitly including some subset of them (implicit is bad
| because opacity, explicit is bad because complexity).
| maginx wrote:
| I've seen many implementations that relied on iterating (or
| rather, they used a prime sieve but it amounts to the same
| thing in terms of the skewed distribution). While maybe not a
| problem in practice I always hated it - even for embedded
| systems etc. I always used pure rejection sampling, with a
| random one in each iteration.
| af3d wrote:
| Iterating over some huge search space in an essentially
| sequential manner is generally not going to be nearly
| performant as simply selecting an odd number at random. You
| could try using a generating polynomial instead such as f(x) =
| x^2 + x + 41 but even that isn't going to help much in the long
| run. (There are Diophantine equations which one day may prove
| useful for generating random primes however AFAICT finding
| efficient solutions is still currently considered a hard
| problem.)
| jdewerd wrote:
| Yes, but the more we mix sieve rejection into candidate
| selection the more we complicate the rule of thumb. "Reject
| even numbers as prime candidates" is probably OK to leave as
| an exercise for the reader, as is the equivalent "round every
| candidate to odd" optimization. The point about random vs
| sequential is well taken, though, and it doesn't complicate
| the rule of thumb, so I changed it.
| kevin_thibedeau wrote:
| Incrementing a bignum is faster than another PRNG cycle.
| loeg wrote:
| Neither is a significant amount of the time required to
| reject a candidate factor. The _cheapest_ rejection test is
| "Bignum Division by 3" and something like 2/3 candidates
| will need more expensive further tests.
|
| https://news.ycombinator.com/item?id=40093136
| FiloSottile wrote:
| > Rule of thumb: Want a 1024-bit prime? Try 1024 1024-bit
| candidates and you'll probably find one.
|
| Where probably is 76% [1], which is not that high depending on
| what you are doing. For example, you wouldn't be ok with
| GenerateKey failing 24% of the time.
|
| To get a better than even chance, 491 [2] 1024-bit candidates
| are enough.
|
| [1]:
| https://www.wolframalpha.com/input?i=1+-+%281+-+li%282%5E102...
| (using li(x) as a slightly better approximation of p(x) than
| x/ln(x), see [3])
|
| [2]:
| https://www.wolframalpha.com/input?i=1+-+%281+-+li%282%5E102...
|
| [3]: https://en.wikipedia.org/wiki/Prime-counting_function
| JacobiX wrote:
| It is fascinating (to me at least) that almost all RSA
| implementations rely on a probabilistic primality test, even
| though the probability of picking a pseudoprime is extremely
| small.
| FiloSottile wrote:
| There's extremely small (like 2-20, the chance of winning $50
| 000 with a $2 Powerball ticket [1]), and then there's
| cryptographically negligible (like 2-120, the false negative
| chance of our primality test). The chance of something
| cryptographically negligible happening is about the same as the
| chance of the attacker guessing your key on the first try, or
| of a cosmic ray flipping a CPU flag inverting the result of "if
| !signature.verified { return err }".
|
| [1]: https://www.powerball.com/powerball-prize-chart
| mras0 wrote:
| I happened to look at this recently, and while I understand
| the argument (but not the math) of having to do fewer Miller-
| Rabain rounds, why would you do so in PRACTICAL settings?
| Unlike ECC you're likely only generating long term keys, so
| shorter key generation time seems like a bad tradeoff.
| Composite candidates are going to be rejected early, so
| you're (with high probability) not doing expensive
| calculations for most candidates. My reading of [BSI B.5.2](h
| ttps://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publicat..
| .) confirms this.
|
| Of course random bit flips could interfere, but other
| measures should thwart this in high-stakes environments (at
| least to some degree).
| FiloSottile wrote:
| The number of Miller-Rabin rounds has to be bounded, so if
| you're not going to base your bound on reaching a
| cryptographically negligible change of false positives,
| what are you going to base it on? Should we do 10? 15?
|
| The problem with "x should be enough, but why not do more?"
| arguments is that they can be applied recursively, and
| never answer the question "ok so when should we stop?"
| mras0 wrote:
| The BSI recommendation I linked says 60 times here
| (considering the worst case and the security level
| they're targeting). Just wondering why we'd want to do
| less for a (presumably rare) one-time operation.
| FiloSottile wrote:
| Why not do 120 then? We can show that the chance of false
| negative of 5 rounds is cryptographically negligible, so
| 5, 60, and 120 are all the same. If the only argument for
| 60 is that it's more and this is a really important rare
| operation, doesn't it also apply to 120?
|
| I'm not trying to be glib, there is no rational stopping
| point if we reject the objective threshold.
| mras0 wrote:
| I don't pretend to understand all the involved math, but
| what I'm trying to say is that as far as I understand T
| rounds gives 4^-T probability that we've chosen a "bad"
| prime (really composite) per normal Miller-Rabin in the
| worst case. Doing ~5 rounds has been shown to be enough
| to choose a good prime candidate, under most (very
| probable!) conditions when we get it a random, and thus
| the argument is that that ~5 rounds is fine. We agree so
| far?
|
| I'm just asking, why not run the conservative 60 round
| test, rather than ~5 when you're doing a very rare, one
| time, key generation? I understand that it's very
| unlikely to reject any numbers, but at least BSI thinks
| it's worth it for important keys.
|
| If I understand the recommendation right, you wouldn't do
| 60 for a 2048 bit key and then 120 for 4096, rather 61
| rounds would be enough for 4096 if 120 is alright for
| 2048.
| maginx wrote:
| As I understand it, the number of rounds needed (for a
| given acceptable failure probability) goes down the
| larger the number is. Note (very importantly) that this
| is assuming you are testing a RANDOM integer for
| primality. If you are given an integer from a potential
| malicious source you need to do the full number of rounds
| for the given level.
| FiloSottile wrote:
| You're asking why not apply the formula for adversarially
| selected candidates even if we are randomly selecting
| candidates. There is simply no reason to, except "maybe
| we made a mistake" but then why would we not think we
| made a mistake also in calculating the 1/4 value, or in
| any other part of the code?
|
| Phrased another way, do you have an argument for _why_
| run the conservative 60 round test, instead of asking for
| an argument for _why not_ run it?
|
| Again, you are "very unlikely" to win the Powerball
| jackpot. Rounds 6-60 have a cryptographically negligible
| chance of rejecting a composite. It's different,
| otherwise we'd have to worry about the "very unlikely"
| chance of the attacker guessing an AES-128 key on the
| first try.
|
| (I don't follow you on the key sizes, if you apply the
| 1/4 probability, the candidate size is irrelevant.)
| maginx wrote:
| The performance of this can matter in some scenarios. In
| embedded systems, smart cards etc., generating the primes
| can take a significant amount of time (15-20 seconds
| typical) even with the 'low' number of iterations. Higher
| iterations means the user will have to wait even longer.
| Actually, in such systems it is not unusual to see
| occasional time-out errors and such when the smart card
| is unlucky with finding the primes. The timeout value may
| be a fixed, general value in a part of the stack
| difficult to change for the specific call.
|
| Another case is short-term keys/certificates, where a
| fresh key pair is generated, and cert issues, for each
| and every signature. This setup makes revocation easier
| to handle (the certificate typically has a lifetime of a
| few minutes so it is 'revoked' almost immediately).
|
| There are also scenarios where the keys are generated on
| a central system (HSM's etc.) to be injected into a
| production line or similar. There performance can matter
| as well. I worked on a system where the HSM's were used
| for this. They could typically generate an RSA key in 1-2
| seconds, so 12 HSM's were needed to keep up with demand -
| and this was again with the reduced number of rounds.
___________________________________________________________________
(page generated 2025-01-03 23:01 UTC)