[HN Gopher] 15,000 lines of verified cryptography now in Python
___________________________________________________________________
15,000 lines of verified cryptography now in Python
Author : todsacerdoti
Score : 172 points
Date : 2025-04-18 19:28 UTC (3 hours ago)
(HTM) web link (jonathan.protzenko.fr)
(TXT) w3m dump (jonathan.protzenko.fr)
| hall0ween wrote:
| I'm cryptographically ignorant. What does this mean for python?
| otterley wrote:
| https://en.wikipedia.org/wiki/Formal_verification
|
| Provable correctness is important in many different application
| contexts, especially cryptography. The benefit for Python (and
| for any other user of the library in question) is fewer bugs
| and thus greater security assurance.
|
| If you look at the original bug that spawned this work
| (https://github.com/python/cpython/issues/99108), the poster
| reported:
|
| """As evidenced by the recent SHA3 buffer overflow,
| cryptographic primitives are tricky to implement correctly.
| There might be issues with memory management, exceeding
| lengths, incorrect buffer management, or worse, incorrect
| implementations in corner cases.
|
| The HACL* project https://github.com/hacl-star/hacl-star
| provides verified implementations of cryptographic primitives.
| These implementations are mathematically shown to be:
| memory safe (no buffer overflows, no use-after-free)
| functionally correct (they always compute the right result)
| side-channel resistant (the most egregious variants of side-
| channels, such as memory and timing leaks, are ruled out by
| construction)."""
| jessekv wrote:
| More context here.
|
| https://github.com/python/cpython/issues/99108#issuecomment-...
| aseipp wrote:
| Extreme ELI5 TL;DR: Your Python programs using the cpython
| interpreter and its built in cryptographic modules will now be
| using safer and faster, with no need for you to do anything.
| odo1242 wrote:
| Is it faster? I'm pretty sure the main goal of this effort is
| just the "safer" part.
| aseipp wrote:
| The goal is to make things safer, yes, but speed is
| absolutely a major priority for the project and a
| requirement for production deployment, because the
| difference in speed for optimized designs vs naive ones
| might be an order of magnitude or more. It's quite speedy
| IME. To be fair to your point, though, it's also a moving
| target; "which is faster" can change as improvements
| trickle in. Maybe "shouldn't be much slower" is a better
| statement, but I was speaking generally :)
|
| (You could also make the argument that if previous
| implementations that were replaced were insecure that their
| relative performance is ultimately irrelevant, but I'm
| ignoring that since it hinges on a known unknown.)
| drewcoo wrote:
| And safer is often slower to avoid timing attacks.
| aseipp wrote:
| I mean, most if not all of the code they're replacing
| (e.g. the vendored and vectorized Blake2 code) is also
| going to be designed and optimized with timing attacks in
| mind and implemented to avoid them. CVE-2022-37454 was
| literally found in a component that was optimized and had
| timing attack resistance in mind (XKCP from the SHA-3
| authors). "Code that is already known to be wrong" is not
| really a meaningful baseline to compare against.
| protz wrote:
| the performance was pretty much identical, however, as an
| added benefit, Blake2 got quite a bit faster due a
| combination of 1) our code being slightly more optimized,
| and 2) python's blake2 integration not actually doing
| runtime cpu detection, meaning that unless you compiled
| with -march=native (like, Gentoo), at the time, you were
| not getting the AVX2 version of Blake2 within Python -- my
| PR fixed that and added code for CPU autodetection
|
| bear in mind that performance is tricky -- see my comment
| about NEON https://github.com/python/cpython/pull/119316#is
| suecomment-2...
| kccqzy wrote:
| Who uses its built-in cryptographic modules though? I wrote a
| fair bit of cryptographic code in Python more than five years
| ago and most people recommended the cryptography package
| https://pypi.org/project/cryptography/ over whatever that's
| built-in.
| Retr0id wrote:
| I'm a big fan of pyca/cryptography and I use it for any
| serious project, but if I _just_ need hashing I tend to use
| the standard library hashlib - it saves a somewhat heavy
| dependency, and the API is less verbose.
|
| Also, pyca/cryptography uses OpenSSL. OpenSSL is fine, but
| has the same "problem" as the previous python stdlib
| implementation. (Personally I think it's an acceptable
| risk. If anything, swapping in 15,000 lines of new code is
| the greater risk, even if it's "verified")
| hathawsh wrote:
| Here is how I read it: CPython has a hashlib module that has
| long been a wrapper around certain functions exported from
| OpenSSL, but since a SHA3 buffer overflow was discovered in
| OpenSSL, the CPython project has now decided to wrap a library
| called HACL*, which uses formal verification to ensure there
| are no buffer overflows and other similar bugs.
| thebeardisred wrote:
| Lines of code is a pretty poor measurement. Doubly so when you're
| boasting a large number in the context of cryptographic code.
| bbstats wrote:
| the 2nd sentence:
|
| "As of last week, this issue is now closed, and every single
| hash and HMAC algorithm exposed by default in Python is now
| provided by HACL*, the verified cryptographic library."
|
| it's describing coverage.
| bbeonx wrote:
| This isn't really a metric. It's a description to help the
| reader understand the magnitude of effort that went into this
| project. SLoC is a bad metric for plenty of things, but I think
| it's fine for quickly conveying the scope of a project to a
| blog reader.
| odyssey7 wrote:
| Lines of code is also a poor indicator for "magnitude of
| effort."
|
| Tangent: generally I'm more inclined to believe quality is
| improved when someone claims 1000s of lines reduced rather
| than 1000s of lines written.
| xeromal wrote:
| See: AI generating 1000s of lines
| chaitimes wrote:
| From my experience, pre LLMs, it was a valid proxy metric
| for effort
| aseipp wrote:
| It isn't boasting about anything, it's a straightforward
| description of the 2.5 years of effort they went through for
| this project, and some of the more annoying "production
| integration" bits.
| titaphraz wrote:
| > Lines of code is a pretty poor measurement
|
| But let's say at least that it's abused, exploited, perverted
| and a molested measurement before we call it a poor one. It not
| all that bad when you have context.
| Tarucho wrote:
| Not saying it isn't useful but with this change Python's crypto
| depends on Microsoft (https://fstar-lang.org/ and
| https://project-everest.github.io/)
| NavinF wrote:
| > HACL* is a formally verified library of modern cryptographic
| algorithms, where each primitive is verified for memory safety,
| functional correctness, and secret independence. HACL* provides
| efficient, readable, standalone C code for each algorithm that
| can be easily integrated into any C project.
|
| > All the code in this repository is released under an Apache
| 2.0 license. The generated C code from HACL* is also released
| under an MIT license.
|
| You have an unusual definition of "depends on Microsoft".
| Anyone worried about depending on Microsoft should be able to
| maintain 15k lines of C that are already formally verified.
| Python already vendored the code so who cares who wrote that
| code?
| Tarucho wrote:
| life is more than code quality my simpleton
| IlikeKitties wrote:
| Modern, ubiquitous cryptography is now practically unbreakable
| (even for the NSA) and widely used. The crypto wars of the 90s
| seem rather quaint. Any thoughts on the effects on society this
| now has?
| xyzzy123 wrote:
| It's cool that we can largely "strike out" link level
| wiretapping from our threat models but it just means attackers
| move on to the endpoints. I have a wonderfully private link to
| google, my bank and my crypto exchange but all of those are
| coerced to report everything I do.
| imiric wrote:
| Why are you so certain of this? The NSA has a long history of
| attempting to insert backdoors in cryptographic systems. Most
| recently they bribed RSA to make their compromised PRNG the
| default, which shipped in software as late as 2014, possibly
| even later.
|
| And these are just the attempts we know about. What makes you
| think that they haven't succeeded and we just don't know about
| it?
| IlikeKitties wrote:
| We know from the Snowden Leaks that they couldn't crack PGP
| at the time. There's been some talks about cracking "export
| grade" cryptography and how that is done. I'm pretty
| confident that the newer hardened crypto libraries are pretty
| secure and since even the NSA recommends signal encryption
| now because the infrastructure in the US has so many holes
| the Chinese are in the entire mobile internet infrastructure,
| that's a pretty strong lead that it's very hard if not
| impossible to crack signal, at least for the Chinese.
|
| It's also very likely that even if they can attack crypto in
| ways we don't know about, they can at best reduce the
| required time it takes to crack a given key. Chosing extra
| long keys and changing them frequently should protect you
| from lots of attacks.
| mmooss wrote:
| > What makes you think that they haven't succeeded and we
| just don't know about it?
|
| Yes, afaik they also have a history of not revealing exploits
| they discover. With a budget in the tens of billions and tens
| of thousands of employees, they have the resources to
| discover quite a bit.
| anon6362 wrote:
| This is so vague as to be meaningless because body of research
| (attacks and hw-accelerated implementations), details,
| implementations, uses, and adversary/ies budget(s) matter.
| Furthermore, the obsession with optimizing for "fast"
| cryptographic algorithms/implementations undermine themselves
| when it comes to the cost of CPU-bound brute force attack which
| become cheaper and more attainable over time.
| gpcz wrote:
| For now. If someone makes a practical quantum computer, pretty
| much every asymmetric primitive we use at the start of a
| cryptographic protocol to make a shared secret for symmetric
| encryption will be breakable.
| dist-epoch wrote:
| The switch to post-quantum encryption already started -
| Signal, Chrome, iMessage
| jakeogh wrote:
| I just got bit by device attestation. Tried to install the
| latest ebay app version via the Aurora Store (on GrapheneOS),
| and instead of presenting me with the option of using my ebay
| login, it wanted a google account at a play store login with no
| way to bypass. I was able to downgrade to the previous version
| which does not require the Integrity API, but the walls are
| closing in. Only took 7 months:
| https://news.ycombinator.com/item?id=41517159 (I did get ebay
| on the phone and filed an issue, hopefully others do the same)
| Retr0id wrote:
| Will this bring support for "streaming" output from SHAKE?
|
| Here's the (recently closed!) issue for pyca/cryptography
| regarding this feature, although I can't find an equivalent issue
| for the python stdlib:
| https://github.com/pyca/cryptography/issues/9185
|
| Edit: Found a relevant issue, "closed as not planned":
| https://github.com/python/cpython/issues/82198
| tsecurity wrote:
| How much of the development of this was verified, and what does
| that consist of?
|
| I worry a little when I read that things are verified and were
| hard.
| rtkwe wrote:
| The first two shouldn't matter because the entirety of the code
| is verified and anyone can check the verification. The last is
| an issue with any cryptography but verification doesn't try to
| address that only that the code does precisely and only what
| it's supposed to; ie it should guarantee that there are no
| exploits possible against that library.
| aseipp wrote:
| https://eprint.iacr.org/2017/536.pdf is the relevant paper that
| introduces the project and its broad design. Figure 1 on page 3
| is a good place to look.
| bgwalter wrote:
| Reading the article, they took a verified C library generated
| from F* from Microsoft, vendored the code in CPython and wrote a
| C extension.
|
| And during the process they discovered that the original library
| did not handle allocation failures?
|
| What is the big deal about Python here? It's just another wrapped
| C library.
| anon6362 wrote:
| In general, it's good practice to use best components
| available, regardless of source. Cryptographic components
| especially shouldn't be DIY.
|
| It would be cool if Ruby did something similar for
| stdlib/openssl, but it could be also be done in a gem too.
| quantumgarbage wrote:
| A seamless, drop-in replacement, for all python users is
| actually a pretty good feat tbh
| nine_k wrote:
| > _just another wrapped C library._
|
| I suppose you mean "just another top-notch library available in
| the Python ecosystem"? :)
| chrisrodrigue wrote:
| There's no mention of what Python version this is actually in.
|
| After some digging, it looks like the answer is 3.14 [0], so we
| won't be seeing this until October [1].
|
| One could argue that this is a security fix (just read the first
| sentence of the blog post) and should be included in all the
| currently supported versions (3.9+) of Python [2].
|
| [0]
| https://github.com/python/cpython/blob/main/Doc/whatsnew/3.1...
|
| [1] https://peps.python.org/pep-0745/
|
| [2] https://devguide.python.org/versions/
___________________________________________________________________
(page generated 2025-04-18 23:00 UTC)