[HN Gopher] 15,000 lines of verified cryptography now in Python
       ___________________________________________________________________
        
       15,000 lines of verified cryptography now in Python
        
       Author : todsacerdoti
       Score  : 172 points
       Date   : 2025-04-18 19:28 UTC (3 hours ago)
        
 (HTM) web link (jonathan.protzenko.fr)
 (TXT) w3m dump (jonathan.protzenko.fr)
        
       | hall0ween wrote:
       | I'm cryptographically ignorant. What does this mean for python?
        
         | otterley wrote:
         | https://en.wikipedia.org/wiki/Formal_verification
         | 
         | Provable correctness is important in many different application
         | contexts, especially cryptography. The benefit for Python (and
         | for any other user of the library in question) is fewer bugs
         | and thus greater security assurance.
         | 
         | If you look at the original bug that spawned this work
         | (https://github.com/python/cpython/issues/99108), the poster
         | reported:
         | 
         | """As evidenced by the recent SHA3 buffer overflow,
         | cryptographic primitives are tricky to implement correctly.
         | There might be issues with memory management, exceeding
         | lengths, incorrect buffer management, or worse, incorrect
         | implementations in corner cases.
         | 
         | The HACL* project https://github.com/hacl-star/hacl-star
         | provides verified implementations of cryptographic primitives.
         | These implementations are mathematically shown to be:
         | memory safe (no buffer overflows, no use-after-free)
         | functionally correct (they always compute the right result)
         | side-channel resistant (the most egregious variants of side-
         | channels, such as memory and timing leaks, are ruled out by
         | construction)."""
        
         | jessekv wrote:
         | More context here.
         | 
         | https://github.com/python/cpython/issues/99108#issuecomment-...
        
         | aseipp wrote:
         | Extreme ELI5 TL;DR: Your Python programs using the cpython
         | interpreter and its built in cryptographic modules will now be
         | using safer and faster, with no need for you to do anything.
        
           | odo1242 wrote:
           | Is it faster? I'm pretty sure the main goal of this effort is
           | just the "safer" part.
        
             | aseipp wrote:
             | The goal is to make things safer, yes, but speed is
             | absolutely a major priority for the project and a
             | requirement for production deployment, because the
             | difference in speed for optimized designs vs naive ones
             | might be an order of magnitude or more. It's quite speedy
             | IME. To be fair to your point, though, it's also a moving
             | target; "which is faster" can change as improvements
             | trickle in. Maybe "shouldn't be much slower" is a better
             | statement, but I was speaking generally :)
             | 
             | (You could also make the argument that if previous
             | implementations that were replaced were insecure that their
             | relative performance is ultimately irrelevant, but I'm
             | ignoring that since it hinges on a known unknown.)
        
             | drewcoo wrote:
             | And safer is often slower to avoid timing attacks.
        
               | aseipp wrote:
               | I mean, most if not all of the code they're replacing
               | (e.g. the vendored and vectorized Blake2 code) is also
               | going to be designed and optimized with timing attacks in
               | mind and implemented to avoid them. CVE-2022-37454 was
               | literally found in a component that was optimized and had
               | timing attack resistance in mind (XKCP from the SHA-3
               | authors). "Code that is already known to be wrong" is not
               | really a meaningful baseline to compare against.
        
             | protz wrote:
             | the performance was pretty much identical, however, as an
             | added benefit, Blake2 got quite a bit faster due a
             | combination of 1) our code being slightly more optimized,
             | and 2) python's blake2 integration not actually doing
             | runtime cpu detection, meaning that unless you compiled
             | with -march=native (like, Gentoo), at the time, you were
             | not getting the AVX2 version of Blake2 within Python -- my
             | PR fixed that and added code for CPU autodetection
             | 
             | bear in mind that performance is tricky -- see my comment
             | about NEON https://github.com/python/cpython/pull/119316#is
             | suecomment-2...
        
           | kccqzy wrote:
           | Who uses its built-in cryptographic modules though? I wrote a
           | fair bit of cryptographic code in Python more than five years
           | ago and most people recommended the cryptography package
           | https://pypi.org/project/cryptography/ over whatever that's
           | built-in.
        
             | Retr0id wrote:
             | I'm a big fan of pyca/cryptography and I use it for any
             | serious project, but if I _just_ need hashing I tend to use
             | the standard library hashlib - it saves a somewhat heavy
             | dependency, and the API is less verbose.
             | 
             | Also, pyca/cryptography uses OpenSSL. OpenSSL is fine, but
             | has the same "problem" as the previous python stdlib
             | implementation. (Personally I think it's an acceptable
             | risk. If anything, swapping in 15,000 lines of new code is
             | the greater risk, even if it's "verified")
        
         | hathawsh wrote:
         | Here is how I read it: CPython has a hashlib module that has
         | long been a wrapper around certain functions exported from
         | OpenSSL, but since a SHA3 buffer overflow was discovered in
         | OpenSSL, the CPython project has now decided to wrap a library
         | called HACL*, which uses formal verification to ensure there
         | are no buffer overflows and other similar bugs.
        
       | thebeardisred wrote:
       | Lines of code is a pretty poor measurement. Doubly so when you're
       | boasting a large number in the context of cryptographic code.
        
         | bbstats wrote:
         | the 2nd sentence:
         | 
         | "As of last week, this issue is now closed, and every single
         | hash and HMAC algorithm exposed by default in Python is now
         | provided by HACL*, the verified cryptographic library."
         | 
         | it's describing coverage.
        
         | bbeonx wrote:
         | This isn't really a metric. It's a description to help the
         | reader understand the magnitude of effort that went into this
         | project. SLoC is a bad metric for plenty of things, but I think
         | it's fine for quickly conveying the scope of a project to a
         | blog reader.
        
           | odyssey7 wrote:
           | Lines of code is also a poor indicator for "magnitude of
           | effort."
           | 
           | Tangent: generally I'm more inclined to believe quality is
           | improved when someone claims 1000s of lines reduced rather
           | than 1000s of lines written.
        
             | xeromal wrote:
             | See: AI generating 1000s of lines
        
             | chaitimes wrote:
             | From my experience, pre LLMs, it was a valid proxy metric
             | for effort
        
         | aseipp wrote:
         | It isn't boasting about anything, it's a straightforward
         | description of the 2.5 years of effort they went through for
         | this project, and some of the more annoying "production
         | integration" bits.
        
         | titaphraz wrote:
         | > Lines of code is a pretty poor measurement
         | 
         | But let's say at least that it's abused, exploited, perverted
         | and a molested measurement before we call it a poor one. It not
         | all that bad when you have context.
        
       | Tarucho wrote:
       | Not saying it isn't useful but with this change Python's crypto
       | depends on Microsoft (https://fstar-lang.org/ and
       | https://project-everest.github.io/)
        
         | NavinF wrote:
         | > HACL* is a formally verified library of modern cryptographic
         | algorithms, where each primitive is verified for memory safety,
         | functional correctness, and secret independence. HACL* provides
         | efficient, readable, standalone C code for each algorithm that
         | can be easily integrated into any C project.
         | 
         | > All the code in this repository is released under an Apache
         | 2.0 license. The generated C code from HACL* is also released
         | under an MIT license.
         | 
         | You have an unusual definition of "depends on Microsoft".
         | Anyone worried about depending on Microsoft should be able to
         | maintain 15k lines of C that are already formally verified.
         | Python already vendored the code so who cares who wrote that
         | code?
        
           | Tarucho wrote:
           | life is more than code quality my simpleton
        
       | IlikeKitties wrote:
       | Modern, ubiquitous cryptography is now practically unbreakable
       | (even for the NSA) and widely used. The crypto wars of the 90s
       | seem rather quaint. Any thoughts on the effects on society this
       | now has?
        
         | xyzzy123 wrote:
         | It's cool that we can largely "strike out" link level
         | wiretapping from our threat models but it just means attackers
         | move on to the endpoints. I have a wonderfully private link to
         | google, my bank and my crypto exchange but all of those are
         | coerced to report everything I do.
        
         | imiric wrote:
         | Why are you so certain of this? The NSA has a long history of
         | attempting to insert backdoors in cryptographic systems. Most
         | recently they bribed RSA to make their compromised PRNG the
         | default, which shipped in software as late as 2014, possibly
         | even later.
         | 
         | And these are just the attempts we know about. What makes you
         | think that they haven't succeeded and we just don't know about
         | it?
        
           | IlikeKitties wrote:
           | We know from the Snowden Leaks that they couldn't crack PGP
           | at the time. There's been some talks about cracking "export
           | grade" cryptography and how that is done. I'm pretty
           | confident that the newer hardened crypto libraries are pretty
           | secure and since even the NSA recommends signal encryption
           | now because the infrastructure in the US has so many holes
           | the Chinese are in the entire mobile internet infrastructure,
           | that's a pretty strong lead that it's very hard if not
           | impossible to crack signal, at least for the Chinese.
           | 
           | It's also very likely that even if they can attack crypto in
           | ways we don't know about, they can at best reduce the
           | required time it takes to crack a given key. Chosing extra
           | long keys and changing them frequently should protect you
           | from lots of attacks.
        
           | mmooss wrote:
           | > What makes you think that they haven't succeeded and we
           | just don't know about it?
           | 
           | Yes, afaik they also have a history of not revealing exploits
           | they discover. With a budget in the tens of billions and tens
           | of thousands of employees, they have the resources to
           | discover quite a bit.
        
         | anon6362 wrote:
         | This is so vague as to be meaningless because body of research
         | (attacks and hw-accelerated implementations), details,
         | implementations, uses, and adversary/ies budget(s) matter.
         | Furthermore, the obsession with optimizing for "fast"
         | cryptographic algorithms/implementations undermine themselves
         | when it comes to the cost of CPU-bound brute force attack which
         | become cheaper and more attainable over time.
        
         | gpcz wrote:
         | For now. If someone makes a practical quantum computer, pretty
         | much every asymmetric primitive we use at the start of a
         | cryptographic protocol to make a shared secret for symmetric
         | encryption will be breakable.
        
           | dist-epoch wrote:
           | The switch to post-quantum encryption already started -
           | Signal, Chrome, iMessage
        
         | jakeogh wrote:
         | I just got bit by device attestation. Tried to install the
         | latest ebay app version via the Aurora Store (on GrapheneOS),
         | and instead of presenting me with the option of using my ebay
         | login, it wanted a google account at a play store login with no
         | way to bypass. I was able to downgrade to the previous version
         | which does not require the Integrity API, but the walls are
         | closing in. Only took 7 months:
         | https://news.ycombinator.com/item?id=41517159 (I did get ebay
         | on the phone and filed an issue, hopefully others do the same)
        
       | Retr0id wrote:
       | Will this bring support for "streaming" output from SHAKE?
       | 
       | Here's the (recently closed!) issue for pyca/cryptography
       | regarding this feature, although I can't find an equivalent issue
       | for the python stdlib:
       | https://github.com/pyca/cryptography/issues/9185
       | 
       | Edit: Found a relevant issue, "closed as not planned":
       | https://github.com/python/cpython/issues/82198
        
       | tsecurity wrote:
       | How much of the development of this was verified, and what does
       | that consist of?
       | 
       | I worry a little when I read that things are verified and were
       | hard.
        
         | rtkwe wrote:
         | The first two shouldn't matter because the entirety of the code
         | is verified and anyone can check the verification. The last is
         | an issue with any cryptography but verification doesn't try to
         | address that only that the code does precisely and only what
         | it's supposed to; ie it should guarantee that there are no
         | exploits possible against that library.
        
         | aseipp wrote:
         | https://eprint.iacr.org/2017/536.pdf is the relevant paper that
         | introduces the project and its broad design. Figure 1 on page 3
         | is a good place to look.
        
       | bgwalter wrote:
       | Reading the article, they took a verified C library generated
       | from F* from Microsoft, vendored the code in CPython and wrote a
       | C extension.
       | 
       | And during the process they discovered that the original library
       | did not handle allocation failures?
       | 
       | What is the big deal about Python here? It's just another wrapped
       | C library.
        
         | anon6362 wrote:
         | In general, it's good practice to use best components
         | available, regardless of source. Cryptographic components
         | especially shouldn't be DIY.
         | 
         | It would be cool if Ruby did something similar for
         | stdlib/openssl, but it could be also be done in a gem too.
        
         | quantumgarbage wrote:
         | A seamless, drop-in replacement, for all python users is
         | actually a pretty good feat tbh
        
         | nine_k wrote:
         | > _just another wrapped C library._
         | 
         | I suppose you mean "just another top-notch library available in
         | the Python ecosystem"? :)
        
       | chrisrodrigue wrote:
       | There's no mention of what Python version this is actually in.
       | 
       | After some digging, it looks like the answer is 3.14 [0], so we
       | won't be seeing this until October [1].
       | 
       | One could argue that this is a security fix (just read the first
       | sentence of the blog post) and should be included in all the
       | currently supported versions (3.9+) of Python [2].
       | 
       | [0]
       | https://github.com/python/cpython/blob/main/Doc/whatsnew/3.1...
       | 
       | [1] https://peps.python.org/pep-0745/
       | 
       | [2] https://devguide.python.org/versions/
        
       ___________________________________________________________________
       (page generated 2025-04-18 23:00 UTC)