[HN Gopher] Removing PGP from PyPI
       ___________________________________________________________________
        
       Removing PGP from PyPI
        
       Author : harporoeder
       Score  : 39 points
       Date   : 2024-10-17 20:03 UTC (2 hours ago)
        
 (HTM) web link (blog.pypi.org)
 (TXT) w3m dump (blog.pypi.org)
        
       | jacques_chester wrote:
       | I performed a similar analysis on RubyGems and found that of the
       | top 10k most-downloaded gems, less than one percent had valid
       | signatures. That plus the general hassle of managing key material
       | means that this was a dead-end for large scale adoption.
       | 
       | I'm still hopeful that sigstore will see wide adoption and bring
       | authorial attestation (code signing) to the masses.
        
         | wnissen wrote:
         | I agree, where is the LetsEncrypt for signing? Something you
         | could download and get running in literally a minute.
        
           | arccy wrote:
           | sigstore https://docs.sigstore.dev/quickstart/quickstart-
           | cosign/#exam...
        
             | Diti wrote:
             | Specifically, the CA signing the code certificates (that
             | are valid for 10 minutes) is
             | https://github.com/sigstore/fulcio.
        
       | bjornsing wrote:
       | Does it matter much if the key can be verified? I mean it seems
       | like a pretty big step up security wise to know that a new
       | version of a package is signed with the same key was previous
       | versions.
        
       | opello wrote:
       | Wouldn't another very good answer be for PyPI to have a keyserver
       | and require keys be sent to it for them to be used in package
       | publishing?
        
         | zitterbewegung wrote:
         | From here: https://caremad.io/posts/2013/07/packaging-signing-
         | not-holy-... which is linked to the article since PyPI has so
         | many packages and that everyone can sign up to add a package it
         | would be extremely unmanageable.
        
           | opello wrote:
           | That's fair and I appreciate that detail even without having
           | followed the link in the original article. But while not
           | being "the holy grail" why must the perfect be the enemy of
           | the good, if it was providing a value?
           | 
           | I certainly allow for the "if it was providing a value" to be
           | a gargantuan escape hatch through which any other perspective
           | may be removed.
           | 
           | But by highlighting the difficulty in verifying signatures
           | and saying it was because they keys were hard to find (or may
           | have been expired or other signing errors per the footnote) a
           | fairly straight forward solution presents itself: add
           | keyserver infrastructure, check it when signed packages are
           | posted, reject if key verification fails using that
           | keyserver.
           | 
           | All told it seems like it wasn't providing a value, so
           | throwing more resources at the effort was not done. But
           | something about highlight how "keys being hard to find"
           | helped justify the action doesn't quite pass muster to my
           | mind.
        
         | wnissen wrote:
         | Wouldn't that make the maintenance burden worse? Now PyPI has
         | to host a keyserver, with its own attack service. And
         | presumably 99.7% of the keys would be only for PyPI, so folks
         | would have no incentive to secure them. The two modes that work
         | are either no signing, or mandatory signing like many app
         | stores. Obviously the middle way is the worst of both worlds,
         | no security for 99+% of packages, but all the maintenance
         | headache. And mandatory signing raises the possibility that
         | PyPI would be replaced by an alternate repository that's easier
         | to contribute to. The open source world depends to a shocking
         | degree on the volunteer labor of people who have a lot of
         | things they could be doing with their time, and a "small" speed
         | bump for enhanced security can have knock-on effects that are
         | not small.
        
           | opello wrote:
           | Sure, it all hinges on whether the signatures provided any
           | value. And it seems to be the conclusion that it didn't.
           | 
           | Without something showing "keyservers present an untenable
           | risk" and Debian, Ubuntu, Launchpad, others have keyserver
           | infrastructure, it seems like too far of a conclusion to
           | reach casually. But of course, it adds attack surface for the
           | simple fact that a public facing thing was stood up where
           | once it was not. Though that isn't the kind of trivial
           | conclusion I imagine you had in mind.
           | 
           | I don't see why there's a binary choice between "signing is
           | no longer supported" and "signing is mandatory" when before
           | that wasn't the case. If it truly provided no value, or so
           | small a value with so high a maintenance burden that it
           | harmed the project that way, then it makes sense--but that
           | didn't seem to be the place from which the article argued.
        
       | bee_rider wrote:
       | I dunno, not all projects are equally important or popular, so it
       | seems to me that the number of _downloads_ which had keys is the
       | better metric to look at.
       | 
       | But, if there are fundamental issues with the key system anyway,
       | the percentages don't matter anyway.
        
       | hiatus wrote:
       | Should be tagged 2023.
        
       | badgersnake wrote:
       | Most people do security badly so let's not do it at all.
       | 
       | Right.
        
         | otikik wrote:
         | Unfortunately we live in a world of limited time and resources,
         | and priorities need to be adjusted accordingly.
         | 
         | Honestly, I would put the blame on PGP. It has a ... special
         | UX. I tried to use it in 3 separate occasions, and ended up
         | doing something else (probably less secure) because I would
         | couldn't manage to make the damn thing work. I might not be a
         | genius, but I am also not completely stupid.
        
       | politelemon wrote:
       | This feels like perfect being the enemy of good enough. There are
       | examples where the system falls over but that doesn't mean that
       | it completely negates the benefits.
       | 
       | It is very easy to get blinkered into thinking that the specific
       | problems they're citing absolutely need to be solved, and quite
       | possibly an element of trying to use that as an excuse to reduce
       | some maintenance overhead without understanding its benefits.
        
         | creatonez wrote:
         | Its benefits are very much completely negated in real-world
         | use. See https://blog.yossarian.net/2023/05/21/PGP-signatures-
         | on-PyPI... - the data suggests that nobody is verifying these
         | PGP signatures at all.
        
           | dig1 wrote:
           | I stopped reading after this: "PGP is an insecure [1] and
           | outdated [2] ecosystem that hasn't reflected cryptographic
           | best practices in decades [3]."
           | 
           | The first link [1] suggests avoiding encrypted email due to
           | potential plaintext CC issues and instead recommends Signal
           | or (check this) WhatsApp. However, with encrypted email, I
           | have (or can have) full control over the keys and
           | infrastructure, a level of security that Signal or WhatsApp
           | can't match.
           | 
           | The second link [2] is Moxie's rant, which I don't entirely
           | agree with. Yes, GPG has a learning curve. But instead of
           | teaching people how to use it, we're handed dumbed-down
           | products like Signal (I've been using it since its early days
           | as a simple sms encryption app, and I can tell you, it's gone
           | downhill), which has a brilliant solution: it forces you to
           | remember (better to say to write down) a huge random hex
           | monstrosity just to decrypt a database later. And no, you
           | can't change it.
           | 
           | Despite the ongoing criticisms of GPG, no suitable
           | alternative has been put forward and the likes of Signal,
           | Tarsnap, and others [1] simply don't cut it. Many other
           | projects running for years (with relatively good security
           | track records, like kernel, debian, or cpan) have no problem
           | with GPG. This is 5c.
           | 
           | [1] https://latacora.micro.blog/2019/07/16/the-pgp-
           | problem.html
           | 
           | [2] https://moxie.org/2015/02/24/gpg-and-me.html
           | 
           | [3]
           | https://blog.cryptographyengineering.com/2014/08/13/whats-
           | ma...
        
             | wkat4242 wrote:
             | Yeah I still use pgp a lot. Especially because of hardware
             | backed tokens (on yubikey and openpgp cards) which I use a
             | lot for file encryption. The good thing is that there's
             | toolchains for all desktop OSes and mobile (Android, with
             | openkeychain).
             | 
             | I'm sure there's better options but they're not as
             | ubiquitous. I use it for file encryption, password manager
             | (pass) and SSH login and everything works on all my stuff,
             | with hardware tokens. Even on my tablet where Samsung
             | cheaped out by not including NFC I can use the USB port.
             | 
             | Replacements like fido2 and age fall short by not
             | supporting all the usecases (file encryption for fido2,
             | hardware tokens for age) or not having a complete toolchain
             | on all platforms.
        
               | nick__m wrote:
               | I use pcks11 on my yubikeys, would I gain something by
               | using the PGP functionality instead?
        
             | rtmang wrote:
             | I agree. PGP's status is just elevated by the people who
             | get funded by Radio Free Asia and then write attacks
             | against PGP.
             | 
             | The Python cryptography "experts" will of course parrot the
             | Moxie party line. Given the general lack of attention to
             | detail, lack of testing and hate for people who demand
             | stricter procedures in the Python ecosystem, never use any
             | Python based cryptography.
        
           | Diti wrote:
           | I believe the article you linked to doesn't seem to say
           | anything about "nobody verifying PGP signatures". We would
           | need PyPI to publish their Datadog & Google Analytics data,
           | but I'd say the set of users who _actually_ verify OpenPGP
           | signatures intersects with the set of users faking
           | /scrambling telemetry.
        
             | woodruffw wrote:
             | I wrote the blog post in question. The claim that "nobody
             | is verifying PGP signatures (from PyPI)" comes from the
             | fact that around 1/3rd had no discoverable public keys on
             | what remains of the keyserver network.
             | 
             | Of the 2/3rd that did have discoverable keys, ~50% had no
             | valid binding signature at the time of my audit, meaning
             | that obtaining a living public key has worse-than-coin-toss
             | odds for recent (>2020) PGP signatures on PyPI.
             | 
             | Combined, these datapoints (and a lack of public noise
             | about signatures failing to verify) strongly suggest that
             | nobody was attempting to verify PGP signatures from PyPI at
             | any meaningful scale. This was more or less confirmed by
             | the near-zero amount of feedback PyPI got once it disabled
             | PGP uploads.
        
         | jacques_chester wrote:
         | Maintaining this capability isn't free, it is of dubious
         | benefit and there are much better alternatives.
         | 
         | On a cost benefit analysis this is a slam dunk.
        
           | nightfly wrote:
           | What are these "much better alternatives"?
        
             | arccy wrote:
             | https://www.sigstore.dev/
             | 
             | The emerging standard for verifying artifacts, e.g. in
             | container image signing, npm, maven, etc
             | 
             | https://blog.sigstore.dev/npm-public-beta/
             | https://www.sonatype.com/blog/maven-central-and-sigstore
        
       | rurban wrote:
       | On the other hand PGP keys were widely successful for cpan, the
       | perl5 repo. It's very simple to use, not as complicated as with
       | pypi.
        
         | 0xbadcafebee wrote:
         | I dunno. I mean, sure, it's a worldwide-mirrored,
         | cryptographically secure, curated, hierarchically and
         | categorically organized, simple set of flat files, with
         | multiple separate community projects, to test all packages on
         | all supported Perl versions and platforms, with multiple
         | different frontends, bug tracking, search engines,
         | documentation hubs, security groups, and an incredibly long
         | history of support and maintenance by the community.
         | 
         | But it's, like, _old_. You can 't make something new be like
         | something _old_. That 's not _cool_. If what we 're doing isn't
         | new and cool, what is the point even?
        
       | woodruffw wrote:
       | This is slightly old news. For those curious, PGP support on the
       | modern PyPI (i.e. the new codebase that began to be used in
       | 2017-18) was always vestigial, and this change merely polished
       | off a component that was, empirically[1], doing very little to
       | improve the security of the packaging ecosystem.
       | 
       | Since then, PyPI has been working to adopt PEP 740[2], which both
       | enforces a more modern cryptographic suite and signature scheme
       | (built on Sigstore, although the design is adaptable) and is
       | bootstrapped on PyPI's support for Trusted Publishing[3], meaning
       | that it doesn't have the fundamental "identity" problem that
       | PyPI-hosted PGP signatures have.
       | 
       | The hard next step from there is putting verification in client
       | hands, which is the #1 thing that actually makes any signature
       | scheme actually useful.
       | 
       | [1]: https://blog.yossarian.net/2023/05/21/PGP-signatures-on-
       | PyPI...
       | 
       | [2]: https://peps.python.org/pep-0740/
       | 
       | [3]: https://docs.pypi.org/trusted-publishers/
        
       | nonameiguess wrote:
       | I feel like there is a broader issue being pushed aside here.
       | Verifying a signature means you have a cryptographic guarantee
       | that whoever generated an artifact possessed a private key
       | associated with a public key. That key doesn't necessarily need
       | to be published in a web-facing keystore to be useful. For
       | packages associated with an OS-approved app store or a Linux
       | distro's official repo, the store of trusted keys is baked into
       | the package manager.
       | 
       | What value does that provide? As the installer of something, you
       | almost never personally know the developer. You don't really
       | trust _them_. At best, you trust the operating system vendor to
       | sufficient vet contributors to a blessed app store. Whoever
       | published package A is actually a maintainer of Arch Linux.
       | Whoever published app B went through whatever the heck hoops
       | Apple makes you go through. If malware gets through, some sort of
       | process failed that can potentially be mediated.
       | 
       | If you're downloading a package from PyPI or RubyGems or
       | crates.io or whatever, a web repository that does no vetting and
       | allow anyone to publish anything, what assurance is this giving?
       | Great, some package was legitimately published by a person who
       | also published a public key. Who are they exactly? A pseudonym on
       | Github with a cartoon avatar? Does that make them trustworthy? If
       | they publish malware, what process can be changed to prevent that
       | from happening again? As far as I can tell, nothing.
       | 
       | If you change the keystore provider to sigstore, what does that
       | give you? Fulcio just requires that you control an e-mail address
       | to issue you a signing key. They're not vetting you in any way or
       | requiring you to disclose a real-world identity that can be
       | pursued if you do something bad. It's a step up in a limited
       | scope of use cases in which packages are published by corporate
       | entities that control an e-mail domain and ideally use their own
       | private artifact registry. It does nothing for public
       | repositories in which anyone is allowed to publish anything.
       | 
       | Fundamentally, if a public repository allows anyone to publish
       | anything, does no vetting and requires no real identity
       | disclosure, what is the basis of trust? If you're going to say
       | something like "well I'm looking for .whl files but only from
       | Microsoft," then the answer is for Microsoft to host its own
       | repository that you can download from, not for Microsoft to
       | publish packages to PyPI.
       | 
       | There are examples of making this sort of simpler for the
       | consumer to get everything from a single place. Docker Hub, for
       | instance. You can choose to only ever pull official library
       | images and verify them against sigstore, but that works because
       | Docker is itself a well-funded corporate entity that restricts
       | who can publish official library images by vetting and verifying
       | real identities.
        
       | upofadown wrote:
       | > Of those 1069 unique keys, about 30% of them were not
       | discoverable on major public keyservers, making it difficult or
       | impossible to meaningfully verify those signatures. Of the
       | remaining 71%, nearly half of them were unable to be meaningfully
       | verified at the time of the audit (2023-05-19).
       | 
       | A PGP keyserver provides no identity verification. It is simply a
       | place to store keys. So I don't understand this statement. What
       | is the ultimate goal here? I thought that things like this mostly
       | provided a consistent identity for contributing entities with no
       | requirement to know who the people behind the identities actually
       | were in real life.
        
         | ploxiln wrote:
         | These keys could have related signatures from other keys, that
         | some users or maintainers may have reason to trust.
         | 
         | (But for 30% of keys this was not even theoretically possible,
         | while for another 40% of keys it was not practically possible,
         | according to the article.)
        
         | woodruffw wrote:
         | You're thinking one step past the failure state here: the
         | problem isn't that keyservers don't provide identity
         | verification, but that the PGP key distribution ecosystem isn't
         | effectively delivering keys anymore.
         | 
         | There are probably multiple reasons for this, but the two
         | biggest ones are likely (1) that nobody knows how to upload
         | keys to keyservers anymore, and (2) that keyservers don't
         | gossip/share keys anymore, following the SKS network's
         | implosion[1].
         | 
         | Or put another way: a necessary precondition of signature
         | verification is key retrieval, whether or not trust in a given
         | key identity (or claimant human identity) is established. One
         | of PGP's historic strengths was that kind of key retrieval, and
         | the data strongly suggests that that's no longer the case.
         | 
         | [1]:
         | https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d695...
        
       ___________________________________________________________________
       (page generated 2024-10-17 23:00 UTC)