[HN Gopher] Preparing to Issue 200M Certificates in 24 Hours
       ___________________________________________________________________
        
       Preparing to Issue 200M Certificates in 24 Hours
        
       Author : jaas
       Score  : 136 points
       Date   : 2021-02-10 14:45 UTC (8 hours ago)
        
 (HTM) web link (letsencrypt.org)
 (TXT) w3m dump (letsencrypt.org)
        
       | ivoras wrote:
       | Oh single points of failure, where art thou...
       | 
       | I hope everyone realises that Let's Encrypt is by now an
       | essential part of the Internet, somewhat like DNS is, just
       | massively centralised.
        
         | jaywalk wrote:
         | If Let's Encrypt went away tomorrow, I'd Google "free SSL
         | certificate" and find somebody else. Worst case, I'd either go
         | back to the old way and buy a cheap certificate or go without
         | SSL until a new free option comes along.
        
           | notafraudster wrote:
           | Honest question (I know just enough about SSL to be
           | dangerous): Does certificate pinning throw a wrench in this
           | plan?
        
             | Wowfunhappy wrote:
             | Please don't enforce certificate pinning in user-facing
             | software, at least without an opt-out. It becomes
             | impossible to inspect my own traffic.
        
               | cheeze wrote:
               | It also leads to a ton of clients who freak out when
               | their app suddenly goes offline because they forgot to
               | update the pinned cert. Seen that one countless times.
               | 
               | Pinning has uses, but if I'm running an app, I'm not
               | doing pinning unless I have to.
               | 
               | Pinning to the issuing or root is much safer from an
               | availability standpoint.
        
             | tialaramex wrote:
             | Before choosing to use pinning you should have planned for
             | this situation. Rather than write it out here, I'll link a
             | post I wrote in a Let's Encrypt forum in response to
             | someone who'd just blown up their system due to a different
             | pinning mistake.
             | 
             | https://community.letsencrypt.org/t/certificate-failure-
             | due-...
        
             | rntksi wrote:
             | If proper revoking procedures are being followed, no.
             | 
             | You can start reading more info here:
             | https://developer.mozilla.org/en-
             | US/docs/Web/Security/Certif...
             | 
             | (HPKP is considered obsolete, if by cert pinning you meant
             | that. If you're confusing HSTS with HPKP, just know that
             | HSTS makes it much harder to mistakenly access the site via
             | http, while HPKP [now deprecated] is the practice of
             | ensuring some hashes is found in the cert that the server
             | sends to mitigate some attacks)
        
             | NovemberWhiskey wrote:
             | At this point, I don't think certificate pinning in the
             | general internet/web environment is a thing.
        
               | LinuxBender wrote:
               | Agreed. I MITM all my own traffic and I only have to
               | exclude a handful of domains or A records on domains to
               | not MITM.
        
               | leesalminen wrote:
               | Honestly curious, why do you MITM all your own traffic?
               | For detailed logging?
        
               | LinuxBender wrote:
               | Logging, ACL's for some mime types, overriding cache
               | controls, shared cache for multiple devices, blocking
               | some sites that I can't block using DNS.
        
               | leesalminen wrote:
               | > shared cache for multiple devices
               | 
               | Oh, that's a neat one and would be very valuable for me.
               | We have poor internet access at home and would be cool to
               | reduce traffic going out to the net.
               | 
               | Thanks for the reply, appreciate it!
        
               | LinuxBender wrote:
               | No problem. You can find examples of how to set up Squid-
               | SSL-Bump or I can provide examples if you can't find any.
        
           | tlb wrote:
           | The hits for "free SSL certificate" certainly won't be able
           | to issue 200M certs in a day.
        
             | strken wrote:
             | AWS ACM is free-as-in-complimentary-peanuts, and I'd bet
             | they could handle a pretty big chunk, although 2000rps for
             | a request that's calling external services and
             | cryptographically signing things is a bit intimidating.
        
         | gregwebs wrote:
         | This is why they created the ACME protocol. There are actually
         | other providers of this protocol now. So there is a possible
         | future that is much more like DNS.
        
         | nodesocket wrote:
         | Sounds like they have read-replicas, why can't they fail over
         | to a read replica as the new master?
        
         | RL_Quine wrote:
         | They aren't the only no-cost distributor, their downtime
         | doesn't really matter so long as it's less than a month long.
         | 
         | What exactly is the issue with centralization here?
        
           | mvolfik wrote:
           | The thing is that we're not talking downtime here, but rather
           | something in the CA compromised, which would mean that pretty
           | much ANY website could be impersonated, as the attacker could
           | issue a Let's Encrypt CA valid certificate for it. That is
           | mitigated by invalidating this CA, but that also invalidates
           | all legit certificates previously issued by them. So they
           | need to reissue them all
        
           | cholmon wrote:
           | How would Netlify feel about not being able to issue or renew
           | any certs for any sites for a month? Plenty of platforms rely
           | on LE exclusively for one-click/automatic HTTPS for their
           | customers sites.
        
             | Denvercoder9 wrote:
             | Let's Encrypt isn't the only provider supporting ACME
             | either. Sectigo (under the ZeroSSL name) is a notable
             | alternative.
        
           | acct776 wrote:
           | > They aren't the only no-cost distributor
           | 
           | I couldn't name a second, which means they're probably not
           | getting a huge %.
        
         | marcosdumay wrote:
         | The TLS CA system is one of those places where a single failure
         | point is absolutely better than distributed possibilities. That
         | is the case because any failure anywhere compromises the entire
         | system.
         | 
         | Of course, the other failure points didn't completely go away
         | yet. But I do expect their number to reduce a lot in the
         | future.
        
         | toomuchtodo wrote:
         | A lesson in recognizing, supporting, and defending public
         | goods.
        
       | Dylan16807 wrote:
       | > the really interesting thing about these machines is that the
       | EPYC CPUs provide 128 PCIe4 lanes each
       | 
       | Not really. In a single socket setup, an EPYC gives you 128
       | lanes. In a dual socket setup, 64 lanes from each CPU are
       | repurposed to connect them together instead of doing PCIe. So
       | just like single socket, you end up with 128 lanes _total_.
        
         | virgulino wrote:
         | 128 or 160 total, configurable.
         | https://www.servethehome.com/dell-and-amd-showcase-future-of...
        
       | TorKlingberg wrote:
       | This seems like really fun project. For some reason it makes me
       | happy that someone has good reason to run their own servers and
       | networking, rather than rent everything from cloud providers.
        
       | cbhl wrote:
       | If there's one thing that always surprises me about the internet,
       | it's that vertical scaling (bigger/faster machines, as opposed to
       | horizontal scaling) can take a well-written service to "Internet
       | Scale".
        
       | 120bits wrote:
       | On a side note. It always nice to see them include stuff like,
       | Internal networking and Hardware specs of server. It shows you
       | how much scalable they are or how they deal with large amount of
       | data. I always enjoy reading them.
        
       | hinkley wrote:
       | Am I the only one having flashbacks to Rainbow's End (Vernor
       | Vinge)?
        
       | tialaramex wrote:
       | > Normally ACME clients renew their certificates when one third
       | of their lifetime is remaining, and don't contact our servers
       | otherwise.
       | 
       | At least newer versions of Certbot, and I believe some other ACME
       | clients, will also try to discern if the certificate is Revoked
       | when considering it. So if you have a daily cron job running
       | Certbot, and your certificate with 60 days left on it has been
       | revoked since yesterday, Certbot ought to notice that and attempt
       | to replace it as if it had expired.
       | 
       |  _If_ you are doing OCSP stapling, and _if_ your stapling
       | implementation is good (sadly last I looked neither Apache nor
       | nginx were) this ought to be enough to make a mass revocation
       | event survivable for you. Your server will notice the latest OCSP
       | answers now say it 's revoked and continue to hand out the last
       | GOOD answer it knew, some time later before that OCSP answer
       | expires your Certbot should replace the certificate with a good
       | one. Seamless.
       | 
       | The new ACME feature is welcome, not least because there are a
       | tremendous number of those bad Apache servers out there, but
       | (unless I misunderstand) I think it's already possible to survive
       | this sort of catastrophe without service interruption.
        
         | phasmantistes wrote:
         | The new ACME feature isn't just about surviving the revocation
         | event itself. Suppose that the new API didn't exist, but every
         | client polled on a daily basis to check to see if their cert
         | was revoked. Then great -- within 24 hours, every server gets
         | the new replacement certificate.
         | 
         | And then 60 days later, every single client tries to renew that
         | certificate. That's another 200 million certs in 24 hours. And
         | that'll repeat every 60 days.
         | 
         | So the ACME draft is also about being able to pro-actively
         | smooth out that renewal spike. Some clients would be told to
         | renew again immediately, less than 24 hours after their
         | replacement. Others would be told to wait the whole 60 days.
         | And then after a couple months of managing that, things would
         | be back to normal.
        
         | hannob wrote:
         | FWIW Apache has a new stapling implementation that is not
         | suffering from all the major problems the old one did. Can be
         | activated with "MDStapling on".
        
           | tialaramex wrote:
           | Thanks, that's good to know. I also read their documentation
           | explaining why they couldn't (or at least didn't) fix the old
           | one. I will try to publicise this rather than simply saying
           | the Apache httpd OCSP stapling is garbage in future.
           | 
           | It explicitly mentions two big flaws with the old one, but
           | not the one most relevant here - out of the box Apache's old
           | OCSP stapling would merrily staple BAD answers simply because
           | they're newer, which makes no sense. I assume that's
           | corrected, but if you know this'd be a good place to say.
        
       | sandGorgon wrote:
       | > _There is no viable hardware RAID for NVME, so we've switched
       | to ZFS to provide the data protection we need._
       | 
       | This is Linux right ? would this be the largest deployment of
       | ZFS-on-Linux then ?
        
         | linsomniac wrote:
         | A decade ago my backup cluster had >100TB of ZFS on Linux. I
         | mean, that predated ZoL, so it was using ZFS-fuse, but...
        
         | mnw21cam wrote:
         | Not by a long shot. I just assembled two servers with 168 12TB
         | drives each, giving a bit over 1.5PB available space on each
         | server. And I'm pretty confident that this is also not the
         | largest ZFS-on-Linux deployment either.
        
         | RL_Quine wrote:
         | I doubt it? 150TB of NVMe storage is big, but I've walked past
         | racks with many orders of magnitude more in it.
         | 
         | (edit: units)
        
           | dlkmp wrote:
           | It's TB though, per unit.
        
           | dragontamer wrote:
           | > 150GB of NVMe storage is big
           | 
           | Your age is showing :-)
           | 
           | Every few years, I gotta get used to the 100s MBs is big!! ->
           | 100s GBs is big!! -> 100s TBs is big!!
           | 
           | Seems like we're entering the age of PBs, and then we stopped
           | caring about capacity and more about the speed of our TB+
           | sized archives.
        
         | koolba wrote:
         | I don't see why anyone would ever want to use hardware RAID. It
         | invariably leads to the day when your hardware is busted,
         | there's no replacement parts, and you can't read your volumes
         | from any other machine. Use the kernel RAID and you can always
         | rip out disks, replace them, or just boot off a USB stick.
        
           | namibj wrote:
           | Because of performance, especially regarding being able to
           | use a battery-backed write-back cache on the controller to
           | give a "safe in the event of powerfailure" confirmation to
           | the application before it actually hits disk/flash.
           | 
           | The "can't read from any other machine" is handled by making
           | sure (this includes testing) that the volumes are readable
           | with dmraid. At least that's for SAS/SATA applications. I'm
           | not sure about NVMe, as it uses different paths in the IO
           | subsystem.
        
           | seniorThrowaway wrote:
           | Totally agree and I'll go one further, I don't want to use
           | RAID at all in a non professional context. Maybe I'm too
           | simplistic but for my personal stuff I don't use RAID, LVM or
           | anything beyond plain ext4 file systems on whole disks. For
           | redundancy I use rsync at whatever frequency makes sense to
           | another disk of the same size. I've run like this for 10
           | years and replaced many disks without losing data. The time I
           | ran soft RAID I lost the whole array because one disk failed
           | and a SATA error happened at the same time.
        
             | tialaramex wrote:
             | LVM is very nice because it eliminates that problem where
             | you've got an almost full 2TB disk and you bought another
             | 2TB disk and now you need to figure out what moves where.
             | With LVM you just say nah, that's just 2TB more space for
             | my data, let the machine figure it out.
             | 
             | I mean, if you _enjoy_ sorting through going  "OK that's
             | photos of my kids, that goes in pile A, but these are
             | materials from the re-mortgage application and go in pile
             | B" then knock yourself out, but I have other things I want
             | to do with my life, leave it to the machine to store stuff.
             | 
             | If you lost everything that's because you lacked _backups_
             | and (repeat after me) RAID is not a backup. Everybody
             | should get into the habit of doing backups. Like any
             | survivalist learns, two is one and one is none.
        
       | Proven wrote:
       | Again that nonsense about AMD CPUs and PCI lanes. It's completely
       | unclear why the old h/w wasn't enough (and whether the new
       | servers are enough).
       | 
       | 200 million certs / 86,400 means 2,315 certs per second. Mid-
       | range server with SSDs can do many 000's of TPS. Even if they
       | need 5 tx/cert, that shouldn't be a problem with 3-4 year old
       | servers.
        
       | jtchang wrote:
       | Always nice to see some private companies stepping up.
       | Specifically Cisco, Luna, Thales, and Fortinet. I'm sure there
       | are a bunch others that donate their resources to Lets Encrypt.
        
       | amaccuish wrote:
       | I hope all that traffic passing Cisco switches is encrypted...
        
       | zanecodes wrote:
       | Very probably I'm missing something, and I love Let's Encrypt and
       | the service that they provide, but... the point of Let's Encrypt
       | is to bring SSL/TLS to more websites, right (and not necessarily
       | to provide identity verification, since the automated renewal
       | process doesn't really require any proof of identity for the
       | entity requesting the certificate)? Why couldn't that have been
       | accomplished using self-signed certificates, and having browser
       | vendors remove their big scary warning pages for sites using
       | self-signed certificates for SSL/TLS? Do certificates from Let's
       | Encrypt provide any security benefits over a self-signed
       | certificate?
        
         | gsich wrote:
         | >Do certificates from Let's Encrypt provide any security
         | benefits over a self-signed certificate?
         | 
         | Depends. The encryption is the same and only dependent on your
         | client/server. Could have been solved by DNS ... maybe.
         | 
         | Self signed certs don't validate that you at least own the
         | domain.
        
           | zanecodes wrote:
           | Ah yes, that makes sense. Let's Encrypt requires proof of
           | domain ownership, which at least ensures that the entity
           | you're connecting to is the entity that owns the domain.
           | Encryption without authentication wouldn't be very helpful,
           | since a man-in-the-middle could just present their own self-
           | signed certificate during the handshake...
        
             | tialaramex wrote:
             | You'd be protected from a passive attack and thus you could
             | always (with enough effort) detect an attack. Someone who
             | is snooping (e.g. fibre taps) is potentially undetectable
             | (yes in theory there are quantum physics tricks you could
             | do to detect this, but nobody much is really doing that)
             | whereas an active attack is always potentially detectable.
             | 
             | So it's not nothing, but it isn't very much without the
             | Certificate Authority role.
        
         | VoidWhisperer wrote:
         | The issue with browsers just allowing self-signed certs is that
         | you cant verify their authenticity - ie were they correctly
         | issued for the domain, or is it someone acting as the website
         | using an invalidly issued cert. Having certs come from a
         | recognized certificate authority helps with this because it
         | provides a point for the certificate to be verified for
         | authenticity
        
           | zanecodes wrote:
           | This makes me wonder about the feasibility of performing an
           | attack by                 * man-in-the-middling Let's Encrypt
           | and a particular domain (or DNS, depending on the domain
           | validation challenge)       * requesting a new certificate
           | for that domain       * spoofing the HTTP resource response
           | (or DNS response, if applicable)
           | 
           | I suppose this is mitigated by the way Let's Encrypt
           | validates the agent software's public key on first use
           | though, at least for websites that are currently using Let's
           | Encrypt.
        
             | jcrawfordor wrote:
             | While LE is indeed vulnerable to this kind of (difficult)
             | attack, I wanted to make the point that LE still
             | represents, for the most part, an improvement over the
             | previous norms in the CA industry. ACME standardizes
             | automated domain ownership validation to a relatively small
             | number of options that have received relatively rigorous
             | security review (leading to one being dropped due to
             | security concerns, for example).
             | 
             | In contrast, incumbent low-budget CAs have often been a bit
             | of a wild west of automated validation methods, often based
             | on email, that can and do fall to much simpler attacks than
             | a large-scale on-path attack. While CA/B, Mozilla, and
             | others have worked to improve on that situation by
             | requiring CAs to implement more restrictive policies on how
             | domains can be validated, ACME still represents a much
             | better validated, higher-quality validation process than
             | that offered by a number of major CAs for DV certificates.
             | 
             | One approach to decentralized or at least compromised-CA-
             | tolerant TLS is something called "perspectives" (also
             | implemented as "convergence"). The basic concept is that
             | it's a common attack for someone to intercept _your_
             | traffic, but it 's very difficult for someone to intercept
             | _many people 's_ traffic on the internet. So, if the TLS
             | certificate and key you receive from a website is the same
             | as the one that many other people have received, it is most
             | likely genuine. If it's different, that's an indicator of a
             | potential on-path attack. This can be implemented by
             | establishing trust between your computer and various
             | "notaries" which basically just check the certificate from
             | their perspective and confirm that it matches yours.
             | 
             | I bring this up, because if you squint just right you can
             | view ACME as being a method of bolting the same concept
             | onto the existing CA infrastructure: before you connect to
             | a website, LetsEncrypt acts as a notary by connecting to
             | the domain from multiple perspectives and ensuring that the
             | same person evidently controls it from all of them. While
             | not perfect, this is a strong indicator that the person
             | requesting the cert is legitimate.
             | 
             | The on-path attack risk is almost, but not always, on a
             | late stage of the network path to the user (e.g. their
             | local network). The big weakness of the ACME approach is an
             | interception of a late stage of the network path to the
             | server. This tends to be much better secured, but hey, it's
             | still something to worry about. There is obviously also a
             | reliance on DNS, but I would say that DNS has basically
             | always been the most critical single link in on-path attack
             | protection.
        
             | renewiltord wrote:
             | Usually lower energy to exploit the server running the ACME
             | client or the infra around it than it is to subvert the
             | Internet infra that surrounds the LE infra.
             | 
             | For instance, you can subvert the internet infra around
             | some ccTLD if you're that country pretty easily but then
             | who really owns the domain? Probably you, the country,
             | since you can do anything with the DNS and then anything
             | with the traffic.
        
             | tialaramex wrote:
             | Yes, this could work, and has definitely been done,
             | sometimes, against other public CAs. We found convincing
             | evidence of this during work at one of my previous
             | employers.
             | 
             | But what tempers my concern over that finding is that we
             | found this by looking at cases where there's clearly a DNS
             | takeover - and actually it was rare to do certificate
             | issuance. In most cases it seems if you can MitM say, an
             | Arab's country's army headquarters group mail servers, you
             | can just offer no encryption or serve a self-signed
             | certificate and the users will accept that. So while the
             | Ten Blessed Methods are, as expected, not enough in the
             | face of a resourceful adversary (in this case perhaps the
             | Mossad, NSA or similar) they're also a padlock on a fence
             | with a huge gaping hole in it anyway, our first attention
             | should be on fixing the hole in the fence.
        
             | level3 wrote:
             | Let's Encrypt also mitigates this by validating from
             | multiple vantage points, so a single man-in-the-middle is
             | insufficient.
        
             | Denvercoder9 wrote:
             | > man-in-the-middling Let's Encrypt and a particular domain
             | (or DNS, depending on the domain validation challenge)
             | 
             | Let's Encrypt issues multiple verification requests from
             | multiple servers in different locations, both physically
             | and in the network topology. If you can MITM that, you've
             | pretty much taken over the domain and the ability to get a
             | certificate isn't the worst of the operators problems.
        
               | marcosdumay wrote:
               | > and the ability to get a certificate isn't the worst of
               | the operators problems
               | 
               | That assumes a lot about the operators goals and values.
               | It may very well be their worst problem. Eg. a journalist
               | in a dictatorial area will very likely prefer not to have
               | a cloud service than to upload his data into some
               | compromised service.
               | 
               | It's just that, if it is their worst problem, TLS is
               | patently insufficient, so they must think about it when
               | setting the system up.
        
         | Jonnax wrote:
         | Those scary warning pages are an indication that someone is
         | intercepting your traffic.
         | 
         | What do you think about connecting to a network, and your phone
         | sends a user/password to a server but instead it's being
         | intercepted?
         | 
         | And the user has no idea.
        
         | jaywalk wrote:
         | They verify domain ownership. If browsers accepted self-signed
         | certificates, then as long as I could intercept your DNS
         | requests (running a public Wi-Fi network next to a Starbucks,
         | for example) then I could bring you to my malicious google.com
         | without you knowing. That's no good.
        
         | acct776 wrote:
         | Highly recommend reading someone's applied essentials guide on
         | certs, and the various methods of accomplishing SSL for self-
         | hosted stuff.
         | 
         | This stuff is much more complicated in isolation from the rest
         | - full picture easiest.
        
           | zanecodes wrote:
           | I'm somewhat familiar with certificate handling in general, I
           | had just forgotten how Let's Encrypt performs domain
           | validation; it's been a few years since I used it and it's
           | worked so well that I haven't had to think about it since,
           | which is probably a testament to its stability!
           | 
           | To be sure, PKI and certificates in particular have a lot of
           | room for improvement in the UX department. Especially on
           | Windows, where one frequently has to deal with not just .pem
           | files but .cer, .pfx (with or without private keys), and
           | more.
        
       ___________________________________________________________________
       (page generated 2021-02-10 23:01 UTC)