[HN Gopher] Preparing to Issue 200M Certificates in 24 Hours
___________________________________________________________________
Preparing to Issue 200M Certificates in 24 Hours
Author : jaas
Score : 136 points
Date : 2021-02-10 14:45 UTC (8 hours ago)
(HTM) web link (letsencrypt.org)
(TXT) w3m dump (letsencrypt.org)
| ivoras wrote:
| Oh single points of failure, where art thou...
|
| I hope everyone realises that Let's Encrypt is by now an
| essential part of the Internet, somewhat like DNS is, just
| massively centralised.
| jaywalk wrote:
| If Let's Encrypt went away tomorrow, I'd Google "free SSL
| certificate" and find somebody else. Worst case, I'd either go
| back to the old way and buy a cheap certificate or go without
| SSL until a new free option comes along.
| notafraudster wrote:
| Honest question (I know just enough about SSL to be
| dangerous): Does certificate pinning throw a wrench in this
| plan?
| Wowfunhappy wrote:
| Please don't enforce certificate pinning in user-facing
| software, at least without an opt-out. It becomes
| impossible to inspect my own traffic.
| cheeze wrote:
| It also leads to a ton of clients who freak out when
| their app suddenly goes offline because they forgot to
| update the pinned cert. Seen that one countless times.
|
| Pinning has uses, but if I'm running an app, I'm not
| doing pinning unless I have to.
|
| Pinning to the issuing or root is much safer from an
| availability standpoint.
| tialaramex wrote:
| Before choosing to use pinning you should have planned for
| this situation. Rather than write it out here, I'll link a
| post I wrote in a Let's Encrypt forum in response to
| someone who'd just blown up their system due to a different
| pinning mistake.
|
| https://community.letsencrypt.org/t/certificate-failure-
| due-...
| rntksi wrote:
| If proper revoking procedures are being followed, no.
|
| You can start reading more info here:
| https://developer.mozilla.org/en-
| US/docs/Web/Security/Certif...
|
| (HPKP is considered obsolete, if by cert pinning you meant
| that. If you're confusing HSTS with HPKP, just know that
| HSTS makes it much harder to mistakenly access the site via
| http, while HPKP [now deprecated] is the practice of
| ensuring some hashes is found in the cert that the server
| sends to mitigate some attacks)
| NovemberWhiskey wrote:
| At this point, I don't think certificate pinning in the
| general internet/web environment is a thing.
| LinuxBender wrote:
| Agreed. I MITM all my own traffic and I only have to
| exclude a handful of domains or A records on domains to
| not MITM.
| leesalminen wrote:
| Honestly curious, why do you MITM all your own traffic?
| For detailed logging?
| LinuxBender wrote:
| Logging, ACL's for some mime types, overriding cache
| controls, shared cache for multiple devices, blocking
| some sites that I can't block using DNS.
| leesalminen wrote:
| > shared cache for multiple devices
|
| Oh, that's a neat one and would be very valuable for me.
| We have poor internet access at home and would be cool to
| reduce traffic going out to the net.
|
| Thanks for the reply, appreciate it!
| LinuxBender wrote:
| No problem. You can find examples of how to set up Squid-
| SSL-Bump or I can provide examples if you can't find any.
| tlb wrote:
| The hits for "free SSL certificate" certainly won't be able
| to issue 200M certs in a day.
| strken wrote:
| AWS ACM is free-as-in-complimentary-peanuts, and I'd bet
| they could handle a pretty big chunk, although 2000rps for
| a request that's calling external services and
| cryptographically signing things is a bit intimidating.
| gregwebs wrote:
| This is why they created the ACME protocol. There are actually
| other providers of this protocol now. So there is a possible
| future that is much more like DNS.
| nodesocket wrote:
| Sounds like they have read-replicas, why can't they fail over
| to a read replica as the new master?
| RL_Quine wrote:
| They aren't the only no-cost distributor, their downtime
| doesn't really matter so long as it's less than a month long.
|
| What exactly is the issue with centralization here?
| mvolfik wrote:
| The thing is that we're not talking downtime here, but rather
| something in the CA compromised, which would mean that pretty
| much ANY website could be impersonated, as the attacker could
| issue a Let's Encrypt CA valid certificate for it. That is
| mitigated by invalidating this CA, but that also invalidates
| all legit certificates previously issued by them. So they
| need to reissue them all
| cholmon wrote:
| How would Netlify feel about not being able to issue or renew
| any certs for any sites for a month? Plenty of platforms rely
| on LE exclusively for one-click/automatic HTTPS for their
| customers sites.
| Denvercoder9 wrote:
| Let's Encrypt isn't the only provider supporting ACME
| either. Sectigo (under the ZeroSSL name) is a notable
| alternative.
| acct776 wrote:
| > They aren't the only no-cost distributor
|
| I couldn't name a second, which means they're probably not
| getting a huge %.
| marcosdumay wrote:
| The TLS CA system is one of those places where a single failure
| point is absolutely better than distributed possibilities. That
| is the case because any failure anywhere compromises the entire
| system.
|
| Of course, the other failure points didn't completely go away
| yet. But I do expect their number to reduce a lot in the
| future.
| toomuchtodo wrote:
| A lesson in recognizing, supporting, and defending public
| goods.
| Dylan16807 wrote:
| > the really interesting thing about these machines is that the
| EPYC CPUs provide 128 PCIe4 lanes each
|
| Not really. In a single socket setup, an EPYC gives you 128
| lanes. In a dual socket setup, 64 lanes from each CPU are
| repurposed to connect them together instead of doing PCIe. So
| just like single socket, you end up with 128 lanes _total_.
| virgulino wrote:
| 128 or 160 total, configurable.
| https://www.servethehome.com/dell-and-amd-showcase-future-of...
| TorKlingberg wrote:
| This seems like really fun project. For some reason it makes me
| happy that someone has good reason to run their own servers and
| networking, rather than rent everything from cloud providers.
| cbhl wrote:
| If there's one thing that always surprises me about the internet,
| it's that vertical scaling (bigger/faster machines, as opposed to
| horizontal scaling) can take a well-written service to "Internet
| Scale".
| 120bits wrote:
| On a side note. It always nice to see them include stuff like,
| Internal networking and Hardware specs of server. It shows you
| how much scalable they are or how they deal with large amount of
| data. I always enjoy reading them.
| hinkley wrote:
| Am I the only one having flashbacks to Rainbow's End (Vernor
| Vinge)?
| tialaramex wrote:
| > Normally ACME clients renew their certificates when one third
| of their lifetime is remaining, and don't contact our servers
| otherwise.
|
| At least newer versions of Certbot, and I believe some other ACME
| clients, will also try to discern if the certificate is Revoked
| when considering it. So if you have a daily cron job running
| Certbot, and your certificate with 60 days left on it has been
| revoked since yesterday, Certbot ought to notice that and attempt
| to replace it as if it had expired.
|
| _If_ you are doing OCSP stapling, and _if_ your stapling
| implementation is good (sadly last I looked neither Apache nor
| nginx were) this ought to be enough to make a mass revocation
| event survivable for you. Your server will notice the latest OCSP
| answers now say it 's revoked and continue to hand out the last
| GOOD answer it knew, some time later before that OCSP answer
| expires your Certbot should replace the certificate with a good
| one. Seamless.
|
| The new ACME feature is welcome, not least because there are a
| tremendous number of those bad Apache servers out there, but
| (unless I misunderstand) I think it's already possible to survive
| this sort of catastrophe without service interruption.
| phasmantistes wrote:
| The new ACME feature isn't just about surviving the revocation
| event itself. Suppose that the new API didn't exist, but every
| client polled on a daily basis to check to see if their cert
| was revoked. Then great -- within 24 hours, every server gets
| the new replacement certificate.
|
| And then 60 days later, every single client tries to renew that
| certificate. That's another 200 million certs in 24 hours. And
| that'll repeat every 60 days.
|
| So the ACME draft is also about being able to pro-actively
| smooth out that renewal spike. Some clients would be told to
| renew again immediately, less than 24 hours after their
| replacement. Others would be told to wait the whole 60 days.
| And then after a couple months of managing that, things would
| be back to normal.
| hannob wrote:
| FWIW Apache has a new stapling implementation that is not
| suffering from all the major problems the old one did. Can be
| activated with "MDStapling on".
| tialaramex wrote:
| Thanks, that's good to know. I also read their documentation
| explaining why they couldn't (or at least didn't) fix the old
| one. I will try to publicise this rather than simply saying
| the Apache httpd OCSP stapling is garbage in future.
|
| It explicitly mentions two big flaws with the old one, but
| not the one most relevant here - out of the box Apache's old
| OCSP stapling would merrily staple BAD answers simply because
| they're newer, which makes no sense. I assume that's
| corrected, but if you know this'd be a good place to say.
| sandGorgon wrote:
| > _There is no viable hardware RAID for NVME, so we've switched
| to ZFS to provide the data protection we need._
|
| This is Linux right ? would this be the largest deployment of
| ZFS-on-Linux then ?
| linsomniac wrote:
| A decade ago my backup cluster had >100TB of ZFS on Linux. I
| mean, that predated ZoL, so it was using ZFS-fuse, but...
| mnw21cam wrote:
| Not by a long shot. I just assembled two servers with 168 12TB
| drives each, giving a bit over 1.5PB available space on each
| server. And I'm pretty confident that this is also not the
| largest ZFS-on-Linux deployment either.
| RL_Quine wrote:
| I doubt it? 150TB of NVMe storage is big, but I've walked past
| racks with many orders of magnitude more in it.
|
| (edit: units)
| dlkmp wrote:
| It's TB though, per unit.
| dragontamer wrote:
| > 150GB of NVMe storage is big
|
| Your age is showing :-)
|
| Every few years, I gotta get used to the 100s MBs is big!! ->
| 100s GBs is big!! -> 100s TBs is big!!
|
| Seems like we're entering the age of PBs, and then we stopped
| caring about capacity and more about the speed of our TB+
| sized archives.
| koolba wrote:
| I don't see why anyone would ever want to use hardware RAID. It
| invariably leads to the day when your hardware is busted,
| there's no replacement parts, and you can't read your volumes
| from any other machine. Use the kernel RAID and you can always
| rip out disks, replace them, or just boot off a USB stick.
| namibj wrote:
| Because of performance, especially regarding being able to
| use a battery-backed write-back cache on the controller to
| give a "safe in the event of powerfailure" confirmation to
| the application before it actually hits disk/flash.
|
| The "can't read from any other machine" is handled by making
| sure (this includes testing) that the volumes are readable
| with dmraid. At least that's for SAS/SATA applications. I'm
| not sure about NVMe, as it uses different paths in the IO
| subsystem.
| seniorThrowaway wrote:
| Totally agree and I'll go one further, I don't want to use
| RAID at all in a non professional context. Maybe I'm too
| simplistic but for my personal stuff I don't use RAID, LVM or
| anything beyond plain ext4 file systems on whole disks. For
| redundancy I use rsync at whatever frequency makes sense to
| another disk of the same size. I've run like this for 10
| years and replaced many disks without losing data. The time I
| ran soft RAID I lost the whole array because one disk failed
| and a SATA error happened at the same time.
| tialaramex wrote:
| LVM is very nice because it eliminates that problem where
| you've got an almost full 2TB disk and you bought another
| 2TB disk and now you need to figure out what moves where.
| With LVM you just say nah, that's just 2TB more space for
| my data, let the machine figure it out.
|
| I mean, if you _enjoy_ sorting through going "OK that's
| photos of my kids, that goes in pile A, but these are
| materials from the re-mortgage application and go in pile
| B" then knock yourself out, but I have other things I want
| to do with my life, leave it to the machine to store stuff.
|
| If you lost everything that's because you lacked _backups_
| and (repeat after me) RAID is not a backup. Everybody
| should get into the habit of doing backups. Like any
| survivalist learns, two is one and one is none.
| Proven wrote:
| Again that nonsense about AMD CPUs and PCI lanes. It's completely
| unclear why the old h/w wasn't enough (and whether the new
| servers are enough).
|
| 200 million certs / 86,400 means 2,315 certs per second. Mid-
| range server with SSDs can do many 000's of TPS. Even if they
| need 5 tx/cert, that shouldn't be a problem with 3-4 year old
| servers.
| jtchang wrote:
| Always nice to see some private companies stepping up.
| Specifically Cisco, Luna, Thales, and Fortinet. I'm sure there
| are a bunch others that donate their resources to Lets Encrypt.
| amaccuish wrote:
| I hope all that traffic passing Cisco switches is encrypted...
| zanecodes wrote:
| Very probably I'm missing something, and I love Let's Encrypt and
| the service that they provide, but... the point of Let's Encrypt
| is to bring SSL/TLS to more websites, right (and not necessarily
| to provide identity verification, since the automated renewal
| process doesn't really require any proof of identity for the
| entity requesting the certificate)? Why couldn't that have been
| accomplished using self-signed certificates, and having browser
| vendors remove their big scary warning pages for sites using
| self-signed certificates for SSL/TLS? Do certificates from Let's
| Encrypt provide any security benefits over a self-signed
| certificate?
| gsich wrote:
| >Do certificates from Let's Encrypt provide any security
| benefits over a self-signed certificate?
|
| Depends. The encryption is the same and only dependent on your
| client/server. Could have been solved by DNS ... maybe.
|
| Self signed certs don't validate that you at least own the
| domain.
| zanecodes wrote:
| Ah yes, that makes sense. Let's Encrypt requires proof of
| domain ownership, which at least ensures that the entity
| you're connecting to is the entity that owns the domain.
| Encryption without authentication wouldn't be very helpful,
| since a man-in-the-middle could just present their own self-
| signed certificate during the handshake...
| tialaramex wrote:
| You'd be protected from a passive attack and thus you could
| always (with enough effort) detect an attack. Someone who
| is snooping (e.g. fibre taps) is potentially undetectable
| (yes in theory there are quantum physics tricks you could
| do to detect this, but nobody much is really doing that)
| whereas an active attack is always potentially detectable.
|
| So it's not nothing, but it isn't very much without the
| Certificate Authority role.
| VoidWhisperer wrote:
| The issue with browsers just allowing self-signed certs is that
| you cant verify their authenticity - ie were they correctly
| issued for the domain, or is it someone acting as the website
| using an invalidly issued cert. Having certs come from a
| recognized certificate authority helps with this because it
| provides a point for the certificate to be verified for
| authenticity
| zanecodes wrote:
| This makes me wonder about the feasibility of performing an
| attack by * man-in-the-middling Let's Encrypt
| and a particular domain (or DNS, depending on the domain
| validation challenge) * requesting a new certificate
| for that domain * spoofing the HTTP resource response
| (or DNS response, if applicable)
|
| I suppose this is mitigated by the way Let's Encrypt
| validates the agent software's public key on first use
| though, at least for websites that are currently using Let's
| Encrypt.
| jcrawfordor wrote:
| While LE is indeed vulnerable to this kind of (difficult)
| attack, I wanted to make the point that LE still
| represents, for the most part, an improvement over the
| previous norms in the CA industry. ACME standardizes
| automated domain ownership validation to a relatively small
| number of options that have received relatively rigorous
| security review (leading to one being dropped due to
| security concerns, for example).
|
| In contrast, incumbent low-budget CAs have often been a bit
| of a wild west of automated validation methods, often based
| on email, that can and do fall to much simpler attacks than
| a large-scale on-path attack. While CA/B, Mozilla, and
| others have worked to improve on that situation by
| requiring CAs to implement more restrictive policies on how
| domains can be validated, ACME still represents a much
| better validated, higher-quality validation process than
| that offered by a number of major CAs for DV certificates.
|
| One approach to decentralized or at least compromised-CA-
| tolerant TLS is something called "perspectives" (also
| implemented as "convergence"). The basic concept is that
| it's a common attack for someone to intercept _your_
| traffic, but it 's very difficult for someone to intercept
| _many people 's_ traffic on the internet. So, if the TLS
| certificate and key you receive from a website is the same
| as the one that many other people have received, it is most
| likely genuine. If it's different, that's an indicator of a
| potential on-path attack. This can be implemented by
| establishing trust between your computer and various
| "notaries" which basically just check the certificate from
| their perspective and confirm that it matches yours.
|
| I bring this up, because if you squint just right you can
| view ACME as being a method of bolting the same concept
| onto the existing CA infrastructure: before you connect to
| a website, LetsEncrypt acts as a notary by connecting to
| the domain from multiple perspectives and ensuring that the
| same person evidently controls it from all of them. While
| not perfect, this is a strong indicator that the person
| requesting the cert is legitimate.
|
| The on-path attack risk is almost, but not always, on a
| late stage of the network path to the user (e.g. their
| local network). The big weakness of the ACME approach is an
| interception of a late stage of the network path to the
| server. This tends to be much better secured, but hey, it's
| still something to worry about. There is obviously also a
| reliance on DNS, but I would say that DNS has basically
| always been the most critical single link in on-path attack
| protection.
| renewiltord wrote:
| Usually lower energy to exploit the server running the ACME
| client or the infra around it than it is to subvert the
| Internet infra that surrounds the LE infra.
|
| For instance, you can subvert the internet infra around
| some ccTLD if you're that country pretty easily but then
| who really owns the domain? Probably you, the country,
| since you can do anything with the DNS and then anything
| with the traffic.
| tialaramex wrote:
| Yes, this could work, and has definitely been done,
| sometimes, against other public CAs. We found convincing
| evidence of this during work at one of my previous
| employers.
|
| But what tempers my concern over that finding is that we
| found this by looking at cases where there's clearly a DNS
| takeover - and actually it was rare to do certificate
| issuance. In most cases it seems if you can MitM say, an
| Arab's country's army headquarters group mail servers, you
| can just offer no encryption or serve a self-signed
| certificate and the users will accept that. So while the
| Ten Blessed Methods are, as expected, not enough in the
| face of a resourceful adversary (in this case perhaps the
| Mossad, NSA or similar) they're also a padlock on a fence
| with a huge gaping hole in it anyway, our first attention
| should be on fixing the hole in the fence.
| level3 wrote:
| Let's Encrypt also mitigates this by validating from
| multiple vantage points, so a single man-in-the-middle is
| insufficient.
| Denvercoder9 wrote:
| > man-in-the-middling Let's Encrypt and a particular domain
| (or DNS, depending on the domain validation challenge)
|
| Let's Encrypt issues multiple verification requests from
| multiple servers in different locations, both physically
| and in the network topology. If you can MITM that, you've
| pretty much taken over the domain and the ability to get a
| certificate isn't the worst of the operators problems.
| marcosdumay wrote:
| > and the ability to get a certificate isn't the worst of
| the operators problems
|
| That assumes a lot about the operators goals and values.
| It may very well be their worst problem. Eg. a journalist
| in a dictatorial area will very likely prefer not to have
| a cloud service than to upload his data into some
| compromised service.
|
| It's just that, if it is their worst problem, TLS is
| patently insufficient, so they must think about it when
| setting the system up.
| Jonnax wrote:
| Those scary warning pages are an indication that someone is
| intercepting your traffic.
|
| What do you think about connecting to a network, and your phone
| sends a user/password to a server but instead it's being
| intercepted?
|
| And the user has no idea.
| jaywalk wrote:
| They verify domain ownership. If browsers accepted self-signed
| certificates, then as long as I could intercept your DNS
| requests (running a public Wi-Fi network next to a Starbucks,
| for example) then I could bring you to my malicious google.com
| without you knowing. That's no good.
| acct776 wrote:
| Highly recommend reading someone's applied essentials guide on
| certs, and the various methods of accomplishing SSL for self-
| hosted stuff.
|
| This stuff is much more complicated in isolation from the rest
| - full picture easiest.
| zanecodes wrote:
| I'm somewhat familiar with certificate handling in general, I
| had just forgotten how Let's Encrypt performs domain
| validation; it's been a few years since I used it and it's
| worked so well that I haven't had to think about it since,
| which is probably a testament to its stability!
|
| To be sure, PKI and certificates in particular have a lot of
| room for improvement in the UX department. Especially on
| Windows, where one frequently has to deal with not just .pem
| files but .cer, .pfx (with or without private keys), and
| more.
___________________________________________________________________
(page generated 2021-02-10 23:01 UTC)