[HN Gopher] The browser's biggest TLS mistake
       ___________________________________________________________________
        
       The browser's biggest TLS mistake
        
       Author : greyface-
       Score  : 105 points
       Date   : 2024-01-07 16:17 UTC (1 days ago)
        
 (HTM) web link (blog.benjojo.co.uk)
 (TXT) w3m dump (blog.benjojo.co.uk)
        
       | dochne wrote:
       | It still remains a mystery to me why browsers felt they should
       | "fix" this server misconfiguration.
       | 
       | It's particularly vexing to me as the main reason that people end
       | up with misconfigured servers at all is because after they've
       | configured their new cert (incorrectly) their web browser gives
       | them a tick and they think they've done it right - after all, why
       | wouldn't they?
        
         | CableNinja wrote:
         | In my hosting days, we relied on the ssl checker that ssl-
         | shopper has. Browser was never considered a valid test for us.
         | It was final validation, but a proper ssl checker was the real
         | test
        
         | chowells wrote:
         | It's Postel's Law being bad advice yet again. No, you should
         | not be liberal in what you accept, because being liberal in
         | what you accept causes even more malformed data to appear in
         | the ecosystem.
        
           | drdaeman wrote:
           | That battle is long lost.
           | 
           | For me the revelatory moment was in mid-00s, when everyone
           | screamed anathema at XHTML, saying it was bad because it
           | required people to write well-formed documents, when everyone
           | just wanted to slap random tags and somehow have that
           | steaming mess to still work.
           | 
           | There must me some sort of law that says in tech the crudest
           | pile of hacks wins over any formally elegant solution every
           | single time those hacks lets one do something that requires
           | extra effort otherwise, even if it works only by wildest
           | chance.
        
             | joshuaissac wrote:
             | > There must me some sort of law that says in tech the
             | crudest pile of hacks wins over any formally elegant
             | solution
             | 
             | This is called 'Worse is better'.
             | 
             | https://en.wikipedia.org/wiki/Worse_is_better
        
             | TylerE wrote:
             | The biggest objection I and many others had at the time was
             | that writing xhtml forced one to deal with hell that is xml
             | namespaces, which many tools at the time barely supported
        
           | TedDoesntTalk wrote:
           | > bad advice ... being liberal in what you accept causes even
           | more malformed data to appear in the ecosystem.
           | 
           | This is one perspective. Another is to be robust and
           | resilient. Resiliency is a hallmark of good engineering. I
           | get the sense you have not worked on server-side software
           | that has thousands or millions of different clients.
        
             | chowells wrote:
             | I absolutely have. And I've never modified a server to
             | accept bullshit from an incorrect client. I have, on the
             | other hand, told several people how to fix their clients
             | when they complain it doesn't work with my service. I
             | actually rather enjoy improving the ecosystem, even if it's
             | not strictly my job. It's better for everyone.
        
             | dns_snek wrote:
             | Where do you draw the line? Usually there's exactly 1
             | intended, standard way of communicating with another system
             | while there's are infinite opportunities to deviate from
             | that standard and infinite opportunities for the other
             | party to try to guess what you _really_ meant. This results
             | in a combinatorial explosion unintended behaviors that lead
             | to bugs and critical security vulnerabilities.
        
             | SAI_Peregrinus wrote:
             | Postel's Law should be called the "Hardness Principle", not
             | the "Robustness Principle". Much like how hardening a metal
             | makes it take more force to break, but results in it being
             | brittle & failing catastrophically when it does, so
             | Postel's law makes systems harder to break initially, but
             | results in more damage when they do fail. It also makes the
             | system harder to maintain, thus adding a pun to the name.
        
         | kevincox wrote:
         | A common way that these work is that 1 browser does it, then if
         | the others don't copy they appear "broken" to users.
         | 
         | IDK what happened in this case but it is pretty easy to imagine
         | Chrome accidentally allowed validation against certificates in
         | its local cache. Maybe it added some sort of validation cache
         | to avoid rechecking revocation lists and OSCP or similar and it
         | would use intermediates from other sites. Then people tested
         | their site in Chrome and it seemed to work. Now Firefox seems
         | broken if they don't support this. So they decided to implement
         | this and do something more robust by preloading a fixed list
         | rather than whatever happens to be in the cache.
         | 
         | Basically no browser wants to be the first to stop supporting
         | this hack.
        
           | jakub_g wrote:
           | The mechanism for caching seen certs dates back to Internet
           | Explorer / Netscape times
           | 
           | https://bugzilla.mozilla.org/show_bug.cgi?id=399324#c16
        
         | kevingadd wrote:
         | Maybe in the modem days the smaller certificate was considered
         | ideal for connection overhead?
        
         | hannob wrote:
         | This is ultimately an application of the "robustness principle"
         | or Poestel's law, which was how people build stuff in the early
         | Internet.
         | 
         | Plenty of people believe these days that this was never a wise
         | guideline to begin with (see
         | https://www.ietf.org/archive/id/draft-iab-protocol-maintenan...
         | which unfortunately never made it to an RFC). However, one of
         | the problems is that once you started accepting
         | misconfigurations, it's hard to change your defaults.
        
           | ekr____ wrote:
           | It actually did end up as RFC 9413, albeit somewhat softened.
           | 
           | https://datatracker.ietf.org/doc/rfc9413/
        
         | stefan_ wrote:
         | Do browsers or is this another OpenSSL Easter egg we all have
         | to live with?
         | 
         | I remember that OpenSSL also validates certificate chains with
         | duplicates, despite that obviously breaking the chain property.
         | That's wasteful but also very annoying because TLS libraries
         | like BearSSL don't (I guess you could hack it and remember the
         | previous hash and stay fixed space).
        
           | tialaramex wrote:
           | The chain "property" was never enforced anywhere of
           | consequence and is gone in TLS 1.3
           | 
           | In practice other than the position of the end entity's
           | certificate, the "chain" is just a set of documents which
           | might aid your client in verifying that this end entity
           | certificate is OK. If you receive, in addition to the end
           | entity certificate, certs A, B, C and D it's completely fine
           | if certificate D has expired, certificate B is malformed and
           | certificate A doesn't relate to this end-entity certificate
           | at all as far as you're concerned if you're able (perhaps
           | with the aid of C) to conclude that yes, this is the right
           | end entity and it's a trustworthy certificate.
           | 
           | Insisting on a chain imagines that the Web PKI's trust graph
           | is a DAG and it is not. So since the trust graph we're
           | excerpting has cycles and is generally a complete mess we
           | need to accept that we can't necessarily turn a section of
           | that graph (if it even was _one_ graph which it isn 't, each
           | client possibly has a slightly different trust set) into a
           | chain.
        
             | stefan_ wrote:
             | You are overthinking it. Some sysadmin copying the same
             | cert into the chain twice because AWS is confusing and
             | doesn't care and OpenSSL doesn't care isn't resolving the
             | grand problem of the trust graph, it's just a loss overall,
             | for everyone. Nobody wins here.
             | 
             | (Of course the 1.3 approach of throwing a bunch of
             | certificates and then asking to resolve over all of them
             | breaks BearSSL comprehensively)
        
               | toast0 wrote:
               | Yes, it's useless to include the CA cert, and to include
               | extra copies, and all those other things.
               | 
               | But requiring the cert chain to be exactly correct is
               | also useless if you need to address clients with
               | different root cert packages. If some clients have only
               | root A and some have only root B, but B did a cross-sign
               | for A, you're ok if you send entity signed by
               | intermediate, intermediate signed by A, A signed by B and
               | clients with A only short circuit after they see an
               | intermediate signed by A, and the clients with B only
               | should be fine too. Of course it gets real weird when the
               | B root has expired, and clients often have A and B, but
               | some don't check if their roots expired, and some won't
               | short circuit to validating with A, so they fail the cert
               | because B is expired.
               | 
               | Oh, and TLS handshakes in the wild don't give you
               | explicit information about what roots they have or what
               | client / version they are. Sometimes you can get a little
               | bit of information and return different cert chains to
               | different clients, but there's also not a lot of support
               | for that in most server stacks.
               | 
               | I don't necessarily like TLS 1.3's approach of end entity
               | cert comes first and then just try all the permutations
               | and accept any one that works, but at least it presents a
               | way to get to success given the reality we live in. I'd
               | also love to see some way to get your end entity cert
               | signed by multiple intermediates, but that's a whole
               | nother level of terrible.
               | 
               | #notbitter
        
         | gregmac wrote:
         | It wasn't long ago when TLS was not the norm and many, many
         | sites were served over plain HTTP, even when they accepted
         | logins or contained other sensitive data. There's a good chance
         | this decision was a trade-off to make TLS simpler to get
         | working in order to get more sites using it.
         | 
         | Browsers have a long history of accepting bad data, including
         | malformed headers, invalid HTML, and maintaining workarounds
         | for long-since-fixed bugs. This isn't really that different.
        
           | samus wrote:
           | Really? You receive two files from your CA. One of them is
           | the leaf, the other one is the chain. You just have to upload
           | the latter ( _not_ the former) into the server 's config
           | directory. That doesn't sound that hard.
           | 
           | If it actually is, I am ready to eat my words, but the actual
           | blame would be on the webserver developers then. Default
           | settings should be boring, but secure; advanced configuration
           | should be approachable; and dangerous settings should require
           | the admin to jump through hoops.
        
         | samus wrote:
         | It's very difficult in practice to shift the blame to the
         | website. Even though the browser would be right in refusing
         | connection, the net effect is that the user would just use
         | another browser to access that website. The proper workaround
         | (Firefox shipping intermediate certificates), doesn't actually
         | damage security. It just means more work for the maintainers.
         | That's a fair tradeoff for achieving more market share.
         | 
         | It's the same reason why browsers must be able to robustly
         | digest HTML5 tagsoup instead of just blanking out, which is how
         | a conforming XML processor would have to react.
        
       | anabolic wrote:
       | Aside from the browsers, I don't know how many times I had to fix
       | TLS handshake failures due to the server sending only the leaf
       | certificate or argue with people who insist on shoving the full
       | CA chain (sometimes including the leaf) in the ca-
       | bundle/trustore, even after I link them the TLS RFC.
       | 
       | This really should be better documented and enforced.
        
       | advisedwang wrote:
       | I get that this feels un-pure, but what is the actual damage of
       | validating against cached intermediate certs? The only concrete
       | thing the author cites is harder debugging, but that's a pretty
       | weak objection.
        
         | jimmyl02 wrote:
         | I think it's mainly the change in behavior that could be viewed
         | as concerning. As a user of a website, if the order of websites
         | you visited after browser startup impacts the success or
         | invalid certificate error I feel like it's a pretty confusing
         | experience.
         | 
         | I definitely didn't know this before and if I saw this behavior
         | I would be pretty confused.
        
         | anonacct37 wrote:
         | The actual damage is that it's pretty common (my last team has
         | this happen) for a team to setup a cert, verify it works, and
         | then when they deploy the cert it works some of the time or
         | "works on my machine" and so the failures seem really random
         | and by definition hard to reproduce because you have to restart
         | chrome to reproduce.
         | 
         | Probably the tl;Dr is that validating against a persistent
         | cache like Firefox is fine. Validating against an ephemeral
         | cache with chrome is likely to cause a lot of breaking.
        
           | KAMSPioneer wrote:
           | Sort of a corollary to your point: if an admin sets up a
           | website and verifies with Firefox (or Chromium, whatever),
           | and then later the server needs to communicate
           | with...basically any tool that speaks HTTPS but isn't a web
           | browser, then there will be many tears shed by that admin.
           | 
           | For instance, you stand up a server, and then a user
           | complains their script using cURL, wget, etc. doesn't work,
           | and if you aren't paying attention you'll have no idea why.
           | 
           | Inb4 why can't the OS certificate store just do the same
           | thing: I suspect people will tend to install OS updates less
           | frequently that browser updates, so it will tend to be less
           | reliable.
        
             | anon4242 wrote:
             | This is why you should do `openssl connect <your site>` to
             | verify TLS when changing your server's TLS certs.
        
         | minitech wrote:
         | Seems like a potential fingerprinting risk, but I haven't
         | checked if the actual implementation is built to guard against
         | that somehow.
        
         | anon4242 wrote:
         | It's a potential Heisenbug for (some of) your javascript code.
         | Sometimes things work on some machines and sometimes it
         | doesn't. Unless you have the cert-chain misconfig in your
         | brain-cache you'd probably spend hours debugging confusing bug
         | reports from customers that you fail to reproduce reliably. So
         | it's not just harder to debug, it causes bugs (and indirectly
         | bug-reports you'll need to investigate).
        
       | moduspol wrote:
       | It's a little off-topic, but I definitely like the throwback to
       | Microsoft FrontPage. I don't know if this page was laid out in
       | it, or it's just the theme, but I haven't seen it in decades.
       | Nice little trip down memory lane.
        
       | Jenk wrote:
       | My niaive understanding is that all certs contain (at least) the
       | id/thumb of their issuing cert - If I am not mistaken, how is TLS
       | broken by sending only the leaf and/or intermediary if the client
       | is able to correctly identify the issuer as known/trusted via
       | this thumb print?
        
         | politelemon wrote:
         | If I'm understanding correctly, the post isn't calling TLS
         | broken, it's calling out the bad behavior of browsers.
         | 
         | By employing mitigations/workarounds, they encourage
         | misconfigured servers, and that in turn produces unexpected or
         | inconsistent behaviors when interacting with those servers
         | through different client types. eg you might see different
         | behavior in FF vs Chrome, or Chrome vs curl/python, etc.
        
           | Jenk wrote:
           | Apologies, I didn't mean breaking TLS itself but breaking the
           | trust that said protocol is providing.
           | 
           | Thanks for explaining.
        
         | tialaramex wrote:
         | You can do this, and historically some browsers did, it's
         | called AIA chasing (which is why AIA is mentioned briefly in
         | the blog post)
         | 
         | The problem with AIA chasing is that it's a privacy violation.
         | Not a _huge_ one but enough to be completely unacceptable for
         | say Mozilla.
         | 
         | In fetching the needed intermediates we reveal which
         | intermediates we needed, if you're Let's Encrypt you operate
         | only a few intermediates and you issue a truly astounding
         | number of certificates from each so that's barely information,
         | but if you're a small CA and you have say a dozen intermediates
         | you absolutely could arrange for say pro-Party A sites to all
         | use intermediate #6 while also pro-Party B sites used
         | intermediate #8 and then use the resulting data from AIA
         | chasing to measure who is going to "Party A" sites and direct
         | political advertising at those people...
        
       | phh wrote:
       | Gosh yes, count me in the "I hate this behavior" group.
       | 
       | I've hit such horrible issues, like "why does this work in
       | chrome, I but not in Firefox and chromium?!" (ofc now that you
       | know it's not related to the browser). It really made my heart
       | hurt. Also if my memory is correct, when it is in cache, when
       | looking at the cert chain in the browser it'll show the chain
       | including the cache, without mentioning the cache. So you end up
       | "why the fuck does my http server I just deployed does browser
       | favoritism and change the exposed cert chain based on the user
       | agent... Which it doesn't have???" then goes to "okay the simple
       | logical explanation is that it's http2 vs http1" (because that
       | was when http2 just started deployment).
       | 
       | And of course at some point you hit the moment where you no
       | longer reproduce the issue because you visited the wrong website.
       | 
       | Thankfully I hit this issue only once, and it wasn't on a mission
       | critical service that I had to fix ASAP, so it wasn't too
       | horrible, but still.
        
       | aflukasz wrote:
       | Related: There is 10+ years old Python issue about implementing
       | "AIA chaising" to handle server misconfigurations as described in
       | this article: https://github.com/python/cpython/issues/62817. The
       | article mentions this approach in the last paragraph.
       | 
       | There is at least one 3rd party Python lib that does that, if you
       | are interested in details how this works:
       | https://github.com/danilobellini/aia.
        
       | AtNightWeCode wrote:
       | Biggest? How about serving TLS certs when doing direct IP access?
       | Or how about leaking sub domains in TLS certs?
       | 
       | I, as a mediocre hacker, cough, security advisor, cough, use
       | certs to find vulnerable subdomains all the time. Or at least. I
       | get to play around in your test envs.
       | 
       | Edit: Ok, the problem in the topic is also not good.
        
         | dylan604 wrote:
         | This strikes me as interesting even if it's a field I have very
         | light understanding, and feel like I might fall victim to this.
         | Asking for a friend, but if that friend uses Let's Encrypt to
         | create certs for subdomains on a single vhost, what would that
         | friend need to do to see the information you are seeing?
        
           | oarsinsync wrote:
           | Certificate issuance transparency logs are public.
           | 
           | Every time a CA issues a new certificate, it gets logged in a
           | public log.
           | 
           | Every time someone sets up nextcloud.example.tld, and gets an
           | SSL cert issued by a CA, that gets logged in public. If
           | nextcloud.example.tld resolves, and responds on tcp/80 and/or
           | tcp/443, you've got yourself a potential target.
        
         | MilaM wrote:
         | I was wondering recently if it's better to use wildcard certs
         | because of this. On the other hand, all sub-domains are
         | discoverable through DNS anyways. Does it then make a
         | difference if the sub-domains are logged in the certificate
         | transparency logs?
        
       | rgbrenner wrote:
       | I see nothing wrong with Firefox's behavior in this case. Root CA
       | certificates are just certificates that the issuer paid and
       | lobbied to include into the browser's certificate store, and they
       | agreed to rules about what they can and cannot do (to maintain
       | trustworthiness).
       | 
       | The root CA then just signs a certificate for the intermediate
       | (allowing them to issue certificates), and required the
       | intermediate to contractually agree to the same rules... because
       | if the intermediate violates the rules, the root CA and all
       | intermediates would be removed from the browser.
       | 
       | Other than the contractual relationship between the parties,
       | there's no difference between an intermediate CA and a root CA.
       | Including an intermediate certificate in the store, makes it
       | equivalent to the root certificate (which is a certificate that
       | is trusted only because it is included in the browser's store).
       | 
       | The only downside to this I can see is that it creates more work
       | to maintain the certificate store. Firefox can just ask the root
       | CA to notify them when they issue an intermediate.. it isnt like
       | this happens every day.
        
         | tialaramex wrote:
         | > Firefox can just ask the root CA to notify them when they
         | issue an intermediate
         | 
         | In fact notifying m.d.s.policy is already _required_ whenever
         | such an intermediate is created
         | 
         | 5.3.2 "The operator of a CA certificate included in Mozilla's
         | root store MUST publicly disclose in the CCADB all CA
         | certificates it issues that chain up to that CA certificate
         | trusted in Mozilla's root store that are technically capable of
         | issuing working server or email certificates,"
        
       | bighoss97 wrote:
       | This is a good behavior, since it means less bytes to transfer
       | per connection. Worst case scenario, the browser doesn't
       | have/can't get the intermediate certs required and the connection
       | fails.
        
         | minaguib wrote:
         | I agree with the author that the non-deterministic portion of
         | it is mildly insane.
         | 
         | Imagine a junior admin who installs a new certificate (without
         | the intermediate). Tests on their Chrome which happens to have
         | cached the intermediate, validate LGTM and moves on.
         | 
         | Meanwhile it gets deployed and some portion of the site's users
         | don't have the intermediate certificate cached = dungeon
         | collapse.
        
         | woodruffw wrote:
         | This doesn't really guarantee fewer bytes per connection. In
         | the worse case, the preloaded intermediate set is still
         | insufficient and the client has to resort to AIA chasing
         | instead (which is both slower _and_ leaks the client to the
         | issuing CA).
         | 
         | I understand this behavior from a "make the spurious user bug
         | reports go away" perspective, but it's still pretty gnarly :-)
        
       | shadowgovt wrote:
       | _sigh..._ Well, my days of being pissed off about some damn
       | detail of TLS or other certainly are coming to a middle.
        
       | asylteltine wrote:
       | Good article but it's methods. Not methodologies. Methodology is
       | the study of methods. 99.99% of the time you want to say method.
       | Also price, when you said price point.
        
       | donmcronald wrote:
       | > and a large number of government websites not even just limited
       | to the United States government but other other national
       | governments too
       | 
       | It's always eye opening to see governments and large businesses
       | that think expensive = good when they don't even know how to
       | properly configure what they're buying.
       | 
       | I've literally lost arguments against buying overprices OV
       | certificates and then had to spend time to shoehorn them into
       | systems designed to be completely automated with LE / ACME.
        
       | coffee-- wrote:
       | It was years in the making for Firefox to be able to do
       | "intermediate preloading" - we [0] had to make the policy changes
       | for all intermediates to be disclosed, and then let that take
       | effect [1].
       | 
       | Preloading like this shouldn't be necessary, I agree with the
       | author, but worse than this is any bug report of "Works in
       | Chrome, not in Firefox." Prior to this preloading behavior
       | shipping in Firefox 75, incorrectly-configured certificate chains
       | were a major source of those kind of bugs [2].
       | 
       | [0] This was me (:jcj) and Dana [1]
       | https://wiki.mozilla.org/Security/CryptoEngineering/Intermed...
       | [2] https://blog.mozilla.org/security/2020/11/13/preloading-
       | inte...
        
       | gregmac wrote:
       | > Chrome will try to match intermediate certificates with what it
       | is seen since the browser has been started. This has the effect
       | of meaning that a cold start of Chrome does not behave the same
       | way as a Chrome that has been running for 4 hours.
       | 
       | Holy crap. I have definitely run into this in the past, but had
       | no idea!
       | 
       | I was configuring a load balancer that served SSL certificates
       | for customer domains which included a mix of wildcard, client-
       | supplied and LetsEncrypt-obtained certificates, and was all
       | dynamically configured based on a backend admin app.
       | 
       | I was getting wildly inconsistent behavior where I'd randomly get
       | certificate validation errors, but then the problem would
       | disappear while diagnosing it. The problem would often (but not
       | always) re-occur on other systems or even on the same system days
       | later, and disappear while diagnosing. I never isolated it to
       | Chrome or the time-since-Chrome-was-restarted, but I do remember
       | figuring out it only affected certificates using an intermediate
       | root. There was a pool of load balancers and I remember us
       | spending a lot of time comparing them but never finding any
       | differences. The fix ended up being to always include the
       | complete certificate chain for everything, so I am pretty
       | confident this explains it.
       | 
       | This was several years ago, but maddening enough that reading
       | this triggered my memory of it.
        
       | MilaM wrote:
       | Thanks for submitting this article. I wasn't aware it's so easy
       | to misconfigure the TLS settings of web servers. It might also
       | explain some TLS errors I have encountered in the past in
       | Firefox.
       | 
       | It would be super helpful if someone could recommend easy ways to
       | check web servers for bad configurations like this.
        
       ___________________________________________________________________
       (page generated 2024-01-08 23:00 UTC)