[HN Gopher] Mediocre Engineer's Guide to HTTPS
       ___________________________________________________________________
        
       Mediocre Engineer's Guide to HTTPS
        
       Author : MediumD
       Score  : 135 points
       Date   : 2024-05-26 15:17 UTC (7 hours ago)
        
 (HTM) web link (devonperoutky.super.site)
 (TXT) w3m dump (devonperoutky.super.site)
        
       | jessriedel wrote:
       | Tangential question from a layman: when I lose access to a
       | particular website, or the internet as a whole, why is it so hard
       | to tell where in the chain the failure is occurring? Like it's
       | often unclear whether
       | 
       | * I've got a network misconfiguration on my local machine;
       | 
       | * My wifi connection to the router is down;
       | 
       | * The cable between my router and ISP is cut;
       | 
       | * My ISP is having large scale issues; or
       | 
       | * The website I'm trying to reach is down.
       | 
       | I've been given the vague impression that it has something to do
       | with a non-deterministic path by which requests are routed, but
       | this seems unconvincing. If some link on the path breaks, why
       | doesn't the last good link send a message backward that says
       | "Your message made it to me, but I tried to send it the next step
       | and it failed there."
        
         | cancerhacker wrote:
         | The browser reports the error closest to what it was doing at
         | the time - host not found? Well, the network was reliable
         | enough to reach a dns server that returned that the lack of
         | address for a name. But if the dns server itself can't come
         | reached, it's some sort of network error between you and that
         | server. The typical way to diagnose that kind of problem is to
         | perform all the steps yourself - can I ping the dns server
         | address? Can I resolve this host with that dns server? What
         | about a different dns server, maybe that particular name is
         | being excluded because of corporate policy. The command line
         | tools ping, traceroute and dig are useful if you want to get
         | into it.
        
         | harry_ord wrote:
         | Not a network person, only played with trace route a long time
         | ago but I'm pretty sure that only really happens if you
         | explicitly ask for information about all the middle men.
         | 
         | Most of the time a lot of software kinda doesn't care about
         | what's happening just if it can do what it's told.
         | 
         | For Websites you often get more informative errors like 404,
         | 500 or something else.
        
           | recursive wrote:
           | If you're getting a status code like 404 or 500, it means
           | there's no problem between you and the web server. The status
           | codes come _from_ the server. The exception is when you get a
           | gateway /reverse proxy error. Usually 503 I think. That means
           | the web server is down, but there's another server in front
           | of it reporting that it's down.
        
             | harry_ord wrote:
             | True, I thought of those as they're just more informative
             | about why you're not getting what you're looking for.
        
             | YZF wrote:
             | 502 Bad Gateway.
        
         | nurple wrote:
         | If ICMP is allowed into your network, your machine will most
         | likely receive a Destination Unreachable response from the host
         | that can't forward the packet further.
         | 
         | Your application won't see the ICMP message unless you
         | configure the socket to report them(these are considered
         | "transient" errors). On Linux this is done via the socket
         | option IP_RECVERR.
         | 
         | ETA: there's not a ton of value collecting errors at this layer
         | when you're working at L7. The errors that _do_ get surfaced
         | for DU at your layer will be appropriate for the failure
         | handling logic you'll inevitably have already. In this case I
         | think it'd be a timeout, as other layers implement retries in
         | the face of unreachable destinations.
         | 
         | I found these RFCs helpful re: how the TCP layer handles ICMP
         | errors: https://www.rfc-editor.org/rfc/rfc1122#page-103
         | 
         | Section 4.2.3.9:
         | 
         | > Since these Unreachable messages indicate soft error
         | conditions, TCP MUST NOT abort the connection, and it SHOULD
         | make the information available to the application.
         | 
         | > DISCUSSION: TCP could report the soft error condition to the
         | application layer with an upcall to the ERROR_REPORT routine,
         | or it could merely note the message and report it to the
         | application only when and if the TCP connection times out.
         | 
         | This one gets into the nitty gritty of how the stacks interact
         | in order to study ICMP as vector for TCP attacks.
         | 
         | https://www.rfc-editor.org/rfc/rfc5927
        
         | arccy wrote:
         | http(s) is built on top of multiple layers (HTTP, TLS, TCP,
         | Ethernet...). A broken link in the lower layers can't really be
         | presented as a higher level message (because it has no access
         | to it).
        
         | YZF wrote:
         | For most people most issues would in at their home network. So
         | that's a good first guess for any connectivity problems. Rarely
         | it would be somewhere between your home and the ISP. If it's a
         | small rural ISP then it might be ISP->Internet though I'd think
         | that's rare. Most large scale ISPs have enough redundancy and
         | capacity.
         | 
         | As someone else mentioned ICMP addresses certain classes of
         | failures if enabled but I think the historical reason is more
         | along the lines of the Internet was meant to run over lossy
         | connections. For example, when a certain link is saturated
         | routers will just start dropping packets. Reporting each
         | dropped packet back to the sender is just not a good idea, it
         | adds load to a system already potentially operating at
         | capacity. TCP assumes packets can get lost and retransmits
         | them. When a link goes down routing protocols will potentially
         | send those retransmitted packets over a different link/path.
         | I.e. there's no real concept of "connection down" other than
         | the application layer or TCP eventually giving up (which can
         | take a very long time). The kind of ICMP message that will
         | immediately terminate a connection is when the server machine
         | doesn't have anything listening on the destination port.
        
         | treflop wrote:
         | It's possible to figure out exactly what failed if you know how
         | it all works.
         | 
         | But to write a tool to provides a useful description to the
         | user is near impossible because no two setups are the same,
         | it's not possible to know if something is intentional or not,
         | and it can be dangerous to just make an assumption based on
         | what the common causes are and just suggest to the user a
         | completely wrong answer.
         | 
         | For example, let's say you can't connect to a website because
         | the DNS server isn't responding and the host isn't responding.
         | You could tell the user that something is probably
         | misconfigured at your router or your ISP is having some issues.
         | 
         | However, it turns out that the actual reason was that your VPN
         | client updated your local routing tables and DNS server but
         | failed to remove the changes when you quit the client. How is a
         | troubleshooter supposed to know that the settings were
         | temporarily changed versus it being the permanent ones?
         | 
         | Once you try to start to write a troubleshooter that can
         | identify the actual cause, you realize that it's very difficult
         | due to the complexity and variation. At best you can write
         | something that usually spits out a correct answer but also
         | sometimes suggests something totally wrong and leads people
         | down a completely wrong path.
        
           | jessriedel wrote:
           | If Google dedicated 10 engineers full time to this problem
           | for 3 years, could they solve it?
        
         | AlienRobot wrote:
         | How are you trying to tell that?
         | 
         | If a web browser can't access a URL, it won't tell you why
         | exactly because there's a chance it diagnosis the reason wrong
         | and most users will be confused by that. I assume most
         | diagnosis tools work the same way. You need to make assumptions
         | about how the OS, hardware, and network are configured to be
         | able to say "the problem is here."
         | 
         | For example, when you access a website, the first thing that
         | needs to be done is check a domain name server (DNS) to get the
         | IP address of the web server. But where does the web browser
         | get the DNS IPs from? You can configure it in the browser. Or
         | in the OS. Or in your router. Or in your modem. And if you
         | don't, it gets them from the DHCP server the router connects
         | to, which could be your ISP's DHCP server (then you get your
         | ISP's default DNS) or it could also be some other router in an
         | organization's network.
         | 
         | If the DNS seems wrong it's easy to tell the IP is wrong but it
         | gets hard to say where that IP came from.
         | 
         | Even SSL could be a problem with the server having the wrong
         | certificates or it could be your computer having the wrong
         | certificates.
        
       | _ache_ wrote:
       | Everything in that article is a little outdated, 30% of web
       | request are in HTTP3 now a day with CORS. There is no date of
       | publication.
        
         | recursive wrote:
         | 30% of requests are CORS? Surely this depends on what type of
         | development you're doing. I'm doing SaaS development for
         | systems generally deployed inside corporate networks. Very
         | close to 0% of requests are CORS. Same for HTTP3.
        
       | Snawoot wrote:
       | > The client generates a premaster secret, encrypts it with the
       | server's public key, and sends it to the server.
       | 
       | It's already not true for, like, ages.
        
         | Operyl wrote:
         | Down below it says this:
         | 
         | > Everything you've learned here is a lie.
         | 
         | > The process we just describe is for the original version of
         | TLS, which is outdated compared to the more modern version of
         | TLS 1.3.
        
       | wonnage wrote:
       | This reads like an AI summary of an actual HTTPS explainer. Terms
       | get introduced with no context - no explanation of what a
       | certificate is or how the chain of trust works, assumes the
       | reader knows about public key cryptography, describes six out of
       | the seven OSI layers (RIP presentation layer) without mentioning
       | that term at all, etc.
       | 
       | TBF it is titled as mediocre!
        
         | MediumD wrote:
         | To be fair, I also didn't include the session layer!
         | 
         | My writing isn't a strength of mine, so I appreciate the
         | criticism. My writing going from "bad" -> "is it AI?" is
         | progress.
         | 
         | I struggled with where to "cutoff" the explanation and public
         | key cryptography seemed like a good boundary and better
         | explained elsewhere, as did various OSI layers.
         | 
         | I probably should have gone over the cert and potentially the
         | full chain of trust, I'll give you that.
        
       | debo_ wrote:
       | > aka. Writing HTTP requests from San Francisco for $300K/year
       | 
       | Best part of the article!
        
       | jonwest wrote:
       | Does anyone have more examples of articles written in this
       | perspective? Regardless of my experience level I love diving
       | through "ELI(a mediocre engineer)" type explanations as I either
       | learn another piece that wasn't completely clear, or gives me
       | another set of examples to help explain it to other people.
       | Either way they're generally very helpful.
        
       | StrLght wrote:
       | Might be relevant: there's also detailed and somewhat interactive
       | byte-by-byte example of TLS for TLSv1.2[0] and TLSv1.3[1]. I
       | absolutely love it and highly recommend checking it out if you
       | want to learn more about TLS.
       | 
       | [0]: https://tls12.xargs.org/
       | 
       | [1]: https://tls13.xargs.org/
        
       ___________________________________________________________________
       (page generated 2024-05-26 23:01 UTC)