[HN Gopher] HTTP/3 adoption is growing rapidly
___________________________________________________________________
HTTP/3 adoption is growing rapidly
Author : skilled
Score : 495 points
Date : 2023-10-05 10:59 UTC (12 hours ago)
(HTM) web link (blog.apnic.net)
(TXT) w3m dump (blog.apnic.net)
| [deleted]
| weird-eye-issue wrote:
| "In fact, the blog post you're reading right now was probably
| loaded over HTTP/3!"
|
| It actually isn't. But some of the 3rd party resources like a
| captcha script are loaded over HTTP3
| diggan wrote:
| Seeing the same thing both in Firefox and Chrome, every
| response from blog.apnic.net comes over HTTP/2, none of the
| HTTP/3 ones are from any apnic.net requests.
|
| Strange that a registry, who should have really good idea about
| what kind of infrastructure they run, would get which protocol
| they're using wrong.
| dgb23 wrote:
| Looks like they forgot that we are hitting a cache that isn't
| configured the same way.
| weird-eye-issue wrote:
| Appears to be Cloudflare's APO cache. I wonder why they
| haven't toggled HTTP/3 on in their Cloudflare dashboard
|
| (Just confirmed my sites with APO are being served with
| HTTP/3)
| krichprollsch wrote:
| The original blog post is served hover http/3 as far as I can
| see. https://pulse.internetsociety.org/blog/why-http-3-is-
| eating-...
| ReactiveJelly wrote:
| And I can't use HTTP/3 over Tor cause Tor only forwards TCP lol
|
| We'll get there
| MrThoughtful wrote:
| the blog post you're reading right now was probably
| loaded over HTTP/3!
|
| This made me curious. In the Firefox console, when I enable "raw"
| for the request headers, I see: GET
| /2023/09/25/why-http-3-is-eating-the-world/ HTTP/2
|
| So I guess it was not?
| WJW wrote:
| Growing from 0 to ~27% of traffic in 2 years is great, but the
| graph from TFA shows that it was not a gradual increase but
| rather two big jumps in adoption:
|
| - One in mid 2021, to ~20%.
|
| - A second one in July 2022, to ~29%.
|
| Since then the traffic share has been pretty flat. I don't think
| that counts as "eating the world" at all tbh.
| sph wrote:
| Is the graph showing the number of requests served by HTTP/3,
| or the number of individual hosts that support it? I believe
| it's the former, and it's explainable by the fact that most
| people these days visit only the same handful of siloes:
| Google, Amazon, Facebook, Twitter and Youtube probably make
| 90+% of all anonymised requests seen by Mozilla. But I doubt
| even 20% of global webservers are HTTP/2 ready, let alone
| HTTP/3.
|
| So HTTP/3 is eating the world, if by world you only count FAANG
| websites, which sadly seems to be the case.
|
| Remains to be seen how many websites are hidden behind tech's
| favorite MITM, Cloudflare, which might make figures a little
| harder to discern.
| buro9 wrote:
| Right, and those big jumps are more to do with large Cloud
| providers or proxy providers switching it on, and less to do
| with web applications themselves switching it on.
|
| It's great that the upgrade path for the end clients is so
| easy, but it doesn't reflect that the environment for web
| application developers has shifted towards HTTP/3
| taf2 wrote:
| July 2022 maybe linked to cloudfront supporting it
| https://aws.amazon.com/blogs/aws/new-http-3-support-for-amaz...
| dchftcs wrote:
| There's a cookie I took out of a package for a nibble. You
| could say I'm eating that pack of cookies. Not justifying the
| sensationalist title though.
| ascar wrote:
| Intetestingly (or rather not surprising at all?) http3 is
| eating away the http2 traffic, while http1 just stays on a very
| slow decline.
|
| Since http1 webservers are usually stuck there for a (more or
| less) good reason, they won't be converting to http3 anytime
| soon, while already more modern webservers are easier to
| upgrade from http2 to 3.
| oblio wrote:
| > Intetestingly (or rather not surprising at all?) http3 is
| eating away the http2 traffic, while http1 just stays on a
| very slow decline.
|
| HTTP1: the OG, meant for a much smaller individual scale. So
| great even today for smaller websites. Where smaller right
| now, with modern hardware, can actually still mean hundreds
| of requests per second and millions per day...
|
| HTTP2/HTTP3: BigCorp optimization primarily for problems
| they're facing and not that many others are.
|
| So viewed through this lens it's clear what's happening.
|
| BigCorps are just moving from HTTP2 to HTTP3 and mom and pops
| don't give a crap, HTTP1 is good enough.
| charcircuit wrote:
| HTTP2 is useful even if you just want to show multiple
| images on a page. For HTTP1 you have to add all of your
| images to a sprite sheet to avoid sending a lot of
| requests.
| xg15 wrote:
| Let me guess, the first jump was Chrome rolling out QUIC
| support of Google properties, the second was Firefox following
| suit?
| 243423443 wrote:
| Frankly, I am a bit disappointed with the article. I fail to
| follow the argument that an encrypted header makes it easier to
| adopt HTTP/3 & QUIC because middleboxes cannot see the header.
| With HTTP/1.1 & TCP, the middleboxes should not be changing the
| TCP packets anyway, no? Also, the author does not point out that
| QUIC builds on UDP instead of directly building on IP like TCP
| does.
| lemagedurage wrote:
| Even though they arguably shouldn't, some middleboxes assume
| that protocols don't change. They have made it hard for
| protocols such as TCP and TLS to evolve without breaking
| things.
|
| Similarly, middleboxes have made it unviable to deploy
| protocols that aren't TCP or UDP based.
|
| https://en.wikipedia.org/wiki/Protocol_ossification
| 243423443 wrote:
| I did not know that protocol ossification was a such a thing.
| Thanks for the link, it's an interesting article.
|
| It says that middleboxes near the edge are more likely to be
| the cause of ossification. Are there any stats about that?
| Such as some manufacturers or software companies "infringing"
| on the end-to-end principle more often than others?
| Karrot_Kream wrote:
| Not sure about stats but if you hang out in networking
| forums, you'll see netops complaining about bad
| implementations of protocols in their networking gear
| forever. This has been a huge problem in several protocols,
| everything from SCTP to IPSec to UDP RTSP.
| xg15 wrote:
| > _This means that metadata, such as [...] connection-close
| signals, which were visible to [...] all middleboxes in TCP, are
| now only available to the client and server in QUIC._
|
| Do I understand that correctly, that middelboxes are not supposed
| to know whether or not a connection still exists? Then how
| exactly is NAT going to work?
| tomalaci wrote:
| Will the spread of QUIC improve low latency services such as
| video game servers? Last I checked they were using either gRPC or
| homecooked protocol over UDP.
| plopz wrote:
| For games running on the browser absolutely yes. They are all
| currently on tcp and being able to switch to udp is huge.
| KingMob wrote:
| Not a whole lot, but the connection identifier will probably
| make it much more seamless when your phone switches from wifi
| to mobile networks. Right now, that's somewhat disruptive; with
| QUIC, you might not even notice.
| tuetuopay wrote:
| Not really. Video game servers definitely don't want session
| oriented protocols, and don't really care about dropped packet.
| Lost the packet with the player position? It comes at a regular
| interval, so we'll get it anyways, and more up to date! All
| they care about is the best latency possible. As for the non-
| time sensitive stuff in games (scoreboard, nicknames, etc),
| there is no real benefit there either. It's not time sensitive,
| the connection get opened once at the start and that's it.
|
| Where QUIC really shines if for fast connections and many
| streams, ie the web where a page will need 500 resources spread
| across 100 hosts. In this case session establishment speed and
| parallel streams are paramount.
| [deleted]
| sschueller wrote:
| What? I just updated my nginx config to HTTP/2.
|
| V3 still seems to be experimental in nginx:
| https://nginx.org/en/docs/http/ngx_http_v3_module.html
| oblio wrote:
| And Apache?
|
| Edit:
|
| https://stackoverflow.com/questions/60324166/is-there-any-wa...
|
| No plans as of right now.
| olavgg wrote:
| Anything which involves more work than dnf/apt install nginx
| isn't worth wasting time on. So http2 for now.
| lost_tourist wrote:
| You'll be fine, there will be http2 for the next couple of
| decades at least. By then you'll be saying "Siri set up nginx
| for reverse proxying my new email server and open-facebook
| federated instance, oh and turn on the coffee maker, i'm gonna
| grab another 15 minutes of sleepy time"
| claar wrote:
| Yeah, and not available in nginx stable yet (currently nginx
| v1.24) - only available in nginx mainline (v1.25+).
|
| More info at https://quic.nginx.org/
| yakubin wrote:
| It's enabled by default in Caddy.
| ThePhysicist wrote:
| QUIC is a really nice protocol as well, I find. It basically
| gives you an end-to-end encrypted & authenticated channel over
| which you can transport multiple streams in parallel, as well as
| datagrams. A lost packet in a given stream won't block other
| streams, and the overhead is quite low. Both ends of the
| connection can open streams as well, so it's really easy to build
| bidirectional communication over QUIC. You can also do things
| like building a VPN tunnel using the datagram mechanism, there
| are protocols like MASQUE that aim to standardize this. Apple is
| using a custom MASQUE implementation for their private relay, for
| example.
|
| HTTP/3 is a protocol on top of QUIC that adds a few more really
| interesting things, like qpack header compression. If you e.g.
| send a "Content-type: text/html" header it will compress to 2
| bytes as the protocol has a Huffman table with the most commonly
| used header values. I found that quite confusing when testing
| connections as I thought "It's impossible that I only get 2
| bytes, I sent a long header string..." until I found out about
| this.
| Borg3 wrote:
| QUIC is junk.. You people all care about raw throughput but not
| about multiuser friendless. Selfish. Problem is, QUIC is UDP
| and so its hard to police/shape. I really want to play some FPS
| game while someone is watching/browsing web. Also, I really
| want my corpo VPN have a bit of priority over web, but no, now
| I cannot police it easly. TCP is good for best effort traffic,
| and thats where I classify web browsing, downloading, VoD
| streaming. UDP is good for gaming, voice/video conferencing,
| VPNs (because they are encapsulate stuff you put another layer
| somewhere else).
| cogman10 wrote:
| > now I cannot police it easly.
|
| Somewhat the point. The issue we've had is that multiple ISPs
| and gov have been "policing" TCP in unsavory ways. Security
| and QoS are just fundamentally at odds with each other.
| vlovich123 wrote:
| I feel like that's an ungenerous characterization. First,
| QUIC should contain some minimal connection info unencrypted
| that can be middleware to do some basic traffic shaping. It's
| also intentionally very careful to avoid showing too much to
| avoid "smart" middleware that permanently ossifies the
| standard as has happened to TCP.
|
| Finally, traffic shaping on a single machine is pretty easy
| and most routers will prefer TCP traffic to UDP.
|
| Finally, the correct response to overwhelm is to drop
| packets. This is true for TCP and UDP to trigger congestion
| control. Middleware has gotten way too clever by half and we
| have bufferbloat. To drop packets you don't need knowledge of
| streams - just that you have a non-skewed distribution you
| apply to dropping the packets so that proportionally all
| traffic overwhelming you from a source gets equally likely to
| be dropped. This ironically improves performance and latency
| because well behaving protocols like TCP and QUIC will
| throttle back their connections and UDP protocols without
| throttling will just deal with elevated error rates.
| Borg3 wrote:
| So what? You dropping packets, and they are still coming,
| eating BW and buckets. Because traditionally UDP did not
| have any flow control, you just treat is as kinda CBR
| traffic and so you just want to leave it queues as fast as
| it can. If there was a lot of TCP traffic around, you just
| drop packets there and vioala, congestion kick sin and you
| have more room for importand UDP traffic. Now, if you start
| to drop UDP packets your UX drops.. packet loss in FPS
| games is terrible, even worse than a bit of jitter. Thank
| you.
| vlovich123 wrote:
| I really don't follow your complaint. QUIC (and other
| similar UDP protocols like uTP used for BitTorrent)
| implement congestion control. If packets get dropped, the
| sender starts backing off which makes you a "fair" player
| on the public internet.
|
| As for gaming, that remains an unsolved problem, but QUIC
| being UDP based isn't any different than TCP. It's not
| like middleware boxes are trying to detect specific UDP
| applications and data flows to prioritize protecting
| gaming traffic from drops, which I think is what you're
| asking for.
| KMag wrote:
| But your complaint is about QUIC, not generic UDP. QUIC
| implements TCP-like flow control on top of UDP, designed
| to play well with TCP congestion control.
|
| QUIC does play well with others. It's just implemented in
| the userspace QUIC library instead of the network stack.
| Dylan16807 wrote:
| Is your complaint fundamentally that it's harder to tell the
| difference between games/voip and browser activity if you
| can't just sort TCP versus UDP?
|
| That's true, but it's not that big of a deal and definitely
| doesn't make QUIC "junk". Looking at the port will do 90% of
| the job, and from what I can tell it's easy to look at a few
| bytes of a new UDP stream to see if it's QUIC.
| Borg3 wrote:
| Really? How.. Can you please tell me how I can detected
| QUIC looking at bytes. I could drop then only QUIC and not
| UDP/443.
| johnmaguire wrote:
| Upvoted because I think you bring up some interesting
| challenges, but you might consider a softer tone in the
| future. (Calling the OP "selfish" goes against site
| guidelines, and generally doesn't make people open to what
| you're saying.)
| adastra22 wrote:
| Agreed on tone, but I don't thinking read calling OP
| selfish specifically.
| Borg3 wrote:
| That selfish was NOT the the OP. It was for general
| audiency who prefer all the bandwidth for themselfs. We
| know how most people behave. They do NOT really care what
| others are doing. For years, I was proud of my QoS because
| my entire home could utilize my (not so fast Internet) and
| I always could do gaming, because everything was QoS
| correctly. Nothing fancy, just separating TCP vs UDP and
| futher, doing some tuning between TCP bulk vs interactive
| traffic. Same went to UDP, some separation for
| gaming/voip/interactive vs VPN (bulk). HTB is pretty decent
| for this.
| KMag wrote:
| QUIC has its own flow control. It's not raw UDP.
|
| Now, I wish ToS/QoS were more broadly usable for traffic
| prioritization.
|
| It sounds like you're using UDP vs. TCP as a proxy for
| ToS/QoS. At a minimum, you're still going to have a bad time
| with TCP streams getting encapsulated in UDP WireGuard VPN
| connections.
| kccqzy wrote:
| Shouldn't this be handled at a lower layer? QoS should be
| part of the IP layer, say using DSCP.
| d-z-m wrote:
| I believe that qpack(like hpack) still has some sharp edges. As
| in...low entropy headers are still vulnerable to leakage via a
| CRIME[0] style attack on the HTTP/3 header compression.
|
| In practice, high entropy headers aren't vulnerable, as an
| attacker has to match the entire header:value line in order to
| see a difference in the compression ratio[1].
|
| [0]: https://en.wikipedia.org/wiki/CRIME [1]:
| https://www.ietf.org/archive/id/draft-ietf-quic-qpack-20.htm...
| ComputerGuru wrote:
| I've been trying to wrap my mind around whether (and how much)
| QUIC would be better than TCP for video streams that rely on
| order-sensitive delivery of frames, especially for frames that
| have to be split into multiple packets (and losing one packet
| or receiving it out of order would lose the entire frame and
| any dependent frames). We used to use UDP for mpegts packets
| but found TCP with a reset when buffers are backlogged after
| switching to h264 to be a much better option over lossy WAN
| uplinks, (scenario is even worse when you have b or p frames
| present).
|
| The problem would be a lot easier if there were a feedback loop
| to the compressor where you can dynamically reduce
| quality/bandwidth as the connection quality deteriorates but
| currently using the stock raspivid (or v4l2) interface makes
| that a bit difficult unless you're willing to explicitly stop
| and start the encoding all over again, which breaks the stream
| anyway.
| Karrot_Kream wrote:
| I can't wait to start implementing RTP atop QUIC so we can stop
| having to deal with the highly stateful SIP stack and open a
| media connection the same way we open any other Application
| layer connection.
| nly wrote:
| The problem is there's no standard equivalent of the BSD
| sockets API for writing programs that communicate over QUIC.
|
| It'll always be niche outside the browser until this exists.
| btown wrote:
| Since it lives on top of UDP, I believe all you need is
| SOCK_DGRAM, right? The rest of QUIC can be in a userspace
| library ergonomically designed for your programming language
| e.g. https://github.com/quinn-rs/quinn - and can interoperate
| with others who have made different choices.
|
| Alternately, if you need even higher performance, DPDK gives
| the abstractions you'd need; see e.g.
| https://dl.acm.org/doi/abs/10.1145/3565477.3569154 on
| performance characteristics.
| rewmie wrote:
| > Since it lives on top of UDP, I believe all you need is
| SOCK_DGRAM, right? The rest of QUIC can be in a userspace
| library ergonomically designed for your programming
| language e.g. (...)
|
| I think that OP's point is that there's no standard
| equivalent of the BSD sockets API for writing programs that
| communicate over QUIC, which refers to the userspace
| library you've referred to.
|
| A random project hosted in GitHub is not the same as a
| standard API.
| pests wrote:
| > no standard equivalent of the BSD sockets API
|
| Did they not answer that question? It uses the BSD
| sockets API with SOCK_DGRAM?
|
| Right, that random project is not a standard API - its
| built using a standard API. You wouldn't expect BSD
| sockets to have HTTP built in... so you can find third-
| party random projects for HTTP implmented with BSD
| sockets just like you can find QUIC implmented with BSD
| sockets.
| jlokier wrote:
| QUIC is roughly TCP-equivalent not HTTP-equivalent, and
| we do have a BSD sockets API for TCP. You might be
| thinking of HTTP/3 rather than QUIC; HTTP/3 actually is
| HTTP-equivalent.
|
| You can turn the OP's question around. Every modern OS
| kernel provides an efficient, shared TCP stack. It isn't
| normal to implement TCP separately in each application or
| as a userspace library, although this is done
| occasionally. Yet we currently expect QUIC to be
| implemented separately in each application, and the
| mechanisms which are in the OS kernel for TCP are
| implemented in the applications for QUIC.
|
| So why don't we implement TCP separately in each
| application, the way it's done with QUIC?
|
| Although there were some advantages to this while the
| protocol was experimental and being stabilised, and for
| compatibility when running new applications on older
| OSes, arguably QUIC should be moved into the OS kernel to
| sit alongide TCP now that it's stable. The benefit of
| having Chrome, Firefox et al stabilise HTTP/3 and QUIC
| were good, but that potentially changes when the protocol
| is stable but there are thousands of applications, each
| with their own QUIC implementation doing congestion
| control differently, scheduling etc, and no cooperation
| with each other the way the OS kernel does with TCP
| streams from concurrent applications. Currently we are
| trending towards a mix of good and poor QUIC
| implementations on the network (in terms of things like
| congestion control and packet flow timing), rather than a
| few good ones as happens with TCP because modern kernels
| all have good quality implementations of TCP.
| adwn wrote:
| Isn't the point of QUIC to offer high performance and
| flexibility at the same time? For these requirements, a
| one-size-fits-all API is rarely the way to go, so
| individual user-space implementations make sense. Compare
| this to file IO: For, many programs, the _open
| /read/write/close_ FD API is sufficient, but if you
| require more throughput or control, its better to use a
| lower-level kernel interface and implement the missing
| functionality in user-space, tailored to your particular
| needs.
| pests wrote:
| > QUIC is roughly TCP-equivalent not HTTP-equivalent, and
| we do have a BSD sockets API for TCP. You might be
| thinking of HTTP/3 rather than QUIC; HTTP/3 actually is
| HTTP-equivalent.
|
| No, I understand QUIC is a transport and HTTP/3 is the
| next HTTP protocol that runs over QUIC. I was saying QUIC
| can be userspace just like HTTP is userspace over kernel
| TCP API. We haven't moved HTTP handling into the kernel
| so what makes QUIC special?
|
| I think it is just too early to expect every operating
| system to have a standard API for this. We didn't have
| TCP api's built-in originally either.
| rewmie wrote:
| > I was saying QUIC can be userspace just like (...)
|
| I think you're too hung up on "can" when that's way
| besides OP's point. The point is that providing access to
| fundamental features through a standard API is of
| critical importance.
|
| If QUIC is already massively adopted them there is no
| reason whatsoever to not provide a standard API.
|
| If QUIC was indeed developed to support changes then
| there is even fewer arguments to not provide a standard
| API.
| Dylan16807 wrote:
| > We haven't moved HTTP handling into the kernel so what
| makes QUIC special?
|
| I feel like they answered that, but I'll try rewording
| it.
|
| What makes TCP special that we put it in the kernel? A
| lot more of those answers apply to QUIC than to HTTP.
|
| > I think it is just too early to expect every operating
| system to have a standard API for this. We didn't have
| TCP api's built-in originally either.
|
| Okay, if you're comparing to TCP by saying it's too early
| then it sounds like you do already see the reasons in
| favor.
| btown wrote:
| It occurs to me that QUIC could benefit from a single
| kernel-level coordinator that can be plugged for
| cooperation - for instance, a dynamic bandwidth-
| throttling implementation a la https://tripmode.ch/ for
| slower connections where the coordinator can look at pre-
| encryption QUIC headers, not just the underlying
| (encrypted) UDP packets. So perhaps I was hasty to say
| that you just need SOCK_DGRAM after all!
| Animats wrote:
| But then Google couldn't prioritize ad content.
| rewmie wrote:
| > Did they not answer that question? It uses the BSD
| sockets API with SOCK_DGRAM?
|
| No, that does not answer the question, nor is it a valid
| answer to the question. Being able to send UDP datagrams
| is obviously not the same as establishing a QUIC
| connection. You're missing the whole point of the
| importance of having a standard API to establish network
| connections.
|
| > Right, that random project is not a standard API - its
| built using a standard API.
|
| Again, that's irrelevant and misses the whole point.
| Having access to a standard API to establish QUIC
| connections is as fundamental as having access to a
| standard API to establish a TCP connection.
|
| > so you can find third-party random projects for HTTP
| (...)
|
| The whole point of specifying standard APIs is to not
| have to search for and rely on random third-party
| projects to handle a fundamental aspect of your
| infrastructure.
| jeffbee wrote:
| Not having a system API is the entire point of QUIC. The only
| reason QUIC needs to exist is because the sockets API and the
| system TCP stacks are too ossified to be improved. If you
| move that boundary then QUIC will inevitably suffer from the
| same ossification that TCP displays today.
| rewmie wrote:
| > Not having a system API is the entire point of QUIC. The
| only reason QUIC needs to exist is because the sockets API
| and the system TCP stacks are too ossified to be improved.
|
| I don't think your take is correct.
|
| The entire point if QUIC was that you could not change TCP
| without introducing breaking changes, not that there were
| system APIs for TCP.
|
| Your point is also refuted by the fact that QUIC is built
| over UDP.
|
| As far as I can tell there is no real impediment to provide
| a system API for QUIC.
| tsimionescu wrote:
| No, the reason QUIC exists is that TCP is ossified at the
| level of middle boxes on the internet. If it had been
| possible to modify TCP with just some changes in the Linux,
| BSD and Windows kernels, it would have been done.
| lima wrote:
| Both, really. The OS-level ossification isn't quite as
| bad as the middleboxes, but is still glacially slow and
| bad enough to justify QUIC on its own.
|
| Case in point: the whole TCP Fast Open drama.
| Dylan16807 wrote:
| I don't know about that. Without middlebox problems we
| might have used _SCTP_ as a basis and upgraded it. But it
| 's so different from TCP that I doubt we would have done
| it as a modification of TCP.
| gjvc wrote:
| worth defining "middlebox"
|
| """ A middlebox is a computer networking device that
| transforms, inspects, filters, and manipulates traffic
| for purposes other than packet forwarding. Examples of
| middleboxes include firewalls, network address
| translators (NATs), load balancers, and deep packet
| inspection (DPI) devices. """
|
| https://en.wikipedia.org/wiki/Middlebox
| mmis1000 wrote:
| I think the point of QUIC is 'if the implementation other
| using is problematic, I can use my own. And no random
| middlebox will prevent me from doing so' instead of
| `everyone must bring their own QUIC implementation.`
|
| There is a slight different here. It's the difference
| between 'the right to do' and 'the requirement to do'.
|
| While at same time. You must use 'system tcp
| implementation' and you are not allowed to use custom one.
| Because even system allow it (maybe require root permission
| or something), the middlebox won't.
| xg15 wrote:
| > _because the sockets API and the system TCP stacks are
| too ossified to be improved_
|
| What part of the sockets API specifically do you think is
| ossified? Also, that doesn't seem to have kept the kernel
| devs from introducing new IO APIs like io_uring.
| xg15 wrote:
| The alternative is that either browsers will be the only
| users of QUIC - or that each application is required to
| bring its own QUIC implementation embedded into the binary.
|
| If ossification was bad if every router and firewall has
| its own TCP stack, have fun in a world where every _app_
| has its own QUIC stack.
| aseipp wrote:
| User-space apps have a lot more avenues for timely
| updates than middleboxes or kernel-space implementations
| do though, and developers have lots of experience with
| it. If middleboxes actually received timely updates and
| bugfixes, there would be no ossification in the first
| place, and a lot of other experiments would have panned
| out much better, much sooner than QUIC has (e.g. TCP Fast
| Open might not have been DOA.)
|
| There's also a lot of work on interop testing for QUIC
| implementations; I think new implementations are strongly
| encouraged to join the effort:
| https://interop.seemann.io/
| jeffbee wrote:
| I am not seeing the problem with every participant
| linking in their own QUIC implementations. The problem of
| ossification is there is way too much policy hidden on
| the kernel side of the sockets API, and vanishingly few
| applications are actually trying to make contact with
| Mars, which is the use-case for which those policies are
| tuned.
| xg15 wrote:
| How would you make changes to the QUIC protocol then?
| jeffbee wrote:
| I wouldn't. I would write down the protocol in a good and
| extensible way, the first time. It's no good throwing
| something into the world with the assumption that you can
| fix the protocol later.
| sophacles wrote:
| Which policies are you speaking of? How are they hidden?
| What would you like to tweak that you can't?
| theptip wrote:
| Presumably all the sysctls?
|
| http://www.linux-admins.net/2010/09/linux-tcp-
| tuning.html?m=...
|
| & associated traffic shaping algorithms?
| jeffbee wrote:
| There are a billion timers inside the kernel, not all of
| which can be changed. Some of them are #defined even.
|
| In these days when machines are very large and always
| have several applications running, having an external
| network stack in the kernel violates the end-to-end
| principle. All of the policy about congestion, retrying,
| pacing, shaping, and flow control belong inside the
| application.
|
| https://en.wikipedia.org/wiki/End-to-end_principle
| sophacles wrote:
| Can you point me to an example of a timer in the kernel
| that is not settable/tunable that should be? My
| experience in looking at such things suggests that most
| of the #defined bits are because RFCs define the protocol
| that way.
|
| As for network stack per application: you're more than
| welcome to so so a myriad of ways - linux provides many
| different ways to pull raw IP, or raw ethernet into
| userspace (e.g. xdp, tun/tap devices, dpdk, and so on).
| It's not like you're being forced to use the kernel stack
| from lack of supported alternatives.
| Karrot_Kream wrote:
| > The alternative is that either browsers will be the
| only users of QUIC - or that each application is required
| to bring its own QUIC implementation embedded into the
| binary.
|
| This is already being done with event loops (libuv) and
| HTTP frameworks. I don't see why this would be a huge
| issue. It's also a boon for security and keeping software
| up-to-date because it's a lot easier to patch userspace
| apps than it is to roll out a new kernel patch across
| multiple kernels and force everyone to upgrade.
| promiseofbeans wrote:
| Woah that's actually pretty neat about header compression.
| Thanks for sharing!
| ithkuil wrote:
| it builds on the experience gathered with the HTTP/2 HPACK
| header compression.
| weird-eye-issue wrote:
| > If you e.g. send a "Content-type: text/html" header it will
| compress to 2 bytes as the protocol has a Huffman table with
| the most commonly used header values
|
| Reminds me of tokenization for LLMs. 48 dashes
| ("------------------------------------------------") is only a
| single token for GPT-3.5 / GPT-4 (they use the cl100k_base
| encoding). I suppose since that is used in Markdown. Also
| "professional illustration" is only two tokens despite being a
| long string. Whereas if you convert that to a language like
| Thai it is 17 tokens which sucks in some cases but I suppose
| tradeoffs had to be made
| culebron21 wrote:
| That makes sense -- if a packet is lost, and it affected just
| one asset, but you're on TCP, then everything has to wait till
| the packet is re-requested and resent.
| ReactiveJelly wrote:
| For the curious, this problem is called head-of-line blocking
| https://en.wikipedia.org/wiki/Head-of-line_blocking
|
| HTTP2 allowed multiple streams over one TCP stream, but that
| kinda made HoL blocking worse, because in the same scenario
| HTTP 1.1 would have just opened multiple TCP streams. QUIC,
| as GP said, basically gives you a VPN connection to the
| server. Open 100 streams for reliable downloads and uploads,
| send datagrams for a Quake match, all over one UDP
| 'connection' that can also reconnect even if you change IPs.
| afiori wrote:
| I wonder why the OS does not let the application peek at
| upcoming incomplete data.
| binwiederhier wrote:
| I dabbled with QUIC a few years ago and I couldn't agree more.
| It was pleasant to work with, and because it's UDP based,
| suddenly you can do NAT hole punching more easily.
|
| Funny that you mentioned a VPN, because I made a little
| experimental project back then to hole-punch between two
| behind-the-NAT machines and deliver traffic between them over
| QUIC. I was able to make my own L2 and L3 bridges across the
| WAN, or just port forward from one natted endpoint to an
| endpoint behind a different NAT.
|
| At one point I used it to L2-bridge my company's network (10.x)
| to my home network (192.168.x), and I was able to ping my home
| server from the bridging host, even though it was different
| networks, because it was essentially just connecting a cable
| between the networks. It was quite fun.
|
| Here's the project if anyone is interested:
| https://github.com/binwiederhier/natter -- it's probably
| defunct, but it was only experimental anyway.
| abhishekjha wrote:
| How mcuh config did it requir on your home local network and
| office network to do the ping on 192.168.x.x?
| binwiederhier wrote:
| I only tested one hop, e.g. A(10.x)---->B(192.x). All I had
| to do there was to adapt the routing tables: On A, route
| 192.x traffic to the "natter" tap/tun interface (I always
| forget which is L2), and on B, route traffic to 10.x
| accordingly. That's all.
|
| For it to be routable in the entire network, you'd need to
| obviously mess with a lot more :-D
| bullen wrote:
| I don't understand the arguments.
|
| 1) TCP needs to establish a connection. That is ZERO problem.
|
| 2) Encryption is needed in the bottom stack! WHY?
|
| Idiots are eating the world.
| adgjlsfhk1 wrote:
| The reason to put encryption at the bottom of the stack is that
| it helps with Hyrum's Law. Part of the reason TLS is so hard to
| change is that everyone can see all the data and therefore
| anyone on the path your package takes might make decisions
| based on the data. This code will break if you try to update
| anything (even if the thing they are observing is something
| that they shouldn't have been observing). By encrypting
| everything possible, you remove the ability for everyone in the
| middle to see or depend on any details of the higher layers.
| bullen wrote:
| Enjoy debugging things when everything is encrypted... and
| then your certificate provider goes down (or removes you
| because they don't like you) and you can't even connect...
| cryptonector wrote:
| > 2) Encryption is needed in the bottom stack! WHY?
|
| One reason is that hardware (NICs) can offload encryption more
| easily when it's closer to the lowest end-to-end layer (i.e.,
| no lower than IP).
|
| So IPsec and QUIC are easy to offload, but TLS-over-TCP less
| so. It especially helps that there is no need to buffer up part
| of the TCP stream so as to get whole TLS records to decrypt.
| bullen wrote:
| You don't need hardware encryption if you don't need
| encryption.
|
| You are solving 1% of the problems with 99% of the energy.
| userbinator wrote:
| Look at the ones pushing this stuff, who they work for, and
| what interests they have. It's easy to see why a lot of things
| are the way they are, when you realise their true purpose.
| ishanjain28 wrote:
| I looked at netflow data for the last 10 days from my router. 75%
| of the HTTP traffic is HTTP3, 23% HTTP1.1/HTTP2(over TLS) and 2%
| is plain HTTP.
| culebron21 wrote:
| Read up until author is confusing HTTP/1 with TCP, claiming that
| it's TCP's fault that we must make multiple connections to load a
| website.
|
| Actually, TCP allows continuous connection and sending as much
| stuff as you want -- in other words, stateful protocol. It was
| HTTP/1 that was decidedly stateless.
|
| Sessions in web sites are direct consequence of the need to keep
| state somewhere.
| taneq wrote:
| > It was HTTP/1 that was decidedly stateless.
|
| And this was a principled stance. HTTP/1 was for serving web
| _pages_ , and web _pages_ are _documents_. Then we started
| generating web pages from databases, which was pretty useful,
| but also started the slippery slope to us hijacking the web and
| turning it into a network of applications running on a nested
| operating system.
| ignoramous wrote:
| > _...the slippery slope to us hijacking the web and turning
| it into a network of applications running on a nested
| operating system._
|
| Well if folks drop anything that's not 53, 443, or 80, then
| what other choices are left?
| taneq wrote:
| That's a slightly different question (ie. how do we
| pragmatically tunnel our data between applications) but
| yes, that evolutionary race to the bottom (or port 80/443
| as you say) also sucks.
| ericpauley wrote:
| The author obviously knows this. As others have noted, even
| though HTTP over TCP supports multiple requests they must be
| accomplished consecutively unless you use multiple TCP
| connections.
| oefrha wrote:
| TCP is a streaming protocol. You can build whatever
| multiplexing scheme on top like h2 does, but you simply can't
| escape TCP head of the line blocking, as it's just a single
| undelimited stream underneath.
|
| As an aside, I only truly grasped the stream nature of TCP when
| I started dissecting and reassembling packets. The first ever
| reassembly program I wrote was hopelessly wrong because I was
| treating TCP packet boundary as data packet boundary, but in
| fact higher level protocol (HTTP/RTMP/etc.) data packet
| boundaries have nothing to do with TCP packet boundaries at
| all, it's a single continuous stream that your higher level
| protocol has to delimit on its own.
| loup-vaillant wrote:
| There is though a fundamental mismatch between TCP, and the
| problem HTTP (any version) needs to solve: TCP is for sending
| stream of bytes, reliably and in the same order they were sent.
| HTTP needs to transmit some amount of data, reliably.
|
| The only aspect of TCP HTTP really wants here is reliability.
| The order we don't really care about, we just want to load the
| whole web page and all associated data (images, scripts, style
| sheet, fonts...), in a way that can be reassembled at he other
| end. This makes it okay to send packets out of order, and doing
| so automatically solves head-of-line blocking.
|
| This is a little different when we start streaming audio or
| video assets: for those, it is often okay for _reliability_ to
| take a hit (though any gap must be detected). Losing a couple
| packets may introduce glitches and artefacts, but doesn't
| necessarily renders the media unplayable. _(This applies more
| to live streams & chats though. For static content most would
| prefer to send a buffer in advance, and go back to sacrifice
| ordering in order to get a perfect (though delayed) playback.)_
|
| In both use cases, TCP is not a perfect match. The only reason
| it's so ubiquitous anyway is because it is so damn convenient.
| masklinn wrote:
| That's a different issue than what parent is talking about:
| HTTP definitely needs each individual resource to be ordered,
| what it does not need is for _different_ resources to be
| ordered relative to one another, which becomes an issue when
| you mux multiple requests concurrently over a single
| connection.
| crims0n wrote:
| > This is a little different when we start streaming audio or
| video assets: for those, it is often okay for reliability to
| take a hit (though any gap must be detected). Losing a couple
| packets may introduce glitches and artifacts, but doesn't
| necessarily renders the media unplayable.
|
| This is exactly what UDP is for. There is nothing wrong with
| TCP and UDP at the transport layer, both do their job and do
| it well.
| fidotron wrote:
| The whole HTTP request/response cycle has led to a generation
| of developers that cannot conceive of how to handle continuous
| data streams, it's extraordinary.
|
| I have seen teams of experienced seniors using websockets and
| then just sending requests/responses over them as every
| architecture choice and design pattern they were familiar with
| required this.
|
| Then people project out from their view of the world and assume
| the problem is not with what they are doing but in the other
| parts of the stack they don't understand, such as blaming TCP
| for the problems with HTTP.
| grogenaut wrote:
| I know right. The kids these days. Most of them never learned
| to solder either so how can they assemble their own computers
| from ics? I caught one of my tech leads using a jet burner
| lighter and just scorching the whole board. And forget
| reading core dumps in hex!!! An intern was just putting the
| hex dump in the chat chibi and asking it what went wrong. Get
| off my lawn already.
| culebron21 wrote:
| I switched from web dev to data science some years ago, and
| surprisingly couldn't find a streaming parallelizer for
| Python -- every package assumes you loaded the whole dataset
| in memory. Had to write my own.
| elashri wrote:
| There are many packages that can do that, like Vax [1] and
| Dask [2]. I don't know exactly your workflow. But the
| concurrency in python is limited to multiprocessing, which
| is much expensive than threads which usually a typical
| streaming parallelizer will use outside python world.
|
| [1] https://vaex.io/docs/index.html
|
| [2] https://docs.dask.org/en/latest/
| bboygravity wrote:
| Just curious: why would you not load the entire dataset
| "into memory" ("into memory" from a Python perspective)?
|
| On an average desktop/server system the OS would
| automatically take care of putting whatever fits in RAM and
| the rest on disk.
|
| Or are the datasets large enough to fit on neither?
| thedougd wrote:
| Example: I'm querying a database, producing a file and
| storing it in object storage (S3). The dataset is 100
| gigabytes in size. I should not require 100 gigabytes of
| memory or disk space to handle this single operation. It
| would be slower to write it to disk first.
| ninkendo wrote:
| > On an average desktop/server system the OS would
| automatically take care of putting whatever fits in RAM
| and the rest on disk.
|
| This is not true, unless you're referring to swap (which
| is a configuration of the system and may not be big
| enough to actually fit it either, many people run with
| only a small amount of swap or disable it altogether.)
|
| You may be referring to mmap(2), which will map the on-
| disk dataset to a region of memory that is paged in on-
| demand, but somehow I doubt that's what OP was referring
| to either.
|
| If you just read() the file into memory and work on it,
| you're going to be using a ton of RAM. The OS will only
| put "the rest on disk" if it swaps, which is a degenerate
| performance case, and it may not even be the dataset
| itself that gets swapped (the kernel may opt to swap
| _everything else_ on the system to fit your dataset into
| RAM. All pages are equal in the eyes of the virtual
| memory layer, and the ranking algorithm is basically an
| LRU cache.)
| geocar wrote:
| read() doesn't read from the disk if the blocks are
| already in memory.
| ninkendo wrote:
| That's a really misleading thing to say. If the kernel
| already has the thing you're read()'ing _cached_ , then
| yes the kernel can skip the disk read _as an
| optimization_. But by reading, you're taking those bytes
| and putting a _copy_ of them in the process's heap space,
| which makes it no longer just a "cache". You're now
| "using memory".
|
| read() is not mmap(). You can't just say "oh I'll read
| the file in and the OS will take care of it". It doesn't
| work that way.
| geocar wrote:
| > If the kernel already has the thing you're read()'ing
| cached
|
| which it would do, if you have just downloaded the file.
|
| > But by reading, you're taking those bytes and putting a
| copy of them in the process's heap space
|
| i mean, you just downloaded it from the network, so
| unless you think mmap() can hold buffers on the network
| card, there's definitely going to be a copy going on. now
| it's downloaded, you don't need to do it again, so we're
| only talking the one copy here.
|
| > You can't just say "oh I'll read the file in and the OS
| will take care of it". It doesn't work that way.
|
| i can and do. and you've already explained swap
| sufficiently that i believe you know you can do exactly
| that also.
| ninkendo wrote:
| Please keep in mind the context of the discussion. A
| prior poster made a claim that they can read a file into
| memory, and that it won't actually use any additional
| memory because the kernel will "automatically take care"
| of it somehow. This is plainly false.
|
| You come in and say something to the effect of "but it
| may not have to read from disk because it's cached",
| which... has nothing to do with what was being discussed.
| We're not talking about whether it incurs a disk read,
| we're talking about whether it will run your system out
| of memory trying to load it into RAM.
|
| > i mean, you just downloaded it from the network, so
| unless you think mmap() can hold buffers on the network
| card, there's definitely going to be a copy going on. now
| it's downloaded, you don't need to do it again, so we're
| only talking the one copy here.
|
| What in god's holy name are you blathering about? If I
| "just downloaded it from the network", it's on-disk. If I
| mmap() the disk contents, there's no copy going on, it's
| "mapped" to disk. If I _read()_ the contents, which is
| what you said I should do, then _another copy of the
| data_ is now sitting in a buffer in my process's heap.
| This extra copy is now "using" memory, and if I keep
| doing this, I will run the system out of RAM. This is
| characteristically different from mmap(), where a region
| of memory _maps_ to a file on-disk, and contents are
| _faulted_ into memory as I read them. The reason this is
| an _extremely_ important distinction, is that in the mmap
| scenario, the kernel is free to _free_ the read-in pages
| any time it wants, and they will be faulted back again in
| if I try to read them again. Contrast this with using
| read(), which makes it so the kernel _can 't_ free the
| pages, because they're buffers in my process's heap, and
| are not considered file-backed from the kernel's
| perspective.
|
| > i can and do. and you've already explained swap
| sufficiently that i believe you know you can do exactly
| that also.
|
| Swap is disabled on my system. Even if it wasn't, I'd
| only have so much of it. Even if I had a ton of it,
| read()'ing 100GB of data and relying on swap to save me
| is going to _grind the rest of the system to a halt_ as
| the kernel tries to make room for it (because the data is
| in my heap, and thus isn't file-backed, so the kernel
| can't just free the pages and read them back from the
| file I read it from.) read() is not mmap(). Please don't
| conflate them.
| SoftTalker wrote:
| > Swap is disabled on my system.
|
| Yep I do the same. If I have a server with hundreds of GB
| or even TB of RAM (not uncommon these days) I'm not
| setting up swap. If you're exhausting that much RAM, swap
| is only going to delay the inevitable. Fix your program.
| geocar wrote:
| > A prior poster made a claim that they can read a file
| into memory, and that it won't actually use any
| additional memory because the kernel will "automatically
| take care" of it somehow. This is plainly false.
|
| Nobody made that claim except in your head.
|
| Why don't you read it now:
|
| _Just curious: why would you not load the entire dataset
| "into memory" ("into memory" from a Python perspective)?_
|
| Look carefully: There's no mention of the word file. For
| all you or I know the programmer is imagining something
| like this: >>>
| data=loaddata("https://...")
|
| Or perhaps it's an S3 bucket. There is no file, only the
| data set. That's more or less exactly what I do.
|
| _On an average desktop /server system the OS would
| automatically take care of putting whatever fits in RAM
| and the rest on disk._
|
| You know exactly this what is meant by swap: We just
| confirmed that. And you know it is enabled on every
| average desktop server system, because you
|
| > Swap is disabled on my system
|
| are the sort of person who disables the average
| configuration! Can you not see you aren't arguing with
| anything but your own fantasies?
|
| > If I "just downloaded it from the network", it's on-
| disk.
|
| That's nonsense. It's in ram. That's the block cache you
| were just talking about.
|
| > If I mmap() the disk contents, there's no copy going
| on, it's "mapped" to disk
|
| Every word of that is nonsense. The disk is attached to a
| serial bus. Even if you're using fancy nvme "disks"
| there's a queue that operates in a (reasonably) serial
| fashion. The reason mmap() is referred to zero-copy is
| because it can reuse the block cache if it has been
| recently downloaded -- but if the data is paged out,
| there is absolutely a copy and it's more expensive than
| just read() by a long way.
|
| > Even if it wasn't, I'd only have so much of it. Even if
| I had a ton of it, read()'ing 100GB of data and relying
| on swap to save me is going to grind the rest of the
| system to a halt as the kernel tries to make room for it
|
| You only have so much storage, this is life, but I can
| tell you as someone who _does_ operate a 1tb of ram
| machine that downloads 300gb of logfiles every day,
| read() and write() work just fine -- I just can 't speak
| for python (or why python people don't do it) because i
| don't like python.
| ninkendo wrote:
| You're basically just gish galloping at this point and
| there's no need to respond to you any more. All your
| points about swap are irrelevant to the discussion. All
| your points about disk cache are irrelevant to the
| discussion. You have a very, very, very incorrect
| understanding of how operating system kernels work if you
| think mmap() is just a less efficient read():
|
| > Every word of that is nonsense. The disk is attached to
| a serial bus. Even if you're using fancy nvme "disks"
| there's a queue that operates in a (reasonably) serial
| fashion. The reason mmap() is referred to zero-copy is
| because it can reuse the block cache if it has been
| recently downloaded -- but if the data is paged out,
| there is absolutely a copy and it's more expensive than
| just read() by a long way.
|
| Please do a basic web search for what "virtual memory"
| means. You seem to think that handwavey nonsense about
| disk cache means that read() doesn't copy the data into
| your working set. You should look at the manpage for
| read() and maybe ponder why it requires you to _pass your
| own buffer._ This buffer would have to be something _you
| 've malloc()'d ahead of time_. Hence why you're _using
| more memory_ by using read() than you would using mmap()
|
| > You only have so much storage, this is life, but I can
| tell you as someone who does operate a 1tb of ram machine
| that downloads 300gb of logfiles every day, read() and
| write() work just fine -- I just can't speak for python
| (or why python people don't do it) because i don't like
| python.
|
| You should _definitely_ learn what the mmap syscall does
| and why it exists. You _really_ don 't need to use 300gb
| of RAM to read a 300gb log file. You should probably look
| up how text editors like vim _actually_ work, and why you
| can seek to the end of a 300gb log file without vim
| taking up a bunch of RAM. Maybe you 've never been
| curious about this before, I dunno.
|
| Try making a 20gb file called "bigfile", and run this C
| program: #include <sys/mman.h>
| #include <sys/stat.h> #include <fcntl.h>
| #include <stdio.h> #include <stdlib.h>
| #include <inttypes.h> #include <unistd.h>
| int main(int argc, char *argv[]) { char
| *mmap_buffer; int fd; struct stat
| sb; ssize_t s; fd =
| open("./bigfile", O_RDONLY); fstat(fd, &sb);
| // get the size // do the mmap
| mmap_buffer = mmap(NULL, sb.st_size, PROT_READ,
| MAP_PRIVATE, fd, 0); uint8_t sum;
| // Ensure the whole buffer is touched by doing a dumb
| math operation on every byte for (size_t i =
| 0; i < sb.st_size; ++i) { sum +=
| mmap_buffer[i]; // overflow is fine, sorta a dumb
| checksum } fprintf(stderr,
| "done! sum=%d\n", sum); sleep(1000);
| }
|
| And wait for "done!" to show up in stderr. It will sleep
| 1000 seconds waiting for you to ctrl+c at this point. At
| this point we will have (1) mmap'd the entire file into
| memory, and (2) read the whole thing sequentially, adding
| each byte to a `sum` variable (with expected overflow.)
|
| While it's sleeping, check /proc/<pid>/status and take
| note of the memory stats. You'll see that VmSize is as
| big as the file you read in (for me, bigfile is more than
| 20GB): VmSize: 20974160 kB
|
| But the actual _resident set_ is 968kb:
| VmRSS: 968 kB
|
| So, my program is using 968 kB even though it has a 20GB
| in-memory buffer that just read the whole file in! My
| system only has 16GB of RAM and swap is disabled.
|
| How is this possible? Because _mmap lets you do this._
| The kernel will read in pages from bigfile _on demand_ ,
| but is also free to _free_ them at any point. There is no
| controversy here, every modern operating system has
| supported this for decades.
|
| Compare this to a similar program using read():
| #include <sys/mman.h> #include <sys/stat.h>
| #include <fcntl.h> #include <stdio.h>
| #include <stdlib.h> #include <inttypes.h>
| #include <unistd.h> int main(int argc, char
| *argv[]) { int fd; struct stat
| sb; ssize_t s; char *read_buffer;
| fd = open("./bigfile", O_RDONLY); fstat(fd,
| &sb); // get the size // do the read
| read_buffer = malloc(sb.st_size); read(fd,
| read_buffer, sb.st_size); uint8_t sum;
| // Ensure the whole buffer is touched by doing a dumb
| math operation on every byte for (size_t i =
| 0; i < sb.st_size; ++i) { sum +=
| read_buffer[i]; // overflow is fine, sorta a dumb
| checksum } fprintf(stderr,
| "done! sum=%d\n", sum); sleep(1000);
| }
|
| And do the same thing (in this instance I shrunk bigfile
| to 2 GB because I don't have enough physical RAM to do
| this with 20GB.) You'll see this in /proc/<pid>/status:
| VmPeak: 2099928 kB VmSize: 2099928 kB
| VmLck: 0 kB VmPin: 0 kB
| VmHWM: 2098056 kB VmRSS: 2098056 kB
|
| Oops! I'm using 2 GB of resident set size. If I were to
| do this with a file that's bigger than RAM, I'd get OOM-
| killed.
|
| This is why you shouldn't read() in large datasets. Your
| strategy is to have servers with 1TB of ram and massive
| amounts of swap, and I'm telling you _you don 't need
| this_ to process this big of files. mmap() does so
| without requiring things to be read into RAM ahead of
| time.
|
| Oh, and guess what: Take out the `sleep(1000)`, and the
| mmap version is _faster_ than the read() version:
| $ time ./mmap done! sum=225 ./mmap 6.89s
| user 0.28s system 99% cpu 7.180 total $
| time ./read done! sum=246 ./read 6.86s
| user 1.72s system 99% cpu 8.612 total
|
| Why is it faster? Because we don't have to needlessly
| copy the data into the process's heap. We can just read
| the blocks directly from the mmap'd address space, and
| let page faults read them in for us.
| geocar wrote:
| > Why is it faster?
|
| Why are the answers different?
|
| Do you not know how to benchmark? Or did you falsify your
| results on purpose?
| ninkendo wrote:
| Begone troll.
|
| (Edit: because I noticed this too, and it got me curious,
| the reason for the incorrect result is that I didn't
| check the result of the read() call, and it was actually
| reading fewer bytes, by a small handful. read() is
| allowed to do this, and it's up to callers to call it
| again. It was reading 2147479552 bytes when the file size
| was 2147483648 bytes. If anything this should have made
| the read implementation faster, but mmap still wins even
| though it read more bytes. A fixed version follows, and
| now produces the same "sum" as the mmap):
| #include <sys/mman.h> #include <sys/stat.h>
| #include <fcntl.h> #include <stdio.h>
| #include <stdlib.h> #include <inttypes.h>
| #include <unistd.h> int main(int argc, char
| *argv[]) { int fd; struct stat
| sb; ssize_t s; ssize_t
| read_result; ssize_t bytes_read;
| char *read_buffer; fd =
| open("./bigfile", O_RDONLY); fstat(fd, &sb);
| // get the size // do the read
| read_buffer = malloc(sb.st_size); bytes_read
| = 0; do { read_result =
| read(fd, read_buffer + bytes_read, sb.st_size -
| bytes_read); bytes_read += read_result;
| } while (read_result != 0); uint8_t sum;
| // Ensure the whole buffer is touched by doing a dumb
| math operation on every byte for (size_t i =
| 0; i < sb.st_size; ++i) { sum +=
| read_buffer[i]; // overflow is fine, sorta a dumb
| checksum } fprintf(stderr,
| "done! sum=%d", sum); }
| Chabsff wrote:
| Why do you think OP isn't refering to mmap()? Its
| behavior is pretty much what they describe, and a common
| way it's used.
| ninkendo wrote:
| Fair enough, it's totally possible that's what they
| meant. But the complaint of "every package assumes you
| loaded the whole dataset in memory" seems to imply the
| package just naively reads the file in. I mean, if the
| package _was_ mmapping it, they probably wouldn't have
| had much trouble with memory enough for it to be an issue
| they've had to complain about. Also, you may not always
| have the luxury of mmap()'ing, if you're reading data
| from a socket (network connection, stdout from some other
| command, etc.)
|
| I don't do much python but I used to do a lot of ruby,
| and it was rare to see anyone mmap'ing anything, most
| people just did File.read(path) and called it a day. If
| the norm in the python ecosystem is to mmap things, then
| you're probably right.
| tsimionescu wrote:
| Depending on the source of the data, that is not as good
| as an actual streaming implementation. That is, if the
| data is coming from a network API, waiting for it to be
| "in memory" before processing it still means that you
| have to store the whole stream on the local machine
| before you even start. Even if we assume that you are
| storing it on disk and mmapping it into memory that's
| still not a good idea for many use cases.
|
| Not to mention, if the code is not explicitly designed to
| work with a streaming approach, even for local data,
| might mean that early steps accidentally end up touching
| the whole data (e.g. they look for a closing } in
| something like a 10GB json document) in unexpected
| places, costing orders of magnitude more than they
| should.
| fidotron wrote:
| This is actually the old SAX vs DOM xml parsing
| discussion in disguise.
|
| SAX is harder but has at least two key strongly related
| benefits: 1. Can handle a continuous firehose 2.
| Processing can start before the load is completed
| (because it might never be completed) so the time to
| first useful action can be greatly reduced.
| culebron21 wrote:
| I had 5..10GB geopackage, and either hadn't spare
| 10..20GB of RAM, or had to make sure my data processing
| routine would fut in a busy server.
| otteromkram wrote:
| Did you try dask?
|
| https://www.dask.org/
| [deleted]
| culebron21 wrote:
| I've looked into the samples and recall what the problem
| was: geopandas was in an experimental branch, and you had
| to lock yourself into dask -- plus, the geopandas code
| had to be rewritten completely for dask. So i wrote my
| own processor that applies the same function in
| map&reduce fashion, and keeps code compatible with
| jupyter notebooks -- you decorate functions to be
| parallelizeable, but still import them and call normally.
| https://github.com/culebron/erde
| goeiedaggoeie wrote:
| Same in video parsers and tooling frequently, expects a
| whole mp4 to be there, or a whole video to parse it, yet
| gstreamer/ffmpegapi delivers the content as a stream of
| buffers that you have to process one buffer at a time.
| dcow wrote:
| I've found the Rust ecosystem to be very good about never
| assuming you have enough memory for anything and usually
| supporting streaming styles of widget use where possible.
| goeiedaggoeie wrote:
| ha! I was literally thinking of the libs for parsing
| h264/5 and mp4 in rust (so not using unsafe
| gstreaer/ffmpeg code) when moaning a little here.
| Generally i find the rust libraries and crates to be well
| designed around readers and writers.
| pipo234 wrote:
| Traditionally, ffmpeg would build the mp4 container while
| transcoded media is written to disk (in a single
| contiguous mdat box after ftyp) and then put the track
| description and samples in a moov at the end of the file.
| That's efficient because you can't precisely allocate the
| moov before you've processed the media (in one pass).
|
| But when you would load the file into a <video> element,
| it would off course need to buffer the entire file to
| find the moov box needed to decode the the NAL units (in
| case of avc1).
|
| A simple solution was then to repackage by simply moving
| the moov at the end of the file before the mdat
| (adjusting chunk offset). Back in the day, that would
| make your video start instantly!
| goeiedaggoeie wrote:
| This is basically what cmaf is. the moov and ftyp gets
| sent at the beginning (and frequently gets written as an
| init segment) and then the rest of the stream is a
| continuous stream of moof's and mdat's chunked as per
| gstreamer/ffmpeg specifics.
| pipo234 wrote:
| I was thinking progressive MP4, with sample table in the
| moov. But yes, cmaf and other fragmented MP4 profiles
| have ftyp and moov at the front, too.
|
| Rather than putting the media in a contiguous blob, CMAF
| interleaves it with moofs that hold the sample byte
| ranges and timing. Moreover, while this interleaving
| allows _most_ of the CMAF file to be progressively
| streamed to disk as the media is created, it has the same
| CATCH22 problem as the "progressive" MP4 file in that
| the index (sidx, in case of CMAF) cannot be written at
| the start of the file unless _all_ the media it indexes
| has been processed.
|
| When writing CMAF, ffmpeg will usually omit the segment
| index which makes fast search painful. To insert the
| `sidx` (after ftyp+moov but before the moof+mdat s) you
| need to repackage (but not re-encode).
|
| Same problem, same solution more or less.
| prox wrote:
| Is it me or aren't there a whole lot video specialists in
| general? It's just something I noticed here and there on
| github.
| londons_explore wrote:
| In a way, that's good. The few hundred video encoding
| specialists who exist in the world have, per person, had
| a huge impact on the world.
|
| Compare that to web developers, who in total have had
| probably a larger impact on the world, but per head it is
| far lower.
|
| Part of engineering is to use the fewest people possible
| to have the biggest benefit for the most people. Video
| did that well - I suspect partly by being 'hard'.
| nullpilot wrote:
| My experience that played out over the last few weeks
| lead me to a similar belief, somewhat. For rather
| uninteresting reasons I decided I wanted to create mp4
| videos of an animation programmatically.
|
| The first solution suggested when googling around is to
| just create all the frames, save them to disk, and then
| let ffmpeg do its thing from there. I would have just
| gone with that for a one-off task, but it's a pretty bad
| solution if the video is long, or high res, or both.
| Plus, what I really wanted was to build something more
| "scalable/flexible".
|
| Maybe I didn't know the right keywords to search for, but
| there really didn't seem to be many options for creating
| frames, piping them straight to an encoder, and writing
| just the final video file to disk. The only one I found
| that seemed like it could maybe do it the way I had in
| mind was VidGear[1] (Python). I had figured that with the
| popularity of streaming, and video in general on the web,
| there would be so much more tooling for these sorts of
| things.
|
| I ended up digging way deeper into this than I had
| intended, and built myself something on top of
| Membrane[2] (Elixir)
|
| [1] https://abhitronix.github.io/vidgear/ [2]
| https://membrane.stream/
| dylan604 wrote:
| It sounds like a misunderstanding of the MPEG concept.
| For an encode to be made efficiently, it needs to see
| more than one frame of video at a time. Sure, I-frame
| only encoding is possible, but it's not efficient and the
| result isn't really distributable. Encoding _wants_ to
| see multiple frames at a time so that the P and B frames
| can be used. Also, to get the best bang for the bandwidth
| buck is to use multipass encoding. Can 't do that if all
| of the frames don't exist yet.
|
| You have to remember how old the technology you are
| trying to use is, and then consider the power of the
| computers available when they were made. MPEG-2 encoding
| used to require a dedicated expansion card because the
| CPUs did have decent instructions for the encoding. Now,
| that's all native to the CPU which makes the code base
| archaic.
| nullpilot wrote:
| No doubt that my limited understanding of these
| technologies came with some naive expectations of what's
| possible and how it should work.
|
| Looking into it, and working through it, part of my
| experience was a lack of resources at the level of
| abstraction that I was trying to work in. It felt like I
| was missing something, with video editors that power
| billion dollar industries on one end, directly embedding
| ffmpeg libs into your project and doing things in a way
| that requires full understanding of all the parts and how
| they fit together on the other end, and little to nothing
| in-between.
|
| Putting a glorified powerpoint in an mp4 to distribute
| doesn't feel to me like it is the kind of task where the
| prerequisite knowledge includes what the difference
| between yuv420 and yuv422 is or what Annex B or AVC are.
|
| My initial expectation was that there has to be some in-
| between solution. Before I set out, what I had thought
| would happen is that I `npm install` some module and then
| just create frames with node-canvas, stream them into
| this lib and get an mp4 out the other end that I can send
| to disk or S3 as I please.* Worrying about the nitty
| gritty details like how efficient it is, many frames it
| buffers, or how optimized the output is, would come
| later.
|
| Going through this whole thing, I now wonder how
| Instagram/TikTok/Telegram and co. handle the initial
| rendering of their video stories/reels, because I doubt
| it's anywhere close to the process I ended up with.
|
| * That's roughly how my setup works now, just not in JS.
| I'm sure it could be another 10x faster at least, if done
| differently, but for now it works and lets me continue
| with what I was trying to do in the first place.
| dylan604 wrote:
| This sounds like "I don't know what a wheel is, but if I
| chisel this square to be more efficient it might work".
| Sometimes, it's better to not reinvent the wheel, but
| just use the wheel.
|
| Pretty much everyone serving video uses DASH or HLS so
| that there are many versions of the encoding at different
| bit rates, frame sizes, and audio settings. The player
| determines if it can play the streams and keeps stepping
| down until it finds one it can use.
|
| Edit: >Putting a glorified powerpoint in an mp4 to
| distribute doesn't feel to me like it is the kind of task
| where the prerequisite knowledge includes what the
| difference between yuv420 and yuv422 is or what Annex B
| or AVC are.
|
| This is the beauty of using mature software. You don't
| _need_ to know this any more. Encoders can now set the
| profile /level and bit depth to what is appropriate. I
| don't have the charts memorized for when to use what
| profile at what level. In the early days, the decoders
| were so immature that you absolutely needed to know the
| decoder's abilities to ensure a compatible encode was
| made. Now, the decoder is so mature and is even native to
| the CPU, that the only limitation is bandwidth.
|
| Of course, all of this is strictly talking about the
| video/audio. Most people are totally unawares that you
| can put programming inside of an MP4 container that
| allows for interaction similar to DVD menus to jump to
| different videos, select different audio tracks, etc.
| nullpilot wrote:
| > This sounds like "I don't know what a wheel is, but if
| I chisel this square to be more efficient it might work".
| Sometimes, it's better to not reinvent the wheel, but
| just use the wheel.
|
| I'm not sure I can follow. This isn't specific to MP4 as
| far as I can tell. MP4 is what I cared about, because
| it's specific to my use case, but it wasn't the source of
| my woes. If my target had been a more adaptive or
| streaming friendly format, the problem would have still
| been to get there at all. Getting raw, code-generated
| bitmaps into the pipeline was the tricky part I did not
| find a straightforward solution for. As far as I am able
| to tell, settling on a different format would have left
| me in the exact same problem space in that regard.
|
| The need to convert my raw bitmap from rgba to yuv420
| among other things (and figuring that out first) was an
| implementation detail that came with the stack I chose.
| My surprise lies only in the fact that this was the best
| option I could come up with, and a simpler solution like
| I described (that isn't using ffmpeg-cli, manually or via
| spawning a process from code) wasn't readily available.
|
| > You don't need to know this any more.
|
| To get to the point where an encoder could take over,
| pick a profile, and take care of the rest was the tricky
| part that required me to learn what these terms meant in
| the first place. If you have any suggestions of how I
| could have gone about this in a simpler way, I would be
| more than happy to learn more.
| dylan604 wrote:
| using the example of ffmpeg, you can use things like -f
| in front of -i to describe what the incoming format is so
| that your homebrew exporting can send to stdout piped to
| ffmpeg where reads from stdin with '-i -' but more
| specifically '-f bmp -i -' would expect the incoming data
| stream to be in the BMP format. you can select any format
| for the codecs installed 'ffmpeg -codecs'
| jrpelkonen wrote:
| It is possible that this is not a fault of the parser or
| tooling. In some cases, specifically when the video file
| is not targeted for streaming, the moov atom is at the
| end of the mp4. The moov atom is required for playback.
| lazide wrote:
| Zip files are the same. At least it makes it easy to
| detect truncated files?
| regularfry wrote:
| That's intentional, and it can be very handy. Zip files
| were designed so that you make an archive self-
| extracting. They made it so that you could strap a self-
| extraction binary to the front of the archive, which -
| rather obviously - could never have been done if the
| executable code followed the archive.
|
| But the thing is that the executable can be _anything_ ,
| so if what you want to do is to bundle an arbitrary
| application plus all its resources into a single file,
| all you need to do is zip up the resources and append the
| zipfile to the compiled executable. Then at runtime the
| application opens its own $0 as a zipfile. It Just Works.
| KMag wrote:
| Also, it makes it easier to append new files to an
| existing zip archive. No need to adjust an existing
| header (and potentially slide the whole archive around if
| the header size changes), just append the data and append
| a new footer.
| lazide wrote:
| Interestingly, a useful strategy for tape too, though zip
| is not generally considered tape friendly.
| deskamess wrote:
| What is the best approach to handling a continuous stream of
| data? Is it just 'buffer till you have what you need and pass
| it off to be processed' approach? And then keep reading until
| end of stream/forever.
| allan_s wrote:
| I think what they mean is that most people think that only
| the following is possible on 1 connection
|
| 1. Send Request 1 2. Wait for Response 1 3. Send Request 2
| 4. Wait for Request 2
|
| while you can do
|
| 1. Send Request 1, move on 2. Send Request 2, move on
|
| and have an other process/routine handling response
|
| and potentially you even have Request that can be sent
| without needing a reponse "user is typing" in XMPP for
| example.
|
| and even wilder for people using only http, that you can
| receive Response without a Request ! (i.e that you don't
| need to implement a GET /messages request, you can directly
| have your server sending messages)
| bmicraft wrote:
| Sounds like a worse way of writing async requests, while
| the last part is basically what websockets seem to be
| intended for
| allan_s wrote:
| I think you missed my point
|
| > Sounds like a worse way of writing async requests,
|
| It's just how it works under the hood, this complexity is
| quickly abstracted and actually it's how a lot of async
| requests are implemented, it's just here it's on 1 tcp
| connection.
|
| > , while the last part is basically what websockets seem
| to be intended for
|
| yes I was specifically answering that :
|
| > I have seen teams of experienced seniors using
| websockets and then just sending requests/responses over
| them as every architecture choice and design pattern they
| were familiar with required this.
|
| i.e people using websocket like normal http request.
| dgb23 wrote:
| In the case of websockets that is already handled for you.
|
| I think GP talks about how to think about communication
| (client server).
|
| Stateless request response cycles are much simpler to
| reason about because it synchronizes messages by default.
| The state is reflected via this back and forth
| communication. Even when you do it via JS: the request data
| is literally in the scope of the response callback.
|
| If you have bi-directional, asynchronous communication,
| then reading and writing messages is separate. You have to
| maintain state and go out of your way to connect incoming
| and outgoing messages semantically. That's not necessarily
| what you should be doing though.
| whizzter wrote:
| Pretty much, websocket implementations usually handles the
| buffering part so you really only need to handle "full"
| events.
|
| Easiest example is probably games, events could be things
| such players moving to new locations, firing a weapon,
| explosions.
|
| In realtime games, as well as "request-response" scenarios,
| it's common to have some kind of response that acknowledges
| that data was received.
| bjourne wrote:
| I could not find that claim in the article, but perhaps you are
| referring to an earlier version of the article that the author
| then updated? The performance problem with TCP is that it
| imposes a strict ordering of the data. The order in which
| clients receive data is the same as the order in which servers
| send data. So if some packet fail to reach the client, the
| packet has to be retransmitted before subsequent packets can be
| sent (notwithstanding the relatively few non-ACKed packets that
| are allowed to be in flight). I think this problem is what the
| author is referring to. And that it would be nigh impossible to
| get all network equipment that sits between a client and a
| server to a more latency-friendly version of TCP.
| ascar wrote:
| If I recall correctly in TCP a packet loss will cause a
| signifcant slowdown in the whole data stream since that packet
| needs to be retransmitted and you generally end up with a
| broken data stream until that happens (even tho TCP can
| continue to send more data in the meantime meaning it's more of
| a temporary hang than a block). Thus if you are sending
| multiple different data streams over the same TCP connection a
| packet loss will temporarily hang the processing of all data
| streams. A constrain that QUIC doesn't have.
| dgb23 wrote:
| A somewhat decent analogy of thinking about TCP is a single
| lane queue with a security guy who's ordering people around
| and making sure the queue is orderly and doesn't overflow. He
| has pretty much no respect for families, groups or overall
| efficiency. He only cares about the queue itself.
| pjc50 wrote:
| Not _broken_ , but you end up with a hole in the TCP receive
| window while the packet is retransmitted.
|
| How does QUIC manage its stream reassembly window sizes? I
| suppose it's easier since it's all in userland.
| adgjlsfhk1 wrote:
| The key point is that if I want to load a webpage that
| needs to download 100 different files, I don't care about
| the order of those 100 files, I just want them all to be
| downloaded in order. TCP makes you specify an order which
| means that if one packet gets lost, you have to wait for
| it. QUIC lets you say "here are the 100 things I want"
| which means that if you lose a packet, that only stops 1 of
| the things so the other 99 can continue asking for more
| packets.
| tsimionescu wrote:
| > I suppose it's easier since it's all in userland.
|
| I doubt applications would provide more reliable
| information to the QUIC library than they would to the
| kernel.
|
| The main difference as I understand it is that QUIC allows
| multiple separate TCP-like streams to exist on the same
| negotiated and encrypted connection. It's not fundamentally
| different from simply establishing multiple TLS over TCP
| connections with separate windows, it just allows you to
| establish the equivalent of multiple TLS + TCP connections
| with a single handshake.
| culebron21 wrote:
| Yes, I just read about this, and it makes sense.
| pjc50 wrote:
| There's an additional layer of network administrators at fault
| which block other protocols. Saw this question the other day:
| https://softwareengineering.stackexchange.com/questions/4479...
|
| Administrator blocks _everything except TCP_. Sometimes you
| even get blocks on everything except 80 and 443.
| lazide wrote:
| That must really suck for DNS!
| LinuxBender wrote:
| It's common in restricted environments. Egress for 80/443
| allowed and DNS must use local recursive DNS servers. Those
| internal DNS servers probably pass through a few SIEM and
| other security devices and are permitted out, usually to
| minimize data exfiltration. Though in those cases 80 and
| 443 are often using a MITM proxy as well for deep packet
| inspection. There are both commercial and open source MITM
| proxies. Fans of HTTP/3 and QUIC would not like most of the
| MITM proxies as they would negotiate a specific protocol
| with the destination server and it may not involve QUIC.
| zoogeny wrote:
| I worked in an environment with similar setup. First step
| for all devices allowed to connect to the network was to
| install the companies custom CA root certificate. There
| are a lot of sharp edges in such a setup (like trying to
| get Charles or other debugging proxies to work reliably).
| But in highly sensitive environments it would seem the
| policy is to MiTM every packet that passes through the
| network.
|
| I wasn't involved but another team did some experimenting
| with HTTP/2 (which at the time was still very early in
| its rollout) and they were struggling with the network
| team to get it all working. Once they did get it to work
| it actually resulted in slightly less performant load
| times for our site and the work was de-prioritized. I
| recall (maybe incorrectly) that it was due to the
| inefficiency of forcing all site resources through a
| single connection. We got better results when the
| resources were served from multiple domains and the
| browser could keep open multiple connections. But I
| wasn't directly involved so I only overheard things and
| my memory might be fuzzy on the details of what I
| overheard.
|
| Funnily enough, we did have to keep a backup network
| without the full snooping proxy for things like IoT test
| devices (including some smart speakers and TVs) since
| installing certs on those devices was sometimes
| impossible. I assume they were still proxying as much as
| they could reasonably manage.
| ignoramous wrote:
| > _Actually, TCP allows continuous connection and sending as
| much stuff as you want_
|
| This is also what QUIC is, but implemented in _userspace_ and
| with some key properties to prevent ossification and analysis.
| rc_mob wrote:
| Why does this article read like propaganda. Its pretty technical
| and I don't understand all of it and it still reads like
| propaganda.
| kzrdude wrote:
| I was thinking the same thing. One item I thought about was
| "H/3 Adoption Grows Rapidly" is paired with a graph that shows
| that adoption is absolutely flat over the last year.
| bdd8f1df777b wrote:
| As a Chinese user who regularly breaches the GFW, QUIC is a god
| send. Tunneling traffic over a QUIC instead of TLS to breach the
| GFW has much lower latency and higher throughput (if you change
| the congestion control). In addition, for those foreign websites
| not blocked by GFW, the latency difference between QUIC and TCP
| based protocol is also visible to the naked eye, as the RTT from
| China to the rest of the world is often high.
| meowtimemania wrote:
| RTT?
| Dylan16807 wrote:
| Round-trip time.
| [deleted]
| [deleted]
| FuriouslyAdrift wrote:
| Yeah... that's an unsecured VPN. Blocked.
| iamcalledrob wrote:
| QUIC's promise is fantastic, and latency-wise it's great. And
| probably that's what matters the most for the web.
|
| However I have run into issues with it for high-throughput use-
| cases. Since QUIC is UDP based and runs in user-space, it ends up
| being more CPU bound than TCP, where processing often ends up
| being done in the kernel, or even hardware.
|
| In testing in a CPU constrained environment, QUIC (and other UDP-
| based protocols like tsunami) capped out at ~400Mbps, CPU pegged
| at 100%. Whereas TCP+TLS on the same hardware could push 3+Gbps.
|
| It'll be interesting to see how it plays out, since a goal of
| QUIC is to be an evolving spec that doesn't get frozen in time,
| yet baking in to the kernel/hardware might negate that.
| Karrot_Kream wrote:
| Is this mostly syscall overhead? Could io_uring and friends
| solve some of this?
| wofo wrote:
| Luckily, there are ways to reduce syscalls (like Generic
| Segmentation Offload and other tricks[1]). But I agree that not
| having things run in the kernel makes it more challenging for
| high-throughput scenarios.
|
| [1] https://blog.cloudflare.com/accelerating-udp-packet-
| transmis...
| tomohawk wrote:
| Since QUIC is UDP based, there are performance issues that become
| apparent at higher rates. In TCP, a single write can write a
| large amount of data. You can even use sendfile to directly send
| the contents of a file to a TCP socket, all in kernel space. In
| UDP, you need to write out data in chunks, and you need to be
| aware of the link MTU as IP fragments can be very bad news at
| higher rates. This means an app needs to context switch with the
| kernel multiple times to send a large amount of data, whereas
| only a single context switch is required for TCP. There are
| newish system calls such as sendmmsg which alleviate the strain a
| bit, but they are not as simple as a stream oriented interface.
|
| The QUIC implementations are primarily in user space, so you end
| up with multiple QUIC implementations in whatever language on any
| given system.
|
| Hopefully Linux will add a QUIC kernel implementation that will
| bypass and overcome the traditional UDP shortcomings in the
| current stack.
| goalieca wrote:
| I always thought GRPC would be an interesting candidate for
| http/3 but there's been no real movement in that direction.
| https://github.com/grpc/proposal/blob/master/G2-http3-protoc...
| jalino23 wrote:
| I just recently found out that even if browser supports up to
| http3, it still up to the browsers to decide which protocol to
| use even if the browser supports http3 too, this was dishearten
| to find out that you don't have control of forcing the browser to
| use http2 or 3 specially if you have features that only worked on
| http3 and was broken on http2, I guess I should have just fixed
| the implementation on http2
| sassy_quat wrote:
| [dead]
| scrpl wrote:
| You don't have control because browser might not support of
| http3 at all. It's up to browser developers to decide when
| their support levels are mature enough to use by default.
| There's no other way of doing it.
| JakaJancar wrote:
| One thing where the semantics are not the same between HTTP/1.1
| and HTTP/2/3(?) is the `Host` header, which is often (always?)
| gone in the latter in favor off the `:authority` pseudo-header.
|
| Apps/scripts may rely on `Host`, but the underlying HTTP/2 server
| software might not normalize/expose it in a backwards-compatible
| way (e.g. Hyper in Rust).
|
| Technically I guess it's the browsers fault it doesn't set it.
|
| I would be curious to know if there are other such discrepancies
| that apps migh run into when enabling HTTP/2/3.
| SanderNL wrote:
| Imagine you are holding a 200 requests/page bag of excrement and
| it's generating too much load on all systems involved.
|
| What do we do? Empty the bag? No, we do not. That doesn't scale.
|
| We create the most elaborate, over the top efficient bag-of-
| excrement-conveyer-belt you ever came across. We'll move these
| babies in no time flat.
|
| Don't worry about side-loading 24 fonts in 3 formats and whatnot.
| Especially do not worry about having to load three different
| trackers to "map" your "customer journeys".
|
| No sir, your customer journeys are fully mapped, enterprise-ready
| and surely to be ignored by everyone. We got you covered with
| QUIC. Reimagining "data transport", for your benefit.
|
| Edit: Of course I didn't read the article, but now I did and it's
| somehow worse than I thought. "Why TCP is not optimal for today's
| web": no answers. There are literally no answers here. Other
| than, TCP is old. Old things are difficult.
| rrdharan wrote:
| This is like arguing chip companies shouldn't make chips that
| are faster and more power efficient just because the web sucks
| and Electron is a pig.
|
| Sure, we should just make chip innovation illegal and mandate
| Gemini instead of HTML/JS, I'm sure that'll work out great..
| satellite2 wrote:
| That's actually an excellent idea you've got here. Let's make
| chip companies design weight loss chips, were Electron
| running apps are forced to run on a purposely built slow
| core, like your doctor prescribing going to the gym.
| miohtama wrote:
| If you do not know the answer yet,as it sounds like you do, but
| there is a ting of rant there, let me expand what was said in
| the article:
|
| TCP/IP was build for 80s/90s when
|
| - Amount of data transferred was low
|
| - Data access patterns were different
|
| - Most TCP/IP was in university networks
|
| - There were no consumer users
|
| - Everyone on Internet could be trusted
|
| Today we have
|
| - 3B more people using Internet
|
| - 5B people living in authoritative regimes where the
| government wants to jail them for saying wrong thing on
| Internet
|
| - Mobile users
|
| - Wireless users (subject to different packet loss conditions)
|
| - You can stream 4k movies in your pocket
|
| - You can search and find any piece of information online under
| one second
|
| To get the low latencies and high bandwidths for mobile links
| TCP/IP can do it, but QUIC and HTTP/3 does it much better. Plus
| it's always encrypted, making all kind of middlebox attacks
| harder.
| peoplefromibiza wrote:
| > - 5B people living in authoritative regimes
|
| it's actually 1.9B
|
| > where the government wants to jail them for saying wrong
| thing on Internet
|
| that happens across all the spectrum of types of systems of
| government
| Spivak wrote:
| And you put those two statements together and you're back
| to the 5B number. I don't really care if your system of
| choosing leaders is democratic elections if at the same
| time you're burning books.
| peoplefromibiza wrote:
| > and you're back to the 5B number
|
| no, if you include anybody in the World, you are back at
| 7B
|
| The point was that you contradicted yourself: if 3B
| people use the internet (as you said) only 3B people are
| in danger of being spied through the internet.
|
| > I don't really care if your system of choosing leaders
| is democratic elections
|
| Unfortunately the dictionary does.
|
| And one does not imply the other, I mean it's one thing
| to have cameras in the streets for (allegedly) safety
| purpose and another thing entirely to be hanged from a
| crane if you're gay.
|
| Let's not pretend that everything is the same everywhere
| just because we don't like or agree with a particular
| aspect of what happens in our own country.
| megous wrote:
| > no, if you include anybody in the World, you are back
| at 7B
|
| you should update your numbers
| Spivak wrote:
| I think you meant to reply to someone else, I said
| nothing about the internet only that the math of
|
| "5B people live under authoritarian regimes."
|
| "Actually it's only 1.9B if you use my definition of
| authoritarian, the rest are just _authoritarian
| behaviors_. "
|
| Okay so to everyone but you 5B live under authoritarian
| regimes then -- "Not technically authoritarianism" is not
| something you should feel the need to say about countries
| you're arguing aren't authoritarian.
|
| > another thing entirely to be hanged from a crane if
| you're gay
|
| I know right, we're so much more civilized. We just lock
| them up if they do gay stuff in public, require sexual
| education to teach that homosexuality is an "unacceptable
| lifestyle choice", define consensual sex between two
| homosexual teenagers as rape, and and have a standing
| legal theory that people just can't help it if they are
| thrown into a blind rage and attack a gay person for
| being gay that in 2023 is still a legal defense in 33
| states. And that's just the gays, the political punching
| bag de jour is trans people and they get it even worse.
| dark-star wrote:
| I think the problems of TCP (especially in a network with high
| latency spikes, non-negligible packet drop rates or when
| roaming through multiple networks while driving 160kp/h on the
| Autobahn) are pretty obvious, even if you leave the
| security/encryption aspect out of the picture... But maybe
| that's only me
| simiones wrote:
| While trimming fat from excessively complex web pages would be
| nice, HTTP over TLS over TCP has some very clear fundamental
| inefficiencies that need to be addressed sooner or later.
|
| The biggest one is that there is just no reason to negotiate
| multiple encrypted sessions between the same client and server
| just to achieve multiple parallel reliable streams of data. But
| since TLS is running over TCP, and a TCP connection has a
| single stream of data (per direction) you are forced to
| negotiate multiple TLS sessions if you want multiple parallel
| streams, which needlessly uses server resources and needlessly
| uses network resources.
|
| Additionally, because TCP and TLS are separate protocols,
| TCP+TLS connection negotiation is needlessly chatty: the TLS
| negotiation packets could very well serve as SYN/SYN-ACK/ACK on
| their own, but the design of TCP stacks prevents this.
|
| I believe it would be theoretically possible to simply set the
| SYN flags on the Client Hello and Server Hello TLS packets to
| achieve the TCP handshake at the same time as the TLS
| handshake, but I don't think any common TCP/IP stack
| implementation would actually allow that (you could probably do
| it with TCP Fast Open for the Client Hello, but I don't think
| there is any way to send the Server Hello payload in a a SYN-
| ACK packet with the default stacks on Linux, Windows, or any of
| the BSDs).
|
| And, since you'd still need to do this multiple times in order
| to have multiple independent data streams, it's probably not
| worth the change compared to solving both problems in a new
| transport protocol (the way QUIC has).
| xorcist wrote:
| The right way would probably be to implement a real TCP
| replacement. Look at SCTP for inspiration. There are
| certainly things that could be improved and would be even
| nore useful than multiplexing, such as working congestion
| control and multipath.
|
| I understand that HTTP is the new TCP, but I don't have to
| like it.
| nightpool wrote:
| That's.... exactly what QUIC is? I don't understand your
| comment
| xorcist wrote:
| QUIC runs over UDP. In this sense it is more like a
| tunneling protocol than a layer 4 one.
| taway1237 wrote:
| isn't QUIC the new TCP in this case?
| [deleted]
| djha-skin wrote:
| This comment has a real grain of truth to it: reducing request
| load on your users has way more bang for the buck than changing
| protocols.
|
| At my company we did a test for HTTP2 and HTTP3 versus HTTP 1.1
| in South America on a 4G connection. We found that Netflix
| loaded in less than 2 and 1/2 seconds _regardless_ of what
| protocols the client supported.
|
| We determined that if we enable the HTTP2 or 3, We would save
| perhaps a second or two load time, but that if we reduced the
| request load we could perhaps save on the order of _tens_ of
| seconds.
| redundantly wrote:
| <insert 'Why Don't We Have Both?' meme here>
|
| Joking aside, we should be happy for any improvement that can
| be obtained. A single second saved for a single connection on
| its own is insignificant, but dozens or hundreds of
| connections for a single user, multiplied by the thousands or
| possibly millions of users, that's a _huge_ amount of time
| saved.
| djha-skin wrote:
| Sure, yeah, it's all sunshine and roses for the client, but
| I'm a DevOps engineer. It is very difficult on the server
| side.
|
| You can enable it on CDNs pretty easily these days, but
| enabling it on the API side is more difficult. The nginx
| ingress controller for kubernetes doesn't support HTTP3
| yet. It supports HTTP2, but I've heard that you can get
| into trouble with timeouts or something if you don't tune
| HTTP2 right. It does look like it's doable though,
| certainly more doable than HTTP3.
|
| HTTP2 is a moderate lift from a systems administration
| standpoint, but it's also a risk. It means changing the
| infrastructure enough that outages can occur, so that I'd
| rather do this lift when things are more stable and the
| holiday season is over.
| BitPirate wrote:
| I'm pretty sure that Netflix operates a decent amount of
| servers in South America which would result in low latency
| connections.
|
| HTTP/3 really starts to shine once high latency and/or
| packetloss are involved.
| satellite2 wrote:
| Let's put it this way, imagine going to the doctor, complaining
| about how you're putting on a little extra weight, your legs
| are getting tired, and you're short of breath. The doctor,
| instead of suggesting the radical idea of a diet or exercise,
| goes, "Aha! What you need, my friend, is a super-fast,
| mechanized, all-terrain wheelchair that can drift corners at 60
| mph. That'll get you to the fridge and back in record time,
| ensuring your problematic lifestyle remains unscathed by
| reality!"
|
| When life gives you lemons, don't make lemonade. Construct an
| intricate, high-speed, international lemon delivery system,
| ensuring everyone gets to share in your citrusy misery, QUICly.
|
| So here's to QUIC, the glorious facilitator of our cybernetic
| decadence, ensuring that the digital freight train of
| unnecessary features, exhaustive trackers, and bandwidth-
| gulping page elements gets to you so fast, you won't have time
| to realize you didn't need any of it in the first place.
|
| Buckle up, folks, because the future of web browsing is here
| and it's screaming down the information superhighway like a
| minivan packed with screaming toddlers, on fire, yet somehow,
| inexplicably, punctual.
| pdimitar wrote:
| While I partially sympathize and agree with your pessimism,
| you underestimate the impact of regular users disengaging
| from various places of the internet exactly because it gets
| more and more meaningless, watered down, and doesn't provide
| any value over super brief validation.
|
| And while that super brief validation is still a very strong
| hook for metric tons of people out there, people do start to
| disengage. The sense of novelty and wonder is wearing off for
| many. I'm seeing it among my acquaintances at least (and they
| are not the least bit technical and don't want to be).
|
| That the future of the internet looks very grim is
| unequivocally true though. But there's a lot that can and
| will be done still.
| ToucanLoucan wrote:
| > While I partially sympathize and agree with your
| pessimism, you underestimate the impact of regular users
| disengaging from various places of the internet exactly
| because it gets more and more meaningless, watered down,
| and doesn't provide any value over super brief validation.
|
| Thank fuck. Burn it down. Take us all the way back to
| vBulletin, MSN Messenger and Limewire. I have seen the
| future and it sucks. The past sucked too but at least my
| grandparents weren't getting radicalized by
| HillaryForJail2024 on Facebook.
| lazide wrote:
| The market for electric wheelchairs/mobility scooters _is_
| doing really well, now that you mention it!
| epeus wrote:
| That reminds me of this classic 1990s internet description
| http://stuartcheshire.org/rants/InfoHighway.html
| yetanotherloss wrote:
| That pitch for the most metal Mad Max sequel was pure
| molten gold and still holds up.
| dmbche wrote:
| You made my week, I think this is my favorite piece of
| writing in a while.
|
| "A little kid on a tricycle with a squirtgun filled with
| HYDROCHLORIC ACID. "
| xorcist wrote:
| I was going to write something along the lines how happy I am
| that someone solved the problem of the time to first byte
| when at least two javascript frameworks and three telemetry
| packages, each including a dozen of resources, are required
| for just about any page. But you put it so much more
| eloquently.
| afavour wrote:
| What a contrast HN provides. The top post is some really
| interesting detail about the QUIC protocol, the things it can
| do and the efficiencies it enables. Immediately followed by
| this rant about web pages being too big these days.
|
| The duality of Hacker News.
| SanderNL wrote:
| Welcome to the desert of the real. That's some real diversity
| for you.
|
| Not everything that shines is gold my friend.
|
| But I concede my comments are really ranty. Sorry about that.
| It's frustration.
| jeroenhd wrote:
| Like a true HN centrist, I agree with both.
|
| QUIC is really cool technology. It can solve a wide range of
| problems that aren't necessarily HTTP based.
|
| However, HTTP/3 is using QUIC as a patch over badly designed
| websites and services.
|
| QUIC doesn't solve the problem, it provides another path
| around the source of the problems. This blog was written by
| Akamai, a CDN that makes profits off the problem of "my
| website is too fat to lose quickly anymore". I don't blame
| them, they should sezie the opportunity when they can, but
| their view is obviously biased.
|
| QUIC makes most sense for sites that can optimise it, like
| the Googles, Cloudflares, and Akamais of the world. For the
| average website, it's just another technology decision you
| now need to make when setting up a website.
| Karrot_Kream wrote:
| > However, HTTP/3 is using QUIC as a patch over badly
| designed websites and services.
|
| Most "modern bloated" SPAs are tiny. The default MSS for
| most TCP implementations is 1460 bytes, that's 1.42 kB max
| per packet. Just the TCP handshake packets themselves (3 *
| 1460) can hold almost as much data as it takes to get a
| standard SPA. This says nothing about the TLS handshake
| itself which adds another 3 packets to connection
| establishment. Most SPAs send small and receive small
| payloads; a compressed JSON exchange can easily fit in a
| single packet (1.42 kB.)
|
| The actual amount of bandwidth on the wire between a
| server-side rendered "fast" app and a "modern bloated" SPA
| isn't very different, the difference is in how they send
| and receive data. SSR pages send a thin request (HTTP GET)
| and receive a large payload; generally the payload is much
| larger than the connection establishment overhead, so they
| make good use of their connection. On the other hand, a
| naive SPA will involve opening and closing tens or even
| hundreds of connections which will lead to a huge waste (6
| packets per TLS+TCP connection) of overhead. QUIC makes it
| possible to have small connection-oriented request/response
| semantics on the internet with minimal fuss.
|
| That said the _problem_ with rants like GP 's comment is
| that they don't serve to illuminate but serve to push on
| us, the readers, the author's agitated emotional state.
| Instead of having a discussion about bandwidth and
| connection overheads, or even trying to frame the
| discussion in terms of what "waste" really is, we get
| emotional content that begs us to either agree with an
| upvote or disagree with a downvote with nothing of
| substance to discuss. Any discussion website is only as
| strong as its weakest content and if content like this
| continues to get lots of upvotes _shrug_
| cryptonector wrote:
| Translation: Making the network more efficient
| enbiggens the amount of excrement you can put in that
| bag. Efficiency bad!!
|
| ?
|
| > Edit: Of course I didn't read the article, but now I did and
| it's somehow worse than I thought. "Why TCP is not optimal for
| today's web": no answers. There are literally no answers here.
| Other than, TCP is old. Old things are difficult.
|
| I think you just didn't read the article period. That QUIC
| hides a lot of metadata that TCP doesn't was covered, for
| example, and there's stuff about multipath, etc. And clearly,
| yes, TCP is old, and a lot of the extensions done to it over
| the years (e.g., window scaling, SACK) are great but still not
| perfect (especially window scaling) and it's getting harder to
| improve the whole thing, and then there's congestion control.
| But TCP was never going to be replaced by SCTP because SCTP was
| a new protocol number above IP and that meant lots of
| middleboxes dropped it, so any new thing would have to be run
| over UDP, and it would have to be popular in the big tech set,
| else it wouldn't get deployed either.
| SanderNL wrote:
| "Efficiency" for whom and for what? Think deep before you all
| bow down before our overlords and ditch clear and simple
| protocols for "efficiency".
|
| I know what QUIC is for and I know it's strengths. I just
| want the web to be simple and accessible. It's great Netflix
| and friends can stream content efficiently. It's another
| thing to push this to the "open web" and declare it The
| Protocol.
| Dylan16807 wrote:
| Sometimes a webpage just wants to have a bunch of images,
| which is not a bad thing, and something like this helps it
| load better.
|
| > It's great Netflix and friends can stream content
| efficiently.
|
| This barely affects streaming. Streams work fine with a
| single TCP connection.
| bastawhiz wrote:
| > Imagine you are holding a 200 requests/page bag of excrement
| and it's generating too much load on all systems involved.
|
| Sorry, this is a purely garbage comment. A "web 1.0" page with
| a photo gallery of a few hundred thumbnails can take a dozen
| seconds or more to load on my phone over 5G. Why? Because
| browsers limit you to six HTTP/1.1 connections each downloading
| a single image at a time, requested serially one after the
| other. A few big images? The rest of the page stops loading
| (regardless of the importance of the assets being loaded) until
| those images are done. It has nothing to do with how much
| bandwidth I have, it has everything to do with the
| insubstantial nature of HTTP/1.1 and TCP as protocols for
| downloading multiple files.
|
| For literally decades, we've been jumping through hoops to
| avoid these problems, even on "web 1.0" sites. In 2006 when I
| was building web pages at a cookie cutter company, rounded
| corners were done with "sliding doors" because loading each
| corner as its own image was too slow (and border-radius hadn't
| been invented). Multitudes of tools to build "sprite sheets" of
| images that are transformed and cropped to avoid loading lots
| of little icons, like the file type icons on web directories of
| old. The "old web" sites that HN adores tend to run
| astoundingly badly on HTTP/1.1.
|
| Not only do H2 and H3 fundamentally solve these _generational
| problems_, they've made the initial loads far faster by
| reducing the overhead of TLS handshakes (yet another sticking
| point of TLS holdouts!) and improved user privacy by encrypting
| more data. H2 and H3 would _absolutely_ have been welcomed with
| open arms fifteen years ago, long before the age of "24 fonts
| in 3 formats" because the problems were still present back
| then, regardless of whether you'd like to pretend they didn't.
|
| We should be celebrating that these protocols made _the whole
| internet_ faster, not just the bloated ones that you're upset
| about.
| agumonkey wrote:
| time to browse some gopher pages
| blauditore wrote:
| This is what always happens, because it's not the same
| people/teams/companies filling the bags and building the
| conveyor belts, and because it's easier that way.
|
| Another reason is that people are reluctant to give away any
| sort of convenience or vanity (a few nicer pixels in their
| website design) just for saving someone else's resources
| (bandwith/data/latency of server & and user). For the same
| reason people are driving SUVs in cities - it's a horrible
| misfit, but makes them feel good about very marginal benefits.
| antirez wrote:
| It's incredible that we need to assist to this farce where it is
| pretended that thanks to the tech giants the web is becoming
| faster. The average web latency is terrible, and this because of
| the many useless framework/tracking layers imposed by such tech
| giants. And HTTP/3 is a terrible protocol: for minor improvements
| that are interesting only for the narrow-data-driven mentality of
| somebody in some Google office, the web lost its openness and
| simplicity. Google ruined the Internet piece by piece, with
| Adsense and AD-driven SERP results, with complexity, buying and
| destroying protocols one after the other, providing abandonware-
| alike free services stopping any potential competition and
| improvement.
| ReactiveJelly wrote:
| Google didn't force anyone to put ads on their blogspam sites.
| djha-skin wrote:
| I hate QUIC. I don't like that now there's an implementation of
| TCP in user space, and I find binary protocols nauseating,
| particularly when 90% of their value can be achieved using gzip
| over the wire. Not to mention corporate firewalls generally block
| UDP by default. UDP? Really?
|
| The design makes a lot of sense if you're talking about a remote
| client because now you can update the TCP stack without regard to
| when the user updates Android or whatever, but like GraphQL I
| feel like it's a technology that bends over backwards for the
| client. Maybe that's necessary and I get that, but whenever
| possible for other services that don't need to be sensitive to
| rural users I prefer to use things that make sense. REST and HTTP
| 1.1 over TCP continue to make sense.
| scrpl wrote:
| Few years ago I wrote an article on HTTP/3 that was briefly
| featured on HN as well:
| https://news.ycombinator.com/item?id=24834767
|
| I can't agree with author about it eating the world though. It
| seems like only internet giants can afford implementing and
| supporting protocol this complex, and they're the only ones who
| will get a measurable benefit from it. It is an upgrade for sure,
| but an expensive one.
| lemagedurage wrote:
| It's not all bad, you can start using Caddy for your new
| project and serve http3 right now.
| CharlieDigital wrote:
| ASP.NET also has support: https://learn.microsoft.com/en-
| us/aspnet/core/fundamentals/s... (minus macOS)
| Andrew018 wrote:
| [dead]
| anilshanbhag wrote:
| You can enable HTTP/3 with one click in Cloudflare. Just enabled
| it after reading this for Dictanote.
| ta1243 wrote:
| I have a corporate laptop which funnels all traffic through
| zscaler.
|
| I was somewhat surprised when I was getting a different IP
| address on ipinfo.io (my home IP) compared with whatsmyip.org (a
| zscaler datacentre IP)
|
| Curlling ipinfo.io though came back with the zscaler address.
|
| Turns out they don't funnel UDP via zscaler, only TCP.
|
| Looking into zscaler, https://help.zscaler.com/zia/managing-quic-
| protocol
|
| > Zscaler best practice is to block QUIC. When it's blocked, QUIC
| has a failsafe to fall back to TCP. This enables SSL inspection
| without negatively impacting user experience.
|
| Seems corporate IT is resurging after a decade of defeat
| wkat4242 wrote:
| Zscaler is pure crap, we use it at work too. It's especially
| hard to configure docker containers for its proxy settings and
| ssl certificate.
|
| When I test something new in our lab I spend 10 minutes
| installing it and half a day configuring the proxy.
| fein wrote:
| It causes minor annoyances with ssl + maven as well, which
| can be fixed by -Dmaven.wagon.http.ssl.insecure=true.
|
| Well, at least they tried I guess.
| ta1243 wrote:
| No, setting any variable including the line
|
| "http.ssl.insecure=true"
|
| Is not a fix under any circumstance.
| ippi72 wrote:
| Which zscaler products does your company use? Do you have an
| idea of what better solutions are out there?
| wkat4242 wrote:
| The cloud service. I don't know what it's called exactly.
| It just says "Zscaler".
|
| In terms of better solutions, I would prefer a completely
| different approach. Securing the endpoint instead of the
| network. Basically the idea of Google's "BeyondCorp".
|
| What happens now is that people just turn off their VPN and
| Zscaler client to avoid issues, when they're working from a
| public hotspot or at home. In the office (our lab
| environment) we unfortunately don't have that option.
|
| But by doing so they leave themselves much more exposed
| than when we didn't have Zscaler at all.
| brewmarche wrote:
| Man, here I am reading this while fighting zScaler when
| connecting to our new package repository (it breaks because
| the inspection of the downloads takes too long). No one feels
| responsible for helping developers. Same with setting up
| containers, git, Python, and everything else that comes with
| its own trust store, you have to figure out everything by
| yourself.
|
| It also breaks a lot of web pages by redirecting HTTP
| requests in order to authenticate you (CSP broken). Teams
| GIFs and GitHub images have been broken for months now and no
| one cares.
| wkat4242 wrote:
| Ahhhh so _that 's_ why my teams gifs don't work. Thanks.
|
| We use an external auth provider which makes even more
| complex config yeah.
| brewmarche wrote:
| At least for me that's the problem. When I open the
| redirect url manually it also fixes the problem for some
| time.
|
| You can open the Teams developer tools to check this.
| Click the taskbar icon 7 times, then right click it. Use
| dev tools for select web contents, choose experience
| renderer AAD. Search for GIFs in Teams and monitor the
| network tab
| caerwy wrote:
| amen, brother!
| mongol wrote:
| It is the single most annoying impediment in corporate IT.
| And you are on your own when you need to work around the
| issues it causes. Is it really providing value, or is it just
| to feel better about security?
| steve_taylor wrote:
| It's not just an impediment. It's corporate spyware and
| possibly a prototype for Great Firewall 2.0.
| fsniper wrote:
| I hate corporate IT. Security with decades old arcane
| practices. Killing user experience with any way possible. MITM
| all around..
| Bluecobra wrote:
| Blame viruses, malware, phishing, ransomware, etc. IT has a
| responsibility to keep the network secure. Google is already
| experimenting with no Internet access for some employees, and
| that might be the endgame.
| fsniper wrote:
| There are real valuable practices that helps security, and
| there are practices just break security.
|
| Particularly MITM practice is a net negative. Rolling
| password resets and bad password requirements are also net
| negatives. Scanners which does not work as intended, which
| are not proofed at all and introduce slowness, feature
| breaks are possible negatives.
|
| Also at some places they introduce predatory privacy
| nightmares like key loggers, screen recorders..
| dobin wrote:
| Full inspection of user traffic is required to implement:
|
| * Data leakage policy (DLP; insider threat, data
| exfiltration)
|
| * Malware scanning
|
| * Domain blocking (Gambling, Malware)
|
| * Other detection mechanisms (C2)
|
| * Logging and auditing for forensic investigations
|
| * Hunting generally
|
| I dont see how this breaks security, and of course you
| also didnt elaborate on why it should be. Assumed TLS
| MitM is implemented reasonably correctly.
|
| Dont worry tho, zero trust will expose the company
| laptops again to all the malicious shit out there.
| acdha wrote:
| > I dont see how this breaks security
|
| You're training users to ignore certificate errors - yes,
| even if you think you're not - and you're putting in a
| critical piece of infrastructure which is now able to
| view or forge traffic everywhere. Every vendor has a
| history of security vulnerabilities and you also need to
| put in robust administrative controls very few places are
| actually competent enough to implement, or now you have
| the risk that your security operators are one phish or
| act of malice away from damaging the company (better hope
| nobody in security is ever part of a harassment claim).
|
| On the plus side, they're only marginally effective at
| the sales points you mentioned. They'll stop the sales
| guys from hitting sports betting sites, but attackers
| have been routinely bypassing these systems since the
| turn of the century so much of what you're doing is
| taking on one of the most expensive challenges in the
| field to stop the least sophisticated attackers.
|
| If you're concerned about things like DLP, you should be
| focused on things like sandboxing and fine-grained access
| control long before doing SSL interception.
| ta1243 wrote:
| A competent organisation will have a root certificate
| trusted on all machines so you won't be ignoring
| certificate errors. You are right however that you are
| funnelling your entire corporate traffic unencrypted
| through a single system, break into that and you have hit
| the goldmine.
| Bluecobra wrote:
| Correct, this is table stakes to get SSL Decryption
| working for any vendor. Typically we're talking about
| Windows PC's joined to Active Directory and they already
| trust the domain's CA. The firewall then gets it's own CA
| cert issued by the domain CA, so when you go to
| www.facebook.com and inspect the certificate it says it
| is from the firewall.
|
| Most orgs don't inspect sensitive things like banking,
| healthcare, government sites, etc. Also it's very common
| to make exceptions to get certain applications working
| (like Dropbox).
| ngrilly wrote:
| Yes, if you want/need to do those things, then you need
| to inspect user traffic. But why do you want/need to do
| those things in the first place? What's your threat
| model?
|
| Doing this breaks the end-to-end encryption and mutual
| authentication that is the key benefit of modern
| cryptography. The security measures implemented in modern
| web browsers are significantly more advanced and up-to-
| date than what systems like Zscaler are offering, for
| example in terms of rejecting deprecated protocols, or
| enabling better and more secure protocols like QUIC. By
| using something like Zscaler, you're introducing a single
| point of failure and a high value target for hackers.
| Bluecobra wrote:
| > But why do you want/need to do those things in the
| first place? What's your threat model?
|
| Not everyone in a company is savvy or hard at work. Randy
| in accounting might spend spend an hour or more a day
| browsing the internet and scooping up ads and be enticed
| to download something to help speed up their PC which
| turns out to be ransomware.
| midasuni wrote:
| In which case as Randy only has access to a few files you
| simply restore the snapshot of those files and away you
| go.
| fsniper wrote:
| * Data leaks are not prevented by MITM attack. A
| sufficiently determined data leaker will easily find
| easier or elaborate ways to circumvent it. * Malware
| scanning can be done very efficiently at the end user
| workstation. ( But always done inefficiently ) * How
| domain blocking requires a MITM? * C2 scanning can
| efficiently done at the end user workstation. * Audits
| does not require "full contents of communication"
|
| Is MITM ever the answer?
|
| Stealing a valid communication channel and identity theft
| of remote servers is in fact break basic internet
| security practices.
| kccqzy wrote:
| Google disabling Internet access is very different from
| your typical company doing that. Watching a YouTube video?
| Intranet and not disabled. Checking your email on Gmail?
| Intranet and not disabled. Doing a web search? Intranet and
| not disabled. Clicking on a search result? Just use the
| search cache and it's intranet.
| ngrilly wrote:
| > IT has a responsibility to keep the network secure.
|
| Yes, but TLS inspection is not the solution.
|
| > Google is already experimenting with no Internet access
| for some employees, and that might be the endgame.
|
| Source? And I'm pretty sure they are not considering
| disconnecting most of their employees who actually need
| Internet for their job.
| Bluecobra wrote:
| Source: https://arstechnica.com/gadgets/2023/07/to-
| defeat-hackers-go...
|
| Eventually I think the endgame here is that you use your
| own personal BYOD device to browse the internet that is
| not able to connect to the corporate network.
| blkhawk wrote:
| This has nothing to do with security and more with
| ineffective practices based on security where nobody knows
| why its done just that its done. Running MitM on
| connections basically breaks basic security mechanism for
| some ineffective security theater. This is basically
| "90-day password change" 2.0.
| Bluecobra wrote:
| MitM can absolutely stop threats if done correctly. A
| properly configured Palo Alto firewall running SSL
| Decryption can stop a random user downloading a known
| zero-day package with Wildfire. Not saying MitM is an end
| all be all, but IMHO the more security layers you have
| the better.
|
| At the end of the day, it's not your network/computer.
| There's always going to be some unsavvy user duped into
| something. If you don't like corporate IT, you're free to
| become a contractor and work at home.
| fsniper wrote:
| "A properly configured Palo Alto firewall running SSL
| Decryption can stop a random user downloading a known
| zero-day package with Wildfire."
|
| Instead that Corp IT should have put a transparently
| working antivirus/malware scanner on the workstation that
| would prevent that download to be run at all. ?
|
| DPS/MITM are not security layers but more of privacy
| nightmares.
| Bluecobra wrote:
| I disagree, I think you should have both as an endpoint
| scanner (either heuristics or process execution) may not
| catch anything. (for example a malicious Javascript from
| an advertisement)
|
| Why do you care so much about your privacy while you're
| on company time using their computers, software, and
| network? If you don't like it, bring your own
| phone/tablet/laptop and use cellular data for your
| personal web browsing. FWIW, it's standard practice to
| exempt SSL decryption for banking, healthcare, government
| sites, etc.
| EvanAnderson wrote:
| > Instead that Corp IT should have put a transparently
| working antivirus/malware scanner on the workstation that
| would prevent that download to be run at all. ?
|
| Sure. Then come the complaints that this slows down
| endpoint devices and has compatibility issues. Somebody
| gets the idea to do this in the network. Rinse. Repeat.
| fsniper wrote:
| Our CorpIT has that and fine tuned it to perfection. No
| one complains now. So it's possible.
|
| Unfortunately they still do MITM which breaks connections
| regularly.
| EvanAnderson wrote:
| It's a knife's edge. One OS patch, or one vendor change
| in product roadmap, and you can be right back to endpoint
| security software performance and compatibility hell.
| Stuff has gotten better but it's still fraught with
| peril.
| Spivak wrote:
| > where nobody knows why its done just that its done
|
| Compliance. You think your IT dept _wants_ to deploy this
| crap? How ever painful you think it is as an end user
| multiply it having to support hundreds /thousands of
| endpoints.
|
| Look, I hate traffic inspection as much as the next
| person but this is for security, it's just not for the
| security you want it to be. This is so you have an _audit
| trail_ of data exfiltration and there 's no way around
| it. You need the plaintext to do this and the whole
| network stack is built around making this a huge giant
| pain in the ass. This is one situation where soulless
| enterprises and users should actually be aligned. Having
| the ability in your OS to inspect the plaintext traffic
| of all incoming and outgoing traffic by forcing apps off
| raw sockets would be a massive win. People. seem to
| understand how getting the plaintext for DNS requests is
| beneficial to the user but not HTTP for some reason.
|
| Be happy your setup is at least opportunistic and not
| "block any traffic we can't get the plaintext for."
| steve_taylor wrote:
| > You think your IT dept _wants_ to deploy this crap?
|
| Yes, they do.
| betaby wrote:
| Compliance with exactly what? Their own rules?
| neon_electro wrote:
| Link for more info? That seems impossible to make work.
| Bluecobra wrote:
| Source: https://arstechnica.com/gadgets/2023/07/to-
| defeat-hackers-go...
| pmarreck wrote:
| I know that they have a gigantic intranet, that might
| make the lack of internet during the workday less painful
| mschuster91 wrote:
| I read this recently for sysadmins at Google and
| Microsoft that have access to absolute core services like
| authentication, which _does_ make sense to keep these
| airgapped
| eep_social wrote:
| This sounds like a misunderstanding of the model. Usually
| these companies have facilities that allow core teams to
| recover if prod gets completely fucked e.g. auth is
| broken so we need to bypass it. Those facilities are
| typically on separate, dedicated networks but that
| doesn't mean the people who would use them operate in
| that environment day to day.
| peoplefromibiza wrote:
| companies can be held liable for what people using their
| networks do, so they need a way to prove it's not their fault
| and provide the credentials of the malevolent actor.
|
| it's like call and message logs kept by phone companies.
|
| nobody likes to keep them but it's better than the breaking
| the law and risking for someone abusing your infrastructure.
|
| it would also be great if my colleagues did not use the
| company network to check the soccer stats every morning for 4
| hours straight, so the company had to put up some kind of
| domain blocking that prevents me from looking up some
| algorithm i cannot recall from the top of my mind on
| gamedev.net because it's considered "gaming"
| FuriouslyAdrift wrote:
| Blame the law. Companies are bound by it. Actually blame
| terrible programming practices and the reluctance to tie the
| long tail of software maintenance and compliance to the
| programmers and product managers that write them.
| jarym wrote:
| > without negatively impacting user experience
|
| I can't stop laughing.
| Faaak wrote:
| Hopefully, IT doesn't notice when I use my `kill-zscaler.sh`
| script. It's horrible to work around when you arrive on a new
| company using it.
| ta1243 wrote:
| The only reason I have one is so I can prove the problem is
| with zscaler and not my network.
|
| I remember someone complaining about the speeds writing to a
| unc file share over a 10G network, using a blackmagic file
| writing test tool
|
| It was really slow (about 100MB/sec), but iperf was fine.
|
| I did a "while (true) pkill sophos" or similar. Speeds shot
| upto about 1GB/sec.
|
| Closed the while while loop, sophos returned in a few
| seconds, and speeds reduce to a crawl again.
|
| But who needs to write at a decent speed in a media
| production environment.
|
| And people still wonder why ShadowIT constantly wins with the
| business.
| PrimeMcFly wrote:
| Care to share?
| Faaak wrote:
| https://github.com/bkahlert/kill-zscaler
| PrimeMcFly wrote:
| Should have tried searching for it first I guess, thanks!
| rcstank wrote:
| What all is in that script? I'd love to have it. Sincerely,
| another dev abused by Zscaler.
| Faaak wrote:
| https://github.com/bkahlert/kill-zscaler
| betaby wrote:
| You can try to block zscaller ( or netskope) IP on you home
| router. Most of the times IT laptops are default to 'normal'
| web behaviour if zscaller/netskope is not available.
| c7DJTLrn wrote:
| Simple HTTP/1.1 would've been perfectly adequate if the web
| wasn't collapsing under its own weight made up of megabytes of
| JavaScript firing off tens of requests a second.
| y04nn wrote:
| I'm not sure about that. I noticed that some websites (even
| simple blogs) felt instantaneous when loading (after pressing
| enter or clicking a link), way much faster than other websites.
| So I opened the dev tools, and the first request was HTTP/3.
| But all resources were still loaded with HTTP/2! HTTP/3 really
| brings something not negligible. Those accumulating TCP/TLS
| handshakes (+WIFI high latency) are really a burden and degrade
| the user experience when you have a fast internet connection.
| 1vuio0pswjnm7 wrote:
| "Why TCP is not optimal for today's web"
|
| Hmm, what is "today's web". Surveillance and advertising.
|
| Let's be reasonable. Not every web user is exactly the same. Some
| might have diferent needs than others. People might use the web
| in different ways. What's optimal for a so-called "tech" company
| or a CDN might not be "optimal" for every web user. The
| respective interests of each do not always align on every issue.
|
| "Over time, there have been many efforts to update TCP and
| resolve some of its inefficiencies - TCP still loads webpages as
| if they were single files instead of a collection of hundreds of
| individual files."
|
| To be fair, not all web pages are "collections of hundreds of
| individual files", besides the file/path that the web user
| actually requested, _that the web user actually wants to see_.
| For example, I use TCP clients and a browser (HTML reader) and I
| only (down)load^1 a single file, using HTTP /1.x. This allows me
| to read the webpage. Most of the time when I am using the web,
| that's all I'm trying to do. For example, I access all sites
| submitted to HN this way. I can read them just the same as
| someone using HTTP/3. Hence I can discuss this blog post in this
| comment using HTTP/1.1 and someone else can discuss the same blog
| post using HTTP/3. The text that we are discussing is contained
| in a single file.
|
| So what are these "collections of hundreds of individual files".
| Well, they might come from other sites, i.e., other domains,
| other host IP addresses. Chances are, they are being used for
| tracking, advertising or other commercial purposes. And I am
| ignoring all the HTTP requests that do not retrieve files, e.g.,
| telemetry, (incorrectly) detecting whether internet connection
| exists, etc.
|
| IMHO, the best websites are not comprised of pages that each
| contain hundreds of individual files, sourced from servers that
| to which I never indicated an intent to connect. The best
| websites are, IMHO, ones with hundreds of pages where each is a
| single file containing only the information I am interested in,
| nothing else. No tracking, no ads, no manipulative Javascripts,
| no annoying graphical layouts, no BS. HTTP/1.1 provides an
| elegant, efficient method to request those pages in sequence.
| It's called pipelining. Multiple HTTP requests sent over a single
| TCP connection. No multiplexing. The files come from a single
| source, in the same order as they were requested. Been using this
| for over 20 years. (If the web user wants "interactive" webpages,
| filled with behavioural tracking and advertising, then this is
| not optimal. But not every web user wants that. For information
| retrieval it is optimal.)
|
| Not every web user is interested in a commercialised web where
| the protocols are optimised for tracking and programmatic
| advertising. The early web had no such complexity nor annoyances.
| HTTP/3 can co-exist with other, simpler protocols, designed by
| academics not advertising service companies or CDNs. The other
| protocols may or may not be as suitable for advertising and
| commercial use.
|
| HN changed the title from "Why HTTP/3 is eating the world" to
| "HTTP/3 adoption is growing rapidly". Perhaps HTTP/3 is being
| hyped. If it is a protocol "optimised" for commercial uses that
| benefit the companies (advertising services, CDN services) who
| authored the specification, this would not be surprising.
|
| Interestingly, the commentary in this thread mainly focuses not
| on HTTP/3 but on QUIC. QUIC reminds me of CurveCP, an earlier
| UDP-based TCP replacement. I have run HTTP/1.1 over CurveCP on
| the home network. In theory shouldn't it be possible to use any
| protocol with QUIC, via ALPN. Something like
| printf 'GET / HTTP/1.1\r\nHost: example.com\r\nConnection:
| close\r\n\r\n' \ |openssl s_client -quic -alpn http/1.1
| -connect 93.184.216.34:443 -ign_eof
|
| 1. The terminology "load" is interesting. It's more than
| downloading. It's downloading code plus running it,
| automatically, without any user input. There is no opportunity to
| pause after the download step to choose whether or not to run
| someoone else's code. That's "today's web". If one uses a
| "modern" web browser issued by an advertising-supported entity.
| The browser I use is not issued by an advertising-supported
| company, AFAIK it's originally written by university staff and
| students; it does not auto-load resources and it does not run
| Javascript. There is no waiting for a site to "load".
| superkuh wrote:
| HTTP/3 is becoming more popular because Microsoft and Google
| white-washed google's QUIC through the IETF. It's popular because
| what they call "HTTP/3" is perfectly designed for meeting the
| needs of a multi-national corporation serving web content to
| millions of people. But it's only megacorps using it for their
| walled gardens.
|
| It's a terrible protocol for the human person web. You can't even
| host a visitable website for people who use HTTP/3 without
| getting the permission of a third party corporation to
| temporarily lend you an TLS cert. Once Chrome drops HTTP/1.1
| support say goodbye to just hosting a website as a person free
| from any chains to corporate ass-covering. While LetsEncrypt is
| benign now, it won't always stay that way as more and more people
| use it. Just like with dot Org it'll go bad and there'll be
| nowhere to run.
|
| TLS only HTTP/3 is just as bad for individual autonomy as web
| attestation in it's own way.
| dmazzoni wrote:
| You're against HTTP/3 because it makes security mandatory?
|
| Have you been completely asleep at the wheel over the past 20
| years as ISPs around the world have started injecting ads in
| people's unencrypted HTTP traffic?
|
| That's great that you want to host a simple website that other
| people around the world can visit using simple HTTP. Maybe if
| your website is completely benign and harmless, that's not
| unreasonable.
|
| But a lot of us want to share information that's important, or
| sensitive - and we want sites where people can interact
| privately. I don't want the content of my site manipulated by
| third-parties along the way, and I don't want them snooping on
| traffic as people interact with my site.
| dang wrote:
| > Have you been completely asleep at the wheel
|
| Can you please edit out swipes like that? Your comment would
| be just fine without that bit.
|
| This is in the site guidelines:
| https://news.ycombinator.com/newsguidelines.html.
| the8472 wrote:
| > Have you been completely asleep at the wheel over the past
| 20 years as ISPs around the world have started injecting ads
| in people's unencrypted HTTP traffic?
|
| ISPs are in my local jurisdiction, under regulators I may
| have voted for and I have a contract which them which I could
| get checked by courts if it's important enough to me. And
| I've voted with my feet by switching to a different ISP when
| the previous one did even a hint of nefarious DNS things
| (they never injected something into HTTP ftws, "merely"
| NXDOMAIN hijacking).
|
| I can't say the same about google, cloudflare, let's encrypt
| and so on.
|
| Trading a cacophony of some local good and bad ISPs to a
| quite US-centric dubious companies is hardly an improvement.
|
| Also, it's a general sign that things are in a low-trust
| equilibrium when all this extra complexity is needed. HTTP is
| perfectly fine if you live in a nicer pocket.
| superkuh wrote:
| Yes, because it's mandatory. If the HTTP/3 implementations
| allowed self-signed certs it'd be okay. But they don't. Or if
| HTTP/3 allowed the option of connections without TLS but
| defaulted to TLS, that'd be okay. But it doesn't.
| afiori wrote:
| > If the HTTP/3 implementations allowed self-signed certs
|
| Is it forbidden at the protocol level or by the
| implementations?
| tambre wrote:
| RFC 9114 SS3.1 P2 [0] requires TLS certificates, but I
| imagine you can easily modify existing implementations to
| remove this restriction. And even if the spec didn't
| require it I'd expect all major implementors (web
| browsers) to still impose this. I imagine something like
| curl has an opt-out for this (-k?).
|
| Note that the spec essentially says "if verification
| fails don't trust this connection". What not trusting
| means is up to the application. For browsers that's
| probably equivalent to not allowing the connection at
| all.
|
| [0] https://www.rfc-editor.org/rfc/rfc9114#section-3.1-2
| treesknees wrote:
| It's an implementation limit that at least Chrome (and
| possibly other browsers) are enforcing for QUIC
| connections. The TLS certificate must be signed by a
| trusted CA. [1]
|
| There does appear to be cli flags to change the behavior,
| but I don't see how you'd do that on mobile or embedded
| browsers.
|
| [1]https://www.smashingmagazine.com/2021/09/http3-practic
| al-dep...
| remram wrote:
| You can add trusted CAs to your system though...
| paulddraper wrote:
| HTTP/3 allows self-signed certs.
|
| Chrome does not. But that choice is orthogonal, could
| happen with any protocol.
| vinay_ys wrote:
| I didn't know chrome didn't allow self-signed
| certificates. Since when?
| AgentME wrote:
| The user can add self-signed certificates. Random
| websites using self-signed certificates won't work
| without that extra configuration from the user.
| dark-star wrote:
| I'm pretty sure the old "thisisunsafe" trick will still
| work ...
| [deleted]
| giantrobot wrote:
| > Chrome does not. But that choice is orthogonal to
| protocol.
|
| Which means HTTP/3 _de facto_ doesn 't support self-
| signed certificates. Once Chrome disables HTTP 1.1/2
| which it will at some point in the name of security or
| performance, you'll only be able to exist on the web with
| a CA signed certificate.
| superkuh wrote:
| Yeah, it's difficult for me to always add the qualifer,
| "HTTP/3 allows self-signed certs but no implementation
| that exists in any browser allows self signed certs".
| paulddraper wrote:
| The original comment was opposition to HTTP/3 because of
| mandatory secure connections.
|
| In reality, the opposition is misdirected; it is Chrome
| that requires secure connections, not HTTP/3.
| nightpool wrote:
| Plenty of browsers allow self-signed certs--Firefox and
| Safari, to the best of my knowledge, treat HTTP/3 certs
| exactly the same as they treat HTTP/2 and HTTP/1.1 certs.
| Chrome has taken the position that it will no longer
| allow self-signed _Root CAs_ for HTTP /3 websites, to
| prevent SSL interception with companies intercepting
| _all_ of your traffic. For personal use, you can always
| whitelist an individual _certificate_ using a CLI flag
| without allowing a trusted root CA
| cortesoft wrote:
| You can use self signed certs, you just have to add your CA
| as a trusted CA.
| Dylan16807 wrote:
| "A self-signed certificate is one that is not signed by a
| CA at all - neither private nor public. In this case, the
| certificate is signed with its own private key, instead
| of requesting it from a public or a private CA."
|
| This definition sounds right to me. Do you disagree with
| it?
|
| I get what you're saying, that you can set up a
| certificate yourself. But you can't accept a certificate
| _someone else_ set up. (in Chrome)
| poizan42 wrote:
| That's definitely the wrong definition. A self-signed
| certificate is a certificate that is signed by the
| private key corresponding to the public key contained in
| that certificate - in other words it's signed by itself.
|
| In fact every CA root are by necessity self-signed and
| therefore signed by a CA (i.e. itself)
| Dylan16807 wrote:
| I don't think they were talking about certificates that
| were also CAs. It's not wrong, it's insufficiently
| precise.
|
| Anyway, your definition is _clearer_ , but it still
| supports the point I was trying to make. You don't "add
| your CA" in order to use self-signed certs. You do that
| to use a private CA that will sign all your certs. And
| doing so only allows you to use websites you signed, not
| websites other people signed. It would be a terrible idea
| to add anyone else's CA, and you can't easily use your CA
| to slap your signature onto websites other people signed.
| Adding your own CA is a completely different situation
| from trying to visit a self-signed site.
| dharmab wrote:
| If you need certs for local dev, use a local CA. Tools like
| mkcert automate this.
|
| If you need certs if a private network, either use Let's
| Encrypt/ACME or run your own CA and add the CA cert to your
| machines.
| PH95VuimJjqBqy wrote:
| For some reason I wasn't aware of that, but I agree with
| you.
|
| There's no reason for to mandate TLS for everything just
| because of a protocol.
| mrighele wrote:
| > Have you been completely asleep at the wheel over the past
| 20 years as ISPs around the world have started injecting ads
| in people's unencrypted HTTP traffic?
|
| Now finally those that forced HTTPS on everybody have a
| monopoly on that :-) (joking, but only in part...)
| [deleted]
| jasonjayr wrote:
| The crux of people's objections is that to bring up a
| "secure" site with this mechanism, you need the blessing of
| an unaffiliated 3rd party.
|
| And yea, for the last 20 years ISP's have been doing stupid
| stuff with people's connections, but companies have been
| trying harder and harder to lock down the internet and put
| the genie back in the bottle.
|
| (This is the core of the objection to PassKeys, FWIW)
|
| If HTTP/3 let users run their own secure site, without any
| 3rd party parts, then we are good. Why not a trust-on-first-
| use mechanism, but with a clear UX explaining the
| implications of accepting the certificates?
| wredue wrote:
| Dude. Everyone is asleep.
|
| The largest, most frequent, and easiest to exploit but also
| easiest to fix issue continues to DOMINATE hacks:
|
| Not sanitizing user entered inputs.
|
| Not saying this is an excuse to rid of TLS, but TLS is not
| really where the main focus of people hosting home blogs
| needs to be.
|
| Ah. I see the "it's easier to reason about" crowd of hype
| driven developers are here, as indicated by how the voting is
| going.
| lost_tourist wrote:
| Can't you sign your own certificates? I honestly haven't done
| any work with http/3 so it's a sincere question.
| superkuh wrote:
| I could. But that wouldn't help make my website visitable for
| anyone. I'm not trying to access my own hosted site. I want a
| random person on the other side of the world who I've never
| met and will never meet to be able to visit my website.
| That's the point of public websites.
|
| Generating my own root cert and using it does not help with
| this at all since no one else will have it installed nor
| install it unless I'm already good friends with them.
| Installing a root cert in the various trust stores is no
| simple task either.
|
| The easier self-sign with no root cert also doesn't help
| because every current implementation of HTTP/3 does not
| support self-signed certs.
| dark-star wrote:
| Let's Encrypt and similar services will allow you to do
| exactly that though...?
| lost_tourist wrote:
| I mean you have to do that with https now if you don't go
| with one of the big guys? How is that any more dangerous to
| the end user than just going to an http site that they
| don't know personally? The risks are the same as far as
| malware.
| superkuh wrote:
| >How is that any more dangerous to the end user than just
| going to an http site that they don't know personally?
|
| Installing a random root cert to your trust store is a
| very dangerous thing to do but you're looking at this
| from the wrong end. I'm not talking about the dangers of
| a random user installing a random TLS root cert from me I
| send them over, what, email? That's just not going to
| happen. It shouldn't happen.
|
| I'm talking about the ability for human persons to host
| visitable websites without corporate oversight and
| control. With HTTP/1.1 I can just host a website from my
| IP and it's good. Anyone can visit. With HTTP/3 before
| anyone can visit my website I first have to go ask
| permission every ~90 days from a third party corporation.
| And if I piss them off (or people who put pressure on
| them in certain political regions), say by hosting a pdf
| copy of a scientific journal article, or a list of local
| abortion clinics, they can revoke my ability to host a
| visitable website.
|
| With HTTP/3, as a human person, the cure is worse than
| the "disease".
| remexre wrote:
| I think they meant, can't a visitor just ignore that your
| cert is self-signed and view the page anyway, _without_
| adding it to the trust store. Firefox, at least, has this
| as the default "ignore the cert error" action.
| superkuh wrote:
| Setting up a root cert in a trust store and accepting a
| non-root self-sign cert are very different things.
|
| That said, "can't a visitor just ignore that your cert is
| self-signed and view the page anyway,"
|
| No. They cannot. Not when it's HTTP/3. Even if the
| browser ostensibly supports it the HTTP/3 connection will
| fail because the HTTP/3 lib Firefox uses does not allow
| it.
| paradite wrote:
| Would that actually help your cause, since it would push
| people to build and distribute their own user agents
| which accepts self-signed certs/CA in a user-driven
| community without big corp oversight?
| PH95VuimJjqBqy wrote:
| You'd have to get all the major browsers to trust that
| CA, it's not possible to do that without "corporate
| oversight".
|
| That's the point the other poster is making, it adds a
| new level of control that becomes the de facto only way
| to do things if http 1.1 ever gets deprecated.
|
| I have no opinion on the likelihood of any such
| deprecation but I fully understand the concern.
| joshjje wrote:
| Thats mainly a problem with the browsers, no? Not saying
| it isn't an issue, obviously the big ones are going to
| drive a lot of this technology, but you could still use
| something like curl or whatever.
| Macha wrote:
| And where do you get that IP? From a corporate ISP? From
| a corporate host? From a politically pressurable entity
| like your local NIC?
| ta8645 wrote:
| Every visitor would have to manually trust them; provided
| they're using a browser that still allows them to do so.
| kevincox wrote:
| This is literally the same as HTTP except that browsers
| don't (by default yet) put up a scary warning for that. But
| with a self-signed cert you get protection from passive
| attackers and once you press yes the first time it verifies
| if someone _else_ tries to hijack your connection.
|
| I think almost all protocols should have always-on
| encryption. You can choose to validate the other end or not
| but it is simpler and safer to only have the encrypted
| option in the protocol.
|
| FWIW I have HTTPS-only mode enabled and I would prefer to
| be notified of insecure connections. To me a self-signed
| cert is actually better than HTTP.
|
| I'm sure it will be a while until HTTPS-only is the
| default, but it seems clear that browsers are moving in
| that direction.
| [deleted]
| jezzamon wrote:
| If you want a domain name, it was never possible to host your
| own webpage without involving someone else who owns the
| domain/name server side of things
| apichat wrote:
| That's why Gnu Name System (GNS) and GNUnet are here
|
| https://www.gnunet.org/en/
|
| GNUnet helps building a new Internet
|
| GNUnet is a network protocol stack for building secure,
| distributed, and privacy-preserving applications. With strong
| roots in academic research, our goal is to replace the old
| insecure Internet protocol stack.
|
| https://www.gnunet.org/en/applications.html
|
| The GNU Name System
|
| The GNU Name System (GNS) is a fully decentralized
| replacement for the Domain Name System (DNS). Instead of
| using a hierarchy, GNS uses a directed graph. Naming
| conventions are similar to DNS, but queries and replies are
| private even with respect to peers providing the answers. The
| integrity of records and privacy of look-ups are
| cryptographically secured.
| xorcist wrote:
| That's not a great analogy. Your registrar is bound by a
| strict contract, in some countries it may even be telecom
| legislation, and your domain is legally yours (again, within
| contract bounds). While they need to delegate it to you, they
| cannot arbitrarily suddenly give it to someone else.
|
| BBC.co.uk belongs with the Beeb, anything else would be
| considered an attack on Internet infrastructure and treated
| as such. You cannot compare that with the power Google has
| over Chrome. It is theirs to do what they wish with.
| pmlnr wrote:
| But you don't _need_ a domain name to host a webpage. It can
| be served over the IP address. You don 't need a public IP
| either, a page can be for the local net.
|
| But indeed, if you want a traditional webpage that is
| accessible over the net and possible to remember it's URL,
| then yes, you need a domain, and for that, you need (at some
| level, even if you're a registrar) the entity who runs the
| tld.
| paulddraper wrote:
| Hosting a HTTP/3 page for local net.....
|
| I think we've gotten very theoretical
| losteric wrote:
| Sci hub often needs to be accesses by a random IP and/or
| without a corporate-blessed TLS cert. The same goes for
| many counter-esrablishment sites across the world (China,
| Iran, ...).
|
| The premise of the Internet was distributed dissemination
| of information for the mass public. There is a real fear
| that we are walking through practical one-way doors, ever
| increasing the barrier of access to disruptive counter-
| corporate/counter-state information.
|
| It doesn't take a huge leap to relate these concerns to
| America's future political discourse.
| vruiz wrote:
| Security and accessibility/simplicity are almost always
| at odds with each others. It's a tradeoff that needs to
| be made. You are entitled to dislike the current trend
| and prefer making security optional. But you can't
| possibly be surprised if most people are happy to
| prioritize their privacy and security over "the barrier
| of access to disruptive counter-corporate/counter-state
| information".
| nofunsir wrote:
| This is disingenuous.
|
| The "most" in your strawman here is just companies like
| Google who want to a) bend to those who want to DRM the
| entire web b) hide and lock away their tracking traffic
| from those being tracked c) make ad blocking impossible.
|
| Please explain why OC "can't be surprised."
| losteric wrote:
| HTTP 1 is on a depreciation path and HTTP3 requires TLS,
| which would mean getting the blessing of a trusted
| (typically corporate) root cert every 90 days to continue
| letting random people access my website.
|
| In the US, states recently passed anti-abortion laws
| which also banned aiding and abetting people seeking the
| procedure. That would cover domain names and certs if any
| relevant tech companies had headquartered in those states
| - or if passed as federal law.
|
| Trans rights are actively heading in that direction, and
| supporters are the very same that lambasted NYT and
| others as "fake news" that needed to be banned while
| pushing narratives of contrived electoral processes.
|
| Fear of political regression is real in America, without
| even looking internationally.
|
| Societal and technical systems evolve together. With the
| depreciation of HTTP1, future cheap middleware boxes will
| likely effectively enforce using HTTP3 and consolidate
| the tech landscape around a system that is far more
| amenable to authoritarian control that the prior
| generation of protocols.
|
| It's fair and valid to call out such scenarios when
| discussing international technical standards. These
| protocols and the consequences will be around for decades
| in an ever evolving world.
| abortions4life wrote:
| [flagged]
| throwaway892238 wrote:
| If HTTP/1.1 is deprecated from all browsers, and HTTP/3
| eventually becomes the only way to view a web page, then
| it will be impossible to host a localnet web page (ex. a
| wifi router). The people pushing this standard through
| don't make routers, so they don't give a shit, and
| everyone on the planet will just be screwed. This is what
| happens when we let 1 or 2 companies run the internet for
| us
| paradite wrote:
| Pardon my knowledge. If we are to get very technical
| instead of for average people, surely you can self-sign a
| cert and setup your CA chain on the computers in your
| local network?
|
| Or is there something else that prevents you from hosting
| HTTP/3 locally?
| taway1237 wrote:
| As far as I know, browsers don't allow self-signed
| certificates on HTTP/3. This was mentioned by people in
| comments here, and quick google seems to confirm.
| throwjdn wrote:
| [dead]
| Majestic121 wrote:
| You cannot use a certificate that was not signed by a
| trusted CA, but nothing keeps you from creating your own
| CA, making it trusted locally, and using it to sign your
| cert
| koito17 wrote:
| > making it trusted locally
|
| That is precisely the problem. Most proprietary systems
| don't let you touch the trust store at all. Even "open"
| platforms like Android have been locking down the ability
| to do _anything_ to the trust store.[1]
|
| With that said, if we assume the user is only using
| Google Chrome and not an alternative browser, then typing
| "thisisunsafe" on the TLS error page should let one elide
| trust store modifications entirely. I cannot guarantee
| this is the case for HTTP/3 since the reverse proxies I
| deal with still use HTTP/2.
|
| [1] https://httptoolkit.com/blog/android-14-breaks-
| system-certif...
| mgaunard wrote:
| Nothing except convenience and compatibility with dozens
| of operating systems that might operate on the network.
|
| Can you even easily do it on Android? Without an Internet
| connection?
| jrmg wrote:
| Knowing how to do that is quite a high barrier to entry.
| throwjdn wrote:
| [dead]
| utybo wrote:
| Couldn't mkcert handle most of that process?
| cryptonector wrote:
| Impossible because you'd need a server certificate? But
| you can issue it yourself and add a trust anchor to your
| browsers.
| WorldMaker wrote:
| Routers _today_ are already dealing with this problem
| because Chrome throws major security warnings for any
| unencrypted HTTP traffic. The current solutions I 've
| seen are to use things like Let's Encrypt/ACME
| certificates for wildcard sub-domains
| *.routercompanylocal.tld and a bit of secure handshake
| magic to register temporary A/AAAA records for link-local
| IP addresses (DNS has no problem advertising link-local
| addresses) and pass the private parts of the certificate
| down to the local hardware. Several major consumer
| routers I've seen will auto-route users to something like
| https://some-guid.theircompanyrouterlocal.tld and
| everything just works including a CA chain usually up to
| Let's Encrypt.
|
| Doing Let's Encrypt/ACME for random localnet web pages is
| getting easier all the time and anyone can use that
| wildcard domain loophole if they want to build their own
| secure bootstrap protocols for localnet. It would be
| great if the ACME protocol more directly supported it
| than through the wildcard domain name loopholes currently
| in use, and that may come with time/demand. I imagine
| there are a lot of hard security questions that would
| need to be answered before a generally available
| "localnet ACME" could be devised (obviously every router
| manufacturer is currently keeping their secure handshakes
| proprietary because they can't afford to leak those
| certificates to would be MITM attacks), but I'm sure a
| lot of smart minds exist to build it given enough time
| and priority.
| dark-star wrote:
| for routers there's a simple and easy workaround. Let's
| say your router answers on "router.lan". All they would
| have to do is redirect this name via the router's DNS
| resolver to, say, 192-168-0-1.company.com, which would be
| an officially-registered domain that resolves back to ...
| wait for it... 192.168.0.1!
|
| If you control company.com you can run wildcard DNS for
| any amount of "private" IP addresses, complete with an
| official and valid trusted certificate. For an internal
| IP address. Problem solved.
|
| (and no, this is not theoretical, there were appliances
| some 10+ years ago that did exactly that...)
| WorldMaker wrote:
| Yeah, that is mostly what I was describing. There's some
| rough security benefit to using more transient A/AAAA
| records under a GUID or other pseudo-random DNS name than
| a DNS zone that just encodes common link local addresses.
| There are definite security benefits to a company using
| mycompanyrouters.com instead of their home company.com
| (XSS scripting attacks and the like). So some things have
| changed over 10+ years, but yes the basic principles at
| this point are pretty old and certainly working in
| practice.
| SoftTalker wrote:
| Look for routers that run an ssh server or have a serial
| console I guess.
| dybber wrote:
| Just use a proper browser
| cjblomqvist wrote:
| Google actually makes a router?
| paulddraper wrote:
| > If HTTP/1.1 is deprecated from all browsers
|
| If HTTP/1.1, HTTP/2, and HTTP/3 is deprecated from all
| browsers the World Wide Web would shut down.
|
| WWW is in danger! /s
| smcleod wrote:
| I have HTTP/3 on my local network using traefik + let's
| encrypt and an internal zone. I didn't actually go out of
| my way to set up HTTP/3 it was pretty much just a few
| extra lines of configuration so why not?
| dark-star wrote:
| But I assume you have an officially-registered domain for
| that? That's the main issue people are having, that
| without an official domain (i.e. with only "foo.local" or
| whatever) it's hard to use HTTP/3
|
| AFAIK Let's Encrypt won't sign certificates for internal
| domains?
| smcleod wrote:
| Yep, you can get any cheap or seemingly random domain and
| just have the zone set to your internet device(s).
|
| There's nothing stopping you, pushing your own internal
| CA, though, if you're big enough to warrant that
| dark-star wrote:
| I know, I run two domains at home on a Raspberry Pi, I
| know that it's easy, but I also don't have an issue with
| paying 10EUR/year for a domain name. I guess this is the
| thing people are angry about, that by buying a domain
| you're funding the very corporate greed that will one day
| destroy the internet, or something...
| cryptonector wrote:
| Not local net, more like samizdat net.
|
| Still, you can always add private trust anchors and still
| have a samizdat net.
| treesknees wrote:
| Just to be pedantic, you only need a registrar to globally
| register the domain name and associate it with DNS records.
| You could choose to point your system at a locally-
| controlled DNS server, or edit the local /etc/hosts file,
| to use user-friendly names without depending on registering
| the domain with any authority.
| AuthorizedCust wrote:
| > _It 's a terrible protocol for the human person web. You
| can't even host a visitable website for people who use HTTP/3
| without getting the permission of a third party corporation to
| temporarily lend you an TLS cert._
|
| Um, so many parts of web hosting require willing actions of
| corporations. Network, servers, etc. Why are you singling out
| just one part?
| ReactiveJelly wrote:
| Your beef is with Google and the Chrome team, then. QUIC itself
| is a good protocol, and at the library level you can just tell
| it to trust any cert it sees.
|
| Google doesn't need to enable QUIC to disable self-signed certs
| in Chrome.
| aftbit wrote:
| This was an HTTP/2 issue as well. IMO it was a big miss to not
| specify some kind of non-signed mode with opportunistic
| encryption. That is, if you set up an HTTP/2 (or /3) connection
| to a server that did not have a signed cert, it would do an
| ECDH exchange and provide guarantees that nobody was passively
| snooping on the traffic. Of course this would not protect
| against an active MITM but would be better than falling back to
| pure plain text.
| KMag wrote:
| Exactly. Unauthenticated encryption increases the costs of
| pervasive dragnet surveillance at a minimal cost/hassle.
| cptskippy wrote:
| > Once Chrome drops HTTP/1.1 support say goodbye to just
| hosting a website as a person free from any chains to corporate
| ass-covering.
|
| Is that even possible? It will effectively brick billions of
| consumer electronics that people have today (e.g. Routers, NAS
| Appliances, etc).
| josefx wrote:
| what do you think happened when browsers killed
| java,silverlight and flash?
|
| Some of the hardware I have received a last minute firmware
| update to replace java applets with a minimal JavaScript
| fallback.
| mardifoufs wrote:
| As opposed to decades of stonewalling and sluggish progress
| just because a few big corporations didn't want to have a
| harder to manage IT system?
|
| Literally most of the complaints against http3 are from
| corporate network engineers that now have a harder time
| controlling their networks. And more importantly, harder to
| implement mitm and monitoring. Which sure, that sucks for them
| but that's a massive upside for the vast majority of other
| internet users.
| mgaunard wrote:
| it makes more sense than HTTP/2, which is retarded, but still
| widely adopted.
|
| It's incredible how many servers can't do pipelining when
| queried with HTTP/1, but do it fine with HTTP/2.
| wkat4242 wrote:
| I'm extremely sceptical of anything proposed by Google especially
| because they are building some truly evil stuff lately (like
| FLoC/Topics and WEI). I really view them as the enemy of the free
| internet now.
|
| But QUIC doesn't really seem to have their dirtbag shadow hanging
| over it. Perhaps I should try to turn it on.
| mixmastamyk wrote:
| Agree, but topics sounded like an improvement to me.
| kondro wrote:
| I miss when diagnosing server issues meant opening telnet and
| typing English.
|
| But I guess I also appreciate why that couldn't last.
| KingMob wrote:
| "When ah was yur age, we had to push our bits 10 miles to tha
| nearest router in the snow... and it was uphill...both ways!!!"
| mnw21cam wrote:
| Most servers should still be able to service HTTP/1.0 requests
| (although you're right they'll probably just reply with a
| redirect to the HTTPS site).
| akmittal wrote:
| Just saw Hacker news uses http/1.1 and its plenty fast. I wonder
| if they explored moving to http2/3
| jefftk wrote:
| I wouldn't expect HN to gain much by moving to HTTP/2 or
| HTTP/3. Loading this page with a cold cache is just 60kB,
| divided as 54kB of HTML, 2kB of CSS, and 2kB of JS, plus three
| tiny images. And on follow-up pageviews you only need new HTML.
|
| If the whole web was like this then we wouldn't need new
| versions of HTTP, but of course it isn't.
| netol wrote:
| But would it gain something? I'm wondering if HTTP/3 could be
| enabled in one of my websites, which is even much smaller,
| and which has no blocking requests. I don't mind if the gains
| are small, as long as the impact is not negative for most
| visitors. I'm mostly concerned about some of my visitors
| visiting the website through an unreliable connection
| (mobile).
| jefftk wrote:
| It should gain a small amount; it is a more efficient
| protocol, with better compression and including 0-RTT for
| repeat visitors. But I doubt it would be noticeable on such
| a light site.
| adgjlsfhk1 wrote:
| The main thing it would get is 2x faster loading on high
| latency networks (because the TLS handshake happens with the
| same packets as the QUIC handshake).
| y04nn wrote:
| I have exactly the opposite view! HN load is fast but the TTFB
| is quite slow compared to HTTP/3 websites. On blogs that are
| using HTTP/3, sometimes I don't see loading time, it's
| instantaneous. On HN, just checking the dev tools, the TCP+TLS
| handshake is slower than the time it takes to make DNS request
| and loading the page data. I think HN would really benefit from
| using HTTP/3.
| nmstoker wrote:
| I get that major participants have switched but what's the
| developer experience like? It's a while since I read up on this
| but previously it seems relatively hard to get set up for any
| local usage of http3 - has that changed recently?
___________________________________________________________________
(page generated 2023-10-05 23:00 UTC)