[HN Gopher] Curl HTTP/3 Performance
       ___________________________________________________________________
        
       Curl HTTP/3 Performance
        
       Author : BitPirate
       Score  : 143 points
       Date   : 2024-01-28 09:31 UTC (13 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | hlandau wrote:
       | Author of the OpenSSL QUIC stack here. Great writeup.
       | 
       | TBQH, I'm actually really pleased with these performance figures
       | - we haven't had time yet to do this kind of profiling or make
       | any optimisations. So what we're seeing here is the performance
       | prior to any kind of concerted measurement or optimisation effort
       | on our part. In that context I'm actually very pleasantly
       | surprised at how close things are to existing, more mature
       | implementations in some of these benchmarks. Of course there's
       | now plenty of tuning and optimisation work to be done to close
       | this gap.
        
         | foofie wrote:
         | How awesome is that? Thank you for all your harf work. It's
         | thanks to people such as yourself that the whole world keeps on
         | working.
         | 
         | Obligatory:
         | 
         | https://xkcd.com/2347/
        
         | apitman wrote:
         | I'm curious if you've architected it in such a way that it
         | lends itself to optimization in the future? I'd love to hear
         | more about how these sorts of things are planned, especially in
         | large C projects.
        
           | hlandau wrote:
           | As much as possible, yes.
           | 
           | With something like QUIC "optimisation" breaks down into two
           | areas: performance tuning in terms of algorithms, and tuning
           | for throughput or latency in terms of how the protocol is
           | used.
           | 
           | The first part is actually not the major issue, at least in
           | our design everything is pretty efficient and designed to
           | avoid unnecessary copying. Most of the optimisation I'm
           | talking about above is not about things like CPU usage but
           | things like tuning loss detection, congestion control and how
           | to schedule different types of data into different packets.
           | In other words, a question of tuning to make more optimal
           | decisions in terms of how to use the network, as opposed to
           | reducing the execution time of some algorithm. These aren't
           | QUIC specific issues but largely intrinsic to the process of
           | developing a transport protocol implementation.
           | 
           | It is true that QUIC is intrinsically less efficient than
           | say, TCP+TLS in terms of CPU load. There are various reasons
           | for this, but one is that QUIC performs encryption per
           | packet, whereas TLS performs encryption per TLS record, where
           | one record can be larger than one packet (which is limited by
           | the MTU). I believe there's some discussion ongoing on
           | possible ways to improve on this.
           | 
           | There are also features which can be added to enhance
           | performance, like UDP GSO, or extensions like the currently
           | in development ACK frequency proposal.
        
             | Matthias247 wrote:
             | Actually the benchmarks just measure the first part (cpu
             | efficiency) since it's a localhost benchmark. The gap will
             | be most likely due to missing GSO if it's not implemented.
             | Its such a huge difference, and pretty much the only thing
             | which can prevent QUIC from being totally inefficient.
        
         | benreesman wrote:
         | Thank you kindly for your work. These protocols are critically
         | important and more the more high-quality and open
         | implementations exist the more likely they are to be free and
         | inclusive.
         | 
         | Also, hat tip for such competitive performance on an untuned
         | implementation.
        
         | spullara wrote:
         | Are there good reasons to use HTTP3/QUIC that aren't based on
         | performance?
        
           | Matthias247 wrote:
           | We need to distinguish between performance (throughput over a
           | congested/lossy connection) and efficiency (cpu and memory
           | usage). Quic can achieve higher performance, but will always
           | be less efficient. The linked benchmark actually just
           | measures efficiency since it's about sending data over
           | loopback on the same host
        
             | jzwinck wrote:
             | What makes QUIC less efficient in CPU and memory usage?
        
               | Matthias247 wrote:
               | Among others: having to transmit 1200-1500byte packets
               | individually to the kernel, which it will all route,
               | filter (iptables, nftables, ebpf) individually instead of
               | just acting on much bigger data chunks for TCP. With GSO
               | it gets a bit better, but it's still far off from what
               | can be done for TCP.
               | 
               | Then there's the userspace work for assembling and
               | encrypting all these tiny packets individually, and
               | looking up the right datastructures (connections,
               | streams).
               | 
               | And there's challenges load balancing the load of
               | multiple Quic connections or streams across CPU cores. If
               | only one core dequeues UDP datagrams for all connections
               | on an endpoint then those will be bottlenecked by that
               | core - whereas for TCP the kernel and drivers can already
               | do more work with multiple receive queues and threads.
               | And while one can run multiple sockets and threads with
               | port reuse, it poses other challenges if a packet for a
               | certain connection gets routed to the wrong thread due to
               | connection migration. Theres also solutions for that - eg
               | in the form of sophisticated eBPF programs. But they
               | require a lot of work and are hard to apply for regular
               | users that just want to use QUIC as a library.
        
               | drewg123 wrote:
               | Quic throws away roughly 40 years of performance
               | optimizations that operating systems and network card
               | vendors have done for TCP. For example (based on the
               | server side)
               | 
               | - sendfile() cannot be done with QUIC, since the QUIC
               | stack runs in userspace. That means that data must be
               | read into kernel memory, copied to the webserver's
               | memory, then copied back into the kernel, then sent down
               | to the NIC. Worse, if crypto is not offloaded, userspace
               | also needs to encrypt the data.
               | 
               | - LSO/LRO are (mostly) not implemented in hardware for
               | QUIC, meaning that the NIC is sent 1500b packets, rather
               | than being sent a 64K packet that it segments down to
               | 1500b.
               | 
               | - The crypto is designed to prevent MiTM attacks, which
               | also makes doing NIC crypto offload a lot harder. I'm not
               | currently aware of any mainstream (eg, not an FPGA by a
               | startup) that can do inline TLS offload for QUIC.
               | 
               | There is work ongoing by a lot of folks to make this
               | better. But at least for now, on the server side, Quic is
               | roughly an order of magnitude less efficient than TCP.
               | 
               | I did some experiments last year for a talk I gave which
               | approximated loosing the optimizations above.
               | https://people.freebsd.org/~gallatin/talks/euro2022.pdf
               | For a video CDN type workload with static content, we'd
               | go from being about to serve ~400Gb/s per single-core AMD
               | "rome" based EPYC (with plenty of CPU idle) to less than
               | 100Gb/s per server with the CPU maxed out.
               | 
               | Workloads where the content is not static and has to be
               | touched already in userspace, things won't be so
               | comparatively bad.
        
               | tialaramex wrote:
               | - The crypto is designed to prevent MiTM attacks, which
               | also makes doing NIC crypto offload a lot harder.
               | 
               | Huh? Surely what you're doing in the accelerated path is
               | just AES encryption/ decryption with a parameterised key
               | which can't be much different from TLS?
        
           | zamadatix wrote:
           | I suppose that depends on your definitions of "good" and what
           | counts as being "based on performance". For instance QUIC and
           | HTTP/3 support better reliability via things like FEC and
           | connection migration. You can resume a session on a different
           | network (think going from Wi-Fi to cellular or similar)
           | instead of recreating the session and FEC can make the
           | delivery of messages more reliable. At the same time you
           | could argue both of these ultimately just impact performance
           | depending on how you choose to measure them.
           | 
           | Something more agreeably not performance based is the
           | security is better. E.g. more of the conversation is enclosed
           | in encryption at the protocol layer. Whether that's a good
           | reason depends on who you ask though.
        
           | o11c wrote:
           | TCP has at least one _unfixable_ security exploit: literally
           | anybody on the network can reset your connection.
           | Availability is 1 /3 of security, remember.
        
       | mgaunard wrote:
       | HTTP/1 remains the one with the highest bandwidth.
       | 
       | No surprise here.
        
         | BitPirate wrote:
         | It's a bit like drag racing. If all you care about is the
         | performance of a single transfer that doesn't have to deal with
         | packet loss etc, HTTP/1 will win.
        
           | varjag wrote:
           | It runs over TCP, you don't need to deal with packet loss.
        
             | vlovich123 wrote:
             | What they're suggesting is that under packet loss
             | conditions QUIC will outperform TCP due to head of line
             | blocking (particularly when there are multiple assets to
             | fetch). Both TCP and QUIC abstract away packet loss but
             | they have different performance characteristics under those
             | conditions.
        
               | mgaunard wrote:
               | HTTP/1 doesn't have head of line blocking, only HTTP/2
               | does.
        
               | dilyevsky wrote:
               | Most browsers only support very limited number of
               | connections so it kinda does
        
               | mgaunard wrote:
               | Limitations of certain implementations are irrelevant.
               | 
               | The protocol does not have any such limitation.
        
               | dilyevsky wrote:
               | Totally, after all it's not like we live in a real world
               | where these things matter
        
               | acdha wrote:
               | The only way the first part of your sentence is correct
               | means that the second part is wrong. HTTP 1 pipelining
               | suffers from head of line blocking just as badly
               | (arguably worse) and the only workaround was opening
               | multiple connections which HTTP/2 also supports.
        
               | vlovich123 wrote:
               | HTTP/1 has parallelism limitations due to number of
               | simultaneous connections (both in terms of browser and
               | server). HTTP/2 lets you retrieve multiple resources over
               | the same connection improving parallelism but has head of
               | line problems when it does so. HTTP/3 based on Quic
               | solves parallelism and head of line blocking.
        
           | vbezhenar wrote:
           | Yesterday I was trying to track weird bug. I moved a website
           | to Kubernetes and its performance was absolutely terrible. It
           | was loading for 3 seconds on old infra and now it spends
           | consistently 12 seconds loading.
           | 
           | Google Chrome shows that 6 requests require 2-3 seconds to
           | complete simultaneously. 3 of those requests are tiny static
           | files served by nginx, 3 of those requests are very simple DB
           | queries. Each request completes in few milliseconds using
           | curl, but few seconds in Google Chrome.
           | 
           | Long story short: I wasn't able to track down true source of
           | this obviously wrong behaviour. But I switched ingress-nginx
           | to disable HTTP 2 and with HTTP 1.1 it worked as expected,
           | instantly serving all requests.
           | 
           | I don't know if it's Google Chrome bug or if it's nginx bug.
           | But I learned my lesson: HTTP 1.1 is good enough and higher
           | versions are not stable yet. HTTP 3 is not even supported in
           | ingress-nginx.
        
             | xyzzy_plugh wrote:
             | nginx is notoriously bad at not-HTTP 1.1. I wouldn't even
             | bother trying.
             | 
             | Envoy is significantly better in this department.
        
               | doublepg23 wrote:
               | Huh TIL. I had always considered nginx the "fast" http
               | server.
        
               | dilyevsky wrote:
               | Nginx is fast enough for most applications. I wouldn't
               | bother switching if you dont need the power of "software
               | defined" proxy
        
             | apitman wrote:
             | What were your reasons for moving the site to kubernetes?
        
             | mgaunard wrote:
             | Kubernetes makes everything slow and complicated.
             | 
             | Why do you even need to have proxies or load balancers in
             | between? Another invention of the web cloud people.
        
             | dilyevsky wrote:
             | > I moved a website to Kubernetes and its performance was
             | absolutely terrible. It was loading for 3 seconds on old
             | infra and now it spends consistently 12 seconds loading.
             | 
             | My guess is it has more to do with resources you probably
             | allocated to your app (especially cpu limit) than any
             | networking overhead which should be negligible in such a
             | trivial setup if done correctly
        
         | 1vuio0pswjnm7 wrote:
         | It comes from a time before websites sucked because they are
         | overloaded with ads ads and tracking.
         | 
         | For non-interactively retrieving a single page of HTML, or some
         | other resource, such as a video, or retrieving 100 pages, or
         | 100 videos, in a single TCP connection, without any ads or
         | tracking, HTTP/3 is overkill. It's more complex and it's slower
         | than HTTP/1.0 and HTTP/1.1.
        
           | sylware wrote:
           | I have a domestic web server, I did implement its code, and
           | the most important was HTTP1.[01] to be very simple to
           | implement, that to lower the cost of implementing my real-
           | life HTTP1.[01] alternative (we all know Big Tech does not
           | like that...).
           | 
           | The best would be to have something like SMTP: the core is
           | extremely simple and yet real-life-works everywhere, and via
           | announced options/extensions it can _optionnaly_ grow in
           | complexity.
        
         | drowsspa wrote:
         | Honestly, one would think that the switch to a binary protocol
         | and then to a different transport layer protocol would be
         | justified by massive gains in performance...
        
           | vlovich123 wrote:
           | The website being tested probably isn't complicated enough to
           | demonstrate that difference.
        
             | drowsspa wrote:
             | Even then, I remember the sales pitches all mentioning
             | performance improvements in the order of about 10-20%
        
               | vlovich123 wrote:
               | In real world conditions, not loop back synthetic
               | benchmarks.
        
               | lttlrck wrote:
               | I believe his point is 10-20% gain is not massive.
               | 
               | FWIW I don't know if that is what was claimed.
        
               | vlovich123 wrote:
               | 20% is what real world data from Google suggested:
               | https://www.zdnet.com/article/google-speeds-up-the-web-
               | with-...
               | 
               | I interpreted his comment as saying "where's the 20%
               | speed up" which seems like a more reasonable
               | interpretation in context. A 20% speed up is actually
               | quite substantial because that's aggregate - it must mean
               | there's situations where it's more as well as situations
               | where it's unchanged or worse (although unlikely to be
               | worse under internet conditions).
        
         | jiripospisil wrote:
         | It's a mystery, it's almost as if people have spent decades
         | optimizing it.
        
           | mgaunard wrote:
           | Or rather, it was simply designed correctly.
        
             | k8svet wrote:
             | I know you think you're coming off smarter than everyone
             | else, but it's not how it's landing. Turns out things are
             | not that overly reductive to that extent at all in the real
             | world
        
               | mgaunard wrote:
               | Not everyone else, just Google and other HTTP/2
               | apologists.
        
               | nolok wrote:
               | I'm not sure what you mean by "apologist" or what you're
               | trying to say, I'm not the person you're answering to,
               | and I have no dog in this fight.
               | 
               | But you're talking in a very affirmative manner and
               | apparently trying to say that one is "correct" and the
               | other is not, while absolutely ignoring context and
               | usage.
               | 
               | I recommend you either run yourself, or find on the web,
               | a benchmark about not a single HTTP request, but an
               | actual web page, requesting the html, the css, the js,
               | and the images. Don't even need to go modern web, even
               | any old pre 2010 design with no font or anything else
               | fancy will do.
               | 
               | You will see that HTTP 1 and 1.1 are way, way worse at it
               | that HTTP 2. Which is why HTTP 2 was created, and why it
               | should be used. Also the sad joke that was domain rolling
               | to trick simultaneous request per host configurations.
               | 
               | Overall, your point of view doesn't make sense because
               | this is not a winner takes all game. Plenty of server
               | should and do also run HTTP 1 for their usage, notably
               | file servers and the likes. The question to ask is "how
               | many request in parallel do the user need, and how
               | important that they all finish as close to each other as
               | possible instead of one after the other".
               | 
               | Similarily, HTTP3 is mostly about latency.
        
               | otterley wrote:
               | You can transfer as many requests in parallel with
               | HTTP/1.1 as you like by simply establishing more TCP
               | connections to the server. The problem is that browsers
               | traditionally limited the number of concurrent
               | connections per server to 3. There's also a speed penalty
               | incurred with new connections to a host since initial TCP
               | window sizes start out small, but it's unclear whether
               | that initial speed penalty significantly degrades the
               | user experience.
               | 
               | The fact that anyone running wrk or hey can coerce a web
               | server to produce hundreds of thousands of RPS and
               | saturate 100Gb links with plain old HTTP/1.1 with
               | connection reuse and parallel threads (assuming of course
               | that your load tester, server, and network are powerful
               | enough) ought to be enough to convince anyone that the
               | protocol is more than capable.
               | 
               | But whether it's the best one for the real world of
               | thousands of different consumer device agents, flaky
               | networks with huge throughput and latency and error/drop
               | rates, etc. is a different question indeed, and these
               | newer protocols may in fact provide better overall user
               | experiences. Protocols that work well under perfect
               | conditions may not be the right ones for imperfect
               | conditions.
        
               | throwaway892238 wrote:
               | That's a lot of _may_ s. One might imagine that before
               | this stuff becomes the latest version of an internet
               | standard, these theoretical qualifications might be
               | proven out, to estimate its impact on the world at large.
               | But it was useful to one massive corporation, so I guess
               | that makes it good enough to supplant what came before
               | for the whole web.
        
               | mgaunard wrote:
               | HTTP/2 or /3 were never about optimizing bandwidth, but
               | latency.
        
               | otterley wrote:
               | Google did a great deal of research on the question using
               | real-world telemetry before trying it in Chrome and
               | proposing it as a standard to the IETF's working group.
               | And others including Microsoft and Facebook gave
               | feedback; it wasn't iterated on in a vacuum. The history
               | is open and well documented and there are metrics that
               | support it. See e.g. https://www.chromium.org/spdy/spdy-
               | whitepaper/
        
               | kiitos wrote:
               | TCP connections are bottlenecked not just by the
               | browser/client, but also at the load-balancer/server.
               | Modulo SO_REUSEPORT, a server can maintain at most 64k
               | active connections, which is far below any reasonable
               | expectation for capacity of concurrent requests. You
               | _have_ to decouple application-level requests from
               | physical-level connections to get any kind of reasonable
               | performance out of a protocol. This has been pretty well
               | understood for decades.
        
         | foofie wrote:
         | > HTTP/1 remains the one with the highest bandwidth.
         | 
         | To be fair, HTTP/2 and HTTP/3 weren't exactly aimed at
         | maximizing bandwidth. They were focused on mitigating the
         | performance constraints of having to spawn dozens of
         | connections to perform the dozens of requests required to open
         | a single webpage.
        
           | eptcyka wrote:
           | HTTP3 also wants to minimize latency in bad network
           | environments, not just mitigating the issue of too many
           | requests.
        
           | Beldin wrote:
           | Too bad that the alternative option - not requiring dozens of
           | requests just for initial rendering of a single page - didn't
           | catch on.
        
             | foofie wrote:
             | I don't think it's realistic to expect a page load to not
             | make a bunch of requests, considering that you will always
             | have to support use cases involving downloading many small
             | images. Either you handle that by expecting your servers to
             | open a dedicated connection for each download request, or
             | you take steps for that not to be an issue. Even if you
             | presume horizontal scaling could mitigate that problem from
             | the server side, you cannot sidestep the fact that you
             | could simply reuse a single connection to get all your
             | resources, or not require a connection at all.
        
             | GuestHNUser wrote:
             | Couldn't agree more. So many performance problems could be
             | mitigated if people wrote their client/server code to make
             | as few requests as possible.
             | 
             | Consider the case of requesting a webpage with hundreds of
             | small images, one should embed all of the images into the
             | single webpage! Requiring each image to be fetched in a
             | different http request is ridiculous. It pains me to look
             | at the network tab of modern websites.
        
         | kiitos wrote:
         | Assuming SSL, HTTP/1 does not deliver better throughput than
         | HTTP/2 in general.
         | 
         | I'm not sure why you believe otherwise. Got any references?
        
       | BitPirate wrote:
       | The performance difference between H1/H2 and H3 in this test
       | doesn't really surprise me. The obvious part is the highly
       | optimised TCP stack. But I fear that the benchmark setup itself
       | might be a bit flawed.
       | 
       | The biggest factor is the caddy version used for the benchmark.
       | The quic-go library in caddy v2.6.2 lacks GSO support, which is
       | crucial to avoid high syscall overhead.
       | 
       | The quic-go version in caddy v2.6.2 also doesn't adjust UDP
       | buffer sizes.
       | 
       | The other thing that's not clear from the blog post is the
       | network path used. Running the benchmark over loopback only would
       | give TCP-based protocols an advantage if the QUIC library doesn't
       | support MTU discovery.
        
         | Etheryte wrote:
         | I don't think taking shots at the Caddy version being not the
         | latest is a fair criticism to be honest. Version 2.6.2 was
         | released roughly three months ago, so it's not like we're
         | talking about anything severely outdated, most servers you run
         | into in the wild will be running something older than that.
        
           | zamadatix wrote:
           | I think you mixed up what year we're now :). Caddy 2.6.2
           | October 13, 2022 so it's been not 3 but 15 months since
           | release.
           | 
           | Even more relevantly, HTTP/3 was first supported out of the
           | box in 2.6.0 - released Sep 20, 2022. Even if 2.6.2 had been
           | just 3 months old that it's from the first 22 days of having
           | HTTP/3 support out of the box instead of the versions from
           | the following 3 months would definitely be relevant criticism
           | to note.
           | 
           | https://github.com/caddyserver/caddy/releases?page=2
        
             | francislavoie wrote:
             | This is why I'm not a fan of debian. (I assume OP got that
             | version from debian because I can't think of any other
             | reason they wouldn't have used latest.) They packaged
             | Caddy, but they never update at the rate we would
             | reasonably expect. So users who don't pay attention to the
             | version number have a significantly worse product than is
             | currently available.
             | 
             | We have our own apt repo which always has the latest
             | version: https://caddyserver.com/docs/install#debian-
             | ubuntu-raspbian
        
               | diggan wrote:
               | Stable/tested but not latest version. Or
               | unstable/untested but latest version. Chose one.
               | 
               | The distribution you chose, also makes you make that
               | choice. If you're using Debian Stable, it's because your
               | prefer stable in favor of latest. If you use Debian
               | Testing/Unstable, you favor latest versions before stable
               | ones.
               | 
               | Can't really blame Debian as they even have two different
               | versions, for the folks who want to make the explicit
               | decision.
        
               | francislavoie wrote:
               | I don't call an old version with known bugs to be
               | "stable/tested". No actual fixes from upstream are being
               | applied to the debian version. There are known CVEs that
               | are unpatched in that version, and it performs worse.
               | There's really no reason at all to use such an old
               | version. The only patches debian applied are the removal
               | of features they decided they don't like and don't want
               | packaged. That's it.
        
               | diggan wrote:
               | By that definition, almost no software in Debian could be
               | called "stable", as most software has at least one known
               | bug.
               | 
               | When people talk about "stableness" and distributions,
               | we're usually referring to the stableness of interfaces
               | offered by the distribution together with the package.
               | 
               | > There's really no reason at all to use such an old
               | version
               | 
               | Sometimes though, there is. And for those people, they
               | can use the distribution they wish. If you want to get
               | the latest versions via a package repository, use a
               | package repository that fits with what you want.
               | 
               | But you cannot dictate what others need or want. That's
               | why there is a choice in the first place.
        
               | zamadatix wrote:
               | Stableness of interfaces is supposed to imply the
               | software version is still maintained though. E.g. how
               | stable kernel versions get backports of fixes from newer
               | versions without introducing major changes from the newer
               | versions. It's not meant to mean you e.g. get an old
               | version of the kernel which accumulates known bugs and
               | security issues. If you want the latter you can get that
               | on any distro, just disable updates.
               | 
               | But you're right people are free to choose. Every version
               | is still available on the Caddy GitHub releases page for
               | example. What's being talked about here is the default
               | behavior not aligning with the promise of being a
               | maintained release, instead being full of security holes
               | and known major bugs. It's unrelated to whether Debian is
               | a stable or rolling distro rather about the lack of
               | patches they carry for their version.
        
               | diggan wrote:
               | > Stableness of interfaces is supposed to imply the
               | software version is still maintained though. E.g. how
               | stable kernel versions get backports of fixes from newer
               | versions without introducing major changes from the newer
               | versions. It's not meant to mean you e.g. get an old
               | version of the kernel which accumulates known bugs and
               | security issues. If you want the latter you can get that
               | on any distro, just disable updates.
               | 
               | I'm sure the volunteers working on this stuff is doing
               | the best they can, but stuff like this isn't usually
               | "sexy" enough to attract a ton of attention and care,
               | compared to other "fun" FOSS work.
        
             | Etheryte wrote:
             | Oh, right you are, somehow I completely mixed that up.
             | Thanks for clarifying.
        
       | nezirus wrote:
       | Maybe shout out to HAProxy people, like many they've observed
       | performance problems with OpenSSL 3.x series. But having good old
       | OpenSSL with QUIC would be so convenient for distro packages etc
       | 
       | https://github.com/haproxy/wiki/wiki/SSL-Libraries-Support-S...
        
       | samueloph wrote:
       | Nice write-up.
       | 
       | I'm one of the Debian maintainers of curl and we are close to
       | enabling http3 on the gnutls libcurl we ship.
       | 
       | We have also started discussing the plan for enabling http3 on
       | the curl CLI in time for the next stable release.
       | 
       | Right now the only option is to switch the CLI to use the gnutls
       | libcurl, but looks like it might be possible to stay with
       | openssl, depending on when non-experimental support lands and how
       | good openssl's implementation is.
        
         | mistrial9 wrote:
         | maybe the right time to clean up the unexpected and awkward set
         | of libs that are currently installed, too ?
        
       | jgalt212 wrote:
       | Does Curl performance really matter? i.e. if it's too performant,
       | doesn't that increase the odds your spider is blocked? Of course,
       | if you're sharding horizontally across targets, then any
       | performance increase is appreciated.
        
         | j16sdiz wrote:
         | libcurl is the backend for many (RESTful) API library.
         | 
         | Improving upload throughput to S3 bucket would be great, right?
        
         | zamadatix wrote:
         | What if you're not using curl as a spider? Even if you are I'd
         | recommend some other spider design which doesn't rely on the
         | performance of curl to set the crawling rate.
        
       | vitus wrote:
       | It is promising to see that openssl-quic serial throughput is
       | within 10-20% of more mature implementations such as quiche.
       | (Which quiche, though? Is this Google's quiche, written in C++,
       | or Cloudflare's quiche, written in Rust? It turns out that's
       | approximately the only word that starts with "quic" that isn't a
       | derivative of "quick".)
       | 
       | One of QUIC's weaknesses is that it's known to be much less CPU
       | efficient, largely due to the lack of things like HW offload for
       | TLS.
       | 
       | > Also, the HTTP/1.1 numbers are a bit unfair since they do run
       | 50 TCP connections against the server.
       | 
       | To emphasize this point: no modern browser will open 50
       | concurrent connections to the same server for 50 GET requests.
       | You'll see connection pooling of, uh, 6 (at least for Chrome and
       | Firefox), so the problems of head-of-line blocking that HTTP/2
       | and HTTP/3 attempt to solve would have manifested in more
       | realistic benchmarks.
       | 
       | Some questions I have:
       | 
       | - What kind of CPU is in use? How much actual hw parallelism do
       | you have in practice?
       | 
       | - Are these requests actually going over the network (even a
       | LAN)? What's the MTU?
       | 
       | - How many trials went into each of these graphs? What are the
       | error bars on these?
        
         | jsty wrote:
         | Looks like Cloudflare quiche:
         | 
         | https://github.com/curl/curl/blob/0f4c19b66ad5c646ebc3c4268a...
        
         | ndriscoll wrote:
         | > To emphasize this point: no modern browser will open 50
         | concurrent connections to the same server for 50 GET requests.
         | 
         | They will. You just need to go bump that number in the
         | settings. :-)
        
         | pclmulqdq wrote:
         | Hardware offload should be protocol-independent, but I suppose
         | most network cards assume some stuff about TLS and aren't set
         | up for QUIC?
        
           | Matthias247 wrote:
           | NICs assume stuff for TCP (segmentation offload) that they
           | can't do for UDP, or can only do in a very limited fashion
           | (GSO).
           | 
           | TLS offloads are very niche. There's barely anyone using them
           | in production, and the benchmarks are very likely without
        
         | secondcoming wrote:
         | Browsers aren't the only things that connect to servers that
         | speak HTTP.
        
       | londons_explore wrote:
       | Anyone else disappointed that the figures for _localhost_ are in
       | MB /s not GB/s?
       | 
       | The whole lot just seems an order of magnitude slower than I was
       | hoping to see.
        
         | zamadatix wrote:
         | A core of the 4770 (curl is single threaded) can't even manage
         | a full order of magnitude more plain AES encryption throughput
         | - ignoring it also has to be done into small packets and
         | decrypted on the same machine.
        
       | superkuh wrote:
       | Can cURL's HTTP/3 implementation work with self signed certs?
       | Pretty much every other HTTP/3 lib used by major browsers do not.
       | And since HTTP/3 does not allow for null cypher or TLS-less
       | connections this means in order to establish an HTTP/3 connection
       | a third party CA must be involved.
       | 
       | As is right now it is impossible to host a HTTP/3 server
       | visitable by a random person you've never met without a corporate
       | CA continually re-approving your ability to. HTTP/3 is great for
       | corporate needs but it'll be the death of the human web.
        
         | adobrawy wrote:
         | Given that browsers discourage HTTP traffic (warning that the
         | connection is insecure), given how easily free SSL certificates
         | are available, and given that HTTPS is already the standard on
         | small hobbyist sites, I don't expect The requirement for an SSL
         | certificate has been a blocker in HTTP/3 adoption.
        
           | ndriscoll wrote:
           | Do browsers warn for http (beyond the address bar icon)? I
           | don't think they ever have for my personal site. I also don't
           | think you can really say there's a "standard" for how
           | hobbyists do things. I'm definitely in the bucket of people
           | who use http because browsers throw up scary warnings if you
           | use a self-signed cert, and scary warnings aren't grandma
           | friendly when I want to send photos of the kids. The benefit
           | of TLS isn't worth setting up publicly signed certs to me,
           | and I don't want to invite the extra traffic by appearing on
           | a CT log.
           | 
           | Like the other poster said, it all makes sense for the
           | corporate web. Not so much for the human web. For humans,
           | self-signed certs with automatic TOFU makes sense, but
           | browsers are controlled by and made for the corporate web.
        
       | jrpelkonen wrote:
       | I really don't want to criticize anyone or their hard work, and
       | appreciate both curl and OpenSSL as a long time user. That said,
       | I personally find it disappointing that in 2024 major new modules
       | are being written in C. Especially so given that a) existing Quic
       | modules written in Rust exist, and b) there's a precedent for
       | including Rust code in Curl.
       | 
       | Of course there are legacy reasons for maintaining existing
       | codebases, but what is it going to take to shift away from using
       | C for greenfield projects?
        
         | zinekeller wrote:
         | For something like curl (which is also used in embedded
         | systems: a legally-verified (compliant with ISO and other
         | standards, for better or worse) Rust compiler that targets
         | common microarchitectures is a definite first step.
         | Fortunately, the first half of it exists (Ferrocene,
         | https://ferrous-systems.com/ferrocene/). The second one is
         | harder: there are architectures even GCC does not target (these
         | architectures rely on other compilers like the Small Device C
         | Compiler (or a verified variant) or even a proprietary
         | compiler), and LLVM only compiles to a subset of GCC. Even if
         | there's a GCC Rust (currently being developed fortunately), you
         | are still leaving a lot of architectures.
        
           | jrpelkonen wrote:
           | This is a good point: there are many niche architectures
           | where Rust is not a viable option. But in this specific case,
           | I don't see these system benefiting from h3/Quic. HOL
           | blocking etc. will rarely, if ever, be a limiting factor for
           | the use cases involved.
        
         | apitman wrote:
         | Not saying you're wrong, but it's worth noting that switching
         | to Rust is not free. Binary sizes, language complexity, and
         | compile times are all significantly larger.
        
         | teunispeters wrote:
         | If rust could support all of C's processors and platforms and
         | produce equivalent sized binaries - especially for embedded ...
         | then it'd be interesting to switch to. (as a start, it also
         | needs a stable and secure ecosystem of tools and libraries)
         | 
         | Right now, it's mostly a special purpose language for a narrow
         | range of platforms.
        
         | secondcoming wrote:
         | I'm personally disappointed you're aware of this issue and have
         | done nothing about it.
        
       | throwaway892238 wrote:
       | Lol, wait, HTTP2 and HTTP1.1 both trounce HTTP3? Talk about
       | burying the lede. Wasn't performance the whole point behind
       | HTTP3?
       | 
       | This chart shows that HTTP2 is more than half as slow as HTTP1.1,
       | and HTTP3 is half as slow as HTTP2. Jesus christ. If these get
       | adopted across the whole web, the whole web's performance could
       | get up to 75% slower . That's insane. There should be giant red
       | flags on these protocols that say "warning: slows down the
       | internet"
        
         | CharlesW wrote:
         | > _Wasn 't performance the whole point behind HTTP3?_
         | 
         | Faster, more secure, and more reliable, yes. The numbers in
         | this article looks terrible, but real-world testing1 shows that
         | real-world HTTP/3 performance is quite good, even though
         | implementations are relatively young.
         | 
         |  _" ...we saw substantially higher throughput on HTTP/3
         | compared to HTTP/2. For example, we saw about 69% of HTTP/3
         | connections reach a throughput of 5 Mbps or more [...] compared
         | to only 56% of HTTP/2 connections. In practice, this means that
         | the video streams will be of a higher visual quality, and/or
         | have fewer stalls over HTTP/3."_
         | 
         | 1https://pulse.internetsociety.org/blog/measuring-
         | http-3-real...
        
         | zamadatix wrote:
         | If the last decade of web protocol development seems backwards
         | to you after reading one benchmark then why immediately assume
         | it's insane and deserves a warning label instead of asking why
         | your understanding doesn't match your expectations?
         | 
         | The benchmark meant to compare how resource efficient the new
         | backend for curl is by using localhost connectivity. By using
         | localhost connectivity any real world network considerations
         | (such as throughput discovery, loss, latency, jitter, or
         | buffering) are sidestepped to allow a direct measurement of how
         | fast the backend alone is. You can't then assume those numbers
         | have a meaningful direct extrapolation to the actual
         | performance of the web because you don't know how the
         | additional things the newer protocols do impact performance
         | once you add a real network. Ingoring that, you still have to
         | consider the notes like "Also, the HTTP/1.1 numbers are a bit
         | unfair since they do run 50 TCP connections against the
         | server." before making claims about HTTP2 being more than half
         | as slow as HTTP1.1.
        
       | apitman wrote:
       | Very nice. I would love to see some numbers including simulated
       | packet loss. That's theoretically an area h3 would have an
       | advantage.
        
       | jupp0r wrote:
       | Great writeup, but the diagrams are downright awful. I'd separate
       | the different facets visually to make it easier to see the
       | difference vs those different colors.
        
       ___________________________________________________________________
       (page generated 2024-01-28 23:01 UTC)