[HN Gopher] 35M Hot Dogs: Benchmarking Caddy vs. Nginx
___________________________________________________________________
35M Hot Dogs: Benchmarking Caddy vs. Nginx
Author : EntICOnc
Score : 309 points
Date : 2022-09-16 12:58 UTC (10 hours ago)
(HTM) web link (blog.tjll.net)
(TXT) w3m dump (blog.tjll.net)
| petecooper wrote:
| I'm an Nginx guy, and I have been for some years, but I do love a
| little bit of Caddy jingoism[1] as the weekend approaches.
|
| This is a good write up. I was expecting Caddy to trounce Nginx,
| but that wasn't the case. I'll be back to re-read this with fresh
| eyes tomorrow.
|
| [1] For the avoidance of doubt, this is not meant as a snarky
| observation.
| mholt wrote:
| You were expecting Caddy to "trounce" nginx? Most people expect
| the opposite.
|
| But Caddy certainly does in some cases, especially with the
| upcoming 2.6 release.
| petecooper wrote:
| > You were expecting Caddy to "trounce" nginx? Most people
| expect the opposite.
|
| I absolutely was, yes. As an observer I see a lot of people
| saying positive things about Caddy around here, and how it's
| superior performance-wise to a variety of 'classic' httpd
| software. Lots of people love Caddy, and they're quite vocal,
| so it's not a stretch to assume there are reasons _why_ they
| love it. Nginx development has slowed since the events in
| Ukraine, unsurprisingly, so again it's not a leap to surmise
| Caddy is making good things happen in the meantime.
| mholt wrote:
| Ahh, right -- so there's a _lot_ more to performance than
| just req /sec and HTTP errors. And that's probably the
| love/hype you're hearing about. (Though Caddy's req/sec
| performance is quite good too, as you can see!)
|
| Caddy _scales_ better than NGINX especially with regards to
| TLS /HTTPS. Our certificate automation code is the best in
| the industry, and works nicely in clusters to coordinate
| and share, automatically.
|
| Caddy performs better in terms of security overall. Go has
| stronger memory safety guarantees than C, so your server is
| basically impervious to a whole class of vulnerabilities.
|
| And if you consider failure modes, there are pros and cons
| to each, but it can definitely be argued that Caddy
| dropping fewer requests than nginx (if any at all!) is
| "superior performance".
|
| I'm actually quite pleased that Caddy can now, in general,
| perform competitively with nginx, and hopefully most people
| can stop worrying about that.
|
| And if you operate at Cloudflare-"nginx-is-now-too-slow-
| for-us"-scale, let's talk. (I have some ideas.)
| CoolCold wrote:
| Can you add details on _scales better_, what do you mean?
| I've read recent post from Cloudflare on their thread
| pool and it makes sense, do you mean things of that sort
| or?
|
| I had the case when after push notifications mobile
| clients wakeup and all of them doing TLS handshake to
| LoadBalancers (Nginx), hitting cpu limit for minute or
| so, but otherwise had no problem with 5-15k rps and
| scaling.
| mholt wrote:
| Caddy does connection pooling (perhaps differently than
| what Cloudflare's proxy does, we'll have to see once they
| open source it) just as Go does. But what Caddy does so
| well is scale well with the number of certificates/sites.
|
| So we find lots of people using Caddy to serve tens to
| hundreds of thousands of sites with different domain
| names because Caddy can automate those certificates
| without falling over. (Huge deployments like this will
| require a little more config and planning, but nothing a
| wiki article [0] can't help with. You might also want
| sufficient hardware to keep a lot of certs in memory,
| etc.)
|
| Also note that rps is not a useful metric when TLS enters
| the picture, as it says nothing useful about the actual
| TLS impact (TLS connection does not necessarily correlate
| to HTTP request - and there are many modes for TLS
| connections that vary).
|
| [0]: https://caddy.community/t/serving-tens-of-thousands-
| of-domai...
| Kiro wrote:
| Why would any of those fail at a measly 10k clients? 10 billion
| clients maybe.
| EugeneOZ wrote:
| We don't have enough humans for that test.
| teknopaul wrote:
| A test is limited to 1024 clients which is (despite aversion
| to the term webscale) not a lot, even on an Intranet.
|
| I would say if you are not testing 10kcc you are not pushing
| the difference between nginx and apache1.3
|
| As soon as you do push 10kcc, kernel tcp buffers and the
| amount of junk in your browser http headers start to be more
| import than server perf. Just in the amount of data coming
| into the nic.
| CoolCold wrote:
| Would be nice to have SO_REUSEPORT on Nginx optimized - if I read
| configuration right, it was not used
| la_fayette wrote:
| These are interesting tests.considering the energy cost of large
| software systems it would be also intersting, which of these two
| has a lower co2 footprint.
| boberoni wrote:
| The killer feature of Caddy for me is that it handles TLS/HTTPS
| certificates automatically for me.
|
| I only ever use Caddy as a reverse proxy for web apps (think
| Flask, Ruby on Rails, Phoenix Framework). My projects have never
| needed high performance, but if my projects ever take off, it's
| nice to see that Caddy is already competitive with Nginx on
| resilience, latency, and throughput.
| skyde wrote:
| TLDR: "Nginx will fail by refusing or dropping connections, Caddy
| will fail by slowing everything down"
|
| To me it seem that Caddy suffer from BufferBloat. Under heavy
| congestion the goodput (useful throughput) will drop to 0 because
| client will start timing-out before the server get a chance to
| respond.
|
| Caddy should use an algorithm similar to :
| https://github.com/Netflix/concurrency-limits
|
| Basically check what was the best request latency, and decrease
| concurrency limit until latency stop improving.
| mholt wrote:
| Thanks, we'll consider that, maybe as an option. Want to open
| an issue so we don't forget?
|
| I'd probably learn toward CUBIC:
| https://en.wikipedia.org/wiki/CUBIC_TCP
|
| (I implemented Reno in college, but times have changed)
|
| The nice thing about Caddy's failure mode is that the server
| won't give up; the server has no control over if or when a
| client will time out, so it I never felt it made much sense to
| optimize for that.
| gordian_NOT wrote:
| I feel like we never see HAProxy in these reverse proxy
| comparisons. Lots of nginx, Apache, Caddy, Traefik, Envoy, etc.
|
| The HAProxy configuration is just as simple as Caddy for a
| reverse proxy setup. It's also written in C which is a comparison
| the author makes between nginx and Caddy. And it seems to be
| available on every *nix OS.
| snowwrestler wrote:
| I'm surprised Varnish is not mentioned much either. For a while
| there it had a reputation as the fastest reverse proxy. I think
| its popularity was harmed by complex config and refusal to
| handle TLS.
| pbowyer wrote:
| It's always been blisteringly fast when we've used it, and I
| like the power of the configuration (it has its quirks but so
| do most powerful systems). But the overhead of setting it up
| and maintaining it due to having to handle TLS termination
| separately puts me off using it when other software is 'good
| enough'. If Varnish Enterprise was cheaper I would have
| bought it, but at their enterprise prices no way.
|
| I'm keeping a watching brief on
| https://github.com/darkweak/souin and its Caddy integration
| to see if that can step up and replace Varnish for short-
| lived dynamic caching of web applications. Though I've lost
| track of its current status.
| darkweak wrote:
| Amazing that you're talking about Souin and it's possible
| usage to replace Varnish. Let me know if you have question
| about the configuration or implementation. ATM I'm working
| on the stabilization branch to have a more stable version
| and merge the improvements into the caddy's cache-handler
| module.
| tempest_ wrote:
| I am not sure I would agree with the assertion that config for
| HAProxy is just as easy.
|
| In fact I use HAProxy in production pretty regularly because it
| is solid but its config one of the main reasons I would choose
| something else.
|
| A basic HAProxy config is fine but it feels like after a little
| bit each line is just a collection of tokens in a random order
| that I have to sit and think about to parse.
| gunapologist99 wrote:
| For simple things, Caddy is nice and easy, but I've struggled
| with Caddy quite a bit, too, especially for more complex
| setups. I usually break out haproxy or nginx for really
| challenging setups, because caddy's documentation and
| examples are quite sparse (esp v2)
| mholt wrote:
| What do you struggle with about the documentation or "more
| complex setups"? I was just on the phone recently with
| Stripe who has a fairly complex, large-scale deployment,
| and they seemed to have figured it out with relative ease.
|
| I'm currently on a push to improve our docs, especially for
| beginners, so feel free to review the changes and leave
| your feedback:
| https://github.com/caddyserver/website/pull/263
| bmurphy1976 wrote:
| I feel the same way. I'm not a fan of haproxy's configuration
| system. It's really difficult for me to understand it,
| whereas I feel I can read most nginx/apache configs and
| immediately know what is supposed to be happening. I still
| maintain servers under load in production that use all three
| to this day and I always go back to nginx because of the
| configuration alone.
| kilburn wrote:
| I can't comment on haproxy because I haven't used it
| enough, but I think that the "nginx's config is easy to
| grasp" posture has a bit of Stockholm syndrome in it.
|
| - Do you want to add a header in this "location" block?
| Great, you better remember to re-apply all the security
| headers you've defined at a higher level (server block for
| instance) because of course adding a new header will reset
| those.
|
| - Oh, you mixed prefix locations with exact location with
| regex locations. Great, let's see if you can figure out by
| which location block will a request end up being processed.
| The docs "clearly" explain what the priority rules for
| those are and they're easy to grasp [1].
|
| - I see you used a hostname in a proxy_pass directive
| (e.g.: http://internal.thing.com). Great, I will resolve it
| at startup and never check again, because this is the most
| sensible thing to do of course.
|
| - Oh... now you used a variable (e.g.:
| http://$internal_host). That fundamentally changes things
| (how?) so I'll respect the DNS's TTL now. Except you'll
| have to set up a DNS resolver in my config because I refuse
| to use the system's normal resolver because reasons.
|
| - Here's an `if` directive for the configuration. It sounds
| extremely useful, doesn't it? Well.. "if is evil" [2] and
| you should NOT use it. There be dragons, you've been
| warned.
|
| I could go on... but I think I've proved my point already.
| Note that these are not complaints, it's just me pointing
| out that nginx's configuration has its _very_ significant
| warts too.
|
| [1] https://nginx.org/en/docs/http/ngx_http_core_module.htm
| l#loc...
|
| [2] https://www.nginx.com/resources/wiki/start/topics/depth
| /ifis...
| bmurphy1976 wrote:
| To be clear I never said it was easy. I have a LOT of
| issues with Nginx's configuration, I just find it to be
| significantly less bad than the other options.
|
| Other than Caddy, Caddy has been great so far but I have
| only used it for personal projects.
| hinkley wrote:
| All that may be true, but for a lot of us old timers we
| were coming from apache to nginx and apache's configs can
| eat a bag of dicks.
|
| Unfortunately it's likely I worked in the same building
| as one of the people responsible for either creating or
| at least maintaining that mess, but I didn't know at the
| time that he needed an intervention.
| TimWolla wrote:
| Exactly all of this. I've mentioned the first point about
| add_header redefining instead of appending in a previous
| HN comment of mine:
| https://news.ycombinator.com/item?id=27253579. As
| mentioned in that comment, HAProxy's configuration is
| much more obvious, because it's procedural. You can read
| it from top to bottom and know what's happening in which
| order.
|
| Disclosure: Community contributor to HAProxy, I help
| maintain HAProxy's issue tracker.
| slivanes wrote:
| Yes, I've experienced most of these with nginx and it can
| be a minefield. The best experience I've had configuring
| a webserver was lighttpd.
| fullstop wrote:
| I would also like to see benchmarks for reverse proxies with
| TLS termination.
| porker wrote:
| h2o [1] was excellent when I tried it for TLS termination,
| beating hitch in my unscientific tests. And it got http/2
| priorities right. It's a shame they don't make regular
| releases.
|
| 1. https://github.com/h2o/h2o/
| mholt wrote:
| I think one reason a lot of benchmarks don't include TLS
| termination is because it's often impractical in the real-
| world, where most clients reuse the connection and the TLS
| session for many requests, thus making them negligible in the
| long run. And given hardware optimizations for cryptographic
| functions combined with network round trips, you end up
| benchmarking the network and the protocol more than its
| actual implementation, which is often upstream from the
| server itself anyway.
|
| Go's TLS stack is set to be more efficient and safer in
| coming versions thanks to continued work by Filippo and team.
| nerdponx wrote:
| Maybe it would be a useful benchmark to simulate a scenario
| like "my site got posted on HN and now I'm getting a huge
| number of unique page views."
| CoolCold wrote:
| Any idea on how much traffic could be from HN? I doubt
| more than 100 rps or any other noticeable load
| viraptor wrote:
| Around 100k/day with lots of requests concentrated around
| the start. Still mostly rpm rather than rps.
| mholt wrote:
| Sure, we've already done this very real test in
| production a number of times and Caddy doesn't even skip
| a beat. (IMO that's the best kind of benchmark right
| there. No need to simulate with pretend traffic!)
| capableweb wrote:
| Yeah, this tends to be (in my cases) where response times
| suffer the most, unless your bottleneck is I/O to/from the
| backend or further away
| gog wrote:
| HAProxy does not serve static files (AFAIK), so for some stacks
| you need to add nginx or caddy after haproxy as well to serve
| static files and forward to a fastcgi backend.
| tomohawk wrote:
| nginx started out as a web server and over time gained
| reverse proxy abilities.
|
| haproxy started out as a proxy and has gained some web server
| abilities, but is all about proxying.
|
| haproxy has less surprises as a reverse proxy than nginx
| does. Some of the defaults for nginx are appropriate for web
| serving, but not proxying.
| RcouF1uZ4gsC wrote:
| > The HAProxy configuration is just as simple as Caddy for a
| reverse proxy setup.
|
| Does HAProxy have built in support for Let's Encrypt?
|
| That is one of my favorite features. Caddy just automatically
| manages the certificates for https.
| gordian_NOT wrote:
| It's not as turn-key as Caddy, that's for sure, but it's
| there: https://www.haproxy.com/blog/lets-encrypt-acme2-for-
| haproxy/
| ei8ths wrote:
| this is great, i'll implement this soon as my current cert
| is about to expire and have been wanting to get haproxy on
| lets encrypt.
| TimWolla wrote:
| It does not, because HAProxy does not perform any disk access
| at runtime and thus would be unable to persist the
| certificates anywhere. Disks accesses can be unpredictably
| slow and would block the entire thread which is not something
| you want when handling hundreds of thousands of requests per
| second.
|
| See this issue and especially the comment from Lukas Tribus:
| https://github.com/haproxy/haproxy/issues/1864
|
| Disclosure: Community contributor to HAProxy, I help maintain
| HAProxy's issue tracker.
| mholt wrote:
| That issue has some good explanation, thanks. I wonder if a
| disk-writing process could be spun out before dropping
| privileges?
|
| > Disks accesses can be unpredictably slow and would block
| the entire thread which is not something you want when
| handling hundreds of thousands of requests per second.
|
| This is not something I see mentioned in the issue, but I
| don't see why disk accesses need to block requests, or why
| they have to occur in the same thread as requests?
| TimWolla wrote:
| When reading along: Keep in mind that I'm not a core
| developer and thus are not directly involved in
| development, design decisions, or roadmap. I have some
| understanding of the internals and the associated
| challenges based on my contributions and discussions on
| the mailing list, but the following might not be entirely
| correct.
|
| > I wonder if a disk-writing process could be spun out
| before dropping privileges?
|
| I mean ... it sure can and that appears the plan based on
| the last comment in that issue. However the "no disk
| access" policy is also useful for security. HAProxy can
| chroot itself to an empty directory to reduce the blast
| radius and that is done in the default configuration on
| at least Debian.
|
| > but I don't see why disk accesses need to block
| requests
|
| My understanding is that historically Linux disk IO was
| inherently blocking. A non-blocking interface (io_uring)
| only became available fairly recently:
| https://stackoverflow.com/a/57451551/782822. And even
| then it's a operating system specific interface. For the
| BSD's you need a different solution.
|
| If your process is blocked for even one millisecond when
| handling two million of requests per second
| (https://www.haproxy.com/de/blog/haproxy-forwards-
| over-2-mill...) then you drop 2k requests or increase
| latency.
|
| > or why they have to occur in the same thread as
| requests?
|
| "have" is a strong word, of course nothing "has" to be.
| One thing to keep in mind is that HAProxy is 20 years old
| and apart from possibly doing Let's Encrypt there was no
| real need for it to have disk access. HAProxy is a
| reverse proxy / load balancer, not a web server.
|
| Inter-thread communication comes with its own set of
| challenges and building something reliable for a narrow
| use case is not necessarily worth it, because you likely
| need to sacrifice something else.
|
| As an example at scale you can't even let your operating
| system schedule out one of the worker threads to schedule
| in the "disk writer" thread, because that will
| effectively result in a reduced processing capacity for
| some fractions of a second which will result in dropped
| requests or increased latency. This becomes even worse if
| the worker holds an important lock.
| fullstop wrote:
| Built-in? Not exactly, but there is an acmev2 implemention
| from haproxytech: https://github.com/haproxytech/haproxy-lua-
| acme
| abdusco wrote:
| I use caddy mostly as a reverse proxy in front of an app.
| It's just one line in the caddy file:
| sub.domain.com { # transparent proxy + websocket
| support + letsencrypt TLS reverse_proxy
| 127.0.0.1:2345 }
|
| It's a fresh breath of air to have server with sensible
| defaults after dealing with apache and nginx (haproxy isn't
| much better in that regard).
| mholt wrote:
| If that's your whole Caddyfile, might as well not even use
| a config file: caddy reverse-proxy --from
| sub.domain.com --to :2345
|
| Glad you like using Caddy!
| bmurphy1976 wrote:
| Personally I still recommend the config file. Even when
| they are simple, it gives you one single source of truth
| that you can refer to, it will grow as you need it, and
| it can be stored in source control.
|
| Where and how parameters are configured is a bit more of
| a wild card and dependent on the environment you are
| running in.
| francislavoie wrote:
| That's something Matt and I tend to disagree on - I agree
| that a config file is better almost always because it
| gives you a better starting point to experiment with
| other features.
| mholt wrote:
| Hey, I mean, I do agree that a config file is "better"
| most of the time -- but having the CLI is just so
| awesome! :D
| CoolCold wrote:
| I still cannot make myself to try Caddy.. in things like
| this looks sweet but just may be 5% of the functionality [I
| care about]. Not saying it's not possible, but with Nginx I
| already know how to do list of CORS, OPTIONS , per location
| & cookie name caching. Issuing certs is probably simplest
| and the last thing in config setups of reverse proxying.
| tylerjl wrote:
| FWIW I'm a big fan of HAProxy as well, but I was just
| constrained by the sheer volume of testing and how rigorous I
| intended to be. Maybe once my testing is a little more
| generalized I can fan out to additional proxies like HAProxy
| without too much hassle, as I'd love to know as well.
| tomohawk wrote:
| Would love to see this
| stefantalpalaru wrote:
| > I'll build hosts with Terraform (because That's What We Use
| These Days) in EC2
|
| > [...]
|
| > Create two EC2 instances - their default size is c5.xlarge
|
| When you're benchmarking, you want a stable platform between
| runs. Virtual private servers don't offer that, because the
| host's resources are shared between multiple guests, in
| unpredictable ways.
| zmxz wrote:
| Which platform would you suggest to use for this benchmark?
| bdcravens wrote:
| Ideally your own hardware with nothing else running on it.
| For convenience you could use a VM assuming they were setup
| identically.
| 0x457 wrote:
| Well, AWS offers "metal" servers.
| stevewatson301 wrote:
| The c5 instances get dedicated cores, and thus should be exempt
| from resource contention due to shared cores.
| speedgoose wrote:
| Do you get dedicated IOs on these too? AWS tends to throttle
| heavily most instances after some time.
| stevewatson301 wrote:
| For dedicated disk IOPS you should take a look at the EBS
| provisioned IO volumes, or perhaps use the ephemeral stores
| that come with some of their more expensive instances.
| tylerjl wrote:
| This is hard because while, yes, some platform with less
| potential for jitter and/or noisy neighbors would help
| eliminate outside influence on the metrics, I think it's also
| valuable to benchmark these in a scenario that I would assume
| _most_ operators would run them in, which is a VPS situation
| and not bare-metal. FWIW, I did try really hard to eliminate
| some of the contamination in the results that would arise from
| running in a VPS by doing things like using the _same_ host
| reconfigured to avoid potential shifts in the underlying
| hypervisor, etc.
|
| But I would certainly agree that, for the utmost accurate
| results, a bare-metal situation would probably be more accurate
| than what I have written.
| tylerjl wrote:
| Hey y'all, author here. Traffic/feedback/questions are coming in
| hot, as HN publicity tends to engender, but I'm happy to answer
| questions or discuss the findings generally here if you'd like
| (I'm looking through the comments, too, but I'm more likely to
| see it here).
| jacooper wrote:
| The black color for caddy in the charts is very hard to read in
| dark mode it would be great if you can change it to other
| colors
| Havoc wrote:
| Close enough to not matter in most use cases. ie pick whatever is
| convenient
| 5d8767c68926 wrote:
| When would it matter? I write in Python, so performance was
| never a concern for me, but I am curious the scenarios in which
| this was likely to be the weakest link in real workloads.
|
| Given available options, I will take the network software
| written in a memory safe language every time.
| zivkovicp wrote:
| This is almost always the case, no matter the service we're
| talking about.
| no_time wrote:
| Is this how we ended up with electron for desktop
| applications and Java for backend?
| philipwhiuk wrote:
| Yes, because developers and expensive and so developer
| productivity dominates almost everything else.
| eddieroger wrote:
| Also, "pick what you know" applies here, too. If you know
| NGINX, then all you get from switching to Caddy is
| experience, and likewise, vice versa.
| mholt wrote:
| *and memory safety*
|
| This cannot be understated. Caddy is not written in C! And
| it can even run your NGINX configs. :)
| https://github.com/caddyserver/nginx-adapter
| excitom wrote:
| A solution in search of a problem.
| 5d8767c68926 wrote:
| Nginx security page [0] lists a non-zero amount of
| exploitable issues rooted in manual memory management.
|
| [0] https://nginx.org/en/security_advisories.html
| shabbatt wrote:
| This is the answer I was looking for but sadly, this type of
| insignificance becomes ammunition for managers/founders who are
| obsessed with novelty
| anonymouse008 wrote:
| > Wew lad! Now we're cooking with gas.
|
| This is the new gold standard for benchmarks!
|
| OP / Author, stupendously tremendous job. The methodology is
| defensible and sensible. Thank you for doing this on behalf of
| the community.
| tylerjl wrote:
| That's very kind of you to say, thank you!
| mholt wrote:
| Yeah, Tyler did an amazing job.
| lelandfe wrote:
| Seconding!
|
| I am also in love with the friendliness and tone of the
| article. I'm a complete dummy when it comes to stuff like this
| and still understood most of it. Feynman would be proud.
| cies wrote:
| I like Caddy, but on prod I do not need SSL (off-loaded by the
| LB), so I stick to nginx after reading this.
|
| Guess I'm waiting for Cloudflare to FLOSS-release their proxy
| https://news.ycombinator.com/item?id=32864119 :)
| mholt wrote:
| This is a great writeup overall. I was happy to see Tyler's
| initial outreach before conducting his tests [0]. However, please
| note that these tests are also being revised shortly after some
| brief feedback [1]:
|
| - The sendfile tests at the end actually didn't use sendfile, so
| expect much greater performance there.
|
| - All the Caddy tests had metrics enabled, which are known[2] to
| be quite slow currently. Nginx does not emit metrics in its
| configuration, so in that sense the tests are a bit uneven. From
| my own tests, when I remove metrics code, Caddy is 10-20% faster.
| (We're working on addressing that [3].)
|
| - The tests in this article did not tune reverse proxy buffers,
| which are 4KB by default. I was able to see moderate performance
| improvements (depending on the size of payload) by reducing the
| buffer size to 1KB and 2KB.
|
| I want to thank Tyler for his considerate and careful approach,
| and for all the effort put into this!
|
| [0]: https://caddy.community/t/seeking-performance-suggestions-
| fo...
|
| [1]: https://twitter.com/mholt6/status/1570442275339239424
| (thread)
|
| [2]: https://github.com/caddyserver/caddy/issues/4644
|
| [3]: https://github.com/caddyserver/caddy/pull/5042
| tylerjl wrote:
| Thanks, Matt! I've pushed the revised section measuring
| sendfile and metrics changes, so those should be accurate now.
|
| Phew. Caches are purged, my errors are fixed. I can rest
| finally. If folks have questions about anything, I'm happy to
| answer.
| tomcam wrote:
| Just want to say your writing is the best quirky balance of
| fun and substance, reminiscent of Corey Quinn [1]. Thanks for
| doing a so damn much work and for the instantly relatable
| phrase "Nobody is allowed to make mistakes on the Internet".
|
| [1] https://www.lastweekinaws.com
| tylerjl wrote:
| Thank you, that's very kind! There's a reason I included
| Corey's name in my hastily cobbled-together skeleton meme
| [1]. Hopefully my writing achieves that level of technical
| approachability.
|
| [1]: https://blog.tjll.net/reverse-proxy-hot-dog-eating-
| contest-c...
| tomcam wrote:
| How did I miss that. Anyway, you succeeded.
| QuinnyPig wrote:
| That's very kind of you to say. Do I get to put this on my
| resume?
| tomcam wrote:
| THE MAN HIMSELF
|
| I can die happy
| fariszr wrote:
| > - All the tests had metrics enabled, which are known[1] to be
| quite slow. From my own tests, when I remove metrics code,
| Caddy is 10-20% faster.
|
| But disabling metrics is not supported in standard Caddy, you
| need to remove specirc code and recompile to disable it.
|
| So maybe benchmarking with it isn't fair to Nginx?
| tialaramex wrote:
| Yeah, I think Fair comparisons are:
|
| * How do these things perform by default. This is how they're
| going to perform for many users, because if it's adequate
| nobody will tune them, why bother.
|
| * How do these things perform with performance configuration
| as often recommended online. This is how they'll perform for
| people who think they need performance but don't tune or
| don't know how to tune, this _might be worse than default_
| but that 's actually useful information.
|
| * How do these things perform when their authors get to tune
| them for our test workload. This is how they'll perform for
| users who squeeze every drop and can afford to get somebody
| to do real work to facilitate, possibly even hiring the same
| authors to do it.
|
| In some cases I would also really want to see:
|
| * How do these things perform with _recommended security_. A
| benchmark mode with great scores but lousy security can
| promote a race to the bottom where everybody ships insecure
| garbage by default then has a mode which is never measured
| and has lousy performance yet is mandatory if you don 't
| think Hunter2 is a great password.
| mholt wrote:
| > How do these things perform by default.
|
| Agreed on this one -- today I'm looking at how to disable
| metrics by default and make them opt-in. At least until the
| performance regression can be addressed.
|
| Update: PR opened:
| https://github.com/caddyserver/caddy/pull/5042 - hoping to
| land that before 2.6.
| dQw4w9WgXcQ wrote:
| Good stuff dude. Listens to users, sees a problem,
| doesn't take it personally, makes a fix. Caddy's going
| places.
| mholt wrote:
| I'm also grateful that Dave, the original contributor of
| the metrics feature, isn't taking it personally. We love
| the functionality! Just gotta refine it...
| mholt wrote:
| > But disabling metrics is not supported in standard Caddy,
| you need to remove specirc code and recompile to disable it.
|
| We're addressing that quite soon. Unfortunately the original
| contributor of the feature has been too busy lately to work
| on it, so we might just have to go the simple route and make
| it opt-in instead. Expect to see a way to toggle metrics
| soon!
|
| Update: PR opened:
| https://github.com/caddyserver/caddy/pull/5042
| Bilal_io wrote:
| That was fast. I love it!
| philipwhiuk wrote:
| There's always going to be some cost to metrics, going
| forward you probably just want to document it and then
| update the figure as you tune it. Higher performance opt-in
| metrics are the sort of thing a company using it at scale
| ought to be able to help with/sponsor work on.
| mholt wrote:
| Absolutely. The plan is to make it opt-in for now, and
| then having a company sponsor the performance tuning
| would be very welcomed. Otherwise it'll probably sit
| until someone with the right skills/know-how and time
| comes along.
| hinkley wrote:
| > All the Caddy tests had metrics enabled
|
| One of the great mysteries in (my) life is why people think
| that measuring things is free. It always slows things down a
| bit and the more precisely you try to measure speed, the slower
| things go.
|
| I just finished reducing the telemetry overhead for our app by
| a bit more than half, by cleaning up data handling. Now it's
| ~5% of response time instead of 10%. I could probably halve
| that again if I could sort out some stupidity in the
| configuration logic, but that still leaves around 2-3% for
| intrinsic complexity instead of accidental.
| asb wrote:
| I wrote up a few notes on my Caddy setup here
| https://muxup.com/2022q3/muxup-implementation-notes#serving-...
| which may be a useful reference if you have a static site and
| wanted to tick off a few items likely on your list (brotli,
| http3, cache-control, more fine-grained control on redirects).
|
| I don't think performance is ever going to matter for my use
| case, but one thing I think is worth highlighting is the quality
| of the community and maintainership. In a thread I started asking
| for feedback on my Caddyfile
| (https://caddy.community/t/suggestions-for-simplifying-my-
| cad...), mholt determined I'd found a bug and rapidly fixed it. I
| followed up with a PR
| (https://github.com/caddyserver/website/pull/264) for the docs to
| clarify something related to this bug which was reviewed and
| merged within 30 minutes.
| mholt wrote:
| Thanks for the comments and participation!
|
| I'm still thinking about that `./`-pattern-matching problem.
| Will probably have to be addressed after 2.6...
| fariszr wrote:
| A very helpful post, thanks for sharing it!
| jiripospisil wrote:
| I'm impressed with Caddy's performance. I was expecting it to
| fall behind mainly due to the fact it's written in Go but
| apparently not. It's a bit disappointing that it's slower in
| reverse proxying, as that's one of the most important use cases,
| but now that it's identified maybe they can make some
| improvements. Finally, there really should be a max memory / max
| connections setting (maybe there is?).
| pjmlp wrote:
| I am not a big fan of Go's design, however that is exactly one
| reason I tend to argue for it.
|
| There is enough juice in compiled managed languages that expose
| value types and low level features, it is a matter to learn how
| to use the tools on the toolbox instead of taking always the
| hammer out.
| zekica wrote:
| Goroutines are efficient enough, and Go compiles to native
| code. I'm sure that Rust/Tokio or handcrafted C can be faster,
| but I think Go is fast enough for 99% of use cases.
|
| I'm building a service manager a la systemd in Go as a side
| project, and I really like it - it's not as low level as Rust
| and has a huge runtime but it is impressively fast.
| shabbatt wrote:
| The only reason for me to consider Caddy was reverse proxy. Now
| that reason is gone and I'm happy with nginx
| teknopaul wrote:
| Worker_connections 1024;
|
| Hello?
|
| http://xtomp.tp23.org/book/100kcc.html
|
| Try worker_connections 1000000;
| mordornginx wrote:
| People still liking nginx making money on it.
|
| nginx awful use and only make easy accidentally shoot one's
| feet's.
| samcrawford wrote:
| Great write-up! One question I had was around the use of
| keepalives. There's no mention in the article of whether
| keepalives were used between the client and reverse proxy, and no
| mention of whether it was used between the reverse proxy and
| backend.
|
| I know Nginx doesn't use keepalives to backends by default (and I
| see it wasn't setup in the optimised Nginx proxy config), but it
| looks like Caddy does have keepalives enabled by default.
|
| Perhaps that could explain the delta in failure rates, at least
| for one case?
| mholt wrote:
| Are you talking about HTTP keepalive or TCP keepalive?
|
| Keepalives can actually reduce the performance of a server with
| many concurrent clients (i.e. a benchmark test), and have other
| weird effects on benchmarks: https://www.nginx.com/blog/http-
| keepalives-and-web-performan...
| teknopaul wrote:
| Same thing. Http has no keep alive feature, you don't send
| http keep alive requests, if http 1.1 asks for keepalives
| it's a tcp thing.
| mholt wrote:
| They are distinct in Go. The standard library uses "HTTP
| keep-alive" to mark connections as idle based on most
| recent HTTP request, whereas TCP keep-alive checks only
| ACKs.
| davidjfelix wrote:
| FYI to author (who is in the comments): you may want to prevent
| the graphs from allowing scroll to zoom, I was scrolling on the
| page and the graphs were zooming in and out.
| bagels wrote:
| I think I don't get the joke. What does the X-Hotdogs header do?
| Arnavion wrote:
| The header does nothing. As the article says, the author sent
| the header in every request, made a total of 35M requests, and
| thus gained a reason to use 35M hot dogs in the article title.
| tylerjl wrote:
| Correct. Maybe it blows my credibility out of the water and
| I'll be shamed for life, who knows
| fullstop wrote:
| The author's writing style reminds me of Andy Weir's Project Hail
| Mary or The Martian.
| pdhborges wrote:
| The author linked to wrk2 but I think he ended up using a k6
| executor that exhibits the problem wrk2 was designed to solve.
| tylerjl wrote:
| Damn. This is probably worth swapping out k6 for if I manage to
| pull off a second set of benchmarks. Thanks for the heads-up.
| hassy wrote:
| Yep, k6 suffers from coordinated omission [1] with its default
| settings.
|
| A tool that can send a request at a constant rate i.e. wrk2 or
| Vegeta [2] is a much better fit for this type of a performance
| test.
|
| 1. https://www.scylladb.com/2021/04/22/on-coordinated-omission/
|
| 2. https://github.com/tsenart/vegeta
| imiric wrote:
| With its default settings, yes, but k6 can be configured to
| use an executor that implements the open model[1].
|
| See more discussion here[2].
|
| [1]: https://k6.io/docs/using-k6/scenarios/arrival-rate/
|
| [2]: https://community.k6.io/t/is-k6-safe-from-the-
| coordinated-om...
| nickjj wrote:
| I'm surprised no benchmarks were done with logging turned on.
|
| I get wanting to isolate things but this is the problem with
| micro benchmarks, it doesn't test "real world" usage patterns.
| Chances are your real production server will be logging to at
| least syslog so logging performance is worth looking into.
|
| If one of them can write logs with 500 microseconds added to
| reach request but the other takes 5 milliseconds that could be a
| huge difference in the end.
| tylerjl wrote:
| This is - along with some reverse proxy settings tweaks - one
| of the variables I'd be keen to test in the future, since it's
| probably _the_ most common delta against my tests versus real-
| world applications.
| mholt wrote:
| Caddy's logger (uber/zap) is zero-allocation. We've found that
| the _writing_ of the logs is often much slower, e.g. printing
| to the terminal or writing to a file. And that 's a system
| problem more than a Caddy one. But the actual emission of logs
| is quite fast last time I checked!
| nickjj wrote:
| I think your statement is exactly why logging should have
| been turned on, at least for 1 of the benchmarks. If it's a
| system problem then it's a problem that both tools need to
| deal with.
|
| If one of them can do 100,000 requests per second but the
| other can do 80,000 requests per second but you're both
| capped at 30,000 requests per second because of system level
| limitations then you could make a strong case that both
| products perform equally in the end.
| metaltyphoon wrote:
| I wonder how this compares to YARP.
| kijin wrote:
| If you tell nginx to limit itself to 4 workers x 1024 connections
| per worker = 4096 connections, and hurl 10k connections at it, of
| course it's going to throw errors. It's doing exactly what you
| told it to do.
|
| That's just one example of how OP's "optimized" nginx config is
| barely even optimized. There are lots of other variables that you
| can tweak to get even better performance and blow Caddy out the
| window, but those tweaks are going to depend on the specific
| workload you expect to handle. There isn't a single, perfectly
| optimized, set of values that's going to work for everyone.
|
| The beauty of Caddy is that you get most of that performance
| without having to tweak anything.
| teknopaul wrote:
| Nginx scales to 1,000,0000 workers per vm in my tests, but
| bandwidth is silly.
|
| I got those results by seriously limiting the junk in http
| headers. Not with real browsers.
|
| If you have that demand for any commercial service, you have
| money to distribute your load globally across more than one
| nginx instance.
___________________________________________________________________
(page generated 2022-09-16 23:00 UTC)