[HN Gopher] Comparing Nginx performance in bare metal and virtua...
       ___________________________________________________________________
        
       Comparing Nginx performance in bare metal and virtual environments
        
       Author : el_duderino
       Score  : 78 points
       Date   : 2021-10-29 18:16 UTC (4 hours ago)
        
 (HTM) web link (www.nginx.com)
 (TXT) w3m dump (www.nginx.com)
        
       | BiteCode_dev wrote:
       | Yes, perfs are better on bare metal, and it's cheaper too. In
       | fact, even a cheap VPS will perform better than the cloud for
       | much less money, and scale vertically to incredible highs.
       | 
       | You do have to do a lot of things manually, and even doing that,
       | you won't get the high availability and elasticity you have the
       | potential to get from a cloud offer. Potential being the key word
       | though.
       | 
       | But honestly, most projects don't need it.
       | 
       | Crashing used to terrify me. One day I started to work for a
       | client that had his service down once a month.
       | 
       | Guess what ? Nothing happened. No customers ever complained. No
       | money was lost, in fact, the cash flow kept growing.
       | 
       | Most services are not so important they can't get switched off
       | once in a while.
       | 
       | Not to mention, monoliths are more robust they are given credit
       | for.
        
         | Nextgrid wrote:
         | > you won't get the high availability
         | 
         | Debatable.
         | 
         | I've seen a fair amount of outages caused by the extra
         | complexity brought on by making a system distributed for the
         | purposes of high availability.
         | 
         | Hardware is actually quite reliable nowadays, and I'll trust a
         | hardware single-point of failure running a monolithic
         | application more than a distributed microservice-based system
         | with lots of moving parts.
         | 
         | Sure, _in theory_ , the distributed system should win, but in
         | practice, it fails more often (due to operator error or
         | unforeseen bugs) than hardware.
        
           | KronisLV wrote:
           | > Sure, in theory, the distributed system should win, but in
           | practice, it fails more often (due to operator error or
           | unforeseen bugs) than hardware.
           | 
           | Isn't this because of the rampant introduction of accidental
           | complexity whenever you attempt to make a system horizontally
           | scalable - e.g. for whatever reason the developers or the
           | people in charge suddenly try to cram in as many
           | technological solutions as possible because apparently that's
           | what the large companies are doing?
           | 
           | There's no reason why you couldn't think about which data can
           | or cannot be shared, and develop your system as one that's
           | scalable, possibly modular, but with the codebase still being
           | largely monolithic in nature. I'd argue that there's a large
           | gray area between the the opposite ends of that spectrum -
           | monoliths vs microservices.
           | 
           | I actually wrote down some thoughts in a blog post of mine,
           | called "Moduliths: because we need to scale, but we also
           | cannot afford microservices":
           | https://blog.kronis.dev/articles/modulith-because-we-need-
           | to...
        
             | earleybird wrote:
             | Not to say that some systems don't introduce accidental
             | complexity, however, getting distributed systems
             | right/correct is necessarily complicated. Design for
             | failure and you will get a good picture of what tools you
             | need to employ.
        
             | Nextgrid wrote:
             | I think another issue is that there are plenty of robust,
             | battle-tested building blocks such as databases for
             | monolithic, vertically-scalable applications, but much
             | fewer for horizontally-scalable ones, meaning you'll need
             | to roll your own (at the application layer) and most likely
             | screw it up.
        
               | KronisLV wrote:
               | I do agree with you in that regard, however, that's also
               | a dangerous line of thinking.
               | 
               | There are attempts to provide horizontal scalability for
               | RDBMSes in a transparent way, like TiDB
               | https://pingcap.com/ (which is compatible with the MySQL
               | 5.7 drivers), however, the list of functionality that's
               | sacrificed to achieve easily extensible clusters is a
               | long one: https://docs.pingcap.com/tidb/stable/mysql-
               | compatibility
               | 
               | There are other technologies, like MongoDB, which
               | sometimes are more successful at a clustered
               | configuration, however most of the traditional RDBMSes
               | work best in a leader-follower type of replication
               | scenario, because even those aforementioned systems
               | oftentimes have data consistency issues that may
               | eventually pop up.
               | 
               | Essentially, my argument is that the lack of good
               | horizontally scalable databases or other data storage
               | solutions is easily explainable by the fact that the
               | problem itself isn't solvable in any easy way, apart from
               | adopting eventual consistency, which is probably going to
               | create more problems than it will solve in case of any
               | pre-existing code that makes assumptions about what ways
               | it'll be able to access data and operate on it: https://e
               | n.wikipedia.org/wiki/Fallacies_of_distributed_compu...
               | 
               | To that end, i'd perhaps like to suggest an alternative:
               | use a single vertically scalable RDBMS instance when
               | possible, with a hot standby if you have the resources
               | for that. Let the architecture around it be horizontally
               | scalable instead, and let it deal with the complexities
               | of balancing the load and dealing with backpressure -
               | introduce a message queue if you must, maybe even an in-
               | memory one for simplicity's sake, or consider an event
               | based architecture where "what needs to be done" is
               | encapsulated within a data structure that can be passed
               | around and applied whenever possible. In my eyes, such
               | solutions can in many cases be better than losing the
               | many benefits of having a single source of truth.
               | 
               | Alternatively, when you actually hit issues with the
               | above approach (and only then), consider sharding as a
               | possibility, or, alternatively, do some domain driven
               | design, figure out where to draw some boundaries and
               | split your service into multiple ones that cover the
               | domain with which you need to work with. Then you have
               | one DB for sales, one for account management, one for
               | reports and so on, all with services that are separated
               | by something as simple as REST interfaces and with rate
               | limits or any of the other mechanisms.
               | 
               | If, however, neither of those two groups of approaches
               | don't seem to be suitable for the loads that you're
               | dealing with, then you probably have a team of very smart
               | people and a large amount of resources to figure out what
               | will work best.
               | 
               | To sum up, if there are no good solutions in the space,
               | perhaps that's because the problems themselves haven't
               | been solved yet. Thus, sooner or later, they'll need to
               | be sidestepped and their impact mitigated in whatever
               | capacity is possible. Not all components can or should
               | scale horizontally.
        
           | BiteCode_dev wrote:
           | Yeah, I edited my answer because that's fair. Even AS 430
           | main frames with crazy uptimes are not unheard of after all.
        
         | FpUser wrote:
         | >"You do have to do a lot of things manually, and even doing
         | that, you won't get the high availability and elasticity you
         | get from a cloud offer."
         | 
         | I run my things on bare metal. In case of hardware failure it
         | takes less than 1 hour to restore servers from backup and there
         | was exactly zero hardware failures for all its lifetime
         | anyways. I also have configurations with standby servers but I
         | am questioning it now as again there was not a single time (bar
         | testing) when standby was needed.
         | 
         | As for performance - I use native C++ backends and modern
         | multicore CPU with lots of RAM. Those babies can process
         | thousands of requests per second sustainably without ever
         | breaking a sweat. It is more than enough for any reasonable
         | business. All while costing fractions of a cloudy stuff.
        
           | Nextgrid wrote:
           | With ZFS or even RAID, there should ideally never be a need
           | to "restore from backup" because of a conventional hardware
           | failure; storage drive malfunctions nowadays can and (IMO)
           | should be resolved online.
           | 
           | This is of course not a reason to avoid backups, but nowadays
           | "restoring from backups" should be because of operator error
           | or large-scale disaster (fire, etc), not because of storage
           | drive failure.
           | 
           | Nowadays I'd be more worried about _compute_ hardware failure
           | - think CPU, RAM or the system board. Storage redundancy is
           | IMO a long-solved problem provided you don 't cheap out.
        
             | BiteCode_dev wrote:
             | Can you educate me on the typical setup you would use to
             | achieve that goal?
        
               | Nextgrid wrote:
               | Either ZFS or some form of RAID with redundancy cranked
               | up to the max, so that it can tolerate a high number of
               | drives failing and still continue operating. ZFS
               | configured in RAIDZ3 mode for example can tolerate 3
               | drives failing and still be able to operate (and rebuild
               | itself once the failed drives are replaced).
               | 
               | You are very unlikely to have 3 drives fail at the same
               | time, so you're pretty much immune to storage hardware
               | failures. Again, this is not an argument against backups
               | (they're useful for other reasons), but with this amount
               | of redundancy on the storage level, I'd expect something
               | else in the system to die before the storage layer fails
               | catastrophically.
        
               | toast0 wrote:
               | > You are very unlikely to have 3 drives fail at the same
               | time, so you're pretty much immune to storage hardware
               | failures.
               | 
               | You have to be careful. If you take three storage devices
               | off the production line and use them as a mirrored array
               | and one fails, there's a good chance they all fail,
               | because you're not just dealing with storage hardware,
               | but storage software too.
               | 
               | HP had two batches of SSDs that failed when the power on
               | time hit some rollover point. And I don't think they're
               | the only one. Fabrication issues are also likely to turn
               | up simultaneously if given equal write loads.
               | 
               | If failures were independent events, you're spot on, but
               | they may not be.
        
               | Nextgrid wrote:
               | Yes that is correct; I assumed it was somewhat common
               | knowledge if you're going for (and have the budget for)
               | such levels of redundancy.
        
               | BiteCode_dev wrote:
               | Would you happen to be freelancing a bit?
               | 
               | I'd like to hire somebody for a day to teach me those
               | things. Remote is ok.
        
               | Nextgrid wrote:
               | Feel free to send me an email (address in my profile) and
               | we can discuss. I don't think I'm the right person to
               | teach system administration, but we can always have an
               | informal conversation and I can hopefully point you to
               | better resources.
        
         | hinkley wrote:
         | I have a service at work that only needs to be up during a
         | couple of 4 hour intervals every week. It's one of the least
         | stressful elements of my job.
         | 
         | There's a cost to always-on systems, and I don't think we're
         | accounting those costs properly (externalities). It's likely
         | that in many cases the benefits do not outweigh the costs.
         | 
         | I think it comes down to your ability to route around a
         | problem. If the budgeting or ordering system is down, there are
         | any number of other tasks most of your people can be doing.
         | They can make a note, and stay productive for hours or days.
         | 
         | If you put all of your software on one server or one provider,
         | and that goes down, then your people are pretty much locked out
         | of everything. Partial availability needs to be prioritized
         | over total availability, because of the Precautionary
         | Principle. The cost of everything being down is orders of
         | magnitude higher than the cost of having some things down.
        
         | PragmaticPulp wrote:
         | > ...and it's cheaper too...
         | 
         | > You do have to do a lot of things manually...
         | 
         | Doing those manual things has a cost, though.
         | 
         | In most cases, it's only cheaper if you value engineering time
         | at $0. Fine for a personal project, but the math changes when
         | you're paying a fully-loaded cost of $100-200/hr for engineers
         | and you have a long backlog of more valuable things for them to
         | work on.
         | 
         | That's the real reason companies use cloud services: Not
         | because it's cheaper or faster, but because it lets the
         | engineers focus their efforts on the things that differentiate
         | the company.
         | 
         | From another perspective: I can change the oil in my car
         | cheaper and faster than driving to a shop and having someone
         | else do it. However, if my time is better spent doing an hour
         | of freelancing work then I'll gladly pay the shop to change my
         | oil while I work.
        
           | BiteCode_dev wrote:
           | Well, not really. If you are using a cloud solution, you
           | usually needs an engineer that knows this particular
           | solution. Outside of the HN bubble, it's a rare breed and it
           | cost a lot more than your traditional linux admin, that you
           | probably already have anyway.
           | 
           | Then you need to design, maintain and debug the distributed
           | cloud system, which is more complex.
           | 
           | So you'll have a dedicated person or team for that in both
           | cases.
           | 
           | On the other end, setuping a linux box for the common tasks
           | (web server, cache, db, etc) never takes me more than a day.
        
           | colechristensen wrote:
           | Eh, everything can be automated just as well on bare metal.
           | "The cloud" tends to add complexity not remove it, or at best
           | replace one kind of complexity with another. Bare metal
           | tooling is less familiar and could use some advancement, but
           | basically anything that a cloud compute provider can do can
           | be done on bare metal.
           | 
           | A lot of orgs just never bothered to learn how to automate
           | bare metal and ended up doing a lot of manual work.
        
         | atoav wrote:
         | Oh like that one system I saw once with an uptime of 10 years
         | that was happily chugging away data (not web facing tho).
         | 
         | Bare metal servers with a propper fallback and a working
         | monitoring/notification system can be incredibly reliable and
         | for most purposes definitely enough.
        
       | DSingularity wrote:
       | I wonder if any of the CPU mitigations are behind this. They have
       | a pretty significant cost in virtualized settings.
        
         | LinuxBender wrote:
         | Another thing to check is how nginx was compiled. Using generic
         | optimizations vs. x86_64 can do _interesting_ things on VM 's
         | vs bare metal. nginx and haproxy specifically should be
         | compiled generic for VM's. I don't have any links, just my own
         | performance testing in the past.
        
           | theevilsharpie wrote:
           | A binary running in a VM is still executing native machine
           | code, so compiler optimizations should have the same effect
           | whether running on bare metal or a VM.
        
             | LinuxBender wrote:
             | Should being the key word. In truth the implementations of
             | each hypervisor vary. Try it on each hypervisor that you
             | use. I found KVM to have the most parity to bare metal
             | performance.
        
               | theevilsharpie wrote:
               | I'm struggling to think of a situation when running
               | virtualized vs. bare metal where compiler optimizations
               | would matter.
               | 
               | Certain hypervisors have the ability to disable features
               | on the virtual CPU to enable live migration between
               | different generation physical CPUs, in which case a
               | binary that depends on a disabled virtual CPU feature
               | (a.g., AVX-512) will simply crash (or otherwise fail)
               | when it executes an unsupported instruction.
               | 
               | Other than that, I'm drawing a blank.
               | 
               | Hypervisor performance will vary, but I can't envision
               | any scenario where a binary optimized for the processor's
               | architecture would perform worse than one without any
               | optimizations when running on a VM vs bare metal.
        
               | LinuxBender wrote:
               | That's the point. It shouldn't matter but in fact it
               | does. You can see this for yourself if you use the x86_64
               | optimizations on VM's in benchmark tests. And you will
               | see various results depending on hypervisor used and what
               | application is used. This will even change with time as
               | updates are made to each hypervisor. What I am describing
               | is exactly what is not supposed to happen which is why
               | you are struggling to think of a situation where this
               | should matter. You are being entirely logical.
        
       | [deleted]
        
       | blowski wrote:
       | I recently moved a smallish business application from two bare-
       | metal servers onto Azure VMs. It's a standard PHP 7.4
       | application, MySQL, Redis, nginx. Despite the VMs costing more
       | and having twice the spec of the bare-metal servers, it has
       | consistently performed slower and less reliably throughout the
       | stack. The client's budget is too small to spend much time
       | looking into it. Instead, they upped the spec even further, and
       | what used to cost PS600 per month now costs PS2000.
        
         | hhw wrote:
         | (Disclaimer) As a bare metal provider, I hope more people
         | become aware what I've been saying for years: cloud is great
         | for scaling down, but not that great for scaling up. If you
         | want to have lots of VM's that don't warrant their own hardware
         | that you can conveniently turn up and down, then cloud is
         | fantastic. If you have a big application that needs to scale,
         | you can get further vertically with bare metal, and if you need
         | to scale horizontally, you need to optimize better higher up in
         | the stack anyway, and the much lower cost for equivalent
         | resources (without even taking any virtualization performance
         | hit into account), more flexibility and thus more/better fitted
         | performance of bare metal should have the clear advantage.
        
           | laumars wrote:
           | > _what I 've been saying for years: cloud is great for
           | scaling down, but not that great for scaling up._
           | 
           | Yes and no. The cloud isn't cheap at running any lift and
           | shift type project. Where the cloud comes into its own is
           | serverless and SaaS. If you have an application that's a
           | typical VM farm and you want to host it in the cloud, then
           | you should at least first pause and identify how, if at all,
           | your application can be re-architected to be more "cloudy".
           | Not doing this is usually the first mistake people make when
           | deploying to the cloud.
        
           | stingraycharles wrote:
           | FYI I clicked on "get a dedicated server" on your site and
           | ended up getting a 404
           | 
           | https://astuteinternet.com/services/dedicated-servers-order
           | 
           | {"error":"URI Not Found"}
        
             | grepfru_it wrote:
             | >Starting at $199/mo
             | 
             | Sounds pricey :)
        
               | duskwuff wrote:
               | Canadian dollars, so ~$160 USD. But that's still
               | extremely high for a quad-core CPU from 2013, 8 GB RAM,
               | no solid-state storage, and capped bandwidth.
        
               | RNCTX wrote:
               | Not really.
               | 
               | Before there was a "cloud" in the early 2000s, we were
               | paying anywhere from $100 (cheap, low spec machines) to
               | $199 (new hardware, more ram, faster disks) per month for
               | rented bare metal from places like Servermatrix,
               | Softlayer, etc.
               | 
               | The going rate also typically included anywhere from 1TB
               | to 2TB of egress, as well.
               | 
               | Another way of looking at it for the low end: How many
               | things do you run on a single bare metal host versus an
               | equivalent amount of discrete "services"?
        
               | FpUser wrote:
               | Look at Hetzner / OVH. I got incredibly good deals from
               | them on dedicated servers. I think I am paying around
               | $150 Canadian for AMD 16 cores, 128GB ECC RAM, do not
               | remember storage.
               | 
               | Updated: I also run some things right from my home office
               | since I have symmetric 1Gbps fiber. For that I paid
               | around $4000 Cdn for off lease 20 core 3.4GHz 512GB RAM
               | server from HP
        
         | speedgoose wrote:
         | Did you throw money at the storage layer? The IOs are
         | ridiculously slow by default.
        
         | stingraycharles wrote:
         | I think "the cloud" never claimed to be cheaper, though. Its
         | promise is mainly that you'll offset a lot of risks with it
         | against a higher price. And of course the workflow with the
         | cloud is very different, with virtual machine images and
         | whatnot.
         | 
         | Whether that's worth the price is up debate, though. I hope
         | we'll get more mature bare metal management software, perhaps
         | even standardized in some remote management hardware so you can
         | flash entire bare metal systems.
         | 
         | Right now I'm mostly running Debian stable and everything
         | inside Docker on the host machine, making it effectively a
         | Docker "hypervisor". It can get you quite far, without leaking
         | any application state to the host machine.
        
           | blowski wrote:
           | Oh I agree, it is easier to mitigate some risks in a cloud
           | solution. But this client - and they're not unusual in their
           | thinking - believes the cloud is somehow automatically
           | mitigating those risks, when in fact it's doing nothing
           | because they're not paying for it.
        
         | RedShift1 wrote:
         | Try Google Cloud instead. I've used both Azure and GCP and
         | Azure VMs always feel sluggish, a problem I don't have with
         | GCP.
        
           | blowski wrote:
           | In this specific case, they chose Azure. They had a
           | consultant in a fancy suit tell them Azure would be safer,
           | and proposed an architecture that ended up not working at
           | all. But they still went with Azure, and it's difficult to
           | point to any gains they've got for the 200% price increase.
        
       | ALittleLight wrote:
       | Am I reading correctly that there is a huge difference in http
       | versus SSL requests per second? e.g. in the Bare Metal 1 CPU case
       | it's 48k http to 800 SSL? I had no idea the performance impact of
       | SSL was that huge, if this is correct.
        
         | kevin_thibedeau wrote:
         | You need to breakout full handshakes from reconnects with a
         | ticket.
        
         | vorpalhex wrote:
         | They were measuring ssl handshakes.
        
         | jcims wrote:
         | >The Ixia client sent a series of HTTPS requests, each on a new
         | connection. The Ixia client and NGINX performed a TLS handshake
         | to establish a secure connection, then NGINX proxied the
         | request to the backend. The connection was closed after the
         | request was satisfied.
         | 
         | I honestly struggle to understand why they didn't incorporate
         | keepalives in the testing. Reusing an existing TLS connection,
         | something done far more often than not in the wild, will have a
         | dramatic positive effect on throughput.
        
           | mlyle wrote:
           | Because, why bother? The symmetric encryption itself is
           | cheap. We have datapoints for
           | 
           | A) 0% SSL handshake workload per connection
           | 
           | and B) 100% SSL handshake workload per connection.
           | 
           | A reasonable step is just to solve the linear system---
           | 800 (s+r) = 1     48000 r = 1
           | 
           | and go from there for preliminary sizing. And, when you need
           | detailed sizing, you should be using the real data from your
           | real application.
           | 
           | E.g., if it's reused 6 times, we'd expect around 4400
           | reqs/second from one core.
        
       | marginalia_nu wrote:
       | I saw quite a significant performance and resource usage benefit
       | from migrating away from a virtualized kubernetes environment to
       | bare metal debian, so their findings align well with my anecdata
       | as well.
        
       | [deleted]
        
       | Thaxll wrote:
       | Who actually runs Centos 7 ( Kernel 3.10 ) for benchmarks in
       | 2021? Run something recent like Ubuntu 20 + KVM, you will see big
       | difference. I don't beleive modern virtualization has ~20%
       | overhead ( it should be less than 5% ).
        
         | tln wrote:
         | Yeah, that is weird. CentOS 7 is from 2009. CentOS 8 is from
         | 2015
         | 
         | Neither supports io_uring (although I don't think nginx does
         | either)
        
         | marginalia_nu wrote:
         | Can you find sources for those "less than 5%"-numbers form
         | people who aren't selling kubernetes or cloud-related services?
         | 
         | It's generally pretty easy to construct benchmarks that make
         | look favorable. It's why there's constantly blog posts to the
         | effect of "$INTERPRETED_HIGH_LEVEL_LANGUAGE is faster than
         | C++".
        
           | jiggawatts wrote:
           | In my experience 20% is about right for the minimum overhead
           | for general workloads.
           | 
           | 5% is achievable only for pure user-mode CPU loads with
           | minimal I/O.
        
             | xxpor wrote:
             | You can get much lower with modern SR-IOV devices.
        
           | grepfru_it wrote:
           | I mean one could say kubernetes has virtualized components,
           | and requires VT-d extensions to operate at an accelerated
           | speed, but I don't think containers are truly virtualized. So
           | you can probably get away with a less than 5% benchmark if
           | the stars aligned.
           | 
           | With a hypervisor you're looking at 10-15% overhead,
           | typically. maybe getting down to 7-12% using tricks
           | (paravirtualization, pci-passthrough etc). In my environment
           | i am around 12% OH on a good day
        
       | secondcoming wrote:
       | I don't quite understand. The 'bare-metal' webservers returned
       | 128 byte responses, but the Kubernetes version returned 1KB
       | responses
        
         | phillipseamore wrote:
         | Both the configuration files only have 1KB data files so might
         | be an error in the article.
        
       ___________________________________________________________________
       (page generated 2021-10-29 23:01 UTC)