[HN Gopher] Load balancing and its different types
___________________________________________________________________
Load balancing and its different types
Author : saranshk
Score : 65 points
Date : 2021-01-26 16:47 UTC (6 hours ago)
(HTM) web link (www.wisdomgeek.com)
(TXT) w3m dump (www.wisdomgeek.com)
| kvhdude wrote:
| i would have liked to see details of layer4 vs layer7 load
| balancing. The latter invovles terminating a tcp session and
| reinitiating one to backend.
| toast0 wrote:
| Layer 4 load balancing can be a huge reduction in work for the
| load balancer; especially if it's in a Direct Server Return
| configuration, where the load balancer only sees incoming
| packets, and response packets go directly from the server.
|
| The downside is you lose any ability to balance based on
| details of the application protocol, it requires some specific
| network setup, and it's hard to find a DSR load balancer in
| managed hosting or cloud. I'm not sure if there's off the shelf
| software to manage DSR either (the basic pieces are there in
| most firewalls, but management isn't)
| r0mdau wrote:
| Software load balancing solutions have now more algorithms such
| as least time (nginx+ for example). And yes some ISP cache DNS
| entries for a long time... But DNS load balancing should be used
| only on disaster scenarios to mitigate.
| toast0 wrote:
| DNS load balancing works good enough if you have smart enough
| clients (not web browsers), and your pool of server IPs is
| fairly static. If you can select randomly from a list of names,
| and then try several of the A/AAAA records from that result,
| then you may have some delay if you pull a dead server from a
| cached record, but it won't be too bad. SRV records and really
| smart clients should work pretty well too, but not a lot of
| people have really smart clients.
|
| The vast majority of ISP caches won't keep your low TTL records
| in cache for years, but some do; this is a problem if you have
| to move your load balancers ever too though.
|
| Depends on how stable your servers are vs your load balancers,
| and how many connections you need; and if you have enough IP
| addresses to give public IPs to your servers. Also, if you
| really absolutely need to control the load precisely, DNS isn't
| going to ever give you that.
| saranshk wrote:
| Added least time to the post as well. Thank you!
| Yhippa wrote:
| I was explaining the DNS system and load balancing today and I
| kind of mixed it all up based on this wonderful link. Thanks, I
| will share this out with that person to undo the damage I might
| have done.
| saranshk wrote:
| We all learn everyday. I am learning from the comments here as
| well. The best we can do is accept we were wrong and correct
| it.
| dragontamer wrote:
| Load balancing as a strategy is used in far more than just web-
| applications!
|
| This article only discusses web-based load balancing, which is
| absolutely important, but doesn't discuss supercomputer
| scheduling load-balancing. Its arguably a different subject...
| but the concept is the same.
|
| When you have 4000 nodes on a supercomputer, how do you
| distribute the problem such that all the nodes have something to
| do? Supercomputer problems are sometimes predictable (ie: matrix-
| multiplications), and you can sometimes "perfectly load balance
| without communication".
|
| But in the case of web-applications, there's probably no way to
| really predict the "cost of performance" before you start
| processing the service request (what if its a Facebook request to
| a really old photograph? Facebook may have to pull it out of
| long-term storage before it can service that request. There's no
| real way to know at the load-balancer whether a picture-request
| would be in the cache or not... at least, not before you process
| the request to begin with!)
|
| -----------
|
| In any case, I think adding "Predict the computational cost,
| calculate the costs you distributed to different nodes, and then
| distribute the new load to the node with lowest computational
| cost given so far" is a good method that works in some
| applications. (All blocks in a dense matrix multiplication have
| the same cost, so just keep passing out subblocks to all nodes as
| you're working on the problem)
| snoshy wrote:
| Arguably, one of the most important characteristics of a load
| balancer is to have extremely low latency. If you're balancing
| loads, you want to be very quick about making a decision. When
| generating predictions about the computational cost, that can
| itself add in a computational cost that might result in a non-
| negligible amount of computational cost overhead.
|
| Inherently, the idea that you're talking about boils down to
| having a way to characterize the nature of the request flows in
| such a manner that they can be evenly distributed. The ideal
| way to characterize them then, would be to know this
| information beforehand such that it does not require any
| computation at all to normalize the costs. As such, the best
| strategy would be to actually segregate traffic flows such that
| they're forwarded to "dumb" load balancers that use one of the
| strategies from TFA like weighted round robin.
|
| Of course, there are many such optimizations available, but TFA
| seems to be targeting a beginner level introduction to a rather
| complex topic. As you describe, load balancing and scheduling
| algorithms have a pretty high overlap in terms of their
| theoretical foundations, and these concepts manifest themselves
| throughout any large scale system.
| toast0 wrote:
| > But in the case of web-applications, there's probably no way
| to really predict the "cost of performance" before you start
| processing the service request (what if its a Facebook request
| to a really old photograph? Facebook may have to pull it out of
| long-term storage before it can service that request. There's
| no real way to know at the load-balancer whether a picture-
| request would be in the cache or not... at least, not before
| you process the request to begin with!)
|
| I worked at Facebook, but didn't touch anything related to
| this; so this is all conjecture.
|
| The load balancer can't (or shouldn't) predict the cost to
| service a request; but the thing that generated the url for the
| image could; and that prediction could be passed in the url for
| the balancer to act on.
|
| If you really need to balance by performance, it's probably
| simpler and accurate enough to provide frequent load feedback
| to the balancer. As long as you have a lot of requests, simple
| things work pretty well.
| saranshk wrote:
| Concepts are definitely transferrable across domains. And then
| adapted according to desired outputs. Cost of performance is
| interesting and somewhat similar to the least-time approach but
| adapted to the supercomputers domain.
| Terretta wrote:
| I like this intro!
|
| Good to see 'random' redefined as pick random two then assign to
| one with least connections, which works strictly better than
| either random or least connections alone.
|
| Misses a couple categories that may be relevant: least hops or
| best transit type network-mapped balancing to get to the ideal
| set of servers globally, as well as a technique that not-so-
| simply connects the user to the geography with the fastest
| response for them right then.
|
| While the article notes value of fewer connections for a stream,
| you can take a bit longer in setup for a stream to get it right,
| as you and the viewer will pay the price longer if you get it
| wrong.
|
| All this gets much more complicated when balancing very large
| objects, as you have to consider content availability and cache
| bin packing among the servers you balance to.
| saranshk wrote:
| Thank you. I am not sure how to introduce transit and the
| concept of hops in an introductory post without explaining in
| depth about the networking side of it. Maybe you could help me
| out with that?
| Terretta wrote:
| I'd think it's fine to be hand-wavy:
|
| Load balancing isn't just about _server_ load or congestion,
| it's also about _network_ load and congestion. If a web page
| or video takes longer for a user to download, it ties up the
| server longer too.[1]
|
| Load balancing algorithms can also consider network paths or
| round trip times between the user and a server to give users
| a faster web download or video stream. To do this, they may
| use information from network routing topology, such as how
| many "hops" or routers between the user and the server, or
| may even triangulate actual network performance by assessing
| measurements from multiple data centers and load balancing to
| the most responsive.
|
| _1. See "snoshy" comment on latency in these
| comments:https://news.ycombinator.com/item?id=25920284 --
| roughly, you aim to avoid queuing or connection creep, as you
| mentioned in the intro, and speed of opening, transmitting
| data over, then closing the connection, can make a huge
| difference._
| sparrc wrote:
| Decent summary but a little out-dated on DNS load balancing.
|
| Major cloud services like AWS support health/status checks
| through DNS these days:
| https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/re...
|
| It's also trivial to get around the client caching issue, just
| set a low TTL. Perhaps in the olden days providers had stricter
| limits on the minimum TTL you can set, but these days you can set
| it practically as low as you want.
|
| EDIT: as a few commenters have fairly pointed out, TTL can easily
| be ignored by poorly-behaved ISPs and clients, so I'll admit
| calling it "trivial" to get around is not exactly accurate.
| notabee wrote:
| Many applications do not refresh their DNS with every
| connection either. Take for example an Apache reverse proxy
| that's reusing long lived connections. So updating DNS may
| still require restarting/reloading many upstream services.
|
| https://stackoverflow.com/questions/52032150/apache-force-dn...
| saranshk wrote:
| I know about the caching issue is a little trivial but it was
| worth mentioning. Though I should have mentioned the low TTL
| piece. I will add that to the post. And also will add the
| health check part too. Reading up a bit about it. Thanks for
| the information!
| jorblumesea wrote:
| TTL is difficult in practice due to client implementations and
| other issues like that. Be careful using DNS anything. DNS was
| not designed to immediately resolve anything. That's why IPs
| are mostly used.
| dilyevsky wrote:
| Except it's not trivial at all because isp resolver will just
| disregard your low ttl
| samprotas wrote:
| Having executed several "no-downtime" cutovers between systems
| via DNS updates, I will warn you that a surprising number of
| clients never re-resolve DNS, so the TTL is effectively
| "forever" from their point of view.
|
| For the rare case of lift-and-shift-ing for a system upgrade I
| felt morally okay about eventually pulling the plug on them,
| but I'd hesitate to design a system that relied on well-behaved
| DNS clients if I had a reasonable alternative.
| tyldum wrote:
| Another gotcha would be UDP based services. Since it is
| packet oriented and not connection oriented, when should it
| re-resolve? Most will not until the application is restarted.
| gary_0 wrote:
| When I last updated a domain most clients saw the change
| within the TTL (1 hour)... except for my cable ISP at home.
| It took them the better part of a week.
| nwsm wrote:
| Nice succinct writeup. Here's a great deep dive on "Practical
| Load Balancing with Consistent Hashing" from a Vimeo engineer
| (2017) [0].
|
| [0] https://www.youtube.com/watch?v=jk6oiBJxcaA
| supernova87a wrote:
| Speaking of, seems like wisdomgeek.com needs some load balancing
| right now...
| saranshk wrote:
| Ha, I know. It's a single VM instance right now. I have been
| thinking of migrating to Gatsby for quite some time now. This
| unexpected traffic and the server limitations give me a reason
| to get working on it!
| crazypython wrote:
| Or you could serve a fully static site with vanilla JS.
| saranshk wrote:
| vanilla JS might be too much to maintain. I have already
| started working on the Gatsby version and will try and
| accelerate that.
___________________________________________________________________
(page generated 2021-01-26 23:01 UTC)