[HN Gopher] Understanding Round Robin DNS
___________________________________________________________________
Understanding Round Robin DNS
Author : hyperknot
Score : 190 points
Date : 2024-10-26 16:46 UTC (6 hours ago)
(HTM) web link (blog.hyperknot.com)
(TXT) w3m dump (blog.hyperknot.com)
| latchkey wrote:
| > "It's an amazingly simple and elegant solution that avoids
| using Load Balancers."
|
| When a server is down, you have a globally distributed / cached
| IP address that you can't prevent people from hitting.
|
| https://www.cloudflare.com/learning/dns/glossary/round-robin...
| arrty88 wrote:
| The standard today is to use a relatively low TTL and to health
| check the members of the pool from the dns server.
| latchkey wrote:
| That's like saying there are traffic rules in Saigon.
|
| Exact implementation of TTL, is a suggestion.
| wongarsu wrote:
| An clients tested in the article behaved correctly and chose
| one of the reachable servers instead.
|
| Of course somebody will inevitably misconfigure their local DNS
| or use a bad client. Either you accept an outage for people
| with broken setups or you reassign the IP to a different server
| in the same DC.
| latchkey wrote:
| If you know all of your clients, then you don't even need
| DNS. But, you don't know all of your clients. Nor do you
| always know your upstream DNS provider.
|
| Design for failure. Don't fabricate failure.
| zamadatix wrote:
| Why would knowing your clients change whether or not you
| want to use DNS? Even when you _control_ all of the clients
| you 'll almost always want to keep using DNS.
|
| A large number of services successfully achieve their
| failure tolerances via these kinds of DNS methods. That
| doesn't mean all services would or that it's always the
| best answer, it just means it's a path you can consider
| when designing for the needs of a system.
| latchkey wrote:
| I'm replying to the comment above. If the article picks a
| few clients and it happens to work, that is effectively
| "knowing your clients". At which point, it means you have
| control over the client/server relationship and if we are
| trying to simplify by not using load balancers, we might
| as well simplify things even further, and not use DNS.
|
| It is an absurd train of thought that nobody in their
| right mind would consider... just like using DNS-RR as a
| replacement for load balancing.
| zamadatix wrote:
| I must be having trouble following your train of thought
| here - many large web services like Cloudflare and Akamai
| serve large volumes of content through round robin DNS
| balancing, what's absurd about their success? They
| certainly don't know every client that'll ever connect to
| a CDN on the internet... it just happens to work almost
| every time anyways. That very few clients might not
| instantly flip over isn't always a design failure worth
| deploying full load balancers. I'm also still not
| following why the decisions for whether or not you need a
| load balancer are supposed to be in any way equivalent to
| the decisions of when using DNS would make sense or not?
| latchkey wrote:
| We are not talking about "large web services", we are
| talking about small end users spinning up their own DNS-
| RR "solution".
|
| LWS get away with it because of Anycast...
|
| https://www.cloudflare.com/en-
| gb/learning/cdn/glossary/anyca...
| zamadatix wrote:
| Anycast is certainly a nice layer to add but it's not a
| requirement for DNS round robin to work reliably. It does
| save some of the concern around relying on selection of
| an efficiently close choice by the client though and can
| be a good option for failover.
|
| More directly - is there some set of common web client
| I've been missing for many years that just doesn't follow
| DNS TTLs or try alternate records? I think the article
| gets it right with the wish list at the end containing a
| Amazon Route 53-like "pull dead entries automatically"
| note but maybe I'm missing something else? I've used this
| approach (pull the dead server entries from DNS, wait for
| TTL) and never caught any unexpected failures during
| outages but maybe I haven't been looking in the right
| places?
|
| If you mean it's possible to design something with round-
| robin DNS in a way that more clients than you expect will
| fail then absolutely, you can do things the wrong way
| with most any solution. Sometimes you can be fine with a
| subset of clients not always working during an outage or
| you can be fine with a solution which provides slower
| failover than an active load balancer. What I'm trying to
| find is why round-robin DNS must always be the wrong
| answer in all design cases.
| latchkey wrote:
| > _is there some set of common web client I 've been
| missing for many years that just doesn't follow DNS TTLs
| or try alternate records?_
|
| Yes. There are tons of people with outdated and/or buggy
| software still using the internet today.
| zamadatix wrote:
| What % did you find to be "tons" with these specific
| bugs? I'm assuming it was quite a significant number (at
| least 10%?) that broke badly quite often given the
| certainty it's the wrong decision for all solutions, any
| idea how to help me identify which clients I've been
| missing or might run into? DNS TTLs are also pretty
| necessary for most web systems to work reliably,
| regardless of load balancer or not, so what ways do you
| work around having large numbers of clients which don't
| obey them (beyond hoping to permanently occupy the same
| set of IPs for the life of the service of course)?
| latchkey wrote:
| The percentage is kind of irrelevant. The issue is that
| if you're running something like an e-commerce site and
| any percentage of people can't hit your site because of a
| TTL issue with one of your down servers, you're likely to
| never know how much lost revenue you've had. Site is
| down, go to another store to buy what you need. You also
| have no control over fixing the issue, other than to get
| the server back and running. This has downstream effects,
| how do you cycle the server for upgrades or maintenance?
|
| I don't understand why anyone would argue for this as a
| solution when there are near zero effort better ways of
| doing this that don't have any of the negative downsides.
| buzer wrote:
| > More directly - is there some set of common web client
| I've been missing for many years that just doesn't follow
| DNS TTLs or try alternate records?
|
| I don't know if there is such a list but older versions
| of Java are pretty famous for caching the DNS responses
| indefinitely. I don't hear much about it these days so I
| assume it was probably fixed around Java 8.
| toast0 wrote:
| Skipping an unnecessary intermediary is worth considering.
|
| Load balancing isn't without cost, and load balancers subtly
| (or unsubtly) messing up connections is an issue. I've also
| used providers where their load balancers had worse
| availability than our hosts.
|
| If you control the clients, it's reasonable to call the
| platform dns api to get a list of ips and shuffle and iterate
| through in an appropriate way. Even better if you have a few
| stablely allocated IPs you can distribute in client binaries
| for _when_ DNS is broken; but DNS is often not broken and it 's
| nice to use for operational changes without having to push new
| configuration/binaries everytime you update the cluster.
|
| If your clients are browsers, default behavior is ok; they
| usually use IPs in order, which can be problematic [1], but
| otherwise, they have good retry behavior: on connection refused
| they try another IP right away, in case of timeout, they try at
| least a few different IPs. It's not ideal, and I'd use a load
| balancer for browsers, at least to serve the initial page load
| if feasible, and maybe DNS RR and semi-smart client logic in JS
| for websockets/etc; but DNS RR is workable for a whole site
| too.
|
| If your clients are not browsers and not controlled by you,
| best of luck?
|
| I will 100% admit that sometimes you have to assume someone
| built their DNS caching resolver to interpret the TTL field as
| a number of days, rather than number of seconds. And that
| clients behind those resolvers will have trouble when you
| update DNS, but if your loadbalancer is behind a DNS name,
| _when_ it needs to change addresses, you 'll deal with that
| then, and you won't have experience.
|
| [1] one of the RFCs suggests that OS apis should sort responses
| by prefix match, which might make sense if IP prefixes were
| heirarchical as a proxy to get to a least network distance
| server. But in the real world, numerically adjacent /24s are
| often not network adjacent, but if your servers have widely
| disparate addresses, you may see traffic from some client ips
| gravitate towards numerically similar server ips.
| ectospheno wrote:
| > I will 100% admit that sometimes you have to assume someone
| built their DNS caching resolver to interpret the TTL field
| as a number of days, rather than number of seconds.
|
| I've run a min ttl of 3600 on my home network for over a
| year. No one has complained yet.
| toast0 wrote:
| That's only because there's no way for service operators to
| effectively complain when your clients continue to hit
| service ips for 55 minutes after you should. And if there
| was, we'd first yell at all the people who continue to hit
| service ips for weeks and months after a change... by the
| time we get to complaining about one home using an hour
| ttl, it's not a big deal.
| ectospheno wrote:
| I take the point of view that if me not honoring your 60
| second ttl breaks your site for me then I want to know so
| I stop going there.
| easylion wrote:
| https://www.cloudflare.com/en-gb/learning/cdn/glossary/anyca...
| easylion wrote:
| did you try running a simple bash curl loop instead of manually
| printing. The data and statistics will be become exactly clear.
| Because i want to understand how to ensure my clients get the
| nearest edge data center
| tetha wrote:
| > As you can see, all clients correctly detect it and choose an
| alternative server.
|
| This is the nasty key point. The reliability is decided client-
| side.
|
| For example, systemd-resolved at times enacted maximum technical
| correctness by always returning the lowest IP address. After all,
| DNS-RR is not well-defined, so always returning the lowest IPs is
| not wrong. It got changed after some riots, but as far as I know,
| Debian 11 is stuck with that behavior, or was for a long time.
|
| Or, I deal with many applications with shitty or no retry
| behavior. They go "Oh no, I have one connection refused, gotta
| cancel everything, shutdown, never try again". So now 20% - 30%
| of all requests die in a fire.
|
| It's an acceptable solution if you have nothing else. As the
| article notices, if you have quality HTTP clients with a few
| retries configured on them (like browsers), DNS-RR is fine to
| find an actual load balancer with health checks and everything,
| which can provide a 100% success rate.
|
| But DNS-RR is no loadbalancer and loadbalancers are better.
| latchkey wrote:
| > _It 's an acceptable solution if you have nothing else._
|
| I'd argue it isn't acceptable at all in this day and age and
| that there are other solutions one should pick today long
| before you get to the "nothing else" choice.
| toast0 wrote:
| Anycast is nice, but it's not something you can do yourself
| well unless you have large scale. You need to have a large
| number of PoPs, and direct connectivity to many/most transit
| providers, or you'll get weird routing.
|
| You also need to find yourself some IP ranges. And learn BGP
| and find providers where you can use it.
|
| DNS round robin works as long as you can manage to find two
| boxes to run your stuff on, and it scales pretty high too.
| When I was at WhatsApp, we used DNS round robin until we
| moved into Facebook's hosting where it was infeasible due to
| servers not having public addresses. Yes, mostly not
| browsers, but not completely browserless.
| latchkey wrote:
| Back in 2013, that might have been the best solution for
| you. But there were still plenty of headlines...
| https://www.wamda.com/2013/11/whatsapp-goes-down
|
| We're talking about today.
|
| The reason why I said Anycast is cause the vast majority of
| people trying to solve the need for having multiple servers
| in multiple locations, will just use CF or any one of the
| various anycast based CDN providers available today.
| toast0 wrote:
| Oh sure, we had many outages. More outages on the one
| service where we tried using loadbalancers because the
| loadbalancers would take a one hour break every 30 days
| (which is pretty shitty, but that was the load balancer
| available, unless we wanted to run a software load
| balancer, which didn't make any sense).
|
| We didn't have many outages due to DNS, because we had
| fallback ips to contact chat in our clients. Usage was
| down in the 24 hours after our domain was briefly
| hijacked (thanks Network Solutions), and I think we lost
| some usage when our DNS provider was DDoSed by 'angry
| gamers'. But when FB broke most of their load balancers,
| that was a much bigger outage. BGP based outages broke
| everything, DNS and load balancers, so no wins there.
| latchkey wrote:
| > We didn't have many outages due to DNS, because we had
| fallback ips to contact chat in our clients.
|
| Exactly! When you control the client, you don't even need
| DNS. Things are actually even more secure when you don't
| use it, nothing to DDoS or hijack. When FB broke one set
| of LB's, the clients should have just routed to another
| set of LB's, by IP.
| toast0 wrote:
| FB likes to break everything all at once anyway... And
| healtchecking the load balancers wasn't working either.
| So DNS to regional balancers was sending people to the
| wrong place, and the anycast ips might have worked if you
| were lucky, but you might have gotten a PoP that was
| broken.
|
| The servers behind it were fine, if you could get to one.
| You could push broken DNS responses, I suppose, but it's
| harder than breaking a load balancer.
| nerdile wrote:
| It's putting reliability in the hands of the client, or
| whatever random caching DNS resolver they're sitting behind.
|
| It also puts failover in those same hands. If one of your
| regions goes down, do you want the traffic to spread evenly to
| your other regions? Or pile on to the next nearest neighbor? If
| you care what happens, then you want to retain control of your
| traffic management and not cede it to others.
| meindnoch wrote:
| So half of your content is served from another server? Sounds
| like a recipe for inconsistent states.
| ChocolateGod wrote:
| You can easily use something like an object store or shared
| database to keep data consistent.
| jgrahamc wrote:
| Hmm. I've asked the authoritative DNS team to explain what's
| happening here. I'll let HN know when I get an authoritative
| answer. It's been a few years since I looked at the code and a
| whole bunch of people keep changing it :-)
|
| My suspicion is that this is to do with the fact that we want to
| keep affinity between the client IP and a backend server (which
| OP mentions in their blog). And the question is "do you break
| that affinity if the backend server goes down?" But I'll reply to
| my own comment when I know more.
| delusional wrote:
| > I'll let HN know when I get an authoritative answer
|
| Please remember to include a TTL so I know how long I can cache
| that answer.
| jgrahamc wrote:
| Thank you for appreciating my lame joke.
| mlhpdx wrote:
| So many sins have been committed in the name of session
| affinity.
| jgrahamc wrote:
| Looks like this has nothing to do with session affinity. I
| was wrong. Apparently, this is a difference between our paid
| and free plans. Getting the details, and finding out why
| there's a difference, and will post.
| asmor wrote:
| Well, CEO said there is none, get on it engineering :)
| cybice wrote:
| Cloudflare results with worker as a reverse proxy can be much
| better.
| easylion wrote:
| But won't it add an additional hop hence additional latency to
| every single request ?
| rodcodes wrote:
| Nah, because the Cloudflare Workers run at closest edge
| location and are real fast.
|
| The real solution with Cloudflare is to use their Load
| Balancing (https://developers.cloudflare.com/load-balancing)
| which is a paid feature.
| specto wrote:
| Chrome and Firefox use the OS dns server by default, which in
| most OS' have caching as well.
| urbandw311er wrote:
| What a great article! It's often easy to forget just how flexible
| and self-correcting the "official" network protocols are. Thanks
| to the author for putting in the legwork.
| zamalek wrote:
| Take a look at SRV records instead - they are very intentionally
| designed for this, and behave vaguely similarly to MX. Creating a
| DNS server (or a CoreDNS/whatever module) that dynamically
| updates weights based on backend metrics has been a pending pet
| project of mine for some time now.
| jeroenhd wrote:
| Until the HTTP spec gets updated to include SRV records, using
| SRV records for HTTP(S) is technically spec-incompliant and
| practically useless.
|
| However, as is common with web tech, the old SRV record has
| been reinvented as the SVCB record with a smidge of DANE for
| good measure.
| teddyh wrote:
| One of the early proposed solutions for this was the SRV DNS
| record, which was similar to the MX record, but for every
| service, not just e-mail. With MX and SRV records, you can
| specify a list of servers with associated priority for clients to
| try. SRV also had an extra "weight" parameter to facilitate load
| balancing. However, SRV did not want the political fight of
| effectively hijacking every standard protocol to force all
| clients of every protocol to also check SRV records, so they
| specified that SRV should _only_ be used by a client if the
| standard for that protocol explicitly specifies the use of SRV
| records. This technically prohibited HTTP clients from using SRV.
| Also, when the HTTP /2 (and later) HTTP standards were being
| written, bogus arguments from Google (and others) prevented the
| new HTTP protocols from specifying SRV. SRV seems to be
| effectively dead for new development, only used by some older
| standards.
|
| The new solution for load balancing seems to be the new HTTPS and
| SVCB DNS records. As I understand it, they are standardized by
| people wanting to add extra parameters to the DNS in order to to
| jump-start the TLS1.3 handshake, thereby making fewer roundtrips.
| (The SVCB record type is the same as HTTPS, but generalized like
| SRV.) The HTTPS and SVCB DNS record types both have the priority
| parameter from the SRV and MX record types, but HTTPS/SVCB lack
| the weight parameter from SRV. The standards have been published,
| and support seem to have been done in some browsers, but not all
| have enabled it. We will see what browsers will actually do in
| the near future.
| jsheard wrote:
| > The new solution for load balancing seems to be the new HTTPS
| and SVCB DNS records. As I understand it, they are standardized
| by people wanting to add extra parameters to the DNS in order
| to to jump-start the TLS1.3 handshake, thereby making fewer
| roundtrips.
|
| The other big advantage of the HTTPS record is that it allows
| for proper CNAME-like delegation at the domain apex, rather
| than requiring CNAME flattening hacks that can cause routing
| issues on CDNs which use GeoDNS in addition to or instead of
| anycast. If you've ever seen a platform recommend using a www
| subdomain instead of an apex domain, that's why, and it's part
| of why Akamai pushed for HTTPS records to be standardized since
| they use GeoDNS.
| teddyh wrote:
| Oh yes1. This is an advantage shared by all of MX, SRV and
| HTTPS/SVCB, though.
|
| 1. <https://news.ycombinator.com/item?id=38420555>
| metadat wrote:
| _> This allows you to share the load between multiple servers, as
| well as to automatically detect which servers are offline and
| choose the online ones._
|
| To [hesitantly] clarify a pedantry regarding "DNS automatic
| offline detection":
|
| Out of the box, RR-DNS is only good for load balancing.
|
| Nothing automatic happens on the availability state detection
| front unless you build smarts into the client. TFA introduction
| does sort of mention this, but it took me several re-reads of the
| intro to get their meaning (which to be fair could be a PEBKAC).
| Then I read the rest of TFA, which is all about the smarts.
|
| If the 1/N server record selected by your browser ends up being
| unavailable, no automatic recovery / retry occurs at the protocol
| level.
|
| p.s. "Related fun": Don't forget about Java's DNS TTL [1] and
| `.equals()' [2] behaviors.
|
| [1] https://stackoverflow.com/questions/1256556/how-to-make-
| java...
|
| [2] https://news.ycombinator.com/item?id=21765788 (5y ago, 168
| comments)
| encoderer wrote:
| We accomplish this on Route53 by having it pull servers out of
| the dns response if they are not healthy, and serving all
| responses with a very low ttl. A few clients out there ignore
| ttl but it's pretty rare.
| ChocolateGod wrote:
| I once achieved something similar with PowerDNS, which you
| can use LUA rules to do health checks on a pool of servers
| and only return health servers as part of the DNS record, but
| found odd occurrences of clients not respecting the TTL on
| DNS records and caching too long.
| tetha wrote:
| You usually do this with servers that should be rock-solid
| and stateless. HAProxy, Traefik, F5. That way, you can pull
| the DNS record for maintenance 24 - 48 hours in advance. If
| something overrides DNS TTLs that much, there is probably
| some reason.
| d_k_f wrote:
| Honest question to somebody who seems to have a bit of
| knowledge about this in the real world: several (German, if
| relevant) providers default to a TTL of ~4 hours. Lovely if
| everything is more or less finally set up, but usually our
| first step is to decrease pretty much everything down to 60
| seconds so we can change things around in emergencies.
|
| On average, does this really matter/make sense?
| stackskipton wrote:
| Lower TTLs is cheap insurance so you can move hostnames
| around.
|
| However, you should understand that not ALL clients will
| respect those TTLs. There are resolvers that may minimum
| TTL threshold where IF TTL < Threshold, TTL == Threshold,
| Common with some ISPs, and also, there may be cases where
| browsers and operating systems will ignore TTLs or fudge
| them.
| hypeatei wrote:
| The browser behavior is really nice, good to know that it falls
| back quickly and smoothly. Round robin DNS has always been
| referred to as a "poor mans load balancer" which it seems to be
| living up to.
|
| > Curl also works correctly. First time it might not, but if you
| run the command twice, it always corrects to the nearest server.
|
| This took two tries for me, which begs the question how curl is
| keeping track of RTT (round trip times), interesting.
| V__ wrote:
| This seems like a nice solution for zero-downtime updates. Clone
| the server, add a the specified ip, deny access to the main one,
| upgrade and turn the cloned server off.
| unilynx wrote:
| > So what happens when one of the servers is offline? Say I stop
| the US server:
|
| > service nginx stop
|
| But that's not how you should test this. A client will see the
| connection being refused, and go on to the next IP. But in
| practice, a server may not respond at all, or accept the
| connection and then go silent.
|
| Now you're dependent on client timeouts, and round robin DNS will
| suddenly look a whole lot less attractive to increase
| reliability.
| Joe_Cool wrote:
| Yeah SIG_STOP or just ip/nftables DROP would be a much more
| realistic test.
| rebelde wrote:
| I have use round robin for years.
|
| Wish I could add instructions like:
|
| - random choice #round robin, like now
|
| - first response # usually connects to closest server
|
| - weights (1.0.0.1:40%; 2.0.0.2:60%)
|
| - failover: (quick | never)
|
| - etc: naming countries, continents
| edm0nd wrote:
| The dark remix version of this is fast flux hosting and what a
| lot of the bulletproof hosting providers use.
|
| https://unit42.paloaltonetworks.com/fast-flux-101/
| stackskipton wrote:
| As SRE, I get a chuckle out of this article and some of the
| responses. Devs mess this up constantly.
|
| DNS has one job. Hostname -> IP. Nothing further. You can mess
| with it on server side like checking to see if HTTP server is up
| before delivering the IP but once IP is given, the client takes
| over and DNS can do nothing further so behavior will be wildly
| inconsistent IME.
|
| Assuming DNS RR is standard where Hostname returns multiple IPs,
| then it's only useful for load balancing in similar latency
| datacenters. If you want fancy stuff like geographic load
| balancing or health checks, you need fancy DNS server but at end
| of day, you should only return single IP so client will target
| the endpoint you want them to connect to.
| lysace wrote:
| I've never ever come up with a scenario where RR DNS is useful
| in the goal of achieving high availability. I'm similarly
| mystified.
|
| What can be useful: dynamically adjusting DNS responses
| depending on what DC is up. But at this point shouldn't you be
| doing something via BGP instead? (This is where my knowledge
| breaks down.)
| stackskipton wrote:
| Yea, Anycast IP like what Cloudflare does is the best.
|
| If you want cheaper load balancing and are ok with some
| downtime while DNS reconfigures, DNS system that returns IP
| based on which Datacenter is up works. Examples of this are
| Route53, Azure Traffic Manager and I assume Google has
| solution, I just don't know what it is.
| lysace wrote:
| Worked on implementing a distributed-consensus driven DNS
| thing like 15 years ago. We had 3 DCs around the world for
| a very compute-intense but not very stateful service. It
| actually just worked without any meaningful testing on the
| first single DC outage. In retrospect I'm amazed.
| realchaika wrote:
| May be worth mentioning Zero downtime failover is a Pro or higher
| feature I believe, that's how it was documented before as well,
| back when protect your origin server docs were split by plan
| level. So you may see different behavior/retries.
| nielsole wrote:
| > Curl also works correctly. First time it might not, but if you
| run the command twice, it always corrects to the nearest server.
|
| I always assumed curl was stateless between invocations. What's
| going on here?
| barrkel wrote:
| My hypothesis: he's running on macOS and he's seeing the same
| behavior from Safari as from curl because they're both using
| OS-provided name resolution which is doing the lowest-latency
| selection.
|
| Firefox and Chrome use DNS over HTTPS by default I believe,
| which may mean they use a different name resolution path.
|
| The above is entirely conjection on my part, but the guess is
| heavily informed by the surprise of curl's behavior.
| hyperknot wrote:
| Correct. I'm on macOS and I tried turning off DoH in Firefox
| and then it worked like Safari.
| jkrauska wrote:
| Check out what happens when you use IPv6 addresses. RFC 6724 is
| awkward about ordering with IPv6.
|
| How your OS sorts DNS responses also comes in to play. Depends on
| what your browser makes DNS requests.
| mlhpdx wrote:
| Interesting topic for me, and I've been looking at anycast IP
| services and latency based DNS resolvers as well. I even made a
| repo[1] for anyone interested in a quick start for setting up AWS
| global accelerator.
|
| [1] https://github.com/mlhpdx/cloudformation-
| examples/tree/maste...
| why-el wrote:
| Hm, I thought Happy Eyeballs (HE) was mainly concerned with IPv6
| issues and falling back to IPV4. I didn't think it was this RFC
| in which finally some words were said about round-robin
| specifically, but it looks like it was (from this article).
|
| Is it true then that before HE, most round-robin implementations
| simply cycled and no one considered latency? That's a very
| surprising finding.
| freitasm wrote:
| Interesting. The author starts by discussing DNS round robin but
| then briefly touches on Cloudflare Load Balancing.
|
| I use this feature, and there are options to control Affinity,
| Geolocation and others. I don't see this discussed in the
| article, so I'm not sure why Cloudflare load balancing is
| mentioned if the author does not test the whole thing.
|
| Their Cloudflare wishlist includes "Offline servers should be
| detected."
|
| This is also interesting because when creating a Cloudflare load
| balancing configuration, you create monitors, and if one is down,
| Cloudflare will automatically switch to other origin servers.
|
| These screenshots show what I see on my Load Balancing
| configuration options:
|
| https://cdn.geekzone.co.nz/imagessubs/62250c035c074a1ee6e986...
|
| https://cdn.geekzone.co.nz/imagessubs/04654d4cdda2d6d1976f86...
| hyperknot wrote:
| I briefly mention that I don't go into L7 Load Balancing
| because it'd be cost prohibitive for my use case (millions of
| requests).
|
| Also, the article is about DNS-RR, not the L7 solution.
| bar000n wrote:
| hey! so i got a cdn for video made of 4 bare metals and 2 are
| newer and more powerful so i give them each 2 ip addresses from
| the 6 addresses replied by dns for the respective a record. but
| from a very diverse pool of devices (proprietary set top boxes,
| smart tv sets, mobile clients ios and android, web browsers, etc)
| i still get ~40% of traffic on the older servers instead of the
| expected 33% given 2 out of 6 ip addresses resolved as dns a
| records for these hosts. why?
| kawsper wrote:
| 37signals/Basecamp wrote about something similar on their blog,
| they saw traffic switching almost immediately:
| https://signalvnoise.com/posts/3857-when-disaster-strikes and in
| their comments they said it was hinted that it was just a DNS
| update with low TTLs.
___________________________________________________________________
(page generated 2024-10-26 23:00 UTC)