[HN Gopher] Akamai Edge DNS was down
___________________________________________________________________
Akamai Edge DNS was down
Author : vhab
Score : 444 points
Date : 2021-07-22 16:12 UTC (6 hours ago)
(HTM) web link (edgedns.status.akamai.com)
(TXT) w3m dump (edgedns.status.akamai.com)
| swarnie_ wrote:
| I love seeing these issues reverberate around the internet.
|
| This time i think /r/sysadmin pegged the issue first, great sub.
| geocrasher wrote:
| People don't believe me when I say how much DNS matters. So I
| wrote a song about it.
|
| https://soundcloud.com/ryan-flowers-916961339/dns-to-the-tun...
| wpasc wrote:
| dns, DNS, dns, dns. The start of every process, dns.
|
| Love this.
| geocrasher wrote:
| Thank you! I'm glad that landed where I wanted it to. It was
| a lot of fun to put together. I keep threatening to make a
| video. I need a collection of DNS memes so that I can just
| sideshow them.
| wpasc wrote:
| Haha please do, a video would be great. Your song reminded
| me of the song: "Find the Longest Path" which you may get a
| kick out of:
|
| https://www.youtube.com/watch?v=a3ww0gwEszo
| mvanbaak wrote:
| Awesome! Thank you.
| patleeman wrote:
| I just teared up
| geocrasher wrote:
| LOL! great comment thank you!
| ricardo81 wrote:
| Brilliant. NOERROR for this.
| brianjking wrote:
| lol, thanks for the laugh.
| zyberzero wrote:
| Thank you!
| southerntofu wrote:
| Sounds amazing! Do you maybe have a direct link? Soundcloud
| doesn't want us privacy-conscious users browsing their website
| :(
| Frost1x wrote:
| This made my day, thanks!
| geocrasher wrote:
| And this, mine! Thanks!
| [deleted]
| kevando wrote:
| lol
| pololee wrote:
| Thank you!! lol
| dolni wrote:
| > People don't believe me when I say how much DNS matters.
|
| That's weird to me. I have been working in sysadmin/DevOps for
| over a decade, but it did not take me very long to learn that
| DNS outages cause massive problems.
| geocrasher wrote:
| Right, but everybody has to learn that at some point. And I
| happen to be somebody who teaches such things. The importance
| of DNS is hard to overstate, but I go to great lengths to do
| exactly that, to make a point ;)
| throwawaysha wrote:
| I ran DNS servers, among other things, in the late 90s with
| better uptime than these "multi-DC/AZ/geo redundant" services
| everyone uses these days.
| memco wrote:
| Was just browsing a website where the first page of a query
| worked, but visiting page 2 of the results was returning a DNS
| error. Was curious how and why only part of the site was down,
| but it looks like this was the problem as now the whole site is
| down.
| katbyte wrote:
| aren't short DNS TTLs great?
| sebmellen wrote:
| Is this a serious argument for long TTLs? Always wondered why
| they exist... How interesting.
| slim wrote:
| Yes it is. The longer the TTL the longer you stay
| independent from third parties. It's what makes the
| internet stable.
| remram wrote:
| Long TTL makes you independent from DNS third parties, in
| that your name is still know by clients if DNS is down.
|
| Short TTL makes you independent from hosting third
| parties, in that you can quickly change which hosting
| provider your domain name points to.
|
| You can't win this one by only changing your TTL. The
| best solution is to use short TTLs and multiple
| nameservers on different providers.
| lowbloodsugar wrote:
| So many sites being reported as down, but change your DNS to
| something else (e.g. Google 8.8.8.8 and 8.8.4.4) and, after
| flushing your DNS cache, the sites are available. I was unable to
| get to ups.com or newegg.com (why yes, I am expecting a new toy),
| but after switching DNS and flushing DNS cache, I was able to get
| to both.
|
| Specifically, 1.1.1.1 provided bad addresses (as opposed to no
| addresses), and removing 1.1.1.1 fixed my problem. By then it had
| returned a bunch of bad addresses and I had to flush my DNS
| cache.
| didjathinkmess wrote:
| Cyberpolygon already? Thought we had at least a month or two
| penultimatebro wrote:
| Shh, normies are not ready for that.
|
| It's just a completely random DNS outage, nothing more.
| gianpaj wrote:
| https://www.interactivebrokers.co.uk/ , a Trading Platform, is
| also down as well :(
|
| How am I going to sell my AMC stock...
| swarnie_ wrote:
| You don't, you hold the dumb, over priced stock as a reminder
| for future, better informed investing.
| thunfisch wrote:
| Yep, all our EdgeDNS zones as well as DSD edgekeys are just
| returning SERVFAILS. Many big german websites are down right now.
| zhdc1 wrote:
| Several unrelated websites I was trying to visit are down. I
| figured I would find the answer on HN : )
| mariusseufzer wrote:
| Same haha
| cbeley wrote:
| I wonder if this is why LastPass is down. It has completely
| locked me out of my vault. You'd think it'd continue to work
| offline in a case like this. :/
| zxcvbn4038 wrote:
| When it comes to password managers, 1password is the one to
| beat. Much better experience in every regard.
| eunai wrote:
| I switched to BitWarden and haven't looked back. You can use it
| on the phone and pc (browser). As well as a desktop client.
| fredski42 wrote:
| And with vaultwarden you can go self hosted with a very
| lightweight server written in rust.
| AnIdiotOnTheNet wrote:
| Switched to vaultwarden at work for password management,
| only have minor gripes so can recommend.
| benburleson wrote:
| Yeah, my path was LastPass -> Bitwarden -> 1Password.
|
| Both Bitwarden and 1Password are great.
| decrypt wrote:
| Same path. It'll be very hard to move away from 1Password.
| App experience, sync, security features like key in
| addition to master password, family organizer-based
| recovery of an account, these are a few things that stand
| out.
| raffraffraff wrote:
| I prefer the browser addon for bitwarden over 1Password.
| Try editing a site in 1Password. It forces you to log
| into the full sir, whereas bitwarden can do almost
| everything right there in the addon.
| judge2020 wrote:
| This is also possible with the 1Password X extension,
| however there's a lot of feature segmentation and unclear
| messaging between the Desktop app-based version and
| 1Password X so I don't blame you for using the old one.
| macintux wrote:
| Yeah, I use 1Password for every critical bit of
| information (SSN numbers, physical access codes) and a
| whole lot of less-critical stuff. I expect to be a
| customer for life.
| revscat wrote:
| Can you explain what family organizer-based recovery
| means? It sounds like dad or mom could recover a kids
| password?
| eddieroger wrote:
| That's about right for what it is, or at least how I
| think about it. There's no magic "unlock vault" button
| (by design), but an Organizer can kick off a workflow to
| reset a vault if need be. I have a few of the more tech-
| savvy family members set as organizers in my family in
| case something ever happens to me.
| chewmieser wrote:
| https://support.1password.com/recovery/
| chewmieser wrote:
| My favorite feature personally is the built-in 2FA
| support. Click and it logs into your account and copies
| the 2fa code to clipboard so just paste on next screen.
|
| Multiple vaults too is nice but I know others have ways
| to limit exposure of passwords in similar manners.
| arnado wrote:
| Bitwarden offers this as well, but I don't really
| understand why you would want it. If someone compromises
| your password manager, 2FA is now worthless. Or am I
| misunderstanding how it works?
| decrypt wrote:
| Your understanding is correct. 1Password requires a key
| in addition to the master password. And finally,
| 1Password can have 2FA for itself, which is stored on my
| Authy. These are reasons why I am comfortable storing my
| 2FA codes on it.
|
| Bitwarden has 2FA support too, but does not have the
| unique key feature that 1Password has.
| JonathanMerklin wrote:
| Then what was the impetus to switch off of Bitwarden?
| nowahe wrote:
| I'm in the middle of a migration from Akamai to Cloudfront, time
| to take a break I guess
| blondie9x wrote:
| Looks like it is fixed now!
| [deleted]
| fredski42 wrote:
| I thought DNS was supposed to be resilient
| topspin wrote:
| DNS is _designed_ to be fault tolerant. Such a design, however,
| is often not leveraged correctly; the implementation of DNS can
| be and frequently is subject to SPOFs.
| realSaddy wrote:
| This is affecting Steam as well
| ssully wrote:
| It is impacting a lot of things: https://downdetector.com/
| 00deadbeef wrote:
| Well it's been an hour now since I first noticed the effects and
| their service status still has no useful information or ETA for a
| fix. It's just an "emerging issue".
| jonnyone wrote:
| The affected sites that I use are now working. Check again.
| bpye wrote:
| This is apparently why I can't book my COVID vaccine
| appointment...
| _joel wrote:
| Yes, was trying to do the same. Getting this 2nd jab has been a
| nightmare. Places listed as walk-in having Moderna, don't and
| they ran out of it when I went to get my secheduled jab.
| Ringing 119 just ends up in a dead line, then this outage. Fun.
| Scoundreller wrote:
| All yuor data are belong to us
| cbono1 wrote:
| Why would Google and Amazon be on the downdetector list or
| experiencing issues? Don't they have their own DNS / nameservers
| separate from Akamai?
| sathackr wrote:
| because the way downdetector works is it just basically counts
| how many people are searching/visiting for <site> down and if
| it's much higher than typical it flags the site as down.
|
| So if everyone searched "is google down" and visited the link
| on downdetector that was returned in the search, that would add
| to the downdetector count for that site.
|
| Downdetector doesn't actually know if the site is up or down.
| brentm wrote:
| A more proper name might be PeopleThinkItsDownDetector.com
| cbono1 wrote:
| Not nearly as SEO friendly
| k1t wrote:
| I found this hard to believe, but it's correct.
|
| _Downdetector only reports an issue if a significant number
| of users are impacted. To that end, Downdetector calculates a
| baseline volume of typical problem reports for each service
| monitored, based on the average number of reports for that
| given time of day over the last year. Downdetector's incident
| detection system compares the current number of problem
| reports to this baseline and only reports an issue if the
| current volume significantly exceeds the typical volume of
| reports._
|
| https://www.speedtest.net/insights/blog/how-downdetector-
| wor...
| mc32 wrote:
| So how do they reset status? The number of queries going down
| signifies return to normal status?
| dylan604 wrote:
| Some CEO calls another CEO and makes a deal?
| Eikon wrote:
| This is affecting apple as well
|
| https://www.apple.com/go/
| iruoy wrote:
| For some reason that url doesn't work for me, but
| https://www.apple.com/ and https://www.apple.com/nl/ do.
| remram wrote:
| That fails with a 404 for me, which is probably not related to
| DNS at all?
|
| archive.org seems to indicate there was never anything there...
| mvanaltvorst wrote:
| What role does Akamai Edge DNS play in normal internet traffic?
| DNS responses usually get cached, as far as I understand
| correctly. And it is usually possible to change your DNS server
| to e.g. Google's and circumvent the outage. Does Akamai Edge DNS
| play a role on the server side?
| carlsborg wrote:
| Looks like this: the affected subdomains are CNAMEd to the
| akamai CDN, and the Nameserver for those are/were down.
|
| So for example:
|
| Top level domain for nvidia resolved fine..
|
| dig @1.1.1.1 nvidia.com => status: NOERROR, Nameservers are
| ns6.dnsmadeeasy.com
|
| But the website didnt. dig @1.1.1.1 www.nvidia.com => status:
| SERVFAIL,
|
| The Nameserver for the this www.nvidia resolved to the akamai
| nameserver which had a problem..
|
| dig @1.1.1.1 www.nvidia.com NS => CNAME
| e33907.a.akamaiedge.net.
| NeckBeardPrince wrote:
| > What role does Akamai Edge DNS play in normal internet
| traffic?
|
| Clearly a big one.
| r1ch wrote:
| The trend these days are DNS TTLs of 60 - 300 seconds, to allow
| "Cloud agility" or something, so sites are exposed to a much
| larger risk of authoritative nameservers going down.
| jameshart wrote:
| You say that like it's a bad idea.
|
| Services like Akamai use short TTLs for their edge services
| for a variety of reasons, not least because if one of their
| edge servers goes offline (for planned or unplanned reasons)
| it lets them sub in a new one and have it receive traffic
| immediately, rather than have a bunch of clients continue
| trying to talk to a dead node. So sure, you can increase
| those TTLs to trade 'what if the DNS server goes down?' risk
| with 'what if the edge server goes down?' risk...
|
| But keeping the edge servers up and running is probably a lot
| harder - they need to scale more to handle traffic load, they
| have to actually handle client data, TLS termination, much
| more complex configuration.... so if I'm placing bets on
| which of those things is more likely to die on me, it's the
| edge node, not the DNS server.
| uncertainrhymes wrote:
| If you use a CDN to front your traffic, you need the CNAME for
| www (or whatever) to be pointing at their DNS infrastructure,
| so they can return whichever closest POP is going to serve your
| traffic.
|
| e.g. dig @1.1.1.1 www.nvidia.com +trace
|
| ... various things from the root ...
|
| www.nvidia.com. 7200 IN CNAME www.nvidia.com.edgekey.net. ;;
| Received 83 bytes from 208.94.148.13#53(ns5.dnsmadeeasy.com) in
| 35 ms
|
| So the main DNS is fine, but it'll never get an A record
| because the last link in the chain is toast -- edgekey being
| Akamai in this case, but all CDNs do this so they can route
| traffic. Normally, this is a good thing so they can shift
| traffic within 30 seconds on their side. Unfortunately, it also
| means it would take nvidia an two hours to point away from
| Akamai.
| _joel wrote:
| So that's why the NHS website is down
| jamespwilliams wrote:
| Back up now by the looks of it
| [deleted]
| tyingq wrote:
| You can see this on a lot of sites right now. You get the Akamai
| style error with something like: Reference:
| #11.453a2f17.1393u44848484.3aee33433
|
| At the bottom of a very bland looking error page.
| halfmatthalfcat wrote:
| You could argue Akamai is the blandest of the CDN bunch; their
| UIs are atrocious.
| chrisweekly wrote:
| Their APIs are (or, were, last I suffered their use a few
| years ago) also terrible, eg blanket policy of refusing to
| cache any resource in the presence of "Vary" header,
| regardless of its value, and failure to honor standard HTTP
| headers... thankfully there are many other options for CDN,
| which are SO MUCH BETTER.
| youngtaff wrote:
| Surely it depends what you vary on?
|
| Content-Encoding should be well supported, User-Agent less
| so and for very good reasons (there's too much variation in
| UA strings)
| acdha wrote:
| It wasn't that simple -- IIRC, for a while Vary meant
| "don't cache anything, ever, under any circumstances"
| unless you made some custom configuration changes. Over
| time they _added_ support for just "Vary: Accept-
| Encoding" (IIRC less than a decade ago) and that was
| fragile. They improved that over time but it was painful
| for a number of years because there were various failure
| modes which meant things wouldn't be cached, or (IIRC)
| compression would be disabled for certain URLs
| sporadically if the first request for the option did not
| request transfer compression.
| judge2020 wrote:
| https://learn.akamai.com/en-us/webhelp/adaptive-media-
| delive...
|
| > AMD automatically strips these headers out of requests
| to support caching for faster delivery.
|
| > I need the Vary HTTP headers: AMD can cache the
| associated object if the Vary HTTP header contains only
| "Accept-Encoding" and "Gzip" is present in the Content-
| Encoding header
|
| (AMD in this case standing for Akamai Media Delivery)
| zxcvbn4038 wrote:
| Akamai is their own worst enemy most of the time. Their
| prices are the highest, they trail on features, their
| documentation opaque, it takes an hour to propagate
| changes, etc. Only a few years ago you could only use SSL
| if you purchased their ridiculously expensive pci-dss plan
| - I thought they would defend that to their grave.
|
| Better alternatives are Cloudflare, Fastly, AWS CloudFront.
|
| Google Cloud CDN always seems to have very good latency but
| a very bare bones feature set and no edge compute I can
| identify. Support is always a huge red mark for Google
| anything.
| dylan604 wrote:
| yeah, but only tech nerds see it, so it's okay. maybe it's a
| ploy to get the users to go to the real command set via CLI.
| make it so shitty nobody wants the UI, and goes back to the
| terminal. "if you're not a CLI ninja, then you shouldn't be
| using our product anyways!"
| lowbloodsugar wrote:
| What's frustrating is that DNS is returning an address, instead
| of just failing, and so macos is caching that value (though it
| might be cloudflare doing that).
| space_ghost wrote:
| Wildcard DNS should be a prosecutable crime, punishable by no
| less than 20 years of hard labor. (Edit: Probably should have
| made it clear that this was a joke)
| gokhan wrote:
| Wildcard DNS helps me to handle multitenancy easily. What's
| wrong with it?
| dylan604 wrote:
| When did congress members start posting to HN?
| adamdoran wrote:
| Presumably you're referring to the practice of answering
| queries for nonexistent records with an A record belonging
| to an advertisement page? (instead of doing the right thing
| answering NXDOMAIN, presuming no records of another type
| also exist for the queried name.)
|
| dnsmasq has a really useful feature for dealing with this:
| --bogus-nxdomain
| breakingcups wrote:
| I don't see how wildcard DNS is related to this? Nor how
| it's bad?
| LeoPanthera wrote:
| To empty the macOS DNS cache:
|
| sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder
| dbsmith83 wrote:
| https://downdetector.com/archive
|
| So many sites down... and unfortunately not one of them is
| Twitter
| cpgeier wrote:
| Amazing that down detector manages to stay up during these
| kinds of outages. Noticed it has been a little slow but they
| really have done a good job keeping it up even though large
| portions of the internet is down right now.
| mindcrime wrote:
| Who detects if Down Detector is down? Is there a
| isdowndetectordown.com site?
| cube00 wrote:
| "I dunno. Coast Guard?"
| SahAssar wrote:
| Sounds like when Fuckedcompany put itself on Fuckedcompany.
| mcintyre1994 wrote:
| It's parked by GoDaddy, but unfortunately their website is
| fubar by this outage if you try to click through to see how
| much they want for it :)
| ksec wrote:
| I guess the mother of all Network Downtime checker is HN.
| dheera wrote:
| Is there a way to tell your system to fall back to the last
| known IP address if DNS server isn't reachable?
|
| Basically soft-invalidate your local DNS cache but it back from
| the cache graveyard if DNS is down.
| elithrar wrote:
| You could run a local resolver like dnsmasq or Unbound that
| can "serve stale" on upstream failures, but that assumes the
| DNS failure is a client-facing resolver one.
|
| From what I observed here, it was more internal DNS related:
| Newegg was serving an opaque "DNS failure" error page _from
| Akamai's front-end_ which is likely because their infra was
| failing to resolve names internally.
| bombcar wrote:
| It should be possible to set your cache so it lives forever
| but still checks for a new IP at normal expiring time.
| [deleted]
| TimWolla wrote:
| Unbound has a 'serve-expired' option: https://nlnetlabs.nl/do
| cumentation/unbound/unbound.conf/#ser...
| mcintyre1994 wrote:
| It's interesting that they report an AWS outage but there don't
| seem to be any issues there. Looks like their methodology is a
| bit too reliant on those speculative tweets from the first 5
| minutes of all these sites going down.
| https://downdetector.com/status/aws-amazon-web-services/
|
| > So many websites are down, are AWS servers down or something?
|
| > Amazon web services is down which is affecting a lot of
| company web sites and services. Not sure what is going on.
|
| > Miss us? @aldotcom and a whole bunch of other folks have been
| knocked off the internet by what appears to be an AWS
| attack/system failure. We'll be back. ?
| mandelbrotwurst wrote:
| It's just based on user reports, so this is people
| mischaracterizing it as an AWS outage.
| jacob019 wrote:
| cloudfront was down too
| mcintyre1994 wrote:
| Yep that's my point. I'm guessing that for a lot of sites
| they can verify if there's an outage pretty easily when
| they see a spike in reports, but for something like AWS
| unless they updated their status page (lol) or downdetector
| ran a bunch of stuff on there just to check with, I guess
| they don't have a good way to verify it.
| mandelbrotwurst wrote:
| Gotcha, yeah I guess I always just considered that out of
| scope for their service and that it's just a report
| aggregator but I suppose you would expect it to be at
| least a little bit clever based on the "detector" name
| 1f60c wrote:
| > Unfortunately not one of them is Twitter
|
| Please keep comments like this off HN
| grawprog wrote:
| You got your wish, looks like Twitter's on the list now too.
| jdlyga wrote:
| Oops, someone unplugged the DNS machine
| mvelie wrote:
| Akamai believes they have it fixed. We've seen our traffic return
| to normal. https://twitter.com/Akamai/status/1418251400660889603
| roody15 wrote:
| hmmm does not appear fixed here in the Midwest
| 00deadbeef wrote:
| Figured this out almost 30 minutes before they bothered to update
| their status page.
| SjorsVG wrote:
| Many bank systems are disrupted by this in the Netherlands
| ricardo81 wrote:
| My UK bank (HBOS) seemed to have 'online banking unavailable'
| though their site was up. No doubt related.
| SjorsVG wrote:
| Many banks in the Netherlands are affected by this.
| schemathings wrote:
| Possibly related .. Verizon peering issues / ASN701 at Equinix
| NY2 in Secaucus NJ
| rvz wrote:
| Probably Akamai needs to use Kubernetes.
|
| EDIT: So HN can't even take a joke after this? [0]
|
| [0] https://news.ycombinator.com/item?id=27893482
| mdtancsa wrote:
| Sheesh, So yesterday! :)
| unemphysbro wrote:
| come on, this is funny. HN needs to lighten-up.
| whitepoplar wrote:
| Probably caused by Kubernetes
| rvz wrote:
| That's even worse if true; despite HNers creating a storm in
| a tea cup on DOSing a blog of a service not using K8s when
| having a blog is not their main service. [0].
|
| Either way, the joke's is now on the HNers in that thread.
|
| [0] https://news.ycombinator.com/item?id=27893482
| knaik94 wrote:
| I am surprised financial institutions don't have any regulation
| for redundancy. The one that stuck out to me is the Navy Federal
| Credit Union website being down. I have not had any issues
| logging into mobile though for some of the reported sites.
| toomuchtodo wrote:
| Commercial banks are held to a different operational resiliency
| standard than financial infrastructure.
|
| (a component of my consulting work is reporting to financial
| regulators for institutions)
| deckard1 wrote:
| this is prime shit Hacker News says right here. Wait until you
| learn banks close on Sunday. Or have maintenance windows for
| their website, ATM, etc.
| christophilus wrote:
| I'm not sure how easy it would be to regulate. But yeah. I've
| got a few short term trades in my brokerage account, and
| outages really throw a wrench into those.
| xyzzy21 wrote:
| The way regulate is like anything else: if they fail to meet
| QoS uptimes, they get fined in 6-8 figures for every minute
| of loss.
| cryptoz wrote:
| All major Canadian banks were down.
| Terretta wrote:
| > _financial institutions don 't have any regulation for
| redundancy_
|
| As CTO of a bank, I wasn't aware of this. So either we wasted a
| ton of money and time constantly upgrading redundancy and
| business continuity technologies to satisfy our regulators...
| or this statement could be mistaken.
| brentm wrote:
| CapitalOne has a broken login which is pretty surprising to me.
| [deleted]
| tjpnz wrote:
| Just got booted out of Netflix on the PS4 because the console
| could no longer connect to Sony's license server. Netflix was
| working just fine by the way.
| vmception wrote:
| Ah thats whats going on. Happened to me as well, I just assumed
| that Sony is neglecting PS4 performance with its new system,
| while bogging it down with bloatware.
| lxgr wrote:
| Was the app installed/running using a secondary PSN account by
| any chance? This shouldn't be happening on a primary
| account/console pair.
| tjpnz wrote:
| It should be my primary although I've often seen it revert
| back after setting it. I did try setting it as my primary
| again but you know.
| hackerbrother wrote:
| Yup, I learned Hulu on Xbox One relies heavily on some
| Microsoft authentication during a recent Office 365 or Azure
| outage (not sure which).
| [deleted]
| simonswords82 wrote:
| I'm sick and tired of these types of services (I'm looking at you
| too Cloudflare) going down and taking otherwise healthy websites
| down with them.
| ceejayoz wrote:
| Most websites using Akamai aren't gonna be "otherwise healthy"
| without the CDN handling most of the load.
| tootie wrote:
| It was fastly last time.
| simonswords82 wrote:
| True but cloudflare have been guilty of downtime too.
| ceejayoz wrote:
| There aren't many sites that aren't, including "otherwise
| healthy websites" hosted without a CDN.
| TheSwordsman wrote:
| I think this is a factually true statement if your business
| uses any computers. ;)
| sammy2244 wrote:
| Cloudflare hasnt had an outage in a long time. And when they do
| they are upfront about it, and post a detailed post-mortem.
| davidjgraph wrote:
| Serious question, has anyone properly solved the issue of DNS as
| a single point of failure?
| tyingq wrote:
| It's an interesting question, as it's always been solved on the
| server side. All of the current problem is client side. That
| is, client resolvers that aren't using diverse providers, and
| only do things like round-robin with long timeouts.
| kokey wrote:
| Anycast for the DNS IPs deals with most of the problems of
| clients not failing over elegantly when their primary DNS
| server is broken.
| citrin_ru wrote:
| From a client (DNS recursor) point of view there is no
| primary server. There is just multiple NS records which are
| equal. If one of them is down it can introduce resolving
| delays, but they are usually small. At least if something
| like Unbound or Bind is used. Unbound e. g. maintains
| infra-cache where it tracks RTT and errors for each server
| and avoid servers which are down.
| hk1337 wrote:
| The problem isn't DNS though, is it? The problem is that people
| don't necessarily use the redundancies on DNS?
|
| The whole reason it takes a domain 24h to fully work with DNS
| is because it propagates the information other DNS servers,
| thus making not be a centralized service.
| unilynx wrote:
| That differs per TLD though. In .nl updates are usually fully
| processed within the hour (they update the zone file twice
| per hour)
| jameshart wrote:
| DNS doesn't 'propagate' except in the very limited case of
| zone-transfer publication, which... nobody really relies on
| these days. Registrars tell you it takes 24 hours to
| propagate to stop you from complaining to them about your
| ISP's DNS caching policy. The reality is: recursing DNS
| servers have caches, they respect TTLs, and for the most part
| that means that DNS changes should fully wash through within
| an hour for most changes (less if you keep your TTLs
| shorter).
| arberx wrote:
| Yes: https://ens.domains/
| jakeschaeffer wrote:
| https://handshake.org is the only project I've seen that
| actually solves the issue with a decentralized root zone file.
|
| https://namebase.io is a "registrar" for it.
| airstrike wrote:
| Why does this need to have the whole NFT / crypto / auction
| angle?
|
| https://learn.namebase.io/starting-from-zero/how-to-get-a-
| na...
|
| This is so convoluted it actually makes the whole thing a
| non-starter
| fwip wrote:
| Decentralized control of a centralized finite resource
| (domain names) requires consensus. For example, Joe Smith
| and Joe Blow both want joe.com.
|
| You want a protocol that gives consistent "global" state
| without any centralized / trusted users -
| blockchain/bitcoin is one of the only technical solutions
| to provide that.
|
| I agree that it's a garbage solution in practice, but
| that's why it's got cryptoshit bundled in.
|
| A potential different solution to DNS monopoly, if that is
| a problem that needs solving, is multiple name-resolution
| providers that have differing records on what name points
| where. (The tradeoff is that an owner may need to register
| their name with multiple different providers).
| sakisv wrote:
| Depending on what point you draw the line of "single point of
| failure" you could use multiple providers for your dns.
|
| GOV.UK for example uses both aws and gcp for DNS
| davidjgraph wrote:
| So, NS entries pointing to both? But then take the example
| your domain was in Route53 and AWS goes down. You can't
| configure the NS entries to avoid AWS DNS servers. Is the
| idea that child DNS servers detect the outage and cache the
| values in the name server(s) that remain up?
|
| But then, the cached values from AWS take a while to clear,
| TTL never seems to be applied properly. It always feels like
| the worst case in such a scenario is you can point everyone
| at the right thing within 24 hours.
| corobo wrote:
| Have them all hot and live rather than any sort of failover
| system. Keep everything in sync with OctoDNS or similar
|
| https://github.com/octodns/octodns
|
| DNS is fastest first* rather than main/failover. If AWS DNS
| was down your GCP DNS would have replied (if all is well)
| sooner than {timeout} so your visitor would still have a
| response
|
| * Sort of. I think if the client doesn't get a reply from
| the server it picked randomly in 1s they move on to the
| next server, repeat until all fail
| NotEvil wrote:
| Ibthink if route53 was down. Your dns provider whouldn't
| able to go there. So it will go to the root who will give
| gcp one too. So your dns provider might try that.
|
| (I don't know if this is how it works, but I thibk that's
| how it supposed to work)
| zxcvbn4038 wrote:
| You typically have four name servers for a domain, but
| they don't all have to be hosted with the same company.
| Very handy when your DNS provider decides to brag they
| are unhackable and the hackers reply by immediately
| hacking them followed by DDoSing them to death.
| tpetry wrote:
| You set both services in your ns records. So every day they
| share the load for dns resolution. If one day one of them
| is down the client can/will use a different nameserver from
| your configuration.
| wongarsu wrote:
| Configuring two NS entries is pretty standard, so surely
| most resolvers try one of the two, and if it's down try the
| other one? What else would be the point of having multiple
| nameservers? Then you just have to get two nameserver
| providers and make sure their settings stay synced, and
| point your domain to one nameserver from each.
|
| Of course that requires the server to properly fail, i.e.
| stop responding to requests. That doesn't seem to be the
| case here
| paradite wrote:
| Last time I tried setting NS to both cloudflare and digital
| ocean in my domain registry, cloudflare sent me an email
| saying the configuration is invalid and asked me to revert.
| Am I doing something wrong?
| gregsadetsky wrote:
| gov.uk's traffic seems to be handled by Fastly, a well known
| CDN.
|
| What I'm a bit surprised / unsure of is what happens when I
| run "dig ns gov.uk". The results are: gov.uk.
| 21559 IN NS ns1.surfnet.nl. gov.uk. 21559 IN NS
| auth50.ns.de.uu.net. gov.uk. 21559 IN NS
| ns3.ja.net. gov.uk. 21559 IN NS ns2.ja.net.
| gov.uk. 21559 IN NS ns0.ja.net. gov.uk. 21559
| IN NS auth00.ns.de.uu.net. gov.uk. 21559 IN NS
| ns4.ja.net.
|
| Who is ja.net , uu.net and surfnet.nl ..?
|
| EDIT: I see that ja.net i.e. jisc.ac.uk "manages the second
| level domain .gov.uk" -- https://www.jisc.ac.uk/domain-
| registry . I imagine that uu.net and surfnet.nl are there for
| redundancy
| PaywallBuster wrote:
| whois ja.net Domain Name: JA.NET Registry
| Domain ID: 499794_DOMAIN_NET-VRSN Registrar WHOIS
| Server: whois.demys.com Registrar URL:
| http://www.demys.com
|
| "Demys is a leading provider of corporate domain name
| management and an ICANN accredited registrar"
| whois uu.net Domain Name: UU.NET Registry
| Domain ID: 5486163_DOMAIN_NET-VRSN Registrar WHOIS
| Server: whois.markmonitor.com
|
| surfnet is just an ISP in Netherlands
|
| https://www.surf.nl/
| gregsadetsky wrote:
| Thanks
|
| Is it possible to see if/where is gov.uk using GCP or AWS
| for its domain zones? From what I can see -- that's not
| the case? Or am I looking in the wrong place?
| PaywallBuster wrote:
| I think you did the right query, maybe they're using it
| for different domain names?
| grishka wrote:
| And then there are Cloudflare and other Centralized Downtime
| Networks as another point of failure.
| andoma wrote:
| Loled at this.
| toddh wrote:
| You can still hardcode IP addresses. Not sure most people
| realize DNS isn't actually needed, you know, except for
| convenience and all that.
| tyingq wrote:
| The "Host:" header in http[s] pretty much killed that. Half
| the internet would be a Cloudflare error page if we moved
| back to ip addresses :)
| citrin_ru wrote:
| It is relatively easy to make DNS highly redundant: just put
| multiple DNS server in data-centers which are as independent as
| possible (different geo locations, different ISPs). You can
| also use different DNS software and different OS (say
| BSD+Linix) to exclude correlated bugs. Root DNS server AFAIK
| use different software for this reason.
|
| Problems starts when you want to easy make frequent changes and
| introduce complex software to manage DNS zones (and complexity
| usually comes with bugs).
| foobarbazetc wrote:
| Absolutely amazing how many billion $+ companies are single homed
| for DNS.
|
| I wonder how much they spend on multi-AZ redundant
| architectures...
| zxcvbn4038 wrote:
| Most CDNs offer huge incentives for sending them more traffic,
| a lot of time you end up in a contract obligated to handle X
| requests and Y gigabytes of traffic per month. But personally I
| believe you should never have a single provider for anything -
| particularly when it's acceptable for a company to cut you off
| with no warning or recourse.
| nexuist wrote:
| Might be survivorship bias. Multi-AZ arch protects against all
| other failures, so the only one that remains visible to the
| outside world is DNS.
| toast0 wrote:
| Using multiple providers for mostly static DNS is easy, pick
| one as primary and AXFR to the other and notifications and
| whatever. Or it's not too hard to keep a zone file in source
| control and sync it to the providers.
|
| Using multiple providers for fancy DNS, like only providing IPs
| that pass healthchecks or geotargetting users to datacenters
| gets pretty hard, because the different providers have similar
| capabilities, but no uniform interface, so you've either got to
| do it manually, or you have to build out your own abstraction
| that is probably limiting.
|
| If possible, insourcing DNS makes the most sense to me, because
| if you can't keep your service online, it's not the worst if
| your DNS is offline; and if you can keep your service online,
| you probably won't mess up your DNS too badly.
| jfoutz wrote:
| So much this. Keeping feature by feature parity is the tricky
| part.
| orblivion wrote:
| So here's a weird question: Supposing companies multi-home for
| DNS, or whatever other essential service, via multiple service
| providers.
|
| Whatever multi-home means, why can't there just be one service
| provider that does _that_? And are we sure that these service
| providers aren 't already doing that as best we might hope for?
| (For instance, Amazon already has multiple zones, etc.)
|
| I suppose the one thing this can't protect against is some sort
| of political (broadly defined) threat related to the company
| itself.
| lxgr wrote:
| > Whatever multi-home means, why can't there just be one
| service provider that does that?
|
| Many of these outages are due to pushing broken artifacts or
| configuration to production.
|
| A single provider can pretty easily offer geographic or
| network topological redundancy, but administrative and/or
| technological independence is pretty hard to achieve in a
| single company.
| orblivion wrote:
| I mean, I guess what I'm saying is that in theory a single
| provider could purposely keep two different departments
| that manage their own artifacts independently.
| knute wrote:
| I believe EasyDNS can automatically push DNS settings to
| Route53 to host DNS in AWS. Doesn't protect you from fat-
| fingering a change, but you should be resilient to either
| EasyDNS or Route53 going down.
|
| https://kb.easydns.com/knowledge/easyroute53/
| tru3_power wrote:
| Any idea on cause? Ddos or hardware failure?
| MrRadar wrote:
| Widespread issues like this on major CDNs tend to be
| configuration errors.
| tootie wrote:
| Cloudflare seems to be struggling too. Not sure if they have
| some dependency on Akamai or if this portends something much
| worse
| aliswe wrote:
| Not only that their support telephone line (in sweden) was down
| as well
| twalichiewicz wrote:
| Posted this is the thread about the travel websites being down,
| but seems Fidelity is entirely impossible to sign in to / trade
| right now.
| sebyx07 wrote:
| The good parts of centralisation
| [deleted]
| conqrr wrote:
| Affecting Airbnb search
| SandroG wrote:
| Is this related to:
|
| Multiple websites including DraftKings, Airbnb, FedEx, Delta and
| others appear to be experiencing issues.
|
| https://www.bloomberg.com/news/articles/2021-07-22/multiple-...
| testplzignore wrote:
| Strange thing about the duration of this outage... From logs I
| have, it seems to have lasted _exactly_ one hour, from 15:38 to
| 16:38. Their Twitter account also said "disruption lasted up to
| an hour", though they incorrectly said it started at 15:46 (did
| it take 8 minutes for their monitoring to notice?).
|
| That makes me think that whatever the fix was, it had to wait for
| some one-hour cache to expire before it took effect. I'm very
| interested to find out what the cache issue was, more so than
| what the original bug was.
| xyzzy21 wrote:
| And people wonder why I try to avoid depending on online
| anything...
| delgaudm wrote:
| Lastpass is down, so if you use lastpass the effect is
| significantly compounded.
| mcintyre1994 wrote:
| Do they not cache everything locally? I'd have thought a
| password manager/secure data store would work offline.
| stusmall wrote:
| They do.
| sammy2244 wrote:
| Having your passwords only accessible by internet is a stupid
| idea anyway
| nonfamous wrote:
| It still works in offline mode. You can't update passwords, but
| you can retrieve them.
| compscistd wrote:
| To enable offline mode, I had to turn on airplane mode on my
| phone before logging in.
| soheil wrote:
| App Store on MacOS is down!
___________________________________________________________________
(page generated 2021-07-22 23:02 UTC)