[HN Gopher] Netlify Status - CDN Issues
       ___________________________________________________________________
        
       Netlify Status - CDN Issues
        
       Author : yabones
       Score  : 76 points
       Date   : 2021-03-25 14:54 UTC (8 hours ago)
        
 (HTM) web link (www.netlifystatus.com)
 (TXT) w3m dump (www.netlifystatus.com)
        
       | bobfunk wrote:
       | Netlify CEO here. I'll try to answer the questions from the
       | thread so far:
       | 
       | Some of our customers are affected by an outage of Googles Load
       | Balancer.
       | 
       | These customers are not taking advantage of our DNS management,
       | or they are not using a DNS provider that supports CNAME
       | flattening and are using their root domain name for their website
       | (ie, no www prefix).
       | 
       | While we don't recommend the setup, we do provide a single IP
       | address to bind an A records for customers that want it.
       | 
       | In general we run our edge infrastructure as a large multicloud
       | setup spanning several different network providers, and offer two
       | separate networks, one for free/self-serve customers that will
       | get newer features faster and one for enterprise customers
       | running mission critical projects where we guarantee very high
       | uptime and reliability through formal SLAs.
       | 
       | The single IP mentioned above however corresponds to a Google
       | Load Balancer, and they are unfortunately currently having an
       | outage for all load balancers in the relevant region. Read more
       | on https://status.cloud.google.com/
       | 
       | Again, while we generally don't recommend using the A name setup
       | for anything mission critical, we are currently doing everything
       | we possible can helping enterprise customers that have chosen
       | this setup to change their configuration.
       | 
       | Really sorry for all the trouble this are causing for our users,
       | full RCA will be forthcoming.
        
         | WORMS_EAT_WORMS wrote:
         | Are you all having shake ups within the company? I'm not going
         | to deep dive, but I heard some rumors about some higher ups
         | leaving.
         | 
         | After the Cloudflare Pages release, I'd be curious of what your
         | future road map looks like and how you all plan to compete and
         | grow.
         | 
         | Thanks for all you and your team does. What you have done for
         | front-end development and the community has been nothing but
         | awesome and inspiring.
        
         | sbr464 wrote:
         | Depending on your config, another DNS related issue with
         | Netlify is the way NS1.com (their vendor) handles domain names.
         | A domain can only be added to one NS1 account. So if Netlify
         | adds to their account internally, you can't use NS1 and vice
         | versa.
        
         | ukulele wrote:
         | Honestly, "not taking advantage of our DNS management" is a
         | garbage response. We use AWS for our DNS management. If you
         | offer a configuration, you should support it fully.
         | 
         | Our sites have been down for 3 hours now, and you're blaming
         | someone else? We have 5 properties on Netlify now and will have
         | 0 this time next week.
        
           | gregsadetsky wrote:
           | Point your apex domain to 75.2.60.5, Netlify recommends it
           | here [0] and in their documentation now [1].
           | 
           | I just did for a site that's hosted by Netlify and it solved
           | the issue. Thankfully I had a short TTL, I hope you do too.
           | 
           | [0] https://www.netlifystatus.com/
           | 
           | [1] https://docs.netlify.com/domains-https/custom-
           | domains/config...
        
           | WORMS_EAT_WORMS wrote:
           | I'm not sure your organization's setup with Netlify but isn't
           | the whole point of Serverless to be... "serverless"? I could
           | migrate twice the amount of properties you have to another
           | provider in less than 3 hours...
           | 
           | I get your frustration but maybe cut some slack. If anything
           | is mission critical, you should have had a backup plan if
           | Netlify, Vercel, Cloudflare, or something else.
        
             | ukulele wrote:
             | We use(d) Netlify for the frontend. I agree, our mistake
             | was believing Netlify could be used for more than toy
             | websites and took care of backup plans for us. Clearly they
             | do not.
        
               | WORMS_EAT_WORMS wrote:
               | I do believe you to be trolling now by saying that. If
               | not, congrats on the valuable lesson!
        
               | ukulele wrote:
               | Not trolling, just very frustrated. But yes a valuable
               | lesson.
        
               | sanedigital wrote:
               | What's keeping you from migrating your frontends?
               | Shouldn't that take a couple of hours at worst?
        
               | huy-nguyen wrote:
               | It's not just migrating the front-end if they're also
               | using other functionalities like Netlify functions,
               | forms, authentications etc. Netlify is not just static
               | file hosting.
        
           | oji0hub wrote:
           | > Our sites have been down for 3 hours now, and you're
           | blaming someone else?
           | 
           | Well if the issue is at Google then maybe "blaming" isn't
           | really the right word. No need to be rude.
           | 
           | I might as well make the same argument for your sites.
           | 
           | - Your sites have been down for 3 hours now, and you're
           | blaming someone else?
        
             | ukulele wrote:
             | Yes, it is our fault for believing Netlify had contingency
             | plans as hosting is their core business. We're fixing this
             | mistake now so that our customers don't have the same
             | experience.
        
               | oji0hub wrote:
               | By the same line of reasoning, your customers could be
               | faulted for believing you had a contingency plan.
        
               | evrydayhustling wrote:
               | Nobody is telling parent's customers how to feel. But the
               | OP suggests that Netlify customers should be faulted for
               | choosing the the wrong setup. Broken trust goes all the
               | way down the chain, which is why the middle links have
               | every reason to get ticked off.
        
               | 1123581321 wrote:
               | The difference is that Netlify communicated the risks to
               | its customers, something other parts of the chain
               | apparently did not do, in addition to not evaluating the
               | risks presented to them by Netlify.
        
               | evrydayhustling wrote:
               | Did you read the docs [1] before writing this? Putting a
               | "(recommended)" on one branch of configuration
               | instructions isn't the same as saying that the other
               | option has a single point of failure. Also, people on
               | both sides of a service don't have the same
               | responsibilities - that's the whole point of the service.
               | 
               | Communicating about risks OR outages are both hard, and
               | every company has both. I'm actually a happy (though
               | impacted) Netlify customer. But it's completely bizarre
               | to me to try to invalidate this customer's complaint.
               | 
               | [1] https://web.archive.org/web/20200303050851/https://do
               | cs.netl... (search "flattening")
        
               | 1123581321 wrote:
               | Yes, I've visited that page before today. I admit my
               | familiarity with these DNS setups may have made the
               | tradeoff jump out at me. No problem invalidating the
               | complaint.
        
         | foxbarrington wrote:
         | I switched to using your DNS to resolve this issue, but
         | https://js.la is still busted and because I'm using your DNS, I
         | can't manually set the A record to go to the workaround IP
         | address.
        
         | instakill wrote:
         | "These customers are not taking advantage of our DNS
         | management"
         | 
         | You're right. I'm using Cloudflare's DNS. I trust them more
         | than I trust Netlify and that's just a function of their size
         | vs Netlify's size. This response needed better wording.
        
           | bobfunk wrote:
           | Cloudflare DNS supports CNAME flattening and you won't be
           | needing the fixed IP address if setting up DNS with them.
        
             | _fool wrote:
             | More details for folks who are curious about optimal config
             | using Cloudflare's DNS hosting, can be found here:
             | https://answers.netlify.com/t/support-guide-which-are-
             | some-g...
        
         | martin_bech wrote:
         | Hi Bob, just want to say, I like your service a lot.
        
           | bobfunk wrote:
           | Thanks! Appreciate the kind words!
        
             | StavrosK wrote:
             | Seconded, I use it for all my static hosting. Great
             | service.
        
         | nrmitchi wrote:
         | > These customers are not taking advantage of our DNS
         | management
         | 
         | I think I understand the point you are trying to make, that
         | customers who are utilizing Netlify DNS Management are
         | unaffected because _reasons_ , but this is phrased in a way
         | that implies that it is your users fault for this downtime
         | because they didn't chose to use your related service.
        
           | foxbarrington wrote:
           | Sadly, even after switching to their DNS I am still affected.
        
             | _fool wrote:
             | This should not be the case; if you'd like, Netlify's
             | Support team will be happy to review your settings to help
             | discover why it didn't help you out (start from
             | https://netlify.com/support) and ensure that you are
             | "futureproofed"!
        
             | [deleted]
        
           | ohadpr wrote:
           | Phrasing can always be better but the point is that there's a
           | way to map your DNS to Netlify which is risky and Netlify
           | hasn't made the aggressive decision of blocking it. They
           | outline in their docs all the reasons why you shouldn't do
           | it, provide instructions for how to avoid it and also offer
           | (but do not require) a hosted DNS setup which avoids this
           | pitfall by design.
           | 
           | Some folks still choose to use this way, some have no other
           | choice for various reasons and some don't care/comprehend the
           | potential pitfalls. I do believe most users avoid using a
           | root domain name for their website.
        
             | gtirloni wrote:
             | _> I do believe most users avoid using a root domain name
             | for their website._
             | 
             | This is where you're definitely wrong.
        
               | ohadpr wrote:
               | I could be. Are you saying this based on data or
               | intuition?
        
           | bobfunk wrote:
           | Full RCA with the steps the team has taken to improve this
           | setup will be coming soon. The main issue with AWS's DNS
           | solution, in this context, is that they don't support ALIAS
           | records or similar techniques (CNAME flattening, etc) for A
           | records pointing to any external provider. That limits our
           | options a lot in terms of what we can do, since anyone using
           | this setup need to point all their traffic to one or more
           | fixed IP addresses.
           | 
           | Our current solution for the free/self-serve tier of Netlify
           | has been to rely on Google's load balancer product to give
           | people a stable IP pointing to a highly available solution.
           | In light of recent issues, our team has setup a new permanent
           | IP for A records (75.2.60.5) backed by a different solution,
           | but due to the way DNS providers with no ALIAS record support
           | work, it does require our customers to manually change their
           | A records.
           | 
           | I totally get that moving DNS providers is a big deal and we
           | want to give the best experience we can regardless of what
           | provider you're on, but we have to work within the technical
           | limitations of those providers and it's the nature of things
           | that we do have more options to deliver a completely seemless
           | experience when we operate both the DNS and the edge layer
           | for customers.
        
         | gcbirzan wrote:
         | This could've been avoided with an HTTP LB, vs a L4 one...
        
       | tyingq wrote:
       | _" To make sure you can minimize the impact of our single-homed
       | loadbalancer being down"_[1]
       | 
       | Interesting. I'm surprised that's how their CDN works.
       | 
       | [1] https://answers.netlify.com/t/support-guide-minimizing-
       | impac...
        
       | juliansimioni wrote:
       | This has been the third major outage for Netlify in the last few
       | weeks.
       | 
       | I like the company, they have good people on their team, and
       | their interface and functionality is great (deploy previews are
       | so nice!).
       | 
       | But this is probably the last straw, as the static portion of our
       | company's website has been down for 45 minutes now.
       | 
       | Fortunately, the beauty of a static site is they're quite easy to
       | host anywhere.
       | 
       | We're already on AWS, and it's easy enough to set up CloudFront.
       | It won't be _quite_ as quick to deploy but it will probably
       | rarely if ever break. Guess that's my task for the day :(
        
         | avianlyric wrote:
         | You should checkout Cloudflare Pages. For static stuff it's a
         | dream to setup, and you get previews out of the box.
        
           | riffic wrote:
           | Cloudflare pages is in beta and right now lacks many features
           | that Netlify has, while also containing some showstopping
           | issues
           | (https://developers.cloudflare.com/pages/platform/known-
           | issue...)
        
           | juliansimioni wrote:
           | The main advantage that AWS will have for us over anything
           | else is that, since AWS already manages our DNS, we are going
           | to be able to offer our visitors the best performance by
           | using geo-specific IP addresses.
           | 
           | The static site in question for us lives at the apex record
           | (mywebsite.com), so it's generally not possible for other
           | providers to do this without having them manage our entire
           | DNS infrastructure, which we aren't willing to do.
           | 
           | In fact I think this is part of why we've had so many issues
           | with Netlify. It's clear their preferred way to host apex
           | domain sites is to manage the DNS completely.
        
         | antihero wrote:
         | Why wouldn't CloudFront be quite as quick, out of interest?
        
           | nicoburns wrote:
           | I think they're talking about the setup/configuration time.
           | Netlify is pretty much one click.
        
         | ukulele wrote:
         | We're in the same boat, and when Netlify blames an "upstream
         | provider", what I hear is that they don't have a backup plan.
         | 
         | They're giving us plenty of time to research alternatives while
         | our site is down though.
        
           | celsoazevedo wrote:
           | It seems that a lot of people didn't have backup plans
           | either. I'm not sure if it's a good idea to rely on just one
           | provider for something critical.
        
           | michaelmior wrote:
           | Based on the comment from bobfunk, it seems they do have a
           | backup plan but not all customers use a configuration which
           | takes advantage of it.
        
             | notyourday wrote:
             | A backup plan that does not service apex domains is not
             | really a backup plan.
        
               | michaelmior wrote:
               | It sounds like it does handle apex domains, but only if
               | you're using Netlify DNS or a provider which supports
               | CNAME flattening. Assuming the potential problems with
               | not doing so are disclosed during setup (not sure if they
               | were) that actually seems pretty reasonable to me.
        
               | _fool wrote:
               | I run the Netlify Support team, and this statement from
               | @michaelmior is correct: apex domains are served using
               | redundant, global CDN if you use Netlify's DNS hosting,
               | or Flattened CNAMEs from Cloudflare.
        
         | pscanf wrote:
         | Shameless plug, I can suggest StaticDeploy
         | (https://staticdeploy.io/) as an open source, self-hosted
         | alternative to Netlify, which can give you a similar deployment
         | workflow.
         | 
         | It's definitely possible to just host directly on
         | S3/CloudFront, but StaticDeploy sets you up quickly with a
         | workflow and a dedicated management interface.
         | 
         | Disclaimer: I'm the main developer of the project.
        
           | jgrahamc wrote:
           | Honestly, don't do this. Netlify is having a bad day and it's
           | not fun for them. The great wheel of karma turns around
           | slowly and one day it'll be your turn to have a bad day.
           | 
           | Submit StaticDeploy to HN some other day and tell us about
           | it. Sounds cool.
        
             | pscanf wrote:
             | Thanks for pointing this out, I admit I did not consider
             | their point of view (the plug was shameless, but in the
             | self-promotion-is-inherently-shameful sense), and I agree
             | it's in bad taste.
        
             | bigsparky wrote:
             | Cloudflare CTO telling off the open source project guy,
             | while ignoring all the other comments suggesting
             | cloudflare?
        
               | jgrahamc wrote:
               | There's a difference between self-promotion and other
               | people making a suggestion.
        
       | riffic wrote:
       | oof, I got bit by this issue this morning. if you're using
       | cloudflare, set your domain's apex (`@`) as a CNAME pointing to
       | the default subdomain (sitename.netlify.app) and use CNAME
       | Flattening. It's the A record pointing to the CDN IP address
       | that's broken.
        
       | danr4 wrote:
       | Hmmm, lots of recent outages.
       | 
       | Has anyone else noticed erratic response times in recent months?
       | My web vitals score sometimes dips heavily because "response time
       | from server" (or whatever it's called).
       | 
       | Anyone can recommend another place where I can host? (except
       | Vercel, which has similar results)
        
         | jpbow wrote:
         | Cloudflare Pages, begin.com, fly.io, and surge.sh are a few
         | I've come across as alternatives (depending on what features
         | you're after).
        
         | indigodaddy wrote:
         | Firebase?
        
       | gregsadetsky wrote:
       | Update on the Netlify Status page [0] -- TLDR anyone experiencing
       | this issue should point their apex domain to 75.2.60.5
       | 
       | ---
       | 
       | Full announcement:
       | 
       | Our team have created a new load balancer instance which is not
       | associated with the upstream provider who is currently
       | experiencing issues. Please update A record values for your
       | site(s) bare domain to 75.2.60.5 to mitigate against this outage.
       | 
       | ---
       | 
       | Their documentation page [1] now includes the same IP.
       | 
       | [0] https://www.netlifystatus.com/
       | 
       | [1] https://docs.netlify.com/domains-https/custom-
       | domains/config...
        
         | swyx wrote:
         | thanks for the TLDR - i pointed it over and can verify that it
         | fixes the problem. annoying to wait out the caching on mobile
         | devices though, i am not sure how to clear DNS cache on mobile
         | but am not too bothered.
        
       | GaltMidas wrote:
       | This got me today. Probably could be characterized as the classic
       | case of: Company says a certain use case is unsupported, but
       | tries hard to accommodate users who are stuck with the
       | unsupported use case, so they hack up a decent work around under
       | the hood. Then the technically unsupported use case blows up, so
       | they then have to scramble to support it with a quick work
       | around...for the workaround.
       | 
       | The workaround worked. I think at this point it makes me more
       | likely to keep using Netlify. I love the product. And I think I
       | love the support for unsupported un-recommended feature that they
       | supported today.
       | 
       | Thanks Netlify Ops!
        
       | notyourday wrote:
       | Whom are they using? In 2021 for static sites behind a CDN being
       | down is... odd. Pretty much all CDNs by now should support
       | equivalent of serve stale.
        
         | swyx wrote:
         | think this is a DNS issue, not CDN. bobfunk noted it is Google,
         | scroll up
        
           | notyourday wrote:
           | So they implemented their own CDN-ish thing on top of GCP
           | without doing anycast and serve stale and they have a non-
           | trivial number non-mom and pop customers?!
        
       | jayp wrote:
       | They and Vercel both seem to be going thru a lot of growing
       | pains. Quite a few outages over the past year.
       | 
       | PS: we use both (for different sites). Probably should
       | consolidate to one.
        
         | [deleted]
        
         | WORMS_EAT_WORMS wrote:
         | I'm not sure the future in this space, but I kind just wish
         | Cloudflare and Vercel would join forces. It would make sense to
         | me.
        
           | jpbow wrote:
           | Cloudflare is already on the way to building out all of the
           | features Vercel has which is exciting. Eventually their Pages
           | product (static hosting) will integrate Workers [1] for
           | serverside APIs.
           | 
           | [1] https://blog.cloudflare.com/cloudflare-pages/#oh-and-one-
           | mor...
        
             | WORMS_EAT_WORMS wrote:
             | They lack a framework and crystal clear way to start
             | developing apps. Nothing exists that taps their features
             | beyond maybe Flarereact.
        
         | parhamn wrote:
         | I always assumed there is an 'underwriter' (like fastly or
         | akamai) for these CDNs. Is that not the case?
        
           | [deleted]
        
           | benatkin wrote:
           | Perhaps they're using CloudFront and providing an end user
           | CDN on top of it. Both are partners with AWS. Both were
           | multicloud, and it seems, aren't anymore.
        
           | scrose wrote:
           | I know Vercel uses AWS Lambda's behind the scenes to process
           | web requests at least. I'd assume caching is also handled
           | through Cloudfront by default. The place I currently work
           | uses Fastly for caching and Vercel for hosting and it's
           | definitely caused some issues(and much finger pointing on
           | both of their sides) when one of those services makes a
           | breaking change.
        
       | bombcar wrote:
       | What is Netlify?
       | 
       | "An intuitive Git-based workflow and powerful serverless platform
       | to build, deploy, and collaborate on web apps"
        
         | Pfhreak wrote:
         | A quick infrastructure service (from what I understand.) If you
         | are building a JS single page app and want to be able to deploy
         | it and some backing cloud functions without really worrying
         | about CDNs, gateways, etc. Git push, code runs tests, code is
         | deployed, done.
        
       | corytheboyd wrote:
       | I wish Netlify all the best! In the mean time, I just hopped on
       | to Cloudflare and saw their Pages product is in public beta.
       | Seems to work the same as Netlify for static pages, just tried it
       | out for my personal site and it worked great! I was already using
       | Cloudflare as my CDN and to manage DNS, it's actually really nice
       | to have my entire website configuration live there.
        
       | hikerclimb wrote:
       | Hopefully they are never solved
        
       | colinchartier wrote:
       | It must be very stressful to run a production hosting company -
       | DigitalOcean and Cloudflare both went through similar issues in
       | the early days.
        
         | jgrahamc wrote:
         | No comment.
        
           | colinchartier wrote:
           | Glad you overcame it all in the end - we're happy users,
           | especially given the outstanding uptime for the past few
           | years.
        
       | mrzool wrote:
       | "We have identified the issue and it is attributed to an upstream
       | provider."
       | 
       | The upstream issue is probably at Google:
       | 
       | "We are experiencing an issue with L4 load balancers in us-
       | west1-c. Multiple managed services relying on LB and located in
       | this zone might be affected."
       | 
       | https://status.cloud.google.com
       | 
       | This has been going on for at least an hour.
        
       | MaggiD wrote:
       | My startup has also suffered from the recent Netlify outages with
       | our main landing page.
       | 
       | Last time I already did the research for potential alternatives:
       | 
       | - Cloudflare Pages is now available in public beta
       | 
       | - even more interesting seemed this offering by PerfOps to put my
       | CDN behind a Load Balancer that can monitor uptime and
       | dynamically shift traffic between multiple CDN sources:
       | https://perfops.net/flexbalancer
       | 
       | What do you think?
       | 
       | - it seems like the multi cloud approach to CDN
       | 
       | - but at the same time I'll have a problem if this Load Balancer
       | fails (single point of failure)
        
         | SahAssar wrote:
         | > CDN behind a Load Balancer
         | 
         | This sounds crazy to me. Besides the obvious superfluous
         | network/layer hops, complexity and points of failure it would
         | also partition the cache, right? So working against the very
         | thing CDN's optimize for.
        
       | charlesworth91 wrote:
       | This hit me, as someone using A records to point to the Netlify
       | IPs. If you are using GitHub, I found switching over to Github
       | pages quite easy for my static site. I used this guide:
       | https://docs.github.com/en/github/working-with-github-pages/...
        
       ___________________________________________________________________
       (page generated 2021-03-25 23:02 UTC)