[HN Gopher] Tell HN: AWS appears to be down again
___________________________________________________________________
Tell HN: AWS appears to be down again
Anyone else seeing this?
Author : thadjo
Score : 821 points
Date : 2021-12-15 15:26 UTC (7 hours ago)
| yellowsir wrote:
| npmjs has problems too :(
| yellowsir wrote:
| seems to be up again
| zedpm wrote:
| Wow, yeah, us-west-1 AND us-west-2 are reporting connectivity
| issues. I'm guessing this is related to the Auth0 outage that's
| currently going on too.
| [deleted]
| dhruvarora013 wrote:
| Looks like its taken down SendGrid, NPM, Twitch, Auth0 so far
| hericium wrote:
| PlayStation Network went down at the same time.
| ents wrote:
| Notion as well
| cyral wrote:
| Stripe as well
| pjf wrote:
| Kentik data on the outage:
| https://twitter.com/DougMadory/status/1471162450649223173
| streetcat1 wrote:
| Remember, every 12 secs take one 9.
| skj wrote:
| eh?
| edoceo wrote:
| It's about calculating the 9s in your uptime. But 365 * 24 *
| 60 * 60 * 0.000001 == 31s (did I get that right?)
| NicoJuicy wrote:
| Related to Amazon's SLA
| rexreed wrote:
| It's not just AWS - check the down
| reports:https://downdetector.com/
|
| Cloudflare having some significant issues as well on certain
| domains.
| the_pwner224 wrote:
| HN was also (briefly) down around that same time (roughly 1
| hour ago from now).
| PragmaticPulp wrote:
| DownDetector is showing everything down during that period,
| including Google.
|
| I suspect DownDetector itself suffered some outages during this
| period, which it shows as outages of every service it monitors.
| zaltekk wrote:
| That's not how DownDetector works. It just relies on reports
| from users. The real failure case is users not understanding
| why they can't access whatever end service. Maybe they blame
| that service, maybe they blame their ISP, maybe they blame
| something else.
| forgotpwd16 wrote:
| This looks weird. At the same time all those services had a
| spike in outage reports.
| jgrahamc wrote:
| No, we are not. But customers who use AWS are having trouble.
| rexreed wrote:
| Thanks for clarifying! Things seem to have settled down.
| yabones wrote:
| It's possible people are reporting the issue as CloudFlare
| because that's whose error page they see when a box on EC2 is
| unreachable.
| [deleted]
| chasd00 wrote:
| can confirm i have multiple salesforce instances down.
| nerdjon wrote:
| The list of affected services is a bit all over the place,
| especially since I highly doubt Xbox Live or Halo is running on
| AWS.
| s_fischer wrote:
| For the core services? Definitely. But do we really know that
| some 3rd party API which doesn't fail gracefully isn't
| causing this?
| iamricks wrote:
| lol imagine if azure was just AWS in the backend
| nerdjon wrote:
| Is it bad that I can _almost_ see that being a quick and
| dirty MVP to get out the door while you built your own
| cloud solution? Raises serious migration and cost issues,
| but... would be interesting.
| vidarh wrote:
| I think for some targeted things there might well be
| "value added" services you could offer to transparently
| wrap AWS. E.g. a "write-through" S3 wrapper was something
| I was actually looking at because some clients when I was
| contracting were very reluctant to trust anything but AWS
| for durability but at the same time AWS bandwidth costs
| were so extortionate that renting our own servers from
| somewhere like Hetzner and then proxying writes both to a
| local disk and to S3 and serve up from local disk with a
| fallback to pull a fresh copy from S3 if missing broke
| even at a quite small number of terabytes transferred
| each month.
|
| The nice part about something like that is that properly
| wrapped you can change your durable storage as needed,
| and can easily even selectively pick "cheaper but less
| trusted" options for less critical data. It also allows
| you to leverage AWS features to ride closer to the wire.
| E.g. to take another example than storage, I've used this
| to cut the cost of managed hosting by being to spill over
| onto EC2 instances in the past, allowing you to run at
| much higher utilisation rate than what you can safely on
| managed / colo / on-prem servers alone - as a result,
| ironically the _ability_ to spill over onto EC2 makes EC2
| far less competitive in terms of cost to _actually_ run
| stuff on most of the time.
| jon-wood wrote:
| Down Detector doesn't really detect anything other than
| people saying "Is [service X] down?" on Twitter, which does
| mean that Xbox Live appears to be permanently offline if you
| believe them because the typical user for Xbox Live will
| declare anything from tripping over their ethernet cable to a
| tornado levelling their house preventing a connection to mean
| Xbox Live is down.
| subandi wrote:
| If that were true, the line should be flat-ish, but it and
| playstation's show the same extreme spike at the same time
| as aws etc.
| Uehreka wrote:
| It's still useful if you remove units from the graph and
| treat it as a sparkline. If there are reliably ~100 Xbox
| Live complaints on Twitter per hour, then suddenly there
| are 3000, that's an outage.
| ren_engineer wrote:
| some sort of widescale attack would be the only explanation
| right?
| buryat wrote:
| downdetector.com uses users complaints so it's unreliable as
| people can blame anything
| ramesh31 wrote:
| Auth0 down as well, right at the same time. There goes any sort
| of productivity today. Whole company in firefighting mode.
| ceejayoz wrote:
| We're having troubles in us-west-2.
|
| Discourse is reporting trouble, too.
| https://twitter.com/DiscourseStatus/status/14711403698992906...
| supermathie wrote:
| us-west-1 also seems offline, but us-east-1 (ironically) seems
| fine
| branon wrote:
| Yup. Having issues with IT Glue and Duo here.
| rd0 wrote:
| Duo issues here as well.
| myth_drannon wrote:
| That's the price of PIP culture and burning out your devs. Now
| noone wants to work at Amazon and they can only hire new grads.
| Throwawayaerlei wrote:
| I hear they do get people who want to be able to get experience
| at AWS's scale, there's only a few places for that.
|
| The thing that really gets me is the reports from the last
| major outage a few days ago about how pervasive lying _inside_
| the company is. This really doesn 't work well for engineering
| and we're possibly seeing the results of that. We should
| certainly expect to see that becoming visible the more time
| goes on without a major cultural shift. Which given that the
| guy who ran AWS now runs all of Amazon.com....
| swaraj wrote:
| Our IaaS vendor, Aptible, reports us-west-1 is down / throwing
| errors
| evilhackerdude wrote:
| 4 hours in, our AWS IoT endpoint (not ATS, Symantec) in us-west-2
| is still down according to monitoring, PHD and support.
| nic_wilson wrote:
| We are seeing issues with requests to Auth0, which I believe is
| hosted on AWS and has historically gone down when AWS has had
| issues
| ramesh31 wrote:
| Auth0 went down for us as well right when AWS did. At least
| it's not like those two systems run our entire company...
| romanhotsiy wrote:
| We see issues with Auth0 too. Other AWS services we use seem to
| be working fine so far (us-east-1)
| heartbreak wrote:
| AWS is reporting an issue in us-west-2 on their status page.
| [deleted]
| [deleted]
| earthboundkid wrote:
| HOST THE GODDAMN STATUS PAGE ON AZURE FOR FUCKS SAKE.
|
| There is zero excuse for this shit. Be professional. Acknowledge
| reality. It is logically impossible to run your own status page.
| Trying to do so just wastes everyone else on the internet's time
| when you have an outage.
| tommek4077 wrote:
| They should automatically update as well. Currently it is a
| static "all green" page and might be manually changed if a
| managet would give his go. Insane.
| aaronharnly wrote:
| Seriously.
| jeroenhd wrote:
| They should host their status page on IPFS instead. If you're
| never going to change the contents of your status page, you
| might as well put it into immutable storage!
| Aldipower wrote:
| If the status page is down, you know the system is down.
| Mission accomplished. Go ahead.
| boopboopbadoop wrote:
| You don't even know what the problem is yet. Stop shouting
| solutions.
| OneLeggedCat wrote:
| I kind of think everyone else here understands this very
| particular problem of a status page running on the same
| equipment that it's supposed to be monitoring if that
| equipment goes down, and for whatever reason, you don't.
| boopboopbadoop wrote:
| I understand that. What I'm questioning is whether that is
| the problem here. Is it? Do you know? I heard it might be
| an internet provider issue, in which case the status page
| is not the problem here.
| yunwal wrote:
| The problem is that AWS can't update their status page to
| reflect that there's a problem. This happens during every AWS
| incident without fail.
| boopboopbadoop wrote:
| My point is that you're not even sure that it's AWS's
| problem. I heard that other providers might be affected,
| perhaps meaning it's a network issue.
| hatware wrote:
| Status page looks like it was updated. Seems more like we
| have a lot of impatience on this board.
| twistedpair wrote:
| "Can't" and "won't" are different things.
|
| See discussions from the last outage about the VP signoff
| needed to admit, I mean announce, an outage.
| slig wrote:
| The problem is very clear: the status page is not working as
| it should.
| boopboopbadoop wrote:
| What if the problem is not an AWS problem? My point is that
| you don't know what the problem is, you're assuming.
| mark-r wrote:
| Given the legal liabilities Amazon has with their SLAs, it
| may be working exactly as Amazon thinks it should. Whether
| anybody would agree with that assessment should be obvious.
| r3trohack3r wrote:
| I don't understand, are folks looking at a different status
| page than me?
|
| This morning we saw some weird behavior in us-west-2, our
| traffic just _vanished_. I thought: there is no way this is us.
|
| Went to https://status.aws.amazon.com/
|
| Top of the board showed "Internet Connectivity Issues (Oregon)"
|
| And that was that. The board worked exactly as it should - it
| immediately explained my missing traffic and kept me up-to-date
| with the status of the outage on their side.
| bnt wrote:
| Isn't it on S3 or something? And a few years ago we had that
| whole S3 is down situation and the status page was also down?
| xD
| kwertyoowiyop wrote:
| No, just change it to a statically-rendered page on
| CloudFlare with all green lights. :-)
|
| And that, ladies and gentlemen, is how I passed my system
| architect interview!
| [deleted]
| kp195_ wrote:
| We're having issues connecting to our EC2 bastions and accessing
| the us-west-1 dashboard too
|
| EDIT: Cognito auth seems down for us too
|
| EDIT2: our ALBs are timing out as well
|
| EDIT3: us-west-1 looks like working now!
| 300bps wrote:
| I'm on us-east-1 and everything is fine for me including:
|
| * EC2 instances
|
| * AWS Workspaces
|
| * FSx for Windows
|
| * AWS Directory Service
|
| * S3 Buckets
| joelbondurant wrote:
| AWS is the McDonald's of computer hardware. Billions ov
| hamburgers served, so dey must have da best hamburger cooks n da
| world. Decades of corporate insistence on outsourcing every shred
| of hardware talent has left the software industry filled with
| imbeciles.
| samgranieri wrote:
| At least this still works: https://livemap.pingdom.com/
| fy20 wrote:
| Partially, the stats on the right are wrong. For me it shows:
|
| Website outages in the past hour 86,967
|
| Lowest 16,208
|
| Average 16,208
|
| Highest 16,209
| andrew_ wrote:
| Root logins are suffering some kind of "captcha outage." The buzz
| has just begun
| https://twitter.com/search?q=aws%20captcha&src=typed_query
| [deleted]
| waynecochran wrote:
| There was a brief period of time back in the early 90's where I
| felt I understood how Linux worked -- the kernel, startup
| scripts, drivers, processors, boot tools, etc... I could actually
| work on all levels of the system to some degree. Those days are
| long gone. I am far removed from many details of the systems I
| use today. I used to do a lot of assembly programming on multiple
| systems. Today I am not sure how most of the systems works in
| much detail.
| cle wrote:
| To an extent, this is one of the goals, to free up engineers to
| work on higher level things. Whether it meets that goal in some
| cases is debatable, and it's certainly not ideal for us
| engineers who like to get to the bottom of things.
| 10000truths wrote:
| Funny, I feel the exact opposite way. The low level stuff is
| where all the magic happens, where performance improvements
| can scale by orders of magnitude rather than linearly with a
| CTO's budget. I'd much rather figure out how to condense some
| over-engineered distributed solution down to one machine with
| resources to spare.
| someguydave wrote:
| "working on higher level things" currently implies that
| depending on many layers of opaque and unreliable lower level
| hardware and software abstractions is a good idea. I think it
| is a mistake.
| cle wrote:
| The best conclusion I can come to is "sometimes it works,
| sometimes it doesn't". Depends on the context. I've seen
| cases where it works great and other times where it's a
| huge hassle.
| johnisgood wrote:
| And I kept getting "We're having some trouble serving your
| request. Sorry!" on HN for the past 10 minutes or something.
| edoceo wrote:
| Traffic flood to this site for status reports on AWS
| qwertyuiop_ wrote:
| Log4jammed ?
| cebert wrote:
| This outage is extremely frustrating to me. My company hosts all
| our apps in gov cloud. Gov Cloud West 1 is also down, but the AWS
| Gov Cloud status page indicates that everything is healthy and
| green. I thought AWS's incident response to the East outage last
| week was that they'd update the status page to better reflect
| reality.
|
| Gov Cloud Status Page: https://status.aws.amazon.com/govcloud
| texasviking wrote:
| We are in the same boat. Finally updated "We are investigating
| Internet connectivity issues to the US-GOV-WEST-1 Region"
| chasd00 wrote:
| i had multiple govcloud hosted salesforce instances down but
| they appear to be coming back up now.
| account758 wrote:
| AWS Global Accelerator not working correctly anymore as well,
| connections dropped worldwide. Seems like it is managed from us-
| west-2 and not redundant.
| tmvnty wrote:
| Some npmjs.com pages are returning 503 Service Unavailable for us
| rychco wrote:
| Tsheets is also down so I can't clock my hours LOL
| clavicat wrote:
| We are barbarians occupying a city built by an advanced
| civilization, marveling at the hot baths but know nothing about
| how their builders keep them running. One day, the baths will
| drain and anyone who remembers how to fill them up will have
| died.
| Waterluvian wrote:
| This has been true for a long time and it is not a bad thing.
|
| It's an easy target to romanticize but realistically, any
| alternative is basically a way of saying: "let's stop
| evolving."
| [deleted]
| tata71 wrote:
| Disagree.
|
| Wanting to evolve differently doesn't mean halting.
|
| It's a call not to devolve.
| Waterluvian wrote:
| I certainly think this is a subjective, "no one right
| answer" discussion topic.
|
| From my perspective, we require abstractions in order to
| free our intellectual capacity up for the next layer of
| complexity.
| tata71 wrote:
| P.S. That Paw Patrol shit hit me right in the emotions.
| Awesome site.
| rhacker wrote:
| I don't buy this. I've written some pretty complicated
| codebases at previous companies that no one knew how to operate
| except for me. After I left those companies they didn't fold or
| lose all their customers. They adapted and everything is fine.
| For whatever reason humans find simplicity through complex
| processes.
| selfhoster11 wrote:
| That's a very dramatic interpretation. In reality, as long as
| Unix greybeards are around, we are safe enough on the "can we
| rebuild it" question.
| meitros wrote:
| There's this classic article on someone's quest to make a
| toaster from scratch https://gizmodo.com/one-mans-nearly-
| impossible-quest-to-make....
| giardini wrote:
| He was working too hard:
|
| https://www.bing.com/images/search?q=toasting+bread+over+cam.
| ..
| schnevets wrote:
| And a bunch of barbarians will just accept being smelly. And a
| few barbarians will figure out a different way to fill a bath
| tub. And life will go on.
| dznodes wrote:
| Well said,... but why is the water slowing getting hotter?
| sebringj wrote:
| I liked that. It might be ever weirder though for us in the new
| age. We'll have robots (AI) running everything and why things
| are happening will degrade into unknown unknowns. Engineering
| and critical thinking may be a lost art.
| [deleted]
| JKCalhoun wrote:
| I'm reminded instead of the supply chain issue our economy has
| gotten itself into. As more people pile on AWS it becomes the
| weak link....
|
| Maybe we need more baskets to distribute our eggs amongst?
| sriram_sun wrote:
| For the past couple of weeks, I've been a beginner-intermediate
| mechanic trying to breathe life into an aging car.
|
| Sometime in the next few months, I've to troubleshoot and fix
| the broken 2 yr. old refrigerator. Someone came and fixed it
| once, now it's out of warranty and fixing it would cost about
| 50% of its cost. Meanwhile I'm glad I didn't throw away the 10
| yr old refrigerator and just moved it to the garage. We just
| have to keep going to the garage.
|
| I also have to play the accountant for my consulting business
| pretty soon. This is a task I had outsourced for years and have
| now started doing myself.
|
| As stuff gets more specialized, I've started noticing that I'm
| able to do moderately complicated things better than
| professionals paid at the 50th - 70th percentile. If I want to
| get a really good job done, my rule of thumb is to be ready to
| shell out money in the 90th percentile range and look for
| references.
|
| In case of AWS, I guess the Greasemonkey scripts are getting
| too complicated ;)?
| wly_cdgr wrote:
| It's ok, we can just rebuild everything from scratch if we need
| to. We know we can cos we already do it every five years anyway
| without needing to
| JadoJodo wrote:
| Tangentially related: If you enjoy this sort of idea in
| fiction-form, I can't recommend Josiah Bancroft's The Tower of
| Babel series (beginning with Senlin Ascends) enough.
| tzs wrote:
| Many years ago I stood at the window of my comfortable
| apartment, watching wind and cold rain rage outside.
|
| I thought about my cave men ancestors who during such a storm
| if they needed water would have to go out and get it, getting
| themselves soaked.
|
| If I wanted water, the tap in the kitchen would give it to me,
| in a nice controlled fashion. If I did feel like having water
| rain down upon me, my shower would do that, again in a
| controlled fashion, and I could select the water temperature.
|
| If they wanted the cave to be warmer, they had to burn
| something and deal with the smoke. And they might have to work
| hard to obtain whatever it is they burn.
|
| If I wanted my apartment warmer, I just had to turn the knob on
| the thermostat.
|
| They were at the mercy of their environment. My environment is
| mine to command. I was feeling pretty superior to my cave man
| ancestors.
|
| Then I realized that _I_ don 't know how to build the systems
| that I was relying on for my supposed superiority, or even how
| some of them work.
|
| I'm really just a cave man that found a nicer cave.
| fasquoika wrote:
| Well, you didn't just find a cave, it was made for you by
| other people. Interdependence is a hallmark of social species
| such as Homo Sapiens. Even your caveman ancestors were
| probably reliant on one another in many ways.
|
| >It seems that someone asked the great anthropologist,
| Margaret Mead, "What is the first sign you look for to tell
| of an ancient civilization?" The interviewer had in mind a
| tool or article of clothing. Ms. Mead surprised him by
| answering, "a healed femur (thigh bone)". When someone breaks
| a femur, they can't survive to hunt, fish or escape enemies
| unless they have help from someone else. Thus, a healed femur
| indicates that someone else helped that person, rather than
| abandoning them and saving only themselves.
| cloverich wrote:
| At a very general level once you move past subsistence
| farming you become reliant on society to provide your needs.
| And in turn provide some value that can only come from
| spending your time on things other than farming. And that is
| I suppose how civilization advances. Its kind of funny to
| work backwards though, because even subsistence farmers are
| reliant on society for protection -- they are farmers not
| soldiers after all. I think about this a lot, how important
| trust is to going anywhere in modern life. And how little
| choice there is anyways. I also think about how most people
| don't think about it at all, or very much, and wonder if
| knowing how fragile we are makes me happier and more
| productive, or less so.
| [deleted]
| nefitty wrote:
| There's an interesting misconception that humans developed
| agricultural societies because they achieved better
| outcomes as individuals. Research shows that hunter-
| gatherers were healthy and better nourished than humans in
| early agricultural settlements.
|
| What's probably closer to truth is that many humans were
| forced to join farming communities. Stronger individuals or
| tribes probably enslaved others, and then forced them to
| build and produce.
|
| The patterns of inequity and the march toward hyper-
| specialization we still see today make sense in that
| context.
|
| As a tangent, if anyone is interested in that "cavemanness"
| deep in our DNA, check out the idea of primitive camping.
| That was my first experience camping, and I expected an
| idealized tv-ad experience. The trip was not framed as
| "primitive camping" to me.
|
| I was dealing with intense burnout, stress, ADHD symptoms,
| immune problems, trouble sleeping... And I was thrown into
| the desert in the summer with a tent and some beer. It
| fucking sucked sooo bad. It fucking sucked sooo bad that I
| forgot every stupid problem I had, because I spent the
| entire time in survival mode. Setting up camp. Hauling
| equipment up and down dunes. Staying hydrated in the 100f+
| heat. Making food. Making sure my wife and friends were ok.
| Strategizing how to defend our camp from bugs and psychos.
|
| I really have not had such an existentially-dense
| experience as that one. And no, I didn't take any
| mushrooms, as the rest of the group did. I wanted to be
| lookout. Maybe I come from a long line of hyperaware
| sentries.
| danielheath wrote:
| Forced labor was absolutely the norm for the pre-modern
| state, and provided the bulk of the workforce [1].
|
| AFAIK humanity is yet to produce a society where the
| majority of farm laborers are fully free to leave the
| land they work on (whether via having their papers
| confiscated, their wages held until the season ends, by
| having transport provided to a remote farm but the trip
| back withheld etc). We've seen improvements in the degree
| of freedom, particularly over the past century and
| especially the past 50 years, but it's still very low
| compared to urban dwellers.
|
| 1: "Against the Grain", James C Scott
| cameronh90 wrote:
| The most impressive thing to me is toilets. Just click a
| button and your waste disappears. Don't know where it goes or
| how it gets there and pay almost nothing for the privilege.
|
| Toilets are amazing and I feel privileged every time I use
| one. Girlfriend thinks I'm nuts.
| hotpotamus wrote:
| Not to mention that when the power goes out, the illusion
| fades fairly quickly. Learned that lesson myself in the Texas
| snowstorm early this year.
| pier25 wrote:
| Absolutely. My wife and I lived for a year in an off-the-
| grid cabin in some mountains in Mexico.
|
| We had solar panels and a generator we used only when
| absolutely necessary. We were never without power, but we
| lived with the constant anxiety of optimizing our energy
| consumption. Some stuff we could only do during the day and
| at night we only used devices with batteries.
|
| For a couple of weeks we didn't have running water in the
| cabin because we were rebuilding our water deposit tower.
| We used buckets for everything.
|
| That was almost a decade ago and I still feel grateful at
| having unlimited energy or running water on demand.
|
| I also feel guilty at times when doing power hungry stuff
| like playing video games, knowing electricity production is
| by far the biggest driver of climate change.
| mikestew wrote:
| _Absolutely. My wife and I lived for a year in an off-
| the-grid cabin in some mountains in Mexico._
|
| I think everyone ought to do a week in an RV with no
| connections to utilities. Not to take away from your
| story, but a similar scenario comes up when we "dry camp"
| (no water or electrical connections): resources are not
| unlimited. We have solar panels, big-ass inverter and
| big-ass battery to go with it. But if we want lights at
| night, best not run that 1100W microwave for _too_ long,
| because the panels won 't keep up and the battery isn't
| _that_ big. We have a built-in generator, but unlike most
| RV owners, we are loathe to use it. It 's almost like a
| game, and if that generator fires up then we've lost.
|
| You want to let the water run while you brush your teeth?
| Go right ahead, our water tank is plenty big...oh, wait,
| but the holding tanks aren't. Shut that tap off before
| there's dirty water coming up through the shower.
| Speaking of showers, use the outside shower, as the
| holding tanks won't hold enough for your 30 minute,
| piping-hot shower.
|
| Point of it all is that it one quickly learns that it all
| has to come from somewhere, and it has to go somewhere
| after you've dirtied it. I'd like to think that it has
| made the both of us more conscious of our usage.
| mediaman wrote:
| Very similar experience with sailboats.
|
| There's nothing like being at sea, 100+ miles from
| civilization, reliant on the limited capacity systems on
| your vessel. You manage your food, you manage your water
| consumption, fuel, electrical usage, you're closely
| attuned to the weather, the sea state, the charts. There
| are no other visible people or people-made objects out to
| the horizon in all directions. If something breaks, you'd
| better know how it works and be able to fix it, or go
| without. It feels very freeing, but also provides a "back
| to basics" accountability.
|
| Standing under a hot water shower with unlimited water in
| a spacious home shower afterward feels luxurious.
| aNoob7000 wrote:
| I lived in Miami during hurricane Wilma and spent like a
| week without electricity. You realize how quickly things go
| south without electricity flowing.
| 2143 wrote:
| Third world country here.
|
| Commit and push often.
| tiborsaas wrote:
| Great for you, I hope you enjoy your cave.
|
| > Then I realized that I don't know how to build the systems
| that I was relying on for my supposed superiority, or even
| how some of them work.
|
| I'm sure if you just sat down with a pen and paper you could
| come up with a DIY solution.
| the_af wrote:
| > _Then I realized that I don 't know how to build the
| systems that I was relying on for my supposed superiority, or
| even how some of them work._
|
| I used to have this joke(?) with my friends: remember Mark
| Twain's "A Connecticut Yankee in King's Arthur Court"? The
| titular Yankee basically upends the (faux) medieval society
| he gets transported to, "inventing" all sorts of
| technological miracles.
|
| Well, I'm a software developer but don't come from an
| engineering background (I mean actual engineering, not
| programming). I don't even understand how electricity or the
| telephone work (I mean, old fashioned telephones, let alone
| current mobile networks). If I was transported to 2 or 3
| centuries to the past, I wouldn't be able to explain modern
| technology to other people, let alone actually build it.
|
| I sort of understand how steam machines work, and I could
| "invent" the printing press. I guess. But anything related to
| circuitry, electricity, chemistry, engineering of any sort, I
| wouldn't be able to even begin explaining them to King
| Arthur.
|
| My introduction to the knights of the round table would go
| something like this:
|
| "We are questing for the Holy Grail, oh noble stranger from a
| far away land! How can you help?"
|
| "Depends, which version of Python are you running?"
| kabdib wrote:
| A light, enjoyable read along these lines is Leo
| Frankowski's "high tech knight" series, starting with _The
| Cross-time Engineer_. The main character -- a _real_
| engineer -- gets transported back to medieval Poland, and
| he knows that he 's got ten years either to bug out, or
| help Poland defend itself from the coming Mongol invasion.
|
| [I only liked the first four books, but that's enough to
| cover the original story arc]
| g051051 wrote:
| "Deathworld 2" by Harry Harrison has a plot along those
| lines. Apparently the original name was "The Ethical
| Engineer".
| bee_rider wrote:
| Even if you knew what to do, convincing the naturally
| suspicious people back then to trust a strange outsider
| would be tricky. Then you have to get the right materials.
|
| If I were a bit more clever, or maybe if I was 50 years
| older and had played with this kind of stuff growing up,
| I'd probably try to make a spark-gap transmitter. That
| seems to be in a sweet spot of not requiring too many super
| clever bits, and having obvious applications.
| ff317 wrote:
| Also on a similar theme:
| https://en.wikisource.org/wiki/I,_Pencil (it's intended to
| be about free market economies, but you can also read it as
| something about knowing how even simple modern marvels work
| at all).
| prh21 wrote:
| Inventing the printing press was more difficult than it
| seems at first. In addition to the idea of unsing movable
| type significant development of the correct alloys for the
| types was necessary. The alloy needs to be able to be cast
| easily and at the same time be durable to be reused for a
| large enough number of print runs. In addition the proper
| ink needs to be developed...
| the_af wrote:
| Right, let me amend my statement: I understand how the
| printing press with movable type works and I would be
| able to explain it to King Arthur, but I probably
| wouldn't be able to actually craft the types, inks, etc,
| and so the annoyed King would have me beheaded.
| dredmorbius wrote:
| Cheap durable paper also helps.
|
| Fun fact: printing rates increased from about 120
| sheets/hour to over 1 million over the course of the 19th
| century. Those began with wooden screw presses that
| differed little from Gutenberg's to cast iron, rotary,
| steam and later electric powered, and web (continuous
| paper feed) presses, and from matrix plates (with
| individual type set in blocks) to offset Linotype (in
| which the entire print block was cast as a single sheet
| through multiple stages from the original matrix
| characters).
|
| Thought just occurs: the falling characters of the iconic
| Matrix screen somewhat resemble the individual type
| elements flowing and falling through a Linotype machine.
| I don't know if that is a deliberate or incidental
| reference, but it's an interesting one.
| twic wrote:
| Do i have the T-shirt for you:
|
| https://topatoco.com/products/qw-cheatsheet
|
| And the T-shirt's companion bandana and spin-off book:
|
| https://www.popularmechanics.com/culture/a23286104/how-to-
| in...
| bee_rider wrote:
| This shirt annoys me. I get that it is a joke, but the
| explanations are just so woefully over-simplified, and
| don't get at the main problem -- materials and
| manufacturing technology in the past was poor enough that
| even if you knew the basic physics you'd have no chance
| of getting, like, material to build a wing out of.
| lstodd wrote:
| What, not even pinewood and gelatin for ribs and
| stringers, and some linen cloth plus pine resin and
| alcohol for doping? Seriously, that's like 1000BC tech
| level.
|
| Wing is no problem as long as one can calculate how to
| make it stiff enough and of a right shape.
| herval wrote:
| > I'm really just a cave man that found a nicer cave.
|
| You aren't really - most cavemen didn't even understand that
| fire is possible, and wouldn't be able to consistently
| operate a lighter if they found one (it'd probably be put on
| an altar and worshipped instead, as it should). You might not
| be able to build your entire cave, but your education alone
| is a _huge_ advantage!
| osense wrote:
| Not to mention that many of the skills needed by the original
| cavemen to survive are gone in today's society. In other
| words, if we were to compete with the original cavemen in
| their environment, we would most likely fare rather poorly,
| at least in the short term.
|
| Not trying to glorify off-the-grid living or anything, but I
| think it's interesting to think that in some (very specific)
| ways, the cavemen were actually superior to us.
| nopenopenopeno wrote:
| >if we were to compete with the original cavemen in their
| environment, we would most likely fare rather poorly
|
| The understatement of the epoch
| wrycoder wrote:
| At least the cave man could go out and get water. Or have
| some reasonable expectation of finding food. Good luck!
| anderspitman wrote:
| Anyone who finds this concept interesting to think about and
| hasn't seen this video may enjoy it:
|
| https://www.youtube.com/watch?v=ZSRHeXYDLko
|
| See also Foundation by Asimov.
| xwdv wrote:
| Or worse, we will be sitting in the hot baths and the hot
| spring where the water comes from will get hotter and hotter
| and we won't notice until suddenly a rapid change in
| temperature boils the water and burns us to death.
| stjohnswarts wrote:
| This simply isn't true. There are lots of full stack devs out
| here, we will rebuild from the AWShes :P
| yongjik wrote:
| _We_ are the advanced civilization that has built those hot
| baths. Being an advanced civilization, it 's safe to assume
| that no single person knows all the knowledge necessary to
| build another hot bath, because it has long surpassed how much
| one person can learn in a lifetime.
|
| But somehow there are multiple organizations that "know" how to
| build another hot bath, and newer and bigger baths are
| continuously being built all across the Empire.
|
| And occasionally one of them stops working and thousands of
| citizens are angry, because they feel, being honest citizens of
| the Empire, they are entitled to enjoy these hot baths.
| Sometimes their very livelihood depends on the baths running.
|
| Then the bath is fixed, and all is well again.
| 41209 wrote:
| I wouldn't be that negative, as long as certain things need to
| be on prem, we'll always have some people who can get the
| internet running again.
|
| Most of these people just happen to be employed at AWS or azure
| right now.
| Iv wrote:
| More like, they will fill with blood and locusts.
| giardini wrote:
| At least you'll have something to eat.
| tibbar wrote:
| The remarkable thing is that today no one knows how to "fill up
| the baths", or to do more than a small part of the job. Teams
| exist with extremely narrow expertise. But if anything, there
| are more options today for DIY infrastructure - way easier to
| be more advanced than "run the Apache on the server box."
| willob33 wrote:
| Fear.
|
| The Greek philosophers wrote of their fear some day the people
| might climb Olympus and find there are no Gods.
|
| Economic uncertainty versus certainty of a paycheck.
|
| No one asked any of us to build specifically these things. We
| get paid to.
| pfortuny wrote:
| So long as you live in a city you are probably forgetting most
| of the ways to survive.
|
| I could not even try to discover which berries are edible
| without killing myself.
|
| However, I can teach advanced maths to a largish group of
| students without much trouble.
| maxwell wrote:
| > I could not even try to discover which berries are edible
| without killing myself.
|
| Cluster berries, from raspberries to pineapples, are never
| poisonous. Avoid berries that resemble blueberries or
| currants unless you're able to identify the plant: we grew up
| with blueberries and know the leaves, but we avoid anything
| currant-like because we'd have no idea if they're actually,
| say, chokeberries. Avoid anything that looks like
| baneberries.
|
| Here in Maine, we forage for raspberries, blackberries, wild
| strawberries, and (mostly low bush) blueberries, but don't
| risk others.
| kiklion wrote:
| > Cluster berries, from raspberries to pineapples, are
| never poisonous.
|
| I know nothing but that still seems too generalized that it
| doesn't have an exception somewhere in the world.
| willcipriano wrote:
| You won't find enough berries, never mind edibles berries
| if everyone in your area suddenly shifted to foraging.
| Game animals would be exhausted quickly or they would
| migrate further out from human settlements. Even if you
| had the skills hunting or foraging isn't all that useful
| anywhere around a city, especially if everyone else is
| doing it.
| maxwell wrote:
| True, fruit and nut-based food forests, like those in the
| Pacific Northwest [1], seem to provide a significant,
| sustainable food source. Berries make a nice dessert +
| vitamins a few times a year.
|
| Historically here in Maine, the core diet seems to have
| been seafood, freshwater fish, maize, Capreolinae, game
| birds, eggs, honey, roots, and greens. While only a tiny
| fraction of fish/seafood remain, deer are over-populated
| and make a fine sustainable food source, the limitation
| mostly being the contemporary appetite for venison.
|
| 1. https://www.smithsonianmag.com/smart-news/indigenous-
| peoples...
| Supermancho wrote:
| The edible himalayan blackberry infestation that plagues
| all of the coastal PNW is widely available. It's almost
| impossible to kill and it fruits for long periods of
| time.
| yottalove wrote:
| Our Hoopa lived virtually exclusively on shellfish for
| hundreds of years.
| DebtDeflation wrote:
| Blueberries are easy, they have a star shaped "opening" on
| the bottom. Native Americans called them starberries. No
| other berry is blue and has that. It's the only berry I
| trust myself to eat while I'm hiking.
| grumple wrote:
| But you could probably devise a scheme by which you feed all
| of your students an assortment of berries and figure out
| which ones are safe based on which students get sick or die.
| derekp7 wrote:
| You are close. You first rub the berry on your skin (or
| leaf, or whatever). Wait 24 hours to see if a rash
| develops. Then you taste it, wait another 24 hours. Then
| you eat one, and see if you get sick after another 24
| hours. Now you can eat several, and build up from there.
|
| Yes, that is a lot of time to go hungry and testing just
| one item. And then you still don't know what actually gives
| you nutrition vs just not killing you (for example leaves
| that you can break down such as leaf lettuce, vs eating
| grass).
| zikduruqe wrote:
| That is the Universal Edibility Test and gets repeated ad
| nauseam in all the survival circles. You would miss out
| on some fine choice foods if you did that. Stinging
| Nettles (Urtica dioica) is one of them. Pokeweed
| (Phytolacca americana) too.
|
| Source - used to teach these skills before it was cool to
| be a "survivalist" on TV and social media.
| BenjiWiebe wrote:
| Yes I don't know how someone figured out that if you cook
| pokeweed and change the water multiple times then you can
| finally eat it without it killing you.
| DebtDeflation wrote:
| Alternatively, let the animals worry about all that and
| then just eat the animals.
| whimsicalism wrote:
| Assuming this is in the context of some apocalyptic event
| requiring you to do this, relying on animal husbandry
| seems obviously wrong. Plus you only get ~10% of the
| energy from the lower trophic level.
|
| The prevalence of meat comes from a society of abundance.
| grumple wrote:
| Evidence of early hominids and other less advanced proto-
| human or human groups shows a pretty significant amount
| of calories came from meat. Some suggest 60-80% of
| calories came from proteins, largely meat, at various
| times in history.
|
| Example source:
|
| https://onlinelibrary.wiley.com/doi/10.1002/ajpa.24247
|
| A contradictory source says meat was less prevalent but
| still at 40-50%:
|
| https://asu.pure.elsevier.com/en/publications/the-diet-
| body-...
|
| Both of these estimates are way higher than what we know
| people eat today, where meat and dairy are 18% of
| worldwide calorie consumption (27% in the US).
|
| So I think the abundance we see today is actually due to
| the availability of non-animal dietary sources.
| bwi4 wrote:
| > I could not even try to discover which berries are edible
| without killing myself.
|
| Eat a small amount and see if you get sick? Science in the
| wild...
| bell-cot wrote:
| Assume that you'll be in a group - some other members of
| which will be more optimistic than you.
| js2 wrote:
| https://en.wikipedia.org/wiki/Chris_McCandless#Theories_of_
| m...
| jdavis703 wrote:
| I would recommend you buy a local book on foraging. Keep in
| case of emergency. But give it a read (at least the first few
| chapters) so you can get a basic understanding of how to
| forage without killing yourself. I also recommend keeping
| _viable_ seeds and a camping shovel around as an insurance
| policy.
|
| These items aren't in my earthquake bag (I have enough energy
| bars to last until the National Guard shows up). Instead
| these are for a Carrington Event type of solar storm, civil
| war or some sort of other long-term disaster.
| FooHentai wrote:
| On the seeds front, you really have to be practicing
| growing food from seed for several years before depending
| on them for basic caloric needs - after a few years of
| providing a fraction of our household calories on the
| property I can see the pitfalls, effort and planting
| diversity needed were we to need to scale it to that level.
| The previous me would have had some seeds and a dream, and
| have died real quick. Even now I give myself 50/50 that
| water, weather, pests, poor soil, or something unexpected
| would lead to starvation.
| jdavis703 wrote:
| Yes we provide a few hundred annual calories from seed.
| Not nearly enough to survive. But hopefully enough to
| learn from while foraging or enough to link up with
| actual experts who might just be lacking in seeds or
| labor.
| jareklupinski wrote:
| as an NYC-born, growing up with bi-monthly boy scout meetings
| and yearly "wilderness camps" (pitching tents in open fields,
| pit latrines, war games/survival, etc.) really helped fill in
| that gap :)
|
| i wonder if there's anything like that for adults
| thatguy0900 wrote:
| There's prepper and survivalist camps and classes
| LambdaTrain wrote:
| That sounds like the setup of Asimov's Foundation series
| lordnacho wrote:
| Surely we are way past the point where someone knows how the
| whole thing works, all the way down.
|
| I doubt even a very skilled engineer would know how his own
| machine works all the way down. What I think happens mostly is
| the skilled dev can use his experience to know where to
| investigate and where to look for solutions.
|
| The question is organisational. Might it be that certain orgs
| have gotten so convoluted that they cannot do this
| investigation on an org level? Essentially, letting the right
| people look in the right places, unhindered by politics,
| legitimate security concerns, and practicality?
|
| You'd think there'd be a limit to scale at some point. A bit of
| redundancy makes sense. There's probably a lot of people with
| multicloud setups patting themselves on the back at the moment.
| gitfan86 wrote:
| I do contracting dev work and my specialty is being able to
| drill down into any part of the engineering assets, ops, sec,
| dev. People think someone like me is slow and expensive until
| they have a problem that no one else wants to touch.
| hiptobecubic wrote:
| I also specialize in being great at everything.
| sulam wrote:
| Not at all diminishing what you do, but surely you have a
| limit past which you say "that's outside of my expertise,
| or what's reasonable for me to gain expertise given the
| scope of this issue"?
|
| For instance, I manage a team that does "full stack"
| development, where full stack means I regularly interact
| with mechanical and manufacturing, operations, electrical
| engineers, battery and radio people, embedded developers,
| mobile, and most aspects of backend engineering. We had an
| issue where one of our chip suppliers changed their FW,
| didn't tell us, and we literally were taking apart units to
| get to the bottom of why units off the line weren't working
| properly. We go pretty deep. Still, at some point we throw
| our hands in the air and say "Hardware is hard, it's in the
| name."
| gitfan86 wrote:
| This was meant to be in the context of hosting software
| services on AWS. Certainly there is a limit. If a MBP get
| a crack in the case, I'm not going to figure out how to
| machine a piece of aluminum into a new case, I'll replace
| the laptop.
| whimsicalism wrote:
| > Surely we are way past the point where someone knows how
| the whole thing works, all the way down.
|
| So you are disagreeing with this statement and saying that,
| in fact, you are the person who knows how the whole thing
| works?
|
| I knew tech work produced some large egos, but sheesh.
| rch wrote:
| This type of person exists, and while rare, not as rare
| as some seem to assume.
| whimsicalism wrote:
| There is nobody who exists who would be able to recreate
| a modern computer from scratch.
| gitfan86 wrote:
| When you say 'modern' do you mean with photolithography?
| whimsicalism wrote:
| Yes, I do, keeping with the spirit of re-building the
| metaphorical "baths" of modern civilization.
|
| But even if I didn't mean that, there still is nobody who
| could do it.
| 8note wrote:
| No one person could recreate the pyramids from scratch
| either, and that's a pile of rocks
| whimsicalism wrote:
| But they could teach others the principle and have them
| follow their directions to recreate it.
|
| I'm saying no such person, even one who built up a team
| that they taught, exists for the modern computer.
| opportune wrote:
| You know how to debug kernel issues? You know how AWS
| virtualization works and how to diagnose a problem with the
| AWS networking stack?
| dijit wrote:
| These are not as hard things as you make them sound. They
| are the things traditional sysadmins spent time
| understanding.
|
| But I am also incredulous by the parent.
|
| I wouldn't be able to diagnose traces on a motherboard or
| a defective (but partially functioning) CPU.
|
| I wouldn't be able to diagnose irregular voltage
| conditions or drop offs.
|
| The amount of stuff that I know I couldn't diagnose is
| absurdly high, but the amount I don't even know that I
| can't diagnose is higher still.
|
| And my job, like the parents, is to drill down and spend
| time in specifics.
| gitfan86 wrote:
| You know how to debug kernel issues? Yes
|
| You know how AWS virtualization works and how to diagnose
| a problem with the AWS networking stack? Yes, and yes
| assuming that everything AWS is responsible for is
| operating within spec. Obviously, I don't have access to
| their switches, and cannot see anything at layer 1 or 2.
| SteveNuts wrote:
| > There's probably a lot of people with multicloud setups
| patting themselves on the back at the moment.
|
| And there's probably an equal number troubleshooting why it
| didn't failover the way it should, while their upper
| management starts questioning what they're paying for.
| com2kid wrote:
| > Surely we are way past the point where someone knows how
| the whole thing works, all the way down.
|
| I've met a few people who can rightfully lay claim, but yeah,
| an incredibly rare set of skills.
|
| That said, there is a recent revival in building systems from
| the ground up. While you can't manufacturer your own
| transistors, it is quite possible to understand everything
| from simple logic gates to ALUs to older style CPUs and
| memory buses.
| cecilpl2 wrote:
| I built a toy CPU in software once as an exercise. I
| started with "class Transistor" (wrapping an AND op) and
| "class Wire" (wrapping a boolean), and wired them together
| incrementally to make gates, flipflops, registers, etc.
|
| I eventually got a fully-functioning 32-bit cpu with
| instruction pipelining, two levels of cache, DMA
| input/output, an asynchronous bus, a custom assembly
| language with an assembler written in python, and got the
| Game of Life running on it.
|
| It ran about 2kHz with 8kb of memory or so.
| swiftcoder wrote:
| > I doubt even a very skilled engineer would know how his own
| machine works all the way down
|
| Knowing how it works, and being able to build a new one, are
| also two very different problems. For example, there are
| plenty of Computer Science folks who learned how to design
| chips (layout the circuits, write the microcode, etc) - but
| you need a whole extra background in EE and Physics to be
| able to fab said chip...
| olooney wrote:
| Many programmers complete some kind of nad2tetris[1] style
| course where they go from basic hardware primitives (the
| NAND gate) all the way up through a small von Neumann
| architecture computer that can be programmed with a simple
| homebrew machine code. Even if they don't, a good CS
| undergraduate program should cover a lot of it, and since
| most EEs can program at least a little they probably get a
| pretty good top-to-bottom understanding as well.
|
| The problem is that this is really only possible with a toy
| model of a computer and very simple programs. Modern chips
| with their branch prediction and caching and threading and
| advanced vectorized operations and so on are vastly more
| complex. The 6502[2] was perhaps the last chip that one
| person could fully grok. Maybe a chip designer at Intel or
| AMD could understand the whole circuit in detail but no one
| else has the time - it would literally be a full time job.
| The same thing is true for operating systems - even if
| you're Raymond Chen, you can know a lot about Windows, but
| you can't know everything.
|
| We learn just enough about the other parts of the system to
| convince ourselves that we understand the principles. We
| build the basic mental model we need to interact with other
| systems but all we can really do focus on our own
| specialized areas and hope that everyone else is doing
| their job. This works well enough until something like
| Spectre[3] or Meltdown[4] crops up and that's when we
| realize that we've been building castles in the sand.
|
| [1]: https://www.nand2tetris.org/
|
| [2]: https://en.wikipedia.org/wiki/MOS_Technology_6502
|
| [3]: https://en.wikipedia.org/wiki/Spectre_(security_vulner
| abilit...
|
| [4]: https://en.wikipedia.org/wiki/Meltdown_(security_vulne
| rabili...
| FooHentai wrote:
| I mean you can (and I have) etch your own circuit boards,
| but that's obviously not at the scale you need for anything
| other than primitive processing (70s 8-bit at MHz scale),
| and even then you're just looking at the next layer down
| (how to make your own chemical wash and get copper onto a
| board) as a barrier if we're truly talking about 'from
| scratch'.
|
| We really depend on three things - knowledge (stored
| collectively and in various media eg books), materials
| (tools and manufactured precursor goods, available via
| active supply chains or existing stores), and most
| importantly having our basic needs met trivially so that
| all our time is not sucked up addressing them.
|
| A scenario where someone has a 'wasteland' to pick over for
| their basic needs, knowledge and materials looks quite
| different to a return to primitive living where what nature
| provides is all their is to work with. 'if you want to bake
| an apple pie, first you must invent the universe' or
| however it goes...
|
| Then of course there's the question of why someone would
| have any interest in obtaining computing power were either
| of those scenarios to occur. Much like the 'how do we warn
| future civilisations about our nuclear waste' problem
| perhaps it is acceptable to not bother, they'll figure it
| out again on their own eventually given enough time.
|
| This stuff is fun to think about at 6am when hay fever is
| preventing my sleep :)
| nic_wilson wrote:
| Is this a fifth season reference?
| bawolff wrote:
| On-prem is much more rare, but its hardly non-existent. Plenty
| of people know how to do this sort of thing.
| vidarh wrote:
| On-prem, maybe, but if you include co-located equipment and
| managed hosting I don't even think it's more rare in absolute
| terms. Just smaller as a percentage of overall hosting.
| starfallg wrote:
| There's (still?) a lot of on-prem and managed hosting. It's
| probably the majority of hosted services. Otherwise VMWare
| wouldn't be doing as well as it is.
| aswinmohanme wrote:
| Couldn't access Notion, so came to check HN, and boom here is the
| answer.
| redety wrote:
| wow
| alvis wrote:
| Oh man, not again!
| alvis wrote:
| Oh man. Not again!
| alecr95 wrote:
| Yep, we're also having issues. Hosted on us-west-2
| navidkhn1 wrote:
| My personal health dashboard on AWS shows "InternetConnectivity
| operational issue us-west-2"
|
| [07:42 AM PST] We are investigating Internet connectivity issues
| to the US-WEST-2 Region.
| iJohnDoe wrote:
| Probably a silly question, but what are you using to get this
| info?
| pbalau wrote:
| A browser most likely... this is the "Personal Health
| Dashboard" one gets for each AWS account
|
| /edit: https://phd.aws.amazon.com/phd/home#/dashboard/open-
| issues
| iJohnDoe wrote:
| Thanks. Didn't know if it was a custom dashboard or
| something provided by AWS.
| rpadovani wrote:
| Systems manager in eu-central-1 is giving us some issues now, but
| I am not sure about their internal architecture for it, so maybe
| needs some us resources?
| yottalove wrote:
| Even as a software engineer, I think I could build from primitive
| materials a couple of battery operated transceivers to replace
| the signal flags or horsemen for critical communications. A
| little basic physics and materials science goes a long way.
| dannyw wrote:
| Prime video down for me. Australia.
| rwalk wrote:
| Yup, trouble in us-west-2 for us.
| wirelesspotat wrote:
| AWS status page shows an update:
|
| > AWS Internet Connectivity (Oregon): 7:42 AM PST We are
| investigating Internet connectivity issues to the US-WEST-2
| Region.
|
| Source: https://status.aws.amazon.com
| alvis wrote:
| Oh. not again...
| NicoJuicy wrote:
| I get the feeling that Havoc will happen when a tornado would
| reach us-east-1
| curtisblaine wrote:
| The npm registry is down too.
| [deleted]
| robthebrew wrote:
| https://nolandda.org/images/memes/nuke_from_orbit.gif
| nickjj wrote:
| I'm seeing outages on us-west-2 too. Customer facing traffic
| being served through Route53 -> ALB -> EC2 is down and CLI tools
| are failing to connect to AWS too.
| stevenhubertron wrote:
| Yeah. It's inconsistent but a number of my production servers
| appear to be down. Along with my New Relic logging.
| phgn wrote:
| This also seems to affect NPM, I can't install packages locally
| :/
| sheepdog wrote:
| I can't log on to the console for us-east-1. But our api gateway
| seems to be working, so I guess production is still up...for
| now...
| menmob wrote:
| 7:42 AM PST We are investigating Internet connectivity issues to
| the US-WEST-2 Region.
| RunOutOfMemory wrote:
| out of memory again. ;<
| sam0x17 wrote:
| They really need to stop requiring SVPs or higher to show non-
| green status on the status page, as other HNers have revealed in
| last week's AWS post. It's effectively not a status page, and
| they could probably be sued if it can be demonstrated that X
| service was down but the status page showed green (since the SLA
| is based on status page). Should be automated and based on sample
| deployments running in every region and every service. And they
| should use non-AWS instances to do the sampling, so they can
| actually sample when, say, we experience the obligatory black
| friday us-east-1 outage every year.
| vineyardmike wrote:
| > we experience the obligatory black friday us-east-1 outage
| every year.
|
| Is this a thing?
| lljk_kennedy wrote:
| I think SVP / GM approval is only needed for yellow / red
| status. From my time in AWS Support, the Support Oncall and
| Call Leader / GM delegate worked to approve green-i posts.
| sam0x17 wrote:
| If my app won't run for reasons that are not my fault for
| longer than the SLA guarantees, the affected services should
| be at least yellow status and I should be accumulating free
| AWS credits.
| ceejayoz wrote:
| They were much faster than usual about updating the AWS Status
| page.
| JshWright wrote:
| Our ~four person ops team shouldn't be able to have our
| status page updated 15 minutes before the upstream status
| page...
| Isthatablackgsd wrote:
| I thought Status Pages or Health Pages is designed to
| automate the reporting and checking the status automatically.
| This was my impression when I came across those status pages.
| Apparently, it is not automated and only update it manually.
| What is the point of having a status pages if it cannot be
| automated? I'm sure FAANG and tech conglomerates don't want
| it to be automated because of SLA.
|
| I'm surprised with FAANG hosted their stuff in their
| competitors cloud services without providing a fallback cloud
| service if the primary service is down. Sure it cost money
| but it would be effective this way than putting all eggs in
| one basket.
| erhk wrote:
| Any public communication is handled by people not machines.
| No one wants to make an automated status page because
| theres a shit ton of real noise that users dont need to
| hear about, nd theres a lot of outages that automation
| won't accurately catch
| hatware wrote:
| As stated earlier, AWS has financial incentive to not
| update the status page. Nobody is willing to call them on
| the conflict of interest in a meaningful, market-changing
| way.
| philistine wrote:
| Perhaps someone could produce an alternate, Patreon-
| supported status page that accurately reports on the
| status of AWS services.
| sam0x17 wrote:
| Would love to see them called out via new regulations or
| a lawsuit, however :)
| stefan_ wrote:
| With some lame ass tiny blue "connectivity issues"
| informational text. Surely broken routing to two entire DCs
| is full red for all services available therein?
|
| Like what, the networking is broken but if you could send
| packets, the services would still work so they are green?
| Hamuko wrote:
| I was still able to reach our service running in us-west-1
| when the connectivity issue was still on-going, so I don't
| know if it was a full interruption.
| adnauseum wrote:
| Seems like ever since Microsoft bought AWS, it's been going down
| an awful lot.
| exdsq wrote:
| Haha wtf?
| endisneigh wrote:
| > Seems like ever since Microsoft bought AWS, it's been going
| down an awful lot.
|
| What?
| rfoo wrote:
| Satire.
|
| Every time Github went down multiple people post on HN saying
| "every since they were bought by Microsoft, ...". As annoying
| as those Rust evangelists on every single memory corruption
| bug.
| staticassertion wrote:
| > As annoying as those Rust evangelists on every single
| memory corruption bug.
|
| First of all, how dare you!
|
| Second, shoulda used rust -\\_(tsu)_/-
| metaltyphoon wrote:
| Obviously while using Arch btw
| [deleted]
| masterof0 wrote:
| Didn't know Tim Dillon is hanging on here in HN.
| iamricks wrote:
| How much do you guys think these frequent outages will effect
| their market share in cloud products?
|
| Is this enough of a push for organizations to actually move over
| their infrastructure to other providers?
| ceejayoz wrote:
| Not at all.
|
| The other cloud providers have had their own outages.
| bravetraveler wrote:
| Sadly this, people are entrenched with AWS and the... "We're
| not the only ones down" thing truly has some effect
|
| Organizations can more easily swallow an AWS failure when
| they aren't the only ones hit. They move elsewhere, those
| outages look more unique
|
| Folks may think multi cloud is a good idea... But you're just
| as likely to suffer from the extra points of failure as you
| are to benefit
| tyingq wrote:
| Multi-cloud is such an odd idea to me. You're either
| building abstractions on top of things like cloud-provider
| specific implementations of CDNs, K8S, S3, Postgres,
| etc...or using the cloud just for VMs. The latter would be
| cheaper with just old-school hosting from Equinix,
| Rackspace, etc. The former feels like a losing battle.
| pm90 wrote:
| It's prompted discussions of building multi regional services
| in my org but not multi cloud. They would have to really really
| really screw up for that to happen... maybe be down for like a
| week or something.
| markbnj wrote:
| Our systems that talk to S3 in CA and OR are timing out trying to
| open SSL connections. AWS lists outages in these regions on their
| status page.
| belter wrote:
| AWS Outage Analysis - December 15, 2021:
|
| https://www.thousandeyes.com/blog/aws-outage-analysis-decemb...
|
| https://azycqgvwjz.share.thousandeyes.com/view/tests/?roundI...
| prakashqwerty wrote:
| leetcode.com is also down
| [deleted]
| hdjjhhvvhga wrote:
| An honest question. Why do you guys use AWS instead of dedicated
| servers? It's terribly expensive in comparison, nowadays equally
| complex, scalability is not magic and you need proper
| configuration either way, plus now the outages become more and
| more common. Frankly, I see no reason.
| dsr_ wrote:
| Once you have committed to a certain way of doing things, the
| transition costs can be very high.
|
| Let's consider RockCo and CloudCo. They both provide a B2B SAAS
| that is mostly used interactively during the working day, and
| mostly used via API calls for the rest of the working week.
| Demand is very much lower on weekends. Both RockCo and CloudCo
| were founded with a team of six people: a CEO who does sales, a
| CTO who can do lots of technology things, three general
| software developers, and one person who manages cloud services
| (for CloudCo) or wrangles systems and hosting (for RockCo).
|
| In the first year, CloudCo spends less on computing than RockCo
| does, because CloudCo can buy spot instances of VMs in a few
| minutes and then stop paying for them when the job is done.
| RockCo needs a month to signficantly change capacity, but once
| they've bought it, it is relatively cheap to maintain.
|
| In the second year, they are both growing. CloudCo buys more
| average capacity, but is still seeing lots of dynamic changes.
| RockCo keeps growing capacity.
|
| In the third year, they're still growing. CloudCo is noticing
| that their bills are really high, but all of their
| infrastructure is oriented to dynamic allocation. They start
| finding places where it makes sense to keep more VMs around all
| the time, which cuts the costs a little. RockCo can't absorb a
| dynamic swing, but their bills are now significantly lower
| every month than CloudCo's bills, and the machines that they
| bought two years ago are still quite competitive. A four year
| replacement cycle is deemed reasonable, with capacity still
| growing. And bandwidth for RockCo is much cheaper than the same
| bandwidth for CloudCo.
|
| Who's going to win?
|
| Well, you can't tell. If they both got unexpectedly sudden
| growth surges, RockCo might not have been able to keep up. If
| they both got unexpected lulls, CloudCo might have been able to
| reduce spending temporarily. RockCo spent more up front but
| much less over the long term. CloudCo could have avoided hiring
| their cloud administrator for several months at the beginning.
| RockCo's systems and network engineer is not cheap. And so on,
| and so forth.
| jrs235 wrote:
| I checked their health status page. All is good. /s
|
| https://downdetector.com/status/aws-amazon-web-services/
| drcongo wrote:
| What does downdetector run on?
| NobodyNada wrote:
| User reports -- i.e. the number of people who google "is X
| down" and then click a Down Detector link.
|
| It's a clever way of getting reasonably accurate data very
| quickly and easily, though it does have it's flaws -- the
| data is pretty noisy and users often attribute outages to the
| wrong service (e.g. blaming their ISP or Microsoft or
| something when YouTube is down, or vice versa).
| tyingq wrote:
| They did add an update, faster than last time:
|
| _" 7:42 AM PST We are investigating Internet connectivity
| issues to the US-WEST-2 Region."_
|
| https://status.aws.amazon.com/
|
| Edit: They added US-WEST-1:
|
| _" 7:52 AM PST We are investigating Internet connectivity
| issues to the US-WEST-1 Region."_
|
| Edit: Found root case, maybe?
|
| _" 8:01 AM PST We have identified the root cause of the
| Internet connectivity to the US-WEST-1 Region and have taken
| steps to restore connectivity. We have seen some improvement to
| Internet connectivity in the last few minutes but continue to
| work towards full recovery."_
|
| _" 8:01 AM PST We have identified the root cause of the
| Internet connectivity to the US-WEST-2 Region and have taken
| steps to restore connectivity. We have seen some improvement to
| Internet connectivity in the last few minutes but continue to
| work towards full recovery."_
| jrs235 wrote:
| Seems to be resolved now. And seems they hid / took away any
| mentioning of possible issues. Sigh.
| zaltekk wrote:
| It's still there now, on the top of the page, just marked
| resolved:
|
| us-west-1:
|
| 7:52 AM PST We are investigating Internet connectivity
| issues to the US-WEST-1 Region.
|
| 8:01 AM PST We have identified the root cause of the
| Internet connectivity to the US-WEST-1 Region and have
| taken steps to restore connectivity. We have seen some
| improvement to Internet connectivity in the last few
| minutes but continue to work towards full recovery.
|
| 8:10 AM PST We have resolved the issue affecting Internet
| connectivity to the US-WEST-1 Region. Connectivity within
| the region was not affected by this event. The issue has
| been resolved and the service is operating normally.
|
| us-west-2:
|
| 7:43 AM PST We are investigating Internet connectivity
| issues to the US-WEST-2 Region.
|
| 8:01 AM PST We have identified the root cause of the
| Internet connectivity to the US-WEST-2 Region and have
| taken steps to restore connectivity. We have seen some
| improvement to Internet connectivity in the last few
| minutes but continue to work towards full recovery.
|
| 8:14 AM PST We have resolved the issue affecting Internet
| connectivity to the US-WEST-2 Region. Connectivity within
| the region was not affected by this event. The issue has
| been resolved and the service is operating normally.
| iJohnDoe wrote:
| That is a shame. Anyone coming in after the fact to
| investigate an outage or glitch with their systems will
| need to look harder to find a known AWS outage. We can't
| assume everyone looks at HN.
| alvis wrote:
| So it is down again.
| savant_penguin wrote:
| Practice makes perfect
| ziddoap wrote:
| Too bad I am unable to load the status page due to connection
| timeouts, so I can't see the updates.
| chasd00 wrote:
| someone tripped over the fiber run i bet. Or, a cleaning
| person unplugged a router to plugin a vacuum (that actually
| happened but to a minicomputer iirc)
| darepublic wrote:
| Unfortunately the vacuum, a shiny IoT connected appliance,
| didn't work because AWS was down
| belter wrote:
| Mandatory...
|
| "The Cloud" https://xkcd.com/908/
| Hamuko wrote:
| Usually the problem is "an idiot with a digger".
| jve wrote:
| No way a cleaning person can do that in a datacenter.
| erhk wrote:
| I hope that their infra is not that unstable
| JshWright wrote:
| It's interesting that west-2 was quicker to create the
| incident (despite the issue starting a bit later there, at
| least by our experience), and while they both "identified" at
| the same time, west-2 also waited longer to call it resolved.
|
| I assume there are different teams responsible for each, is
| the west-2 team just more on top of things?
| tyingq wrote:
| West-2 also launched many years after us-east-1, so less
| legacy to deal with.
| vineyardmike wrote:
| 1.US-East-1 wasn't involved today.
|
| 2. They don't really have much "legacy" stuff to deal
| with since they likely turn over racks quickly across
| their whole fleet and software deployments should be
| standardized, so any US-east-1 flakiness has to do with
| the fact that its where amazon houses their control
| planes often.
| kulikalov wrote:
| The issue is not specific to the US, same issues in Europe.
| Also, it seems not only AWS experiencing issues. Unless
| Google is hosted on AWS haha...
| tyingq wrote:
| Yes, it could be network peering related. But there's
| definitely a lot of us-west-1 and us-west-2 users
| complaining and people saying that us-east-1 seems fine.
| ornornor wrote:
| Yep, when it loads, it's all green. "nine nines!!!"
| buitreVirtual wrote:
| emphasis on _when_
| adamisom wrote:
| 60% of the time, it's all-green 100% of the time
| Cort3z wrote:
| Down detector is just a statistical page, it does not actually
| detect downtime, and is in no way aws's status page.
| sgt wrote:
| Ok, so it can't be down then. This is proof!
| commandlinefan wrote:
| "Hey boss, that thing that took down us-east-1... that can't take
| down us-west-1 next week, can it?"
|
| "No, no, of course not"
|
| "Should I check?"
|
| "No, don't waste time checking, get back to your TPS reports"
| niks2112 wrote:
| we are having issue with us-west-1 and us-west-2
| wenbin wrote:
| ListenNotes.com has servers running on us-west-2.
|
| One issue is that outbound requests from our servers us-west-2
| timeout. Other than that, it seems that we are running ok so far.
| cblconfederate wrote:
| Reminder that the internet was literally invented to avoid this
| kind of nuclear attack. But i guess people are herdish animals
| and prefer to die as a group
| throw_m239339 wrote:
| More like ultimately all these companies buy into a certain
| form of vendor lock-in and they have no competence or
| willingness to migrate or even consider the competition. It's
| starts with "oh I'm just renting a remote virtual server" and
| in no time it's "Oh, all my stack is tied to AWS proprietary
| products" because convenience. That's what Amazon wants.
| doublepg23 wrote:
| Seems like the Internet level networking is quite robust at
| this point.
| lukeqsee wrote:
| We lost all public IPv6 in the Linode Newark DC.
|
| This appears to be cross-provider.
|
| Edit: We have IPv6 back.
| iJohnDoe wrote:
| Yes, seeing it too.
|
| Seems to be down in a major way. Lots of various AWS services are
| down. However, so many things depend on AWS that it could just be
| EC2 is down and it is causing a rippling affect.
| Zelphyr wrote:
| I think it's time to face the fact that we all have too many of
| our eggs in the AWS basket.
| mattjaynes wrote:
| Tangentially related: On Friday Backblaze and B2 were down for
| 10+ hours to update their systems for the log4j2 vulnerability.
| Seemed noteworthy for the HN crowd and I posted a link to their
| announcement when the outage began. However, the post was quickly
| flagged and disappeared. Genuinely curious, why is announcing
| some outages ok and others not?
| qaq wrote:
| What would be the ratio of HNers who are Backblaze customers vs
| those who are AWS customers. I bet Backblaze number is small
| enough where Backblaze employees on HN can downvote you enough
| for it to matter.
| aaronharnly wrote:
| Everyone who spent the past week migrating from us-east-1 to us-
| west-2: this joke is on you. :)
| DarthNebo wrote:
| "US-EAST-1 or bust" being manifested right now.
| theverything wrote:
| Slack seems to be having issues too.
| mrsuprawsm wrote:
| Seems like this is affecting Dropbox paper, at least for me.
| [deleted]
| gz5 wrote:
| looks specific to certain (possibly AWS hosted or partially
| dependent) services such as Auth0:
|
| https://status.auth0.com/
|
| e.g. our services running on AWS are fine right now, but new
| sessions dependent on Auth0 are not.
| tuzemec wrote:
| Is that related to the current NPM status
| (https://status.npmjs.org/)?
| Graffur wrote:
| I thought the whole point of AWS was that you could fail over to
| a different location?
| monkeybutton wrote:
| I really appreciate seeing these threads. Let's me know I haven't
| lost it.
| moneywoes wrote:
| Back up
| [deleted]
| mtschopp wrote:
| Could it be related to a Log4j issue?
| FoLeyy wrote:
| npmjs returning 503
| zonkd1234 wrote:
| yes. Having issues as of few mins ago reaching us-west-2 ec2.
| zonkd1234 wrote:
| us-west-2 EC2 looks like just came back online.
| chejazi wrote:
| wohoo! ssh'd back in. ty
| rakem wrote:
| proof:
| https://twitter.com/thedrunkneteng/status/147114428947652608...
| devin wrote:
| Can someone please update the title to be broader than AWS?
| TekMol wrote:
| It is surprising that their status page is down too:
|
| https://status.aws.amazon.com
|
| Their CDN, CloudFront, always works reliable for me. Couldn't
| they put the status page on CloudFront?
| [deleted]
| drcongo wrote:
| Not working for me either in the UK.
| mhitza wrote:
| Takes minutes to update a CloudFront distribution (they say
| around 5 minutes in their blog post from last year when speed
| was improved [1]). I think they might want to be able to change
| it to "everything's back to normal" in an instant, based on the
| SLA argument I've seen thrown around last time an AWS region
| was down.
|
| [1] https://aws.amazon.com/blogs/networking-and-content-
| delivery...
| ceejayoz wrote:
| It's minutes to update the distribution _settings_ , but that
| doesn't have to be the case for the content itself. A much
| lower cache time can be used.
| electroly wrote:
| The status page is working great for me. Did they make it
| multi-region after the last failure? I'm on the east coast.
| Sebb767 wrote:
| Central EU here, appears to be down.
| Hamuko wrote:
| Northern EU, down as well. AWS Management Console in eu-
| west-1 opens up just fine though.
|
| Edit: Hitting refresh a bunch finally got it open.
| oneplane wrote:
| Western EU here, appears to be up for me. Maybe a peering
| issue?
| Sebb767 wrote:
| It's back up for me, too, right now. Rather slow, though,
| and traceroute shows 25 hops. So it might really be
| peering.
| hericium wrote:
| Works for me. It's the usual static page with everything green.
| johnisgood wrote:
| Maybe it is just a static website. Do they even have CSS for
| red? :D
| clavicat wrote:
| Down for me, as well.
| tomlagier wrote:
| I wonder if AWS will make more or less money from these outages?
|
| Will large players flee because of excessive instability? Or will
| smaller players go from single-AZ to more expensive multi-AZ?
|
| My guess is that no-one will leave and lots of single-AZ tenants
| who should be multi-AZ will use this as the impetus to do it.
|
| Honestly, having events like this is probably good for the
| overall resilience of distributed systems. It's like an immune
| system, you don't usually fail in the same way repeatedly.
| gjvr wrote:
| I would not go multiple Availability Zone within the same
| Infra/Cloud provider...
| andy_ppp wrote:
| * Free chaos monkey installed in every AZ
| jjav wrote:
| > * Free chaos monkey installed in every AZ
|
| Only during this beta period, AWS will start charging for
| this feature soon enough.
| jedberg wrote:
| We (Netflix) begged them for years to create a Chaos Monkey
| that we could pay for. There were things we just couldn't
| do ourselves, like simulate a power pull or just drop all
| network packets on the bare metal. I guess not enough
| people asked.
| kenhwang wrote:
| If my company is any indication, they're going to make more
| money since everyone will simply check the multi-AZ or multi-
| region checkboxes they didn't before and throw more money at
| the problem instead of doing proper resiliency engineering
| themselves.
| gizmodo59 wrote:
| It doesn't matter how much of resiliency engineering you do.
| Having everything in a single AZ is a risk. If this is
| acceptable then it's fine if not you need to think of multi
| az from day 1.
| urthor wrote:
| The actual answer?
|
| In the next 5 calendar years the bottom line will still grow.
|
| However, the brand damage means they permananently lose market
| share. Which impacts their growth ceiling.
| ransom1538 wrote:
| "Or will smaller players go from single-AZ to more expensive
| multi-AZ"
|
| Yes! When you have a service interruption pay 2x more! With a
| region down I am sure other regions wont have any interruptions
| either! /s
| jorblumesea wrote:
| No one just "moves off" AWS. Once your apps are spaghetti coded
| with lambdas, buckets and all sorts of stuff, it's basically
| impossible to get off. More than likely, as you noticed, it
| will increase spending since multi-AZ/multi-region will become
| the norm.
| s_dev wrote:
| >I wonder if AWS will make more or less money from these
| outages?
|
| There is no possibility that outages are good for AWS. Nor is
| there more money to be made from "publicity" of the outages.
| moralestapia wrote:
| I think GP has a point with,
|
| >Or will smaller players go from single-AZ to more expensive
| multi-AZ?
| s_dev wrote:
| No -- if they needed to they already would have migrated to
| a multi-region. If they don't need it -- they won't have.
| The reason is simple -- it's expensive as you say. I'm not
| a fanboi or evangelist of AWS either -- I do have pet
| theories they named their products with shit names in order
| to make more money by making AWS skills less transferable
| to Google Cloud etc. S3 should be Amazon FTP, RDS should be
| Amazon SQL etc.
| Cederfjard wrote:
| You're saying businesses always make the right decisions
| and never put them off?
| jedberg wrote:
| Not at all the case. It was a regional outage that got
| Netflix to more than double our AWS spend going multi-
| region, so that outage netted them millions of extra
| dollars per year just from Netflix.
| dilyevsky wrote:
| You're underestimating the ability of eng leadership to
| not take these issues seriously. Only when there's
| sufficient pressure from the very top or even the
| customers it takes a priority.
| llbeansandrice wrote:
| S3 is nothing like FTP? RDS stands for Relational
| Database Service. You have a valid point but picked the
| worst examples.
| hagbarddenstore wrote:
| S3 is Simple Storage Service RDS is Relational Data
| Service EC2 is Elastic Compute Cloud
|
| All of these make sense.
|
| If you're gonna complain about names, at least pick the
| really sucky ones, like Athena, Snowball, etc.
| apetresc wrote:
| > S3 should be Amazon FTP
|
| I... don't think you know what S3 is. Or maybe what FTP
| is.
|
| (Also S3, EC2, RDS, etc. were named long before GCP had
| competing services)
| ketzo wrote:
| I mean, _lots_ of people put off doing something
| expensive but safer just because it's expensive, but
| rethink after the consequences show.
| yawnxyz wrote:
| Vercel is down too.
|
| My sites run on Cloudflare and Vercel, and I can't even log in to
| those right now.
|
| I'm curious -- what does Hacker News run on? It seems impervious
| to any kind of downtime...
| qeternity wrote:
| > I'm curious -- what does Hacker News run on? It seems
| impervious to any kind of downtime...
|
| On a dirty, disgusting dedicated server.
| yawnxyz wrote:
| > On a dirty, disgusting dedicated server.
|
| I'm adding "reliable" into that mix. Too bad they're too
| expensive and hard to setup for side projects, but HN is
| probably one of the most stable site I frequently visit, and
| I don't even think about it.
| Nextgrid wrote:
| I disagree that they're expensive. Expensive to _own_
| maybe, but you can rent them on a monthly basis from
| something like Hetzner or OVH for a fraction of the cost of
| AWS (especially when you include bandwidth which is free
| and unmetered in this case) and they handle hardware
| maintenance for you.
|
| Hard to setup is relative. It all depends on what you're
| doing and how much reliability you need. For a side project
| or a dev server you can just start with Debian, stick to
| packaged software (most language runtimes and services such
| as Postgres or Redis are available) as much as possible and
| call it a day. You can even enable auto-updates on such a
| stable distro.
|
| The knowledge you'll gain by dealing with bare-metal is
| also going to be useful in the cloud even in container
| environments.
| jjav wrote:
| > I'm adding "reliable" into that mix. Too bad they're too
| expensive and hard to setup for side projects
|
| I mean they're not particularly. Unless the use is
| extremely minimal, it'll always be cheaper to buy a small
| server even for a side project.
|
| I use cloud VMs for projects that can live on $5/mo VMs
| because at that usage rate I'll never break even to buy a
| machine.
|
| But as soon as your AWS bill is even like $50/mo, worth to
| start looking at alternatives.
| [deleted]
| hn_throwaway_69 wrote:
| DNS A record suggests a dedicated server from this company:
|
| https://www.m5hosting.com/
| ceejayoz wrote:
| HN definitely gets overloaded at times, including during big
| outages when everyone stampedes here. I got a bunch of "sorry,
| we can't serve your request" a little while back.
|
| Pobody's nerfect.
| iso1631 wrote:
| Must be a Y in the day.
|
| It amazes me how many projects exist that don't even have multi-
| region capability, let alone no single point of failure
| ahallock wrote:
| You're saying that as if it's a walk in the park to set up and
| not cost prohibitive, in terms of opportunity cost and budget,
| especially for smaller companies.
| tylerrobinson wrote:
| Right. Downtime (or perception of downtime) is bad for
| business, so AWS is surely working to improve reliability to
| avoid more black eyes on their uptime. But at the same time,
| an AWS customer might be considering multi-region
| functionality in AWS to protect themselves ... from AWS
| making a mistake.
|
| As a customer, it's unclear what the right approach is.
| Invest more with your vendor who caused the problem in the
| first place, or trust that they'll improve uptime?
| electroly wrote:
| Multi-region is difficult and expensive, and a lot of projects
| aren't that important. Most of our infrastructure just isn't
| that vital; we'd rather take the occasional outage than spend
| the time and money implementing the sort of active-active
| multi-region infrastructure that a "correct" implementation
| would use. We took the recent 8 hour us-east-1 outage on the
| nose and have not reconsidered this plan. It was a calculated
| risk that we still believe we're on the right side of. Multi-AZ
| but single-region is a reasonable balance of cost, difficulty,
| and reliability for us.
| dilyevsky wrote:
| Curios if you tell your customers you're totally ok with
| having lower than 99.9 availability
| electroly wrote:
| We don't have any external customers; they are all
| internal. We're all on the same side of the table.
| dilyevsky wrote:
| Sounds like even worse deal for the customer since there
| is no refund
| zimmerfrei wrote:
| How many 9s can you get from a single-region multi-AZ
| deployment not on us-east-1 and which nly uses basic
| services (EC2, IAM, S3, DynamoDB, etc)?
|
| Really only 3?
| dilyevsky wrote:
| Depends on how critical they are to your stack. Ime if
| you use more than a few products and either one of them
| can take you down yeah it's less than 3. Just something
| to ponder but if s3 didn't meet 99.9 for the month you
| get a whopping 10% back. Other cloud vendors aren't much
| better at this (actually worse). Not even to mention that
| you need to leave some room for your own fuckups
| Spivak wrote:
| This might be a multi-region problem. Auth0 as an example has
| three US regions and two of them are down.
| staticassertion wrote:
| IDK, don't you end up with a bunch of extra costs? Like you're
| going to literally pay more money because now you have cross
| region replication charges, and then you're going to pay a
| latency cost, and then you may end up needing to overprovision
| your compute, etc.
|
| All to go from, idk, 99.9% uptime to 99.95% (throwing out these
| numbers)? The thing is when AWS goes down so much of the
| internet goes down that companies don't really get called out
| individually.
| dilyevsky wrote:
| If you just sat that there and took that 8 hour outage you're
| barely even 99.9 for the year
| QuiiBz wrote:
| I tried to monitor services status using
| https://stop.lying.cloud, but they are also hosted to AWS, and
| down too.
| saganus wrote:
| How does this service work?
|
| It seems to have all the look and feel of AWS, and somehow has
| more up to date info than the official AWS status page?
| fishtoaster wrote:
| It's the same info - it just changes all blues to yellows and
| all yellows to reds. :)
| saganus wrote:
| I had no idea!
|
| Pretty funny actually.
| mishftw wrote:
| Funny I didn't know that and assumed it was okay
| taormina wrote:
| I mean, sounds like it's working as intended then?
| synergy20 wrote:
| AWS should monitor itself from Azure or GCP, even DO or Linode
| makes more sense.
|
| Eat your own dog food shows confidence, but monitoring it is a
| different dimension, you need use anything but your own dog
| food there.
| mrslave wrote:
| A similar reason drives businesses to host
| `status.product.bigcorp` on a different server. And if your
| product is a cloud then your suggestion makes sense.
| skeeter2020 wrote:
| It's the only realistic multi-cloud provider scenario I can
| ever come up with that I would consider actually
| implementing...
| itisit wrote:
| AWS wouldn't monitor itself from a competitor, of course, but
| they could just as well silo a team and isolate DCs to do
| independent self-auditing.
| rozenmd wrote:
| I don't know about AWS, but I know a lot of us uptime
| monitoring makers use (and pay) for competitor's products
| to know if we're down.
| itisit wrote:
| Rightly so. My point is a company can self-audit without
| having to pay a competitor.
| lostlogin wrote:
| Absolutely. And even if it's cheaper to use the
| competition, an expensive custom solution will be found.
| whimsicalism wrote:
| I think that is inherently riskier because you never know
| on what axis you will have a failure and it is difficult
| to exclude all shared axes.
| 8note wrote:
| That applies when you use competitors too.
|
| They could have a related outage, or even a
| coincidentally timed one
| jethro_tell wrote:
| But we're talking about a status page which should be
| basically static. In it's simplest form you need a rack
| in 2+ random colos and a few people to manage the page
| update framework. Then you make teams submit the tests
| that are used to validate SLA. Run the tests from a few
| DCs and rebuild the status page every minute or two.
|
| Maybe add a CDN. This shit isn't rocket science and being
| able to accurately monitor your own systems from off
| infrastructure is the one time you should really be
| separate.
| reaperducer wrote:
| _AWS wouldn 't monitor itself from a competitor, of course_
|
| Why not? The big tech companies use each other all the
| time.
|
| For example, set up a new firewall on macOS and you can see
| how many times Apple pulls data from Amazon or Azure or
| other competitors' APIs and services.
| jethro_tell wrote:
| Apple is not a competitor to AWS or Azure in any way.
| They offer not infrastructure/platform as a service that
| I am aware of.
| reaperducer wrote:
| Apple and Amazon are competitors. Apple and Microsoft
| competitors.
|
| The postulation was that Apple and Amazon weren't
| competitors. Not that they're not competitors in a
| specific niche.
| throwaway81523 wrote:
| They have a bazillion alexa and kindle devices out there
| that they could monitor from, heh heh. At least let that
| phone-home behaviour do something useful, like notice AWS
| is down.
| QuinnyPig wrote:
| Yeah, I homed https://stop.lying.cloud out of us-west-2. Oops.
| mrslave wrote:
| Considering the sea of bright green circles, reds might stand
| out but blues get lost in a fast scroll. Perhaps fade or mute
| the green icon to improve visibility of non-green which is
| the interesting information?
| yonig wrote:
| The brand is strong if you're really the owner
| moneywoes wrote:
| That's hilarious
| tinco wrote:
| Now that they're back up they're not reporting any problems,
| how is it supposed to work? It looks like it is just repeating
| the status reported on the Amazon status page.
| fishtoaster wrote:
| It is. It's just the AWS status page run through a
| transformation function to:
|
| 1. Remove all the thousand green services that no one cares
| about when looking at AWS status
|
| 2. Upgrade all yellows to reds because Amazon refuses to list
| anything as "down" no matter how bad the outage is.
|
| 3. Insert a snarky legend
| civilized wrote:
| If they're monitoring AWS downtime they might want to rethink
| this.
| cinntaile wrote:
| How come? It's accurate.
| johnisgood wrote:
| True, if it is down, then that means AWS is down (not
| necessarily, obviously). :D But honestly, if they want to
| monitor AWS, they gotta pick something else for this
| reason, something that is not down when AWS is.
| jeanlucas wrote:
| Well... Yes. Hahahah
| civilized wrote:
| I guess it depends on whether you like your FALSE's encoded
| as timeouts :)
| [deleted]
| Sholmesy wrote:
| Yup, seeing this on us-west-1
| wirelesspotat wrote:
| We're seeing AWS issues with us-west-2 at [medium-sized tech
| company]
| 1123581321 wrote:
| Yes, all our stuff in west-2 went down at 7:15 PT.
| 8K832d7tNmiQ wrote:
| Twitch video streaming is also down right now:
|
| HTTP Error 500 internal server error
| justinc8687 wrote:
| us-west-2 stuff is down for me too
| blueside wrote:
| The vehement defenders of AWS are starting to remind me of the
| cryptobros
| [deleted]
| Jamie9912 wrote:
| Twitch seems to have recovered, is it back now for everyone?
| bloaf wrote:
| Still getting errors in Houston
|
| edit: some streams back up, chat still buggy as of 09:55 local
| time
|
| edit2: appears to be back ~10:00 local time
| [deleted]
| turtlebits wrote:
| That was fun. Badges weren't working (daily checkin required) so
| the front desk had to manually activate them.
|
| Slack wasn't sending messages and Pagerduty was throwing 500's.
| api wrote:
| ... because you need to contact a server 1000 miles away to
| issue badges in your building.
|
| This cloud-for-everything-even-local-devices thing is both
| hilarious and sad.
|
| I wonder if anyone had trouble doing their dishes or laundry
| today, because I'm sure someone thought dish washers and
| washing machines needed cloud.
| owlbynight wrote:
| Yes, everyone but you is wrong.
|
| Many logical people have decided to abstract away their soul-
| crushing anxieties and legal gray area during outages to
| incredibly stable and well-staffed cloud infrastructure
| providers.
|
| If you and your team are better at taking care of hardware
| than an entire building full of highly paid engineering
| specialists, then that's cool for you, but also, no you're
| not.
|
| That's not to say you're not capable of running on-prem
| hardware that is stable.
|
| I'm just saying that the high-handed swiping away of everyone
| else who's made an incredibly safe and logical decision to
| host their stuff in the cloud makes me question your general
| vibe.
| Nextgrid wrote:
| > but also, no you're not.
|
| If you plan to replicate all of AWS I'd agree with you. But
| if all you need is a handful of servers, you could end up
| with better uptime doing it in-house just because you don't
| have all the moving parts that make AWS tick, reducing the
| chance for something to go wrong.
|
| My bare-metal servers stayed up during both of the recent
| outages, not because I'm some kind of genius that's better
| than the AWS engineers but just because it's a dead simple
| stack that has zero moving parts and my project doesn't
| require anything more complex.
| user3939382 wrote:
| > If you and your team are better at taking care of
| hardware than an entire building full of highly paid
| engineering specialists
|
| The trade offs aren't quite that simple. Those specialists
| are necessary because they're building and maintaining
| infrastructure that's extremely complex since it has a
| crazy scale and has to be all things to all people. When
| you're running in-house, your infrastructure is simpler
| because it's custom tailored to your specific requirements
| and scale.
|
| There are tradeoffs that make cloud vs local make sense in
| different contexts and there's no one right answer.
| jjav wrote:
| There is absolutely no reason for a local device (like a
| door lock or dishwasher as per OP) to depend on any
| external connectivity. Not to the company on-prem hardware,
| not to AWS.
| ec109685 wrote:
| I don't know if you can say an on-premise badge hosting
| service would be more reliable than the cloud.
| kazen44 wrote:
| well, atleast you have the agency to do something about it
| yourself.
|
| also, building access systems should be hosted in the
| building they reside in for security reasons anyways.
| marcosdumay wrote:
| This creates some really fun failure cases on the form of
| "I need to enter the building so anybody can enter the
| building".
|
| Depending on the cloud is certainly a very stupid
| decision. keeping everything inside the building is
| better, but still not ideal.
| jjav wrote:
| Any electronic access system like this requires manual
| backup. As in, some doors with regular locks using
| physical keys.
| kazen44 wrote:
| it requires an override anyways in case of emergencies
| like a fire.
| reaperducer wrote:
| Taking badges out of the cloud reduces points of failure by
| several orders of magnitude.
|
| Cloud-based badges make sense if you have locations with
| small staffs and no HR people or managers. Like if you're
| controlling access to a microwave tower on the top of a
| mountain.
|
| But badges-in-the-cloud for an office building full of
| people who are being supervised by supposedly trusted
| managers, and all of whom has been vetted for security and
| by HR, is just being cheap.
|
| Like the 1980's AT&T commercials used to say: "You get what
| you pay for."
| orourkek wrote:
| > Taking badges out of the cloud reduces points of
| failure by several orders of magnitude.
|
| I'm not convinced that's true, or at least certainly not
| an order of magnitude. Wouldn't a badge system hosted on-
| prem also need a user management system (database), a
| hosted management interface, have a dependency on the
| LAN, and need most of the same hardware? Such a system
| would also need to be running on a local server(s), which
| introduces points of failure around power
| continuity/surges, physical security, ongoing
| maintenance, etc.
| jjav wrote:
| The remote solution requires all of those same things,
| plus in addition it requires internet connectivity to be
| up and reliable, the cloud provider be available and the
| third party company be up and still in business.
|
| Adding complexity and moving parts never reduces points
| of failure. It can reduce daily operating worries as long
| as everything works, but it can't reduce points of
| failure. It also means than someday when it breaks, the
| root causes will be more opaque.
| reaperducer wrote:
| All of those things would also be needed by the cloud
| provider, too. Just because it's on-prem doesn't mean it
| doesn't need servers, power conditioning, physical
| security, etc. "Cloud" isn't magic fairies. It's just
| renting someone else's points of failure.
|
| In addition, you're forgetting the thousands of points of
| failure between the building and the cloud provider.
| Everything from routers being DDOSed by script kiddies to
| ransomware gangs attacking infrastructure to Phil
| McCracken slicing a fiber line with his new post hole
| digger.
| thadjo wrote:
| obligatory comment about status page showing seas of green:
| https://status.aws.amazon.com
| tubignaaso wrote:
| The status page appears to be down now as well.
| BoiledCabbage wrote:
| Maybe they got so much flak last time for it being worthless,
| that they just decided to pull it this time??
| thadjo wrote:
| yep I'm seeing that too - wow.
| gregmfoster wrote:
| Love to see the manually updated status page not updating
| arsome wrote:
| For this kind of thing it's usually better to just use a user-
| driven site like: https://downdetector.ca/status/aws-amazon-
| web-services/
|
| Some users are clueless, but the clueless users average out
| over time and the spikes make it clear when there are actual
| issues.
| rpadovani wrote:
| For me, it's down as well
| [deleted]
| TheFragenTaken wrote:
| At least Twitch.tv (Amazon subsidiary) and npmjs.com seems to be
| affected.
| AustinDev wrote:
| Yeah, I'm getting 2000 player errors in the Twitch video
| player.
| yabones wrote:
| Yep, it's broken again. I was trying to install some Thunderbird
| extensions, and stuff started breaking halfway through. Never
| thought of an AWS outage borking my mail client I guess...
| the_iceman wrote:
| Confirmed experiencing significant issues in US-WEST-1 as well
| CoastalCoder wrote:
| Asking as a non-cloud-developer: why would Crunchyroll's recovery
| [0] lag so much behind AWS's recovery [1]?
|
| [0] https://downdetector.com/status/crunchyroll/
|
| [1] https://downdetector.com/status/aws-amazon-web-services/
| spenczar5 wrote:
| I don't know for sure, but this is generally common because
| caches get cold.
|
| A lot of websites use a cache in front of databases (or
| template rendering engines, or many other systems). That cache
| might evict entries based on time - after 5 minutes, the entry
| is considered invalid.
|
| But that means that if you have no traffic for 10 minutes, the
| cache completely empties. Then when traffic returns, it all
| skips the cache and actually triggers a real hit to the backend
| - which is now overwhelmed with traffic. The cache protects the
| backend in normal behavior, but now it's not doing its job, so
| the backend has many more requests than usual.
|
| In the worst case, those requests are enqueued in a big serial
| sequence... but the ones at the back of the queue may time out.
| The client may do something like say "it's taken me 5 seconds
| and I still don't have a response - I'll abort and retry!" and
| now you have even _more_ traffic to deal with.
|
| So cold caches and retries can conspire to keep a service down
| for a long time even after the root cause is fixed.
| CoastalCoder wrote:
| I'm accustomed with cache-eviction policies based on LRU,
| age, etc. But in my systems, eviction happens only when (a)
| the content is known to be invalid, or (b) there's
| competition for cache space.
|
| IIUC the parent comment, it's describing a policy that evicts
| entries even (a) and (b) are false. Is that common in the
| web-hosting / CDN world? Or is age considered a proxy for
| stale?
| Nexxxeh wrote:
| Crunchyroll seems to barely work at the best of times, and when
| it does, it's still a mess.
|
| All sorts of issues still unresolved for years, including the
| ridiculously annoying "Finishes playing season English sub,
| autoplays first season of German dub, which then gets stuck".
| Still no profiles (nerfing their super-premium offering). Auto-
| resume points are unreliable, the Android app is hot garbage at
| dealing with network disruption...
|
| I can only imagine their back-end is mostly Visual Basic
| running on a single AWS-powered VM.
| the_iceman wrote:
| Experiencing significant issues in US-WEST-1
| mysql wrote:
| It's bad that I come here first to see if I am crazy or AWS is
| actually down.
| jcoder wrote:
| This is new... Siri hasn't been able to connect for me since this
| began
| ghawkescs wrote:
| Same thing here.
| mgbmtl wrote:
| QuickBooks Online seems to be down, and they seem to be hosted on
| AWS.
| BTCOG wrote:
| Can't use MFA right now to get into multiple instances due to
| this outage.
| xondono wrote:
| At which point this outages are a sign that something inside AWS
| is deeply broken and pretty much unfixable?
| gregmfoster wrote:
| Down for us (graphite.dev) as well, running on us-west-2
| tubignaaso wrote:
| Seeing this on us-west-1. us-east-1 appears to be functioning for
| us.
| dijit wrote:
| Is it AWS or could it be an ISP?
|
| AWS seems to be working for me, but I've worked with clients in
| the US and spectrum internet tended to drop connections to us
| sporadically, which looks like an outage to our clients but is
| something we obviously can't control.
| albatross13 wrote:
| Currently we're seeing 40kms response times from CloudFront
| distributions, we can't hit PagerDuty (probably runs on AWS),
| etc.
|
| I guess it could be an ISP thing but I guess we're all assuming
| 80/20.
| avs733 wrote:
| I wonder if you really dug into most company's tech stacks,
| how many of their support tools (e.g., PagerDuty) are reliant
| on overlapping cloud providers.
| albatross13 wrote:
| Oh man, it is insane. During the aws incident last week we
| couldn't build software because bitbucket pipelines were
| all down, due to them running lambdas in us-east-1 only
| haha.
|
| We've taken a massive turn away from a "decentralized"
| internet.
| avs733 wrote:
| it's still decentralized...it's just a centralized
| version of it right?
|
| just like Cavendish bananas are grown in multiple
| places...
| simcop2387 wrote:
| Yea a number of people got hit by that, Louis Rossmann
| found out that every form of contact to his buisness was
| reliant on AWS east 1.
| https://www.youtube.com/watch?v=DE05jXUZ-FY
| treis wrote:
| It was an AWS networking issue 90%+ packet loss pinging to
| Google & Facebook.
| adriand wrote:
| I'm wondering the same thing. We have stuff hosted in us-west-2
| and multiple people across the US are reporting that our
| systems are down, however our system is working fine for me
| here, which is near Toronto.
| iamricks wrote:
| When us-east was down recently, our apps were not effected
| and we host on east. Maybe a similar issue?
| judge2020 wrote:
| The east-1 downtime was the interconnection between AWS
| hosted services, including the control plane, so most
| resources not dependent on AWS APIs stayed up (eg. non-
| autoscaled EC2 instances).
|
| https://news.ycombinator.com/item?id=29516482
| banana_giraffe wrote:
| Things were working during the event, but connectivity was
| pretty messed up
|
| https://imgur.com/a/VsrS0JZ
|
| (This is two similarly spec'd boxes on us-east-2 and us-
| west-2). Looking at GeoIP of connecting clients, the only
| pattern I can see is the region itself.
| ukyrgf wrote:
| I have an outage way over in the southeast, looks to be
| affecting the major monopoly ISP. Can't get a tech to our data
| center until 2PM.
| tw04 wrote:
| If it's a network issue, it's on their side. I've verified from
| centurylink, comcast, cogent, he.net, at&t, and verizon - all
| of them are having issues. This isn't like: Cox is having an
| outage and just can't get to AWS.
| CodinM wrote:
| I fucking swear to God.
| codercotton wrote:
| "Everything is fine." - https://status.aws.amazon.com
| rytrix wrote:
| Everything *is* fine now. The status page previously reflected
| an issue much quicker than last time.
| alberth wrote:
| It appears AWS Status Page is hosted at AWS [0].
|
| Seems like a really bad idea.
|
| [0] https://hostingchecker.com/
| anpat wrote:
| My monitoring is on fire, flipping red to green every minute
| because of connectivity issues with every single LB in us-west-2.
| gitfan86 wrote:
| I'm so glad that I'm not still the CTO of a startup. I would be
| getting dozens of e-mails from people without engineering
| backgrounds asking "Are we multi-cloud", "why didn't you make us
| multi-cloud"?
| necovek wrote:
| Well, why didn't you? :)
|
| The response is that this actually works well enough, so the
| investment required has not pushed anyone to do it (with that
| meaning building the core infrastructure to make that easy).
| baskethead wrote:
| It sounds like their systems design interviews aren't rigorous
| enough.
| pm90 wrote:
| At this point they should hire specifically for config
| management and rollout.
|
| Mostly /s; I wish the aws engineers the best of luck through
| this.
| lordnacho wrote:
| How about an ad for a "Status Page Engineer"?
| tyingq wrote:
| I'm guessing lots of people fled us-east-1 for us-west-2, after
| the last outage, and overwhelmed something there.
| jsight wrote:
| "Nobody ever got fired for picking [x]" as applied to cloud
| zones? Sadly, you are probably right.
| tyingq wrote:
| I wonder how many are now furiously headed back to us-
| east-1, building the conditions for the third event :)
| [deleted]
___________________________________________________________________
(page generated 2021-12-15 23:01 UTC)