[HN Gopher] Simplicity - Google SRE Handbook (2017)
       ___________________________________________________________________
        
       Simplicity - Google SRE Handbook (2017)
        
       Author : nateb2022
       Score  : 160 points
       Date   : 2024-05-25 23:01 UTC (23 hours ago)
        
 (HTM) web link (sre.google)
 (TXT) w3m dump (sre.google)
        
       | OutOfHere wrote:
       | Does this include instructions on accidentally deleting a
       | customer's account? Because that's what Google does. I don't
       | think I want to take any advice from Google on anything.
        
         | postepowanieadm wrote:
         | It's from 2016 when google was less trash.
        
         | infinityplus1 wrote:
         | Cloud computers are just someone's else computer. Amazon and
         | Microsoft engineers can make the same mistake too. Take backups
         | and test them regularly and you'll be OK.
        
         | dieortin wrote:
         | I challenge you to find an organization that has never made a
         | mistake. Truth is the uptime and reliability of Google services
         | is very good, while operating at huge scale. And I have no
         | association with Google whatsoever.
        
         | gtirloni wrote:
         | _> Because that 's what Google does_
         | 
         | Your argument would be stronger if you could list a few cases
         | like that latest high profile one where GCP deleted some
         | enterprise customer's account. A single one won't cut it for
         | "that's what Google does".
        
           | OutOfHere wrote:
           | With Google, the deletions almost always are intentional, not
           | accidental, and this is a huge problem with it. Google (not
           | GCP) deleted ten years of my data without warning or
           | notification or remorse or recourse even though I was doing
           | nothing illegal. Amazon would never do something like it. To
           | Google, once a customer or service becomes just 1%
           | inconvenient, it's time to get rid of the customer or
           | service. It's a very valid concern.
        
       | elktown wrote:
       | Just remember that what google writes in these kind of things is
       | not universal. It's written from their very unusual
       | circumstances. You can certainly pick nuggets that are more
       | universal than others but, like in many other instances, too much
       | unnecessary work is spent trying to imitate Google and others
       | when it's not really needed. And no, you won't turn into Google
       | over night, you will have time to adapt if fortune hits you. Some
       | things are not even necessarily good advice at all, but rather a
       | product of incentives within Google (and perhaps most tech corps)
       | rewarding the aesthetics of "innovation".
        
         | fmbb wrote:
         | I read the whole text (granted, a bit quickly) looking for
         | weird or unnecessary advice but I cannot see any.
         | 
         | This is a great text about considerations everyone operating
         | software services should take to heart.
         | 
         | It applies regardless of if you deploy a monolith or several
         | smaller servers.
         | 
         | If you are only one developer, it might apply in a smaller
         | context.
        
           | elktown wrote:
           | "If you are only one developer" suggests zero interests in
           | being nuanced.
           | 
           | To be clear, this linked specifically to simplicity which I'm
           | certainly in favor of emphasizing the importance of. But IME
           | the exact opposite happens when people try to imitate Google
           | overall in a smaller setting, where instead too much
           | resources are spent on meta-issues instead of the product
           | being developed.
        
             | zbentley wrote:
             | I think you're arguing with someone who isn't here.
             | 
             | Nobody is endorsing the practices in TFA "because it's
             | Google"/in order to be like Google. Sure, people elsewhere
             | make those claims all the time, and they're wrong, but
             | that's not in evidence here that I can see.
             | 
             | The article does seem to come pretty close to universally
             | applicable good ideas. Not because of where its author
             | works, but because of the content.
        
               | elktown wrote:
               | > Nobody is endorsing the practices in TFA "because it's
               | Google"/in order to be like Google. Sure, people
               | elsewhere make those claims all the time, and they're
               | wrong, but that's not in evidence here that I can see.
               | 
               | I disagree, I think we can see this time and time a
               | again. YMMV I guess. It's an encouragement to be vigilant
               | for over-engineering when you don't need it because
               | you're not google. I'm not saying that the content is
               | bad, it's a worthy read. Just don't get overeager like
               | the OOP craze phase where would attempt to bend
               | everything into a maze of design pattern because people
               | took whatever books they read way too far. Most of the
               | chapters have YAGNI parts for smaller settings, but it's
               | still worth knowing about what the next steps are.
        
         | nvarsj wrote:
         | Even within Google, this is not universal. I doubt the majority
         | of SREs at Google have even read the "Google SRE book".
         | 
         | On the other hand, the book has some nuggets that make it worth
         | reading. But it should be treated as a collection of essays
         | from some very senior SREs rather than a manual.
        
           | elktown wrote:
           | > On the other hand, the book has some nuggets that make it
           | worth reading
           | 
           | Definitely! Mostly just a word of caution to not get
           | overeager hoping to apply this everywhere, because "here be
           | dragons".
        
       | userbinator wrote:
       | A lot of preaching but bears little resemblance to what Google is
       | actually doing in reality. IMHO those who actually understand
       | what "simplicity" means in software are only those who have tried
       | to do anything in highly-resource-constrained environments.
        
         | hiAndrewQuinn wrote:
         | A taxonomy of what we mean when we talk about "resource-
         | constrained" might be helpful for those seeking to gain this
         | knowledge. Limited CPU, RAM, etc are the obvious contenders -
         | but then there's also "resource-constrained" as in "I'm the
         | solo dev of this project and have 5 hours in a good week to
         | work on it", or "this runs in a weird place without Internet
         | that I only get access to twice a year". I've been in all of
         | these situations, sometimes multiple at the same time, and
         | they've been great forcing functions to find new paths towards
         | simplicity.
        
         | gtirloni wrote:
         | You also have to keep in mind the scope and timeline of where
         | these principles apply. I'm sure someone would be able to apply
         | them to their own work most of the time but if you look at a
         | company as a whole, unless someone at the top is really pushing
         | for global simplicity, things are pretty messy most of the
         | time.
         | 
         | I'm just saying this because Google might be doing this in
         | little islands, not as a company strategy. I don't really know
         | and can only guess from the outside.
        
         | davidcbc wrote:
         | > bears little resemblance to what Google is actually doing in
         | reality
         | 
         | What is Google doing in reality?
        
       | quintes wrote:
       | Yeah look. This may be the throw it over the wall problem. sRE
       | says No.
       | 
       | You build it you run it but may work at their scale
        
       | burakemir wrote:
       | While the text touches on many points I would immediately sign,
       | the paragraph starting with "Because engineers are human beings
       | who often form an emotional attachment to their creations, ..."
       | is really out of place.
       | 
       | The cause of complexity is not emotional attachment, these are
       | decisions being made. The decision to add feature after feature
       | and punt on maintenance for example is something that has little
       | to do with emotions. There is a lot of agency that engineers, SWE
       | and SRE alike have in shaping how things are. However there can
       | be good reasons to abandon simplicity. The real trouble here is
       | not psychology but that as a profession we are really bad at
       | measuring and estimating the effective cost of maintenance. Part
       | of that is considering measures to improve simplicity and
       | maintainability as cost that comes without gain and somehow less
       | important than features, and then just accept giant rewrite a few
       | years later. A continuous portion of upkeep would likely be more
       | economical and real engineering has always included an aspect of
       | economy - cost vs benefit.
       | 
       | IMHO the loaded accusation of emotional attachment might be
       | rooted in an "us vs them" attitude (SRE vs software engineering)
       | that should have no place in a sober discussion on the value of
       | simplicity and it diminishes an otherwise great text.
        
         | CraigJPerry wrote:
         | >> Because engineers are human beings who often form an
         | emotional attachment to their creations, confrontations over
         | large-scale purges of the source tree are not uncommon. Some
         | might protest, "What if we need that code later?"
         | 
         | > the paragraph starting with "Because engineers are human
         | beings who often form an emotional attachment to their
         | creations, ..." is really out of place.
         | 
         | FWIW I've definitely encountered developers clinging to things
         | when the business context has completely changed. I totally
         | recognise the scenario in the original text.
        
           | burakemir wrote:
           | Sure, but if we argue that these values and principles should
           | be applicable, then it should also be possible to make an
           | argument _why_ and not blame the irrationality on emotions.
           | 
           | It seems more likely that bounded rationality is at play
           | here, where different parties only know part of the picture
           | (and fail to bring these together and find out what would be
           | best globally.)
        
             | philosopher1234 wrote:
             | I don't follow why we shouldn't blame the irrationality on
             | emotions. Emotions are massively important, and people do
             | irrational things because of them all the time. Why pretend
             | that's not true?
        
               | burakemir wrote:
               | The question is not whether emotions can cause people to
               | be irrational. They can!
               | 
               | Not every case of irrational behavior is caused by
               | emotions though. And when we are making an argument that
               | people are acting against their own interests, it may
               | help to ponder what makes them do so. All the more when
               | we are claiming principles and values that should be
               | accepted by everyone.
               | 
               | "If you don't believe me you are acting irrational / it's
               | because you are emotionally attached" does not seem to be
               | an attitude that gets closer to real causes in a
               | discussion on how to best seek simplicity, but rather a
               | recipe for avoiding discussion or a "thought-terminating
               | cliche."
               | 
               | There must be a better argument for convincing people to
               | let go of code / clean up etc.
        
               | kortilla wrote:
               | You're assuming the conclusion. What may appear to be
               | irrational emotional behavior can be completely rational
               | under a different set of information.
        
         | arccy wrote:
         | But people do get attached to their creations, they don't want
         | their things deprecated/removed, since to them it may feel like
         | their thing is thrown away or wasted work down the drain. While
         | they may not obviously state it as such, it can be the
         | underlying reason driving their arguments (e.g. sunk cost
         | fallacy).
        
           | burakemir wrote:
           | Maybe this is also about the desire to create, which of
           | course is also common in engineering. It does not contradict
           | my argument that the cost of maintenance and operations is
           | being ignored eg when one creates things all the time and
           | never removes stuff. And it should be possible to measure or
           | estimate that cost.
        
         | jimmySixDOF wrote:
         | When containers got going there was a phrase used in devops to
         | think of servers as "cattle not pets" for just this reason.
        
           | XorNot wrote:
           | I never took that as dealing with emotional attachment, it
           | was just a shorthand to express that at any moment you would
           | kill cattle so don't do things you can't easily replicate.
        
           | kortilla wrote:
           | No, that had nothing to do with emotional attachment. It's a
           | short phrase to remind people that they can't make each
           | device special with one-off because it needs to be
           | repeated/destroyed all of the time.
           | 
           | Separately, cattle vs pets is much older than containers. It
           | got popular with ephemeral EC2 instances when people were
           | first forced to grapple with lifetimes of VMs measured in
           | hours and the ability to scale massively as needed.
        
         | scott_w wrote:
         | I think the examples the paragraph gives more than backs up the
         | statement. I've met people who comment out code instead of
         | deleting it (luckily not in a long time!) and I feel the
         | authors speak from experience here.
        
           | burakemir wrote:
           | Curious what examples do you see there. I don't doubt the
           | experience.
           | 
           | When I draw analogies of my past experiences to present
           | situations, that does not mean that my past experiences are
           | the best way to convince people of what is the right thing to
           | do. I still need to do the hard work of pointing out what it
           | is that is in the common interest and why eg deleting stuff
           | and simplifying is good.
           | 
           | In such a discussion it won't help me to say people who
           | disagree with me are generally just emotional, does it? Even
           | if I may have encountered people with such emotional
           | reactions.
        
             | scott_w wrote:
             | > Curious what examples do you see there. I don't doubt the
             | experience.
             | 
             | I'm not going to just copy and paste the article for you.
             | It's literally right there.
        
         | intelVISA wrote:
         | > Because engineers are human beings who often form an
         | emotional attachment to their creations
         | 
         | Because engineers are human beings who often form an emotional
         | attachment to their job security
         | 
         | It's understandably very unwise to admit that Very Complex
         | Solution that cost A Lot Of Money was A Bad Thing
        
           | ozim wrote:
           | Unfortunately complex solution we have accumulated over time
           | is usually because business did not want to spend a bit more
           | up front to come up with cleaner solution.
           | 
           | In the same way business is also very reluctant to spend
           | time/money on cleaning up stuff.
           | 
           | I never ever had to make up complex stuff on my own. It
           | always happens on its own.
        
         | mrbungie wrote:
         | I think that is being transparent with what actually happens in
         | the real world (engineers, at least in part, being human and
         | emotional in their decisions), rather than just talking about
         | impossible ideals (engineers thinking about tradeoffs in a
         | purely objective matter).
         | 
         | NIH, CV based development, preference for shiny/new things and
         | a myriad of other "engineer/organizational diseases" exist, you
         | know. And there are even SaaS/PaaS/XaaS marketing teams
         | exploiting such human qualities when making software sales.
        
         | oooyay wrote:
         | I'm a SRE and I disagree too, though, I think you're giving
         | SREs too much credit in the category of our hegemony for an "us
         | vs them" debate. Maybe at Google SWEs having relationships with
         | their code based is a well studied thing. It could also just be
         | someone's opinion that managed it's way unchallenged into the
         | book. That's to say, Google SRE wasn't the best or last
         | iteration of SRE.
         | 
         | I personally think systems evolve the way you describe because
         | of a system of incentives. There are more incentives for
         | features than there exist for refactor and non top priority
         | defect fixes. This comes from the people who hold power to
         | shape incentives and they often do so with conflicting
         | priorities and superficial understandings of the existing
         | incentive structure.
         | 
         | I'd also like to say that it's my own personal theory that
         | systemic issues can only be caused by systemic forces.
         | Individual mindsets cannot be to blame then; if a mindset has
         | become systemic (example: SWEs overly attached to code and
         | features) then your next question should be "why?". There's a
         | system that enforces that, and if you don't look beyond
         | personal obsession then you'll never find it.
        
           | burakemir wrote:
           | I like this way of saying it. I don't think anything here is
           | well studied at all. It is not like we are all fishing in the
           | dark but the organizational structures that determine the
           | conditions in which software development and operations
           | happen are not well understood. I found Herb Simon's writings
           | and his concept of bounded rationality very lucid.
           | 
           | When we shift from "reliability" to "safety" we also need to
           | shift from the individual to the system.
        
       | kryptonomist wrote:
       | Glad to see those valuable principles written, even if it seems
       | we are heading in the complete opposite. At least we can try to
       | apply them on our side business.
       | 
       | These were also true in the early ages of aviation:
       | 
       | "Perfection is achieved, not when there is nothing more to add,
       | but when there is nothing left to take away."
       | 
       | -- Antoine de Saint-Exupery
        
       | kubb wrote:
       | SRE has got to be one of the organisations that have done the
       | most damage in the big G. They were given a license to mandate
       | things based on philosophical musings backed with no science, and
       | they can decide what's best and should be done without any data,
       | just based on feels. They also have a culture of misanthropy,
       | patronization and contempt towards devs. From what I can tell
       | anyway.
        
         | bru wrote:
         | [citation needed]
        
         | sgarland wrote:
         | > culture of misanthropy, patronization and contempt towards
         | devs.
         | 
         | When you're being paged for the Nth time because of an idiotic
         | problem that you've pointed out repeatedly, you too might
         | exhibit these traits.
        
         | alienchow wrote:
         | Why don't you give Mission Control a try for 6 months?
        
         | makerofthings wrote:
         | Be google SRE. Elite software engineer. Cool under pressure.
         | 
         | Pager goes off! Grab pixel. Press finger print reader until it
         | lets me enter my passcode. Ack page. Put down whisky. Shake
         | self. 5 minutes to be logged in and dealing with the problem.
         | 
         | Password. gnubby. password. gnubby. gnubby. gnubby.
         | 
         | Check alert, see playbook, ignore playbook. Check which cell
         | the problem is in. Correlate with rollouts. See a match. Roll
         | back poorly tested dev promo project. Charts recover. Alert not
         | firing.
         | 
         | Log out. Back to whisky.
        
           | kubb wrote:
           | The drinking culture is also pretty weird.
        
             | joshuamorton wrote:
             | It's also outdated. I don't think I've seen real are
             | drinking culture since like 2018.
        
           | VirusNewbie wrote:
           | I think you picked the least interesting part of SRE to poke
           | at.
           | 
           | It's the equivalent of saying "SWE just adds new stubby
           | endpoints to a service, so simple".
        
         | gtirloni wrote:
         | _> they can decide what 's best and should be done without any
         | data, just based on feels._
         | 
         | The book is exactly the opposite of this. The Principles
         | chapter alone talk about many things that involve actually
         | dealing with numbers (SLO, measuring complexity, etc).
        
           | kortilla wrote:
           | "It has numbers attached" does not mean something is backed
           | by data.
           | 
           | Unless they are testing correlations between these target
           | metrics and business success or some other external cost
           | metric, it's still "just feels".
           | 
           | I've seen internal crusades against cyclomatic complexity
           | that resulted in massive engineering waste to reduce and
           | reliability saw no improvement.
        
             | kubb wrote:
             | Yeah, but to be fair I don't think Googlers can tell the
             | difference between a decision backed by data, and a random
             | ass measurement.
        
       | ChrisArchitect wrote:
       | Some more recent discussion:
       | 
       | https://news.ycombinator.com/item?id=39580346
        
       | YokoZar wrote:
       | See also the Simplicity chapter in the followup Google SRE
       | Workbook: https://sre.google/workbook/simplicity/
        
       | randmeerkat wrote:
       | Google's "best practices" lead them to deleting an entire
       | customer's $135 billion pension account [1]. I'm surprised anyone
       | is still reading anything Google writes.
       | 
       | 1. https://arstechnica.com/gadgets/2024/05/google-cloud-
       | acciden...
        
         | klabb3 wrote:
         | You're assuming that those systems were all implemented to the
         | letter of that guide. That's never the case. Often these type
         | of guidelines are written to address recurring problems found
         | in an organization.
        
         | thirteenfingers wrote:
         | That was seven years later. Maybe the problem is that Google
         | stopped reading what Google wrote.
        
           | passion__desire wrote:
           | Google : We will breach the rules we preach.
        
             | wiseowise wrote:
             | Right, because every line of code written by tens of
             | thousands of Google engineers is being validated against
             | every guidebook.
        
           | randmeerkat wrote:
           | > That was seven years later. Maybe the problem is that
           | Google stopped reading what Google wrote.
           | 
           | The problem is that it was never that good. Anyone who has
           | used K8s at scale will tell you at length how it doesn't
           | scale. People should stop focusing on tech companies like
           | celebrities and focus instead on domain problems related to
           | their business.
        
             | lima wrote:
             | The funny thing with k8s is that Google doesn't use it
             | (except GKE, and there's a reason it's one cluster per
             | customer).
             | 
             | Their internal tooling scales just fine, but all it shares
             | with k8s is some of the underlying concepts. Unlike, say,
             | Bazel, gVisor or Gerrit, which are the real thing (minus
             | some secret sauce tied to internal infra). k8s is good
             | software, and best-in-class when it comes to open source
             | options, but the idea that it is "open source Borg" is
             | silly.
        
         | dieortin wrote:
         | If we should only read things written by organizations that
         | make no mistakes, then we will never read anything.
        
           | randmeerkat wrote:
           | > If we should only read things written by organizations that
           | make no mistakes, then we will never read anything.
           | 
           | That was a "mistake" that should not have even been possible.
           | If the pension fund had not used a multi cloud strategy the
           | entire business would have been lost. A mistake is not
           | configuring Kafka correctly and losing some data, deleting an
           | entire account should not be given a pass.
        
             | joshuamorton wrote:
             | The recent postmortem says they were able to recover from
             | backups on gcp, so I don't think this is true.
        
               | randmeerkat wrote:
               | > The recent postmortem says they were able to recover
               | from backups on gcp, so I don't think this is true.
               | 
               | "UniSuper, an Australian pension fund that manages $135
               | billion worth of funds and has 647,000 members, had its
               | entire account wiped out at Google Cloud, including all
               | its backups that were stored on the service. UniSuper
               | thankfully had some backups with a different provider and
               | was able to recover its data, but according to UniSuper's
               | incident log, downtime started May 2, and a full
               | restoration of services didn't happen until May 15."
               | 
               | Google didn't recover the data, the customer recovered
               | their data from a different cloud provider.
        
               | joshuamorton wrote:
               | From https://cloud.google.com/blog/products/infrastructur
               | e/detail...
               | 
               | > This incident did not impact:
               | 
               | > Any other Google Cloud service.
               | 
               | > Any other customer using GCVE or any other Google Cloud
               | service.
               | 
               | > The customer's other GCVE Private Clouds, Google
               | Account, Orgs, Folders, or Projects.
               | 
               | > _The customer's data backups stored in Google Cloud
               | Storage (GCS) in the same region._
               | 
               | ...
               | 
               | > _Data backups that were stored in Google Cloud Storage
               | in the same region were not impacted by the deletion, and
               | ... were instrumental in aiding the rapid restoration._
               | 
               | Emphasis mine.
               | 
               | You're quoting, as far as I can tell, an ArsTechnica
               | article that makes unsourced claims about backups being
               | deleted, neither UniSuper's nor Google's previous
               | statements ever mentioned anything about backups being
               | deleted.
        
         | gtirloni wrote:
         | Oh, completely ignoring anything anyone from Google ever writes
         | again? This is akin to the cancel culture which we all know is
         | how society should work. /s
        
           | randmeerkat wrote:
           | > Oh, completely ignoring anything anyone from Google ever
           | writes again? This is akin to the cancel culture which we all
           | know is how society should work. /s
           | 
           | Maybe if Google focused on doing actual work instead of
           | writing feel good engineering pieces, they wouldn't have the
           | Google graveyard and an unstable cloud offering that may
           | spontaneously delete multi-billion dollar accounts.
        
             | wiseowise wrote:
             | Alphabet has 2T market cap, get your head out of your
             | sitting place, lol.
        
               | randmeerkat wrote:
               | > Alphabet has 2T market cap, get your head out of your
               | sitting place, lol.
               | 
               | That same sort of thinking is what led to the downfall of
               | yahoo.
        
               | wiseowise wrote:
               | Maybe. Times are different now.
        
               | ikrenji wrote:
               | The fall of yahoo was caused by out of touch C suite not
               | engineering handbooks.
        
       | maximinus_thrax wrote:
       | Maybe an unpopular opinion, but this type of content is useless
       | and serves no other purpose than feeding the already bloated
       | Google cargo-culting machine.
        
         | dieortin wrote:
         | Could you elaborate on why you think that? Just stating that
         | does not really add to the conversation
        
           | maximinus_thrax wrote:
           | Because is just useless. I mean seriously, what valuable
           | insight does anyone get from that? It's some sort of truism
           | wrapped in a word sandwich, ready for linkedin lunatics to
           | pat themselves on the back sharing it. Do you feel you've
           | gained something by reading it? Is this a valuable piece of
           | intelligence which would guide your future decisions? Will
           | you bring this to the team during an argument to push your
           | agenda? This feels like the same type of 'feel good' content
           | which people read and then feel like they did something
           | productive. But I would argue that every piece of insight
           | coming for a mega corp, valuable inside the mega corp is
           | actually dangerous outside when people take it as dogma and
           | try to apply it. SRE in general is something which IMHO of
           | working in the industry for decades has poisoned the industry
           | with half assed cargo cult implementations. But it has Google
           | branding, so it must be valuable, hits hard for the fanboys
           | and obviously can and should be applied in every company and
           | every context.
           | 
           |  _I also find it ironic to see 'Simplicity' touted from the
           | same people who let Kubernetes lose in the wild, but that's a
           | different story for a different time_
        
             | gtirloni wrote:
             | That's circular reasoning ("it's useless because it's
             | useless").
             | 
             | If you haven't gained any insights from reading that
             | content, maybe it doesn't apply to you or you don't know
             | what you don't know.
             | 
             |  _> valuable inside the mega corp is actually dangerous
             | outside when people take it as dogma and try to apply it._
             | 
             | mega corp or not, dogmatic principles are usually bad
             | coming from anywhere. The SRE book contains insights that
             | apply to startup, medium-sized companies and mega corps.
             | It's not prescriptive for a reason.
        
             | zbentley wrote:
             | I've honestly never worked on software in an environment
             | where the advice in the article _wasn't_ important to keep
             | in mind. Personal projects, single digit employee count
             | startups, growth stage, ancient and slow moving Perl
             | monolith shops ... they all needed to keep the principles
             | of simplicity, boringness (boring.tech is a great
             | reiteration of this) and continually self-auditing to
             | reduce inherited complexity in mind.
             | 
             | Whether or not Google interprets this advice in a sane way
             | or whether they actually follow it are separate issues, but
             | I think the advice is timely and (at least in my
             | experience) important for many people to hear, regardless
             | of where it's author works.
        
               | zbentley wrote:
               | Er, not boring.tech; boringtechnology.club is what I
               | meant.
        
           | stackskipton wrote:
           | Not OP but I'll give my take because I mostly agree. Because
           | Google lives in a world few of us do. I'm SRE/DevOps and our
           | lives are nothing like Google SREs. We have almost zero
           | control over software that is chucked our way. Any attempt to
           | try and control them fails with management telling us "Just
           | fucking ship it". Finally, something I realized after working
           | with various FAANG SRE types, they don't understand what bad
           | development practices look like, they can't imagine it.
        
             | VirusNewbie wrote:
             | But then that's not SRE, that's just Ops, who is called
             | SRE...
        
       | zbentley wrote:
       | Many commenters here are rightly pointing out Google's hypocrisy
       | in actually following the principles in this article. Fair
       | enough. But others are throwing the baby out with the bathwater:
       | it's a little silly to read comment after comment saying that the
       | advice in TFA must be bad because Google does dumb/bad stuff on
       | the regular. Companies aren't homogenous. Even misguided
       | companies may employ people who can teach others important
       | things.
       | 
       | Boeing is a perfect example of this. I would _absolutely_ read an
       | article proposing principles of engineering reliability from a
       | Boeing eng /QA greybeard. Even as the rest of the company
       | spiraled due to horrible leadership and management practices,
       | many people in engineering and quality control did their
       | damnedest to keep those failures from causing even _more_ harm
       | and loss of life. Those people probably have very valuable
       | lessons to share about how to maintain what quality you can in a
       | deeply hostile environment.
        
         | dangus wrote:
         | I don't see any validity to the alleged hypocrisy.
         | 
         | End users making that criticism are confusing the products with
         | the reliability practices.
        
           | Tao3300 wrote:
           | Indeed, allegations of hypocrisy are a class of ad hominem.
           | They don't necessarily weigh in on the validity. It just...
           | feels good? I guess? People _LOVE_ to feel like they caught a
           | hypocrite. It 's probably in the Top 5 most sought after
           | dopamine kicks.
        
             | tbrownaw wrote:
             | >here's why doing <thing> is dumb
             | 
             | > _regularly does <thing>_
             | 
             | I think that might be a reason to suspect that the person
             | doing that is hiding some holes in their argument.
        
       | wouldbecouldbe wrote:
       | "Why don't we gate the code with a flag instead of deleting it?"
       | These are all terrible suggestions. Source control systems make
       | it easy to reverse changes, whereas hundreds of lines of
       | commented code create distractions and confusion."
       | 
       | In most cases to delete code would be a good idea, but to say
       | that source control systems make reverting easier. After a few
       | months most developers will have forgot about those lines and at
       | times uncommenting code & explaining it explicitly might be a
       | better way to preserve knowledge then to rely on digging through
       | GIT.
        
         | lloydatkinson wrote:
         | First time I'm hearing that feature flags and commented out
         | code are the same thing.
        
           | wouldbecouldbe wrote:
           | I've seen it be a company culture thing where every
           | discussion was resolved with we'll put it behind a
           | config/flag. It's an easy way to avoid hard choices. It's
           | probably something like that the author refers to.
        
         | wiseowise wrote:
         | Nothing like supporting dead code forever just because you
         | might need it some day.
        
       | gnuser wrote:
       | At the last "real" job I tried to help implement this as part of
       | and later the manager of the ops team. It's a great start, but in
       | that case management wanted the idea of devops/sre but didn't
       | actually support it, and it really was a shit show. If you have a
       | bad CTO and leadership on the board level, no amount of re-
       | tooling will paper over their lack of support for the real
       | principles.
        
       | YZF wrote:
       | This reads for me as a reflection of Google politics/org
       | structure. The SRE org positioning itself as the guardian of
       | system design vs. the SWEs who are agents of complexity. Doesn't
       | feel healthy to me. The principles are fine but it's the SWEs
       | that should be talking and applying them because they are
       | "closer" to the decisions.
        
       ___________________________________________________________________
       (page generated 2024-05-26 23:01 UTC)