[HN Gopher] On the Term "Blameless Postmortem"
       ___________________________________________________________________
        
       On the Term "Blameless Postmortem"
        
       Author : thcipriani
       Score  : 20 points
       Date   : 2022-01-27 19:51 UTC (3 hours ago)
        
 (HTM) web link (tylercipriani.com)
 (TXT) w3m dump (tylercipriani.com)
        
       | SpicyLemonZest wrote:
       | I like the idea here, but in my experience the biggest practical
       | issue with postmortems is getting people to actually do them. A
       | heavy term serves as a reminder that it's an important
       | investigation, it has to get done, we can't just put it off until
       | it fits conveniently into the schedule. I worry whether a
       | lighter-sounding term would make it easier for people to work on
       | their projects first and delay post-incident investigations
       | indefinitely.
        
       | schmatz wrote:
       | I like the term "Incident report".
        
       | marcosdumay wrote:
       | Oh, God. If you believe "disquisition" carries less negative
       | connotations than "blameless postmortem", you completely failed
       | at reading the audience.
       | 
       | Aviation uses the word "investigation", by the way. But they can
       | only omit the "blameless" part because there are very strong
       | guarantees that it will be blameless.
        
         | isleyaardvark wrote:
         | I prefer "retrospective", which doesn't sound like a police
         | investigation or bring to mind airplane crashes.
         | 
         | It's also easier to do those on a weekly basis, so there's less
         | of a Pavlovian association of "bad thing happens" then "synonym
         | for postmortem happens".
        
       | rfreiberger wrote:
       | The name needs to change but also the attitude that as engineers,
       | we build complex systems and assume everyone has the knowledge
       | how to use it. A few world wide outages I've been a part of was
       | caused by a task runner which didn't lint the command and allowed
       | a broken bash one-liner to be executed across every system in
       | parallel.
       | 
       | Yes, it's a simple mistake but how was a system allowed access to
       | our global environment that this edge case was never calculated?
       | In many of the meetings, the common issue is communication even
       | between co-workers on the same team, and between internal
       | platform providers. One case was an outage on the storage backend
       | and realized after a long meeting that the internal SLA was much
       | greater than we expected (and which the systems would timeout).
       | It only worked for so long as storage utilization was extremely
       | low.
        
       | zippergz wrote:
       | You can't debate the value of the term without considering the
       | conditions that led to it. Why would someone call a postmortem
       | "blameless"? Because (in some companies) there absolutely was a
       | culture of blame, which made people less forthcoming and
       | thoughtful about the causes of incidents, which limited the
       | potential learnings. This term was not pulled out of thin air, or
       | built upon some imaginary possible blame. It was designed to
       | explicitly remove blame that was already present in the culture.
        
         | fragmede wrote:
         | In particular, firing the dev that wrote the buggy code or the
         | SRE that pushed the bad change is an obvious management
         | reaction that "blameless postmortem" seeks to redress. I'm
         | happy for OP that they've never worked somewhere that toxic but
         | those places absolutely exist.
        
       ___________________________________________________________________
       (page generated 2022-01-27 23:02 UTC)