hngopher.com

       [HN Gopher] Operations is not Developer IT
       ___________________________________________________________________
        
       Operations is not Developer IT
        
       Author : todsacerdoti
       Score  : 287 points
       Date   : 2021-09-03 10:42 UTC (12 hours ago)
        
 (HTM) web link (matduggan.com)
 (TXT) w3m dump (matduggan.com)
        
       | jon_north wrote:
       | When developers see ops as IT for them, it's because ops (and
       | overall management) is doing a poor job laying out the actual
       | responsibilities of each role in the org.
        
         | cbushko wrote:
         | I'd also add that Ops is not automating these problems away.
        
       | markus_zhang wrote:
       | One thing that I realized as a Business Analyst and then Data
       | Engineer is that communication is the key but pretty much no one
       | can do it properly. I also realized that communication should be
       | part of the job and leads and managers should hire people who can
       | communicate things effectively.
        
       | danielovichdk wrote:
       | This is a great showcase of the silo mentality and split.
       | 
       | One should not build silos where experts sit.
       | 
       | One should participate in a team of many different experts.
       | 
       | If you still have to call "whatever department" for fixing your
       | slow SQL query, your disk space, your repo access, production
       | deployments etc. Then in most cases I feel you should get out and
       | find a place where silos are not being exercised.
       | 
       | This is a thing of the past.
        
         | greedo wrote:
         | Try avoiding silos if you work in any financial industry, or
         | have any regulations around your business.
        
         | SpaceL10n wrote:
         | > This is a thing of the past.
         | 
         | Is it? In highly regulated environments, silos are practically
         | a requirement. Security access controls are intentionally put
         | in place to limit access to systems and their respective
         | pieces. If you need repo access, or access to the CI pipeline,
         | or access to the database, you have to go to the appropriate
         | channel.
        
           | dsr_ wrote:
           | The scale of security necessities in organizations runs from:
           | 
           | - no need for anything but minimal security, perhaps because
           | the business is trying to surf the margin between their AWS
           | bill and their Google ad revenue.
           | 
           | - there is only a need for security in the part of the
           | business that deals with money
           | 
           | - some stuff that the users do, they would prefer to maintain
           | integrity but they don't care a lot about confidentiality
           | 
           | - the users want reasonable confidentiality, too
           | 
           | - everything about the business is money or secrets
           | 
           | Where your business is on that scale determines how much you
           | are regulated and how many internal gatekeepers are
           | necessary.
        
       | WesolyKubeczek wrote:
       | So I remember the times when programmers were power users and
       | were poking fun at "lamers" and "lusers" who always went like "I
       | clicked something and a message appeared, what do I do?"
       | 
       | And now people who call themselves programmers are like this
       | themselves.
       | 
       | Now can I have some other kind of future please?
        
         | [deleted]
        
       | sgarland wrote:
       | My biggest bugbears for anyone in tech are an incurious nature,
       | and an inability to search documentation/Google.
       | 
       | Your code threw an error? Did you read it? Did you search Google?
       | 
       | A monitored metric dramatically changed with the last deploy?
       | Have you investigated why that might be?
       | 
       | I am more than happy to help troubleshoot a tricky problem if you
       | tell me what you've already tried. If you truly don't know where
       | to begin, I'm also happy to teach you. What I am not going to do
       | is fix your problem for you, with you retaining nothing.
        
       | szundi wrote:
       | Be happy that your colleagues bring you in to these
       | conversations.
        
       | darkwater wrote:
       | I'm writing this as a DevOps/SysEng/whatever you prefer: I think
       | there are two big type of developers groups in the industry, and
       | two ways to think about these issues. The first group advocates
       | for an ultimate "NoOps" world where every team, composed only by
       | SWE, is responsible for the whole code lifecycle, from inception
       | to deployment to maintenance to decommission. This is
       | aspirational in many, many companies and probably true in a bunch
       | of really good companies. Anyway I still wonder if there aren't
       | tensions there between product/business asking for developer
       | power to bring new features/changes in and developers doing the
       | Operations work beside writing business code. The second group is
       | formed by the developers described in the article, which just
       | focus on their code and need a "developer helpdesk" for
       | everything else related to actually deploying and operating the
       | software written. IME this is how the vast majority of companies
       | work, especially the "normal" ones. Some developer steps "up" and
       | try to understand/do this extra work and they usually rise to the
       | top because they are the "good" engineers.
        
         | robertlagrant wrote:
         | > every team, composed only by SWE, is responsible for the
         | whole code lifecycle, from inception to deployment to
         | maintenance to decommission
         | 
         | I think this is the original thesis for DevOps.
        
           | nonameiguess wrote:
           | This isn't true at all, though. Many organizations,
           | especially smaller ones, seem to have taken it this way.
           | Everyone is responsible for everything. But that wasn't the
           | DevOps thesis. The idea was for eliminating silos in the
           | sense that dev and ops would talk to each other continuously
           | throughout the software lifecycle, with developers
           | understanding operational challenges and creating software
           | that didn't just meet functional requirements, but was easier
           | and faster to deploy, change, rollback, and troubleshoot. Ops
           | would understand the pressures of development and create
           | automation systems catered to the specifics of their software
           | stack.
           | 
           | It wasn't supposed to mean no more division of labor.
           | Division of labor is a key innovation in human society that
           | enables civilizations to exist. It was supposed to mean the
           | teams in different categories of labor interact throughout
           | and consider each other's needs, and not throw shit at each
           | other over a wall and only ever interact through a ticketing
           | system.
        
             | twic wrote:
             | I believe that it _was_ supposed to mean no more division
             | of labour, and i believe that based on conversations with
             | people who were early adopters of it ten years ago.
             | 
             | This waffly "breaking down the silos" stuff is a later
             | redefinition, i believe reacting to the fact that the
             | original meaning was extremely unpalatable to existing
             | organisations, with existing employees and hierarchies who
             | would be severely disrupted by it.
        
               | vajrabum wrote:
               | Here's a link to an interview with Patrick Debois who is
               | one of the two guys who came up with the concept of
               | DevOps. https://blogs.oracle.com/javamagazine/how-dev-
               | versus-ops-bec...
               | 
               | From Patrick Debois in the interview "Later, I saw a talk
               | by Jean-Paul Sergent about developer (Dev) teams and
               | operations (Ops) teams working together."
               | 
               | So no, breaking down the silos was baked in from the
               | beginning.
        
               | darkwater wrote:
               | But I still challenge the fact that it can really work
               | well for normal performers and not only top notch
               | developers. Taking care of all the phases of a software
               | lifecycle is not an easy task, it has a big enough
               | cognitive load. I understand that this is supposed to be
               | done in (small) teams, but within those teams you still
               | need some degree of separation of labour, you cannot have
               | everybody having wide and deep knowledge of everything
               | software. Well, you can do it, if you are Google and the
               | like and set the bar for hiring really, really high. But
               | that doesn't apply to most organizations and engineers.
               | Not all of us are wicked smart.
        
             | robertlagrant wrote:
             | You're missing the fact that a team can have different
             | specialisms. It doesn't have to be each individual can do
             | everything.
        
           | m1keil wrote:
           | The original thesis was better cooperation between developers
           | and sysadmins (ops). It didn't focus on trying to make
           | operations redundant or transfer every sysadmin into a SWE.
        
           | thesuperbigfrog wrote:
           | It is.
           | 
           | The team that writes the code should also deploy the code and
           | get paged in if there are problems in production.
           | 
           | That creates a tight feedback loop that requires developers
           | to learn and manage the whole stack, code defensively, and
           | test enough to be confident to deploy to production.
           | 
           | Didn't test your code enough? You will be paged in the middle
           | of the night to fix it. It creates a strong incentive to make
           | good decisions because you will be living in the mess you
           | create.
        
             | cestith wrote:
             | In those cases, though, it's usually a team with some Dev
             | people who know or can learn some Ops and some Ops people
             | who know or can learn some Dev. It's not just laying off
             | all the sysadmins, network admins, stack architects, and
             | all then letting the developers freefall until they find a
             | way to right themselves.
        
             | cbushko wrote:
             | And this only works if your Ops team provides good tooling
             | for deploying, logging, monitoring and alerting.
             | 
             | I have heard of companies deciding to do 'devops' and it
             | turns into a free for all of dev teams having to
             | handle/build things end to end. Everyone loses in that
             | scenario.
        
           | darkwater wrote:
           | Yes, absolutely, but it's not actually widely implemented.
           | Orgs does "DevOps" but they are just automating/writing as
           | code some things that previously were done manually by
           | Ops/SysAdmins. Now we have DevOps roles doing that same work
           | but with other tools (Terraform, Cloudformation etc), much
           | more automation, less gruntwork and toll but STILL used as
           | "developers IT" nonetheless.
        
             | robertlagrant wrote:
             | Not disagreeing :)
        
         | lbhdc wrote:
         | > every team, composed only by SWE, is responsible for the
         | whole code lifecycle, from inception to deployment to
         | maintenance to decommission
         | 
         | This is how it is at my startup. All of the engineers are
         | involved in managing the infrastructure for everything we
         | build. I find it gives me much better insight into my app, and
         | the feedback loop is much tighter since I am in control of
         | everything.
        
         | closeparen wrote:
         | To put some color on the "NoOps" world: it is not that product
         | engineers are directly touching cloud or metal. We have an
         | infrastructure group. It ships a PaaS. It doesn't get involved
         | with a specific tenant of that PaaS unless a product engineer
         | has evidence that there's something wrong with it & escalates a
         | page. Product engineers click the deploy and rollback buttons
         | for themselves.
        
           | darkwater wrote:
           | Thanks for stating this, I think it's exactly what a DevOps
           | oriented org should ultimately achieve. Or at least try to.
        
       | syspec wrote:
       | "I'm good at my field, people in other fields are terrible."
        
       | tester34 wrote:
       | Is this $BIG_Company problem?
       | 
       | I feel like devs at small companies are doing everything -
       | 
       | coding, testing, supporting customer, deployments,
       | troubleshooting and of course straight over ssh+winscp cuz
       | vps/bare metal are cheaper
        
         | makach wrote:
         | well, they have to. Does not mean they do all that. They just
         | skip the hard parts and focus on what is important in order to
         | survive.
         | 
         | large organisations usually have a much bigger responsibility
         | and is held to higher standards, frequently audited and
         | controlled to stay within rules and legal compliance
        
           | greedo wrote:
           | 100%. Over the last two years, a majority of my "sysadmin"
           | work has been devoted to audit and compliance tasks. Mostly
           | validating and working with auditors, but also making
           | significant changes to work processes.
        
         | michaelt wrote:
         | It's a medium-sized company problem.
         | 
         | If you've got 2 developers, they're both doing everything and
         | on call 24/7 and all have read/write access to everything on
         | demand.
         | 
         | If you've got 200 developers, you're going to start wanting a
         | team of shift workers keeping an eye on the systems, and maybe
         | you won't want every developer to have read/write access to
         | production data.
         | 
         | If you've got 20,000 developers your working practices and
         | infrastructure are almost completely cemented in place, and
         | anyone who doesn't like them has already left because it's
         | easier to change jobs than to get 20k people to change their
         | behaviour.
        
       | tomrod wrote:
       | I agree with the author's stresses. What they miss in their
       | recollection of the halcyon days of large teams of experts is
       | that it's extraordinarily expensive and (in many cases!)
       | potentially wasteful of business resources to have experts on
       | staff to maintain stable equipment.
       | 
       | Don't get me wrong, I appreciate well operating systems and the
       | people who make that happen. There is going to be a beancounter
       | wondering whether the very expensive, trained engineer can
       | operate faster/cheaper.
        
         | nijave wrote:
         | The sad truth in many cases: increased errors are cheaper than
         | the cost to prevent them
        
       | sgt wrote:
       | I laughed when he mentioned the Node developers. Come on guys,
       | 90% of Node developers couldn't even code their way out of a
       | paper bag. The quality of developers is shocking.
        
       | joedoejr wrote:
       | Haha nice one, I love to be a pure "cloud" engineer just because
       | i can send a idiot dev to idiot azure support and enjoy they
       | solve trivial "i don't read docs" issue for months.
        
       | mrintegrity wrote:
       | Having worked in Operations in some form or another for the past
       | ~20 years this articulates so well the feelings I have been
       | increasingly having over the past few of those 20 years. Now I
       | manage a small operations team and we experience pretty much all
       | of the issues highlighted in the article.
       | 
       | There needs to be a rethink of how infrastructure, development
       | and deployment is handled.. maybe the solution is to slow things
       | down and insert a little carefully thought out bureaucracy
       | between the layers (can't believe I'm advocating for more
       | bureaucracy!)
        
         | jimmySixDOF wrote:
         | You will probably never get that past the Change Management
         | Board ;}
        
       | datavirtue wrote:
       | This has been my experience as well. I used to work between infra
       | and development and I saw first hand a constant stream of
       | clueless devs that don't read documentation starting shit with
       | infra and networking because they don't know what's wrong and
       | just assumed it was the monkey-brained infra people who had
       | something screwed up. The infra people were equally disdainful of
       | the "stupid devs." Honestly, no one worked together effectively
       | but the devs would just pull in a new framework or language, roll
       | it out because some blog posts said it was cool, and then gripe
       | at infra about perceived problems.
       | 
       | Now I'm in a dev ops team (as a dev) and we spend a very large
       | swath of our time---troubleshooting infra issues. It's all AWS
       | and our problem now.
        
       | uvesten wrote:
       | From the article: "Often they have not even bothered to do basic
       | troubleshooting, things like read the documentation on what the
       | error message is attempting to tell you."
       | 
       | This has been the bane of my work happiness for a while now. I
       | keep having to tell junior devs to actually _read_ the fine error
       | message, just in case it actually _contains information about the
       | error_, you know. Not that it seems to help much, it's like they
       | can't get the concept into their heads.
       | 
       | This is 100% a problem with younger, bootcamp-"educated" devs, in
       | my experience. I know the common wisdom on social media is "no
       | one reads text anymore", but if that includes aspiring
       | developers, it might be tough to replace the current workforce
       | when that day comes...
        
         | AnIdiotOnTheNet wrote:
         | I don't think it's just because of bootcamp education, I think
         | people are growing up in a world where error messages are
         | either never displayed, or displayed in the from of "The
         | program has did a sad :( try again later."
         | 
         | They're not used to reading error messages because they've been
         | brought up seeing nothing but completely useless error
         | messages.
        
         | spaetzleesser wrote:
         | I always preach to people that they have to make it easy for
         | others to help them. Don't just say "it doesn't work" and
         | expect them to analyze the issue and take care of things.
         | Instead provide information what you did, send log files,
         | screenshots and whatever other information you may have.
         | 
         | I think the people most likely to fall into the "it doesn't
         | work" category are people who don't have much experience
         | troubleshooting difficult problems.
         | 
         | In the end it's about compassion and understanding of each
         | other. Unfortunately in a lot of companies the only direction
         | people are getting is "get it done on time". It's rare that
         | management asks people to have empathy for each other.
        
         | bottled_poe wrote:
         | Why are software devs responsible for first level support?
         | Software devs are expensive, support staff are not. Seems like
         | bad business to me.
        
           | spinningslate wrote:
           | Big fat "it depends" on that. It might be superficially
           | correct (dependong on the scope/skills of "support" staff).
           | Even if there is a meaningful financial difference between
           | the day rates, it doesn't necessarily follow when the process
           | is viewed end to end:
           | 
           | 1. If ops staff have limited expertise/authority, it's less
           | likely they can resolve problems. They might acknowledge (so
           | maintaining some aspect of client SLA), or have a limited set
           | of pre-defined remedial actions (reset button). Anything
           | beyond that, though, and it needs the dev team. So it's
           | arguable whether the ops staff provide much value in the
           | equation.
           | 
           | 2. As a dev, there's nothing quite like the prospect of being
           | paged at 2am on a Sunday to incentive more robust code.
           | 
           | End to end dev accountability isn't a panacea either - but
           | the problem is more nuanced than just pay rates.
        
             | kazen44 wrote:
             | also, devs seem to really underestimate the pay support
             | staff makes, especially those capable of troubleshooting
             | deep, low-level and complex issues. this might be less
             | visible on the Dev side, but troubleshooting infrastructure
             | is not an easy to find skill, especially if dozens of
             | moving parts are involved.
        
               | LilBytes wrote:
               | More power to them. The rest of us 'Ops' guys are slowly
               | transitioning to DevOps Engineers and SRE's and are
               | gladly taking handfuls of cash because I both know how to
               | read hex dumps, and are also smart enough to know your IP
               | address isn't going to change because I replaced your
               | ethernet cable.
        
             | acdha wrote:
             | One of the best arguments for shared responsibility is that
             | it avoids "not my job" thinking. I've seen large
             | organizations burn resources and downtime because
             | developers and ops are in an adversarial relationship where
             | nobody has a stronger sense of responsibility for an
             | application working than they do for shifting the blame to
             | their counterparts. If everyone is getting things escalated
             | to them it tends to cut through that cycle.
        
         | 908B64B197 wrote:
         | > This is 100% a problem with younger, bootcamp-"educated"
         | devs, in my experience.
         | 
         | You'll get a lot of pushback here but it's definitely true.
         | 
         | That doesn't mean it doesn't happen with CS grads as well, but
         | it's quite rampant among bootcamp devs. I think the reason for
         | that is that, since the bootcamps are so short, they "stay on
         | rails" and mostly work on simple projects (that will give out
         | something they can push to a github repo and use as a
         | portfolio).
         | 
         | It's the same with git. Every bootcamp will use git and claim
         | to teach it to their grads, but then watch them do anything on
         | a repo with multiple users. A lot of them just rote memorized
         | commands to pull and push to main and that's it. Branching?
         | Rebase? Using the commit history? Never heard of.
         | 
         | For new hires from serious Engineering or CS Degree, they
         | should have had at least a few classes dedicated to projects
         | where they built something non-trivial. On top of theoretical
         | classes teaching the fundamentals.
        
         | arvindamirtaa wrote:
         | > This is 100% a problem with younger, bootcamp-"educated"
         | devs, in my experience.
         | 
         | I resent this. Not because I'm a bootcamp-"educated" dev. I'm
         | not. But it suggests somehow that devs with CS degrees are
         | somehow better in this aspect. If anything, they're arguably
         | worse (obligatory, not everyone disclaimer).
        
         | nikanj wrote:
         | "We'll just rewrite that microservice from scratch and the
         | error probably goes away. Preferably with the newest framework
         | du jour"
        
           | bdavis__ wrote:
           | 2021, State of the Practice.
           | 
           | well except for using Rust, and a bunch of dependencies from
           | the web, version determined when downloaded at compile time.
        
             | Macha wrote:
             | To be fair, Cargo defaults to using lockfiles, so you'd
             | have to go out of your way to hit both your points at the
             | same time.
        
               | bdavis__ wrote:
               | very true, some hyperbole to make a point.
               | 
               | "lack of newness" is a characteristic many will expend
               | untold hours to extinguish. to my perspective, the
               | "rewrite it in Rust crowd" is the peak; all non-Rust code
               | is soiled, and worthy of replacement.
               | 
               | (it is very possible that the "rewrite in Rust" movement
               | is just a guerrilla marketing project)
        
               | selfhoster11 wrote:
               | I don't really care for Rust, but all non-memory safe
               | code could benefit from being replaced. This does, for
               | some people, mean rewriting it in Rust because C and C++
               | make it harder to achieve the goal of memory safety.
        
           | politician wrote:
           | Not to totally discount your point, but consider that
           | rewriting is a form of study.
        
         | birdyrooster wrote:
         | It's like script kiddies showing you all of the innocuous stuff
         | they found with Google, they are illiterate and so their
         | imaginations run wild.
        
           | dijit wrote:
           | The script kiddies of yore, however, did have a desire to get
           | stuff working.
           | 
           | This lead to a lot of them learning common failure modes of
           | the software they ran.
           | 
           | Ironically the people who were script kiddies in their teens
           | have been some of the best troubleshooters I know.
        
             | snovv_crash wrote:
             | Selective memory, nobody cares about Igor who never got
             | further than being a script kiddie.
        
         | dkersten wrote:
         | Force them to write heavily templates C++ code, they either
         | learn how to read pages of comprehensible error message to
         | diagnose one little typo, or they can't do their job and are
         | forced to look for a new one.
         | 
         | It's usually not hard to figure out what's wrong from the
         | messages, but man do they look scary and hard to understand
         | when they appear. Yes I've been writing C++17 lately using some
         | very template heavy libraries.
        
           | spaetzleesser wrote:
           | "Force them to write heavily templates C++ code, they either
           | learn how to read pages of comprehensible error message to
           | diagnose one little typo, or they can't do their job and are
           | forced to look for a new one."
           | 
           | When I did C++ we sometimes made little competitions for the
           | smallest change that can produce the craziest error messages.
           | On the other hand I always found it extremely satisfying to
           | make one little change that removed thousands of errors and
           | warnings.
        
             | dkersten wrote:
             | I had a nice one today where it complained about some thing
             | not being invokable deep in some std code somewhere. Lots
             | of crazy template instantiation errors. It turned out I
             | forgot to pass the variant parameter to std::visit.
        
             | shoo wrote:
             | I've avoided C++ for most of my career, but one idiotic
             | mistake I do remember making was accidentally leaving an
             | open curlybrace at the end of one of my source files. The
             | C++ compiler ran and reported 1000s of compilation errors
             | all through every single _other_ file -- in my code,
             | throughout all the library code I included.
             | 
             | Easily diagnosed if you're working incrementally, one small
             | change at a time, and making checkpoints with version
             | control: `git diff`, carefully review the diff of what you
             | changed since the last checkpoint where things were more or
             | less working. I must have not been disciplined enough to
             | work like that at the time.
             | 
             | Troubleshooting systems integration failures is also
             | character building for getting better at diagnosis from
             | errors. Sure, it's failing, but let's try to figure out the
             | immediate layer of failure from the logs, error messages,
             | symptoms: name resolution? tcp? tls? http proxy?
             | authentication? authorisation? api spec misalignment? error
             | in our application code or the system we're directly
             | talking to? unexpected data? error in some other system
             | that we depend upon transitively? each time you hit a new
             | novel failure mode, or fail at one level deeper, you're
             | making progress!
        
         | dsr_ wrote:
         | It was not about 27 years ago that the senior sysadmin at my
         | first real job told me: "Most young people don't have any
         | hacker spirit at all, they just give up the first time they see
         | an error message. The world is not going to survive that."
         | 
         | I was doubtful that this was a universal truth then, and I
         | think it's the same now: there are a lot of people who do
         | mediocre work, and they are and were supported by a smaller
         | group of people who do really good work. And the world keeps
         | turning.
         | 
         | One of the joys of the Internet and of open source is the
         | increased ease of sharing ideas and solutions.
        
           | ByteWelder wrote:
           | > they are and were supported by a smaller group of people
           | who do really good work
           | 
           | I think the only solution for this is when "support" includes
           | education. In the simplest form, we can give support by
           | helping that colleague to find the issue himself, rather than
           | giving the solution directly. In a more advanced form, you're
           | making structural changes to your company. Like in how you
           | share knowledge with the team.
        
             | datavirtue wrote:
             | Reality check: they run off and nag someone else...very
             | discreetly...until someone caves and fixes it for them. If
             | they get enough pushback they will just quit.
        
           | indigodaddy wrote:
           | Salient point and makes a ton of sense. I think this is
           | accurate, totally coincides with my entire career/work
           | experience.
        
         | blowski wrote:
         | Some seniors remind me of street preachers shouting "Hell and
         | damnation awaits you!", and they get much the same response
         | from people. Not saying this is you, but I'd be interested to
         | know what ways you've tried and what's worked or not.
        
         | GlennS wrote:
         | Keep at it: some crusty guy telling me to just read the sodding
         | error 12 years ago opened my eyes.
         | 
         | I think there's something about certain kinds of tools that are
         | cryptic, unpredictable, and frustrating that can teach you to
         | be helpless - to just Google and hope. It's fixable though.
        
         | acdha wrote:
         | > This is 100% a problem with younger, bootcamp-"educated"
         | devs, in my experience.
         | 
         | In my experience it's also common with older and college-
         | educated ones; contractors trying to avoid extra hours; senior
         | architects; and especially anyone who thinks ops is someone
         | else's job. It's definitely not specific to age or training
         | mode.
         | 
         | There are a few contributing factors I see: tunnel-vision
         | focused on the particular detail they think they're working on,
         | causing them to ignore anything they "know" isn't related;
         | shoddy tools like much of the Java ecosystem where poor culture
         | around logging trains every user that it's normal to have huge
         | amounts of log spew; etc. but the biggest problem I have seen
         | is ego -- either unwillingness to believe that the product of
         | their staggering intellect could be less than perfect or that
         | the mundane task of getting their grand vision to actually work
         | is for the little people.
         | 
         | I'm thinking of a "senior architect" who was quite surprised to
         | learn that networks are neither perfect nor instantaneous, and
         | that his app might have some issues due to needing thousands of
         | XHR calls to load the UI. It was so much easier to ignore the
         | error messages and say the problem was Chrome. He had a CS
         | degree - the problem was the wrong mindset and having been
         | enabled to avoid good troubleshooting skills.
        
           | BizarroLand wrote:
           | It could be argued that there is a technological Maya that
           | has to be overcome before you mature as a developer. Most
           | people grow up spoon fed the "it just works" ideal, even old
           | school computer people never really had to doubt that their
           | floppy drive would read a disk or that their PC BIOS would
           | accurately boot their computer.
           | 
           | It's only after the technological illusion of Maya breaks
           | that you realize floppies have read heads, hard drives have
           | moving parts, CPUs have conductive traces and all of these
           | are vulnerable to breakdown, entropy exists in the system and
           | cannot be expelled, that the previous "ideal" state of your
           | system was temporary, an illusion, that nothing always works
           | the way it is supposed to and that your options boil down to
           | "burn it to the ground and start over" or "leap into Hades
           | both feet first to rescue the soul of what you love".
           | 
           | Most people go the first route. Buy a new one. Replace what
           | is broken with something else. That way the illusions are
           | never broken. The technology didn't fail, only its current &
           | easily replaceable avatar.
           | 
           | As the Son of God once did, after its death it will rise
           | again, immortally replaceable.
           | 
           | However, it is only after you have faced that 2nd trial by
           | fire and returned with your elixir that you as a changed
           | being can peer through the veil. The meme about "CPUs being
           | rocks we filled with lightning & tricked into thinking" rings
           | differently to you now.
           | 
           | You're touched the bones of the God and found that they
           | crumble. There is no God here, only a beautiful shambling
           | nightmare that has eaten the minds and souls of millions,
           | built by mad scientists and engineers in a vain attempt to
           | create the God whose physical absence they find themselves
           | longing for the same way a neglected child longs for the
           | embrace of their mother.
           | 
           | https://psychology.wikia.org/wiki/Maya_(illusion)
        
         | jamil7 wrote:
         | > I keep having to tell junior devs to actually _read_ the fine
         | error message
         | 
         | I wonder if it's also to do with the environment in which they
         | learn. When I was learning to program, like probably others
         | here, I didn't have anyone around me who knew anything about
         | computers so was generally on my own until my first job and had
         | to dig through stack traces and read error messages and had to
         | try and figure out what was wrong. Kind of a blessing and a
         | curse as I imagine my rate would have been a quicker and I
         | wouldn't have hit so many brick walls but I learned to debug
         | independently.
        
         | duped wrote:
         | I have the opposite experience, boomer devs that see red or
         | yellow in their terminal and send me an email or jira ticket
         | before trying to parse it. The younger generation at least can
         | make judgement calls on warnings.
        
         | giantg2 wrote:
         | I do have one anecdote/counterpoint. At my company we do
         | DevOps, but we have a central group that creates the templates
         | we are supposed to use for various AWS resources, build plans,
         | and deploy plans. It makes it extremely frustrating to
         | troubleshoot issues specific to the templates or how you
         | configured them because they usually aren't something you can
         | google and the documentation for them is extremely basic.
         | Sometimes you have to ask the people who created them since
         | they have the deep understanding and experience.
        
           | le-mark wrote:
           | I share this pain, but believe it or not, it's actually much
           | better than everyone doing their own thing all willy nilly.
        
             | giantg2 wrote:
             | I agree that it has to be done to have consistency. I'm
             | just pointing out that when using IaC designed by another
             | team where the internal workings are largely hidden, us
             | primarily dev guys are going to need some help with issues
             | that appear to be ops related. (And yeah, occasionally root
             | cause will be that I'm just an idiot)
        
           | nickjj wrote:
           | I've been contracting at a company for a few years and things
           | progressed from manually setting up servers (before I was
           | involved) to using Ansible to provision multiple servers and
           | now it's transitioning to a Kubernetes cluster and
           | infrastructure as code from day 1 along with tomes of
           | documentation to go with it.
           | 
           | It really is worth it to go the extra mile and write
           | comprehensive docs, even going as far as writing them in a
           | conversational tone as if it's a blog post or a book. I'm
           | really happy I found a company who treats documentation and
           | workflows as first class resources.
           | 
           | For a small team where only 1 person is working on this it
           | helps eliminate the bus factor and it also makes it easier to
           | have non-hardcore ops folks do code reviews on your IaC.
           | Having them be able to get the gist of it with a little bit
           | of background knowledge is so much better than nothing. All
           | of this results in higher reliability of the services your
           | company offers.
        
         | callesgg wrote:
         | I don't know most of the times error messages on computers are
         | garbage. People build up some type of fear for the error
         | message format.
         | 
         | I have an example:
         | 
         | I built a logistics and invoicing tool a few years back with
         | error messages that where human readable with clear proper
         | messages that told the user exactly what they did wrong and it
         | even proposed how they might fix the problem.
         | 
         | I don't know how many times I had to go to the users
         | workstations read the text out loud for them like they where a
         | 5 year old and ask them what they thought it meant. They always
         | knew what it meant but I had to read it for them it was
         | embarrassing.
         | 
         | And these where university educated accountants that where
         | using the software.
         | 
         | After a lifetime of garbage error messages like "error code
         | 4513" people just zone out.
        
           | BizarroLand wrote:
           | For real. How difficult would it be to have the computer tell
           | us what the error code means instead of a single sentence.
           | 
           | I get codes back in the day when storage for a whole book was
           | costly, but that isn't the case anymore. Just tell us the
           | error, show us the pointers, and then tell us what typical
           | fixes are instead of expecting us to go to the internet for a
           | solution.
        
             | blacksmith_tb wrote:
             | My take is that cryptic error messages have always been a
             | cross between 'protecting company secrets' and 'never
             | really admit to a mistake'. Sure, MS or Apple could just
             | throw up a dialog that said "Sorry, we trashed the file you
             | were working on, here are the last 1024 chars" but people
             | would actually be more angry then, instead of just "Error
             | 1234 occurred" (or at least they'd have to go look it up to
             | be mad).
             | 
             | As developers, we also have to be used to a lot of
             | completely unhelpful errors. Yeah, couldn't connect to the
             | DB, sure... oh, but actually because my code ate all of
             | memory, why didn't you say that in the first place?
        
         | ziml77 wrote:
         | I have never met these kinds of people and it boggles my mind
         | that they exist. Actually reading error messages is the most
         | basic thing I'd expect from a developer considering that you're
         | going to encounter them regularly during the development
         | process.
        
         | iovrthoughtthis wrote:
         | this feels like one of those things that needs to be tackled
         | from both ends
         | 
         | teach people to read error messages and simultaneously improve
         | the readability (and utility) or error messages
         | 
         | i don't know why we put up with such bad error messages
         | anymore. i imagine it's a function of stockholm syndrome and
         | the difficulty in getting messages changed
        
           | zaphar wrote:
           | I sort of blame exceptions here. They make it really easy to
           | just let the error bubble up to the top. But often times the
           | place where the error gets thrown doesn't have all the
           | necessary context to have a really good error message. If you
           | want a good readable error message you have to trap the
           | exception at the appropriate place and then wrap it with the
           | appropriate amount of context in the message. But the easy
           | path is to pretend there is no error and let the very top
           | layer surface anything that went wrong to the user.
        
         | tompazourek wrote:
         | I have experienced this same issue many times. But from my
         | experience, I cannot simply pinpoint it to young developers or
         | their education. I have seen this behavior with several older
         | people, university educated, with many years of professional
         | experience. So not just an age/generational thing, in my
         | opinion.
         | 
         | Maybe they have seen so many red herrings that they don't even
         | trust that the error message could contain something useful and
         | relevant? Or maybe they just learned to skim through
         | everything, and don't actually read stuff.
        
           | vladvasiliu wrote:
           | I have the same experience.
           | 
           | I also don't understand why, when they ask for help, they can
           | never be bothered to say what they're trying to do, what
           | error message they got, etc. It feels like they're doing me a
           | favor when I try to help them fix something.
        
             | politician wrote:
             | Then there's the StackOverflow effect: where you ask a
             | reasonable question X and get a bunch of upvoted
             | condemnations along with directions to do Y instead.
        
             | waylandsmithers wrote:
             | Right. When I was learning to code a more seasoned dev told
             | me "The code doesn't do what you want it to do, only what
             | you _tell_ it to do."
             | 
             | Really helped me gain the mindset that not only was it my
             | mistake that resulted in code not running, but that it was
             | fixable. Like a game of ping pong. You hit the ball,
             | sometimes the compiler hits it back.
        
           | cbushko wrote:
           | I think it comes down to passion and curiosity. Those are two
           | things can happen at any age and education. It is also
           | something that ebs and flows based on energy levels.
        
             | Noujin wrote:
             | I think the same. It's so typical everyone here tries to
             | pin characteristics to exact professional groups. The
             | amount of anecdotal evidence here is too damn high.
        
         | mrtksn wrote:
         | I think people don't read error messages if fixing the issue is
         | beyond their immediate capabilities.
         | 
         | Computers are scary things that fail in counterintuitive ways.
         | When the handle of your tea cup breaks, the issue is intuitive
         | and most people will be able to understand why it is happening
         | and how to work around it(handle it carefully from the top end
         | end enjoy your tea?).
         | 
         | But when it comes to computers, often you need deep
         | understanding of its inner workings to make sense of your
         | observations of problems. Why Xcode would say that it failed to
         | compile my project because usefulExtensions.swift already
         | exists? What it is supposed to mean, I see only one file with
         | that name? That information gives intuitive idea about the
         | issue only if you know how the compiling process works.
         | 
         | Why would I know why the package couldn't be found? Unless of
         | course I know how that package manager works. Then I can check
         | if the package manager is configured to look at the correct
         | places.
         | 
         | Most error messages are like that. Instantly makes intuitive
         | sense if you know how everything is glued together and makes no
         | sense and needs study if it's outside of you domain of
         | expertise. No one reads error messages unless they can
         | recognise the pattern instantly and there's a data(like the
         | name of the variable) guiding you to the fix.
        
           | zaphar wrote:
           | First of all I agree with you. But I would like to note one
           | aspect of this that has always been true. Many times the
           | error message itself is terrible. They say that an error
           | happened and then they give you absolutely no context for the
           | error. My favorite example is when an application happily
           | bubbles up the error from the operating system when a File
           | read/write operation fails. The OS will tell what kind of
           | operation failed. It won't tell you which file you were
           | trying to read or write. It won't tell you what location you
           | were trying to read or write. Basically none of the context
           | you need to really understand what went wrong. You'll have to
           | go read the code in order to mentally reconstruct what the
           | context is. It's silly. If you are writing a file and don't
           | at a minimum catch and then wrap the error with an error of
           | your own that adds the necessary context then you are
           | contributing to the problem here. And that's just one common
           | example. There are many more.
        
             | sgarland wrote:
             | I'm an SRE, so my programs are generally Python scripts <
             | 1K LOC; maybe this isn't scalable, but I write verbose log
             | statements (if it's launched with --verbose, of course).
             | It's not that much effort to change `except OSError as e:
             | log.error(e)` to `except OSError as e: log.error(f"Error
             | accessing {file} - {e}")`
             | 
             | If I know typical causes of errors (forgot to connect to
             | the VPN, etc.), I'll include them in the log message as
             | well as things to check.
        
               | zaphar wrote:
               | That might at least in part be because you are an SRE and
               | at some point you hit your limit of inscrutable error
               | messages happening in production.
        
           | spaetzleesser wrote:
           | "I think people don't read error messages if fixing the issue
           | is beyond their immediate capabilities."
           | 
           | At least capture them so somebody who knows that area can
           | make sense of them.
        
           | Lutger wrote:
           | This is a large and ongoing part of becoming a developer. It
           | also happens again every time you try to learn a sufficiently
           | new or alien technology. You know you start to make progress
           | when the error messages begin to make sense.
           | 
           | Often you need to know a lot of context before you're even
           | able to determine what the error message is! One error
           | message can lead to a cascade of other error messages, or
           | it's something breaking down as a result of multiple layers
           | of indirection, requiring the developer to careful track the
           | trail of what went wrong and led to another thing failing,
           | which broke down the next thing and ultimately, decided to
           | stop the program and mention only the very last thing falling
           | apart to the user. There might be a directly sensible
           | connection with the original error, but often it's quite
           | unrelated. An experienced developer often immediately
           | recognizes: this is not the actual error message, that other
           | thing is! But for a junior it's all equally incomprehensible.
           | 
           | It is detective work with many false leads, and being very
           | new at something it can be so overwhelming you don't know
           | where to begin and immediately assume you will not succeed
           | finding out 'whodunnit', asking your senior co-worker for
           | help.
        
           | nanis wrote:
           | > Computers are scary things that fail in counterintuitive
           | ways.
           | 
           | Not being scared of the tool and believing one's inherent
           | supremacy over it must be the most basic criterion for
           | practicing this craft, but these days this fear is nursed, at
           | times encouraged, at times even exalted (corollary of the
           | failure fetish) especially by those who publicly place
           | themselves as ambassadors.
           | 
           | Any introduction to computers _must_ start with the statement
           | that they are all heaps of plastic and sand and the only
           | things they are able to do are because some mortal sat down
           | and spent time figuring it out.
           | 
           | People starting out now are at a disadvantage because their
           | first encounters happen mostly through extremely polished
           | looking apps and it is hard to see at the outset how one
           | could go from weird incantations in a text editor to _that_.
        
             | rjknight wrote:
             | I think it's harder now than it used to be. When I started
             | programming, any error message you saw would almost
             | certainly originate from the beige box on the desk in front
             | of you, which had only a single CPU core and often didn't
             | really do multi-tasking. Over time, our computers,
             | operating systems, applications and networks have become
             | vastly more complex, with the effect that you can't easily
             | build an intuition about which component is responsible for
             | a failure and why.
             | 
             | Personally, I _love_ debugging things. I have a very good
             | "theory of mind" for dealing with computer failures, and
             | figuring out why the computer isn't doing what I might
             | naively expect it to do is a lot of fun. However, it's only
             | fun because I've been able to stay on top of the curve as
             | the systems I work with have become more complex. Starting
             | from zero today sounds a lot more daunting.
        
               | jimbokun wrote:
               | It gets a lot less fun when there's a customer visible
               | outage and you need to know why things are failing NOW
               | because dollars are on the line.
        
               | mtVessel wrote:
               | True, the stacks are more complex today, but the
               | resources are greater. Back in the days when the error
               | was in the beige box, if the books/CDs/DVDs you had on
               | your shelf didn't address the exact problem, you had to
               | roll up your sleeves or you were SOL.
               | 
               | Nowadays, research skills are more important, but I see a
               | lot of devs who just don't have them. Can't find the
               | answer on the first page of your first (poorly formed)
               | search? Run get the senior dev. To me it reads like
               | incuriousity and laziness, or lack of training.
               | 
               | I don't mind doing some coaching, but if you're a dev,
               | and you can't even be bothered to read the error message,
               | what does that say about your effectiveness?
               | 
               | /rant
        
             | colonelpopcorn wrote:
             | This is the correct comment.
        
             | mrtksn wrote:
             | > Any introduction to computers must start with the
             | statement that they are all heaps of plastic
             | 
             | This scales to everything IMHO, everything is simple once
             | you understand it. Levels of abstractions is what makes it
             | scary and complex. I.e. electricity or fire is also not
             | scary once you know how to handle it.
        
               | cestith wrote:
               | One should not have an unreasonable fear of fire, but
               | "unreasonable" is a quite key word there. One should have
               | a healthy respect for it and feel some fear if those
               | around you don't.
        
           | burnished wrote:
           | My own perspective on this was that it felt like my brain
           | would turn to goop when I got an error message, my eyes would
           | cross and I would start frantically googling literally
           | anything, skimming stack overflow, and getting nothing done.
           | In order to progress I had to learn to slow down and start
           | reading error messages and learning what they meant.
           | Sometimes this meant I had to look up a bunch of words, one
           | after another, to understand something incredibly dense.
           | 
           | So yeah, I think I largely agree with your assessment, and
           | would only go on to state that the path forward is slowing
           | down to learn vocabulary and think critically. You really
           | speed up after that.
        
           | Scarblac wrote:
           | > But when it comes to computers, often you need deep
           | understanding of its inner workings to make sense of your
           | observations of problems.
           | 
           | They're supposed to have that knowledge, or at least not be
           | afraid to dive in and get that knowledge.
           | 
           | There's only one way to build an intuition of what kind of
           | problem probably causes some error (most famously, if the
           | error is completely incomprehensible, you missed a closing
           | thingy on the previous line), and that's by doing the work a
           | lot.
        
             | politician wrote:
             | Unfortunately, the way we've organized this industry, a
             | junior developer stuck on an Agile(TM) hamster wheel has no
             | time to dive in and figure it out.
        
               | datavirtue wrote:
               | Very very true. When something causes a 1 point story to
               | take three days; bring on the hacks and compromises and
               | ignore anything that doesn't need dealt with to get it
               | out the door.
        
             | mrtksn wrote:
             | > They're supposed to have that knowledge
             | 
             | I don't think so. We can do so many amazing things with the
             | computers precisely because we don't have to know how
             | things work. Computers are so many levels of abstractions
             | over printed metal on melted sand.
             | 
             | People who know what they are doing will understand the
             | errors of their own creations and will learn the workings
             | of the tools they use to some degrees and will be able to
             | understand the failing modes of these tools with experience
             | over time. No one starts with complete knowledge before
             | start building things.
             | 
             | > or at least not be afraid to dive in and get that
             | knowledge.
             | 
             | Of course they should have the drive but people's first
             | instinct would be to make the error go away so that they
             | can do their actual work. People have limited time and
             | energy, you can't expect a JS developer, for example, to
             | study inner workings of a Linux box to understand all
             | errors. It's cool when they do and gives them superpowers
             | but it also makes them less productive as JS developers.
             | Sometimes you simply need to implement that button to
             | render on the server without studying the server.
        
           | rudasn wrote:
           | > I think people don't read error messages if fixing the
           | issue is beyond their immediate capabilities.
           | 
           | How would they obtain those abilities though if not while
           | spending time on the issues brought up and learning how to
           | learn.
           | 
           | I think sometimes people are just bored and can't be bothered
           | to find the cause and solution to their issues, and over a
           | long period of time that mentality sticks and becomes second
           | nature resulting in phrases like "this software sucks, I need
           | to read the docs to use it".
        
             | mrtksn wrote:
             | Sure, if they study it then they learn it.
             | 
             | The problem is, learning is taxing and many times you
             | encounter these errors when you have more important things
             | to do.
             | 
             | When you want to develop your game and the IDE is
             | complaining about something about locating some files, do
             | you think that it is good idea to learn how that IDE
             | organises dependencies?
             | 
             | Sometimes you suck it up and learn it and you know next
             | time. However, your first instinct would be to look for
             | ways to make the error go away so that you can immediately
             | start working on the task that you are supposed to work on.
             | That's why we have abstractions and when things work fine
             | we don't know how things work.
             | 
             | It shouldn't be expected of you having complete knowledge
             | of all computer systems, tools and frameworks before you
             | can make a ball image bounce on the screen.
        
               | datavirtue wrote:
               | Taxing as hell. I have personal projects well underway
               | that go unfinished because of tooling complexity or some
               | other issue causing me to completely derail and spend
               | days figuring out some type of in issue that has
               | absolutely nothing to do with what I'm trying to
               | accomplish. Granted, since I have gotten away from visual
               | studio it is much better but I'm t still happens. If it
               | isn't the IDE or package upgrades it's AWS or Azure
               | issues.
        
           | Aeolun wrote:
           | But the moment you _get_ these error messages they have to
           | become your domain of study.
        
       | cies wrote:
       | To my the line between dev and ops is where we put it. And I like
       | to make it very explicit.
       | 
       | In shared hosting times. Ops maintained Puppet definitions,
       | create new "deploy environments" (using Puppet), gave/revoked
       | server access the employees, monitored servers. Devs maintained
       | the source repos and deployed to the environments provided by
       | ops.
       | 
       | Now we live in virtual machine times (docker). Ops does the cloud
       | infra (terraform), monitors the services, gives/revokes access to
       | cloud services. Devs maintaines the source repos and deploys to
       | the cloud clusters provided by ops.
        
       | Jenk wrote:
       | This blog perfectly articulates the strife that inspired and
       | drove the DevOps trend.
       | 
       | I am always saddened when I hear "our organisation has a DevOps
       | team" - immediately this demonstrates the fundamenetal lack of
       | understanding the very premise of what DevOps set out to solve:
       | Bringing Development and Operations _together_.
       | 
       | Even the very name "DevOps" was constructed such to symbolise the
       | combining of the two domains into one. But no. Now we just have a
       | new cool title to throw on people who will be ringfenced just as
       | they were before.
        
         | dogleash wrote:
         | I'm more and more convinced that "DevOps" (and "Agile" before
         | it) are just buzzwords that can be leveraged to make whatever
         | change the person implementing it wanted to do all along, with
         | zero regard for what the buzzword actually means. If real
         | devops was going to happen, it wouldn't take a brand to sell
         | cross functional collaboration. It would have just been yet
         | another one of the constant stream of incremental improvements
         | we make by folding lessons learned in industry into our own
         | orgs.
         | 
         | "DevOps" today is just codified Shadow IT.
        
         | mrweasel wrote:
         | > Bringing Development and Operations together
         | 
         | Yep, because developer might not know what ops needs in terms
         | of traceability, logs and so on, to be able to run their code
         | in production without having to wake them up at 2AM. Similarly
         | Ops knows a lot about what can be done with existing
         | infrastructure, or off the shelf components, which can save a
         | huge amount of work, while providing a more stable system.
         | 
         | I do mostly operations now, and I'm lucky enough to work with
         | really talents developers, who care to listen to input, before
         | writing 5000 lines of code. I also work with customers, who
         | have their own developers, with their own weird ideas about the
         | world.
         | 
         | The biggest problem I see right now, except for occasions
         | cowboy pretending to be a professional developer, is developer
         | picking technologies without understanding it. We work with
         | customer who picked technologies because they're interesting,
         | not because it's what they need. When performance is terrible
         | it becomes and operations issue and being told "Kafka is not
         | actually a database and should be used as one" often isn't the
         | answer they want. Or try telling a developer that the code he
         | worked on for three months can be done by the existing load
         | balancer in a few hours or that the ORM is actually writing
         | terrible queries.
         | 
         | DevOps team, as in: "We use the shared knowledge of both
         | parties" is fantastic, but operations is frequently an
         | afterthought and not involved in the design fase.
         | 
         | If we're to take "DevOps" as developers doing operation, I'd
         | prefer that we do the opposite and let operations do
         | development. I think we'd get better results.
        
       | criticas wrote:
       | Another root cause in our environment is alluded to in the
       | article. With the rise of test frameworks, devs seem to test to
       | prove the API is correct, not to find problems.
       | 
       | Another symptom of this is that when the QA/Staging function went
       | away, load testing became perfunctory. Many of the performance
       | problems we see should have been caught in QA. Devs are anxious
       | to ship and get on to the next sprint, leaving app support and
       | operations on the hook.
        
         | nijave wrote:
         | >Devs are anxious to ship and get on to the next sprint
         | 
         | I think it goes a step farther back to product. PMs and
         | analysts put constant pressure on developer teams to complete
         | work quickly and that time pressure shows up on the next guy's
         | plate, etc
         | 
         | "Trickle down software engineering"
        
       | forinti wrote:
       | A lot of the animosity between teams come from the fact that IT
       | departments are being pushed harder and harder.
       | 
       | Programmers have to push out an endless stream of features; DBAs
       | have to deal with ever greater amounts of data; network people
       | have to deal with an enormous amount of endpoints (and now the
       | network extends itself beyond the firm, so security concerns have
       | grown exponentially).
       | 
       | The real challenge is to make your IT departments realise that
       | they are not each other's obstacles.
        
         | relax88 wrote:
         | The problem is often political. I have been at many orgs where
         | the management of the dev team over promises something to the
         | executive, and then when ops finds out about it and realizes
         | the project is going to be a giant dumpster fire whose failure
         | they will likely be blamed for, it becomes really hard for
         | people to foster a "we're on the same team" mindset.
         | 
         | As with most organizational dysfunction, middle management
         | fiefdoms are to blame.
         | 
         | It always helps when the executive can see through this
         | bullshit and ask the right questions, but often by the time
         | this happens millions of dollars have been wasted.
        
       | diNgUrAndI wrote:
       | Learning new abstractions is not a good argument though. Did
       | hardware engineers from the pre-OS era complain about Linux / OS
       | hard to understand?
       | 
       | In a sense, kubernetes is the new Linux / Bash of our time.
       | 
       | If it's painful, maybe it's just the abstraction not done right,
       | but not the fault of abstraction itself.
        
       | hosh wrote:
       | I have done both application development and ops. In my current
       | job, I am doing both.
       | 
       | There is a big difference between the mindset of what makes a
       | good application developer and what makes a good ops person.
       | 
       | Application developers, by and large, have a sort of "sandbox"
       | within which features are developed. This sandbox results from
       | working with abstractions, each with some kind of guarantee. For
       | example, most application developers assume that what you write
       | into memory will be what you get out. That is, hardware is
       | abstracted. The idea that the memory chips themselves can have
       | defects or can sometimes fail, even if one gets error-correcting
       | memory chips, is a violation of guarantees. Memory that just
       | works is taken for granted. Another example is assuming that the
       | system clock is monotonic.
       | 
       | This extends to things like networks, storage, operating
       | characteristics, and so forth. Very few application developers
       | get into that nitty gritty, let along all the plumbing and
       | interactions among different systems.
       | 
       | I've seen application developers get incredibly frustrated and
       | angry when those underlying guarantees are violated in some ways.
       | I've been like that when I put on my application developer "hat".
       | The main reason is that the developer is holding as much of the
       | state and logic as they can, and they do this by excluding things
       | through abstractions. They want their tooling and platform to
       | just work so they can focus on writing good software.
       | 
       | The thing is that, for a good ops person, all those nitty gritty
       | and plumbing _is_ the focus of their jobs. It takes a very
       | different mindset to troubleshoot: you start looking at those
       | "guarantees" and find out what they are actually doing.
       | 
       | I once interviewed at a place which has this amazing way of
       | figuring out if someone has the mindset and tenacity to be a good
       | ops person. It was not writing algorithims on a white board. It
       | was a deceptively simple task of installing a piece of software.
       | And even though there are documentation for installing that
       | software, there are not documentation for installing that
       | software for every single environment and requirements and its
       | interaction with other systems. When adding in a time crunch, and
       | the scrutiny of an observer, that simulates a pretty typical day
       | pretty well. You have to have enough emotional intelligence to
       | keep working through it until it works. Documentation is always
       | sparse and can't be guaranteed to be correct. Runbook? Good idea,
       | but there is no way even meticulously crafted runbook for one
       | component is going to be able to describe how systems interact
       | with each other. Someone, somewhere has to figure that out.
       | (Well, they don't have to. We can just let the system fail).
       | 
       | And sometimes, as an ops person, you have to open that "black
       | box" and read code. Just like sometimes, an application developer
       | needs to pop open the abstraction layer and pull out netcat or
       | sysdig.
       | 
       | In the end, I'm not lamenting that DevOps blurs who owns what.
       | Maybe this is because I've mostly worked on small, early-stage
       | startup teams. Complexity has to live somewhere. I like working
       | on the teams where people talk to each other to figure things
       | out.
        
       | cbushko wrote:
       | I am in Ops and I can understand this point of view. The author
       | is probably overwhelmed with issues that are 'not his problem'
       | and this is his rant on it.
       | 
       | To be fair, Developers are getting slammed with their
       | responsibilities too. At one time it used to be that they could
       | just know one programming language really well, like java,
       | compile their code and hand it off to QA.
       | 
       | Now they have to know a dozen languages, frameworks, do their own
       | testing, deploy the service, monitor it and trouble shoot
       | everything in production in some 'cloud'.
       | 
       | Or they are just being lazy and this guy is sick of it. That is
       | when you do your best to train people up and get them to put in
       | the leg work. Ask pointed questions about if they Googled the
       | error and help them work through the problem. Then add some
       | things to the docs to help others out in the future.
        
       | bsedlm wrote:
       | I've seen a larger trend of developers (modern app programmers)
       | not knowing how to use (in a deep advanced way) their own
       | computers.
       | 
       | I guess these developers will end up writing code directly on
       | github online editors...
        
       | _wldu wrote:
       | Years ago, everyone working in tech was an IT generalist. They
       | did everything (DB design, systems, applications, algorithms,
       | code, networking, etc.). Today, the field has matured and people
       | are able to specialize.
       | 
       | Sometimes, when old IT generalists work with new IT specialists,
       | these sort of misunderstandings occur.
        
         | WJW wrote:
         | Years before everyone was apparently a generalist, you would
         | have a separate DBA, a separate architect, sometimes even
         | separate teams for implementing algorithms, etc. The mythical
         | man month has a very nice section on splitting up the work over
         | the various teams and that's a book written in the early 70s.
         | 
         | I think the actual boundary lies more in big vs small
         | companies: small companies do not have the resources to hire
         | specialists for every little subproblem, while big companies
         | typically have enough employees that specialisation becomes a
         | possibility.
        
         | makach wrote:
         | it is called full-stack today?
        
           | _wldu wrote:
           | I'm not sure. I take full-stack to mean the front-end,
           | middleware and back-end of a webapp. I would still call it
           | being a "generalist" when applying the concept to computer
           | technology in general. For example, the CTO of an org should
           | be a technology generalist, not a full-stack web dev.
           | 
           | Of course, this is just my personal opinion based on what I
           | have experienced over the last 40 years.
        
       | k__ wrote:
       | It should be.
       | 
       | But in the cruft of legacy systems it probably won't be for a
       | long time, probably never.
       | 
       | I'm a freelancer. I "use" the project managers of my clients, I
       | don't hire them myself.
       | 
       | Same goes for my applications. I use the cloud, managed services
       | and such. The providers hire operations people, I don't.
        
       | rsyring wrote:
       | I think some of the article is framed wrong. It's written as
       | competent ops guy/team vs. incompetent developers. But I don't
       | think that's actually what's going on. I'm sure there are
       | competent developers saying the exact same thing about their
       | relatively incompetent ops people.
       | 
       | We built a couple relatively simple applications for an
       | enterprise client. It took their ops teams months to get both
       | applications running in K8s, even though our deliverable was a
       | fully functioning container. They were largely incompetent as far
       | as we could tell.
       | 
       | But, I don't think it's worth being unkind or judging them. Every
       | time they asked us a question we made an effort to point them in
       | the right direction. There were other times it was a problem we
       | couldn't help with, we kindly let them know that.
       | 
       | I think the reality is that the demand for competent IT and
       | developers outpaced supply a long time ago and it's not getting
       | better. Those of us who know and care about the difference should
       | make competent co-workers and executives part of the job
       | evaluation. Or, accept incompetence around you as a reality, help
       | and avoid as wisdom dictates.
       | 
       | But, complaining that it exists and framing it as competent ops
       | vs incompetent developers is both untrue and unhelpful IMO.
       | 
       | The latter part of the article that talks about the pace of
       | features, complexity, and the lack of time is spot on though IMO.
       | I think the article would have been better focusing here and
       | avoiding the IT vs devs angle.
        
         | lowercased wrote:
         | I started to write something, but you encapsulated it much more
         | concisely. I've been on all sides of this over the past... 25
         | years, mostly dev, sometimes having to handle
         | server/network/etc (before it was 'devops'), and worked on
         | large and small teams.
         | 
         | Competent and incompetent people exist in all areas. Some of
         | those incompetent ones can get better with time and support,
         | and some can't/don't.
        
       | mankypro wrote:
       | Cannot even count the number of times that my Ops teams had to
       | write wrappers and o scripts to ameliorate issues suffered by
       | apps rushed to production by shoddy dev teams. glad someone has
       | written about it. Anyone in the business knows this has been
       | going on forever.
       | 
       | The problem is the Ops teams get ZERO credit for enabling the
       | shoddy work done by devs, the devs meanwhile get patted on the
       | back, and frankly continue to be romanticized.
       | 
       | "move fast and break shit (and let Ops fix it silently)"
        
       | g051051 wrote:
       | 100% agreement, but from the developer side. DevOps has been
       | nothing short of a disaster for software development, on par with
       | Agile.
        
         | cbushko wrote:
         | I am sincerely curious why you feel this way about DevOps.
        
           | g051051 wrote:
           | Because it treats Dev and Ops as having identical, equivalent
           | skill sets. Attempting to make this true invariably leads to
           | disaster. In over 30 years of professional software
           | development, I've never seen the problems it purports to
           | solve.
        
             | cbushko wrote:
             | They are similar skill sets but the domains are different.
             | I am not expected to understand every javascript framework
             | that is thrown at developers and they are not expected to
             | know the inner workings of our networking, kubernetes or
             | service meshes. I will concede that every company wanting
             | an Ops person to be a full fledged software developer is a
             | little ridiculous.
             | 
             | I have been at this for 25+ years and I remember the days
             | of silos. I remember developers passing off code to QA and
             | it coming back days later with bugs. I also remember a lot
             | of 'not my problem' coming back from developers. Sometimes
             | it wasn't their problem; often it was.
             | 
             | Either way, the person with intimate knowledge of how the
             | code works should be the first person that looks at the
             | problem. In SaaS, that is production and should be the
             | developers (within reason).
             | 
             | As Ops, my goal is to make that as easy for possible for
             | developers. That means automating everything I can so that
             | the right tools are in place for deploying, monitoring and
             | alerting are there. It means that I have to automate
             | spinning up and destroying infrastructure as quick and easy
             | as possible so that we can meet your needs and also keep
             | costs down.
             | 
             | I have also seen companies fail at becoming 'devops' in the
             | most terrible way. They took developers and made them own
             | everything from code to deployment to VMs. The developers
             | had so many pieces to understand that the only guarantee
             | was failure. That was a terrible startup to work at.
        
               | g051051 wrote:
               | > They are similar skill sets but the domains are
               | different.
               | 
               | Exactly. They're specializations, like heart surgeon vs.
               | orthopedic surgeon.
               | 
               | > I have been at this for 25+ years and I remember the
               | days of silos.
               | 
               | 32 years for me. Silos evolved out of the wild west of
               | the 90's and early 2000's. Which evolved out of the
               | strict controls of early computers run by a cult of
               | Operators where the devs couldn't even access the machine
               | directly. It's a cycle, where management tries to remove
               | people, only to have to put them back later. I've seen it
               | over and over.
               | 
               | > I remember developers passing off code to QA and it
               | coming back days later with bugs.
               | 
               | It is literally QA's job to find bugs that developers
               | missed.
               | 
               | > In SaaS, that is production and should be the
               | developers
               | 
               | SaaS or not doesn't have anything to do with it.
               | 
               | > As Ops, my goal is to make that as easy for possible
               | for developers.
               | 
               | As a developer, my goal is to deliver high quality code
               | that meets the requirements for performance, stability,
               | monitoring, security, and functionality.
               | 
               | > They took developers and made them own everything from
               | code to deployment to VMs. The developers had so many
               | pieces to understand that the only guarantee was failure.
               | 
               | I've never seen DevOps done any other way. Hence my
               | original comment.
        
       | UK-Al05 wrote:
       | Large companies force devs to go through ops for so many things.
       | There isn't much of a choice.
        
       | Aeolun wrote:
       | I disagree with pretty much everything in this article, apart
       | from the fact that people often come up to me with a question of
       | the "I tried nothing, and I'm all out of ideas!" kind.
       | 
       | The author seems to want it both ways. They want the devs to fix
       | their own problems, but at the same time give them zero control
       | of the stack (we have to provide them with guide rails to prevent
       | them from hurting themselves indeed).
        
       | prepend wrote:
       | I've found ESR's "how to ask questions the smart way" [0] to be
       | really helpful in these situations on both the asking and
       | answering.
       | 
       | If I'm asking a question I explain what I'm trying to figure out,
       | what I've tried, what I expect, what I've researched. Basically
       | helping the answerer not waste as much time covering the same
       | ground.
       | 
       | If I'm answering questions and don't get this info, I ask it. And
       | establish the expectation that this info helps me answer their
       | question.
       | 
       | About 70% of the time, the asker adds in more info. 25% of the
       | time I don't hear back. 5% of the time I get a complaint that
       | they are too busy or can't answer the questions.
       | 
       | [0] http://www.catb.org/~esr/faqs/smart-questions.html
        
       | pietromenna wrote:
       | To the author of this article: Really great job on it, had great
       | fun reading and also lots of truths in there. But this part:
       | 
       | "Often they have not even bothered to do basic troubleshooting,
       | things like read the documentation on what the error message is
       | attempting to tell you."
       | 
       | This happens, but this just means that your Development Team
       | needs some coaching or to improve their quality.
       | 
       | This tells more of a quality of the development team you have
       | been working with. You have to pass along this feedback and
       | ensure that Development team also works with professionalism as
       | everybody else.
       | 
       | DevOps would tell be that "Dev & Ops" would look up issues
       | together (Yes, he will be blocked as well WORKING with you), if
       | you find that it was developer's fault. Tell them: "Hey, this is
       | on your side. You saw how we troubleshooted together. Now each of
       | us has new tricks to use in the future".
       | 
       | If you don't do that, you are the shortest path to get THEIR
       | problem solved. And it is too easy to go that path.
        
       | brador wrote:
       | On mobile this webpage has a thin hovering black bar at the top
       | that fills to the right as you scroll further into the article.
       | Very nice feature that I have not seen before.
        
         | nonameiguess wrote:
         | Not just mobile. It has the same progress bar in a full-size
         | browser as well.
        
         | dsr_ wrote:
         | We used to have this browser-supplied thing which would tell
         | you how far you were in the document, what percentage of the
         | document you were currently looking at, and afforded you the
         | ability to quickly change your position.
         | 
         | It was called a scrollbar.
        
           | nikau wrote:
           | You and your dinosaur technology, next thing you will want
           | clickable interactive text to be in a different font or
           | colour to differentiate from regular text.
        
       | Waterluvian wrote:
       | Everyone needs to be a bit of everything to mitigate the cases
       | where one team doesn't understand another team's domain and
       | Applications begins blaming IT or Operations admits to not
       | understanding the applications they facilitate.
        
       | zenron wrote:
       | I mostly agree with the overall tone of the article but I do have
       | to point something out:
       | 
       | > It is baffling on many levels to me. First, I am not an
       | application developer and never have been. I enjoy writing code,
       | mostly scripting in Python, as a way to reliably solve problems
       | in my own field. I have very little context on what your
       | application may even do, as I deal with many application demands
       | every week. I'm not in your retros or part of your sprint
       | planning. I likely don't even know what "working" means in the
       | context of your app.
       | 
       | The point about not being in retros or part of sprint planning...
       | I take up arms against that. I've worked for companies that have
       | gone from waterfall to hybrid agile because we cannot get buy in
       | from Ops to actually... you know... come to our retros, sprint
       | planning and scrums.
       | 
       | Some things in this article is just pointing out the obvious...
       | mediocre developers who push their problems and/or lacks on other
       | teams. However, that quote the Author needs to look in the
       | mirror. They exist only because of the products offered by the
       | Company need resources. They have a responsibility to be business
       | partners in that. If they aren't the company needs to re-align
       | some priorities and it could start with Ops. Ops doesn't get a
       | pass in an agile organization. The whole point of agile is to
       | destroy them ivory towers. And if they were in those planning
       | sessions, the developer might have already gone over the type of
       | destructive testing that would have emerged from that
       | collaboration and their DevOps relationship would be even richer.
        
       | dogman144 wrote:
       | There is a reason SRE/DeVOps Eng jobs are taking off in number
       | and comp, and entities GitHub is (slowly) figuring out how to
       | automate dev work.
       | 
       | Running code at scale turned into a very challenging comp sci
       | program, and uptime vs code slickness is getting prioritized by
       | clients.
       | 
       | The career support and innovation in that corner of the world
       | (ops eng jobs) reflects it. Sort of gets after what software
       | architects do, but the requirements to know that come way earlier
       | in the career for Ops. Ops Engs with cloud knowledge, Python, and
       | IaC tend to go far.
        
         | gfiorav wrote:
         | Add to the list: "Running code generated by ML which is not
         | trusted"
         | 
         | Similar in nature to "Running arbitrary containers" but without
         | the human trust-to-do-no-evil policy in place.
        
       | eljimmy wrote:
       | This isn't just specific to operations, I experience this amongst
       | other developer teams as well.
       | 
       | I've had previous coworkers approach me about API "bugs" because
       | they didn't bother to troubleshoot their app code and just
       | immediately assumed it was a server-side issue.
       | 
       | Then I spend 10 minutes debugging the issue only to point them
       | the error in their own code. I don't know if it's laziness or
       | inability to troubleshoot, or both.
        
       | exdsq wrote:
       | If your QA team is a "thing" that gets features at the end of a
       | sprint and churns out bugs or releases you're doing it wrong.
       | They should be involved on a feature by feature basis working
       | alongside the developer with QA time incorporated into every
       | task. All unit/integration/system tests should be automated
       | during the cycle so there is no "hand-off to QA". There should be
       | less latency because you have a test expert speeding up
       | implementing tests or being a force multiplier to developers by
       | acting as an internal consultant who can advise on bits where
       | needed.
       | 
       | QA as a discipline has evolved but from the sounds of it, it's
       | not been widespread enough.
        
       | criticas wrote:
       | This happens weekly.
       | 
       | Developer: Host XYZ is very busy.
       | 
       | Sysadmin: Yes, Yes it is. The top 10 processes are your Java App.
       | 
       | Developer: Fix it.
       | 
       | Sysadmin: ???? You can request a larger virtual machine, you can
       | try these options to the JVM, or you can fix your code.
       | 
       | Developer: Can you do it?
        
         | mrweasel wrote:
         | That's oddly familiar... I frequently get: The server is slow.
         | Well, no it's not really doing anything, but your applications
         | is responding remarkably slowly.
         | 
         | Or: Can I get a bigger server... Yes, but you have 32 cores and
         | 256GB of RAM, and your applications isn't that complex.
        
       | PeterisP wrote:
       | Perhaps all this description is showing is that in many
       | organizations there simply is a genuine need for a "Developer IT"
       | support function with appropriate skills and resources, and
       | because there isn't one, it's being done haphazardly by teams who
       | aren't a good fit for it, as the author describes. If there's _a
       | few_ niche issues then that 's solvable by e.g. dev training, but
       | if the issues are _systematic_ as the article asserts, then that
       | 's an organizational problem that needs an organizational
       | solution. If your company can't ensure that devs are capable
       | and/or motivated to troubleshoot issues that work on their laptop
       | but don't in a real deployment, then your company needs some
       | "internal consultations" mechanism to connect them to someone who
       | does have this capability and can explain and/or fix the issues
       | for them.
       | 
       | Responding to "Someone will always have to own that gap and
       | nobody wants to, because there is no incentive to. Who wants to
       | own the outage, the fuck up or the slow down?" with "Not me." is
       | not sufficient, it's a very valid question for which any
       | organization definitely needs an answer pointing at some specific
       | people - if it's not going to be pure ops people, it's IMHO not
       | going to be the feature-developing devs as well, that would
       | likely need separate 'site reliability engineer' teams as some
       | major companies do.
        
         | stayfrosty420 wrote:
         | I disagree, it seems laughable that devs are coming with him
         | with those kinds of questions.
        
           | tilolebo wrote:
           | They shouldn't have to come to him that often, if they had
           | skilled senior SWE mentoring them.
           | 
           | Seems like OP works for a shitty company.
        
         | twic wrote:
         | I agree that something needs to change at the organisation
         | level in your case, but i think it's hiring and promotion. This
         | "developer IT" stuff is part of a developer's job. Juniors
         | won't join you knowing it all, but they can learn it from
         | seniors on their team who do. If you are recruiting seniors who
         | don't know this stuff, stop, and if you are promoting juniors
         | to senior before they've learned this stuff, stop.
        
         | [deleted]
        
       | d--b wrote:
       | I am sorry but as an application developer, I think this is all
       | wrong. I'll thank my infra team today for not being assholes like
       | this guy.
       | 
       | 1. Application developers are your users. If we application
       | developers took offense every time a user tells us that things
       | are not working, we'd be pretty pissed off all the time.
       | Educating and empathizing with your users is part of your job.
       | 
       | 2. Talking about how it was better before: QA teams sure do
       | buffer a lot of crap. They also cost a bunch and slow down time
       | to release. Yes agile is causing problems. The bureaucracy and
       | stiffness of organizations before agile was no nirvana either.
       | 
       | 3. By your own affirmation you treat applications as black boxes
       | that should be deployed using a runbook that should just work.
       | This is ridiculous. Application's ownership is shared between
       | everyone who works on it.
       | 
       | 4. And yes, as developers, networking or physical drive space are
       | things that we tend to abstract away. Maybe if the infrastructure
       | people were involved in development discussion earlier, they'd be
       | able to raise their hands and say: wait a minute, you're going to
       | blow up our logs.
       | 
       | This all feels like someone who used to not do anything suddenly
       | being asked to take part in what's happening...
       | 
       | EDIT: apologies for the strong language and sounding like an
       | asshole myself, but I certainly feel irritated when someone takes
       | the time to write a 5000-word article complaining about whiny
       | developers who thought they could own ops but actually don't know
       | anything and scream for help when they themselves are the cause
       | of all evil.
        
         | sgarland wrote:
         | > And yes, as developers, networking or physical drive space
         | are things that we tend to abstract away.
         | 
         | Why is it Ops job to guess at your application requirements?
         | You have the best understanding of what setting LOG_LEVEL=DEBUG
         | is going to do to disk requirements.
        
           | jimbokun wrote:
           | In theory, but pragmatically it's irresponsible to assume
           | without running in staging or on a subset of production
           | resources to monitor and see what actually happens.
        
         | _jal wrote:
         | I currently manage an infra ops team. I was a developer for
         | about 10 years.
         | 
         | I agree with point 1, nearly completely. A lot of developers
         | could take a lot more responsibility for understanding the
         | environments their applications operate in, but I get it.
         | 
         | Point 3, at least in my shop, you're just wrong. I don't know
         | anything about what you're writing. I probably don't even know
         | what problem it is supposed to solve. You are mistaking the
         | highway road crew for mechanics.
         | 
         | Point 4, in my shop, we provide a lot of documentation and
         | guidelines for this sort of thing. Developers are responsible
         | for knowing if their stuff is going to fall outside of those,
         | and come to us to work something out. Again with the road
         | metaphor, if you drive a semi into a single car garage, you're
         | the idiot, not the person who built the garage.
         | 
         | On some of this, I'm taking a hard line. I do, in fact, end up
         | doing a lot of troubleshooting with developers. But most of my
         | team does not write code. If you want more senior ops folks who
         | also have a coding background, come on over! There aren't that
         | many of us who are any good, and I would love to hire more.
        
           | jameshart wrote:
           | > You are mistaking the highway road crew for mechanics
           | 
           | The highway road crew know what a car is, though, right? They
           | know that the road needs to be clear and flat and drained of
           | water, and the markings need to be clear, so that cars can
           | drive on it.
           | 
           | When the devs come to you complaining about flat tires, you
           | can't turn round and say 'this is a mechanic issue, I don't
           | know how tires are meant to work. They go on the bottom,
           | right?' - you're meant to help check for rusty nails or bits
           | of metal in the road that are causing all these flats.
           | 
           | 'Oh, I didn't realize that was something that could cause
           | trouble for cars'
           | 
           | Well then you're a pretty crappy highway maintenance guy.
        
           | [deleted]
        
           | cogman10 wrote:
           | > Point 4, in my shop, we provide a lot of documentation and
           | guidelines for this sort of thing. Developers are responsible
           | for knowing if their stuff is going to fall outside of those,
           | and come to us to work something out. Again with the road
           | metaphor, if you drive a semi into a single car garage,
           | you're the idiot, not the person who built the garage.
           | 
           | With the road metaphor, one issue I've seen is ops will
           | create a rope bridge and get mad when devs need to drive a
           | car over it. "You shouldn't do that! You idiot! Just walk
           | over the bridge like we expect!"
           | 
           | Example: We have about 500 different applications in our
           | company and the ops team maintains a single rabbit cluster
           | for all apps (and everyone is supposed to use that one
           | cluster). If an app gets too chatty on that cluster "Oh you
           | idiot, why are you so chatty! You just sunk the
           | organization!" Which, in turn, discourages the usage of
           | rabbit (maybe that's the intention?)
           | 
           | > But most of my team does not write code.
           | 
           | I actually prefer this ( :D ), our ops team was a bunch of
           | converted devs that decided the best way to do things was
           | making a giant ops framework for all devs to follow. That
           | ended up costing WAY more money than if they'd just used
           | tools that were available. They fetishized trying to make
           | everything "just one line!" which ended up breaking anytime
           | you had a slightly different need (trying to take control
           | right up to managing how version bumps happen).
           | 
           | Overly trying to force a single method of implementation has
           | a lot of negative consequences. I prefer instead to have
           | guidebooks and examples with the freedom to be an idiot and
           | walk off the beaten path when needed.
        
             | kcb wrote:
             | It pains me. Just add this magic line to your pipeline and
             | everything will "Just Work (tm)"
        
             | kazen44 wrote:
             | > With the road metaphor, one issue I've seen is ops will
             | create a rope bridge and get mad when devs need to drive a
             | car over it. "You shouldn't do that! You idiot! Just walk
             | over the bridge like we expect!"
             | 
             | Well, the main problem with the "bridge mismatch" is
             | usually that resources required for an environment are not
             | free. Its usually the opposite, most infrastructure is
             | rather expensive, and running multiple systems side by side
             | because multiple developers require slightly different
             | versions of the same thing tends to explode cost.
        
           | kcb wrote:
           | > Point 3, at least in my shop, you're just wrong. I don't
           | know anything about what you're writing. I probably don't
           | even know what problem it is supposed to solve. You are
           | mistaking the highway road crew for mechanics.
           | 
           | How? Honest Question. If you know nothing of the application
           | how are you able to offer any input into the infrastructure
           | it runs on.
        
             | Plasmoid wrote:
             | Because an ops team will have between dozens and hundreds
             | of apps to support. You do a survey of needs and build out
             | something that gets to the most common use cases.
             | 
             | You try to respond to what people need and add things when
             | there is enough demand. But I can't know what your business
             | goals are, what your uptime metrics are, or who your users
             | are.
             | 
             | At some point, your app becomes a black box that takes in
             | requests, accesses DB/storage, and emits logs/metrics. I
             | just don't have the brain space to be intimately familiar
             | with each service.
        
           | Aeolun wrote:
           | > Point 3, at least in my shop, you're just wrong. I don't
           | know anything about what you're writing. I probably don't
           | even know what problem it is supposed to solve. You are
           | mistaking the highway road crew for mechanics.
           | 
           | I don't follow this. Developers are responsible for learning
           | what kind of environment their application runs in, but ops
           | is not responsible for having some clue about what they're
           | running? That cuts both ways, and it'll help everyone out.
           | 
           | > Developers are responsible for knowing if their stuff is
           | going to fall outside of those, and come to us to work
           | something out.
           | 
           | I find this attitude fairly common amongst ops people. They
           | just build something that is totally inappropriate for actual
           | usage, and then dump the responsibility for figuring that out
           | on the developers.
        
             | philbo wrote:
             | > Developers are responsible for learning what kind of
             | environment their application runs in, but ops is not
             | responsible for having some clue about what they're
             | running?
             | 
             | I don't think it's as cut-and-dried as your question frames
             | it, but I do think there are fundamental differences
             | between the two positions that justify some of the tension
             | there.
             | 
             | The problem is the difference between domain knowledge and
             | general systems knowledge. The former varies wildly from
             | org to org, team to team or even within individual teams.
             | The latter is more consistent across wider applications and
             | over longer timeframes.
             | 
             | Developers usually need a lot of domain knwoledge to do
             | their job, which can leave less space for systems stuff.
             | But the systems stuff they do learn tends to be more widely
             | applicable.
             | 
             | Ops folk often service many teams where the domain
             | knowledge differs between them. The best of them might be
             | able to internalise all of those differences but it's a big
             | ask. And there's rarely any crossover.
             | 
             | This difference is also why developers tend to have a
             | slower ramp-up time than ops engineers do on joining a new
             | team. It's just the nature of the work.
             | 
             | I say all this as someone from the developer side of the
             | fence. I'm fortunate to have some years in the bank now
             | that the systems stuff comes more easily. The domain stuff
             | remains really hard.
        
             | BurritoAlPastor wrote:
             | Developers have more responsibility than ops for knowing
             | their apps, for the simple reason that each developer owns
             | a small number of apps, but ops owns the infrastructure for
             | all the apps.
        
               | kcb wrote:
               | I don't follow. Why compare a developer to the entire ops
               | organization?
        
               | tadpole172 wrote:
               | Because the ops org doesnt concentrate on just the one
               | application. They have broad knowledge of the entire
               | stack and therefore don't have as deep of an
               | understanding on any single piece.
        
               | kcb wrote:
               | The dev org also doesn't concentrate on just one
               | application. I've not seen this situation where every Ops
               | personnel is assigned to the entire stack. Each Ops
               | employee or team in a larger organization is generally
               | responsible for a subset of the environments.
        
               | jameshart wrote:
               | Why has your organization built a one-size-fits-all ops
               | organization if it doesn't have a one-size-fits-all dev
               | organization? Sounds like a failure of ops organization
               | to recognize that the needs of the email hosting guys are
               | different from the website team or the billing team.
               | Maybe you should build a set of smaller, more focused ops
               | teams focused on meeting the needs of those different
               | groups?
        
               | kazen44 wrote:
               | Smaller, more focused ops teams already exists, but are
               | not bound by application boundaries but by system
               | boundaries. (mostly, storage, compute and networking).
               | The reason is because each of these is a completely
               | different environment on its own.
        
               | cogman10 wrote:
               | I completely agree. Far too many devs are clueless about
               | how their apps perform or interact with the ecosystem.
               | That tunnel vision has a LOT of negative consequences on
               | infrastructure.
        
             | [deleted]
        
         | protomyth wrote:
         | _QA teams sure do buffer a lot of crap. They also cost a bunch
         | and slow down time to release._
         | 
         | If your QA team is slowing down releases then that is the
         | developer's fault not the QA team. Frankly, this move fast,
         | don't do proper QA is irresponsible and a danger to users.
        
           | marcosdumay wrote:
           | They add latency, there's no way around it. Even if there are
           | no software problems and their verification is instantaneous,
           | QA by itself adds an extra hand-off to a team with an
           | independent task queue.
        
             | exdsq wrote:
             | If your QA team is a "thing" that gets features at the end
             | of a sprint and churns out bugs or releases you're doing it
             | wrong. They should be involved on a feature by feature
             | basis working alongside the developer with QA time
             | incorporated into every task. All unit/integration/system
             | tests should be automated during the cycle so there is no
             | "hand-off to QA". There should be _less_ latency because
             | you have a test expert speeding up implementing tests or
             | being a force multiplier to developers by acting as an
             | internal consultant who can advise on bits where needed.
        
           | icedchai wrote:
           | Seriously, I haven't worked at a company with a QA team in
           | almost 10 years. Do these actually exist anymore? It would
           | certainly be nice to have.
        
             | Aeolun wrote:
             | They do! They're really good at their job but _definitely_
             | slows down releases.
             | 
             | Then again, the entire point is to release after all the
             | bugs are fixed, not to get all the bugs into production as
             | quickly as possible :)
        
               | protomyth wrote:
               | I guess it depends how you count a release. I think these
               | fast moving teams spend more time in production debugging
               | than the QA team adds. Shipping it should not be the
               | final determination of release time.
               | 
               | I wish more companies valued QA teams, then maybe I
               | wouldn't get so many notices of security breaches and
               | need to keep checks on my credit.
        
               | mateo411 wrote:
               | Security breaches are the responsibility of the InfoSec
               | team. The QA team usually won't have the skillset to find
               | security issues.
        
               | icedchai wrote:
               | Or maybe you still would. Are most QA folks actively
               | looking for security issues?
        
               | Aeolun wrote:
               | Not really. QA is functional. We have a product security
               | team doing pentests on new and updated applications.
        
               | protomyth wrote:
               | Some of the bonehead stuff will be caught by QA, but
               | there are folks on some QA teams that get security.
               | Sadly, developers talk down about QA so much that the
               | people we need on QA teams are not going to go there.
        
             | _AzMoo wrote:
             | We have a fantastic QA team, and they test everything that
             | goes to prod. Definitely slows things down (by about 1/3)
             | but our user experience is significantly improved because
             | of it. IMO a good QA/test team is critical to delivering an
             | excellent user experience.
        
         | jodrellblank wrote:
         | Pet hate: they're not operations' logs, they're developer logs.
         | Developers write the code to create log messages on the
         | principle "more is better". Logs are another example of the
         | systemic hoarding problem with people and computers.
         | 
         | They're a ratchet pattern, adding more is easy but once they
         | exist it's very difficult to find someone with the authority to
         | authorise removing them and the willingness to stick their neck
         | out and declare that they aren't required and the willingness
         | to spend time on low-importance maintenance. As a consequence
         | logs build up until something gives and they become high
         | importance urgent failure. The middle bit where they "aren't
         | important" but they still waste storage space and networking
         | bandwidth and processing power (and money) and when there is
         | something to debug they waste people's time because the
         | important details are needle-in-haystack among tons of low-
         | value filler, all gets ignored.
         | 
         | At the limit, it isn't sustainable to print the complete
         | internal state of a system at every clock cycle. It "should" be
         | possible to do a lot better troubleshooting_power-to-log_weight
         | ratio than "print every state change which feels important at
         | the time in whatever semi-English message format is
         | convenient", shouldn't it?
        
         | jasonlotito wrote:
         | I am sorry but as someone who has been on both sides of this, I
         | think this is all wrong. And I thank god both my app developers
         | and operations people aren't assholes like you.
         | 
         | Hey, that's a pretty shitty way to start off a comment, don't
         | you think? With a personal attack?
         | 
         | 1. Yes. But it's not operations problem if you are whining that
         | your PS5 game isn't running on the XBox. There is personal
         | responsibility in this, too, and it's not operations job to
         | hold your hand and explain how to do your job. If you aren't
         | reaching out to operations to make requests, they aren't going
         | to know what to do. Your entire comment shows that you think
         | they are subservient to you, rather than you actually being an
         | honest user. Tell them what you want, and work with them to get
         | it.
         | 
         | 2. QA teams do not slow down time to better quality releases.
         | They do slow down time to half-baked or buggy releases.
         | Regardless, the number of app developers to operations people
         | is generally a very bad imbalance. I promise you, the good ones
         | are working with the people that reach out to them.
         | 
         | 3. Maybe if you invited the operations people earlier, they'd
         | have some ownership in the product. But usually they release it
         | without operations even knowing, and suddenly there is
         | something in production that is half-working. They had no hand
         | in it. They literally did not work on the project, so they
         | can't know.
         | 
         | 4. You can't abstract away things if you don't know how they
         | work or account for them. Again, inviting operations people to
         | earlier discussion is incredibly easy. You know what projects
         | you are working on, they tend to not because there are far
         | fewer of them than there are application developers. So, it's
         | on you to reach out to them to get input. Yes, they have to
         | make themselves available, but you have to invite. And guess
         | what? When you do that, you get a wealth of information and
         | makes the product better.
         | 
         | Your comment feels like someone who is used to expecting
         | perfection from others while accepting their own mediocrity.
         | 
         | Wow... ending a comment with an insult is rather shitty, too.
         | Why did you decide to go the route of writing a comment that
         | starts of shitty and ends up that way?
         | 
         | Personally, I did it to hold up a mirror to you.
        
           | cf499 wrote:
           | "Maybe if the infrastructure people were involved in
           | development discussion earlier, they'd be able to raise their
           | hands and say: wait a minute, you're going to blow up our
           | logs."
           | 
           | "Maybe if you invited the operations people earlier, they'd
           | have some ownership in the product."
           | 
           | Awww... You like each other but none of you dare to make the
           | first move :D
        
             | happymellon wrote:
             | Ops/Infra teams don't usually start software projects and
             | not aware that there is a project for them to offer their
             | help with.
             | 
             | My experience has been that they can be very accomodating
             | and supportive if you do talk to tham.
        
               | Aeolun wrote:
               | Yeah, all those devs are just sitting there at their
               | desks clacking away on their novels.
               | 
               | There is _always_ a project for them to offer help with,
               | since the business will not suffer devs to be idle.
        
             | ozim wrote:
             | Whole thread reads like bunch of guys shouting at each
             | other "but but ... I know better!".
             | 
             | IMO this is main topic of the thread and of the article.
             | 
             | There are groups of people who instead of spending time to
             | figure out how to work together and understand what other
             | side has to say, they just throw shit over the fence.
             | 
             | Maybe some could start by reading points at least couple of
             | times and try to understand instead of trying to write
             | personal experiences as fast as they can in reply to other
             | comment that hurts their ego.
        
           | izacus wrote:
           | > And I thank god both my app developers and operations
           | people aren't assholes like you.
           | 
           | Uhh... can we chill with the personal insults a bit?
        
           | d--b wrote:
           | Cause the whole article reads like "developers are whiny
           | assholes who don't know shit about computers". And yes, it
           | starts with an attack and ends with an attack too.
           | 
           | 1. It's not operations problem for sure, but I certainly
           | don't bash people for not knowing things I am the expert of.
           | 
           | 2. Fine
           | 
           | 3. The OP's saying he doesn't want to know!
           | 
           | 4. Well, writing applications is sitting atop a stack of
           | technologies more and more abstract. A developer not knowing
           | what happens in an IP packet is the same as an infrastructure
           | guy not knowing what happens in an NP junction.
        
             | civilized wrote:
             | > Cause the whole article reads like "developers are whiny
             | assholes who don't know shit about computers". And yes, it
             | starts with an attack and ends with an attack too.
             | 
             | There is no attack in the text. There is a complaint that
             | issues presented to operations often lack the basic level
             | of detail and due diligence that they should have. You are
             | free to disagree with the author's expected level of due
             | diligence on issues; I think you'd be wrong to, but you
             | can. However, it isn't an attack.
             | 
             | You perceive a non-attack as an attack, and respond with an
             | explicit attack and name-calling. That actually makes _you_
             | the aggressor.
             | 
             | Hmm, who is the asshole here?
        
             | burnished wrote:
             | It read more like "these developers are asking poorly
             | formed, difficult to answer questions", and frankly
             | reminded me of a LOT of r/CodingHelp problems I've seen
             | lately. Aside from that the author seems to repeatedly have
             | empathy and admiration for developers but thinks that there
             | is a systematic disfunction. There is definitely a little
             | "old man shouts at clouds" too, but at least to me this
             | article read as a legitimate discussion of some pain
             | points, certainly not a hit piece.
        
               | Aeolun wrote:
               | Hmm, it sounds like the opposite to me. I find it really
               | hard to read because of the constant 'devs are stupid'
               | comments.
               | 
               | There is a legitimate point buried there, but I just kept
               | seeing red reading it.
        
             | sgarland wrote:
             | > A developer not knowing what happens in an IP packet
             | 
             | I don't care if devs understand IP packets, TCP congestion
             | control algorithms, or anything similarly low-level. If
             | they do, that's awesome, but it's not expected. I do expect
             | them to have a basic understanding of expected latencies
             | for intra-DC vs. internet, why running Flask in production
             | isn't a good idea, and if they're really sharp, an inkling
             | of how Kubernetes networking works.
        
               | arwineap wrote:
               | I think I understand your sentiment, but what's wrong
               | with flask??
        
               | kazen44 wrote:
               | i assume the poster means running flask in production
               | without something like nginx in front to serve as the
               | webserver.
               | 
               | the flask build in webserver is not production grade
               | software in my opinion.
        
               | Cyphus wrote:
               | It is also the opinion of the people who wrote the built-
               | in webserver. If you try to run it in production mode,
               | it'll emits this warning on startup:
               | 
               | > WARNING: This is a development server. Do not use it in
               | a production deployment. > Use a production WSGI server
               | instead.
               | 
               | I don't expect junior devs to have a sense for what is
               | production-grade and what is not, but if they try to ship
               | software that explicitly warns against being used in
               | production, you've got a real liability on your hands.
        
         | mdekkers wrote:
         | > wait a minute, you're going to blow up our logs.
         | You really need to have that pointed out to you?
        
         | igetspam wrote:
         | I believe your assessment of the agreement is flawed.
         | Application developers are not our users. You're our tenants.
         | We provide highly available housing for your projects. We keep
         | the lights on, we keep walls standing and we make sure the roof
         | doesn't leak. We also provide APIs for you to interact with.
         | When those things fail, we are responsible. When your code
         | doesn't run in the test environment where everyone else's does,
         | that's not our job. I'll help you but at my convenience because
         | I have other things to do. Of your app fails in the middle of
         | the night, that's your responsibility. If it's an infra
         | problem, then it's on me. We don't ask you to tune the network
         | or balance the cluster or ask you why the daemon sets are
         | failing, right? If this was a shared responsibility, you'd be
         | helping with the core too but I can almost guarantee that's not
         | happening. (Some of my eng peers do but the vast majority think
         | or it as a black box.)
        
         | draw_down wrote:
         | Jeez. I think it's really despicable to read the behavior this
         | person is describing and decide they're the asshole.
         | 
         | It isn't surprising though, this is par for the course in tech
         | workplaces it seems. The problem isn't that I shit all over
         | your doorstep, the problem is you pointing it out instead of
         | just cleaning it up silently.
        
         | tucosan wrote:
         | Wow. Starting your argument with an ad hominem attack qualifies
         | you as one of those people I will never ever want to work with.
        
         | waylandsmithers wrote:
         | On point 3: > I likely don't even know what "working" means in
         | the context of your app.
         | 
         | I think both sides can do more to reach into the domain of the
         | other. I get it- we don't want to deal with blinking lights and
         | they don't want to deal a missing semicolon breaking
         | everything.
         | 
         | Honestly I think "that's not my problem" is one of the worst
         | attitudes you can have as part of an organization with common
         | goals.
        
         | time0ut wrote:
         | It sounds like someone who is frustrated because the process or
         | culture in their organization has lead to point 3. I tend to
         | involve ops before I write a single line of code and definitely
         | before deploying to a stage environment. Over the course of a
         | project, they help me write the runbook, create dashboards, and
         | alerts. After all, we are all on the hook when things go
         | sideways at 3AM. I want them to know as much as possible about
         | how things work.
        
           | generalk wrote:
           | This is the way.
           | 
           | My previous company had a HUGE problem with Devs cowboying
           | off and doing whatever and dumping it on the Ops team at the
           | last minute.
           | 
           | One of the biggest (but for damn sure not the last) issues
           | was a dev who designed and built an entire new product around
           | a MongoDB database, which wasn't something we had in
           | production, and something he didn't mention during the months
           | of development and demos to stakeholders. Week before the
           | launch date he hits up our Ops folks to get production set
           | up.
           | 
           | Ops was calm and collected about the whole thing. "We don't
           | have MongoDB in production. Are you volunteering to learn how
           | to correctly install it, write monitors for alerting, be
           | paged with issues, figure out backups and how to ensure our
           | data stays safe, secure, and available? You're not? Then get
           | the [redacted] out and rewrite your app. Yes it will affect
           | the ship date, and yes it's your fault."
           | 
           | I'd love to say we used that opportunity to shore up our
           | processes involving kicking off new applications and
           | including Ops folks in from day one, but that took years
           | more.
        
             | time0ut wrote:
             | Something similar happened at my company like 5 years ago.
             | 
             | A developer was tasked with adding a major new feature to
             | one of our older monoliths. He added MongoDB as a
             | dependency. The application already had a well managed
             | Oracle database. Nothing about the feature required
             | MongoDB.
             | 
             | When it came time to go to production, the DBA and ops
             | teams responded similarly to how you did. I wish I could
             | say sanity prevailed, but the business mumbled something
             | about contractually obligated release dates and forced it
             | through to production. Pretty sure it is still there
             | rotting away.
             | 
             | I've worked mostly on the app side of things and this sort
             | of thing just makes me shake my head.
        
               | random_kris wrote:
               | well at the end of the day you managed to ship it? Did it
               | cause any big problems down the line? It seems the
               | biggest problem is that it is rotting away somewhere,
               | which to me means that it is working without need to do
               | much care on it.
               | 
               | If they listened to your DBA/ops guys no value would be
               | gettig shipped ;)
        
               | time0ut wrote:
               | I don't know of any big problems other than the
               | unnecessary cost. I agree meeting the needs of the
               | company is king, but it was just a lot of unnecessary
               | complexity because a dev wanted to put MongoDB on their
               | resume. Could have been avoided by talking to the rest of
               | the team early on. Of course, they would not have liked
               | the answer of just creating a new table in boring old
               | Oracle.
        
               | Aeolun wrote:
               | To be fair, when forced to choose between Oracle and
               | MongoDB I'd also have a serious dilemma.
        
             | oblio wrote:
             | > Ops was calm and collected about the whole thing. "We
             | don't have MongoDB in production. Are you volunteering to
             | learn how to correctly install it, write monitors for
             | alerting, be paged with issues, figure out backups and how
             | to ensure our data stays safe, secure, and available?
             | You're not? Then get the [redacted] out and rewrite your
             | app. Yes it will affect the ship date, and yes it's your
             | fault."
             | 
             | Love the shoot-down!
        
             | davidgerard wrote:
             | Ops here. The threat of 3am phone calls does wonders, in my
             | experience.
             | 
             | If it turns out it was product owner pressure, the product
             | owner gets a call too. Possibly first.
        
             | Aeolun wrote:
             | So, you could have delayed the app by the same amount but
             | now have a mongo environment for production as well?
             | 
             | Seems a bit of a waste to rewrite the app instead.
             | 
             | Not that I would recommend Mongo anywhere, production or
             | dev, but it would apply for any other technology for which
             | this happened.
        
               | generalk wrote:
               | > So, you could have delayed the app by the same amount
               | > but now have a mongo environment for production as
               | well?
               | 
               | No, we couldn't have. Not just because we didn't want
               | MongoDB, which at the time was notorious for data loss,
               | but because our ops team didn't have the capacity at that
               | point in their schedule or team size to handle it. Maybe
               | had we discussed at the beginning of the project plans
               | could have been made or altered, but we didn't and so
               | they couldn't.                 > Seems a bit of a waste
               | to rewrite the app instead.
               | 
               | The responsible dev took the time necessary to rewrite
               | the data layer to better reflect the needs of the
               | application.
               | 
               | Is what I wish had happened. Instead the developer jammed
               | the huge JSON blobs into a column on an MSSQL table and
               | changed a few lines. lolsob.
        
               | jimbokun wrote:
               | > Instead the developer jammed the huge JSON blobs into a
               | column on an MSSQL table and changed a few lines.
               | 
               | Sounds like quickest way to deliver value to the
               | customer. As described, was far too late in the process
               | to worry about deploying with a clean, extensible
               | architecture.
               | 
               | A reasonable amount of technical debt in order to ship in
               | the timeframe available.
        
               | kazen44 wrote:
               | except that shipping something with semi-broken
               | infrastructure leads to losses down the line.
               | 
               | What if your mongodb database drops its data and now you
               | have production impact? Are those losses calculated while
               | making these decisions during development.
        
               | Aeolun wrote:
               | > because our ops team didn't have the capacity at that
               | point in their schedule or team size to handle it
               | 
               | Lol, I get your point, but that was also true for the dev
               | organisation. Hence what you ended up with.
               | 
               | I doubt the needs of the application included a rewrite
               | in MSSQL.
        
         | czep wrote:
         | > Application developers are your users.
         | 
         | No. Equating internal teams with paying customers is the very
         | attitude that is causing these problems. Encouraging teams to
         | think about their "internal customers" leads those customers to
         | become entitled. We work together in the same company, our
         | relationship is not the same as with actual external paying
         | customers. I can't tell a paying customer that they're being
         | unreasonable or lazy or unrealistic. We absolutely should be
         | able to have that conversation with other internal teams when
         | appropriate.
         | 
         | The post is describing the situation that has evolved as a
         | result of QA being phased out. Telling Ops to suck up that
         | extra work because "Dev are your users" is exactly why the post
         | was written.
        
           | gravypod wrote:
           | At most large companies things are organized in such a way
           | that internal teams are your "paying users". Some internal
           | teams at some companies even say "If you want X feature and Y
           | support you need to request $$$ funding and N people for our
           | team".
        
             | LambdaComplex wrote:
             | ...Isn't that how Sears went bankrupt?
        
               | gravypod wrote:
               | I don't know much about Sears. I've mostly worked as a
               | Software Engineer and know other Software Engineers.
        
         | 0n34n7 wrote:
         | Agreed. Good application code often contains edge case
         | handling, build time checks, unit tests and defensive flows
         | that handle the unexpected so that users don't wake you up at
         | night. Why can Ops not do the same? Why can Dockerfiles /
         | Orchestrators / CI / playbooks not also implement sanity checks
         | on deployments?
         | 
         | "Ooops... deployment failed. While deploying your artifact we
         | found the following:
         | 
         | - Nothing is listening on the nominated port
         | 
         | - Your deployment is utilizing 100% CPU while idling
         | 
         | - We detected an abnormal volume of write operations to the
         | mount
         | 
         | Please fix these issues and re-trigger the pipeline at your
         | earliest convenience.
         | 
         | Regards, Ops."
        
           | jensensbutton wrote:
           | > Why can Ops not do the same? Why can Dockerfiles /
           | Orchestrators / CI / playbooks not also implement sanity
           | checks on deployments?
           | 
           | All of those things were written by developers.
        
           | clipradiowallet wrote:
           | > - Nothing is listening on the nominated port
           | 
           | Now that just shouldn't happen... ie, we(ops) aren't going to
           | deploy something that doesn't come with healthcheck(s). The
           | healthcheck never passing(port isn't listening) is going to
           | stop the deployment from ever completing. Ops job is to push
           | back on developers if they try to hand us something like this
           | to build a pipeline for. In my company, to hand Ops the name
           | of a repo and say "build a pipeline"...there are a lot of
           | requirements, and the biggest one is a list of SLAs. That
           | list of SLAs is how we build monitoring for your application,
           | and one of those should _always_ be a list of port(s) and
           | protocol(s) that are exposed; we build monitors against
           | those.
        
           | seniorThrowaway wrote:
           | "Oh those are normal errors" - Every developer I've ever
           | worked with
        
         | greedo wrote:
         | I think if your take away is that the author is an asshole, you
         | might want to reflect on specifically why you feel that way. In
         | my experience as a sysadmin, in a large company that's been
         | trying to become a user of "cool" IT in the last decade, the
         | article is spot on.
         | 
         | I think for point 1, he's trying to say that application
         | developers aren't doing their role as both dev and QA. I've
         | witnessed the same issue where an DBA had trouble installing
         | Maxscale on two identical servers. He was convinced that there
         | must be something different between the two servers despite
         | them being created from the same template, and only differing
         | in IP/hostname. He had done no research, opened no tickets with
         | the vendor, but instead wasted 30 minutes of my time arguing
         | that it's not his fault. And this is common with many of the
         | developers I've worked with in the last decade.
         | 
         | For #3, I don't own the application you develop. We provide you
         | with a platform that YOUR application runs on, based on
         | requirements you provide. If you don't do an adequate job of
         | providing accurate requirements, that's on you, no my team.
         | 
         | And #4, developers don't abstract all those things away, they
         | often fundamentally don't understand how they work at all, so
         | they ignore them. This ignorance has damning consequences when
         | they make blind assumptions about how things work.
        
           | greedo wrote:
           | I used "mine/yours" to denote where the responsibility lies.
           | In a small org you can have the entire IT team troubleshoot
           | an issue. In a large org, that's unfeasible.
           | 
           | I'm willing to help troubleshoot and provide guidance based
           | on my experience, assuming the application developer has
           | performed their due diligence. I have no insight into what
           | their application is expected to do, or its failure modes. I
           | have no input into the coding methods, the test harnesses,
           | the deployment process. But when that shit breaks because the
           | dev doesn't understand the difference between `rm -rf ./*`
           | and `rm -rf /` that's his problem.
           | 
           | Now of course this is an org problem, not a team problem. As
           | in parenting, setting boundaries and responsibilities is the
           | key to success. Too many leaders in IT simply think that
           | "DevOps" will be cheaper and faster and leave it at that.
        
           | EastSmith wrote:
           | Talking in terms of mine and yours means we are not on the
           | same side. And this is the problem.
           | 
           | If there is a problem with the deploy let's meet, fix the
           | issue and most importantly learn from the problem, and
           | document the incident for future reference.
           | 
           | And them move on without fingerpointing.
        
             | CodeMage wrote:
             | Just because I have my responsibilities and you have yours,
             | it doesn't mean we're not on the same side.
             | 
             | I've come to dread cute management phrases like "everyone
             | should pull on the rope". I agree with the sentiment, but
             | software development is not as simple as pulling on a rope.
             | There are lots of moving parts and lots of things to
             | specialize in. And I say this as a generalist dev, not as
             | an ops engineer.
             | 
             | I agree with TFA completely. I was interviewing for a job
             | recently, and one of the questions I would ask when the
             | interviewer signaled it was time for me to ask questions
             | was "how do you handle QA?" On some occasions, this got me
             | weird looks, because "QA" seems to be an antiquated
             | concept.
             | 
             | In a similar vein, my stint at Amazon taught me that one of
             | the questions to ask my interviewers is to tell me about
             | their on-call rotation. Is there any? How often are you on
             | call and for how long? Who gets paged first?
             | 
             | Yeah, we're all on the same side, but there needs to be
             | some structure and order. Otherwise, you end up with
             | something like this:
             | 
             |  _" Twenty-seven people were got out of bed in quick
             | succession and they got another fifty-three out of bed,
             | because if there is one thing a man wants to know when he's
             | woken up in a panic at 4:00 A.M., it's that he's not
             | alone."_
             | 
             | -- from "Good Omens", by Sir Terry Pratchett and Neil
             | Gaiman
        
         | emmelaich wrote:
         | I totally agree with TFA; except it was ever thus. (And agile
         | has helped reduce the problem if anything)
         | 
         | As an ops person I've had to explain the devs own architecture
         | to them; they didn't know how it sent mail -- nothing to do
         | with SMTP; they just hadn't shared the knowledge among
         | themselves of the db/java app interaction.
         | 
         | I once had a developer tell me ridiculous things like "my java
         | app can't write to java.tmpdir". They couldn't even tell me
         | what file they were trying to write. I had to dive into apache
         | docs and send it to them. I turned out to be a bug in an apache
         | project code, nothing to do with tmpdir writeability.
         | 
         | The lack of basic responsibility and ownership was appalling.
        
         | generalk wrote:
         | I find this response surprising, as I fully agreed with TFA.
         | 
         | I've had an Ops team that had a similar attitude, and they did
         | a _lot_ to help me become a good developer. Part of that was
         | requiring that I come to them with identified problems.  "Hey
         | I'm getting this error, can you take a look at a stack trace in
         | a language you've never used and tell me what's wrong?" would
         | have gotten me booed/laughed out of the office, and for good
         | reason.
         | 
         | It's not at all unreasonable to expect the developer to come
         | around instead with "hey my application can't write to this NFS
         | mount like I expected. It's running as $user, the permissions
         | look right but I'm still getting permission denied. Any
         | thoughts?" (A real situation I ran into, turns out SELinux had
         | further permissions I was unaware of, and my Ops lead Chip was
         | happy to show me what was what.)
         | 
         | Yeah, we're all on the same team, and that cuts both ways --
         | Ops should ensure Dev has what it needs, and Dev should make
         | some actual effort to understand the landscape their production
         | applications run in. Which seemed to me to be the entire point
         | of TFA.
        
           | emeraldd wrote:
           | > Part of that was requiring that I come to them with
           | identified problems. "Hey I'm getting this error, can you
           | take a look at a stack trace in a language you've never used
           | and tell me what's wrong?" would have gotten me booed/laughed
           | out of the office, and for good reason.
           | 
           | This a thousand times over ... If you can train your users to
           | do this any customer relationship will be better off!
        
           | xorcist wrote:
           | I've always tried to encourage the following format for all
           | professional questions of that sort:
           | 
           | "a) I do exactly this, b) expected this outcome, c) but got
           | this instead"
           | 
           | Short and to the point, it's remarkable how much easier it
           | makes things for everyone. I think I got it off usenet at
           | some time.
        
           | igetspam wrote:
           | You, sir or madame, are a good job. I like working with
           | people like you. I want to help but some things just don't
           | fall into my wheelhouse buy when they do, we're on it. This
           | is how teamwork should be defined.
        
           | indigodaddy wrote:
           | Thank you for being one of the minority that do this!
        
         | civilized wrote:
         | If the author is an asshole, you certainly also are one by the
         | same standard.
         | 
         | Developers are not just "users", they're fellow software
         | professionals who can reasonably be expected to work harder on
         | troubleshooting than reporting "it works on my machine but not
         | in the test environment :(" without even reading the error
         | message or including it in the report.
         | 
         | As a general rule, when you have most of the control or
         | knowledge of a technical process and you want someone else to
         | help you with it, you need to give that other person as much
         | transparency and info as possible. Because they don't control
         | the process and will have to slowly, laboriously ask you
         | questions, or ask you to do things, rather than just probing
         | the system themselves.
         | 
         | They're taking time out of their day to work in a relatively
         | inefficient and frustrating mode just to help you out, so jeez,
         | have some respect and try to make their jobs a little easier.
         | 
         | If you don't and prefer to wear this entitled attitude, fine,
         | but you're just as much an asshole as he is.
        
           | seniorThrowaway wrote:
           | my favorite response ever to "it works fine on my laptop/dev
           | machine" is "let's connect the prod load balancer to your
           | workstation and get you a pager, problem solved!"
        
       | NotSammyHagar wrote:
       | At many companies there is no one to help developers.
        
       | bob1029 wrote:
       | What I am seeing is a need for more vertical integration. Teams
       | need to be made to own the entire product stack. If you do this,
       | they will be incentivized to make it simple and stable.
       | 
       | No one should ever get to play "not my job" while simultaneously
       | throwing complexity grenades over to another team.
        
       | dragonwriter wrote:
       | This is why teams should be cross-functional and product-
       | organized, divided, if further is necessary, by product
       | _component_ not function, instead of function-organized.
       | 
       | Function-organized teams encourage knowledge siloes, and its-
       | some-othrler-teams-problem-ism.
        
         | mpitt wrote:
         | That sounds great in theory, but what happens if your dev/ops
         | ratio is something like 15/1? How do you put an ops person in
         | every team? I think it's the right answer but it seems
         | impossible to put in practice.
        
           | lbhdc wrote:
           | An alternative is that all or some of the devs share the load
           | of ops work.
        
       | manuelabeledo wrote:
       | This reads like a guy trying to take complete ownership, while
       | renouncing to any accountability.
       | 
       | I have been in this industry for 15+ years, and as a developer, I
       | have a surprising amount of experience dealing with customers. Of
       | course, when a customer complains about some feature not working,
       | I would not just take their word for it. Customers mess up too.
       | 
       | What I _would not do_ is brush their complains off.  "This is a
       | systemic issue". "They are causing problems". "They don't know
       | better". "They don't have the correct incentives". Try telling
       | that to a customer, or to your boss.
       | 
       | The obvious disconnect from his own team _is_ the problem.
        
       | jen20 wrote:
       | This entire article is written with such profound
       | misunderstanding of DevOps - perhaps one induced by vendor
       | marketing - that it's effectively meaningless.
       | 
       | Yes, developers should understand the operational environment a
       | system runs in, and should be capable of advanced
       | troubleshooting. But the rest of the post is simply tired screed
       | about how "the old days" were better, despite the fact that they
       | manifestly were not.
        
       | 123pie123 wrote:
       | >Application teams attempting to assign ownership of a bug to a
       | networking team because they didn't account for timeouts.
       | 
       | I had to chuckle - everyone (not just developers) seems to blame
       | the network first! (including blame the firewall rules)
        
         | aNoob7000 wrote:
         | First blame the network then the database but never the code.
         | :)
        
           | tssva wrote:
           | This just means the network gets blamed twice because
           | inevitably the DBA's will also blame the network once the
           | issue gets to them.
        
         | bennyp101 wrote:
         | I mean, it /is/ always DNS :)
        
           | tyingq wrote:
           | I see a fair amount of DNS problems that trace back to "app
           | resolves a DNS name at startup, and never does the lookup
           | again".
        
           | gfiorav wrote:
           | Or if you work in a big company, it's always the proxy
        
         | johngalt wrote:
         | At one point running wireshark and reviewing network traces
         | with developers was a full time job. Guess what percentage of
         | time it was actually a network problem?
        
         | Foobar8568 wrote:
         | I had to escalate to basically a CIO of a fortune 500 company
         | for someone to take a look at the network performance of our
         | system, all teams were blaming applications despite the
         | evidences. It ended up to be a bug in a VMWare driver that was
         | impacting their whole infrastructure.
        
         | tyingq wrote:
         | I would add reasonable retry logic also. I've seen quite a lot
         | of outages that would not have been noticed if there were
         | decent retry logic with backoff, etc.
        
       | reacharavindh wrote:
       | Oh boy, it feels like someone is ringing bells in my head because
       | of aligned thoughts.
       | 
       | Let me share an experience. In 2010, I worked on a project for a
       | large business in the US(Fortune 100). The process was set so
       | rigidly that it worked well, but I was among the group of people
       | who were mad at it saying"why is this so rigid? Trust us and let
       | us do things faster!!". Context : There were change management
       | rules in place. The software was to be released only on a regular
       | cadence of about 6 months, only after thorough integration tests,
       | and approval from the change mgmt board. Should anything go wrong
       | in "move to prod" there will be representation from dev, QA, Ops,
       | change mgmt, and Mgmt orgs to immediately decide on actions until
       | the release to prod is successful. There will be thorough
       | documentation of what to do (run books) on what changes occurred,
       | what their impact could be and how to rollback if something
       | unexpected occurs. It was always a party after a successful
       | release :-)
       | 
       | Trust me there were a lot of bugs, but they were mostly found and
       | fixed during the laborious QA and integration tests by people
       | whose job it was.
       | 
       | Fast forward to now, I am a "Cloud Engineer" in a small team that
       | does everything from app development to building CI pipelines to
       | running services on AWS to being on-call to keep them running.
       | 
       | I must say, I wish for the old days back. Sure, it was slow and
       | laborious, but it resulted in better outcomes and manageability.
       | IMHO, it also resulted in better reliability of software due to
       | the diligence done by several layers.
       | 
       | It is easy to say do the same just faster in your small team.
       | But, in practicality it just doesn't happen. I work on setting up
       | Observability one week, then onto designing infra for a new
       | service, then onto some development and so on. I feel like my
       | scope would have been limited, and I would have had an easier
       | time becoming an expert at something than becoming so broad
       | skilled like I am today.
       | 
       | Sometimes, old, slow, and mature is not so bad. Not everyone
       | needs to follow the FAANG SV companies to be successful.
        
         | ByteWelder wrote:
         | > I must say, I wish for the old days back. Sure, it was slow
         | and laborious, but it resulted in better outcomes and
         | manageability. IMHO, it also resulted in better reliability of
         | software due to the diligence done by several layers.
         | 
         | Those were also the days where it took many years to go from
         | Java 6 to Java 8. Or perhaps to try out Kotlin.
         | 
         | They were the days where legacy code was the norm, and we kept
         | supporting it because nobody dared to change anything for the
         | better. In practice, that's just not something you can maintain
         | in a competitive market, because your competitors _will_ use
         | new technologies and faster/better development processes.
         | 
         | "it just works" might be good enough for maintaining your
         | application, but will it be good enough to find people willing
         | to work in that code base or that environment?
         | 
         | I work for a large business where both the old and new
         | practices are in place (mostly the new ones, though). Focusing
         | on "going fast" is definitely not a good idea, but I believe
         | there's a sweet spot in between.
        
           | datavirtue wrote:
           | All code is legacy code. As soon as you start changing the
           | existing code...it is legacy.
           | 
           | I'm on a project now that has not released to prod. It has a
           | lot of new legacy code.
        
           | reacharavindh wrote:
           | Sure, mature processes encourage tech stagnation, and
           | discourage even beneficial changes as collateral damage. But,
           | as you say there is line somewhere at which project should
           | move on from "Go fast, ship often, change much and get
           | feature-rich" to "focus on correctness, stable releases,
           | actually maintain our existing features". Perhaps it is
           | really a cycle of both and missing one for the other leads to
           | problems.
        
         | wayoutthere wrote:
         | Here here. If the bulk of your "products" are for internal
         | consumers, you likely aren't paying enough to attract talent
         | who know how to operate in the the FAANG model.
         | 
         | I like to distinguish between "product developers" (i.e.
         | building products for consumers with guaranteed scale, so do it
         | right the first time) and "project developers" (get it done
         | ASAP and cut the corners you need to do so).
         | 
         | In the "project developer" world, 50-75% of your requirements
         | gathering happens before a line of code is written. There is
         | usually a "right way" to implement a process of which
         | technology is only one component and figuring that out as you
         | go will actually slow down the project due to the maker /
         | manager schedule conflict. True "agile" in this environment
         | just leads to scope creep as there usually aren't dedicated
         | product owners to say no to every little request.
         | 
         | I've stopped pushing agile as hard because the corporates
         | simply can't afford the kind of engineers to make it work
         | correctly, and they don't have the roles required to gather and
         | feed requirements to a dev team in an agile format. Sprints are
         | a good way to time-box feature development, but most business
         | projects work better with a more waterfall approach. Your
         | customers and project plan operate under waterfall so there's
         | less downside to begin with.
        
           | reacharavindh wrote:
           | Great comment about "Project developers" and "Product
           | developers". It is almost an entirely different art to get
           | the requirements right by iterating on a project, and
           | bringing out a solution to life versus engineering a
           | scalable, maintainable product that evolves after a good
           | start. I never had to think of such distinction.
           | 
           | Waterfall model has its downsides in extracting the
           | requirements out properly whereas the Agile approach(the
           | little I have seen of it) seems to lose the layered stability
           | of a waterfall based approach.
        
           | bdavis__ wrote:
           | excellent comments. instead of straight waterfall, i would
           | suggest a time boxed requirements phase, followed by
           | incremental development with a reasonable cadence (dictated
           | by the product; web might be 2 weeks, more serious domains
           | might be 90 days). you need iteration, but having a solid
           | grounding on what you are going to build eliminates churn.
        
       | elfrinjo wrote:
       | I sent a link to that post to a senior developer with similar
       | habits. He answered: "didn't read, don't understand english."
        
       | rmetzler wrote:
       | Just today I saw another "works on my machine" issue. The dev
       | didn't complain for 3 weeks that his latest code isn't deployed.
       | QA found out today (on a Friday) about it and the dev has his day
       | off. The issues were not hard to fix, but it's not the DevOps
       | job.
       | 
       | Especially when the dev wanted to migrate from Java 8 to Java 11
       | and didn't even attempt to lookup our documentation on how to
       | change JVM parameters.
        
       | giantg2 wrote:
       | "Operations is not Developer IT"
       | 
       | It seems more and more places want it to be. DevOps is all the
       | rage.
        
       | ineedasername wrote:
       | I'm not trying to be glib, it honestly sound like a lot of the
       | people this person worked with needed a strong lesson in LMGTFY.
        
       | dgb23 wrote:
       | > Nobody gets promoted for maintenance or passing a security
       | audit.
       | 
       | This is a huge problem. Working on reliability and security is
       | hard, shipping broken features is easy.
        
         | nijave wrote:
         | Not only that, fixing those issues generally adds work that
         | sucks up time that could be devoted to shipping new features.
         | 
         | In that regard, those roles are slowing things down and costing
         | money
        
       | lucasyvas wrote:
       | The problem often lies with the entire Organization and not the
       | Development team. I've had roles where Development was empowered
       | to code the product and deploy the code, which necessitates
       | certain access. At that point, troubleshooting is trivial and we
       | can solve our own problems. It's amazing.
       | 
       | Throw in some red tape where I can't have access to logs myself?
       | Then I don't care to fix it at all - chasing another team, that
       | has diverging priorities, is complete a waste of my time.
       | 
       | If your Developers are tossing shit over a wall, I'd bet top
       | dollar you work in organization B. In which case they are
       | behaving accordingly. Don't empower me to identify and fix
       | issues? Then I won't (and I won't lose sleep over it either).
        
       | plebianRube wrote:
       | >developers are not incentivized or even encouraged to gain
       | broader knowledge of how their systems work
       | 
       | This is the crux of the problem. Coding in isolation. Replies of
       | 'It's java, it should work anywhere' etc.
       | 
       | The other gear grinding commom theme is not even doing basic
       | troubleshooting. To the point of not even googling the error
       | message or the symptoms, and being 'blocked' because they are
       | waiting on a ticket they opened with the 'other' team.
        
         | reportgunner wrote:
         | _Works on my machine_
        
         | bennyp101 wrote:
         | We're a small company so we sometimes do many things, but it's
         | taught me a lot of networking fault finding.
         | 
         | There's some very clever ppl that know all about how
         | networks/vm stuff work, and I've learnt enough from them that I
         | can fix most of my own infra related things - or at least give
         | them a run down of what I've done first to save them some time.
         | 
         | It got me back into hardware and networky stuff, so now I've
         | got a MikroTik at home, some proxmox machines, Tailscale
         | network etc - more fun than just spin up a box on DO and be
         | done with it.
         | 
         | A lot of ppl just aren't interested though, they just want to
         | code (and maybe learn a new language) but because a lot of
         | stuff is now PaaS and it's super easy, there is no need to
         | learn it (in their eyes)
        
         | debarshri wrote:
         | I think incentive for developer is to be relevant. If you don't
         | do it, someone else will. And that becomes the new norm. Like
         | how DevOps has become the new norm.
        
       | arminiusreturns wrote:
       | I've dealt with most of the issues in TFA and in comments here.
       | Without engaging too much in the technicalities, I would offer
       | that most of these issues actually stem from leadership, or lack
       | thereof, and most often, at the middle management level, but
       | sometimes middle management issues are just covering up upper
       | management issues. Generally, I see these kinds of issues more
       | often in non-technical management presiding over technical teams,
       | because they have learned all the correct propitations to upper
       | management and all the buzzword bingo for their teams, but lack a
       | real understanding, and more importantly, lack the ability to
       | form a coherent and actionable _vision_ to correct these kinds of
       | issues. (pet peeve issue with middle management is when they push
       | others out of the interview process, and suddenly you have hires
       | that don 't belong _at all_.)
       | 
       | As for me, I'm currently watching a good devops team go down the
       | drain because of a bad manager, so I'm seriously considering
       | trying to move to management so I can help my employer do better.
        
       ___________________________________________________________________
       (page generated 2021-09-03 23:02 UTC)