_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
 (HTM) Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
 (HTM)   The future of software engineering is SRE
       
       
        pjmlp wrote 6 min ago:
        Except the small detail that as proven by all the people that lost
        their jobs to factory robots, the number of required SRE is relatively
        small in porpotion to existing demographics of SWEs.
        
        Also this doesn't cover most of the jobs, which are actually in
        consulting, and not product development.
       
        joe_91 wrote 9 min ago:
        True, but also need to know the basics well of what constitutes good
        code and how it should scale vs just working code. Too many people
        relying on LLMs to produce stuff which just about works but give users
        a terrible experience as it bearly works.
       
        metasim wrote 56 min ago:
        What’s an “SRE”?
       
          netdevphoenix wrote 10 min ago:
          Site Reliability Engineering. It is the role that, among other
          things, ensures that a service uptime is optimal. It's the closest
          thing we have nowadays to the system admin role
       
            metasim wrote 8 min ago:
            Thank you!
       
        mexicocitinluez wrote 1 hour 6 min ago:
        > All he wanted was to make his job easier and now he's shackled to
        this stupid system.
        
        What people failed to grasp about low-code/no-code tools (and what I
        believe the author ultimately says) is that it was never about
        technical ability. It was about time.
        
        The people who were "supposed" to be the targets of these tools didn't
        have the time to begin with, let alone the technical experience to
        round out the rough edges. It's a chore maintaining these types of
        things.
        
        These tools don't change that equation. I truly believe that we'll see
        a new golden age of targeted, bepsoke software that can now be
        developed cheaper instead of small/medium businesses utilizing
        off-the-shelf, one-size-fits-all solutions.
       
        petetnt wrote 3 hours 43 min ago:
        Again there's a cognitive dissonance in play here where the future of
        coding is somehow LLMs and but at the same time the LLMS would not
        evolve not to handle the operations as well even if we disregard
        pipedreams about AGIs being just around the corner. Especially when
        markdown files for AI are essentially glorified runbooks.
       
        stared wrote 4 hours 14 min ago:
        Yet, AI is not there yet. Even the top models struggle at simplest SRE
        tasks.
        
        We  just created a benchmark on adding distributed logs (OpenTelemetry
        instrumentation) to small services, around 300 lines of code.
        
        Claude Opus 4.5 succeed at 29%, GPT 5.2 at 26%, Gemini 3 Pro at 16%.
        
 (HTM)  [1]: https://quesma.com/blog/introducing-otel-bench/
       
        tasuki wrote 4 hours 14 min ago:
        > Writing code was always the easy part of this job. The hard part was
        keeping your code running for the long time.
        
        Spoken like a true SRE. I'm mostly writing code, rather than working on
        keeping it in production, but I've had websites up since 2006 (hope
        that counts as long time in this corner of the internet) with very
        little down time and frankly not much effort.
        
        My experience with SREs was largely that they're glorified SSH: they
        tell me I'm the programmer and I should know what to type into their
        shell to debug the problem (despite them SREing those services for
        years, while I joined two months ago and haven't even seen the
        particular service). But no I can't have shell access, and yes I should
        be the one spelling out what needs to be typed in.
       
        ivan_gammel wrote 6 hours 8 min ago:
        Operational excellency was always part of the job, regardless of what
        fancy term described it, be it DevOps, SRE or something else. The
        future of software engineering is software engineering, with emphasis
        on engineering.
       
        silisili wrote 6 hours 12 min ago:
        I was an old school SRE before the days of containerization and such. 
        Today, we have one who is a YAML wizard and I won't even pretend to
        begin to understand the entire architecture between all the moving
        pieces(kube, flux, helm, etc).
        
        That said, Claude has absolutely no problem not only answering
        questions, but finding bugs and adding new features to it.
        
        In short, I feel they're just as screwed as us devs.
       
        nbevans wrote 6 hours 19 min ago:
        Surely SRE is just a .md file like everything else? :upside-down-face:
       
        alexgotoi wrote 6 hours 23 min ago:
        There were several cheaper than programmers options to automate things,
        Robot Processing Automation being probably the most known, but it never
        get the expected traction.
        
        Why (imo)? Senior leaders still like to say: I run a 500 headcount
        finance EMEA organization for Siemens, I am the Chief People Officer of
        Meta anf I lead an org of 1000 smart HR pros. Most of their status is
        still tight to the org headcount.
       
        solatic wrote 6 hours 24 min ago:
        I think there's two kinds of software-producing-organizations:
        
        There's the small shops where you're running some kind of monolith
        generally open to the Internet, maybe you have a database hooked up to
        it. These shops do not need dedicated DevOps/SRE. Throw it into a
        container platform (e.g. AWS ECS/Fargate, GCP Cloud Run, fly.io, the
        market is broad enough that it's basically getting commoditized), hook
        up observability/alerting, maybe pay a consultant to review it and make
        sure you didn't do anything stupid. Then just pay the bill every month,
        and don't over-think it.
        
        Then you have large shops: the ones where you're running at the scale
        where the cost premium of container platforms is higher than the salary
        of an engineer to move you off it, the ones where you have to figure
        out how to get the systems from different companies pre-M&A to talk to
        each other, where you have N development teams organizationally far
        away from the sales and legal teams signing SLAs yet need to be
        constrained by said SLAs, where you have some system that was
        architected to handle X scale and the business has now sold 100X and
        you have to figure out what band-aids to throw at the failing system
        while telling the devs they need to re-architect, where you need to
        build your Alertmanager routing tree configuration dynamically because
        YAML is garbage and the routing rules change based on whether or not
        SRE decided to return the pager, plus ensuring that devs have the
        ability to self-service create new services, plus progressive rollout
        of new alerts across the organization, etc., so even Alertmanager
        config needs to be owned by an engineer.
        
        I really can't imagine LLMs replacing SREs in large shops. SREs
        debugging production outages to find a proximate "root" technical cause
        is a small fraction of the SRE function.
       
          weitendorf wrote 2 hours 55 min ago:
          Having worked on Cloud Run/Cloud Functions, I think almost every
          company that isn't itself a cloud provider could be in category 1,
          with moderately more featureful implementations that actually
          competed with K8s.
          
          Kubernetes is a huge problem, it's IMO a shitty prototype that
          industry ran away with (because Google tried to throw a wrench at
          Docker/AWS when Containers and Cloud were the hot new things,
          pretending Kubernetes is basically the same as Borg), then the
          community calcified around the prototype state and bought all this
          SAAS/structured their production environments around it, and now all
          these SAAS providers and Platform Engineers/Devops people who make a
          living off of milking money out of Kubernetes users are guarding
          their gold mines.
          
          Part of the K8s marketing push was rebranding Infrastructure
          Engineering = building atop Kubernetes (vs operating at the layers at
          and beneath it), and K8s leaks abstractions/exposes an enormous
          configuration surface area, so you just get K8s But More
          Configuration/Leaks. Also, You Need A Platform, so do Platform
          Engineering too, for your totally unique use case of connecting git
          to CI to slackbot/email/2FA to our release scripts.
          
          At my new company we're working on fixing this but it'll probably be
          1-2 more years until we can open source it (mostly because it's not
          generalized enough yet and I don't want to make the same mistake as
          Kubernetes. But we will open source it). The problem is mostly
          multitenancy, better primitives, modeling the whole user story in the
          platform itself, and getting rid of false dichotomies/bad
          abstractions regarding scaling and state (including the entire
          control plane). Also, more official tooling and you have to put on a
          dunce cap if YAML gets within 2 network hopes of any zone.
          
          In your example, I think
          
          1. you shouldn't have to think about scaling and provisioning at this
          level of granularity, it should always be at the multitenant zonal
          level, this is one of the cardinal sins Kubernetes made that Borg
          handled much better
          
          2. YAML is indeed garbage but availability reporting and alerting
          need better official support, it doesn't make sense for every
          ecommerce shop and bank to building this stuff
          
          3. a huge amount of alerts and configs could actually be expressed in
          business logic if cloud platforms exposed synchronous/real-time
          billing with the scaling speed of Cloud Run.
          
          If you think about it, so so so many problems devops teams deal with
          are literally just
          
          1. We need to be able to handle scaling events
          
          2. We need to control costs
          
          3. Sometimes these conflict and we struggle to translate between the
          two.
          
          4. Nobody lets me set hard billing limits/enforcement at the platform
          level.
          
          (I implemented enforcement for something close to this for
          Run/Appengine/Functions, it truly is a very difficult problem, but I
          do think it's possible. Real time usage->billing->balance debits was
          one of the first things we implemented on our platform).
          
          5. For some reason scaling and provisioning are different things
          (partly because the cloud provider is slow, partly because Kubernetes
          is single-tenant)
          
          6. Our ops team's job is to translate between business logic and
          resource logic, and half our alerts are basically asking a human to
          manually make some cost/scaling analysis or tradeoff, because we
          can't automate that, because the underlying resource model/platform
          makes it impossible.
          
          You gotta go under the hood to fix this stuff.
       
            linuxftw wrote 3 min ago:
            There are plenty of PaaS components that run on k8s if you want to
            use them.  I'm not a fan, because I think giving developers direct
            access to k8s is the better pattern.
            
            Managed k8s services like EKS have been super reliable the last few
            years.
            
            YAML is fine, it's just configuration language.
            
            > you shouldn't have to think about scaling and provisioning at
            this level of granularity, it should always be at the multitenant
            zonal level, this is one of the cardinal sins Kubernetes made that
            Borg handled much better
            
            I'm not sure what you mean here.  Manage k8s services, and even k8s
            clusters you deploy yourself, can autoscale across AZ's.  This has
            been a feature for many years now.  You just set a topology key on
            your pod template spec, your pods will spread across the AZ's,
            easy.
            
            Most tasks you would want to do to deploy an application, there's
            an out of the box solution for k8s that already exists.  There have
            been millions of labor-hours poured into k8s as a platform, unless
            you have some extremely niche use case, you are wasting your time
            building an alternative.
       
            spockz wrote 10 min ago:
            Since you are developing in this domain. Our challenge with both
            lambdas and cloud run type managed solutions is that they seem
            incompatible with our service mesh. Cloud run and lambdas can not
            be incorporated with gcp service mesh, but only if it is managed
            through gcp as well. Anything custom is out of the question. Since
            we require end to end mTLS in our setup we cannot use cloud run.
            
            To me this shows that cloud run is more of an end product than a
            building block and it hinders the adoption as basically we need to
            replicate most of cloud run ourselves just to add that tiny bit of
            also running our Sidecar.
            
            How do you see this going in your new solution?
       
            firesteelrain wrote 29 min ago:
            Lots to unpack here.
            
            I will just say based on recent experience the fix is not
            Kubernetes bad it’s Kubernetes is not a product platform; it’s
            a substrate, and most orgs actually want a platform.
            
            We recently ripped out a barebones Kubernetes product (like Rancher
            but not Rancher). It was hosting a lot of our software development
            apps like GitLab, Nexus, KeyCloak, etc
            
            But in order to run those things, you have to build an entire
            platform and wire it all together. This is on premises running on
            vxRail.
            
            We ended up discovering that our company had an internal software
            development platform based on EKS-A and it comes with auto
            installers with all the apps and includes ArgoCD to maintain state
            and orchestrate new deployments.
            
            The previous team did a shitty job DIY-ing the prior platform. So
            we switched to something more maintainable.
            
            If someone made a product like that then I am sure a lot of people
            would buy it.
       
            vrosas wrote 2 hours 36 min ago:
            Every time I’ve pushed for cloud run at jobs that were on or
            leaning towards k8s I was looked at as a very unserious person.
            Like you can’t be a “real” engineer if you’re not battling
            yaml configs and argoCD all day (and all night).
       
              weitendorf wrote 1 hour 52 min ago:
              It does have real tradeoffs/flaws/limitations, chief among them,
              Run isn't allowed to "become" Kubernetes, you're expected to
              "graduate". There's been an immense marketing push for Kubernetes
              and Platform Engineering and all the associated SAAS sending the
              same message (also, notice how much less praise you hear about it
              now that the marketing has died down?).
              
              The incentives are just really messed up all around. Think about
              all the actual people working in devops who have their
              careers/job tied to Kubernetes, and how many developers get drawn
              in by the allure and marketing because it lets them work on more
              fun problems than their actual job, and all the provisioned
              instances and vendor software and certs and conferences, and all
              the money that represents.
       
          ffsm8 wrote 4 hours 17 min ago:
          > SREs debugging production outages to find a proximate "root"
          technical cause is a small fraction of the SRE function.
          
          According to the specified goals of SRE, this is actually not just a
          small fraction - but something that shouldn't happen. 
          To be clear, I'm fully aware that this will always be necessary - but
          whenever it happened - it's because the site reliability engineer
          (SRE) overlooked something.
          
          Hence if that's considered a large part of the job.. then you're just
          not a SRE as Google defined that role [1] Very little connection to
          the blog post we're commenting on though - at least as far as I can
          tell.
          
          At least I didn't find any focus on debugging. It put forward that
          the capability to produce reliable software is what will distinguish
          in the future, and I think this holds up and is inline with the
          official definition of SRE
          
 (HTM)    [1]: https://sre.google/sre-book/table-of-contents/
       
            bigDinosaur wrote 1 hour 32 min ago:
            This makes sense - as am analogy the flight crash investigator is
            presumably a very different role to the engineer designing flight
            safety systems.
       
              arcbyte wrote 54 min ago:
              I think you've identified analogous functions, but I don't think
              your analogy holds as you've written it. A more faithful analogy
              to OP is that there is no better flight crash investigator than
              the aviation engineer designing the plane, but flight crash
              investigation is an actual failure of his primary duty of
              engineering safe planes.
              
              Still not a great rendition of this thought, but closer.
       
        zahlman wrote 7 hours 12 min ago:
        > And you definitely don't care how a payments network point of sale
        terminal and your bank talk to each other... Good software is
        invisible.
        
        > ...
        
        > Are you keeping up with security updates? Will you leak all my data?
        Do I trust you? Can I rely on you?
        
        IMO, if the answers to those questions matter to you, then you damn
        well should care how it works. Because even if you aren't sufficiently
        technically minded to audit the system, having someone be able to
        describe it to you coherently is an important starting point in
        building that trust and having reason to believe that security and
        privacy will work as advertised.
       
        pcj-github wrote 7 hours 34 min ago:
        If the agent swarm is collectively smarter and better than the SRE,
        they'll be replaced just like other types of workers.  There is no
        domain that has special protection.
       
          bronlund wrote 7 hours 19 min ago:
          My thoughts exactly. This is just some guy grasping at straws before
          he understands that he will have to bow to our new overlords sooner
          or later.
          
          Edit: Or maybe he is fully aware and just need to push some books
          before it's too late.
       
            TeMPOraL wrote 1 hour 32 min ago:
            Or, most charitably, maybe they're not sure and trying to
            Cunningham's Law their way through the conundrum.
       
          measurablefunc wrote 7 hours 20 min ago:
          What about C-suite executives & shareholders? Are they safe from
          automation?
       
            netdevphoenix wrote 33 min ago:
            You can probably automate the full economy. Both production and
            consumption
       
            oytis wrote 45 min ago:
            You can only replace someone who was useful. If one is useless, but
            is still there, it means they are not there for their contribution
            and you can't replace them by automating whatever it might have
            been.
       
            TeMPOraL wrote 1 hour 19 min ago:
            Ultimately, no. But when we get to this point - once we have AI
            deciding on its own  what needs to be done in the world in general
            - then the bottom falls out, and we'll all be watching a new global
            economy, in which humans won't partake anymore. At best, we'll
            become pets to our new AI overlords; more likely, resources to
            exploit.
       
            meindnoch wrote 4 hours 14 min ago:
            A uniquely important thing that a CEO brings to the table is
            accountability. You can't automate accounta- ...sorry, I can't
            continue this with a straight face :DDD
       
            rcbdev wrote 4 hours 53 min ago:
            Yes. The AI cannot be the child/other type of beneficiary of a
            well-connected person, yet.
       
            vkou wrote 5 hours 46 min ago:
            Automating away shareholders can't come soon enough.
       
            vjvjvjvjghv wrote 5 hours 47 min ago:
            The make the decisions so I doubt they will soon themselves to be
            automated away. Their main risk will be that nobody can buy their
            products once everything is automated.
            
            I wonder if capitalism and democracy will be just a short chapter
            in history that will be replaced by something else. Autocratic
            governments seem to be the most prevalent form of government in
            history.
       
            p_v_doom wrote 6 hours 26 min ago:
            Generally yes. The more power one holds in an organization the more
            safe they are from automation.
       
            bjt12345 wrote 7 hours 10 min ago:
            The thing about C-suite executives is they usually have  short
            tenures, however the management levels below them are often cozy in
            their bureaucracy, resist change, often trying to outlast the new
            management.
            
            I actually argue that AI will therefore impact these levels of
            management the most.
            
            Think about it, if you were employed as a transformational CEO
            would you risk trying to fight existing managers or just replace
            them with AI?
       
              joe_mamba wrote 6 hours 48 min ago:
              >I actually argue that AI will therefore impact these levels of
              management the most.
              
              Not AI but bad economy and mass layoffs tend to wipe out
              management positions the most. As a decent IC, in case of layoffs
              in bad economy, you'll always find some place to work at if
              you're flexible with location and salary because everyone still
              needs people who know how to actually build shit, but nobody
              needs to add more managers in their ranks to consume payroll and
              add no value.
       
                bjt12345 wrote 6 hours 19 min ago:
                A lot of large companies lay off swags of technical staff
                regularly (or watch them leave), and rotate CEOs but their
                middle management have jobs for life - as the Peter Principe
                states, they are promoted to their highest respective
                incompetence and stay there because no CEO has time to replace
                them.
                
                AI will transform this.
       
                  joe_mamba wrote 6 hours 9 min ago:
                  Disagree with the "jobs for life" part for management. Only
                  managers who are there thanks to connection, nepotism or
                  cronyism, are there for life as long as those shielding them
                  also stay in place. THose who got in or got promoted to
                  management meritocratically don't have that protection and
                  are the first to be let go.
                  
                  At all large MNCs I worked at, management got hired and fired
                  mostly on their (or lack thereof) connections and less on
                  what they actually did. Once they got let go, they had near
                  impossible time finding another management position elsewhere
                  without connections in other places.
       
                mraza007 wrote 6 hours 22 min ago:
                This is so true
                Especially with middle managers they are they the ones that are
                hit the hardest
       
                  joe_mamba wrote 5 hours 45 min ago:
                  Yes I was talking about middle managers mostly. Upper
                  management, C-suite, execs are mostly protected from firing
                  unless they F-up big time like sexual assault, hate speech,
                  etc.
       
        joshuaisaact wrote 7 hours 59 min ago:
        Couldn't disagree with this article more. I think the future of
        software engineering is more T-shaped.
        
        Look at the 'Product Engineer' roles we are seeing spreading in
        forward-thinking startups and scaleups.
        
        That's the future of SWE I think. SWEs take on more PM and design
        responsibilities as part of the existing role.
       
          pjmlp wrote 3 min ago:
          Or architects, someone has to draw the nice diagrams and spec files
          for the robots.
          
          However, like in automated factories, only a small percentage is
          required to stay around.
       
          reeredfdfdf wrote 5 hours 51 min ago:
          I agree. In many cases it's probably easier for a developer to become
          more of a product person, than for a product person to become a dev.
          Even with LLM's you still need to have some technical skills & be
          able to read code to handle technical tasks effectively.
          
          Of course things might look different when the product is something
          that requires really deep domain knowledge.
       
        chubot wrote 8 hours 6 min ago:
        Yeah, I think that when writing code becomes cheap, then all the
        COMPLEMENTS become more valuable:
        
            - testing
            - reviewing, and reading/understanding/explaining
            - operations / SRE
       
        willtemperley wrote 8 hours 45 min ago:
        This may be true about SaaS. Not all software is SaaS, thankfully.
       
        hahahahhaah wrote 8 hours 48 min ago:
        Operational excellence will always be needed but part of that is
        writing good code. If the slop machine has made bad decisions it could
        be more efficient to rewrite using human expertise and deploy that.
       
        dionian wrote 9 hours 35 min ago:
        But there is bad code and good code and SREs cant tell you which is
        which, nor fix it.
       
          VirusNewbie wrote 7 hours 23 min ago:
          Why not? I'm a SWE SRE and I'm arguably better at telling good code
          from bad code than many of the pure devs I've worked with.
       
          bionsystem wrote 8 hours 51 min ago:
          My take (I'm an SRE) is that SRE should work pre-emptively to provide
          reproducible prod-like environments so that QA can test DEV code
          closer to real-life conditions. Most prod platforms I've seen are
          nowhere near that level of automation, which makes it really hard to
          detect or even reproduce production issues.
          
          And no, as an SRE I won't read DEV code, but I can help my team test
          it.
       
            dmoy wrote 6 hours 24 min ago:
            > And no, as an SRE I won't read DEV code, but I can help my team
            test it.
            
            I mean to each their own.  Sometimes if I catch a page and the
            rabbit hole leads to the devs code, I look under the covers.
            
            And sometimes it's a bug I can identify and fix pretty quickly. 
            Sometimes faster than the dev team because I just saw another dev
            team make the same mistake a month prior.
            
            You gotta know when to cut your losses and stop searching the
            rabbit hole though, that's true.
       
              bionsystem wrote 5 hours 58 min ago:
              I agree with your nuance, but that's not my default mode, unless
              I know the language and the domain well I am not going to write
              an MR. I'm going to read the stack trace to see it it's a conf
              issue though.
       
        deadbabe wrote 10 hours 39 min ago:
        CRE - Code Reliability Engineering
        
        AI will not get much better than what we have today, and what we have
        today is not enough to totally transform software engineering. It is a
        little easier to be a software engineer now, but that’s it. You can
        still fuck everything up.
       
          falcor84 wrote 10 hours 23 min ago:
          > AI will not get much better than what we have today
          
          Wow, where did this come from?
          
          From what just comes to my mind based on recent research, I'd expect
          at least the following this or next year:
          
          * Continuous learning via an architectural change like Titans or
          TTT-E2E.
          
          * Advancement in World Models (many labs focusing on them now)
          
          * Longer-running agentic systems, with Gas Town being a recent proof
          of concept.
          
          * Advances in computer and browser usage - tons of money being poured
          into this, and RL with self-play is straightforward
          
          * AI integration into robotics, especially when coupled with world
          models
       
            jayd16 wrote 7 hours 18 min ago:
            What does robotics have to do with writing better code?  Is this
            just a random AI wishlist?
       
        Sparkyte wrote 10 hours 48 min ago:
        As an SRE I can tell you AI can't do everything. I have done a little
        software development, even AI can't do everything. What we are likely
        to see is operational engineering become the consolidated role between
        the two. Knows enough about software development and knows enough about
        site reliability... blamo operational engineer.
       
          squidbeak wrote 2 hours 50 min ago:
          Paraphrase: "As an SRE I can tell you that the undetermined and
          unknowable potential of AI definitely won't involve my job being
          replaced."
       
          mellosouls wrote 3 hours 30 min ago:
          "As an SRE I can tell you AI can't do everything."
          
          That's what they used to say about software engineering and yet this
          is becoming less and less obvious as capabilities increase.
          
          There are no hiding places for any of us.
       
            TuxSH wrote 1 hour 22 min ago:
            Not the person you are replying to but, even if the technical
            skills of AI increase (and stuff like Codex and Claude Code is
            indeed insanely good), you still need someone to make risky
            decisions that could take down prod.
            
            Not sure management is eager to give permission to software owned
            by other companies (inference providers) the permission to delete
            prod DBs.
            
            Also these roles usually involve talking to other teams and
            stakeholder more often than with a traditional SWE role.
            
            Though
            
            > There are no hiding places for any of us.
            
            I agree with this statement. While the timeline is unclear (LLM use
            is heavily subsidized), I think this will translate into less
            demand for engineers, overall.
       
              pjmlp wrote 2 min ago:
              Indeed, however the amount of "someone" is going to be way less.
       
        ks2048 wrote 10 hours 52 min ago:
        This says nothing about how if AI can write software, AI cannot do
        these other things.
       
        augusteo wrote 11 hours 20 min ago:
        stackskipton makes a good point about authority. SRE works at Google
        because SREs can block launches and demand fixes. Without that
        organizational power, you're just an on-call engineer who also writes
        tooling.
        
        The article's premise (AI makes code cheap, so operations becomes the
        differentiator) has some truth to it. But I'd frame it differently: the
        bottleneck was never really "writing code." It was understanding what
        to build and keeping it running. AI helps with one of those. Maybe.
       
          nasretdinov wrote 5 hours 5 min ago:
          > because SREs can block launches and demand fixes
          
          I didn't find that particularly true during my tenure, but obviously
          Google is huge, so there probably exist teams that actually can
          afford to behave this way...
       
        giancarlostoro wrote 11 hours 44 min ago:
        What? Maybe OPs future. SWE is just going to replace QA and maybe
        architects if the industry adopts AI more, but there's a lot of hold
        outs. There's plenty of projects out there that are 'boring' and will
        not bother.
       
        stackskipton wrote 11 hours 49 min ago:
        As someone who works in Ops role (SRE/DevOps/Sysadmin), SREs are
        something that only works at Google mainly because for Devs to do SRE,
        they need ability to reject or demand code fixes which means you need
        someone being a prompt engineer who needs to understand the code and
        now they back to being developer.
        
        As for more dedicated to Ops side, it's garbage in, garbage out. I've
        already had too many outages caused by AI Slop being fed into
        production, calling all Developers = SRE won't change the fact that AI
        can't program now without massive experienced people controlling it.
       
          bionsystem wrote 8 hours 43 min ago:
          Most devs can't do SRE, in fact the best devs I've met know they
          can't do SRE (and vice versa). If I may get a bit philosophical, SRE
          must be conservative by nature and I feel that devs are often
          innovative by nature. Another argument is that they simply focus on
          different problems. One sets up an IDE and clicks play, has some
          ephemeral devcontainer environment that "just works", and the hard
          part is to craft the software. The other has the software ready and
          sometimes very few instructions on how to run it, + your typical
          production issues, security, scaling, etc. The brain of each gets
          wired differently over time to solve those very different issues
          effectively.
       
            zinodaur wrote 7 hours 9 min ago:
            I don’t understand this take - if all engineers go on call, they
            learn real quick what happens when their coworkers are too
            innovative. It is a good feedback loop that teaches them not to
            make unreliable software.
            
            SREs are great when the problem is “the network is down” or
            “kubernetes won’t run my pods”, but expecting a random
            engineer to know all the failure modes of software they didn’t
            build and don’t have context on never seems to work out well.
       
            rincebrain wrote 8 hours 6 min ago:
            It's possible to do both, you just need to be cognizant of what
            you're doing in both positions.
            
            A tricky part becomes when you don't have both roles for something,
            like SRE-developed tools that are maintained by the ones writing
            them, and you need to strike the balance yourselves until/unless
            you wind up with that split. If you're not aware of both hats and
            juggling wearing them intentionally, in that case, you can wind up
            with tools out of SRE that are worse than any SWE-only tool might
            ever be, because the SREs sometimes think they won't make the same
            mistakes, but all the same feature-focused things apply for
            SRE-written tools too...
       
        almosthere wrote 11 hours 54 min ago:
        Until you find out there are 40 - 80 startups writing agents in the SRE
        space :/
       
          cl0ckt0wer wrote 1 hour 30 min ago:
          Reliable ai agents would make you a trillionaire.
       
          ozim wrote 6 hours 36 min ago:
          Basically that’s what people are doing with YOLO mode letting
          Claude do everything in the system.
       
          ikiris wrote 9 hours 3 min ago:
          And I wish them luck, because the thought of current ai bots doing
          SRE work effectively is laughable.
       
          Nextgrid wrote 11 hours 25 min ago:
          It only matters if any of those can promise reliability and either
          put their own money where their mouth is or convince (and actually
          get them to pay up) a bigger player to insure them.
          
          Ultimately hardware, software, QA, etc is all about delivering a
          system that produces certain outputs for certain inputs, with certain
          penalties if it doesn’t. If you can, great, if you can’t, good
          luck. Whether you achieve the “can” with human development or LLM
          is of little concern as long as you can pay out the penalties of
          “can’t”.
       
        adelmotsjr wrote 12 hours 13 min ago:
        For those who were oblivious to what SRE means, just like me: SRE os
        _site reliability engineering_
       
          arionmiles wrote 7 hours 21 min ago:
          Servers, Ready to Eat
       
          ares623 wrote 10 hours 4 min ago:
          Seemingly Random Engineering
       
            samyar wrote 27 min ago:
            Super Ready Engineer
       
            ithkuil wrote 1 hour 14 min ago:
            Sysadmin Really Expensive
       
            bronlund wrote 7 hours 16 min ago:
            Stuckup Retro Engineer
       
            bravetraveler wrote 9 hours 17 min ago:
            Sales Recovery Engineering
       
          F7F7F7 wrote 10 hours 50 min ago:
          I knew what an SRE was and found the article somewhat interesting
          with a slightly novel (throwaway), more realistic take, on the "why
          need Salesforce when you can vibe your own Salesforce convo."
          
          But not defining what an SRE is feels like a glaring, almost
          suffocating, omission.
       
       
 (DIR) <- back to front page