[HN Gopher] Beware anti patterns in event driven architecture
       ___________________________________________________________________
        
       Beware anti patterns in event driven architecture
        
       Author : indentit
       Score  : 57 points
       Date   : 2024-06-08 18:56 UTC (4 hours ago)
        
 (HTM) web link (codeopinion.com)
 (TXT) w3m dump (codeopinion.com)
        
       | gnat wrote:
       | https://web.archive.org/web/20240608190815/https://codeopini...
        
       | candiddevmike wrote:
       | Can someone share some long term event driven success stories?
       | Almost everything you see online is written by consultants or
       | brand new, greenfield implementations, curious how long these
       | systems last.
        
         | cjk2 wrote:
         | No. We have a complete fucking disaster on our hands.
        
           | macintux wrote:
           | How old of a system? Do you feel it's the implementation, the
           | design, or the concept itself that went wrong? Is your system
           | a good fit?
           | 
           | (No stake in this one way or another, just curious.)
        
             | cjk2 wrote:
             | Less than 5 years. Vanity project. Built and maintained by
             | astronaut architects. Entirely unnecessary. Poorly
             | implemented down to the level of wire contracts being
             | inconsistent. Overheads are insane both from engineering
             | and operational POV.
        
               | moandcompany wrote:
               | Resume driven development never goes out of style
        
               | analognoise wrote:
               | Hail, RDD, my favorite development style.
        
               | cjk2 wrote:
               | I call this one FDD: Fuckwit Driven Development. Because
               | if it was resume-driven I'd expect it to be something
               | that they would want to put on their resume. But this is
               | unmentionable.
        
               | moandcompany wrote:
               | There's sayings along the lines of "Victory/success has a
               | thousand fathers, but defeat/failure is an orphan."
               | 
               | Chances are that system, and its outcomes are described
               | very differently on a resume
        
         | turkey99 wrote:
         | Yes, it's a great tool for integration. We have a product suite
         | and it's our chosen way to connect products.
        
         | blowski wrote:
         | I was the lead developer on one for an insurance company a few
         | years back, and it's still in active use. Insurance is a
         | heavily regulated domain, where an audit trail is more
         | important than performance. There was a natural pattern for it
         | to follow, as we were mapping a stable industry standard.
         | 
         | I also tried doing it in a property setting, where profit
         | margins were tight. The effort needed wasn't worth the cost,
         | and clients didn't really care about the value proposition
         | anyway. We pretty much replaced the whole layer with a more
         | traditional crud system.
        
         | devdude1337 wrote:
         | When I did game dev I often went for an even driven approach or
         | messaging based systems combined with oop and state machines to
         | prevent eventual consistency locally. It works great in that
         | domain, albeit not being the most performant solution.
         | 
         | In web or business systems it works well for some(!) parts. You
         | just shouldn't do everything that way - but often people get
         | too exited about a solution and then they tend to overdo it and
         | apply it everywhere, even when not appropriate.
         | 
         | Always chose the golden middle path and apply patterns where
         | they fit well.
        
         | Salgat wrote:
         | https://www.eventstore.com/case-studies/insureon
         | 
         | I can attest to this case study being 100% true. Our platform
         | has been using EventStore as our primary database for 9 years
         | going strong, and I'm still very happy with it. The key thing
         | is that it needs to be done right from the very beginning; you
         | can't do major architecture reworks later on and you need an
         | architect who really knows what they're doing. Also, you can't
         | half-ass it; event sourcing, CQRS, etc all had to embraced the
         | entire time, no shortcuts.
         | 
         | I will say though, the biggest downside is that scaling is
         | difficult since you can't always rely on snapshots of data,
         | sometimes you need to event source the entire model and that
         | can get data heavy. If you're standing up a new projector, you
         | could be going through tens of millions of events before it is
         | caught up which requires planning. It is incredible though
         | being able to have every single state change ever made on the
         | platform available, the data guys love it and it makes
         | troubleshooting way easier since there's no secrets on what
         | happened. The biggest con is that most people don't really
         | understand it intuitively, since it's a very different way of
         | doing things, which is why so many companies end up fucking it
         | up.
        
           | Spivak wrote:
           | Am I dumb or is this basically the binlog of your database
           | but without the tooling to let you do efficient querying?
           | 
           | Like I get the "message bus" architecture when you have a
           | bunch of services emitting events and consumers for differing
           | purposes but I don't think I would feel comfortable using it
           | for state tracking. Especially when it seems really hard to
           | enforce a schema / do migrations. CQRS also makes sense for
           | this but only when it functions as a WAL and isn't meant to
           | be stored forever but persisted by everyone who's interested
           | in it and then eventually discarded.
        
             | ffsm8 wrote:
             | > _Especially when it seems really hard to enforce a schema
             | / do migrations_
             | 
             | Enforcing the schema isn't too hard ime. But every
             | migration needs to be bi-directionally compatible. That's
             | likely what they meant with "you need an architect and
             | can't make major changes later on"
             | 
             | It's the same issue you've had with nosql, even though you
             | technically do have a schema
        
         | vmaurin wrote:
         | I have been doing this kind of stuff both in ad tech and trust
         | & safety industry, mainly to handle scalability. Something that
         | looks like "Event-carried state transfer" here
         | https://martinfowler.com/articles/201701-event-driven.html
         | 
         | These system are working fine, but maybe a common ground : *
         | very few services * the main throughput is "fact" events (so
         | something that did happen) * what you get as "Event carried
         | state transfer" is basically the configuration. One service own
         | it, with a classical DB and a UI, but then expose configuration
         | to all the system with this kind of event (and all the
         | consumers consume these read only) * usually you have to deal
         | with eventual consistency a lot in this kind of setup (so it
         | scales well, but there is a tradeoff)
        
         | jgraettinger1 wrote:
         | PostgreSQL.
         | 
         | The WAL is an event log, and when you squint at its internal
         | architecture, you'll see plenty of overlap with distributed
         | event sourcing.
        
           | mrkeen wrote:
           | Likewise with git. There's the "top-level events" that you
           | see (commits). But even when you're doing 'unsafe'
           | operations, you're working with the lower-level reflog
           | events.
        
         | ninkendo wrote:
         | Chiming in with another "no" here. We adopted a message
         | bus/event-driven architecture when moving a very popular piece
         | of software from the cloud, to directly on the user's device...
         | it was a disaster IMO.
         | 
         | The core orchestration of the system was done via events on the
         | bus, and nobody had any idea what was happening when a bug
         | occurred. People would pass bugs around, "my code did the right
         | thing given the event it got", "well, my code did the right
         | thing too", and nobody understood the full picture because
         | everyone was stuck in their own silo. Event driven
         | architectures encourage this: events decouple systems such that
         | you don't know or care what happens when you emit a message,
         | until one day it's emitted with slightly different timing or
         | ordering or different semantics, and things are broken and
         | nobody knows why.
         | 
         | The worst part is that software is basically "take user input,
         | do process A on it, then do process B on that, then do process
         | C on that." It could have _so easily_ been a simple imperative
         | function that called C(B(A(input))), but instead we made events
         | for "inputWasEmitted", "Aoutput", "Boutput", etc.
         | 
         | What happens when system C needs one more piece of metadata
         | about the user input? 3 PR's into 3 repos to plumb the
         | information around. Coordinating the release of 3 libraries.
         | All around just awful to work with.
         | 
         | Oh and this is a _very high profile_ piece of software with a
         | user base in the 9 figure range.
         | 
         | (Wild tangent: holy _shit_ is hard to get iOS to accept "do
         | process" in a sentence. I edited that paragraph at least 30
         | times, no joke, trying every trick I could to stop it
         | correcting it to "due process". I almost gave up. I used to
         | defend autocorrect but holy shit that was a nightmare.)
        
         | mrkeen wrote:
         | We've had mistakes that we've been able to course-correct from.
         | 
         | Our users are small-businesses with organisation numbers, and
         | we mostly think of them as unique. But they strictly aren't, so
         | we 'overwrote' some companies with other companies.
         | 
         | Once we detected and fixed the bug, we just replayed the events
         | with the fixed code, and we hadn't lost any data.
        
         | simonbw wrote:
         | It's been an incredibly useful pattern for me in game
         | development. I have a hard time imagining making a game with
         | any level of complexity without it. You can definitely go
         | overboard with it, but I have a hard time even imagining how
         | some systems like collision detection/a physics engine could
         | even work without it.
        
         | cweld510 wrote:
         | I work on an event-based architecture that I think is
         | successful, but that's because our core primitives are event-
         | based, so there is no impedance mismatch in the way that there
         | can be if you migrate from a request-response architecture to
         | an evented one. Specifically, we aren't trying to deal with
         | databases and HTTP (both of which are largely synchronous
         | primitives). Instead, I work on a platform for somewhat
         | arbitrary code execution; and the code we are executing depends
         | on our code rather than vice versa. In general, the code we
         | execute on the platform can run for an indeterminate amount of
         | time, and it generally has control and calls back into our code
         | rather than our code calling into it. So our control flow is
         | naturally callback-based rather than request/response; as a
         | result, our system is fundamentally event-based.
        
         | tkiolp4 wrote:
         | No. The usual pains are:
         | 
         | - Producer and consumer are decoupled. That's a good thing m
         | right? Good luck finding the consumer when you need to modify
         | the producer (the payload). People usually don't document these
         | things
         | 
         | - Let's use SNS/SQS because why not. Good luck reproducing
         | producers and consumers locally in your machine. Third party
         | infra in local env is usually an afterthought
         | 
         | - Observability. Of rather the lack of it. It's never out of
         | the box, and so usually nobody cares about it until an incident
         | happens
        
       | arwhatever wrote:
       | I often hear the argument in favor of event-driven architecture
       | that you can work on one part of a system in isolation without
       | having to consider the other parts, and then I get assigned some
       | task which requires me to consider the entire system operation,
       | now with events that are harder to trace than function calls
       | would have been.
       | 
       | Now when people argue "because decoupling," I hear, "You don't
       | get as much notification that you just broke a downstream
       | system."
        
         | hobs wrote:
         | I think generally a lot of these types of problems were
         | actually had by folks who grew out of single node systems and
         | had a lot of interesting ideas to solve problems that were new
         | in those domains, GIVEN they've already solved the stateful
         | domain problems as well.
         | 
         | When you've never grown out of a single node domain but you do
         | event driven "because scaling" or whatever, you've shot
         | yourself in the foot amazingly hard.
        
         | moandcompany wrote:
         | Integration tests?
        
         | scubbo wrote:
         | Amen. Event-driven architecture makes it easier to bury your
         | head in the sand, and harder to implement an actually-working
         | feature.
        
       | pudwallabee wrote:
       | I have seen Kafka pulled out by its hairs and replaced with
       | request based architecture.
       | 
       | Event driven architecture, to me is itself an antipattern.
       | 
       | It seems like a replacement for batch processing. Replayable
       | messages are AWESOME. Until you encounter the complexity for a
       | system to actually replay them consistently.
       | 
       | As far as the authors video, while there was some truth in there,
       | it was a little thin, compared to the complexity of these
       | architectures. I believe that even though Kafka acts the part of
       | "dumb pipe", it doesnt stay dumb for long, and the n
       | distributions of Kafka logs in your organization could be 1000x
       | more expensive than a monolithic DB and a monolithic API to
       | maintain.
       | 
       | Yes it appears auditable but is it? The big argument for
       | replayability is that unlike an API that falls over theres no
       | data loss. If you work with Kafka long enough you'll realize that
       | data loss will become a problem you didnt think you had. You'll
       | have to hire people to "look into" data loss problems constantly
       | with Kafka. Its just too much infrastructure to even care about.
       | 
       | Theres also, something ergonomically wrong with event drive
       | architecture. People dont like it. And it also turns people into
       | robots who are "not responsible" for their product. Theres so
       | much infrastructure to maintain that people just punt everything
       | back to the "enterprise kafka team".
       | 
       | The whole point of microservices was to enable flexibility, smart
       | services and dumb pipes, and effective CI/CD and devops.
       | 
       | We are nearing the end of microservices adoption whether it be
       | event or request driven. In mature organizations it seems to me
       | that request driven is winning by a large margin over event
       | driven.
       | 
       | It may be counterintuitive, but the time to market of request
       | driven architecture and cost to maintain is way way lower.
        
       | astrea wrote:
       | I've got an anti-pattern to avoid: "Here's my YouTube"
        
       ___________________________________________________________________
       (page generated 2024-06-08 23:00 UTC)