[HN Gopher] The Dual Nature of Events in Event-Driven Architecture
___________________________________________________________________
The Dual Nature of Events in Event-Driven Architecture
Author : lutzh
Score : 61 points
Date : 2024-10-31 08:15 UTC (1 days ago)
(HTM) web link (www.reactivesystems.eu)
(TXT) w3m dump (www.reactivesystems.eu)
| lutzh wrote:
| Article observing that events in event-driven architecture are
| both triggers of actions and carriers of data, and that these
| roles may conflict in the event design. Submitted by author.
| gunnarmorling wrote:
| Nice one. I wrote about this a while ago, from a slightly
| different perspective, focusing on data change events [1].
| Making a similar differentiation there between id-only events
| (which you describe as triggers of action; from a data change
| feed perspective, that action typically would be a re-select of
| the current state of the represented record), full events (your
| carriers of data) and patch events (carriers of data with only
| the subset of attributes whose value has changed).
|
| [1] https://www.decodable.co/blog/taxonomy-of-data-change-
| events
| bob1029 wrote:
| The triggering of the action is a direct consequence of the
| information an event contains. Whether or not an action is
| triggered should not be the responsibility of the event.
|
| If you are writing events with the intention of having them
| invoke some specific actions, then you should prefer to invoke
| those things directly. You should be describing a space of things
| that have occurred, not commands to be carried out.
|
| By default I would only include business keys in my event data.
| This gets you out of traffic on having to make the event serve as
| an aggregate view for many consumers. If you provide the keys of
| the affected items, each consumer can perform their own targeted
| lookups as needed. Making assumptions about what views each will
| need is where things get super nasty in my experience (i.e.
| modifying events every time you add consumers).
| withinboredom wrote:
| Events shouldn't carry any data in my opinion, except
| parameterized data. In the context of a booking, for example,
| it would be SeatBooked {41A} instead of 41ABooked, though the
| latter is a better event, but harder to program for. The entire
| flow might looked like this:
|
| SeatTimeLimitedReserved {41A, 15m}
|
| SeatAssignedTo {UserA}
|
| SeatBooked {41A}
|
| If a consumer needs more data, there should be a new event.
| lutzh wrote:
| Thanks for your feedback, I appreciate it!
|
| > The triggering of the action is a direct consequence of the
| information an event contains. Whether or not an action is
| triggered should not be the responsibility of the event.
|
| I agree, but still for different consumers events will have
| different consequences - in some consumers it'll trigger an
| action that is part of a higher-level process (and possibly
| further events), in others it'll only lead to data being
| updated.
|
| > If you are writing events with the intention of having them
| invoke some specific actions, then you should prefer to invoke
| those things directly. You should be describing a space of
| things that have occurred, not commands to be carried out.
|
| With this I don't agree. I think that's the core of event-
| driven architecture that events drive the process, i.e. will
| trigger certain actions. That's not contradicting them
| describing what has occurred, and doesn't make them commands.
|
| > By default I would only include business keys in my event
| data. This gets you out of traffic on having to make the event
| serve as an aggregate view for many consumers. If you provide
| the keys of the affected items, each consumer can perform their
| own targeted lookups as needed. Making assumptions about what
| views each will need is where things get super nasty in my
| experience (i.e. modifying events every time you add
| consumers).
|
| This is feedback I got multiple times, the "notification plus
| callback" seems to be a popular pattern. It has its own
| problems though, both conceptual (event representing an
| immutable set of facts) and technical (high volume of events).
| I think digging into the pros and cons of that pattern will be
| one of my next blog posts! Stay tuned!
| williamdclt wrote:
| > each consumer can perform their own targeted lookups as
| needed
|
| that puts you into tricky race condition territory, the data
| targeted by an event might have changed (or be deleted) between
| the time it was emitted and the time you're processing it. It's
| not always a problem, but you have to analyse if it could be
| every time.
|
| It also means that you're losing information on what this event
| actually represents: looking at an old event you wouldn't know
| what it actually did, as the data has changed since then.
|
| It also introduces a synchronous dependency between services:
| your consumer has to query the service that dispatched the
| event for additional information (which is complexity and extra
| load).
|
| Ideally you'd design your event so that downstream consumers
| don't need extra information, or at least the information they
| need is independent from the data described by the event: eg a
| consumer needs the user name to format an email in reaction to
| the "user_changed_password" event? No problem to query the
| service for the name, these are independent concepts, updates
| to these things (password & name) can happen concurrently but
| it doesn't really matter if a race condition happens
| candiddevmike wrote:
| There should be some law that says strictly serialized
| process should never be broken into discreet services.
| Distributed locks and transactions are hell.
| withinboredom wrote:
| The best way to avoid distributed locks and transactions is
| to manually do the work. For example, instead of doing a
| distributed lock on two accounts when transferring funds,
| you might do this (which is the same as a distributed
| transaction, without the lock):
|
| 1. take money from account A
|
| 2. if failed, put money back into account A
|
| 3. put money into account B
|
| 4. if failed, put money back into account A
|
| In other words, perform compensating actions instead of
| doing transactions.
|
| This also requires that you have some kind of mechanism to
| handle an application crash between 2 and 3, but that is
| something else entirely. I've been working on this for a
| couple of years now and getting close to something really
| interesting ... but not quite there yet.
| candiddevmike wrote:
| > This also requires that you have some kind of mechanism
| to handle an application crash between 2 and 3, but that
| is something else entirely
|
| Like a distributed transaction or lock. This is the
| entire problem space, your example above is very naive.
| singron wrote:
| The better version of this is sagas, which is a kind of a
| simplified distributed transaction. If you do this
| without actually using sagas, you can really mess this
| up.
|
| E.g. you perform step 2, but fail to record it. When
| resuming from crash, you perform step 2 again. Now A has
| too much money in their account.
| marcosdumay wrote:
| Your event data must not be mutable.
|
| That's kind of the first rule of any event-based system. It
| doesn't really matter the architecture, if you decide to name
| the things "event", everybody's head will break if you make
| them mutable.
|
| If you decide to add mutation there in some way, you will
| need to rewrite the event stream, replacing entire events.
| gunnarmorling wrote:
| It's not about mutability of events, but about mutating the
| underlying data itself. If the event only says "customer
| 123 has been updated", and a consumer of that event goes
| back to the source of the event to query the full state of
| that customer 123, it may have been updated again (or even
| deleted) since the event was emitted. Depending on the use
| case, this may or may not be a problem. If the consumer is
| only interested in the current state of the data, this
| typically is acceptable, but if it is needed in the
| complete history of changes, it is not.
| marcosdumay wrote:
| Making a wacky 2-steps announcement protocol doesn't
| change the nature of your events.
|
| If the consumer goes to your database and asks "what's
| the data for customer 123 at event F52A?" it better
| always get back the same data or "that event doesn't
| exist, everything you know is wrong".
| gunnarmorling wrote:
| > ... at event F52A
|
| Sure, if the database supports this sort of temporal
| query, then you're good with such id-only events. But
| that's not exactly the default for most databases / data
| models.
| marcosdumay wrote:
| I'm understanding what you have isn't really "events",
| but some kind of "notifications".
|
| Events are part of a stream that define your data. The
| stream doesn't have to be complete, but if it doesn't
| make sense to do things like buffer or edit it, it's
| probably something else and using that name will mislead
| people.
| bob1029 wrote:
| > an event might have changed (or be deleted) between the
| time it was emitted
|
| Then I would argue it isn't a meaningful event. If some
| attributes of the event could become "out of date" such that
| the logical event risks invalidation in the future, you have
| probably put too much data into the event.
|
| For example, including a user's preferences (e.g., display
| name) in a logon event - while convenient - means that if
| those preferences ever change, the event is invalid to
| reference for those facts. If you only include the user's id,
| your event should be valid forever (for most rational
| systems).
|
| > your consumer has to query the service that dispatched the
| event
|
| An unfortunate but necessary consequence of integrating
| multiple systems together. You can't take out a global
| database lock for every event emitted.
| jwarden wrote:
| Another architecture might be that the service responsible for
| Seat Selection emits a `SeatSelected` event, and another
| service responsible for updating bookings emits a
| `BookingUpdated(Reason: SeatSelected)` "fat" event. Same for
| `PaymentReceived` and `TicketIssued`.
|
| Both events would "describe a space of things that occurred" as
| @bob1029 suggests.
|
| The seat selection process for an actual airline probably needs
| to be more involved. @withinboredom recommends:
| - SeatTimeLimitedReserved {41A, 15m} - SeatAssignedTo
| {UserA} - SeatBooked {41A}
|
| In which case, only SeatBooked would trigger a BookingUpdated
| event.
| lutzh wrote:
| Thanks for your feedback. I realize I should have elaborated
| the example a bit more, it's too vague. So, as I wrote in
| some other reply as well, please don't over-interpret it. The
| point was only to say that in order to differentiate the
| events, we don't necessarily need distinct types (which would
| result in multiple schemas on a topic), but can instead
| encode it in one type/schema. Like mapping in ORM - instead
| of "table per subclass", you can use "table per class
| hierarchy".
| svilen_dobrev wrote:
| like, events vs documents, ~2007:
|
| https://learn.microsoft.com/en-gb/archive/blogs/nickmalik/ki...
| lutzh wrote:
| Good catch! Indeed, without me realizing it, this trigger/data
| duality is pretty much what Event Message and Document Message
| are in "Enterprise Integration Patterns" (which is what the
| post you linked refers to). As it happens, in the book the
| authors also speak about "a combined document/event message",
| which is how me mostly use events in EDA today, I think.
| revskill wrote:
| Event is a point in time. State is a range in time.
|
| Geometrically speaking.
|
| So, what should be in an event ? To me, it's the minimum but
| sufficient data on its own to be understandable.
| exabrial wrote:
| Well put!
|
| We do a lot of event driven architecture with ActiveMQ. We try to
| stick messaging-as-signalling rather than messaing-as-data-
| transfer. These are the terms we came up with, I'm sure Martin
| Fowler or someone else has described it better!
|
| So we have SystemA that is completing some processing of
| something. It's going to toss a message onto a queue that SystemB
| is listening on. We use an XA to make sure the database and
| broker commit together _1. SystemB then receives the event from
| the queue and can begin it 's little tiny bit of business
| processing.
|
| If one divides their "things" up into logical business units of
| "things that must all happen, or none happen" you end up with a
| pretty minimalistic architecture thats easy to understand but
| also offers retry capabilities if a particular system errors out
| on a single message.
|
| It also allows you to take SystemB offline and let it's work pile
| up, then resume it later. Or you can kick of arbitrary events to
| test parts of the system.
|
| _1: although if this didn't happen, say during a database failure
| at just the right time, the right usage of row locking,
| transactions, and indexes on the database prevent duplicates.
| This is so rare in practice but we protect against it anyway.
| kaba0 wrote:
| I am going on a bit of a tangent here, but I always wondered, are
| those of you who use absolutely huge event-driven architectures,
| have you ever got yourself into a loop? I can't help but worry
| about such, as event systems are fundamentally Turing-complete,
| and with a complex enough system it doesn't seem too hard to
| accidentally send an event because A, which will eventually,
| multiple other events later again cases A.
|
| Is it a common occurence' and if it happens is it hard to
| debug/fix? Does Kafka and other popular event systems have
| something to defend against it?
| lutzh wrote:
| Reg. the "technical" question: Kafka or any log-based message
| broker (or any message queue) would not prevent you from that.
| Any service can publish/send and/or subscribe/receive.
|
| Regarding if it's a problem or a regular occurrence: No, really
| not. I have never seen this being a problem, I think that fear
| is unfounded.
| dkarl wrote:
| Events are published observations of facts. If you want to be
| able to use them as triggers, or to build state, then you have to
| choose the events and design the system to make that possible,
| but you always have to ensure that systems only publish facts
| that they have observed.
|
| Most issues I've seen with events are caused by giving events
| imprecise names, names that mean more or less than what the
| events attest to.
|
| For example, a UI should not emit a SetCreditLimitToAGazillion
| event because of a user interaction. Downstream programmers are
| likely to get confused and think that the state of the user's
| credit limit has been set to a gazillion, or needs to be set to a
| gazillion. Instead, the event should be
| UserRequestedCreditLimitSetToAGazillion. That accurately
| describes what the UI observed and is attesting to, and it is
| more likely to be interpreted correctly by downstream systems.
|
| In the article's example, SeatSelected sound ambiguous to me.
| Does it only mean the user saw that the seat was available and
| attempted to reserve it? Or does it mean that the system has
| successfully reserved the seat for that passenger? Is the update
| finalized, or is the user partway through a multistep process
| that they might cancel before confirming? Depending on the
| answer, we might need to release the user's prior seat for other
| passengers, or we might need to reserve both seats for a few
| minutes, pending a confirmation of the change or a timeout of
| their hold on the new seat. The state of the reservation may or
| may not need to be updated. (There's nothing wrong with using a
| name like that in a toy example like the article does, but I want
| to make the point that event names in real systems need to be
| much more precise.)
|
| Naming events accurately is the best protection against a
| downstream programmer misinterpreting them. But you still need to
| design the system and the events to make sure they can be used as
| intended, both for triggering behavior and for reporting the
| state and history of the system. You don't get anything
| automatically. You can't design a set of events for triggering
| behavior and expect that you'll be able to tell the state of the
| system from them, or vice-versa.
| lutzh wrote:
| Thanks for your feedback! Very good point on the naming. fwiw
| the idea was if you buy a cinema ticket, you are usually
| presented with some sort of seating plan and select the seat
| (basically putting them into the shopping cart). So
| SeatSelected would be the equivalent of "ItemAdded" to the
| shopping cart in an e-commerce application I guess. Please
| don't over-interpret the example. There isn't even a definition
| what that booking aggregate contains. The point was really only
| to say that in order to differentiate the events, we don't
| necessarily need distinct types (which would result in multiple
| schemas on a topic), but can instead encode it in one
| type/schema. Think of it like mapping in ORM - instead of
| "table per subclass", you can use "table per class hierarchy".
___________________________________________________________________
(page generated 2024-11-01 23:00 UTC)