[HN Gopher] Versioned finite-state machines with Postgres (2019)
       ___________________________________________________________________
        
       Versioned finite-state machines with Postgres (2019)
        
       Author : mirzap
       Score  : 84 points
       Date   : 2024-07-25 08:40 UTC (14 hours ago)
        
 (HTM) web link (raphael.medaer.me)
 (TXT) w3m dump (raphael.medaer.me)
        
       | onion-soup wrote:
       | Too much alilen syntax, why complicate things when I have perfect
       | and flexible NoSQL?
        
         | maxbond wrote:
         | Flexibility and constraint are both useful. (Constraint is one
         | of the reasons to use a finite state machine in the first
         | place.) An overly constrained system can't function, an overly
         | flexible system can't be relied upon. You'll answer your
         | question the day you encounter a problem that NoSQL is a bad
         | fit for, and after you learn SQL it won't feel so alien or
         | complicated. Or maybe that day will never come and you'll never
         | feel the need to learn SQL.
         | 
         | For a motivating example, consider a billing or accounting
         | system. This domain is well known, we probably don't need the
         | ability to rapidly evolve our schema. If we violate the
         | constraints of this system, there may be serious consequences,
         | like spending money we don't have or billing for the service
         | twice.
         | 
         | We could build this system with either an NoSQL or a SQL
         | database, but SQL would seem to me to be the natural choice.
        
           | cricketlover wrote:
           | I understand what you're saying, but in the example for
           | accounting, we can also solve this problem using NoSQL.
           | Because the most important feature we're talking about there
           | is transaction support. Similarly, schema-on-write can be
           | provided by a library.
           | 
           | To me it seems like NoSQL works better when there is less to
           | normalize, which is the case with microservices. Those
           | services struggle with the support for a distributed
           | transaction when they have to make a distributed transaction.
           | This problem will be very easily solved in SQL (assuming its
           | not shared to completely denormalize everything for
           | performance).
           | 
           | Note that this normalization problem also shows up in schema-
           | on-write. If multiple people are contributing to a schema
           | from different teams, then it will become hard to maintain a
           | schema-on-write.
        
         | pratio wrote:
         | NoSQL will bring it's own set of issues right? Too flexible of
         | a system with no built in Validations would mean that they need
         | to be handled somewhere else. If we take the example in the
         | post, refund should not precede awaiting payment, If a new
         | status gets added, it becomes easy to know where the migrations
         | have to be run and in NoSQL, either we write something custom
         | or handle it each time the document is called.
        
           | mtkd wrote:
           | Surely that logic should not sit in persistance layer though?
        
         | klysm wrote:
         | Alien syntax is not a valid critique
        
         | mjhay wrote:
         | Looks like normal SQL syntax to me. That's not alien to most
         | practitioners.
        
       | jpnc wrote:
       | One thing I'm missing when modeling FSM like this is different
       | states having different set of constraints, even if only being
       | concerned with nullity. It's a shame having to make the field
       | optional just because you do not have the appropriate value in
       | the initial state of the entity.
        
       | aargh_aargh wrote:
       | For any use-case I can think of, which are all "small data"
       | (under 1B rows), I'd rather use DB migrations to take care of
       | this problem rather than forcing myself to live with my previous
       | design decisions indefinitely.
       | 
       | I am attracted to clever but in practice, simple always beats
       | clever.
        
         | ragebol wrote:
         | Wouldn't that risk some FSM getting modified while it's
         | 'running'? That may or may not be desirable I guess. Eg. a
         | customer not knowing that the process changed since they placed
         | an order and not understanding the process anymore
        
       | klysm wrote:
       | I definitely like materializing the transitions into a table
       | instead of the switch statement
        
       | zinclozenge wrote:
       | I always wondered about this, a while ago I saw a library similar
       | to this that modeled FSMs in Postgres, and had a comment saying
       | "todo: cycle detection". And made me wonder if it's even possible
       | to do this.
        
         | wchargin wrote:
         | Hmm... consider the following. Your FSM is acyclic iff you can
         | assign each state an integer _depth_ such that a state at depth
         | _d_ only ever transitions to states at depths strictly greater
         | than _d_. So consider the following tables:
         | CREATE TABLE states (             state order_state PRIMARY
         | KEY,             depth int NOT NULL,             UNIQUE(state,
         | depth)         );         CREATE TABLE transitions (
         | start_state order_state NOT NULL,             event order_event
         | NOT NULL,             end_state order_state NOT NULL,
         | start_depth int NOT NULL,             end_depth int NOT NULL,
         | CONSTRAINT transitions_start_depth_correct
         | FOREIGN KEY(start_state, start_depth) REFERENCES states(state,
         | depth)                 ON UPDATE CASCADE,
         | CONSTRAINT transitions_end_depth_correct
         | FOREIGN KEY(end_state, end_depth) REFERENCES states(state,
         | depth)                 ON UPDATE CASCADE,
         | CONSTRAINT transitions_depth_increases CHECK(end_depth >
         | start_depth),             PRIMARY KEY(start_state, event)
         | );
         | 
         | Let's bang on it for a quick test. You can define a state
         | machine; here's one that roughly matches the regex `^(AB|BA)$`
         | (I know I'm being a bit sloppy):                   fsm=> CREATE
         | DOMAIN order_state AS text;         CREATE DOMAIN         fsm=>
         | CREATE DOMAIN order_event AS text;         CREATE DOMAIN
         | fsm=> -- "CREATE TABLE"s as above, then:         fsm=> INSERT
         | INTO states(state, depth) VALUES('start', 1), ('a', 2), ('b',
         | 2), ('done', 3), ('error', 3);         INSERT 0 5         fsm=>
         | INSERT INTO transitions(start_state, event, end_state,
         | start_depth, end_depth) VALUES('start', 'A', 'a', 1, 2),
         | ('start', 'B', 'b', 1, 2), ('a', 'B', 'done', 2, 3), ('b', 'A',
         | 'done', 2, 3), ('a', 'A', 'error', 2, 3), ('b', 'B', 'error',
         | 2, 3);         INSERT 0 6
         | 
         | And, as you need to modify it, you can increase a node's depth
         | to make room for intervening nodes:                   fsm=>
         | UPDATE states SET depth = 4 WHERE state = 'error';
         | UPDATE 1         fsm=> TABLE transitions;          start_state
         | | event | end_state | start_depth | end_depth
         | -------------+-------+-----------+-------------+-----------
         | start       | A     | a         |           1 |         2
         | start       | B     | b         |           1 |         2
         | a           | B     | done      |           2 |         3
         | b           | A     | done      |           2 |         3
         | a           | A     | error     |           2 |         4
         | b           | B     | error     |           2 |         4
         | (6 rows)
         | 
         | But you can't decrease a node's depth too far:
         | fsm=> UPDATE states SET depth = 2 WHERE state = 'error';
         | ERROR:  new row for relation "transitions" violates check
         | constraint "transitions_depth_increases"         DETAIL:
         | Failing row contains (a, A, error, 2, 2).         CONTEXT:  SQL
         | statement "UPDATE ONLY "public"."transitions" SET "end_state" =
         | $1, "end_depth" = $2 WHERE $3::pg_catalog.text
         | OPERATOR(pg_catalog.=) "end_state"::pg_catalog.text AND $4
         | OPERATOR(pg_catalog.=) "end_depth""
         | 
         | And you can't introduce transitions that don't increase depth:
         | fsm=> INSERT INTO transitions(start_state, event, end_state,
         | start_depth, end_depth) VALUES('done', 'AGAIN!', 'start', 3,
         | 1);         ERROR:  new row for relation "transitions" violates
         | check constraint "transitions_depth_increases"         DETAIL:
         | Failing row contains (done, AGAIN!, start, 3, 1).
         | 
         | Now, I don't know that I would immediately recommend this for
         | high-throughput production use. You're storing "unnecessary"
         | state not once but many times (each state's depth appears `1 +
         | \deg(v)` times), plus additional indices and lookups. But I do
         | think it meets the desired consistency goals!
        
       | michelpp wrote:
       | This is a nice blog and I like the versioning feature, if you're
       | looking to explore this idea more I made a very simple
       | unversioned FSM extension for Postgres years ago for fun and no
       | profit:
       | 
       | https://github.com/michelp/pgfsm
        
       ___________________________________________________________________
       (page generated 2024-07-25 23:08 UTC)