_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
 (HTM) Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
 (HTM)   Using PostgreSQL as a Dead Letter Queue for Event-Driven Systems
       
       
        jeeybee wrote 2 hours 56 min ago:
        I maintain a small Postgres-native job queue for Python called
        PGQueuer: [1] It uses the same core primitives people are discussing
        here (FOR UPDATE SKIP LOCKED for claiming work; LISTEN/NOTIFY to wake
        workers), plus priorities, scheduled jobs, retries,
        heartbeats/visibility timeouts, and SQL-friendly observability. If
        you’re already on Postgres and want a pragmatic “just use
        Postgres” queue, it might be a useful reference / drop-in.
        
 (HTM)  [1]: https://github.com/janbjorge/pgqueuer
       
        nottorp wrote 5 hours 18 min ago:
        Hmm that raises a question for me.
        
        I haven't done a project that uses a database (be it sql or no sql)
        where the amount of deletes is comparable to the amount of inserts (and
        far larger than like tens per day, of course).
        
        How does your average db server work with that, performance wise?
        Intuitively I'd think it's optimized more for inserts than for deletes,
        but of course I may be wrong.
       
        Andys wrote 6 hours 1 min ago:
        We did this at Chargify, but with MySQL. If Redis was unavailable, it
        would dump the job as a JSON blob to a mysql table.  A cron job would
        periodically clean it out by re-enqueuing jobs, and it worked well.
       
        cmgriffing wrote 7 hours 57 min ago:
        Only slightly related, but I have been using Oban as a Postgres native
        message queue in the elixir ecosystem and loving it.
        For my use case, it’s so much simpler than spinning up another piece
        of infrastructure like Kafka or rabbitmq
       
        branko_d wrote 10 hours 28 min ago:
        Why use string as status, instead of a boolean? That just wastes space
        for no discernable benefit, especially since the status is indexed.
        Also, consider turning event_type into an integer if possible, for
        similar reasons.
        
        Furthermore, why have two indexes with the same leading field (status)?
       
          hans_castorp wrote 1 hour 55 min ago:
          Postgres does index de-duplication. So it's likely that even if you
          change the strings to enums, the index won't be that much smaller.
          
          > Furthermore, why have two indexes with the same leading field
          (status)?
          
          That indeed is a valid question.
       
          storystarling wrote 4 hours 59 min ago:
          Boolean is rarely enough for real production workloads. You need a
          'processing' state to handle visibility timeouts and prevent
          double-execution, especially if tasks take more than a few
          milliseconds. I also find it crucial to distinguish between
          'retrying' for transient errors and 'failed' for dead letters. Saving
          a few bytes on the index isn't worth losing that observability.
       
            branko_d wrote 36 min ago:
            > Boolean is rarely enough for real production workloads. You need
            a 'processing' ... 'retrying'... 'failed' ...
            
            If you have more than 2 states, then just use integer instead or
            boolean.
            
            > Saving a few bytes on the index isn't worth losing that
            observability.
            
            Not sure why having a few well-known string values is more
            "observable" than having a few well-known integer values.
            
            Also, it might be worth having better write performance. When
            PostgreSQL updates a row, it actually creates a new physical row
            version (for MVCC), so the less it has to copy the better.
       
        tantalor wrote 11 hours 58 min ago:
        This is logging.
       
          bdangubic wrote 11 hours 55 min ago:
          Care to elaborate? I do not understand how is this logging, it is
          quite opposite of logging as once the retry works the DLQ gets wiped
          out - would assume you would like logging to be persistent with at
          least a little bit of retention?
       
        with wrote 15 hours 1 min ago:
        Great application of first principles. I think it's totally reasonable
        also, at even most production loads. (Example: My last workplace had a
        service that constantly roared at 30k events per second, and our DLQs
        would at most have orders of hundreds of messages in them). We would
        get paged if a message's age was older than an hour in the queue.
        
        The idea is that if your DLQ has consistently high volume, there is
        something wrong with your upstream data, or data handling logic, not
        the architecture.
       
          microlatency wrote 3 hours 30 min ago:
          What did you use for the DLQ monitoring? And how did you fix the
          issues?
       
        awesome_dude wrote 15 hours 51 min ago:
        I think that using Postgres as the message/event broker is valid, and
        having a DLQ on that Postgres system is also valid, and usable.
        
        Having SEPARATE DLQ and Event/Message broker systems is not (IMO) valid
        - because a new point of failure is being introduced into the
        architecture.
       
        shoo wrote 17 hours 11 min ago:
        re: SKIP LOCKED, introduced in postgres 9.5, here's an an archived copy
        [†] of the excellent 2016 2ndquadrant post discussing it [1]
        corresponding HN discussion thread from 2016 [2] [†] it seems that
        all the old 2ndquadrant.com blog post links have been broken after
        their acquisition by enterprisedb
        
 (HTM)  [1]: https://web.archive.org/web/20240309030618/https://www.2ndquad...
 (HTM)  [2]: https://news.ycombinator.com/item?id=14676859
       
          upmostly wrote 16 hours 52 min ago:
          We just published a detailed walkthrough of this exact pattern with
          concrete examples and failure modes:
          
          PostgreSQL FOR UPDATE SKIP LOCKED: The One-Liner Job Queue [1] It
          covers the race condition, the atomic claim behaviour, worker
          crashes, and how priorities and retries are usually layered on top.
          Very much the same approach described in the old 2ndQuadrant post,
          but with a modern end-to-end example.
          
 (HTM)    [1]: https://www.dbpro.app/blog/postgresql-skip-locked
       
            victor106 wrote 16 hours 5 min ago:
            Love your product. Will you ever provide support of
            duckdb/motherduck? Wish there is a generic way you provided to add
            any database type
       
              upmostly wrote 15 hours 31 min ago:
              Thanks, glad you like it.
              
              DuckDB is on our radar. In practice each database still needs
              some engine-specific work to feel good, so a fully generic plugin
              system is harder than it sounds. We are thinking about how to do
              this in a scalable way.
       
        kristov wrote 17 hours 59 min ago:
        Why use shedlock and select-for-update-skip-locked? Shedlock stops
        things running in parallel (sort-of), but the other thing makes
        parallel processing possible.
       
        tonymet wrote 18 hours 5 min ago:
        Postgres is essentially a b-tree with a remote interface. Would you use
        a b-tree to store a dead letter queue? What is big O of insert &
        delete?  what happens when it grows?
        
        Postgres has a query interface, replication, backup and many other
        great utilities.  And it’s well supported, so it will work for
        low-demand applications.
        
        Regardless, you’re using the wrong data structure with the wrong
        performance profile, and at the margins you will spend a lot more money
        and time than necessary running it . And service will suffer.
       
          quibono wrote 14 hours 34 min ago:
          What would you use?
       
        gytisgreitai wrote 18 hours 45 min ago:
        Would be interesting to see the numbers this system processes. My bet
        is that they are not that high.
       
        cpursley wrote 18 hours 55 min ago:
        
        
 (HTM)  [1]: https://github.com/pgmq/pgmq
       
          nextaccountic wrote 5 hours 54 min ago:
          Indeed, pgmq is exactly the Postgres queueing system that you would
          build from scratch (for update skip locked and all that), except it's
          already built. Cloud providers should install this extension by
          default - it's in a really sweet spot for when you don't want or need
          a separate queue
       
        nicoritschel wrote 19 hours 4 min ago:
        lol a FOR UPDATE SKIP LOCKED post hits the HN homepage every few months
        it feels like
       
          whateveracct wrote 19 hours 0 min ago:
          and another CTO will use this meme as a reason to "just use Postgres"
          for far longer than they should lmao
       
            throw_away_623 wrote 14 hours 21 min ago:
            I’ll take “just use Postgres” over “prematurely add three
            new systems” any day. Complexity has a cost too.
            
            Using Postgres too long is probably less harmful than adding
            unnecessary complexity too early
       
              whateveracct wrote 14 hours 1 min ago:
              It probably is, but I don't like to operate as if I will
              inevitably make giant mistakes. Sometimes there isn't a trade off
              - you can just be good lolol.
              
              Both are pretty bad.
       
        TexanFeller wrote 19 hours 55 min ago:
        Ofc I wouldn't us it for extremely high scale event processing, but
        it's great default for a message/task queue for 90% of business apps.
        If you're processing under a few 100m events/tasks per day with less
        than ~10k concurrent processes dequeuing from it it's what I'd default
        to.
        
        I work on apps that use such a PG based queue system and it provides
        indispensable features for us we couldn't achieve easily/cleanly with a
        normal queue system such as being able to dynamically adjust the
        priority/order of tasks being processed and easily query/report on the
        content of the queue. We have many other interesting features built
        into it that are more specific to our needs as well that I'm more
        hesitant to describe in detail here.
       
          j45 wrote 17 hours 0 min ago:
          Very few things dna start at an extremely high scale event
          processing.
          
          There’s also an order of magnitude higher events when doing event
          based work in processing.
          
          This seems like a perfectly reasonable starting and gateway points
          that can have things organized for when the time comes.
          
          Most things don’t scale that big.
       
            gytisgreitai wrote 3 hours 18 min ago:
            So perhaps don’t use kafka at all? E.g. Adyen used postgresql [1]
            as a queue until the outgrew.
            In this case it seems there are a lot of things that can go south
            in case of major issue on the event pipeline. Unless the throughput
            is low.. but then why kafka?
            
 (HTM)      [1]: https://www.adyen.com/knowledge-hub/design-to-duty-adyen-a...
       
        renewiltord wrote 20 hours 21 min ago:
        Segment uses MySQL as queue not even as DLQ. It works at their scale.
        So there are many (not all) systems that can tolerate this as queue.
        
        I have simple flow: tasks are order of thousands an hour. I just use
        postgresql. High visibility, easy requeue, durable store. With
        appropriate index, it’s perfectly fine. LLM will write skip locked
        code right first time. Easy local dev. I always reach for Postgres for
        event bus in low volume system.
       
        reactordev wrote 20 hours 27 min ago:
        Another day, another “Using PostgreSQL for…” thing it wasn’t
        designed for. This isn’t a good idea. What happens when the queue
        goes down and all messages are dead lettered? What happens when you end
        up with competing messages? This is not the way.
       
          tlb wrote 1 hour 12 min ago:
          I think the PG designers would be surprised by the claim that it
          wasn't designed for this. Database designers try very hard to support
          the widest possible range of uses.
          
          If all queue actions are failing instantly, you probably want a
          separate throttle to not remove them from the Kafka queue, since
          you'd rather keep them there and resume processing them normally
          instead of from the DLQ when queue processing is working again. In
          fact, the rate limit implicitly enforced by adding failure records to
          the DLQ helps with this.
       
          trympet wrote 17 hours 2 min ago:
          I prefer using MS Exchange mailboxes for my message queue.
       
          direwolf20 wrote 19 hours 15 min ago:
          The other system you're using that isn't Postgres can also go down.
          
          Many developers overcomplicate systems. In the pursuit of 100%
          uptime, if you're not extremely careful, you removed more 9s with
          complexity than you added with redundancy. And although hyperscalers
          pride themselves on their uptime (Amazon even achieved three nines
          last year!) in reality most customers of most businesses are fine if
          your system is down for ten minutes a month. It's not ideal and you
          should probably fix that, but it's not catastrophic either.
       
            reactordev wrote 16 hours 9 min ago:
            >The other system you're using that isn't Postgres can also go
            down.
            
            Only if DC gets nuked.
            
            Many developers overcomplicate systems and throw a database at the
            problem.
       
              direwolf20 wrote 13 hours 6 min ago:
              Which system is immune to all downtime except the DC getting
              nuked?
       
                reactordev wrote 12 hours 32 min ago:
                Properly designed distributed systems.
                
                Challenge: Design a fault tolerant event-driven architecture.
                Only rule, you aren’t allowed to use a database. At all. This
                is actually an interview question for a top employer. Answer
                this right and you get a salary that will change your life.
       
                  direwolf20 wrote 2 hours 29 min ago:
                  No, those go down all the time. AWS had three nines last
                  year. Bitcoin had the value overflow incident.
       
              mwigdahl wrote 14 hours 27 min ago:
              Wow, TIL there was an atomic attack on the capitol in October!
       
                reactordev wrote 14 hours 6 min ago:
                DC=Data Center
                
                DC!=Washington, DC
       
                  mwigdahl wrote 8 hours 58 min ago:
                  I wondered, but the lack of "the" before "DC" tipped me
                  toward interpreting it as the place name, especially as AWS
                  us-east-1 is in Northern Virginia.  Thanks for clarifying!
       
            hinkley wrote 17 hours 51 min ago:
            What I’ve found is that, particularly with internal customers,
            they’re fine with an hour a month, possibly several, as long as
            not all of your eggs are in one basket.
            
            The centralization pushes make a situation where if I have a task
            to do that needs three tools to accomplish, and one of them goes
            down, they’re all down. So all I can do is go for coffee or an
            early lunch because I can’t sub in another task into this time
            slot. They’re all blocked by The System being down, instead of a
            system being down.
            
            If CI is borked I can work on docs and catch up on emails. If the
            network is down or NAS is down and everything is on that NAS, then
            things are dire.
       
              plaguuuuuu wrote 11 hours 40 min ago:
              good luck doing anything if kafka is down though
       
          fcarraldo wrote 19 hours 38 min ago:
          There are a ton of job/queue systems out there that are based on SQL
          DBs. GoodJob and SupaBase Queues are two examples.
          
          It’s not usable for high scale processing but most applications
          just need a simple queue with low depth and low complexity. If
          you’re already managing PSQL and don’t want to add more
          management to your stack (and managed services aren’t an option),
          this pattern works just fine. Go back 10-15yrs and it was more
          common, especially in Ruby shops, as teams willing to adopt
          Kafka/Cassandra/etc were more rare.
       
            reactordev wrote 16 hours 12 min ago:
            And there are a ton that aren’t.
       
          senbrow wrote 19 hours 48 min ago:
          Criticism without a better solution is only so valuable.
          
          How would you do this instead, and why?
       
            reactordev wrote 16 hours 11 min ago:
            Watching a carpenter try to weld is equally only so valuable. I
            think the explanation is clear.
       
          hnguyen14 wrote 19 hours 54 min ago:
          How so? There are queues that use SQL (or no-SQL) databases as the
          persistence layer. Your question is more specific to the
          implementation, not the database  as persistence layer itself. And
          there are ways to address it.
       
          odie5533 wrote 20 hours 11 min ago:
          You wouldn't ack the message if you're not up to process it.
       
        exabrial wrote 20 hours 36 min ago:
        > FOR UPDATE SKIP LOCKED
        
        Learned something new today. I knew what FOR UPDATE did, but somehow
        I've never RTFM'd hard enough to know about the SKIP LOCKED directive.
        Thats pretty cool.
       
          scresswell wrote 17 hours 18 min ago:
          Yes, SKIP LOCKED is great. In practice you nearly always want LIMIT,
          which the article did not mention. Be careful if your selection spans
          multiple tables: only the relations you explicitly lock are protected
          (see SELECT … FOR UPDATE OF t1, t2). ORDER BY matters because it
          controls fairness and retry behaviour. Also watch ANALYZE:
          autoanalyze only runs once the dead to live tuple threshold is
          crossed, and on large or append heavy tables with lots of old rows
          this can lag, leading to poor plans and bad SKIP LOCKED performance.
          Finally, think about deletion and lifecycle: deleting on success,
          scheduled cleanup (consider pg_cron), or partitioning old data all
          help keep it efficient.
       
            exabrial wrote 9 hours 54 min ago:
            I can see how that'd be extremely useful with LIMIT, especially
            with XA. Take a stride, complete it, or put it back for someone
            else.
            
            Something I've still not mastered is how to prevent lock escalation
            into table-locks, which could torpedo all of this.
       
          metanonsense wrote 18 hours 44 min ago:
          only learned about SKIP LOCKED because ChatGPT suggested it to solve
          some concurrency problem I had. Great tool to learn such things.
       
            indigo945 wrote 18 hours 20 min ago:
            Great tool that wrote the blog post in the OP also, so it's quite
            versatile.
       
        rbranson wrote 21 hours 8 min ago:
        Biggest thing to watch out with this approach is that you will
        inevitably have some failure or bug that will 10x, 100x, or 1000x the
        rate of dead messages and that will overload your DLQ database. You
        need a circuit breaker or rate limit on it.
       
          plaguuuuuu wrote 12 hours 15 min ago:
          Could one put the DLQ messages on a queue and have a consumer ingest
          into pg?
          
          (The queue probably isnt down if you've just pulled a message off it)
       
          with wrote 15 hours 7 min ago:
          This is the same risk with any DLQ.
          
          The idea behind a DLQ is it will retry (with some backoff)
          eventually, and if it fails enough, it will stay there. You need
          monitoring to observe the messages that can't escape DLQ. Ideally,
          nothing should ever stay in DLQ, and if it does, it's something that
          should be fixed.
       
            microlatency wrote 3 hours 33 min ago:
            What do you use for the monitoring of DLQs?
       
          j45 wrote 16 hours 59 min ago:
          It will happen eventually in any system.
          
          No need to look down on PG because it makes it more approachable and
          is more longer a specialized skill.
       
          rr808 wrote 19 hours 48 min ago:
          I worked on an app that sent an internal email with stack trace
          whenever an unhandled exception occurred. Worked great until the day
          when there was an OOM in a tight loop on a box in Asia that sent a
          few hundred emails per second and saturated the company WAN backbone
          and mailboxes of the whole team. Good times.
       
          pletnes wrote 20 hours 46 min ago:
          If you can’t deliver to the DLQ, then what? Then you’re missing
          messages either way. Who cares if it’s down this way or the other?
       
            RedShift1 wrote 20 hours 35 min ago:
            The point is to not take the whole server down with it. Keeps the
            other applications working.
       
            xyzzy_plugh wrote 20 hours 39 min ago:
            Not necessarily. If you can't deliver the message somewhere you
            don't ACK it, and the sender can choose what to do (retry, backoff,
            etc.)
            
            Sure, it's unavailability of course, but it's not data loss.
       
              konart wrote 20 hours 14 min ago:
              If you are reading from Kafka (for example) and you can't do
              anything with a message (broken json as an example) and you can't
              put it into a DLQ - you have not other option but to skip it or
              stop on it, no?
       
                awesome_dude wrote 16 hours 24 min ago:
                Sorry, but what's stopping the DLQ being a different topic on
                that Kafka - I get that the consumer(s) might be dead,
                preventing them from moving the message to the DLQ topic, but
                if that's the case then no messages are being consumed at all.
                
                If the problem is that the consumers themselves cannot write to
                the DLQ, then that feels like either Kafka is dying (no more
                writes allowed) or the consumers have been misconfigured.
                
                Edit: In fact there seems to be a self inflicted problem being
                created here - having the DLQ on a different system, whether it
                be another instance of Kafka, or Postgres, or what have you, is
                really just creating another point of failure.
       
                  majormajor wrote 12 hours 19 min ago:
                  > Edit: In fact there seems to be a self inflicted problem
                  being created here - having the DLQ on a different system,
                  whether it be another instance of Kafka, or Postgres, or what
                  have you, is really just creating another point of failure.
                  
                  There's a balance. Do you want to have your Kafka cluster
                  provisioned for double your normal event intake rate just in
                  case you have the worst-case failure to produce elsewhere
                  that causes 100% of events to get DLQ'd (since now you've
                  doubled your writes to the shared cluster, which could cause
                  failures to produce to the original topic).
                  
                  In that sort of system, failing to produce to the original
                  topic is probably what you want to avoid most. If your
                  retention period isn't shorter than your time to recover from
                  an incident like that, then priority 1 is often "make sure
                  the events are recorded so they can be processed later."
                  
                  IMO a good architecture here cleanly separates transient
                  failures (don't DLQ; retry with backoff, don't advance
                  consumer group) from "permanently cannot process" (DLQ only
                  these), unlike in the linked article. That greatly reduces
                  the odds of "everything is being DLQ'd!" causing cascading
                  failures from overloading seldom-stressed parts of the
                  system. Makes it much easier to keep your DLQ in one place,
                  and you can solve some of the visibility problems from the
                  article from a consumer that puts summary info elsewhere or
                  such. There's still a chance for a bug that results in
                  everything being wrongly rejected, but it makes you
                  potentially much more robust against transient downstream
                  deps having a high blast radius. (One nasty case here is if
                  different messages have wildly different sets of downstream
                  deps, do you want some blocking all the others then? IMO they
                  should then be partitioned in a way so that you can still
                  move forward on the others.)
       
                    awesome_dude wrote 11 hours 36 min ago:
                    I think that you're right to mention that if the DLQ is
                    over used that that potentially cripples the whole event
                    broker, but I don't think that having a second system that
                    could fall over for the same reason AND a host of other
                    reasons is a good plan. FTR I think doubling kafka
                    provisioned capacity is simpler, easier, cheaper, and more
                    reliable approach.
                    
                    BUT, you are 100% right to point to what i think is the
                    proper solution, and that is to treat the DLQ with some
                    respect, not a bit bucket where things get dumped because
                    the wind isn't blowing in the right direction.
       
                Misdicorl wrote 18 hours 56 min ago:
                Your place of last resort with kafka is simply to replay the
                message back to the same kafka topic since you know it's up. In
                a simple single consumer setup just throw a retry count on the
                message and increment it to get monitoring/alerting/etc. Multi
                consumer? Put an enqueue source tag on it and only process the
                messages tagged for you. This won't scale to infinity but it
                scales really really far for really really cheap
       
                singron wrote 19 hours 51 min ago:
                Generally yes, but if you use e.g. the parallel consumer, you
                can potentially keep processing in that partition to avoid
                head-of-line blocking. There are some downsides to having a
                very old unprocessed record since it won't advance the consumer
                group's offset past that record, and it instead keeps track of
                the individual offsets it has completed beyond it, so you don't
                want to be in that state indefinitely, but you hope your DLQ
                eventually succeeds.
                
                But if your DLQ is overloaded, you probably want to slow down
                or stop since sending a large fraction of your traffic to DLQ
                is counter productive. E.g. if you are sending 100% of messages
                to DLQ due to a bug, you should stop processing, fix the bug,
                and then resume from your normal queue.
       
            rbranson wrote 20 hours 40 min ago:
            Sure, but you still need to design around this problem. It’s
            going to be a happy accident that everything turns out fine if you
            don’t.
       
          shayonj wrote 20 hours 48 min ago:
          This! Only thing worse than your main queue backing off is you
          dropping items from going into the DLQ because it can’t stay up.
       
       
 (DIR) <- back to front page