hngopher.com

       [HN Gopher] Postgres as queue
       ___________________________________________________________________
        
       Postgres as queue
        
       Author : leontrolski
       Score  : 191 points
       Date   : 2024-02-09 15:37 UTC (1 days ago)
        
 (HTM) web link (leontrolski.github.io)
 (TXT) w3m dump (leontrolski.github.io)
        
       | andrewstuart wrote:
       | Postgres isn't the only way to have fun.
       | 
       | Other things just as good at being a queue:
       | 
       | SQLite
       | 
       | SQL server
       | 
       | MySQL
       | 
       | Plain old files on the file system
        
         | bhaney wrote:
         | std::queue and a big swap file
        
         | cha42 wrote:
         | you can certainly encode queue in many ways. mkfifo just work.
         | But integrating the queue in the DB isn't a bad idea if you
         | want to have both the queue and the db in a shared state.
         | 
         | I am happy that my queue inherit ACID properties.
         | 
         | SQLite simply doesn't allow concurrent write so it is a no go
         | for a queue.
         | 
         | I don't know much about SQL Server and MySQL but I wouldn't
         | favor a lockin closed source software or anything remotely
         | connected to Oracle.
         | 
         | At the end, only Postgresql remains I guess. Also, Postgresql
         | is super solid and the closest to SQL standard.
        
           | andrewstuart wrote:
           | You're wrong that I'm wrong.
        
           | tarruda wrote:
           | > mkfifo just work
           | 
           | Are you certain about that?
        
         | tussa wrote:
         | Don't forget the plain old postit notes.
        
           | geon wrote:
           | I thought those were write-only?
        
             | dkersten wrote:
             | Use a pencil so you can erase (or an erasable pen like the
             | frixion erasable pens)
        
             | tussa wrote:
             | I believe the right terminology is immutable.
        
         | crizzlenizzle wrote:
         | > Plain old files on the file system
         | 
         | ...and accessed over SFTP.
         | 
         | I worked for a company in the health industry and one of the
         | labs we integrated refused to call a HTTPS endpoint whenever a
         | result was ready, so we had to poll every _n_ mins to fetch
         | results. That worked well until covid happened and there were
         | so many test results causing all sorts of issues like reading
         | empty files (because they were about to be written to disk) and
         | things of that nature.
        
       | adaboese wrote:
       | Postgres is extremely expensive to scale. Why on Earth would you
       | try to put queue there.
        
         | kromem wrote:
         | The mantra about premature optimization applies to
         | infrastructure too.
        
         | valenterry wrote:
         | Maintaining extra infrastructure is expensive. Working around
         | missing ACID is expensive. Depending on how many messages we
         | are talking about, the cost of scaling postgres a bit more
         | might be much lower.
        
         | simonw wrote:
         | If you're running less than a million tasks a day through a
         | queue and you already have PostgreSQL in your stack why add
         | anything else?
        
           | adaboese wrote:
           | Just use a dedicated tool. It is not that hard. If you want
           | higher level abstraction, you have a whole spectrum of next
           | gen queues, like Temporal, Trigger.dev, Inngest, Defer, etc.
        
             | jitl wrote:
             | Personally the next-gen-ness of an infrastructure component
             | is inversely proportional to my trust in it.
        
               | bakchodi wrote:
               | Especially for something handling data. I want an old,
               | battle tested solution that won't disappear when the VC
               | capital dries up
        
               | simonw wrote:
               | Right, use boring technology!
               | 
               | https://boringtechnology.club/
        
             | simonw wrote:
             | Why use a dedicated tool if you have something in your
             | stack that can solve the problem already?
             | 
             | The less separate pieces of infrastructure you can run, the
             | less likely something will break that you don't know how to
             | easily fix.
             | 
             | The article touched on this in the list of things to avoid
             | when it said "I estimate each line of terraform to be an
             | order of magnitude more risk/maintenance/faff than each
             | line of Python" and "The need for expertise in anything
             | beyond Python + Postgres"
        
           | CyberDildonics wrote:
           | Even a million tasks a day is less than 12 a second. Most
           | queues are going to have surges since that's part of the
           | point of a queue, but it's still a few orders of magnitude
           | away from what should overwhelm a database.
        
         | throwaway2990 wrote:
         | lol no it's not.
        
         | cwbriscoe wrote:
         | Any DB is fine as a queue depending on requirements, design and
         | usage.
        
         | CubsFan1060 wrote:
         | https://microservices.io/patterns/data/transactional-outbox....
         | 
         | It allows you to wrap it all in a transaction. If you separate
         | a database update and insertion into a queue, whichever happens
         | second may fail, while the first succeeds.
        
           | illusive4080 wrote:
           | Tell the alternative is a saga pattern to implement a
           | distributed two phase commit.
           | 
           | Or actual XA, but that is cursed.
        
         | beeboobaa wrote:
         | because you're not expecting to have to scale beyond 1 instance
         | for the next few years & are already using postgres & now
         | everything is trivially transactional. KISS
        
         | wavemode wrote:
         | If the choice is between using a dedicated queue and postgres
         | for your data vs. using postgres for both, using postgres for
         | both makes perfect sense.
         | 
         | The scale at which you would outgrow using postgres for the
         | queue, you would also outgrow using postgres for the data.
        
           | halfcat wrote:
           | At what point does one outgrow Postgres for the data?
        
         | cha42 wrote:
         | If you have an infra that need to scale so much then Postgresql
         | isn't the right tool indeed. The right tools for your use case
         | probably doesn't even exists and you will have to build one.
         | 
         | It is not a mystery why all webscale companies endup designing
         | their own DB technology.
         | 
         | That being said, most of the DB in the wild are not remotely at
         | those scale. I have seen my share of Postgresql/ElasticSearch
         | combo to handle below TB data and just collapsing because of
         | the overeng of administrating two DB in one app.
        
         | alerighi wrote:
         | If you need scaling. Not all applications need scaling (e.g.
         | I'm doing an internal tool for a company that has 1000
         | employees, it's unlikely that from one day to another that
         | number of employee will double!), and for most applications a
         | single PSQL server either locally or in the cloud is enough.
        
       | dharmab wrote:
       | Postgres as a queue is one of the worst decisions I've seen made
       | and I and others I've worked with have spent years of our lives
       | unwinding the consequences.
        
         | dralley wrote:
         | Completely disagree for our use case. If your messages aren't
         | send-and-forget but rather represent work that needs to be
         | tracked, it is incredibly difficult to manage the state when
         | your database and queue are separate. Using postgresql as the
         | queue and leveraging the same transactions as your business
         | logic solves many many issues.
        
           | nostrebored wrote:
           | How is it incredibly difficult? I've seen this work very
           | well.
           | 
           | Is your code handling different varieties of event states
           | stored in your database?
           | 
           | Why implement this as a monolithic data source? Was it not
           | possible to separate different states by additional queues?
           | 
           | Curious to hear the decision making process here. Maybe I can
           | see it for low volume?
        
             | dralley wrote:
             | Admittedly we're not using it for anything high volume.
        
             | jvans wrote:
             | dralley is saying your queue push and database write aren't
             | transactional. You have to be ok suffering some small % of
             | message loss or messages sent without a database commit
        
               | nostrebored wrote:
               | Depends on your queue semantics, ie is it just an at
               | least once delivery or is there an ack back to the queue
               | service.
               | 
               | If there is a two phase process then you just need
               | idempotence and can safely transact with the db.
        
               | jvans wrote:
               | It doesn't matter what your queue semantics are. If it's
               | not part of the transaction, delivery and push cannot be
               | guaranteed to be atomic.
        
             | arcbyte wrote:
             | At the end of the day, the ONLY way to reliably do this is
             | to hook into the databases native journal/log anyway.
             | Postgres gives you better primitives to work with than
             | installing Qlik for example.
        
               | nostrebored wrote:
               | Why do you think so?
        
           | nurettin wrote:
           | As a person who has implemented worker farms at scale, I
           | don't understand which part is "incredibly difficult", or
           | what having a dedicated queue server prevents you from
           | updating database state within "the same transactions". If
           | your worker process has to update some kind of task state and
           | calculate some business logic, it can still do so whether if
           | you use rmq, redis or whatever.
        
             | dkersten wrote:
             | I assume they mean the dual write problem
        
               | nurettin wrote:
               | So atomic updates are incredibly difficult?
        
         | canadiantim wrote:
         | years seems like hyperbole
        
         | mikercampbell wrote:
         | It's almost as if we were in an ordered line, all waiting to
         | learn the same thing, and had the lesson delivered no more than
         | once.
        
         | mistrial9 wrote:
         | saying so without context or requirements at all reduces the
         | effectiveness of the insight
        
           | dharmab wrote:
           | Sadly my work is subject to regulatory requirements that
           | prevent me from discussing specific details :(
        
             | sbuttgereit wrote:
             | Surely if your complaint is general enough to give such
             | advice to everyone here, there must be a way to express
             | generalized technical details of that complaint without
             | disclosing anything more privileged than you already have.
             | If that kind of abstraction and generalization isn't
             | possible, then it sounds to me like something specific to
             | your work's implementation of the ideas and not something
             | that applies broadly.
        
             | cqqxo4zV46cp wrote:
             | Oh, okay. It's great that you've let us all know that you
             | hold The Secret. Thank you.
        
         | wavemode wrote:
         | Care to elaborate?
        
       | nomdep wrote:
       | Why? Because memory is expensive and disk space is cheap
        
       | notnmeyer wrote:
       | > I'd estimate each line of terraform to be an order of magnitude
       | more risk/maintenance/faff than each line of Python.
       | 
       | ehhhhhhhh, i don't know about that. different strokes and all of
       | that, but TF describing infrequently changing resources requires
       | next to no maintenance. i'm not sure i under what they mean by
       | "risk" in this context though.
       | 
       | if you don't know $technology well then any thing you do with it
       | is likely to be worse/riskier than using the thing that you do
       | know.
       | 
       | > The need for expertise in anything beyond Python + Postgres.
       | 
       | this should be the first point because it explains 99% of the
       | motivation for all of this.
        
         | smashed wrote:
         | Yes, I think the point being made is that the infrastructure
         | described in terraform is a much more difficult thing to change
         | and evolve over time.
         | 
         | Changing a source file line and shipping a new version of the
         | app is much easier.
         | 
         | They are not really comparable things at all. Even though we
         | moved to infrastructure as code, that does not mean that the
         | infra code is like functional application code.
         | 
         | Come to think of it, I guess it was a damn good lie to sell
         | infrastructure as code as so easy it's like shipping a new app
         | version when it's everything but..
        
         | benlivengood wrote:
         | Terraform providers and modules and the cloud resources evolve
         | fairly rapidly (especially anything kubernetes-related). If
         | you're not keeping up with the latest versions you can end up
         | somewhat stranded when a new resource feature needs to be used
         | and you discover unexpected tech debt from the world moving out
         | from under you.
        
         | Too wrote:
         | It's not that the TF code in itself is more complex, it's the
         | resources you spin up using TF that are. They are stateful,
         | expose network interfaces, require backups, updates, cert
         | renewals and all other kinds of operational burden.
        
       | jnsaff2 wrote:
       | In Elixir land Oban[0] uses Postgres as queue and seems to work
       | quite well.
       | 
       | [0] - https://github.com/sorentwo/oban
        
         | thibaut_barrere wrote:
         | Confirmed, it works fairly nicely (although we've had to be a
         | bit careful with very chatty jobs configurations with regard to
         | DB overload).
        
         | jpb0104 wrote:
         | In the world of Ruby, GoodJob [0] has been doing a _good job_
         | so far.
         | 
         | [0] - https://github.com/bensheldon/good_job
        
         | glenngillen wrote:
         | And in Go you've got River: https://riverqueue.com/
        
           | parthdesai wrote:
           | FWIW, River was directly inspired by Oban:
           | https://news.ycombinator.com/item?id=38351067
        
       | nemo44x wrote:
       | I'm still waiting for the "Postgres as a Web Browser" thread.
        
         | nurettin wrote:
         | Postgres rust rewrite will be a huge thread.
        
           | jesperwe wrote:
           | It has already started. https://docs.rs/pgx/latest/pgx/
           | 
           | Not hard to see path ahead of us similar to Linux's gradual
           | step towards Rust in the kernel.
        
             | tpetry wrote:
             | Postgres Rust extensions is a completely different topic
             | than rewriting PG in Rust. The problem is that a crash in
             | any extension will take down the full PG worker. So you can
             | havoc the robustness of your PG database with any
             | extension. But with a memory-safe language like Rust which
             | forces you to handle all error cases this can't happen
             | anymore.
        
         | RedShift1 wrote:
         | Our scientists are not stopping to think if they should.
        
       | falsandtru wrote:
       | FDB, a distributed DB, would be superior if complete order is not
       | required.
       | 
       | https://apple.github.io/foundationdb/queues.html
        
       | tomas789 wrote:
       | It was used in my previous job and it worked wonders. It was an
       | internal tool which stored lots of data but did not see high
       | traffic. And as such the load was not an issue. In the world of
       | thet large corporate it was almost the only viable solution. DB
       | already existed so it could be used. Other solutions would
       | require an "architecture" documents, security meetings and
       | perhaps even hiring someone to care about new tech in stack.
        
       | diek wrote:
       | Postgres is great as a queue, but this post doesn't really get
       | into the features that differentiate it from just polling, say
       | SQL Server for tasks.
       | 
       | For me, the best features are:                 * use LISTEN to be
       | notified of rows that have changed that the backend needs to take
       | action on (so you're not actively polling for new work)       *
       | use NOTIFY from a trigger so all you need to do is INSERT/UPDATE
       | a table to send an event to listeners       * you can select
       | using SKIP LOCKED (as the article points out)       * you can use
       | partial indexes to efficiently select rows in a particular state
       | 
       | So when a backend worker wakes up, it can:                 *
       | LISTEN for changes to the active working set it cares about
       | * "select all things in status 'X'" (using a partial index
       | predicate, so it's not churning through low cardinality 'active'
       | statuses)       * atomically update the status to 'processing'
       | (using SKIP LOCKED to avoid contention/lock escalation)       *
       | do the work       * update to a new status (which another worker
       | may trigger on)
       | 
       | So you end up with a pretty decent state machine where each
       | worker is responsible for transitioning units of work from status
       | X to status Y, and it's getting that from the source of truth.
       | You also usually want to have some sort of a per-task
       | 'lease_expire' column so if a worker fails/goes away, other
       | workers will pick up their task when they periodically scan for
       | work.
       | 
       | This works for millions of units of work an hour with a
       | moderately spec'd database server, and if the alternative is
       | setting up SQS/SNS/ActiveMQ/etc and then _still_ having to track
       | status in the database/manage a dead-letter-queue, etc -- it's
       | not a hard choice at all.
        
         | eddd-ddde wrote:
         | There's actually an extension called "tcn" or trigger change
         | notification that provides such a trigger put of the box.
        
         | foofie wrote:
         | Excellent comment. Thank you for taking the time to write it.
        
         | leontrolski wrote:
         | Thanks for the comprehensive reply, does the following argument
         | stand up at all? (Going on the assumption that LISTEN is one
         | more concept and one less concept is a good thing).
         | 
         | If I have say 50 workers polling the db, either it's quiet and
         | there's no tasks to do - in which case I don't particularly
         | care about the polling load. Or, it's busy and when they query
         | for work, there's always a task ready to process - in this case
         | the LISTEN is constantly pinging, which is equivalent to
         | constantly polling and finding work.
         | 
         | Regardless, is there a resource (blog or otherwise) you'd
         | reccomend for integrating LISTEN with the backend?
        
           | diek wrote:
           | In a large application you may have dozens of tables that
           | different backends may be operating on. Each worker pool
           | polling on tables it may be interested on every couple
           | seconds can add up, and it's really not necessary.
           | 
           | Another factor is polling frequency and processing latency.
           | All things equal, the delay from when a new task lands in a
           | table to the time a backend is working on it should be as
           | small as possible. Single digit milliseconds, ideally.
           | 
           | A NOTIFY event is sent from the server-side as the
           | transaction commits, and you can have a thread blocking
           | waiting on that message to process it as soon as it arrives
           | on the worker side.
           | 
           | So with NOTIFY you reduce polling load and also reduce
           | latency. The only time you need to actually query for tasks
           | is to take over any expired leases, and since there is a
           | 'lease_expire' column you know when that's going to happen so
           | you don't have to continually check in.
           | 
           | As far as documentation, I got a simple java LISTEN/NOTIFY
           | implementation working initially (2013?-ish) just from the
           | excellent postgres docs:
           | https://www.postgresql.org/docs/current/sql-notify.html
        
         | jakjak123 wrote:
         | Using the INSERT/UPDATES is kind of limiting for your events.
         | Usually you will want richer event (higher level information)
         | than the raw structure of a single table. Use this feature very
         | sparingly. Keep in mind that LISTEN should also ONLY be used to
         | reduce the active polling, it is not a failsafe delivery
         | system, and you will not get notified of things that happened
         | while you were gone.
        
           | diek wrote:
           | For my use cases the aim is really to not deal with events,
           | but deal with the rows in the tables themselves.
           | 
           | Say you have a `thing` table, and backend workers that know
           | how to process a `thing` in status 'new', put it in status
           | 'pending' while it's being worked on, and when it's done put
           | it in status 'active'.
           | 
           | The only thing the backend needs to know is "thing id:7 is
           | now in status:'new'", and it knows what to do from there.
           | 
           | The way I generally build the backends, the first thing they
           | do is LISTEN to the relevant channels they care about, then
           | they can query/build whatever understanding they need for the
           | current state. If the connection drops for whatever reason,
           | you have to start from scratch with the new connection
           | (LISTEN, rebuild state, etc).
        
           | halfcat wrote:
           | > Usually you will want richer event (higher level
           | information) than the raw structure of a single table.
           | 
           | JSONB fields in Postgres are pretty awesome for this. You can
           | query the JSON fields, index them, and all that.
           | 
           | Is that what you mean?
        
         | Deukhoofd wrote:
         | > * use LISTEN to be notified of rows that have changed that
         | the backend needs to take action on (so you're not actively
         | polling for new work)
         | 
         | > * use NOTIFY from a trigger so all you need to do is
         | INSERT/UPDATE a table to send an event to listeners
         | 
         | Could you explain how that is better than just setting up Event
         | Notifications inside a trigger in SQL Server? Or for that
         | matter just using the Event Notifications system as a queue.
         | 
         | https://learn.microsoft.com/en-us/sql/relational-databases/s...
         | 
         | > * you can select using SKIP LOCKED (as the article points
         | out)
         | 
         | SQL Server can do that as well, using the READPAST table hint.
         | 
         | https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...
         | 
         | > * you can use partial indexes to efficiently select rows in a
         | particular state
         | 
         | SQL Server has filtered indexes, are those not the same?
         | 
         | https://learn.microsoft.com/en-us/sql/relational-databases/i...
        
           | taspeotis wrote:
           | I agree that SQL Server has similar functionality but Service
           | Broker is pretty clunky compared to LISTEN.
        
           | go_prodev wrote:
           | Thanks for the links, I was wondering if SQL Server supports
           | similar features.
        
           | diek wrote:
           | Admittedly I used SQL Server pretty heavily in the mid-to-
           | late-2000s but haven't kept up with it in recent years so my
           | dig may have been a little unfair.
           | 
           | Agree on READPAST being similar to SKIP_LOCKED, and filtered
           | indexes are equivalent to partial indexes (I remember
           | filtered indexes being in SQL Server 2008 when I used it).
           | 
           | Reading through the docs on Event Notifications they seem to
           | be a little heavier and have different deliver semantics.
           | Correct me if I'm wrong, but Event Notifications seem to be
           | more similar to a consumable queue (where a consumer calling
           | RECEIVE removes events in the queue), whereas LISTEN/NOTIFY
           | is more pubsub, where every client LISTENing to a channel
           | gets every NOTIFY message.
        
           | killingtime74 wrote:
           | It's better because it's not SQL Server
        
         | gavanm wrote:
         | SQL Server has included a message queue capability for a while
         | now. It's called SQL Server Service Broker:
         | 
         | https://learn.microsoft.com/en-us/sql/database-engine/config...
         | 
         | I haven't had the opportunity to use it in production yet - but
         | it's worth keeping in mind.
         | 
         | I've helped fix poor attempts of "table as queue" before - once
         | you get the locking hints right, polling performs well enough
         | for small volumes - from your list above, the only thing I
         | can't recall there being in sql server is a LISTEN - but I'm
         | not really an expert on it.
        
           | EvanAnderson wrote:
           | Came here to mention Service Broker. I've used it in
           | production in multi-server configurations for a number of
           | years. It works really well but it's terribly obscure. Nobody
           | seems to know it's even there.
           | 
           | The learning curve is steep and there are some easy anti-
           | patterns you can fall into. Once you grok it, though, it
           | really is very good.
           | 
           | The LISTEN functionality is absolutely there. Your activation
           | procedure is invoked by the server upon receipt of records
           | into the queue. It's very slick. No polling at all.
        
           | halfcat wrote:
           | Also Azure is adding SQL Server trigger support
           | 
           | https://learn.microsoft.com/en-us/azure/azure-
           | functions/func...
        
             | dagss wrote:
             | This stuff has a latency measured in minutes though,
             | limiting the usecases a lot.
        
         | mulmen wrote:
         | What happens when the backend worker dies while processing?
        
           | 0cf8612b2e1e wrote:
           | Usual way is you update the table with a timestamp when the
           | task was taken. Have one periodic job which queries the table
           | looking for tasks that have outlived the maximum allowed
           | processing time and reset the status so the task is available
           | to be requeued.
        
         | Rapzid wrote:
         | I use a generic subsystem modeled loosely after SQS and Golang
         | River.
         | 
         | I have a visible_at field which indicates when the "message"
         | will show up in checkout commands. When checked out or during a
         | heartbeat from the worker this gets bumped up by a certain
         | amount of time.
         | 
         | When a message is checked out, or re-checked out, a key(GUID)
         | is generated and assigned. To delete the message this key must
         | match.
         | 
         | A message can be checked out if it exists and the visible_at
         | field is older or equal to NOW.
         | 
         | That's about it for semantics. Any further complexity, such as
         | workflows and states, are modeled in higher level services.
         | 
         | If I felt it mattered for perf and was worth the effort I might
         | model this in a more append-only fashion taking advantage of
         | HOT updates and etc. Maybe partition the table by day and drop
         | partitions older than longest supported process. Use the sparse
         | index to indicate deleted.. Hard to say though with SSDs, HOT,
         | and the new btree anti-split features..
        
       | osigurdson wrote:
       | It is all about numbers. What latency is acceptable in this
       | hypothetical? How many workers are there? I'm not sure how long x
       | is "time.sleep(x)" but unless very long this will use one
       | connection per worker. That is probably fine in this situation
       | but it isn't fine for all projects.
        
       | marcusbuffett wrote:
       | I looked at some solutions that were more specifically meant for
       | queues, for my site, but decided to keep the infra simple by
       | using my Postgres DB for the worker queues. There's 10,000 jobs a
       | day, at that rate theres virtually zero additional load. It runs
       | fast, I know exactly how it works, can extend it easily, local
       | dev doesn't get any more complicated, and there's not another
       | point of failure in the form of a new piece of infra.
       | 
       | If I get millions of users I'll swap it out, in the meantime it
       | took like a day to implement and I haven't looked back.
        
       | brycelarkin wrote:
       | Big fan of Graphile Worker to handle this job.
       | https://github.com/graphile/worker
        
         | jawngee wrote:
         | +1 graphile worker. we use it to coordinate all of our
         | background tasks and events (use slayQ https://slayq-
         | docs.vercel.app which is kind of a wrapper around it)
        
       | andrew-v wrote:
       | SKIP LOCKED is a good idea to make sure that the tasks can be
       | processed independently by the workers, but what about a case
       | where you have to guarantee the order of events?
       | 
       | I guess this Postgres queue is a very handy tool for cases like
       | background jobs, but will it work with event sourcing?
        
         | ffsm8 wrote:
         | You're responsible for writing the query that selects the tasks
         | to process, so yes? You can obviously group by a common id,
         | order by inserted and use limit 1. And if every event has to be
         | processed in order, then you've created a system that can't be
         | parallel, so you'll be effectively limited to a single worker.
        
           | dkersten wrote:
           | You can have multiple workers, but they all need to process
           | all events. So it's not suitable for parallelisation of work
           | (how can it be, if total ordering is required), but it can be
           | used where different workers consume the queue in different
           | ways (eg for synchronising different services).
        
         | Rapzid wrote:
         | Order by insert time on some group key(or otherwise figure out
         | the order) and select the top item. Then select that item for
         | update with skip locked. If it's locked you will get zero
         | results.
        
       | rubenfiszel wrote:
       | If you need a job queue on Postgres, https://windmill.dev provide
       | an all-integrated developer platform with a Pg queue at its core
       | that support jobs defined in python/typescript/sql
        
         | BozeWolf wrote:
         | Came here to say this as well. Using it in a side project and
         | it works well. Chose windmill because it allows me to upgrade
         | to other backends when necessary.
        
       | mattbillenstein wrote:
       | I wrote one of these not long ago:
       | https://github.com/mattbillenstein/pg-queue
        
       | tuananh wrote:
       | There's also Riverqueue https://github.com/riverqueue/river
        
       | stavros wrote:
       | There's a Dramatiq backend to do this
       | (https://pypi.org/project/dramatiq-pg/) but I don't think it has
       | seen as wide adoption as it deserves. I haven't tried it, but I
       | don't see why it wouldn't work well.
        
         | bbkane wrote:
         | Do you know if there's a MySQL dramatiq backend?
        
       | cpursley wrote:
       | Related post from a few days ago:
       | 
       | "Postgres Is Enough":
       | https://news.ycombinator.com/item?id=39273954
       | 
       | Queue section in the gist:
       | https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...
        
       | jmmv wrote:
       | I'm also using PostgreSQL as a queue of tasks for a side project
       | and it has been OK---although it has barely enough load for me to
       | say that it's "really OK" or whether it will turn out to be a
       | problem in the future. I'm convinced I can go this way for a
       | really long time without having to rearchitect the system or add
       | more bloat to it.
       | 
       | Anyhow, in my case, PostgreSQL is there just there to persist
       | task state, and the database does very little magic. In fact, I
       | also have a SQLite-backed implementation that allows service unit
       | tests to be fast. Most of the work around managing such queue is
       | in-process (and in Rust) because I wanted to keep deployments and
       | operations simple. If interested, I wrote this thing a few months
       | ago on how it works: https://jmmv.dev/2023/06/iii-iv-task-
       | queue.html
        
       | throw__away7391 wrote:
       | I am always surprised at how infrequently logical replication is
       | mentioned when people are trying to implement these types of
       | systems. I've used Debezium with some success in the past, but
       | recently implemented my own application level reader to
       | facilitate low latency event sourcing projections where I didn't
       | want to involve Kafka. This seems to be quite a good solution in
       | many scenarios that typically involve undesirable performance
       | tradeoffs, as it doesn't introduce polling or much additional
       | load on your DB, allows better isolation of the consumer
       | workload, allows simple recovery if the consumer restarts or gets
       | disconnected (e.g. you pick up at the LSN you left off and aren't
       | worried that you missed a NOTIFY while offline), and watching
       | status of the replication slot gives a since external metric to
       | monitor of the health of the consumer.
        
       | bodantogat wrote:
       | I like this approach, especially for systems that don't
       | immediately require scaling to millions of users. I use MySQL
       | with the 'SKIP LOCKED' option for my queues. I recognize that
       | this method may not scale indefinitely, but it negates the need
       | to add new infrastructure across all environments, saving on
       | costs and devops effort. I've intentionally designed the event
       | sending and receiving code to be abstract, ensuring that if the
       | need to switch systems arises, it should be a reasonably
       | manageable task.
        
       | swaptr wrote:
       | I think it is a great idea if scaling is not an issue. I have
       | been using Postgres backend on my side projects with celery and
       | it works really well. Really handy if you wish to start off quick
       | and on scale all you need to do is replace the backend/broker.
        
       | throwawaaarrgh wrote:
       | The idea that "performance" is the big problem, is the same as
       | saying "performance" is why you don't use an IMAP server as a
       | queue. Hey, you've already got an email server _somewhere_ , and
       | that means it's simpler, so that means it's a good idea. Based on
       | the logic in this article you should be using your email server
       | as your application's general purpose queue.
       | 
       | I feel so bad for the poor engineers that will believe this crap
       | and later regret it. If you're going to advocate doing something
       | like this, you should be forced to be honest and explain all the
       | reasons it's bad idea. The fact is that people are lying about
       | the problems of using Postgres as a queue, because they just
       | don't want to believe that the world is really more complex than
       | that.
        
         | sseagull wrote:
         | But you aren't storing application data in the IMAP server. I
         | hope.
         | 
         | The idea (for me anyway) is that if you have a postgres db with
         | data, and need a queue for tasks dealing with that data,
         | building it into the existing db can make more sense than
         | adding additional infrastructure and complexity. Especially if
         | you are familiar with postgres already.
         | 
         | If all you need is a queue, then I agree that dedicated queue
         | software will be better.
        
       | BrickTamblan wrote:
       | I'm not sure if this is the same as pg queues but I just started
       | using pg_notify with channels, and it's so good.
       | 
       | Row update occurs as normal express api, but at the end I call
       | pg_notify on some channel name. I pickup the message in a new
       | thread via polling, and perform aggregate queries to a agg table.
       | It's like materialized views, but with no wasteful refreshes.
       | Then I push the updates aggs back to the frontend via websockets.
        
       | 698969 wrote:
       | Does the advice only apply to self-hosted/bare-metal VMs? Would
       | sharing our Azure Flexible Postgres for a queue be a bad idea?
        
       | KronisLV wrote:
       | Offtopic, but I added 1rem of left/right padding for paragraph
       | and header elements in the stylesheet and personally it feels
       | more readable that way.
        
       ___________________________________________________________________
       (page generated 2024-02-10 23:01 UTC)