[HN Gopher] Choose Postgres queue technology
___________________________________________________________________
Choose Postgres queue technology
Author : bo0tzz
Score : 174 points
Date : 2023-09-24 20:30 UTC (2 hours ago)
(HTM) web link (adriano.fyi)
(TXT) w3m dump (adriano.fyi)
| andrewstuart wrote:
| You don't even need a database to make a message queue. The Linux
| file system makes a perfectly good basis for a message queue
| since file moves are atomic.
|
| My guess is that many people are implementing queuing mechanisms
| just for sending email.
|
| You can see how this works in Arnie SMTP buffer server, a super
| simple queue just for emails, no database at all, just the file
| system.
|
| https://github.com/bootrino/arniesmtpbufferserver
| repiret wrote:
| That's a key property leveraged in the Maildir mailbox format.
| doctor_eval wrote:
| This is true, and I've worked on systems that use this, but
| it's a lot more work than just a rename.
|
| I'd recommend that, if you have a Postgres database already,
| definitely use that instead. Your queues will be transactional
| and they will get backed up when the rest of your database
| does.
| andrewstuart wrote:
| >> but it's a lot more work than just a rename
|
| Such as?
| whartung wrote:
| For those using simple SELECTs, what kind of WHERE clause are you
| using that works well with lots of qualified pending messages and
| (somewhat) guarantees the most appropriate (oldest?) message?
| levkk wrote:
| Running this exact implementation with 47M jobs processed and
| counting. SKIP LOCKED is great for VACUUM, and having durable
| storage with indexes make otherwise expensive patterns like
| delayed jobs, retries, status updates, "at least once", etc.
| really easy to implement.
| dlisboa wrote:
| Do you have some idea of how many jobs per minute or hour do
| you have? Just want to compare with what we have on Redis at
| work.
|
| Do you also have any idea on the concurrency? How many workers
| you have pulling from Postgres.
|
| I've used this approach before (ages ago) when Redis wasn't
| even a thing, though not at high throughout requirements.
| devoutsalsa wrote:
| I'm sure Redis is much faster than an RDBMS w/ all the ACID
| features turned on. The biggest concern I always have with
| Redis is simply overwhelming the in-memory storage limits
| when someone wants to do process a large number of good-sized
| messages at an inconvenient time. #tradeoffs
| rubenfiszel wrote:
| We use exactly this for windmill (OSS Retool alternative + modern
| airflow) and run benchmarks everyday. On a modest github CI
| instance where one windmill worker and postgres run as
| containers, our benchmarks run at 1200jobs/s. Workers can be
| added and it will scale gracefully up to 5000jobs/s. We are
| exploring using Citus to cross the barrier of 5000j/s on our
| multi-tenant instance.
|
| https://github.com/windmill-labs/windmill/tree/benchmarks
| simonw wrote:
| One of my favourite pieces of writing about worker queues is this
| by Brandur Leach:
|
| Transactionally Staged Job Drains in Postgres -
| https://brandur.org/job-drain
|
| It's about the challenge of matching up transactions with queues
| - where you want a queue to be populated reliably if a
| transaction completes, and also reliably NOT be populated if it
| doesn't.
|
| Brandur's pattern is to have an outgoing queue in a database
| table that gets updated as part of that transaction, and can then
| be separately drained to whatever queue system you like.
| paulddraper wrote:
| USE. ADVISORY. LOCKS.
|
| Do not use SKIP LOCKED unless it is a toy/low throughout.
|
| Row locks require transactions and disk writes.
|
| Advisory locks require neither. (However, you do have to stay
| inside the configurable memory budget.)
| ukd1 wrote:
| Maybe it's changed in the last year or so, but from
| benchmarking and writing / running queue software for Postgres
| - SKIP LOCKED was/is significantly faster. Is that different
| for MySQL?
| mikeocool wrote:
| Pretty common advice for scaling Postgres is to deploy
| pgbouncer in transaction mode in front of it to handle
| connection pooling.
|
| Advisory locks don't work in this setup (and will start
| behaving in strange ways if you do try to use them.) Something
| to consider if you go this route.
| ris wrote:
| Transaction-scoped advisory locks are very much a thing too.
| 5id wrote:
| One of the biggest benefits imo of using Postgres as your
| application queue, is that any async work you schedule benefits
| from transactionality.
|
| That is, say you have a relatively complex backend mutation that
| needs to schedule some async work (eg sending an email after
| signup). With a Postgres queue, if you insert the job to send the
| email and then in a later part of the transaction, something
| fails and the transaction rollbacks, the email is never queued to
| be sent.
| laurencerowe wrote:
| > One of the biggest benefits imo of using Postgres as your
| application queue, is that any async work you schedule benefits
| from transactionality.
|
| This is a really important point. I often end up using a
| combination of Postgres and SQS since SQS makes it easy to
| autoscale the job processing cluster.
|
| In Postgres I have a transaction log table that includes
| columns for triggered events and the pg_current_xact_id() for
| the transaction. (You can also use the built in xmin of the row
| but then you have to worry about transaction wrap around.)
| Inserting into this row triggers a NOTIFY.
|
| A background process runs in a loop. Selects all rows in the
| transaction table with a transaction id between the last run's
| xmin and the current pg_snapshot_xmin(pg_current_snapshot()).
| Maps those events to jobs and submits them to SQS. Records the
| xmin. LISTEN's to await the next NOTIFY.
| matsemann wrote:
| Good point. We alleviate that a bit by scheduling our queue
| adds to not run until after commit. But then we still have some
| unsafety, and if connection to rabbit is down we're in trouble.
| theptip wrote:
| Worth being clear that bridging to another non-idempotent
| system necessarily requires you to pick at-least-once or at-
| most-once semantics. So for emails, if you fail awaiting
| confirmation of your email you still need to pick between
| failing your transaction and potentially duplicating the email,
| or continuing and potentially dropping it.
|
| The big advantage is for code paths which async modify your DB;
| these can be done fully transactionally with exactly-once
| semantics since the Job consumption and DB update are in the
| same transaction.
| skybrian wrote:
| It seems like listen/notify doesn't play well with a serverless
| architecture. Would it make sense for Postgres to make a web
| request when there's work in the queue? Is that a thing?
| dools wrote:
| I'm always surprised that when I see people talk about queues I
| never see anyone mention beanstalkd. I've been using it for
| basically everything for 10 years and it's solid as a rock,
| incredibly simple and requires basically no maintenance. It Just
| Works(tm)
| jjice wrote:
| I've implemented queues with tables in RDBMSs a few times and
| it's always great and usually all you need. Worried about future
| scale? Make a class to wrapper the queue with a decent interface
| and swap it for RabbitMQ or whatever you want down the road.
| Implementation stays opaque and you have an easy upgrade path
| later on.
| tiagod wrote:
| Temporal, which AFAIK was made by the Uber Cadence team, which
| was also involved in SQS, uses postgres as a backend.
|
| I used it for a web automation system for an accounting client
| (automatically read files from a network share, lookup the
| clients on a database, submit the documents to government
| websites, using headless browsers, and put the resulting files in
| the directory). It allows for completely effortless deterministic
| programs that call workers that run the non deterministic code,
| with built in configurable retries (react to certain exception
| type, exponential back off) so you can write code that works
| almost like there were no issues with api connections,
| filesystem, etc.
|
| This code has been running for 5 or more years, with barely any
| maintenance, with 0 issues so far. It keeps everything in
| postgres, so even full reboots and crashes have no impact, it
| will just move the work back to the queue and it will run when
| there's an available worker.
| leoqa wrote:
| Temporal is a pretty complicated system. It has sharding built
| in, stores the entire activity history and runs multiple queues
| for timers and events. I'm a big fan (worked at Uber) but it's
| definitely not just postgres with a few indices.
| aduffy wrote:
| For several projects I've opted for the even dumber approach,
| that works out of the box with every ORM/Query DSL framework in
| every language: using a normal table with SELECT FOR UPDATE SKIP
| LOCKED
|
| https://www.pgcasts.com/episodes/the-skip-locked-feature-in-...
|
| It's not "web scale" but it easily extends to several thousand
| background jobs in my experience
| qaq wrote:
| batch inserts process tasks in batches and it is pretty much
| webscale
| matsemann wrote:
| I've done even simpler without locks (as no transaction logic),
| where I select a row, and then try to update a field about it
| being taken. If 1 row is affected, it's mine. If 0, someone
| else did it before me and I select a new row.
|
| I've used this for tasks at big organizations without issue. No
| need for any special deployments or new infra. Just spin up a
| few worker threads in your app. Perhaps a thread to reset
| abandoned tasks. But in three years this never actually
| happened, as everything was contained in try/catch that would
| add it back to the queue, and our java app was damn stable.
| klysm wrote:
| With what transaction isolation level?
| mbb70 wrote:
| I've done the same with MongoDB with findOneAndModify, simple
| and solid
| andrelaszlo wrote:
| I guess you update it with the assigned worker id, where the
| "taken by" field is currently null? Does it mean that workers
| have persistent identities, something like an index? How do
| you deal with workers being replaced, scaled down, etc?
|
| Just curious. We maintained a custom background processing
| system for years but recently replaced it with off the shelf
| stuff, so I'm really interested in how others are doing
| similar stuff.
| matsemann wrote:
| No, just update set taken=1. If it was a change to the row,
| you updated it. If it wasn't, someone updated before you.
|
| Our tasks were quick enough so that all fetched tasks would
| always be able to be completed before a scale down / new
| deploy etc, but we stopped fetching new ones when the
| signal came so it just finished what it had. I updated
| above, we did have logic to monitor if a task got taken but
| never got a finished status, but I can't remember it ever
| actually reporting on anything.
| fsniper wrote:
| You can combine this "update" with a "where taken = 0" to
| directly skip taken rows.
| SahAssar wrote:
| That is the sort of thing that bites you hard when it
| bites. It might run perfectly for years but that one
| period of flappy downtime at a third party or slightly
| misconfigured DNS will bite you hard.
| matsemann wrote:
| But compared to our rabbit setup where I work now, it was
| dead stable. No losing tasks or extra engineering effort
| on maintaining yet another piece of tech. Our rabbit
| cluster acting up has led to multiple disasters lately.
| SahAssar wrote:
| Agreed, I've had my own rabbit nightmares. But setting up
| a more robust queue on postgresql is easy, so you can
| easily gain a lot more guarantees without more
| complexity.
| fbdab103 wrote:
| I would set the taken field to a timestamp. Then you
| could have a cleanup job that looks for any lingering
| jobs aged past a reasonable timeout and null out the
| field.
| tylergetsay wrote:
| it wont work with a timestamp because each write will
| have an affected row of 1 beacuse the writes happen at
| different times. setting a boolean is static
| jayd16 wrote:
| You can do something like UPDATE row SET timeout = NOW()
| WHERE NOW() - taskTimeout > row.timestamp. You're not
| stuck with comparing bools.
| twic wrote:
| update tasks set taken_timestamp = now() where task_id =
| ? and taken_timestamp is null
| bushbaba wrote:
| You could even use a timestamp for handling what if this task
| was never finished by the worker who locked the row.
| ricardobeat wrote:
| That's what's in the article.
| somsak2 wrote:
| Fourth paragraph of the post:
|
| >Applied to job records, this feature enables simple queue
| processing queries, e.g. SELECT * FROM jobs ORDER BY created_at
| FOR UPDATE SKIP LOCKED LIMIT 1.
| surprisetalk wrote:
| I recently published a manifesto and code snippets for exactly
| this in Postgres!
|
| [1] https://taylor.town/pg-task
| orangepanda wrote:
| As I understand, with SKIP LOCKED rows would no longer be
| processed in-order?
| klysm wrote:
| Depends on how many consumers you have. If you need order
| guarantees, then something like the outbox pattern is
| probably a better fit.
| riku_iki wrote:
| article says he also uses "order by" clause, but I am
| wondering if it will severely limit throughput since all
| messages will need to be sorted on each lookup, but this
| probably can be solved by introducing index.
| jpgvm wrote:
| Few things.
|
| 1. The main downside to using PostgreSQL as a pub/sub bus with
| LISTEN/NOTIFY is that LISTEN is a session feature, making it
| incompatible with statement level connection pooling.
|
| 2. If you are going to do this use advisory locks [0]. Other
| forms of explicit locking put more pressure on the database while
| advisory locks are deliberately very lightweight.
|
| My favorite example implementation is que [1] which is ported to
| several languages.
|
| [0] https://www.postgresql.org/docs/current/explicit-
| locking.htm...
|
| [1] https://github.com/que-rb/que
| andrelaszlo wrote:
| One issue with Redis as a queue backend seems to be that
| persistence is quite expensive, at least for managed Redis
| instances. Using PG seems like it could be much cheaper,
| especially if you already have an instance with room to spare.
|
| I thought it was an interesting article, and I'd love to hear
| more from people using PG for queues in production (my intuition
| would say you'd get a lot of table bloat and/or vacuum latency,
| but I haven't tested it myself), but when it comes to the
| conclusion - "choosing boring technology should be one's default
| choice" - I can't think of anything more boring (in a good sense,
| mostly) than Sidekiq + Redis for a Rails app.
| xdanger wrote:
| I do enjoy using https://github.com/graphile/worker for my
| postgresql queuing needs. Very scalable, the next release 0.14
| even more so, and easy to use.
| andrewstuart wrote:
| MS SQL server, Postgres and MySQL all support SKIP LOCKED, which
| means they are all suitable for running queues.
|
| I built a complete implementation in Python designed to work the
| same as SQS but be more simple:
|
| https://github.com/starqueue/starqueue
|
| Alternatively if you just want to quickly hack something into
| your application, here is a complete solution in one Python
| function with retries (ask ChatGPT to tell you what the table
| structure is): import psycopg2 import
| psycopg2.extras import random db_params
| = { 'database': 'jobs', 'user':
| 'jobsuser', 'password': 'superSecret',
| 'host': '127.0.0.1', 'port': '5432', }
| conn = psycopg2.connect(**db_params) cur =
| conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
| def do_some_work(job_data): if random.choice([True,
| False]): print('do_some_work FAILED')
| raise Exception else:
| print('do_some_work SUCCESS') def process_job():
| sql = """DELETE FROM message_queue WHERE id = (
| SELECT id FROM message_queue
| WHERE status = 'new' ORDER BY created ASC
| FOR UPDATE SKIP LOCKED LIMIT 1 )
| RETURNING *; """ cur.execute(sql)
| queue_item = cur.fetchone() print('message_queue says
| to process job id: ', queue_item['target_id']) sql =
| """SELECT * FROM jobs WHERE id =%s AND status='new_waiting' AND
| attempts <= 3 FOR UPDATE;""" cur.execute(sql,
| (queue_item['target_id'],)) job_data = cur.fetchone()
| if job_data: try:
| do_some_work(job_data) sql = """UPDATE jobs
| SET status = 'complete' WHERE id =%s;"""
| cur.execute(sql, (queue_item['target_id'],))
| except Exception as e: sql = """UPDATE jobs
| SET status = 'failed', attempts = attempts + 1 WHERE id =%s;"""
| # if we want the job to run again, insert a new item to the
| message queue with this job id
| cur.execute(sql, (queue_item['target_id'],))
| else: print('no job found, did not get job
| id: ', queue_item['target_id']) conn.commit()
| process_job() cur.close() conn.close()
| jarofgreen wrote:
| > I'd love to see more neoq-like libraries for languages other
| than Go.
|
| Python has Celery, but maybe the author is looking for more
| choice between brokers.
| https://docs.celeryq.dev/en/stable/index.html
| scraplab wrote:
| Yep, we process hundreds of thousands and sometimes a few million
| jobs daily inside Postgres, using Oban in Elixir.
|
| Having transactional semantics around background jobs is
| incredibly convenient for things like scheduling email only if
| the transaction is successful, and so on.
|
| You do need to do a little bit of autovacuum tuning, but once
| sorted it's been great for us.
| nicoburns wrote:
| For running queues on Postgres with Node.js backend(s), I highly
| recommend https://github.com/timgit/pg-boss. I'm sure it has it
| scale limits. But if you're one of the 90% of the apps that never
| needs any kind of scale that a modern server can't easily handle
| then it's fantastic. You get transactional queueing of jobs, and
| it automatically handles syncing across multiple job processing
| servers using Postgres locks.
| kissgyorgy wrote:
| Here is a Python example how to use it:
| https://gist.github.com/kissgyorgy/beccba1291de962702ea9c237...
| mildavw wrote:
| For Rails apps, you can do this using the ActiveJob interface via
|
| https://github.com/bensheldon/good_job
|
| Had it in production for about a quarter and it's worked well.
| pphysch wrote:
| Another point is that task queue technology is highly fungible.
| There's nothing stopping you from starting with cron, adding in
| Postgres/Redis, then graduating to Kafka or something as _need_
| arises. And running all three in parallel, with different jobs in
| each. I would be surprised if the average Kafka shop didn 't also
| have a bunch of random cron jobs doing things that _could_ be
| implemented on Kafka or vice versa.
|
| At some point you may want to refactor things to reduce tech
| debt, but it really is a "and" rather than "or" decision.
| Scubabear68 wrote:
| I agree with all of this except for the part about "cron". Cron
| jobs in my experience quickly become hard to manage and
| effectively invisible over time.
|
| Use almost anything else to manage job scheduling....
| dharmab wrote:
| I'm not sure if they literally mean crond, or something
| vaguely cron-like but easier to manage like systemd timers.
| evil-olive wrote:
| there's an important dimension of scalability that I think gets
| overlooked in a lot of these discussions about database-as-a-
| queue vs queue-system-as-a-queue:
|
| are you queuing _jobs_ , or are you queuing _messages_?
|
| that's a fuzzy distinction, so somewhat equivalently, what's the
| expected time it takes for a worker to process a given queue
| item?
|
| at one end, an item on the queue may take several seconds to a
| minute or longer to process. at the other end, an item might take
| only a few milliseconds to process. in that latter case, it's
| often useful to do micro-batching, where a single worker pulls
| 100 or 1000 items off the queue at once, and processes them as a
| batch (such as by writing them to a separate datastore)
|
| the "larger" the items are (in terms of wall-clock processing
| time, not necessarily in terms of size in bytes of the serialized
| item payload) the more effective the database-as-a-queue solution
| is, in my experience.
|
| as queue items get smaller / shorter to process, and start to
| feel more like "messages" rather than discrete "jobs", that's
| when I tend to reach for a queue system over a database.
|
| for example, there's a RabbitMQ blog post [0] on cluster sizing
| where their recommendations _start_ at 1000 messages /second.
| that same message volume on a database-as-a-queue would require,
| generally speaking, 3000 write transactions per second (if we
| assume one transaction to enqueue the message, one for a worker
| to claim it, and one for a worker to mark it as complete / delete
| it).
|
| can Postgres and other relational databases be scaled & tuned to
| handle that write volume? yes, absolutely. however, how much
| write volume are you expecting from your queue workload, compared
| to the write volume from its "normal database" workload? [1]
|
| I think that ends up being a useful heuristic when deciding
| whether or not to use a database-as-a-queue - will you have a
| relational database with a "side gig" of acting like a queue, or
| will you have a relational database that in terms of data volume
| is primarily acting like a queue, with "normal database" work
| relegated to "side gig" status?
|
| 0: https://blog.rabbitmq.com/posts/2020/06/cluster-sizing-
| and-o...
|
| 1: there's also a Postgres-specific consideration here where a
| lot of very short-lived "queue item" database rows can put
| excessive pressure on the autovacuum system.
| doctor_eval wrote:
| I've used PG as a message queue, actually it was used as a
| transactional front end to Kafka; we'd push messages to a PG
| table during a transaction, which would then be snarfed up to
| Kafka by a separate process after the transaction completed.
|
| I've seen very high transaction rates from this arrangement,
| more than 20k messages/second.
| ukd1 wrote:
| I maintain QueueClassic
| (https://github.com/QueueClassic/queue_classic) for Rails/Ruby
| folks; which is basically what you're talking about - a queuing
| system for Postgres. A bonus reason, and why I originally wanted
| this was the ability to use transactions fully - i.e. I can start
| one, do some stuff, add a job in to the queue (to send an email),
| .....and either commit, or roll back - avoiding sending the
| email. If you use resque, I found sometimes either you can't see
| the record (still doing other stuff and it's not committed), or
| it's not there (rollback) - so either way you had to deal with
| it.
|
| QC (and equivs) use the same db, and same connection, so same
| transaction. Saves quite a bit of cruft.
___________________________________________________________________
(page generated 2023-09-24 23:00 UTC)