[HN Gopher] Show HN: PgQueuer - Transform PostgreSQL into a Job ...
___________________________________________________________________
Show HN: PgQueuer - Transform PostgreSQL into a Job Queue
PgQueuer is a minimalist, high-performance job queue library for
Python, leveraging the robustness of PostgreSQL. Designed for
simplicity and efficiency, PgQueuer uses PostgreSQL's LISTEN/NOTIFY
to manage job queues effortlessly.
Author : jeeybee
Score : 84 points
Date : 2024-08-18 19:22 UTC (3 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| samwillis wrote:
| This looks like a great task queue, I'm a massive proponent of
| "Postgres is all you need" [0] and doubling down on it with my
| project that takes it to the extreme.
|
| What I would love is a Postgres task queue that does multi-step
| pipelines, with fan out and accumulation. In my view a structured
| relational database is a particularly good backend for that as it
| inherently can model the structure. Is that something you have
| considered exploring?
|
| The one thing with listen/notify that I find lacking is the max
| payload size of 8k, it somewhat limits its capability without
| having to start saving stuff to tables. What I would really like
| is a streaming table, with a schema and all the rich type
| support... maybe one day.
|
| 0: https://www.amazingcto.com/postgres-for-everything/
| jeeybee wrote:
| Thanks for your insights!
|
| Regarding multi-step pipelines and fan-out capabilities: It's a
| great suggestion, and while PgQueuer doesn't currently support
| this, it's something I'm considering for future updates.
|
| As for the LISTEN/NOTIFY payload limit, PgQueuer uses these
| signals just to indicate changes in the queue table, avoiding
| the size constraint by not transmitting substantial data
| through this channel.
| halfcat wrote:
| Is multi-step (fan out, etc) typically something a queue or
| message bus would handle?
|
| I've always handled this with an orchestrator solution like
| (think Airflow and similar).
|
| Or is this a matter of use case? Like for a real-time scenario
| where you need a series of things to happen (user registration,
| etc) maybe a queue handling this makes sense? Whereas with
| longer running tasks (ETL pipelines, etc) the orchestrator is
| beneficial?
| CowOfKrakatoa wrote:
| How does LISTEN/NOTIFY compare to using select for update skip
| locked? I thought listen/notify can lose queue items when the
| process crashes? Is that true? Do you need to code for those
| cases in some manner?
| jeeybee wrote:
| LISTEN/NOTIFY and SELECT FOR UPDATE SKIP LOCKED serve different
| purposes in PgQueuer. LISTEN/NOTIFY notifies consumers about
| changes in the queue table, prompting them to check for new
| jobs. This method doesn't inherently lose messages if a process
| crashes, because it simply triggers a check rather than
| transmitting data. The actual job handling and locking are
| managed by SELECT FOR UPDATE SKIP LOCKED, which safely
| processes each job even when multiple workers are involved.
| severino wrote:
| I think the usage of listen/notify is just a mechanism to save
| you from querying the database every X seconds looking for new
| tasks (polling). That has some drawbacks, because if the
| timeout is too small, you are making too much queries that
| usually may not return any new tasks, and if it's too big, then
| you may start processing the task long after it was submitted.
| This way, it just notifies you that new tasks are ready so you
| can query the database.
| cklee wrote:
| I've been thinking about the potential for PostgreSQL-backed job
| queue libraries to share a common schema. For instance, I'm a big
| fan of Oban in Elixir: https://github.com/sorentwo/oban
|
| Given that there are many Sidekiq-compatible libraries across
| various languages, it might be beneficial to have a similar
| approach for PostgreSQL-based job queues. This could allow for
| job processing in different languages while maintaining
| compatibility.
|
| Alternatively, we could consider developing a core job queue
| library in Rust, with language-specific bindings. This would
| provide a robust, cross-language solution while leveraging the
| performance and safety benefits of Rust.
| memset wrote:
| I am building an SQS compatible queue for exactly that reason.
| Use with any language or framework.
| https://github.com/poundifdef/smoothmq
|
| It is based on SQLite, but it's written in a modular way. It
| would be easy to add Postgres as a backend (in fact, it might
| "just work" if I switch the ORM connection string.)
| GordonS wrote:
| Does SmoothMQ support running multiple nodes for high
| availability? (I didn't see anything in the docs, but they
| seem unfinished)
| earthnail wrote:
| This would be so immensely useful. I'd estimate that there are
| so many cases where the producer is Node or Rails and the
| consumer is Python.
| stephenr wrote:
| Qless "solves" this problem (in redis) by having all core logic
| written as lua and executed in redis.
|
| You could take a similar approach for pg: define a series of
| procedures that provide all the required functionality, and
| then language bindings are all just thin wrappers (to handle
| language native stuff) around calls to execute a given
| procedure with the correct arguments.
| rileymichael wrote:
| If you want a generic queue that can be consumed in any
| runtime, you can just build it directly into postgres via
| extensions like https://github.com/tembo-io/pgmq.
| redskyluan wrote:
| there seems to be a big hype to adapt pg into any infra. I love
| PG but this seems not be right thing.
| mlnj wrote:
| I use it as a job queue. Yes, it has it's cons, but not dealing
| with another moving piece in the big picture is totally worth
| it.
| sgarland wrote:
| At low-medium scale, this will be fine. Even at higher scale,
| so long as you monitor autovacuum performance on the queue
| table.
|
| At some point it may become practical to bring a dedicated
| queue system into the stack, sure, but this can massively
| simplify things when you don't need or want the additional
| complexity.
| jeeybee wrote:
| I agree, there is no need for FANG level infrastructure. Imo.
| in most cases, the simplicity / performance tradeoff for
| small/medium is worth it. There is also a statistics tooling
| that helps you monitor throughput and failure rats
| (aggregated on a per second basis)
| eknkc wrote:
| Instead of SQS, I recently created a basic abstraction on PG
| that mimics the SQS apis. The intention was to use it during
| development and we would simply switch to SQS later.
|
| Never did. The production code still uses PG based queue (which
| has been improved since) and pg just works perfectly fine.
| Might still need to go with a dedicated queue service at some
| point but it has been perfectly fine so far.
| westurner wrote:
| Does the celery SQLAlchemy broker support PostgreSQL's
| LISTEN/NOTIFY features?
|
| Similar support in SQLite would simplify testing applications
| built with celery.
|
| How to add table event messages to SQLite so that the SQLite
| broker has the same features as AMQP? Could a vtable facade send
| messages on tablet events?
|
| Are there sqlite Triggers?
|
| Celery > Backends and Brokers:
| https://docs.celeryq.dev/en/stable/getting-started/backends-...
|
| /? sqlalchemy listen notify:
| https://www.google.com/search?q=sqlalchemy+listen+notify :
|
| asyncpg.Connection.add_listener
|
| sqlalchemy.event.listen, @listen_for
|
| psychopg2 conn.poll(), while connection.notifies
|
| psychopg2 > docs > advanced > Advanced notifications:
| https://www.psycopg.org/docs/advanced.html#asynchronous-noti...
|
| PgQueuer.db, PgQueuer.listeners.add_listener; asyncpg
| add_listener:
| https://github.com/janbjorge/PgQueuer/blob/main/src/PgQueuer...
|
| asyncpg/tests/test_listeners.py:
| https://github.com/MagicStack/asyncpg/blob/master/tests/test...
|
| /? sqlite LISTEN NOTIFY:
| https://www.google.com/search?q=sqlite+listen+notify
|
| sqlite3 update_hook:
| https://www.sqlite.org/c3ref/update_hook.html
| aflukasz wrote:
| BTW: Good PostgresFM episode on implementing queues in Postgres,
| various caveats etc: https://www.youtube.com/watch?v=mW5z5NYpGeA
| .
| jeeybee wrote:
| thanks for sharing, added to my to watch list.
| fijiaarone wrote:
| You can make anything that stores data into a job queue.
| kaoD wrote:
| But can you make a _decent_ job queue with anything that stores
| data? Not easily. E.g. you need atomicity if multiple consumers
| can take jobs, and I think you need CAS for that, not just any
| storage will do, right?
|
| You probably need ACI and also D if you want your jobs to
| persist.
| _medihack_ wrote:
| There is also Procrastinate:
| https://procrastinate.readthedocs.io/en/stable/index.html
|
| Procrastinate also uses PostgreSQL's LISTEN/NOTIFY (but can
| optionally be turned off and use polling). It also supports many
| features (and more are planned), like sync and async jobs (it
| uses asyncio under the hood), periodic tasks, retries, task
| locks, priorities, job cancellation/aborting, Django integration
| (optional).
|
| DISCLAIMER: I am one of the co-maintainers of Procrastinate.
| martinald wrote:
| Any suggestions for something like this for dotnet?
| eknkc wrote:
| Hangfire with PostgreSQL driver.
| rgbrgb wrote:
| Cool, congrats on releasing. Have you seen graphile worker?
| Wondering how this compares or if you're building for different
| use-cases.
| ijustlovemath wrote:
| You could even layer in PostgREST for a nice HTTP API that is
| available from any language!
___________________________________________________________________
(page generated 2024-08-18 23:00 UTC)