[HN Gopher] Postgres as queue
___________________________________________________________________
Postgres as queue
Author : leontrolski
Score : 191 points
Date : 2024-02-09 15:37 UTC (1 days ago)
(HTM) web link (leontrolski.github.io)
(TXT) w3m dump (leontrolski.github.io)
| andrewstuart wrote:
| Postgres isn't the only way to have fun.
|
| Other things just as good at being a queue:
|
| SQLite
|
| SQL server
|
| MySQL
|
| Plain old files on the file system
| bhaney wrote:
| std::queue and a big swap file
| cha42 wrote:
| you can certainly encode queue in many ways. mkfifo just work.
| But integrating the queue in the DB isn't a bad idea if you
| want to have both the queue and the db in a shared state.
|
| I am happy that my queue inherit ACID properties.
|
| SQLite simply doesn't allow concurrent write so it is a no go
| for a queue.
|
| I don't know much about SQL Server and MySQL but I wouldn't
| favor a lockin closed source software or anything remotely
| connected to Oracle.
|
| At the end, only Postgresql remains I guess. Also, Postgresql
| is super solid and the closest to SQL standard.
| andrewstuart wrote:
| You're wrong that I'm wrong.
| tarruda wrote:
| > mkfifo just work
|
| Are you certain about that?
| tussa wrote:
| Don't forget the plain old postit notes.
| geon wrote:
| I thought those were write-only?
| dkersten wrote:
| Use a pencil so you can erase (or an erasable pen like the
| frixion erasable pens)
| tussa wrote:
| I believe the right terminology is immutable.
| crizzlenizzle wrote:
| > Plain old files on the file system
|
| ...and accessed over SFTP.
|
| I worked for a company in the health industry and one of the
| labs we integrated refused to call a HTTPS endpoint whenever a
| result was ready, so we had to poll every _n_ mins to fetch
| results. That worked well until covid happened and there were
| so many test results causing all sorts of issues like reading
| empty files (because they were about to be written to disk) and
| things of that nature.
| adaboese wrote:
| Postgres is extremely expensive to scale. Why on Earth would you
| try to put queue there.
| kromem wrote:
| The mantra about premature optimization applies to
| infrastructure too.
| valenterry wrote:
| Maintaining extra infrastructure is expensive. Working around
| missing ACID is expensive. Depending on how many messages we
| are talking about, the cost of scaling postgres a bit more
| might be much lower.
| simonw wrote:
| If you're running less than a million tasks a day through a
| queue and you already have PostgreSQL in your stack why add
| anything else?
| adaboese wrote:
| Just use a dedicated tool. It is not that hard. If you want
| higher level abstraction, you have a whole spectrum of next
| gen queues, like Temporal, Trigger.dev, Inngest, Defer, etc.
| jitl wrote:
| Personally the next-gen-ness of an infrastructure component
| is inversely proportional to my trust in it.
| bakchodi wrote:
| Especially for something handling data. I want an old,
| battle tested solution that won't disappear when the VC
| capital dries up
| simonw wrote:
| Right, use boring technology!
|
| https://boringtechnology.club/
| simonw wrote:
| Why use a dedicated tool if you have something in your
| stack that can solve the problem already?
|
| The less separate pieces of infrastructure you can run, the
| less likely something will break that you don't know how to
| easily fix.
|
| The article touched on this in the list of things to avoid
| when it said "I estimate each line of terraform to be an
| order of magnitude more risk/maintenance/faff than each
| line of Python" and "The need for expertise in anything
| beyond Python + Postgres"
| CyberDildonics wrote:
| Even a million tasks a day is less than 12 a second. Most
| queues are going to have surges since that's part of the
| point of a queue, but it's still a few orders of magnitude
| away from what should overwhelm a database.
| throwaway2990 wrote:
| lol no it's not.
| cwbriscoe wrote:
| Any DB is fine as a queue depending on requirements, design and
| usage.
| CubsFan1060 wrote:
| https://microservices.io/patterns/data/transactional-outbox....
|
| It allows you to wrap it all in a transaction. If you separate
| a database update and insertion into a queue, whichever happens
| second may fail, while the first succeeds.
| illusive4080 wrote:
| Tell the alternative is a saga pattern to implement a
| distributed two phase commit.
|
| Or actual XA, but that is cursed.
| beeboobaa wrote:
| because you're not expecting to have to scale beyond 1 instance
| for the next few years & are already using postgres & now
| everything is trivially transactional. KISS
| wavemode wrote:
| If the choice is between using a dedicated queue and postgres
| for your data vs. using postgres for both, using postgres for
| both makes perfect sense.
|
| The scale at which you would outgrow using postgres for the
| queue, you would also outgrow using postgres for the data.
| halfcat wrote:
| At what point does one outgrow Postgres for the data?
| cha42 wrote:
| If you have an infra that need to scale so much then Postgresql
| isn't the right tool indeed. The right tools for your use case
| probably doesn't even exists and you will have to build one.
|
| It is not a mystery why all webscale companies endup designing
| their own DB technology.
|
| That being said, most of the DB in the wild are not remotely at
| those scale. I have seen my share of Postgresql/ElasticSearch
| combo to handle below TB data and just collapsing because of
| the overeng of administrating two DB in one app.
| alerighi wrote:
| If you need scaling. Not all applications need scaling (e.g.
| I'm doing an internal tool for a company that has 1000
| employees, it's unlikely that from one day to another that
| number of employee will double!), and for most applications a
| single PSQL server either locally or in the cloud is enough.
| dharmab wrote:
| Postgres as a queue is one of the worst decisions I've seen made
| and I and others I've worked with have spent years of our lives
| unwinding the consequences.
| dralley wrote:
| Completely disagree for our use case. If your messages aren't
| send-and-forget but rather represent work that needs to be
| tracked, it is incredibly difficult to manage the state when
| your database and queue are separate. Using postgresql as the
| queue and leveraging the same transactions as your business
| logic solves many many issues.
| nostrebored wrote:
| How is it incredibly difficult? I've seen this work very
| well.
|
| Is your code handling different varieties of event states
| stored in your database?
|
| Why implement this as a monolithic data source? Was it not
| possible to separate different states by additional queues?
|
| Curious to hear the decision making process here. Maybe I can
| see it for low volume?
| dralley wrote:
| Admittedly we're not using it for anything high volume.
| jvans wrote:
| dralley is saying your queue push and database write aren't
| transactional. You have to be ok suffering some small % of
| message loss or messages sent without a database commit
| nostrebored wrote:
| Depends on your queue semantics, ie is it just an at
| least once delivery or is there an ack back to the queue
| service.
|
| If there is a two phase process then you just need
| idempotence and can safely transact with the db.
| jvans wrote:
| It doesn't matter what your queue semantics are. If it's
| not part of the transaction, delivery and push cannot be
| guaranteed to be atomic.
| arcbyte wrote:
| At the end of the day, the ONLY way to reliably do this is
| to hook into the databases native journal/log anyway.
| Postgres gives you better primitives to work with than
| installing Qlik for example.
| nostrebored wrote:
| Why do you think so?
| nurettin wrote:
| As a person who has implemented worker farms at scale, I
| don't understand which part is "incredibly difficult", or
| what having a dedicated queue server prevents you from
| updating database state within "the same transactions". If
| your worker process has to update some kind of task state and
| calculate some business logic, it can still do so whether if
| you use rmq, redis or whatever.
| dkersten wrote:
| I assume they mean the dual write problem
| nurettin wrote:
| So atomic updates are incredibly difficult?
| canadiantim wrote:
| years seems like hyperbole
| mikercampbell wrote:
| It's almost as if we were in an ordered line, all waiting to
| learn the same thing, and had the lesson delivered no more than
| once.
| mistrial9 wrote:
| saying so without context or requirements at all reduces the
| effectiveness of the insight
| dharmab wrote:
| Sadly my work is subject to regulatory requirements that
| prevent me from discussing specific details :(
| sbuttgereit wrote:
| Surely if your complaint is general enough to give such
| advice to everyone here, there must be a way to express
| generalized technical details of that complaint without
| disclosing anything more privileged than you already have.
| If that kind of abstraction and generalization isn't
| possible, then it sounds to me like something specific to
| your work's implementation of the ideas and not something
| that applies broadly.
| cqqxo4zV46cp wrote:
| Oh, okay. It's great that you've let us all know that you
| hold The Secret. Thank you.
| wavemode wrote:
| Care to elaborate?
| nomdep wrote:
| Why? Because memory is expensive and disk space is cheap
| notnmeyer wrote:
| > I'd estimate each line of terraform to be an order of magnitude
| more risk/maintenance/faff than each line of Python.
|
| ehhhhhhhh, i don't know about that. different strokes and all of
| that, but TF describing infrequently changing resources requires
| next to no maintenance. i'm not sure i under what they mean by
| "risk" in this context though.
|
| if you don't know $technology well then any thing you do with it
| is likely to be worse/riskier than using the thing that you do
| know.
|
| > The need for expertise in anything beyond Python + Postgres.
|
| this should be the first point because it explains 99% of the
| motivation for all of this.
| smashed wrote:
| Yes, I think the point being made is that the infrastructure
| described in terraform is a much more difficult thing to change
| and evolve over time.
|
| Changing a source file line and shipping a new version of the
| app is much easier.
|
| They are not really comparable things at all. Even though we
| moved to infrastructure as code, that does not mean that the
| infra code is like functional application code.
|
| Come to think of it, I guess it was a damn good lie to sell
| infrastructure as code as so easy it's like shipping a new app
| version when it's everything but..
| benlivengood wrote:
| Terraform providers and modules and the cloud resources evolve
| fairly rapidly (especially anything kubernetes-related). If
| you're not keeping up with the latest versions you can end up
| somewhat stranded when a new resource feature needs to be used
| and you discover unexpected tech debt from the world moving out
| from under you.
| Too wrote:
| It's not that the TF code in itself is more complex, it's the
| resources you spin up using TF that are. They are stateful,
| expose network interfaces, require backups, updates, cert
| renewals and all other kinds of operational burden.
| jnsaff2 wrote:
| In Elixir land Oban[0] uses Postgres as queue and seems to work
| quite well.
|
| [0] - https://github.com/sorentwo/oban
| thibaut_barrere wrote:
| Confirmed, it works fairly nicely (although we've had to be a
| bit careful with very chatty jobs configurations with regard to
| DB overload).
| jpb0104 wrote:
| In the world of Ruby, GoodJob [0] has been doing a _good job_
| so far.
|
| [0] - https://github.com/bensheldon/good_job
| glenngillen wrote:
| And in Go you've got River: https://riverqueue.com/
| parthdesai wrote:
| FWIW, River was directly inspired by Oban:
| https://news.ycombinator.com/item?id=38351067
| nemo44x wrote:
| I'm still waiting for the "Postgres as a Web Browser" thread.
| nurettin wrote:
| Postgres rust rewrite will be a huge thread.
| jesperwe wrote:
| It has already started. https://docs.rs/pgx/latest/pgx/
|
| Not hard to see path ahead of us similar to Linux's gradual
| step towards Rust in the kernel.
| tpetry wrote:
| Postgres Rust extensions is a completely different topic
| than rewriting PG in Rust. The problem is that a crash in
| any extension will take down the full PG worker. So you can
| havoc the robustness of your PG database with any
| extension. But with a memory-safe language like Rust which
| forces you to handle all error cases this can't happen
| anymore.
| RedShift1 wrote:
| Our scientists are not stopping to think if they should.
| falsandtru wrote:
| FDB, a distributed DB, would be superior if complete order is not
| required.
|
| https://apple.github.io/foundationdb/queues.html
| tomas789 wrote:
| It was used in my previous job and it worked wonders. It was an
| internal tool which stored lots of data but did not see high
| traffic. And as such the load was not an issue. In the world of
| thet large corporate it was almost the only viable solution. DB
| already existed so it could be used. Other solutions would
| require an "architecture" documents, security meetings and
| perhaps even hiring someone to care about new tech in stack.
| diek wrote:
| Postgres is great as a queue, but this post doesn't really get
| into the features that differentiate it from just polling, say
| SQL Server for tasks.
|
| For me, the best features are: * use LISTEN to be
| notified of rows that have changed that the backend needs to take
| action on (so you're not actively polling for new work) *
| use NOTIFY from a trigger so all you need to do is INSERT/UPDATE
| a table to send an event to listeners * you can select
| using SKIP LOCKED (as the article points out) * you can use
| partial indexes to efficiently select rows in a particular state
|
| So when a backend worker wakes up, it can: *
| LISTEN for changes to the active working set it cares about
| * "select all things in status 'X'" (using a partial index
| predicate, so it's not churning through low cardinality 'active'
| statuses) * atomically update the status to 'processing'
| (using SKIP LOCKED to avoid contention/lock escalation) *
| do the work * update to a new status (which another worker
| may trigger on)
|
| So you end up with a pretty decent state machine where each
| worker is responsible for transitioning units of work from status
| X to status Y, and it's getting that from the source of truth.
| You also usually want to have some sort of a per-task
| 'lease_expire' column so if a worker fails/goes away, other
| workers will pick up their task when they periodically scan for
| work.
|
| This works for millions of units of work an hour with a
| moderately spec'd database server, and if the alternative is
| setting up SQS/SNS/ActiveMQ/etc and then _still_ having to track
| status in the database/manage a dead-letter-queue, etc -- it's
| not a hard choice at all.
| eddd-ddde wrote:
| There's actually an extension called "tcn" or trigger change
| notification that provides such a trigger put of the box.
| foofie wrote:
| Excellent comment. Thank you for taking the time to write it.
| leontrolski wrote:
| Thanks for the comprehensive reply, does the following argument
| stand up at all? (Going on the assumption that LISTEN is one
| more concept and one less concept is a good thing).
|
| If I have say 50 workers polling the db, either it's quiet and
| there's no tasks to do - in which case I don't particularly
| care about the polling load. Or, it's busy and when they query
| for work, there's always a task ready to process - in this case
| the LISTEN is constantly pinging, which is equivalent to
| constantly polling and finding work.
|
| Regardless, is there a resource (blog or otherwise) you'd
| reccomend for integrating LISTEN with the backend?
| diek wrote:
| In a large application you may have dozens of tables that
| different backends may be operating on. Each worker pool
| polling on tables it may be interested on every couple
| seconds can add up, and it's really not necessary.
|
| Another factor is polling frequency and processing latency.
| All things equal, the delay from when a new task lands in a
| table to the time a backend is working on it should be as
| small as possible. Single digit milliseconds, ideally.
|
| A NOTIFY event is sent from the server-side as the
| transaction commits, and you can have a thread blocking
| waiting on that message to process it as soon as it arrives
| on the worker side.
|
| So with NOTIFY you reduce polling load and also reduce
| latency. The only time you need to actually query for tasks
| is to take over any expired leases, and since there is a
| 'lease_expire' column you know when that's going to happen so
| you don't have to continually check in.
|
| As far as documentation, I got a simple java LISTEN/NOTIFY
| implementation working initially (2013?-ish) just from the
| excellent postgres docs:
| https://www.postgresql.org/docs/current/sql-notify.html
| jakjak123 wrote:
| Using the INSERT/UPDATES is kind of limiting for your events.
| Usually you will want richer event (higher level information)
| than the raw structure of a single table. Use this feature very
| sparingly. Keep in mind that LISTEN should also ONLY be used to
| reduce the active polling, it is not a failsafe delivery
| system, and you will not get notified of things that happened
| while you were gone.
| diek wrote:
| For my use cases the aim is really to not deal with events,
| but deal with the rows in the tables themselves.
|
| Say you have a `thing` table, and backend workers that know
| how to process a `thing` in status 'new', put it in status
| 'pending' while it's being worked on, and when it's done put
| it in status 'active'.
|
| The only thing the backend needs to know is "thing id:7 is
| now in status:'new'", and it knows what to do from there.
|
| The way I generally build the backends, the first thing they
| do is LISTEN to the relevant channels they care about, then
| they can query/build whatever understanding they need for the
| current state. If the connection drops for whatever reason,
| you have to start from scratch with the new connection
| (LISTEN, rebuild state, etc).
| halfcat wrote:
| > Usually you will want richer event (higher level
| information) than the raw structure of a single table.
|
| JSONB fields in Postgres are pretty awesome for this. You can
| query the JSON fields, index them, and all that.
|
| Is that what you mean?
| Deukhoofd wrote:
| > * use LISTEN to be notified of rows that have changed that
| the backend needs to take action on (so you're not actively
| polling for new work)
|
| > * use NOTIFY from a trigger so all you need to do is
| INSERT/UPDATE a table to send an event to listeners
|
| Could you explain how that is better than just setting up Event
| Notifications inside a trigger in SQL Server? Or for that
| matter just using the Event Notifications system as a queue.
|
| https://learn.microsoft.com/en-us/sql/relational-databases/s...
|
| > * you can select using SKIP LOCKED (as the article points
| out)
|
| SQL Server can do that as well, using the READPAST table hint.
|
| https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...
|
| > * you can use partial indexes to efficiently select rows in a
| particular state
|
| SQL Server has filtered indexes, are those not the same?
|
| https://learn.microsoft.com/en-us/sql/relational-databases/i...
| taspeotis wrote:
| I agree that SQL Server has similar functionality but Service
| Broker is pretty clunky compared to LISTEN.
| go_prodev wrote:
| Thanks for the links, I was wondering if SQL Server supports
| similar features.
| diek wrote:
| Admittedly I used SQL Server pretty heavily in the mid-to-
| late-2000s but haven't kept up with it in recent years so my
| dig may have been a little unfair.
|
| Agree on READPAST being similar to SKIP_LOCKED, and filtered
| indexes are equivalent to partial indexes (I remember
| filtered indexes being in SQL Server 2008 when I used it).
|
| Reading through the docs on Event Notifications they seem to
| be a little heavier and have different deliver semantics.
| Correct me if I'm wrong, but Event Notifications seem to be
| more similar to a consumable queue (where a consumer calling
| RECEIVE removes events in the queue), whereas LISTEN/NOTIFY
| is more pubsub, where every client LISTENing to a channel
| gets every NOTIFY message.
| killingtime74 wrote:
| It's better because it's not SQL Server
| gavanm wrote:
| SQL Server has included a message queue capability for a while
| now. It's called SQL Server Service Broker:
|
| https://learn.microsoft.com/en-us/sql/database-engine/config...
|
| I haven't had the opportunity to use it in production yet - but
| it's worth keeping in mind.
|
| I've helped fix poor attempts of "table as queue" before - once
| you get the locking hints right, polling performs well enough
| for small volumes - from your list above, the only thing I
| can't recall there being in sql server is a LISTEN - but I'm
| not really an expert on it.
| EvanAnderson wrote:
| Came here to mention Service Broker. I've used it in
| production in multi-server configurations for a number of
| years. It works really well but it's terribly obscure. Nobody
| seems to know it's even there.
|
| The learning curve is steep and there are some easy anti-
| patterns you can fall into. Once you grok it, though, it
| really is very good.
|
| The LISTEN functionality is absolutely there. Your activation
| procedure is invoked by the server upon receipt of records
| into the queue. It's very slick. No polling at all.
| halfcat wrote:
| Also Azure is adding SQL Server trigger support
|
| https://learn.microsoft.com/en-us/azure/azure-
| functions/func...
| dagss wrote:
| This stuff has a latency measured in minutes though,
| limiting the usecases a lot.
| mulmen wrote:
| What happens when the backend worker dies while processing?
| 0cf8612b2e1e wrote:
| Usual way is you update the table with a timestamp when the
| task was taken. Have one periodic job which queries the table
| looking for tasks that have outlived the maximum allowed
| processing time and reset the status so the task is available
| to be requeued.
| Rapzid wrote:
| I use a generic subsystem modeled loosely after SQS and Golang
| River.
|
| I have a visible_at field which indicates when the "message"
| will show up in checkout commands. When checked out or during a
| heartbeat from the worker this gets bumped up by a certain
| amount of time.
|
| When a message is checked out, or re-checked out, a key(GUID)
| is generated and assigned. To delete the message this key must
| match.
|
| A message can be checked out if it exists and the visible_at
| field is older or equal to NOW.
|
| That's about it for semantics. Any further complexity, such as
| workflows and states, are modeled in higher level services.
|
| If I felt it mattered for perf and was worth the effort I might
| model this in a more append-only fashion taking advantage of
| HOT updates and etc. Maybe partition the table by day and drop
| partitions older than longest supported process. Use the sparse
| index to indicate deleted.. Hard to say though with SSDs, HOT,
| and the new btree anti-split features..
| osigurdson wrote:
| It is all about numbers. What latency is acceptable in this
| hypothetical? How many workers are there? I'm not sure how long x
| is "time.sleep(x)" but unless very long this will use one
| connection per worker. That is probably fine in this situation
| but it isn't fine for all projects.
| marcusbuffett wrote:
| I looked at some solutions that were more specifically meant for
| queues, for my site, but decided to keep the infra simple by
| using my Postgres DB for the worker queues. There's 10,000 jobs a
| day, at that rate theres virtually zero additional load. It runs
| fast, I know exactly how it works, can extend it easily, local
| dev doesn't get any more complicated, and there's not another
| point of failure in the form of a new piece of infra.
|
| If I get millions of users I'll swap it out, in the meantime it
| took like a day to implement and I haven't looked back.
| brycelarkin wrote:
| Big fan of Graphile Worker to handle this job.
| https://github.com/graphile/worker
| jawngee wrote:
| +1 graphile worker. we use it to coordinate all of our
| background tasks and events (use slayQ https://slayq-
| docs.vercel.app which is kind of a wrapper around it)
| andrew-v wrote:
| SKIP LOCKED is a good idea to make sure that the tasks can be
| processed independently by the workers, but what about a case
| where you have to guarantee the order of events?
|
| I guess this Postgres queue is a very handy tool for cases like
| background jobs, but will it work with event sourcing?
| ffsm8 wrote:
| You're responsible for writing the query that selects the tasks
| to process, so yes? You can obviously group by a common id,
| order by inserted and use limit 1. And if every event has to be
| processed in order, then you've created a system that can't be
| parallel, so you'll be effectively limited to a single worker.
| dkersten wrote:
| You can have multiple workers, but they all need to process
| all events. So it's not suitable for parallelisation of work
| (how can it be, if total ordering is required), but it can be
| used where different workers consume the queue in different
| ways (eg for synchronising different services).
| Rapzid wrote:
| Order by insert time on some group key(or otherwise figure out
| the order) and select the top item. Then select that item for
| update with skip locked. If it's locked you will get zero
| results.
| rubenfiszel wrote:
| If you need a job queue on Postgres, https://windmill.dev provide
| an all-integrated developer platform with a Pg queue at its core
| that support jobs defined in python/typescript/sql
| BozeWolf wrote:
| Came here to say this as well. Using it in a side project and
| it works well. Chose windmill because it allows me to upgrade
| to other backends when necessary.
| mattbillenstein wrote:
| I wrote one of these not long ago:
| https://github.com/mattbillenstein/pg-queue
| tuananh wrote:
| There's also Riverqueue https://github.com/riverqueue/river
| stavros wrote:
| There's a Dramatiq backend to do this
| (https://pypi.org/project/dramatiq-pg/) but I don't think it has
| seen as wide adoption as it deserves. I haven't tried it, but I
| don't see why it wouldn't work well.
| bbkane wrote:
| Do you know if there's a MySQL dramatiq backend?
| cpursley wrote:
| Related post from a few days ago:
|
| "Postgres Is Enough":
| https://news.ycombinator.com/item?id=39273954
|
| Queue section in the gist:
| https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...
| jmmv wrote:
| I'm also using PostgreSQL as a queue of tasks for a side project
| and it has been OK---although it has barely enough load for me to
| say that it's "really OK" or whether it will turn out to be a
| problem in the future. I'm convinced I can go this way for a
| really long time without having to rearchitect the system or add
| more bloat to it.
|
| Anyhow, in my case, PostgreSQL is there just there to persist
| task state, and the database does very little magic. In fact, I
| also have a SQLite-backed implementation that allows service unit
| tests to be fast. Most of the work around managing such queue is
| in-process (and in Rust) because I wanted to keep deployments and
| operations simple. If interested, I wrote this thing a few months
| ago on how it works: https://jmmv.dev/2023/06/iii-iv-task-
| queue.html
| throw__away7391 wrote:
| I am always surprised at how infrequently logical replication is
| mentioned when people are trying to implement these types of
| systems. I've used Debezium with some success in the past, but
| recently implemented my own application level reader to
| facilitate low latency event sourcing projections where I didn't
| want to involve Kafka. This seems to be quite a good solution in
| many scenarios that typically involve undesirable performance
| tradeoffs, as it doesn't introduce polling or much additional
| load on your DB, allows better isolation of the consumer
| workload, allows simple recovery if the consumer restarts or gets
| disconnected (e.g. you pick up at the LSN you left off and aren't
| worried that you missed a NOTIFY while offline), and watching
| status of the replication slot gives a since external metric to
| monitor of the health of the consumer.
| bodantogat wrote:
| I like this approach, especially for systems that don't
| immediately require scaling to millions of users. I use MySQL
| with the 'SKIP LOCKED' option for my queues. I recognize that
| this method may not scale indefinitely, but it negates the need
| to add new infrastructure across all environments, saving on
| costs and devops effort. I've intentionally designed the event
| sending and receiving code to be abstract, ensuring that if the
| need to switch systems arises, it should be a reasonably
| manageable task.
| swaptr wrote:
| I think it is a great idea if scaling is not an issue. I have
| been using Postgres backend on my side projects with celery and
| it works really well. Really handy if you wish to start off quick
| and on scale all you need to do is replace the backend/broker.
| throwawaaarrgh wrote:
| The idea that "performance" is the big problem, is the same as
| saying "performance" is why you don't use an IMAP server as a
| queue. Hey, you've already got an email server _somewhere_ , and
| that means it's simpler, so that means it's a good idea. Based on
| the logic in this article you should be using your email server
| as your application's general purpose queue.
|
| I feel so bad for the poor engineers that will believe this crap
| and later regret it. If you're going to advocate doing something
| like this, you should be forced to be honest and explain all the
| reasons it's bad idea. The fact is that people are lying about
| the problems of using Postgres as a queue, because they just
| don't want to believe that the world is really more complex than
| that.
| sseagull wrote:
| But you aren't storing application data in the IMAP server. I
| hope.
|
| The idea (for me anyway) is that if you have a postgres db with
| data, and need a queue for tasks dealing with that data,
| building it into the existing db can make more sense than
| adding additional infrastructure and complexity. Especially if
| you are familiar with postgres already.
|
| If all you need is a queue, then I agree that dedicated queue
| software will be better.
| BrickTamblan wrote:
| I'm not sure if this is the same as pg queues but I just started
| using pg_notify with channels, and it's so good.
|
| Row update occurs as normal express api, but at the end I call
| pg_notify on some channel name. I pickup the message in a new
| thread via polling, and perform aggregate queries to a agg table.
| It's like materialized views, but with no wasteful refreshes.
| Then I push the updates aggs back to the frontend via websockets.
| 698969 wrote:
| Does the advice only apply to self-hosted/bare-metal VMs? Would
| sharing our Azure Flexible Postgres for a queue be a bad idea?
| KronisLV wrote:
| Offtopic, but I added 1rem of left/right padding for paragraph
| and header elements in the stylesheet and personally it feels
| more readable that way.
___________________________________________________________________
(page generated 2024-02-10 23:01 UTC)