[HN Gopher] Do you really need Redis? How to get away with just ...
___________________________________________________________________
Do you really need Redis? How to get away with just PostgreSQL
Author : hyzyla
Score : 689 points
Date : 2021-06-12 06:57 UTC (16 hours ago)
(HTM) web link (spin.atomicobject.com)
(TXT) w3m dump (spin.atomicobject.com)
| markus_zhang wrote:
| I think it more or less depends on the mindset of the developer.
| For example from my observation developrrs from telecom or other
| big companies tend to turn every program into a "solution" i. e.
| it must have a full weaponry and other bells and whistles. Me?
| I'm the opposite amd won't add anything or fix anything unless it
| is really useful.
| manishsharan wrote:
| Sometimes we need network based data structures that are other
| than relational tables. Redis delivers quite a few of those ..
| You can always implement then in a SQL database but Redis is just
| a better solution for those cases.
| beforeolives wrote:
| Can anyone explain the need for redis in a clear way for someone
| who knows how databases work but isn't a backend developer? What
| alternatives are there? What did people do to solve the same
| problem before redis existed?
| mixmastamyk wrote:
| Generally used for caching expensive lookups, including job
| queues, and other nifty patterns. Before that memcached was
| most popular, but it didn't save to disk, where redis
| eventually does. Before that, it was roll your own or ram disk.
| snarkypixel wrote:
| ELI5:
|
| Redis is really a mix-bag of many useful commands, see
| [https://redis.io/commands]. It has been referred to as the
| "swiss army knife" because you can build many custom solutions
| from these building blocks. Still, its most common use-case is
| for in-memory key/value caching for performance reasons. For
| instance, if you have one database query that takes a
| particularly long time, you can execute it once and store the
| result in redis, and then retrieve that value extremely fast.
| ibraheemdev wrote:
| Okay, but why? Redis is perfect for job queues and pub/sub. Use
| the right tool for the job.
| encoderer wrote:
| Because one can, does not mean one should.
| fasteo wrote:
| For use case 1 (job queue) I can only recommend beanstalkd[1].
| Simple tcp plain text protocol. Install and forget. Plus you get
| some really useful features like deferred jobs, burying, kicking,
| job priorities, etc.
|
| We have literally processed tens of billions of jobs without a
| single failure.
|
| Old and simple technology that just works
|
| [1] https://beanstalkd.github.io/
| mianos wrote:
| It is funny, every time I have used beanstalk over the years I
| have put it in thinking I'll replace it later. I never have.
| Multiple startups I have left years ago are still running it,
| untouched years later. No one mentions it probably because it
| didn't do anything but run. I wish the same could be said for
| RabbitMQ.
| chx wrote:
| You really don't need anything fancy to implement a queue using
| SQL. You need a table with a primary id and a "status" field. An
| "expired" field can be used instead of the "status". We used the
| latter because it allows easy retries.
|
| 1. SELECT item_id WHERE expire = 0. If this is empty, no items
| are available.
|
| 2. UPDATE SET expire = some_future_time WHERE item_id =
| $selected_item_id AND expire = 0. Then check whether UPDATE
| affected any rows. If it did, item_id is yours. If not, loop. If
| the database has a sane optimizer it'll note at most one document
| needs locking as the primary id is given.
|
| All this needs is a very weak property: document level atomic
| UPDATE which can return whether it changed anything. (How weak?
| MongoDB could do that in 2009.)
|
| Source code at
| https://git.drupalcode.org/project/drupal/-/blob/9.2.x/core/...
| (We cooked this up for Drupal in 2009 but I am reasonably sure we
| didn't invent anything new.)
|
| Of course, this is not the fastest job queue there is but it is
| quite often good enough.
| ryanthedev wrote:
| I believe something like this would have issues with multiple
| consumers.
| viraptor wrote:
| This sounds dangerous. Without some good way to prevent workers
| stepping on each other, that will result in lots of useless
| queries and failed updates if a lot of jobs get scheduled at
| the same time.
|
| In the Drupal case, the database returns a single row which is
| pretty much guaranteed to be the same for all workers in
| between updates. You really don't want that in the same
| database your page views hit. At least selecting N rows at a
| time and then claiming a random item from them would be
| slightly better.
| viraptor wrote:
| I find the downvotes weird without any explanation why the
| raised issue doesn't sound important.
| zapstar wrote:
| I have done something similar to this to implement queues in
| Postgres. And at quite a large scale with lots of workers all
| pulling items off the queue at once.
|
| One huge advantage over, say, SQS, is that you also get easy
| visibility! You can query and see what's been completed, what's
| still not worked on, etc.!
|
| Oh, and do you want a priority queue? Just add a priority field
| and sort before picking an item! Do you need to change the
| priority of an item while it's already in the queue? Go right
| ahead. Do you want to enforce some constraints so that
| duplicates don't get added to the queue? It's a database -- add
| the constraints.
|
| If you're already using a relational database, and now you need
| a queue, start with this approach. Odds are, it'll work just
| fine, and it'll give you the perks above.
| GordonS wrote:
| Interesting! What kind of scale did you run it at?
|
| Also keen to know if you saw any disadvantages with this
| approach?
| cerved wrote:
| no you don't but it's a lot better to do an atomic
| select/update IMHO.
|
| Basically you have status = 0,1,2 for queued, processing, done.
| You have a filtering CTE that returns the queued rows, sort by
| id and update status = status +1 of the top one.
|
| You can even get fancy and disallow updates on the status
| column for other users to make it immutable.
|
| I've done this in SQL server in a very similar way to as
| explained in the post
| Twisol wrote:
| For other readers, the `UPDATE` step is an exact anlogue of the
| "compare-and-set" atomic instruction [0]. It's really cool to
| see how you've realized it in SQL!
|
| As a capability, compare-and-swap has an infinite consensus
| number [1], meaning it's sufficient to implement wait-free
| consensus algorithms with an arbitrary number of participants.
| That makes it a perfect fit for managing a scalable pool of
| workers that need to coordinate on consuming from a queue!
|
| [0] https://en.wikipedia.org/wiki/Compare-and-swap
|
| [1]
| https://en.wikipedia.org/wiki/Consensus_(computer_science)#C...
| chx wrote:
| Yes, the UPDATE command is the exact equivalent of LOCK
| CMPXCHG (the SELECT can be seen as computing the memory
| address). So the discussion about that in the comments of
| https://stackoverflow.com/a/59022356/308851 totally applies:
| if two queue runner threads pick the same item exactly one
| will succeed so it can't happen both tries and tries the same
| item. So there's no busy wait (reminder: a busy wait is where
| it does nothing just tests a condition), it just goes over
| every candidate until one succeeds.
| abhishekjha wrote:
| I have been asked this question in interviews for how to
| prevent two people from booking the same seat. Interviewers
| don't seem to be satisfied by the answer that I will let
| the query go to the database and see who ends up booking.
| They want some advanced locking mechanism using redis.
| Redis does serialize queries using lua thus avoiding two
| people from booking the same seat.
| kccqzy wrote:
| I had a similar question asked of me in an interview. The
| interviewer was also surprised by my solution of letting
| the database decide the winner.
| bryanrasmussen wrote:
| based on experience arriving very last minute to flight,
| nobody has problem with letting two people book the same
| seat.
| chx wrote:
| mmmm no
|
| They allow overbooking the flight but you can't book the
| same seat twice.
|
| Overbooking can make financial sense -- even if you need
| to hand a little compensation from time to time it's
| still better than flying with empty seats. Of course it
| sucks big time for the travelers especially if there are
| no volunteers. IDB is miserable.
| https://www.transportation.gov/individuals/aviation-
| consumer...
| mianos wrote:
| They obviously don't have the transaction semantics
| worked out. I have been at the gate and the person in
| front of me had the same seat number. They read it out.
| When I presented my pass the girl politely took my ticket
| and printed another one. If the person was not right in
| front of me I would never have known why. It is not
| unusual.
| kyleee wrote:
| Let the check-in attendant handle it; that's what I call
| a no-code solution! You're hired
| Tostino wrote:
| That's solid middle management thinking right there!
| agf wrote:
| I think this being a good answer hinges on the resulting
| user experience. If one person spends several minutes
| going through the process of putting in their payment and
| other info only to be told you don't have a ticket,
| they're going to be pissed. So you need a good answer on
| how to avoid that experience, however you implement the
| locking.
|
| Saying this as someone who has been on both sides of that
| interview question, and also evaluated performance in it
| as a hiring manager.
| lstamour wrote:
| Send bookings to the database early, with an expiry time
| and a counter visible on the screen?
|
| Thing is, because purchasing is often shared across
| third-parties, even then you can't guarantee that you'll
| have the reservation until it's paid and entered into the
| shared booking backend... unless the third party backend
| has a locking mechanism, at which point you're probably
| not using your own Postgres to do this, but their
| mainframe instead.
| spockz wrote:
| Is there any other solution other than going full STM on
| this?
| halifaxbeard wrote:
| Postgres's transactional semantics are really useful when
| building a queue, because of how it interacts with the
| various pieces.
|
| Connection 1 LISTEN 'job-updates';
|
| Connection 2 BEGIN; INSERT INTO jobs
| ('a-uuid', ...); SELECT PG_NOTIFY('job-update', 'json
| blob containing uuid and state change info'); COMMIT;
|
| Connection 3 (used when Connection 1 is notified)
| BEGIN; SELECT id, ... FROM jobs WHERE id = 'a-uuid' FOR
| UPDATE SKIP LOCKED; UPDATE 'jobs' SET state =
| 'step1_completed' WHERE is = 'a-uuid'; SELECT
| PG_NOTIFY('job-update', 'json blob containing uuid and state
| change info'); -- do the thing here: computation,
| calling external API, etc. If it fails then rollback.
| COMMIT;
|
| Because notify has transactional semantics, the notify only
| goes out at transaction commit time. You want to use a
| dedicated connection for the notify.
|
| The only downsides I immediately think of are you will have
| every worker contending to lock that row, and you'll need to
| write periodic jobs to cleanup/retry failures.
| jayd16 wrote:
| A single Rabbit node seems easier than rolling this yourself
| and doing all the work to test it.
| jasonwatkinspdx wrote:
| There's off the shelf libraries to do this in nearly every
| language.
|
| It's always about tradeoffs in context, but I have seen
| plenty of instances of someone setting up a rabbit cluster
| for something that could be done trivially in their existing
| db cluster via the above.
| z77dj3kl wrote:
| It's a tradeoff between software and infra complexity.
|
| I would argue that if you've built a system up from scratch,
| this is much easier to debug and maintain than a foreign
| piece of software. Rabbit is just way overkill if you have a
| couple of jobs per hour, for example.
| solipsism wrote:
| But if we always do what's easiest, then we will make a bunch
| of horrible choices.
|
| Other things that matter, often more than what's easiest:
|
| * Flexibility. If you aren't 100% sure what you'll need, but
| need something in place today, it can make sense to choose an
| unopinionated, generic solution.
|
| * Familiarity. Maybe you've never used Rabbit.
|
| * Ops simplicity. Maybe you already have a bunch of postgres
| deployed. Adding a new type of system means you need new ways
| of monitoring, new dependencies, different deployment.
| throwaway894345 wrote:
| Of these, I think the last is the most compelling. Why
| introduce something new if you already have all of the
| infra and tooling configured for managing postgres? The
| parent mentioned testing the postgres implementation, but
| that seems like quite a lot less work than configuring the
| infra-as-code, monitoring, alerting, etc required for a new
| type of thing.
| joelbluminator wrote:
| So you need to keep polling the db?
| lolinder wrote:
| Postgres has LISTEN/NOTIFY for this.
| ukd1 wrote:
| You really really wanna do everything (locking, working,
| unlocking, deleting or completing) in a transaction, and use
| SKIP LOCKED to find work. It's actually pretty high performance
| when you so this.
| phumberdroz wrote:
| do you have a more detailed example on how this could look
| like?
| subleq wrote:
| https://github.com/gavinwahl/django-postgres-queue
| taspeotis wrote:
| The article starts by discussing SKIP LOCKED
| habibur wrote:
| Right. And this article reminds me of people trying to use a
| SQL database like a filesystem. "Lock file before update
| otherwise it will get overwritten".
|
| Don't use record lock as like it's an attribute of a row. Add a
| column on the table called "status" and use it.
| 0x0nyandesu wrote:
| I recently switched from rolling my own table to SQS.
|
| The main issue is making a job queue dispatcher and runner
| which most people aren't familiar with.
| kalleboo wrote:
| The way we did it (in MYSQL) was "UPDATE tasks SET
| processid=<unique id for the queue consumer task> WHERE
| processid IS NULL ORDER BY taskid LIMIT 1" and then look for
| processid to find your task. I guess the LIMIT on UPDATE might
| be an mysql extension? Then if your task crashes or dies you
| can look for invalid processids and do cleanup
| chx wrote:
| Our initial did that too but generating a unique was removed
| soon after as had expire in there anyways for retries as I
| mentioned and that is enough.
| hughrr wrote:
| Oh please stop using databases as queues. I spent a
| disproportionate amount of time in the last 20 years undoing
| that decision. It doesn't scale at all well.
| blowski wrote:
| You're dealing with survivorship bias - those companies whose
| products were successful enough that they needed your
| services. You don't speak to the teams who went bankrupt
| because they spent too much time fiddling with RabbitMQ, when
| Postgres would have been fine.
|
| As always, there are trade offs, no silver bullets.
| paxys wrote:
| You are wayyy overestimating how complex something like
| RabbitMQ or Redis is. You don't need to hire entire teams
| of engineers or go bankrupt setting up an instance. It is
| less work than implementing a production-level queue in
| Postgres for sure.
| zomgwat wrote:
| I agree. I'll also add that Redis is a level easier to
| operate than RabbitMQ.
| aetherson wrote:
| Have worked at multiple companies that successfully used
| redis or rabbit with teams of less than five engineers.
|
| It's a little insane that a highly rated thread on HN is
| telling people to use postgres as their queuing solution.
| The world is wide, I'm sure that somewhere out there
| there is a situation where using postgres of a queue
| makes sense, but in 99% of all cases, this is a terrible
| idea.
|
| Also, SQS and nsq are simple.
| politician wrote:
| If your data storage pattern involves a database write
| followed by a message queue insert, then the ability to
| wrap those in a transaction can be a good trade-off to
| avoid consistency failures between the writes.
|
| Avoid consuming that queue directly though -- which is
| probably what you're thinking when saying this is a dumb
| idea and I tend to agree. Typically, you want to have a
| worker that loads entries into a more optimal queue for
| your application.
|
| Bottom line though, there is not a single best "queue"
| product. There are lots of queues offering wildly
| different semantics that directly impact use cases.
| GordonS wrote:
| For years I've been using RabbitMQ in small teams, and me
| as a one-man team too.
|
| As long as you don't need clustering (even a single node
| can handle some pretty heavy load), it's actually
| _really_ simple to setup and use - way easier than
| Postgres itself, for example.
|
| My biggest beefs with it have historically been the
| Erlang-style config files (which are basically
| unreadable), and the ridiculous spiel of error messages
| you get if you have an invalid configuration. But
| thankfully RabbitMQ switched to a much simpler config
| file format one or two years back, and I understand the
| Erlang OTP is working on better error messages and stack
| traces.
| fauigerzigerk wrote:
| The issue is not necessarily the complexity of RabbitMQ
| or Redis. The complexity comes from having to manage
| another (stateful) process that has to be available at
| all times.
| hughrr wrote:
| Yes. Pay Amazon to do it and add use all that saved time
| to add business value instead.
|
| They'll also manage the consumers of the queue and scale
| them too! Serverless is bliss.
| mbreese wrote:
| "Production-level" means different things to different
| groups. Many (most?) groups don't operate at the scale
| where they need a specialized queue in a dedicated
| message broker. Simple queues in a DB will work fine.
| Even if it isn't very complex, why not already use the
| infrastructure you probably already have setup -- an
| RDBMS?
|
| Now, if you're using a nosql approach for data storage,
| then you already know your answer.
| gravypod wrote:
| My main concern would be monitoring. Most queue systems
| connect to alerting systems and can page you if you
| suddenly stop processing thongs or retrying the same
| query many many times. For a DB, since the scope of
| access is much larger, you don't get these sort of
| guarantees for access patterns and you essentially need
| to reinvent the wheel for monitoring.
|
| All to save 2 to 3 hr of Googling for the best queue for
| your use case and finding s library for your language.
|
| It makes sense if you don't care about reliability and
| just need something easy for many many people to deploy
| (ex: Drupal).
| bcrosby95 wrote:
| We use pingdom that hits a page that gives it the health
| of various systems. Queue included.
|
| > All to save 2 to 3 hr of Googling for the best queue
| for your use case and finding s library for your
| language.
|
| The cost of using a new piece of tech in production is
| not just 2 or 3 hours.
| gravypod wrote:
| If you're at the scale where your postgres db can be used
| as a queue and no one on your team has experience running
| a these systems (most startups) then pretty much anything
| will work to begin with and as long as you have a clear
| interface that separates your code from your deps for the
| queue it'll be easy to swap out.
|
| At $JOB-1 I wrote a `Queue` and `QueueWorker` abstraction
| that used an environment variable to switch between
| different queue backends while providing the same
| interface. Because of this I got everyone up and running
| with Redis lists as a queue backend and then could add in
| things like MQTT or RabbitMQ as things evolved. I also
| defined very loose constraints for the queue interface
| that made it so the implementer. Essentially there was a
| `push()` which added something into the queue or failed
| and returned an error. Then there was an `onWork()` which
| was called whenever there was work "at least once"
| meaning your system had to handle multiple instances
| being delivered the same work item. We would only ack the
| queue message after `onWork()` completed successfully.
|
| There's not really anything preventing a team from doing
| this, putting a pin into it, and coming back to it when
| there's a scalability or reliability concern.
| cortesoft wrote:
| Sure, but I also feel like this comment is based on an
| assumption that everyone works at startups. A lot of us
| work for larger, already established companies that are
| past that will we or won't we survival stage. We are trying
| to design systems to solve needs for already validated
| products.
|
| It might seem like everyone works for a startup, but big
| companies have a lot of engineers, and we still have to
| solve new problems there even though the company is not
| new.
| robwwilliams wrote:
| Great comment: Been reading Taleb's Black Swan this morning
| on ascertainment bias, so this resonated well.
| skytreader wrote:
| > You're dealing with survivorship bias - those companies
| whose products were successful enough that they needed your
| services
|
| In fairness, Redis has only been available for 12 years so
| someone who has been in the industry longer has probably
| encountered systems using DBs as queues for no better
| reason than an alternative was not available when the
| system was made. (Rabbit just a couple years older I
| think.)
|
| But in this day and age, you have better options to start
| with c'mon.
|
| > they spent too much time fiddling with RabbitMQ, when
| Postgres would have been fine.
|
| Rather ironic claim in a thread full of tips, tricks, and
| gotchas just to make PG behave like a queue no?
|
| I can't speak for RabbitMQ personally but Redis has got to
| be among the simplest drop-in dependencies I've ever
| encountered. For a simple queue, the defaults do out-of-
| the-box.
|
| Gut feel also says, Redis defaults for queuing would also
| serve more than PG defaults for queuing so you could leave
| it untouched for far longer to focus on your product, get
| to profitability, etc. Feel free to debunk with data.
| boardwaalk wrote:
| You (or whoever is doing ops) will have to pay with your
| time to operate Redis until the product dies though.
|
| I don't think adding another component (especially one
| with storage/backup requirements/more complex failover
| procedures) should be taken lightly.
|
| Paying attention to the tips/tricks/gotchas is something
| you'd pay once and hopefully document in source
| code/documentation.
|
| If you piggyback off someone else's work (another team at
| your job, a cloud-based solution that handles everything,
| etc) the calculus would change, of course.
| skytreader wrote:
| > You (or whoever is doing ops) will have to pay with
| your time to operate Redis until the product dies though.
|
| Repeating my other comment: do you mean to say using an
| in-DB queue will not need special monitoring other than
| what's already standard metrics for DBs?
|
| Since a queue is a different structure, I would want to
| know metrics that might be unfamiliar for standard DB
| monitoring. Queue length, queue throughput, etc. Can I
| get that from my DB queue with negligible cost?
|
| Another thing, there might be standard practices and
| patterns for in-DB queues, often encapsulated in a nice
| library/middleware. But as _databases are not queues_
| there is no assurance from the vendor that those patterns
| will remain applicable in future versions.
|
| > Paying attention to the tips/tricks/gotchas is
| something you'd pay once
|
| Actually, I disagree with this. Everytime someone
| connects to that database, they have to remember that one
| special table which is used as a queue, and remember to
| tread differently where it is concerned.
|
| Setting up a queue? You do that once and everyone who
| interacts with it will not forget it is a queue. Heck you
| can't even do your usual DB operations there. How can you
| not remember it is a queue?
| sh87 wrote:
| Using a tool without knowing reasonable bounds of the
| domain, current requirements, and how the tool (redis in
| this case) solves the problem isn't good advice.
|
| Case in point our team went with Redis, just the default,
| use it blindly without fully understanding our
| requirements and how redis helps scale.
|
| 2 years later we spent 2 sprints, holding back the
| release trying to understand RDF vs AOF, and why we're
| seeing massive spikes in consumption and performance
| triggering pod eviction, running comparison tests to
| prove which works better and explaining why, running qa
| tests (regression, performance, load), introducing
| postgres for queuing, redoing our code to bypass the sync
| mechanism between how data flows between redis and
| postgres, updating dependenciies, migrating existing
| customer data (various on site locations), explaining
| this to team members, managers and their managers,
| installation technicians, support engineers and
| presenting it at engineering townhalls as a case of bad
| decisions.
|
| Not worth it.
| skytreader wrote:
| Well, by your admission, you used Redis for a problem
| domain it wasn't suited for in the first place. How is
| this an argument for using in-database queues?
|
| > use it blindly without fully understanding our
| requirements and how redis helps scale
|
| I'm sorry I don't get how I could come across as
| advocating the use of Redis blindly. My point is if your
| data flow looks like a queue, then use a queue, don't
| hack a relational DB to become a queue. I think that's
| reasonable rule of the thumb, not going in blind.
| sateesh wrote:
| I believe you meant "RDB vs AOF".
|
| Also reading your reply I get the impression that "sync
| mechanism between redis and postgress" was the
| bottleneck. Wondering if you can add some details around
| it and also was this something that can't be fixed by
| fine tuning redis config, rather than completely removing
| it from your stack.
| sh87 wrote:
| Yes I did mean RDB. Thanks for pointing out.
|
| There were many problems but at the core of it, this was
| us having redis write huge amounts of data to disk very
| frequently causing this.
|
| We could not reduce the frequency (product would not
| allow) and we couldn't find a way to make the writes
| reliably fast.
|
| I like to believe there exists a possible way of handling
| this, but point being, our team had no time to find out,
| how redis works internally and have confidence that the
| new way won't bring up new surprises.
| [deleted]
| derefr wrote:
| > Rather ironic claim in a thread full of tips, tricks,
| and gotchas just to make PG behave like a queue no?
|
| There are libraries in many languages written
| specifically to accomplish the "make PG behave like a
| queue" part. All these people are explaining the
| principle of how it would work, but in practice, you can
| just pull in one of those libraries and be done. (Or,
| more accurately, you use an abstract external-job-queuing
| library and tell it to use its Postgres adapter, which in
| turn uses that library.)
|
| In exchange, you get the ability to not have to set up /
| maintain any more infra than you already have. Which can
| be important if you're already "at scale", where every
| component you set up needs to be duplicated per
| geographic-region, clustered for High Availability, etc.
| If you've already done all that for Postgres, it'd be
| nice to not _also_ have to do all that _again,
| differently_ for Redis, RabbitMQ /Kafka/NATS, etc.
|
| > Gut feel also says, Redis defaults for queuing would
| also serve more than PG defaults for queuing
|
| Don't know about what you've got your queues doing, but
| our service only does (persistent, out-of-process)
| queuing for two purposes:
|
| 1. Global scheduled background-job queuing. (Think "cron
| jobs.") There are only a static number of these; they
| don't scale up with load.
|
| 2. Lifecycle email dispatch. This scales up with MAU, but
| with very very low coefficients.
|
| For those cases, Postgres would totally be enough. (
| _Anything_ would be enough!)
|
| Most "queues" needed in the real world are requirements-
| light like this. It's a rare company whose product is
| doing something Apache BEAM-alike with its data, where a
| high-throughput reliable MQ is a fundamental part of the
| data workflow. (And those companies, I hope, know that
| they need an actual MQ, not a piece of software that does
| queuing in its spare time.)
|
| Our company doesn't use Postgres queuing; we do, in fact,
| use Redis for it instead. But we only ended up doing
| that, because we already needed Redis for other stuff;
| and if you _already have Redis in play_ (including an ops
| plan for scaling it), _then_ it 's the better tool for
| the job.
| skytreader wrote:
| > In exchange, you get the ability to not have to set up
| / maintain any more infra than you already have.
|
| And, you mean, a PGQ will not need any special monitoring
| other than the common DB metrics?
|
| For instance, if I ever run a queue, it's just due ops
| diligence to know the length of the queue, average time
| an item spends in queue, throughput over time, among
| others. Are there standard monitoring modules that would
| check this for a PGQ? Because in exchange for setting up
| a proper queue cluster, compatibility and documentation
| for common stuff like this is also what you get.
|
| The first one is particularly a sticking point for me. If
| you don't do it right, you will end up issuing a COUNT on
| a table periodically. You might say it's acceptable for
| most companies, because they don't need high-performance
| queues, but I specifically need my monitoring to be
| reliable during times of unexpected high load. Also,
| ideally, there is close to zero chance that my telemetry
| is the one that ends up causing me trouble.
|
| Also, why does no one consider the added "surface for
| human error" that using a DB for a queue introduces. For
| instance anyone can drop in the DB and perform a query
| you were not supposed to. A malicious actor can update
| all outgoing emails in the queue to another recipient. If
| these are sensitive emails like password reset or OTP,
| good luck. A dedicated queue process does not allow such
| operations.
|
| I have to say, it's very perplexing to me that people
| (especially in HN) would avoid using the right structure
| just because they have to invest a bit more work.
| derefr wrote:
| > I have to say, it's very perplexing to me that people
| (especially in HN) would avoid using the right structure
| just because they have to invest a bit more work.
|
| You're imagining this as if it's just the CapEx (the work
| of setting up Redis), but in reality it's the OpEx (the
| work of _maintaining_ a Redis cluster) that kills you.
| Think of it in terms of the number of ops salaries you
| have to pay. At scale, this number goes up in proportion
| to the complexity of your stack.
| skytreader wrote:
| Actually, if you read the rest of my previous reply (as
| opposed to an offhand comment at the end) you will see
| that I'm considering the logistics of operating an in-DB
| queue as well.
|
| Using an in-DB queue doesn't give you zero OpEx does it?
| Maybe you can make the argument that it's cheaper but
| it's cheaper for a reason: the job wasn't done properly,
| so to speak.
|
| Both options introduce new failure modes and op costs
| into your system. Might as well do it properly if (and
| this is a huge IF in my opinion) for slightly more cost.
| When you run into a failure it's standard, maybe the
| solution is even one Google search away as opposed to
| realizing, one hour into a late-night debugging session
| that, shit, that table should be treated as a queue!
| derefr wrote:
| I read your whole reply. I didn't see any mention of
| anything I'd consider an OpEx cost.
|
| Re: Monitoring
|
| Presuming you're already monitoring anything at all,
| adding monitoring for a new piece of infrastructure is a
| pure CapEx cost. You presumably already have monitoring
| infrastructure running. Its ingestion, by design, will be
| 99% idle -- i.e. it won't need to be scaled horizontally
| proportionally to the number of components. The only
| thing needed, then, will be careful up-front design for
| the monitoring.
|
| Which is usually _also_ a cost already paid for you in
| advance when you use an abstract job-queue library. They
| all know how to expose Prometheus metrics endpoints, and
| they do the accounting to serve those endpoints
| efficiently (usually using process-local per-job-producer
| and per-job-consumer counters, which you must then roll
| up yourself at the PromQL level, taking the irate() to
| find spikes.)
|
| Re: AAA
|
| > Also, why does no one consider the added "surface for
| human error" that using a DB for a queue introduces.
|
| Because the type of person who reaches for their DB first
| to solve a novel problem, is the same type of person who
| understands and makes use of their DB's security
| features, to the point that doing _more_ DB-security-
| config isn 't a complex one-off problem for them, but
| something they can whip off in a few minutes.
|
| For a simple option, you can create two service-account
| DB roles: a queue_producer, and a queue_consumer. You can
| put all the job-queue implementation tables in a schema
| owned by the queue_producer; and then grant the
| queue_consumer SELECT privileges on all the tables, and
| UPDATE privileges on some of them. Then, nobody but the
| job_producer (or a DB superuser) can create or destroy
| jobs; and nobody but a job_producer, a job_consumer, or a
| superuser, can read or modify jobs. (Your job-queue
| abstraction library within your service usually maintains
| its own DB connection pool anyway, so it's no sweat to
| have those connections use their own credentials specific
| to their job-queue role.)
|
| For a more complex -- but perhaps useful? -- option, the
| tables themselves can be "hidden" behind stored
| procedures (DDLed into existence by the abstract job-
| queue library), where nobody has any privileges (incl.
| SELECT) on the tables, only EXECUTE rights on the sprocs.
| And the sprocs are carefully designed to never do
| anything that could have an unbounded CPU time. Then
| anyone can "check up on" or even "insert into" the job-
| queue, but nobody can do anything "funny" to it. (My god,
| it's an encapsulated API!)
|
| Once again -- the libraries that abstract this away,
| already think about concerns like this, and choose one or
| the other of these options. That's why libraries like
| this exist, when the "core premise" is so simple: it's so
| there's a place to put all the fine details derived from
| man-years of thought on how to make this approach
| _robust_.
|
| -----
|
| On a completely different track, though: having a queue
| in the DB can sometimes _be_ the optimal (i.e. the "if
| we had infinite man-hours for design + implementation")
| engineering decision. This case comes when the thing the
| queue is operating upon is _data in the DB_. In such a
| case, the DB data modification, and the job 's
| completion, can succeed or fail together _atomically_ ,
| as part of a single DB transaction.
|
| To accomplish the same thing when your queue lives
| outside the DB, you usually end up either needing some
| really gnarly distributed-locking logic that both your DB
| and your app layer need to know everything about (leaky
| abstraction!); _or_ you need to completely remodel your
| data _and_ your job queue into an event-streaming
| paradigm, so that you can "rewind" one side when the
| other side fails.
| skytreader wrote:
| > For a simple suggestion, the job queue can live in a
| schema that is owned by a job-queue service-account role.
| Nobody other than that user (or a DB superuser) can issue
| any query in there.
|
| Yes service accounts are fine and dandy but [a] isn't
| that more hoops borne by choosing the wrong tool for the
| job (gotta stay mindful of those permissions, you can
| have an Ansible typo, etc) and [b] as I said, proper
| queue services would not even allow tampering of enqueued
| data. For no extra cost other than installing the
| service. In contrast, your service account falling into a
| malicious actor is still a threat vector.
|
| > For a more complex -- but perhaps useful? -- option ...
|
| Now this is just a queue service with extra steps running
| in a relational DB instead of natively as an OS process.
| You did cite it as just an option but I don't see why
| this is an _attractive_ option.
|
| > I didn't see any mention of anything I'd consider an
| OpEx cost.
|
| It seems to me we have differing definitions of OpEx and,
| judging by your previous comment, you value this almost
| exclusively in terms of "the number of ops salaries you
| have to pay". Even if I play by that definition, I could
| tell from experience (and also corroborated by other
| commenters here, in other subthreads of this discussion),
| that operating a Redis cluster does not need extra warm
| bodies. The people who monitor that your app responds,
| your servers aren't running out of memory, could take on
| the task with little to no additional training.
|
| The experience I want to cite, bearing in mind of course
| that everyone's mileage varies: in my previous job,
| customer base of 2M and growing across two countries, we
| operated and monitored a redundant Redis cluster with an
| ops team of 3. In my first job I talked about in another
| subthread here, we managed three high-traffic Redis
| queues (and a few other smaller-traffic special cases)
| with a team of 7. In both jobs, we also did active
| across-the-stack development; we weren't dedicated to
| monitoring, nor to the components related to the queues.
|
| In fact I would argue running a DB (queue or no queue) is
| just more complex than running a queue service. Rather
| telling is the need for "DB Administrators/Specialists"
| but no such equivalent for Redis or other proper queueing
| solutions.
|
| > Which is usually also a cost already paid for you in
| advance when you use an abstract job-queue library. They
| all know how to expose Prometheus metrics endpoints...
|
| Honestly, this sounds fine but I'm rather wary of the
| caveats that might be present. How do they do it
| efficiently? What do I need to keep in mind to keep
| things efficient? As such, unless you can point me to
| your preferred job-queue library which does all these
| wonders, I have to reserve judgment.
|
| Edit: I just remembered mentioning this in another
| subthread. Even if you have a _fantastic_ library
| /middleware abstracting all those queue concerns, that
| ties you in with that library. If someone wants to build
| integrations into your queue, they have to play by that
| library. If majority of your app is in Java (and so is
| the library), and Data Science (who, in this example,
| uses Python, not improbable to happen) wants to share
| your data pipeline, if that library isn't available for
| them, tough luck I guess? More dev time for someone.
|
| And also, whatever underlying DB features or semantics
| your library might rely on in order to enforce a queue
| structure, you can't be assured that future versions of
| your DB would support those features because, you know,
| your DB isn't really in the business of queues. It opens
| you up to running an outdated DB version just to keep
| that queue functionality.
| derefr wrote:
| > Now this is just a queue service with extra steps
| running in a relational DB instead of natively as an OS
| process. You did cite it as just an option but I don't
| see why this is an attractive option.
|
| Your DB then shares a WAL log with your queue. Meaning a
| single managed physical replication pipeline for them
| both. Meaning only one set of leader-election issues to
| debug, not two. Meaning one canonical way to do
| geographic high-latency async replication. Meaning
| disaster recovery brings back a whole-system consistent
| snapshot state. Etc.
|
| Honestly, if I had my way, every stateful component in
| the stack would all share a single WAL log. That's what
| FoundationDB and the like get you.
|
| > In fact I would argue running a DB (queue or no queue)
| is just more complex than running a queue service.
|
| Well, yeah, but you usually _need_ a DB. So, if you're
| going to be paying the OpEx costs of the DB either way,
| then you may as well understand it deeply in order to
| wring the most productive use you can out of each OpEx
| dollar /man-hour spent.
|
| (I feel the same way about Redis, as it happens: if you
| need it, and are locking your code into its model
| _anyway_ , then you may as well take advantage of its
| more arcane features, like Redis Streams, Lua scripting,
| etc.)
|
| However, maybe our company is uncommon in how much our
| service literally _is_ doing fancy complex DB queries
| that use tons of DB features. We're a data analytics
| company. Even the frontend people know arcane SQL here :)
|
| > that ties you in with that library
|
| The difference between what you / apps / abstracting
| libraries do in Redis, and what they do in an SQL DB, is
| that in the DB, the shape of everything has to be
| explained in a vendor-neutral manner: SQL DDL.
|
| _Sometimes_ Redis-based solutions converge on
| conventional schemas; see e.g. Sidekiq's informal schema,
| which several other queuing systems are implemented in
| terms of. But when they don't, there's nothing you can
| really do -- beyond hacking on the libraries involved --
| to bring them into sync.
|
| In an SQL DB, anything can be adapted into the expected
| shape of anything else, by defining SQL views. (Heck, in
| an SQL DB with Redis support, like Postgres with
| redis_fdw, the _Redis_ data can be adapted into any shape
| you like using SQL views.)
|
| And that's further enabled by the fact that the DB had
| received from the app, through DDL, a schema, that you
| can examine, manipulate, and refactor; or even synthesize
| together with other schemas.
|
| > you can't be assured that future versions of your DB
| would support those features
|
| You can if those features are in the SQL standard. I've
| never heard of a DBMS _regressing_ on its level of SQL
| standard support.
| skytreader wrote:
| > Your DB then shares a WAL log with your queue...
|
| I feel like the understanding we can come to here is that
| we have differing definitions of necessary complexity.
|
| You did mention you work in data analytics and I have
| worked and am working in more traditional product-
| oriented gigs. Everything you mentioned are nice,
| impressive even, but to me they are toys. I don't need a
| queue with a WAL stream, or PITR backups. Queues, to me,
| are inter-process communication mechanisms, or a means to
| concurrency. In fact, worst case, you can delete the
| queues or restart Redis even without triggering a disk
| write (though I note this is another feature that comes
| free with Redis); it would inconvenience our customers
| but they can always just retry.
|
| Of all the benefits you mentioned, leader-election is the
| only one I could make a good case for.
|
| > then you may as well understand it deeply in order to
| wring the most productive use you can out of each OpEx
| dollar/man-hour spent.
|
| Understanding it is one thing but using that
| understanding to create/develop (CapEx) and then maintain
| (OpEx) an in-DB queue feels like unnecessary costs. CapEx
| to install Redis is practically nil, and you seem to
| agree that running it is far simpler OpEx than PG too
| ("Well, yeah, but...").
|
| Of course, I keep in mind your point about libraries
| doing the dirty work. But if it all boils down to a third
| party in the end, the point I'm replying to above is
| rather moot no? And if it comes to a third dependency
| anyway, I've already reasoned out my preference for an
| actual queue solution.
|
| > explained in a vendor-neutral manner: SQL DDL
|
| > You can if those features are in the SQL standard. I've
| never heard of a DBMS regressing on its level of SQL
| standard support.
|
| While this is an ideal case, my experience so far shows
| that every major DB package relies on their own tweaks to
| the SQL standard, and so "If those features are in the
| SQL standard" turns out to be a pretty big "if" to ask. I
| don't worry about vendors regressing on SQL standard
| support but rather that the libraries for DB queues are
| relying on the non-SQL-standard features/behavior to mock
| out a full queue functionality---non standard behavior
| that has no guarantee of consistency across versions.
|
| I mean, if there's a DB queue middleware/library that
| works across DB vendors, be it Postgres, Oracle, Maria,
| MySQL (with a reasonable storage engine, perhaps) then
| that's a convincing argument that the SQL standard can
| support queuing sufficiently. But otherwise, this
| counterpoint is a hard sell for me.
| hughrr wrote:
| I've got about 6 years operational experience with Redis.
| I would not use it for a queue. I struggle to consider
| using it as a data store these days as well with other
| options on the table. But that's a whole new thread so
| I'm not going into details here.
|
| If you want a queue, get a queue.
|
| If you want a relational database, get a relational
| database.
|
| If you want a key-value store, get a key-value store.
|
| Don't cross the streams.
| mikeklaas wrote:
| What would you use redis for, if anything?
| yawaramin wrote:
| I assume a key-value store.
| stickfigure wrote:
| There's one incredibly valuable feature you get with PG
| that you will never get with Redis or RabbitMQ or any
| other task queue: Enqueueing tasks transactionally with
| your unit of work.
|
| If your load requirements fit using your database as a
| queue, it can radically simplify your system.
|
| I've personally done a fair bit of migration from formal
| queue systems _into_ an RDBMS. You simply get a lot more
| control and visibility that way. But I use task queues
| heavily too. It 's all very application-specific and I
| don't think you can generalize.
| GordonS wrote:
| You can actually do this with RabbitMQ - it has a feature
| called "Publisher Confirmations", where you send a
| message, and then wait for an "ack" in return.
| blowski wrote:
| If you're using the same database for your queue as your
| business entities, then you can wrap both in the same
| transaction - I think that's what the parent post is
| getting at. For example, save a customer entity and add a
| job to send them a confirmation email. As far as I know,
| that's something you can't do natively if you're using
| RabbitMQ and PostgreSQL. Of course you can work around
| it, but when your workloads are low you don't need to,
| which is great for early stage products.
| spullara wrote:
| In that case it doesn't help you that much since you
| can't transactionally send an email but you would get at
| least one attempt.
| blowski wrote:
| It doesn't help guarantee the email is sent, but it makes
| it easier to find and handle unhappy edge cases, which is
| where a lot of time is spent in an early-stage startup.
| politician wrote:
| Not sure why you were downvoted, but perhaps they were
| thinking about Redis transactions (lua scripts that can
| do multiple operations at once). If Redis is your
| datastore ( _groan_ ), then you can transactionally
| enqueue tasks with your unit of work.
| arthurcolle wrote:
| Real time trading systems can totally get away with only
| using Redis, outside of historical data (which obviously
| you'd stick into Postgres or your favorite flavor of DB)
| 2muchcoffeeman wrote:
| In my experience proper abstraction is the key. If you have
| a clear abstraction and structure you can replace the
| implementation when you need to without too much
| disruption. If it's too leaky you're screwed.
| hughrr wrote:
| Yes. Use a queue abstraction for which there are numerous
| queues available off the shelf for!
| aparsons wrote:
| In my experience, great abstractions without a good data
| migration blueprint is as about as useful as no
| abstraction at all.
|
| In this example - great, you have a "queue" abstraction.
| When you switch to RabbitMQ or Postgres, how do you move
| your existing data without a quality of service or data
| loss? It can be difficult with production datastores even
| if the abstraction within your service is beautiful.
| lolinder wrote:
| Isn't migrating a queue super simple? Have a window in
| which you only add to the new system and you listen to
| both. When the old queue is empty, delete it.
|
| If you need to keep track of old data, then yes,
| migration is hard. But queues don't save old data--that's
| the point of a queue.
| goodoldneon wrote:
| 100%! Create a good enough solution and put it behind an
| abstraction that remains stable when you later create an
| ideal solution
| pyuser583 wrote:
| I've watched companies maintain their non-scaling DBs all
| the way to the grave.
|
| It's a killer.
|
| If it's "just a cache" OK. Fine. If you need it for
| something important like queuing or certain caching
| scenarios - be aware it won't scale well.
| ikiris wrote:
| You say that as if rabbitmq takes more than an 8th grader
| to set up in a few hours?
| GordonS wrote:
| To be fair, you can pull and run the Docker image in a
| few minutes, and the defaults will work out-of-the-box
| for a lot of folks.
|
| If you don't need clustering, RabbitMQ really is pretty
| simple to setup.
| dorfsmay wrote:
| There is a lot of space between Postres and RabbitMQ. If
| performance is a factor it takes alot of work to make a
| Postgres table fast enough. These days, and for the past at
| least 5 years, you're better off using a service like SQS,
| or setup Redis yourself.
| rantwasp wrote:
| no. he/she actually has a point. a database is not a queue
|
| https://mikehadlow.blogspot.com/2012/04/database-as-queue-
| an...
| piokoch wrote:
| Not sure. All messaging solution brokers walked away from
| relational database based storage long time ago (I've seen
| how badly JBoss 4 messaging with persistence sucked) and
| for a good reason as either database was occasionally
| killed by messaging traffic, or messaging traffic was
| crawling because database was too slow.
|
| In these days with Docker, ready cloud solutions getting
| the message broker is so simple that there is no reason to
| use database as a broker. If we know that our traffic will
| be so small, just install DB and message broker on the same
| VM, to avoid spending more on hardware or hosting.
| alunchbox wrote:
| You're not wrong, but if anyone on the team has already
| worked with a queue system that they need to manage then
| utilize something that scales. Otherwise there's no reason
| to use the cloud solution you're hosting with which hand
| hold you to make it work quickly with your integration.
| crazygringo wrote:
| Can you explain exactly what doesn't scale well?
|
| Databases scale well in general, it's one of the things
| they're built for.
|
| A single database server with an indexed table (b-tree) and
| an SSD is _extremely_ performant. (And yes there are other
| indexes that could be even _more_ performant, but b-trees are
| already extremely fast.)
|
| But the huge advantage of putting a queue in your database is
| being able to create transactions across both the queue and
| data in other tables that needs to change at the same time.
|
| Not to mention one less extra tool to maintain, existing
| database tools for backups replication etc. automatically
| include the queue as well, etc.
|
| As well as the fact that it's incredibly easy to make your
| queue behave according to whatever kind of custom business
| logic you might need. (Which is where additional indexes and
| b-trees can turn out to be necessary, and which is precisely
| what databases are great at.)
|
| So to the contrary -- unless you're at Facebook-level scale,
| using modern databases for queues generally seems like an
| _excellent_ choice. (And in the special cases where it 's
| not, that's usually quite obvious.)
| cerved wrote:
| In your experience, roughly at which point did it stop
| scaling and why?
| rmetzler wrote:
| I also have a lot of issues with people using tables and
| cronjobs instead of just using queues.
|
| I once was the one implementing this stupid idea. It's very
| easy to shoot yourself in the foot. It seems to be ok at
| first, but it is a stupid idea and will give nightmares to
| the operations team.
|
| It works as long as the table is small. It gets especially
| problematic when there are sudden spikes in the workload,
| so that the table is growing rapidly. People don't test
| these scenarios enough.
|
| Often it's important to do the jobs in the order they came
| in. This means you need to have the index set up this way
| or you'll end up with full table scans. You also need to
| remove jobs which are done, but this will also mess with
| the tables performance.
|
| There are tricks with databases to help scaling this, but
| at the end of the day it's much simpler to just use a
| queue.
| hughrr wrote:
| It stopped scaling when we put in the PO for a $150k x86-64
| box to run it on and finance said WTF.
| cerved wrote:
| What about in terms of number of requests and requests
| per second?
| sonthonax wrote:
| It can absolutely scale well if you design the database
| properly. However, it'll probably be fairly high latency
| compared to a Redis or Kafka stream.
|
| The things that often make databases appear slow is that
| they're typically configured for large random reads rather
| than small uniform writes. And they have insane initial
| response times because of all the IPC they typically do when
| starting a session.
|
| But aside from that, if you use a heap table without an index
| you'll can probably get similar throughout to Kafka. And if
| you turn Fsync off you might even get up to REDIS speeds.
| oneplane wrote:
| It's possible at scale, but the engineers that can do that
| are usually 1:20 outnumbered within most organisations.
| (when scoping out FAANG and co) Asking a developer to
| design a database is not even normal process today, unless
| the ORM does it for you, very little actually is done with
| the database. It might as well be a flat file...
| api wrote:
| That's sad. I learned relational DB design before I even
| really got into systems programming and it was really a
| valuable thing to learn. You can be so much more
| efficient and less error prone with normalized data
| structures.
|
| There's a reason SQL is still around after all these
| years. It's basically math and while the syntax is a
| little moldy the principles are eternal.
| skytreader wrote:
| This please. I feel like "How to Get Away with Just
| PostgreSQL" and the GP comment falls squarely under being too
| preoccupied with whether you could, you didn't stop to think
| if you should.
|
| Whatever happened to use the proper data structures for the
| job? PostgreSQL and MySQL are, at the end of the day, b-trees
| with indices. Throw in relational properties and/or ACID too.
| Those aren't properties you need or want in a queue
| structure.
|
| I know I don't have a solid argument against not doing it;
| it's just experience (and dare I say, common sense) telling
| me not to. Not quite like parent but I spent the first two
| years of my professional career in a team that had the
| brilliant idea to use DBs as queues. The big task I partook
| in for that stint is moving them off that v2 into a v3 which
| used---wait for it---Redis. Everyone's quality of life
| improved with every migration, proportional to the size of
| the v2 cluster we retired.
| zigzag312 wrote:
| > Whatever happened to use the proper data structures for
| the job?
|
| This so much. People too often treat databases as magical
| black-boxes that should handle anything. Database is most
| often the bottleneck and choosing the proper storage engine
| with appropriate data structures can be 100x more efficient
| that just using the defaults. 1 server vs 100 can
| definitely make a noticeable difference in costs and system
| complexity.
|
| While premature optimization is bad, choosing the right
| tool for the job is still somewhat important and will
| usually pay off in the long run.
| mirekrusin wrote:
| I think your "most often" is more like 0.01%, I'd say the
| inverse is true, that _most_ would be fine with single
| sqlite host or something like rqlite.
| otoolep wrote:
| rqlite author here. Happy to answer any questions about
| it.
|
| https://github.com/rqlite/rqlite
| mirekrusin wrote:
| Are you planning on adding websockets or something
| similar in the near future to support things like ie.
| data change notifications [0]?
|
| [0] https://www.sqlite.org/c3ref/update_hook.html
| zigzag312 wrote:
| What would you then consider to be the most common
| bottleneck?
|
| I agree that there are many cases with low workload where
| that would be plenty.
| solipsism wrote:
| Money. The most common bottleneck is money and customers.
| Use whatever helps you get new customers faster.
|
| Don't be scared of having to make changes in the future.
| Do the small amount of work it takes today to make sure
| your transition in the future is easy.
|
| Transitioning from a SQL queue to redis it's only
| difficult if you have a bunch of SQL throughout your
| code. If you have that, you did it wrong.
| mirekrusin wrote:
| Most common bottleneck is lack of competence.
|
| Direct visible effects are wrong decisions entangled in
| spaghetti-like complexity.
|
| It's hard to reach technical bottleneck in well designed
| systems. Computers are really fast novadays. They will
| vary greatly depending on what kind of system it is. Out
| of resources - cpu, memory, network, disk io - likely the
| weakest of them will be saturated first - network. But
| that's not a rule, it's easy to have system which will
| saturate ie. CPU before.
| hughrr wrote:
| This is the best post in this thread.
|
| A lot of people don't see the effects of their decisions.
| They leave a company after 3-5 years and go and work
| somewhere else where they get to make the same mistake
| again. The bottleneck indeed is lack of competence.
|
| As for technical bottlenecks, it's quite easy to hit a
| wall. Be it through layers of stupid or unexpected
| success. We have unexpectedly reached the limit of what
| is possible with x86-64 on a couple of occasions due to
| stupid decisions made over 10 years previously for which
| there is now no longer the budget or attention to fix.
| zigzag312 wrote:
| Competence is expensive :) While I mostly agree, even
| well designed systems have (sometimes considerable)
| tradeoffs.
|
| > It's hard to reach technical bottleneck in well
| designed systems. Computers are really fast novadays.
|
| I have been listening to how fast moderen computers are
| for the better part of the past two decades, yet as I
| user I still have to deal daily with too many of slow
| software and slow web services.
| mirekrusin wrote:
| Somebody once said "cheap things are expensive". This
| idea applies to developers as well. Cheap developers will
| drive company through bumpy roads towards uninteresting
| plains. Good developers not only pay for themselves but
| bring orders of magnitude more cash in. Only thing that
| touches on this that I can find is "software
| craftsmanship manifesto".
| zigzag312 wrote:
| True
| Seattle3503 wrote:
| Your comment sort of explains why you would use your DB as
| a queue. It is a big task to migrate to a new system. If
| you already have Ppostgres or MySQL integrated and
| deployed. Using it as a queue may be the simplest option.
| lolinder wrote:
| What has me wanting to stick with postgres is that I work
| on a small team (two developers) and adding more
| technologies to our stack is extra overhead that's hard to
| justify. At our peak we're currently handling one request
| per second, and postgres for a queue is more than
| sufficient for that. Is there any good reason for us to
| add, learn, and maintain a technology neither of us yet
| knows? Or would we do just as well to abstract away the
| queue in the code so that we can switch to redis when we do
| run into scaling problems?
| skytreader wrote:
| > Is there any good reason for us to add, learn, and
| maintain a technology neither of us yet knows?
|
| Absolutely and that reason is, you are still a small
| team, with a small user base to boot. That's fantastic
| opportunity to learn a new technology and build on it
| properly! Remember everything is easier in software
| engineering if you assume you have no users[1] and your
| situation is as close as it gets to this ideal. Leverage
| it.
|
| Plus, as me and others keep saying, Redis (and other
| proper queues) isn't a complex addition to your infra.
| This isn't Hadoop, or Kafka, which is a very special type
| of queue (one way to put it, at least).
|
| > one request per second, and postgres for a queue is
| more than sufficient for that
|
| Yes I agree but...
|
| > Or would we do just as well to abstract away the queue
| in the code so that we can switch to redis when we do run
| into scaling problems?
|
| What I read when I see such statements is this mythical
| software engineering ideal that with enough abstraction,
| a migration is just a matter of writing a new class that
| implements some interface and then changing a config. For
| a sufficiently complex app infra, that happens almost
| never because you could never keep the abstraction leaks
| to an acceptable level.
|
| Another thing, abstraction does not solve all your
| problems if the underlying implementation is poor fit to
| begin with. Let me paint you a plausible scenario:
|
| Once you are large enough, you might find your PGQ acting
| weird and you realize it's because someone in the team
| wrote code that accesses your queue table like it's an
| actual table of records, not a queue. So you think, okay
| let's prevent that from happening. Maybe you add users
| and permissions to distinguish connections that need to
| access between tables proper and queue. Maybe you start
| writing stored procs to check and enforce queue
| invariants periodically.
|
| Well, guess what, all those problems would've been solved
| for free if you invested maybe one work day getting a
| Redis server running when you were a two-person op
| serving one request per second.
|
| Lastly, scaling a relational DB is an entirely different
| beast from scaling a queue. Scaling anything does not
| ever come painless but you can reduce the suffering when
| it comes. Would you rather scale PG so it can keep acting
| as a queue or scale a queue that's, you know, really a
| queue in the first place? Heck the latter might even be
| solvable by throwing money at the problem (i.e., give it
| more compute).
|
| [1] Except for the part where you need to make money, of
| course.
| yongjik wrote:
| > Absolutely and that reason is, you are still a small
| team, with a small user base to boot. That's fantastic
| opportunity to learn a new technology and build on it
| properly! Remember everything is easier in software
| engineering if you assume you have no users[1] and your
| situation is as close as it gets to this ideal. Leverage
| it.
|
| I have to disagree. Of course code quality is important,
| but building things "properly" because "we may need it
| later" is a great way to kill a project with complexity.
| KISS, YAGNI. An early startup is, IMHO, not a good place
| to learn about new frameworks while getting paid - you're
| on borrowed time.
|
| Make a back-of-the-envelope calculation about how much
| throughput you need. E.g., if you expect to have 10,000
| users, and each may make one request per hour, you're
| dealing with 3 qps. Anybody who wants to bring in a new
| dependency for this, needs some talking to.
|
| (If you already need Redis anyway and it's a better fit
| than Postgresql, then sure, go ahead.)
| skytreader wrote:
| > but building things "properly" because "we may need it
| later" is a great way to kill a project with _complexity_
|
| Emphasis added because I feel like I addressed this in
| the paragraph immediately after the one you quoted:
|
| > Plus, as me and others keep saying, Redis (and other
| proper queues) _isn 't a complex addition to your infra_
|
| I'm speaking out of experience and, as I already pointed
| out in another subthread, Postgres is far more complex
| than Redis. Consider the presence of "DB
| Admins/Specialists" and the lack of counterpart thereof
| for Redis and other queuing solutions.
|
| Of course, if queues are not central to how your platform
| operates, you _might_ be able to get away with Postgres.
| I still advise using Redis as a reasonable hedge against
| someone famous tweeting organically about your service
| because in this case, you don 't want your DB to go down
| because some queue table had a surplus of transactions
| (or vice versa).
|
| Not to mention, at an early stage, your tech decisions
| set precedents for the team. Maybe you have 10K users
| with a low qps but soon you are sending marketing emails
| to them and your system has periodic bursts of queue
| activity for all 10K users at once. When discussing this
| marketing "feature" rarely anyone thinks, "Hey we can't
| do that with our Postgres queue", rather "Yeah I saw
| functions in our codebase for queuing---this is doable".
| This is a small effort but a huge technical investment
| for later on.
| [deleted]
| onlyrealcuzzo wrote:
| There are A LOT of use cases for a queue that aren't
| particularly computer intensive and you aren't worried about
| ever needing to scale - especially larger than your Postgres
| database.
|
| So while this might be good advice if you will need to scale,
| it's certainly not blanket advice.
|
| In the first case, I would much rather just have Postgres and
| not a separate queue to manage.
| basilgohar wrote:
| For the scale of most tasks, the existing RDBMS that a small
| app is using for its data is perfectly sufficient for the
| queuing needs of said app. This reduces complexity in the app
| and removes the need to maintain additional tools.
|
| Frameworks like Laravel make this easy because they allow you
| choose different backends for your DB, queue, and cache,
| among other things, all or some of which can be an RDBMS like
| Postgres or MySQL/MariaDB.
|
| When your app's need outscale what the RDBMS can rationally
| handle, then you can dedicate the resources to switch to a
| more specific tool as your growth has indicated you should.
|
| I can't say what the percentage of apps that can handle
| things this way, but the anecdotes mentioned in this thread
| are clearly the cases where growth outstripped what a
| traditional RDBMS like Postgres are well-suited for, and so
| migrating was the right thing. The probably was likely
| identifying the need or delaying the transition until it was
| too painful. I don't think there's a magic bullet here, but I
| also don't think that starting with a dedicated queuing store
| tool is also always the right thing to do.
| baq wrote:
| the other side of the coin is that queue servers are
| databases, _just_ optimized for the use case.
| atombender wrote:
| You're not wrong, but every solution involves compromises.
| More broadly, I would argue, though, that almost nobody
| actually _needs_ queues.
|
| Sure, there are cases were you actually need strict ordering,
| but they are relatively few, in my opinion, involving things
| like transaction processing or other areas where the order of
| operations change the effect of the operations and must be
| strictly sequential.
|
| Ordering itself brings significant technical challenges. For
| example: With a queue, you can only have strict ordering as
| long as you only process one item as a time, with no
| concurrency. Ordering also complicates handling of failures
| and retrying, because it means every newer item has to wait
| until the failed item has been dealt with; if you use things
| like dead letter queues, order is violated.
|
| In almost all use cases -- anything from image processing to
| newsletter delivery -- a better system is to have an
| unordered list of items that anyone can take items out of, in
| any order. Then you throw as many workers at this system and
| design it so that there's minimal locking needed to prevent
| workers from stepping on each other's.
|
| There are queue systems that have a purpose in a stack,
| specifically pub/sub brokers, which can be used to _schedule_
| work along several dimensions. You use the pub /sub system to
| _signal_ work to be done, but each payload just refers to
| some state that 's stored elsewhere.
| hughrr wrote:
| I think you're missing something important here by applying
| the worker pool simplification to the idea. Order _is_
| important i that case as well. It forces both fair
| scheduling of the work and the metrics from the queue can
| easily describe the overall latency of the work being done
| and the capacity and utilisation. The latter can be used to
| scale the consumers of the queue and /or alert on when
| things aren't working properly.
|
| If you throw a proper messaging broker on the table like
| RabbitMQ then you're getting message acknowledgement as
| well which is important if you _must_ do the work. A failed
| consumer will result in the work being rescheduled on
| another consumer transparently. Of course there are metrics
| you can see and get heads up when your consumers are not
| working.
|
| Ergo, don't simplify something too soon. Even reliable
| delivery of newsletters is rather important otherwise why
| would all the tracking pixels exist? (they are a form of
| message acknowledgement)
| atombender wrote:
| If latency is important, I would argue that ordering
| doesn't help you.
|
| That is, if you have the computational resources to
| service your entire queue within the required latency,
| then ordering doesn't matter, because workers could
| simply claim the first available random task, and it
| doesn't matter who finishes first or last. If you _don
| 't_ have the resources, then ordering still doesn't
| matter, because you are blowing your latency budget
| anyway.
|
| For example, say users are uploading photos which go into
| a queue for scaling, cropping, etc. Let's say 10 users
| per second are uploading a photo. In the best-case
| scenario, the queue is empty, so all users should see
| their photo fully uploaded quickly. Worst-case scenario,
| your queue has a million tasks pending, which benefits a
| handful of early users who see their photos completed,
| whereas most users will simply wait forever. Prioritizing
| early users (i.e. queue order) doesn't benefit anyone
| when everyone will be miserable, so it's an arbitrary
| choice.
|
| The ordering imposed by queues can create an illusion of
| fairness when task _execution time_ (i.e. end-to-end
| latency) is not taken into account. If one user uploads a
| 1 gigapixel photo, it 's most likely going to take up
| more processing time than other users, and again, there's
| no reason to prioritize _that_ user over any other user
| just because their request arrived earlier.
|
| If you have random-order delivery, you can still measure
| latency, capacity, and utilization, as well as have
| retries and acking, and utilization-based scaling.
| gravypod wrote:
| I use Queues for retries, buffering, and parallelization.
| It's pretty easy to write most programs to support at least
| once and to run multiple queue workers that you auto scale.
| draw_down wrote:
| I bet a bunch of them love the phrase "use the right tool for
| the job" too. Just shows how hollow/meaningless that phrase
| can often be
| ngrilly wrote:
| Look at SKIP LOCKED in PostgreSQL and MySQL.
| hughrr wrote:
| It's not that. It usually ends up in the same storage
| engine as the main business functionality which leads to IO
| contention. The some sick and twisted bastard does a join
| to it from somewhere abhorrent which means it requires some
| serious heavy lifting to fix it.
|
| This is unfortunately the status quo in the slightly less
| ideal universe than "seemed like a good idea at the time"
|
| Use SQS or RabbitMQ or something.
|
| Edit: Also if you're using something SQL for queues you're
| going to have to build out your monitoring stack yourself
| based on your implementation rather than suck up a metrics
| endpoint with queue stats or pull from cloudwatch.
| jbverschoor wrote:
| I dunno. If it's usually the same problem, it should be
| easy to identify and fix.
|
| Unless we're talking about a high load, there should be
| no problem doing this.
| hughrr wrote:
| It's only the same problem until someone dumps 25,000
| entries in your queue.
| kasey_junk wrote:
| I've seen all manner of hokey queue implementations in
| sql going back about 20 years and all of them could
| handle 25k enqueue bursts. That wasn't a problem for a
| Sybase database on a commodity host circa 2000.
|
| I think if I were going to argue against using DBs as
| queues it would be around: heavy parallel write use
| cases, latency concerns of both reads/writes and scaling
| to millions of events per second.
|
| If you don't have those concerns using a properly
| normalized and protected schema (which you are doing
| anyway right? Cause if not you are already shooting your
| toes off) for queues goes a very long way and removes a
| very big operational burden and tons of failure modes.
| justsomeuser wrote:
| I agree.
|
| Going from a single database to a (database + queue)
| means two server processes to manage, maintain, observe,
| test etc.
|
| I actually start with SQLite to reduce as much
| distributed state as possible, then move to something
| else once it's proven it will not work.
| hughrr wrote:
| It wasn't a problem for Sybase on a commodity host circa
| 2000 because clearly that host wasn't doing a whole lot
| of other stuff. It's a big problem for our 48 core nodes
| with 2TiB of RAM and a metric shit ton of DAS NVMe. Ergo
| anecdotes don't scale either.
|
| To clarify we just moved this entire problem to SQS.
| crazygringo wrote:
| What database can't handle an extra 25,000 entries?
|
| That's... nothing. Databases handle billions of rows
| effortlessly using a b-tree index. So not really sure
| what point you're trying to make?
| glogla wrote:
| The benefit of having the queue in the same engine as the
| main business functionality is that you can do stuff in a
| single transaction.
|
| Having the queue on its own means you have to handle
| exceptions like "job is done but queue didn't get
| updated" or get into distributed transaction which is
| very different challenge.
|
| Also, operationally having one inherently stateful
| component (db) versus two (db, queue) makes things a lot
| simpler to manage, deploy, update, etc.
|
| Not claiming this is a perfect solution, just that the
| tradeoff isn't obvious.
| hughrr wrote:
| The point of queues is sequencing, storage and decoupling
| of events for the sake of scalability and durability.
|
| By putting it in the same transactional store with the
| same transactional boundaries you're instantly shooting
| the whole point.
|
| Not only that, most queues define boundaries between
| different logical systems where transactional boundaries
| do exist. At which point your database's transaction
| scope is extremely limited in capability.
|
| In the real world of messaging transactions span more
| than just the database engine.
|
| It's just bad architecture. Full stop.
| victor106 wrote:
| Having a separate queue makes sense across application
| boundaries. If you need queuing mechanisms in your own
| application you should be able to get by with a dB for
| 99% of use cases
| dragonwriter wrote:
| > The point of queues is sequencing, storage and
| decoupling of events for the sake of scalability and
| durability.
|
| The point of queues in any particular application is some
| subset of that; the calculus of implementation approaches
| that make sense depends not on the abstract point of
| queues _in general_ but on the concrete point of queues
| _in your use case_.
| flefto wrote:
| So use a separate database.
|
| Is contention with the application your only objection?
| That's pretty weak.
|
| What's the real compelling reason not to use a database?
| You haven't said.
| hughrr wrote:
| Contention, architectural separation, you have to build
| your own monitoring stack, not transactional with
| concerns outside the database without introducing
| distributed transactions and risk, no routing or
| distribution capability, you have to build it yourself,
| point in time message durability is somewhat dubious
| depending on your acknowledgement process which of course
| you had to invent yourself as well.
|
| Not to mention numerous database administrators crying
| into their cups of coffee.
|
| Enough?
|
| Like I said 20 years of people fucking this up is my
| experience.
| deskamess wrote:
| > you have to build your own monitoring stack,
|
| How is this done in Redis automatically? At some point
| you are writing queries (albeit Redis ones) to pull
| metrics. Other queueing systems may expose an API but
| there is always some level of integration. With
| Postgres/other db I would write SQL which is their API
| and a powerful one at that.
|
| I can couple events with triggers to auto-generate other
| events etc, have built in audit capability, roll up
| reporting etc, all with standard SQL. (all in the context
| of monitoring and reporting)
| hughrr wrote:
| Um, I'm not suggesting using Redis. In actual fact I said
| elsewhere I wouldn't use Redis for queues.
|
| As for triggers, reporting, audit, I'm laughing now
| because you miss the point. That's another bunch of IO in
| your monolithic black box, and you're still building it.
|
| Start here: https://www.rabbitmq.com/prometheus.html
| Jarwain wrote:
| Currently at work we use a postgresdb for queues for
| long-running persistent jobs, but I'd like to move away
| from that model since I have doubts about how it'll
| scale.
|
| I've thought about using rabbit mq, but have a few
| questions: - it appears that when a job is consumed, it's
| gone. How would I maintain a record of the jobs & status
| of the job in rabbitmq? If I wanted to display job
| updates how would I handle that? I didn't think you could
| update a message once it's already in the queue
|
| Or am I missing the mark? Do I want to separate the
| business entity "job" that maintains a status and updates
| and such from the "thing that dispatches jobs to
| workers"? And when a worker Has the job it just updates
| the business entity in the database?
| doktorhladnjak wrote:
| You might want to look at something like SWF or Temporal
| flefto wrote:
| Meh. All your objections are hand wavey and not factual.
|
| Databases actually work fine as a queue but emotionally
| you don't like it. That's fine it's just not real strong
| objections.
|
| What you have not said is "it physically does not work",
| and that's because it does work fine.
| phildawes wrote:
| I don't think he's saying 'it cannot be physically made
| to work (given enough effort and discipline)'. The
| impression I got was more like 'if you go down this route
| then the incremental path of least resistance leads
| somewhere bad'.
| rmetzler wrote:
| The range where databases work fine is pretty small.
| Sudden spikes in job creation will kill the performance
| of the database. Often enough the job creations come from
| external systems you can't control.
| hughrr wrote:
| I'm not saying it doesn't work. I'm saying you're
| shooting your toes off down the line.
| mianos wrote:
| Skype at its peak used a postgres queue extension, pgqueue,
| at its core. Seemed to go OK for them. The extension did work
| around some limitations when running a pure sql queue in
| postgres. These days I would run Pulsar but if you only have
| a million users and already have postgres you will be fine.
| conradfr wrote:
| I worked at a place using pg_queue (the person who created
| it worked there before going to Skype IIRC) and maybe it
| was working well operationally but it was not a great
| experience to develop on it. It was hard to generate events
| and debug/observe them (it was height years ago so the
| details are a bit fuzzy).
|
| MS also moved from it afterwards when they acquired Skype
| but I guess it was more because of the shift to SQL Server.
| maxrev17 wrote:
| Lean startup begs to differ :p
| vincentmarle wrote:
| Exactly, GP does not sound like someone who has worked on
| anything with decent scale. Because if you did, you would
| know that it is a really, really bad idea.
| manigandham wrote:
| Very few things have "decent scale". 99% of apps can run on
| a single server with modern hardware.
| lolinder wrote:
| Or maybe GP is someone who has worked on projects from
| their infancy. If your queue will be handling less than a
| request per second for the foreseeable future, why _not_
| just use the technologies already in your stack and plan a
| migration path now? What do I gain from adding Redis (and
| thereby making my system even more distributed, with all
| the hazards that carries) if my project isn 't even
| guaranteed to make it to 10 rps before going bankrupt?
| aidos wrote:
| The solution I use now is to have the queue in the dB and
| then have a single process pushing to another queue better
| suited to dealing with workers.
|
| I drank from the "keep your queue out of the dB" koolaid
| once, but then you end up worrying about transaction
| boundaries and atomic writes to the queue and the dB. It
| totally depends on your workload, but in my case, I've found
| the halfway solution of the queue in the dB so you get to
| write to it within your transactions, with a dedicated queue
| for the workers to be a much better solution.
| hughrr wrote:
| Are you sure it works properly when something fails. That's
| a wonderful situation I like to ask people because the
| answer is usually no. Then someone gets billed twice for
| something and then the guy who said this was a good idea
| gets shot.
| aidos wrote:
| Well, I only ever enqueue idempotent work, so again, in
| my scenario it can fail and be rerun, but I'm also unsure
| about what you think is a bad idea here.
|
| Honestly, I had more code managing two phase commits so I
| would only push to the queue when I was sure (ish) the dB
| commit was ok.
|
| Say you need to trigger something to send an email, but
| let's say that there's an error condition after and the
| transaction is rolled back. How do you handle that
| failure? In my scenario the queue is rolled back too, no
| harm to foul.
| AtNightWeCode wrote:
| The outbox pattern. Can in many cases be solved by using
| topics.
| aidos wrote:
| Ah ha. Yup, that's exactly it. Thanks for the pointer.
|
| Edit: can you explain by what you mean by solving it with
| topics? I thought topics were a broadcast mechanism?
| sorokod wrote:
| What happens if the process that performed 2. crashes before it
| was able to complete whatever processing it was supposed to do?
| jayd16 wrote:
| I think they want to select expire < now instead of expire =
| 0. Delete the row on completion.
| jffry wrote:
| The idea, I think, is you wouldn't delete the job from the
| queue until the processing was done.
|
| Of course, this relies on the jobs being something that can
| be retried.
| sorokod wrote:
| Looking at the code, some steps are missing in the
| description:
|
| 3. When processing is complete the matching table entry is
| deleted.
|
| 4. There is a "reaper" process that checks for stale jobs (
| based on time in the future ) and resets them to 0.
|
| This of ouffcourse raises more questions.
| shawnz wrote:
| But you will have marked it as in progress by setting the
| "expire" to a non-zero value, preventing any other workers
| from trying to work on it. How will they know that the
| worker which marked it actually crashed and will never
| finish?
|
| By using SELECT ... FOR UPDATE SKIP LOCKED, the record will
| automatically get unlocked if the worker crashes.
| sorokod wrote:
| This approach has its own gottchas.
| jffry wrote:
| Not sure. Maybe something else is responsible for
| identifying "stuck" jobs and kicking them out to humans
| to decide what to do (assuming worker crashes are rare)
|
| If we're talking PostgreSQL specifically, and newer-ish
| Postgres (9.5+ I think), then you can leverage its
| abilities to do all this in one atomic query:
| UPDATE jobs SET status='working' WHERE id =
| (SELECT id FROM jobs
| WHERE status=NULL LIMIT 1
| FOR UPDATE SKIP LOCKED) RETURNING
| id
| shawnz wrote:
| If you are willing to use SELECT ... FOR UPDATE SKIP
| LOCKED, then you can implement the queue without any
| UPDATE query or status column at all. Just lock the
| record to mark it as in progress, that is how you get the
| benefit that when the worker crashes, it will
| automatically be returned to the queue.
| hardwaresofton wrote:
| Don't forget SKIP LOCKED[0]!
|
| [0]: https://blog.2ndquadrant.com/what-is-select-skip-locked-
| for-...
| masklinn wrote:
| `FOR UPDATE` is also a good idea.
| Koffiepoeder wrote:
| The article mentions that.. The comment by chx was rather a
| more generalized idea for implementing a queue that is not
| database-specific.
| sigil wrote:
| Problems with this approach:
|
| 1. Recovery.
|
| 2. Concurrency.
|
| 3. Polling.
|
| In more detail:
|
| 1. _Recovery_. Suppose a worker dies while processing a job.
| The job should be retried, but how do you know this, since
| expire > 0? You can impose a timeout, but that has drawbacks
| -- there might be long jobs you repeatedly start but never
| finish. To recover failed jobs without imposing a timeout,
| you'd have to run both the UPDATE and the job processing inside
| a transaction. That way, a failed job can result in an implicit
| ROLLBACK, freeing the job for future worker attempts.
|
| 2. _Concurrency_. So now recoverable workers are trying to
| claim jobs inside a transaction. If there are 10 jobs and 10
| workers, we want all 10 jobs processing concurrently. However,
| each worker SELECTs the first item_id and tries to UPDATE that
| same row. 1 worker wins. The other 9 workers block while that
| transaction completes, then update zero rows. Concurrency will
| hover around 1.
|
| 3. _Polling_. In the transactional UPDATE approach, there 's no
| way to tell whether there are free jobs short of trying to
| claim them via UPDATE, which blocks. So you must poll, and
| either burn cpu on quiet queues, or introduce an artificial
| delay into job processing times. The beauty of the SKIP LOCKED
| approach is that it _can_ tell whether there are free (not
| currently processing) jobs. Even if you wake all 10 workers up
| via LISTEN / NOTIFY for a lone job, 9 workers in the
| thundering herd will fail to claim the job, and go back to
| sleep.
|
| I blame TFA for not sufficiently explaining the SKIP LOCKED
| approach. A much better explanation is here:
| https://www.2ndquadrant.com/en/blog/what-is-select-skip-lock...
| dillondoyle wrote:
| Thanks for posting this! It is relevant to me right now and
| great example. Especially since I'm using a new to me database,
| Spanner.
|
| Specifically I similarly pull rows into a 'queue' like that
| linked sql by WHERE primary key, primary key, and queue_status
| ORDER BY timestamp, uuid LIMIT 10. I was worried what would
| happen with many requests around the same time if it would
| accidentally pull duplicates into 'queue.'
|
| I think hopefully I've implemented Spanner locking read write
| transaction correctly and that this won't cause problems at
| scale (we're not even close to google scale lol).
| wodenokoto wrote:
| > SELECT item_id WHERE expire = 0
|
| I come across a lot of SQL databases that uses 0 instead of
| false.
|
| Is there any good reason for this?
| masklinn wrote:
| So that you don't need a separate column for an associated
| datum?
| chx wrote:
| false would be "expired", this is called "expire" because it
| is a timestamp, it's the epoch time when the item expires.
| The worker presumably finishes before that time and if it
| does then the item is deleted. So, if any item has a nonzero
| expires in the past then it needs to be reset to 0 so it can
| be retried.
| xupybd wrote:
| Only in databases that have no Boolean type.
|
| Sometimes it's an artifact of the orm mapping.
| dragonwriter wrote:
| > You need a table with a primary id and a "status" field.
|
| For a queue, you need a table with an insertion order and a
| status indicator. "Primary id" is not enough because "primary
| id" may not be ordered (e.g., a UUID primary key is a primary
| id.)
|
| OTOH, while a DB can do this, therr are plenty of well-tested
| open source purpose built message queueing solutions, so the
| swt of scenarios where "roll your own with a DB" isn't a misuse
| of effort is limited.
| lolinder wrote:
| Why on earth would you make a queue table with a non-serial
| primary key? What benefit do you gain from a UUID here?
| [deleted]
| solipsism wrote:
| What am I missing? With your description, nothing ever sets
| expire to 0. Can't look at the code comfortably as I'm on
| mobile.
| pjmlp wrote:
| Yeah, most of the NoSQL solutions boil down to not learning how
| powerful SQL and its programming extensions are.
| rad_gruchalski wrote:
| How so?
| kasey_junk wrote:
| That's a bit of rewriting history. NoSql, particularly
| document style stores came about at least as much as a
| reaction to the overweight policies & orms the sql world has
| made rampant at the time.
|
| You'd be trying to store a tiny bit of simple state and all
| the books/articles would have you standing up read only views
| & stored procedures for all your crud ops. The document
| stores came along with a fresh perspective and easy scaling.
|
| Then their were the columnar stores and time-series stores
| that really did solve the newer scale problems in ways the
| existing sql stores didn't.
|
| I'm a sql guy through and through but it's important to
| recognize the nosql movement was a reaction to real pain
| points. Also it made the sql databases better.
| pjmlp wrote:
| What history rewriting?
|
| I have been using RDMS all along since 1996.
|
| Nokia NetAct was scaling GB of data across multiple
| clusters with OLAP reporting engine in 2005 with no
| hiccups.
|
| The experience I had with DynamoDB kind of proved to me I
| haven't lost anything by staying in the SQL path.
|
| Most NoSQL deployments I have seen, could have been easily
| done in Oracle or SQL Server, provided they actually had a
| DBA on the team.
| kasey_junk wrote:
| But that's sort of the point. You are saying "if you have
| a DBA and use proprietary products and have the skill to
| understand the trade offs of running a db with another
| data warehouse product layered in you can handle GB of
| data".
|
| Mongo said "Throw data at me and I'll scale with very
| little work".
|
| Now, I've always largely believed that's penny wise and
| pound foolish but it's certainly a good pitch.
| pjmlp wrote:
| Which is what I originally stated "boil down to not
| learning how powerful SQL and its programming extensions
| are".
|
| One can only have bought into Mongo's story, by lacking
| the skills to understand how fake it was.
| [deleted]
| sorokod wrote:
| "mongo said" - the obligatory mongo DB is webscale video:
|
| https://m.youtube.com/watch?v=b2F-DItXtZs
| nemothekid wrote:
| DynamoDB was designed to solve a very different problem
| than a traditional SQL database. When DynamoDB (and the
| other Cassandra flavors) were released there were no
| databases doing multi-master failover with high write
| throughput - we are talking about TBs, not GBs. It's not
| a coincidence that Google, Facebook, and Amazon all had
| to write to their own database at around the same time
| (BigTable, Cassandra, Dynamo).
|
| With those new tools you had other companies building on
| top of those databases for far cheaper than a license to
| MSSQL or any other OLAP of choice would give you.
| edwinyzh wrote:
| Great! So I learned this technique. But one thing that's not
| clear to me - how a "expired" datetime field can allow easy
| retries? Anyone give me an example? Thanks!
| chx wrote:
| Successful items are deleted from the queue before they
| expire. So an item with expired > current time needs to be
| retried because it failed processing. So, a cronjob resets
| these to expire 0 and the next worker can pick them up.
| edwinyzh wrote:
| Thanks for the explanation and time is needed in order to
| digest it... I guess one might start with the simple
| "status" field approach ;)
| DangitBobby wrote:
| The status doesn't help if the worker died during
| processing. If you've got something that's been "in
| progress" for 20 minutes of a job that takes 2, your job
| might not be getting done. That's what expired is for.
| layoutIfNeeded wrote:
| Doesn't sound like a robust solution. What if the job takes
| longer than your estimated expire time? Does the worker
| keep updating the expiry date on a background thread while
| doing the processing?
| cbo100 wrote:
| Same for any queuing system. You need to set the expiry
| time long enough for the expected task duration.
|
| In SQS, for example, you use a visibility timeout set
| high enough that you will have time to finish the job and
| delete the message before SQS hand it off to another
| reader.
|
| You won't always finish in time though, so ideally jobs
| are idempotent.
| ptx wrote:
| You mean _expired < current time_, right?
| johbjo wrote:
| Could the process not delete the row, and not commit until
| it has finished processing? Otherwise, the transaction is
| rolled back when the db connection ends. This would not
| require timeout or clean-up.
|
| One would need the appropriate transaction isolation, but
| shouldn't it work?
| papito wrote:
| The world would be a simpler place if people actually knew their
| storage engine like Postgres. I mean, _knew_ it. All the features
| and how to leverage them in a performant way. Would eat less
| cloud energy too.
| sgt wrote:
| As much as I love Postgres, I would rather use Redis for this. I
| haven't used Redis much though, and on our project we actually
| decided on using Kafka. Admittedly much heavier and maintenance
| intensive, it seems to do the job very well.
|
| Any opinions and experiences here with Kafka vs Redis as a queue?
| ickyforce wrote:
| Don't use Kafka as a queue.
|
| (I haven't used Kafka for a while, if anything below is
| outdated let me know)
|
| The main issue I've experienced was balancing work across
| consumers - if one consumer is slower than others (e.g. running
| on a machine with some other heavy workload). In such case when
| a larger number of jobs is enqueued and expected to process
| quickly it's possible that everything will complete except jobs
| which landed on the partition belonging to the slow consumer.
| There can be only one consumer per partition so others can't
| help.
|
| One could consider using a single partition to avoid this
| imbalance but this means that there could be only a single
| consumer - not good.
|
| In other queue systems consumers fetch up to N jobs from a
| shared pool when they need more work. This means that in the
| case of a slow consumer the lag is limited to by the time
| required to process N jobs.
|
| This situation also arises if consumers are equally fast but
| some kinds of jobs are considerably slower. Random distribution
| across partitions mitigates it to a degree.
|
| My case was precomputing caches after some event, the goal was
| to do this within 10 seconds. First implementation with Kafka
| worked but very often some consumer couldn't use as much CPU
| time as others and we didn't want dedicated nodes for this
| because this even was happening at most few times a day. As a
| result jobs from a single partition were processed 2x slower -
| we had to overprovision.
|
| Replacing Kafka with a queue based on Redis solved this right
| away, all consumers were finishing at the same moment. Although
| it wasn't a large scale operation.
|
| The second complaint about using Kafka as a job queue is not
| being able to modify the queue, e.g. to reorder jobs, postpone
| jobs or cancel jobs
| sgt wrote:
| That is very interesting, thanks. I will look into a possible
| migration to Redis, which in any case is so lightweight and
| simple to run, it seems.
| arpa wrote:
| It's apples versus oranges with rabbits in the middle.
| des1nderlase wrote:
| Ha ha. So true. In theory though you can make any Turing
| complete piece of software to do anything, so rabbits can be
| turned into oranges...
| dandarie wrote:
| How about tags/sets?
| flefto wrote:
| MySQL and Microsoft SQL server also support SKIP LOCKED.
| tim44 wrote:
| ia this a bub het, found, hard srry gl hf hf
| taf2 wrote:
| Redis to me is the magic solution to solve so many complex
| multiple process syncing issues, a global lock, rate limiter,
| uniqueness with a set, stasher of all the temporary things... no
| migration, quick solution to complex problems- streams of events
| to organize pending tasks, etc... to me it's all about the
| temporal / temporary state my application needs to keep track in
| order to work in a multiuser - multiprocess and multi service
| environment... just a different tool for a different job then
| what I would use a database for... but really it's computer
| science, to each is their own
| x7777 wrote:
| Sometimes I wonder what kind of apps you are all developing where
| using postgres instead of redis makes sense? do you have 3 people
| using your app? Is 800ms latency ok for your users? Or is it that
| you don't care. Honest question.
| jayd16 wrote:
| Postgres is plenty fast. If you think it can only do 3 users or
| 800ms, you're doing something wrong.
| x7777 wrote:
| It is the blind claim of it being "plenty fast" that is
| resonated in this thread that I find problematic. The article
| describes how to implement queueing/app locks/pub sub,
| without providing any benchmarks on how the same
| implementations compare to redis under different loads.
|
| If you are using redis and you don't care about perf you are
| probably doing it wrong. If you are using postgres and you
| care about perf you are probably doing it wrong. "Probably"
| because context is key.
| mixmastamyk wrote:
| Pg is very fast. If it is slower it is because it does more
| and does it more safely. When not needed those things can
| be disabled.
|
| There are a number of tips about that in the discussion
| above, such as turning off logging of and partitioning a
| job table.
| cs02rm0 wrote:
| I tend to ask the opposite question more often.
|
| Do you really need Postgres?
| gchamonlive wrote:
| Get your requirements in check.
|
| How sensitive is your app to latency? How much data and request
| volume you need to handle?
|
| Do proof of concepts, write thorough load tests and go for what
| makes sense.
|
| Either way, no matter which tech you choose, make sure monitoring
| and alarms are in place and that you do regular maintenance
| exercises.
| tim44 wrote:
| men have never found me. at least
| tim44 wrote:
| found at at a deal
| tim44 wrote:
| any
| MichaelMoser123 wrote:
| it's cool that they added listen notify to postgres; I wonder
| when they will add many-to-many relationships; my problems with
| postgress start when i have to do a many to many relationship.
| For these you need to do an extra table for the link entries; now
| performance tanks, and you need to denormalise this for read
| requests, that's where the fun starts...
| mixmastamyk wrote:
| M2M tables shouldn't be a problem except at the very high end,
| in which case you'll have a number of mitigations at your
| disposal. ?
| tim44 wrote:
| any never alt homely
| [deleted]
| jdlyga wrote:
| It's kind of like buying all sorts of running outfits and a
| camelbak to go on a 3 mile jog once a week. It's about overeager
| optimism of doing races and marathons in the future. Where in
| reality, you can get away with just running in the track pants
| and tshirt you already have on.
| osigurdson wrote:
| This seems to come up on HN at least once a year. Sure it can
| work but LISTEN ties up a connection which limits scalability as
| connections are limited and expensive. Also, mitigation
| strategies like PgBouncer cannot be used with this approach (nor
| can scale out solutions like CitusDB I don't think).
|
| Of course, if scalability is not a concern (or the connection
| limitations are eventually fixed in postgres), this would be a
| very viable approach.
| axiosgunnar wrote:
| The point the author is missing is that most people are not
| deliberately choosing Redis.
|
| They use packages such as celery which happen use Redis under the
| hood but the user doesn't have to deal with these details other
| than pasting some Redis connection URL somewhere.
| Nicksil wrote:
| >They use packages such as celery which happen use Redis under
| the hood but the user doesn't have to deal with these details
| other than pasting some Redis connection URL somewhere.
|
| Celery doesn't use Redis under the hood.
|
| However, you may use Redis as a message broker and/or result
| store.
|
| https://docs.celeryproject.org/en/stable/getting-started/bac...
| antihero wrote:
| I think one of the draws of redis was, back when it came out, how
| simple it was to set up and maintain in comparison to an RDBMS.
| sometimesshit wrote:
| I need an article how to get away with just MangoDB
| simonw wrote:
| The best reason I know of to use a relational database as a queue
| is that it lets you trigger queue operations from within a
| transaction - so a queue item is guaranteed to be created if the
| transaction succeeds, and guaranteed not to be created if the
| transaction fails.
|
| Brandur wrote a great article about that here:
| https://brandur.org/postgres-queues
| Aeolun wrote:
| This seems a bit silly. We might be able to use only postgres,
| but this seems harder than using Postgres _and_ Redis.
|
| Any time someone says 'this works for small workloads' I hear,
| 'we'll eventually have to migrate this'.
|
| That's not a problem if you are talking about introducing a
| hugely complex piece of infrastructure, but Redis is hardly that.
| gizdan wrote:
| > Any time someone says 'this works for small workloads' I
| hear, 'we'll eventually have to migrate this'.
|
| Personally I aim to develop with future migrations in mind.
| Certain abstractions are worth it. Put an interface around a
| cacher, db store etc. Day to day dev overheads is low, but your
| future self will thank you should you need to migratw away.
|
| Sure this isn't always possible, but do it as much as possible.
| manigandham wrote:
| Most workloads are small workloads. Modern hardware is
| incredibly powerful.
| ryanjshaw wrote:
| Or don't use Postgres to begin with. The reason you need to
| SELECT FOR UPDATE is because Postgres doesn't implement READ
| UNCOMMITTED. If you're implementing an ordered queue, you have
| no way of reliably getting the next page of queue items without
| taking a lock using SELECT FOR UPDATE because there may be
| pending uncommitted transactions that you need to block on. On
| MySQL and SQL Server you can compare counts computed with READ
| UNCOMMITTED and READ COMMITTED (on MSSQL, with snapshot
| isolation) to quickly figure out whether there are any
| uncommitted transactions in the range without taking a lock.
| That being said, SELECT FOR UPDATE and UPDLOCK equivalents are
| the easiest to get right, and if you need higher throughput you
| probably need a message queue or Kafka-equivalent.
| kristopolous wrote:
| I've worked at plenty of businesses with annual revenue in the
| tens of millions using "non-scaled" solutions without any
| issue.
|
| If we rephrase question to the material reality of "What will
| we do when we're pulling in $20,000,000 a month?" then you can
| see how silly it is.
|
| I dunno, just buy a company then and use their stuff? It's
| literally not a problem that a pre/early-revenue company should
| or realistically can solve. It's part of "founders delusion" -
| the same kind that make them get a huge office that's mostly
| empty or all that unused computer hardware sitting in the
| closet. It's just gambling.
|
| It'd be like saying "we have $500,000 in the bank at this new
| restaurant and need to be ready service 5,000 national
| locations with our staff of 5."
|
| I mean no, by the time that's no longer just a mental exercise
| (as in you have, say 4,700 locations) you'll be a different
| kind of institution better able to solve the problem. Literally
| just forget about it. It's so profoundly irrelevant...
|
| Solve the problems that are actually real, you've got enough of
| them, I promise.
| simonjgreen wrote:
| I feel like I've spent half by career telling people this.
|
| My general recommendation is to engineer for only 1 order of
| magnitude greater than you need today, but even that stops
| applying eventually.
| kristopolous wrote:
| The response is easy "find me an otherwise successful
| company that actually collapsed from failing to scale
| things like a database and we'll use it as a case study."
|
| Security and PI stuff is unfortunately something you _do_
| have to be super anal about these days but scalability? no.
| anaganisk wrote:
| For thousands of applications deployed, they're most probably
| never need scale.
| zomgwat wrote:
| Security is another factor in choosing not to use a database for
| things like queues. I like to keep access to the database as
| tight as possible. I don't want to start dishing out network and
| account access to services just because they need a queue or a
| distributed lock. I could run a separate database instance but
| that's worse than just running Redis.
| ram_rar wrote:
| Why do you need postgres, if you can get away with Redis? For all
| the 3 points mentioned in the article, I would rather just use
| redis as apposed to postgres. Unless, you really need a rds
| solution for the project.
| dkarras wrote:
| I think the article is assuming you already have Postgres for
| data storage and arguing against the addition of redis when
| that is the case, for the given common requirements that
| typically calls for the addition of redis into the mix.
| cpursley wrote:
| With Elixir you can replace both with something like Quantum or
| backed by Postgres with Oban. Or even just a hand rolled
| genserver.
|
| You can also listen to Postgres WAL changes to build an Event
| like system with Elixir. Supabase is doing this with their
| Realtime app.
| Rygu wrote:
| Don't forget to use Partial Indexes on jobs/tasks tables with
| queries like WHERE status = 'pending'
|
| More: https://use-the-index-luke.com/sql/where-clause/partial-
| and-...
| BiteCode_dev wrote:
| Oh but redis is much more than that.
|
| It's so simple any little script can interact with it in seconds,
| instead of having to craft complex SQL or import your wrapper.
| You can call redis from the weirdest places, like from the inside
| of an nginx config file. You needn't even a compiled driver if
| that matters.
|
| It's easier to get it to perform. It's possible to get great
| perfs with PostgreSQL, but you just got them for free with redis.
| Very hard to screw up. Read, write ? N+1 ? Batching ? Who cares,
| it will be fast no matter the concurrency or the load.
|
| Sure you can expire with Postgres, but having done so in
| practice, it's way harder than it looks to get right. With redis,
| you don't have to care. Set a cache, set a lock with a timeout,
| store data your are not sure you need. It will all disapear.
|
| Redis is not just key/value. You have sets, so ensuring unicity
| is a piece of cake, no constraints to define, then insertion to
| check against. And of course, you have sorted sets, which you are
| kept ordered at insertion by any number you pair the value with,
| such as a timestamp/score/index, and truncate by range for extra
| magic.
|
| And then you have bonuses like hyperloglog which you need an
| extension for in Posgres. Super handy for stats.
|
| Finally, you have streams, which for most apps will fill the gap
| for a timeserie database or a way to store your app logs. All
| that with a 2ms latency at 10k/s requests. None of my projects
| ever needed more than 1K/s though, even one with 1M users/month.
|
| You have all of that with a dead simple install and basically no
| maintenance.
|
| In fact, redis by itself consume almost no resource, it's not
| even worth it to not have it in your toolset. I just install it
| by default on all my projects: I never regretted it, there is
| always something it can do for you. It not now, just wait a bit,
| it's costing you nothing, and something will come up.
|
| So no, let use Postgres for what it's great at, which is being a
| robust all purpose database. Redis is a fantastic complement to
| it, not a competitor, just use both.
|
| Unless you are google size, there are little chances you will
| reach a stage where you need to migrate from any of them.
|
| It's part of those tech that are just too good to be true, like
| SQLite or Python.
|
| My only regret is that it doesn't exist on windows.
|
| P.S: if you need caching and can't afford redis on a small python
| script, use Disk Cache, it's awesome:
| http://www.grantjenks.com/docs/diskcache/index.html
| junon wrote:
| Not to mention I've never personally experienced redis go down
| by itself - it's always some stupid user error that could have
| happened to any part of the infrastructure.
|
| Redis is, far and away, _the_ most robust piece of software
| I've ever had the pleasure of deploying to production.
| snicky wrote:
| That + the best documentation in the industry. It is a joy to
| work with.
| bob1029 wrote:
| I feel I could write a similar article titled: Do
| you really need PostgreSQL? How to Get Away with Just SQLite
|
| How many apps are running pg on a single node and don't really
| have plans to scale beyond this?
|
| Exclusive ownership of a SQLite database by the application can
| deliver better performance than exclusive ownership over a single
| node postgres database. This also extends to SQL Server, DB2,
| Oracle, or any other hosted solution.
| aeoleonn wrote:
| Yep. Or even
|
| "Do you really need SQLite? How to get away with a dozen
| secretaries, pencil, paper, and a fax machine"
|
| I'm with ya. Why not use the right tool for the right job,
| given the right tool exists?
| mosselman wrote:
| For rails I've used que in the past, which is a job queue adapter
| for Postgres. It supports activejob.
|
| https://github.com/que-rb/que
| shaicoleman wrote:
| Another alternative is GoodJob:
|
| https://island94.org/2020/07/introducing-goodjob-1-0
|
| https://github.com/bensheldon/good_job
| pablodavila wrote:
| For Elixir, Oban is a great library that allows the same, in
| case anyone was looking for one.
|
| https://github.com/sorentwo/oban
| void_mint wrote:
| "Do you really need < insert well-tested, verified technology > ?
| Why not just hand-roll your own in < different technology > ?"
|
| Hopefully the emphasis is clear why this is silly.
| meitham wrote:
| Indeed this is silly, to me this sounded like "Do you really
| need a drill? Just hammer this screw down!".
| void_mint wrote:
| As with all things, if the goal is exploration, hell yes try
| this out.
|
| If you're actually trying to solve the problems Redis is
| meant to solve, just use Redis. Otherwise you become the sole
| maintainer of a glued together, suboptimal version of Redis
| based in Postgres. For most people, they'd rather build
| whatever caused them to need Redis in the first place, rather
| than building their own internal worse-version-of Redis.
| iamgopal wrote:
| Does groupcache for golang works with cloud run ? As a redis
| replacement ?
| nameless912 wrote:
| It's always wild when an article on HN has an answer to the exact
| problem i wanted to solve this weekend. Kudos!
| nickjj wrote:
| I think one of the biggest advantages of using Redis for job
| queing vs Postgres comes down to library support.
|
| For example Python has Celery and Ruby has Sidekiq. As far as I
| know there's no libraries in either language that has something
| as battle hardened with comparable features for background tasks
| using Postgres as a backend.
|
| There's a big difference between getting something to work in a
| demo (achievable by skimming PG's docs and rolling your own job
| queue) vs using something that has tens of thousands of hours of
| dev time and tons of real world usage.
|
| I'm all for using PG for things like full text search when I can
| because it drastically reduces operation complexity if you can
| avoid needing to run Elasticsearch, but Redis on the other hand
| is a swiss army knife of awesome. It's often used for caching or
| as a session back-end so you probably have it as part of your
| stack already. It's also really easy to run, uses almost no
| resources and is in the same tier as nginx in terms of how crazy
| efficient it is and how reliable it is. I don't see not using
| Redis for a job queue as that big of a win.
| kortex wrote:
| Totally agree. The number of job queues which use or can use
| Redis as the backing store is legion. Celery, rq, arq, Golang
| has asynq (inspired by sidekiq iirc), and that's off the top of
| my head. IMHO, it's just a better interface for implementing a
| job queue than an RDB.
|
| It's also probably one of the easiest services to deploy and
| manage; often a one-liner.
|
| Plus like you said, swiss army knife. It has so many uses. It's
| inevitable my stack will include a redis at some point, and my
| reaction is almost always "I should have just started with
| redis in the first place."
|
| Is redis prone to golden hammer syndrome? Of course. But as
| long as you aren't too ridiculous, I've found you can stretch
| it pretty far.
| adventured wrote:
| Redis and I have a golden hammer agreement. I keep finding
| new ways to use it and it just keeps working.
| derethanhausen wrote:
| The elixir world has Oban[0] which implements quite a lot of
| advanced job features on top of PG. Admittedly it doesn't quite
| have the usage of Celery and Sidekiq but most queueing
| libraries don't.
|
| [0] https://github.com/sorentwo/oban
| trulyrandom wrote:
| In the case of Python, Celery does support SQLAlchemy as a
| broker if I remember correctly. So in theory, you _could_ still
| use PostgreSQL and also have a solid queue library.
| nickjj wrote:
| It is supported but it's classified as experimental and not
| officially maintained by the core devs based on their
| documentation.
| RedShift1 wrote:
| But in Postgres you could write functions in your schema to
| handle job queueing/dequeueing with the additional benefit of
| being able to use it in any language that can connect to
| Postgres and being able to reuse the same SQL/interface across
| all languages.
| nickjj wrote:
| I'm totally with you in this regard and I'd like to see that
| too but the reality of the situation is a job queue is more
| than slapping together a few SQL queries.
|
| A good job queue will have all or most of these features:
| Prioritize jobs (queues, weighs to certain jobs, etc.)
| Scheduled jobs (running them once but X time in the future)
| Periodic jobs (running them every 2nd Tuesday at 3:33am)
| Static and dynamic configuration (CRUD'ing jobs at runtime,
| like adding new scheduled tasks) Re-try jobs
| (customizable strategy, such as exponential back off)
| Rate limit jobs Expire jobs Cancel jobs
| Unique jobs Batch executing Handling graceful
| shutdown (integration with your app server) Get
| metrics (status, health, progress, etc.) Browse job
| history Web UI (nice to have)
|
| And I'm sure I'm missing things too. These are only off the
| top of my head based on features I tend to use in most
| applications. In other words, this isn't a laundry list of
| "nice to haves in theory", most of these are core or
| essential features IMO. I use them in nearly every web app.
|
| Rolling all of these things on your own would be a massive
| undertaking. Tools like Celery and Sidekiq have been actively
| developed for ~10 years now and have likely processed
| hundreds of billions of jobs through them to iron out the
| kinks.
|
| Even if you managed to do all of that and created it as a
| Postgres extension (which I think is doable on paper), that's
| only half the story. Now you'd have to write language
| specific clients to interface with that so you can create
| jobs in your application using a nice API. This would be a
| very welcome project but I think we're talking a year+ of
| full time development time to release something useful that
| supports a few languages, assuming you're already an expert
| with writing pg extensions, have extension knowledge about
| job queues and know a few popular programming languages to
| release the initial clients.
| c-cube wrote:
| You can extend redis with lua to perform atomic operations,
| too. And it has bindings in a lot of languages by virtue of
| its protocol being so simple.
| rantwasp wrote:
| please don't. source control/versioning/deployment become a
| nightmare
| taffer wrote:
| I never understood this point:
|
| - You have a separate schema for your procs
|
| - You define your procs in files
|
| - You write tests for your procs
|
| - You put your files into git
|
| - Then, in a transaction, you drop your old schema and
| deploy the new one
| fiznool wrote:
| We've seen success in our current node.js application using
| pgboss [1] for job queueing. Since the primary database is
| Postgres, it's nice to not introduce another dependency into the
| system, and take advantage of the infrastructure that we already
| have. The library supports most of what you'd expect from a job
| queue, and is crucially well maintained!
|
| That being said, I agree with other comments that this is
| somewhat swimming against the tide. Redis is much more commonly
| used, and so if you don't mind adding another service into the
| mix, I'd probably recommend going with Redis instead.
|
| [1] https://github.com/timgit/pg-boss
| queicherius wrote:
| Out of curiosity, how many jobs are you queuing on average and
| how did it perform for you at that level?
|
| I have been thinking about using the library, mainly because
| it's less operational hassle than running another service that
| has to be sticky to one host with persistent storage and
| backups.
| skunkworker wrote:
| Hopefully one day projects like que[1][2] will become stable and
| battle-tested enough to use in a production environment. Until
| then I'll be using something like sidekiq (if you're going for a
| client-side job queue, eg: the clients don't really know about
| each other and only have rate-limiting, not true throttling).
|
| With Postgres you also need to worry about high churn, especially
| since you are creating/locking/deleting rows constantly. This can
| be alleviated through a variety of means, of which personally I
| would use per-day table partitioning and truncate older
| partitions on a cron, not to mention the sharp increase in
| database connections to the host now required.
|
| Ignoring the literal elephant in the room of synced writes to the
| store. Redis can be used quite effectively in a blocking manner
| with RPOPLPUSH/LMOVE(6.2+) for a reliable queue, allowing an item
| to not be lost because atomically the pop and push from two
| different lists are done together.
|
| [1] https://github.com/que-rb/que [2]
| https://gist.github.com/chanks/7585810
| dutchbrit wrote:
| Why use Redis when you have more suitable solutions like RabbitMQ
| or Kafka?
|
| Obviously depends on the scale and needs of a project, Postgres
| etc is fine for simple queues.
|
| I often see people waste unnecessary time by writing their own
| complex solutions, resulting in increasing technical debt, when
| you already have perfectly suitable open source options available
| that do a better job..
| brentjanderson wrote:
| I have had two separate benefits from using just Postgres in
| apps:
|
| 1. It's one tool to learn. No need to upskill on topics,
| partitions, consumer groups, routers, brokers, etc. 2. It's one
| transaction boundary. You get a lot of "magic" for free when
| you know that a rollback will not only revert application state
| but also NOTIFY events published and queue jobs in flight. This
| alone makes Postgres a powerful option.
| jugg1es wrote:
| Why would you use a DB or Redis for job queuing when there are
| extremely inexpensive and highly optimized queuing systems in
| every major cloud provider?
|
| I've had so many horrible experiences with DB-based queuing over
| my career once you get to a certain scale. Just use a message
| bus, bro!
| mixmastamyk wrote:
| Some folks are not using the cloud for everything. Portability
| may be an issue too.
| deckard1 wrote:
| I imagine most people using Redis as a queue were already using
| it as a cache and just needed some limited queuing ability. Much
| like how places end up using a DB as a queue.
|
| Using a DB as a queue has been a thing for a very long time.
| Every billing system I've seen is a form of a queue: at a certain
| point in the month a process kicks off that scans the DB and
| bills customers, marking their record as "current".
|
| The challenge is always going to be: what if the worker dies.
| What if the worker dies, the job is re-ran, and the customer is
| billed twice. Thank god it's been many years since I've had to
| touch cron batch jobs or queue workers. The thought of leaving
| the office knowing some batch job is going to run at 3am and the
| next morning might be total chaos... shudder.
| cerved wrote:
| How would double billing occur if the worker dies. The way I
| would design this, the billing request and bill would be
| committed atomically such that a request can only be completed
| with exactly one associated bill. If the worker dies, no bill
| is created.
|
| Also I'd detect a worker has died by recording the start-time
| and using a timeout. Furthermore I'd requeue requests as
| distinct new entities. A requeued entity would have a self-
| referencing nullable FK to reference its parent request.
| deckard1 wrote:
| Murphy's law says that you're going to screw this up any
| number of ways. Maybe not you, specifically, but perhaps your
| coworker.
|
| > committed atomically
|
| Complex billing systems don't work that way. Worker processes
| are not always in these neat boxes of "done" or "not done".
| Much like rollbacks are a developer myth. If a process was
| that trivial then you wouldn't need a queue and workers in
| the first place!
|
| > Also I'd detect a worker has died by recording the start-
| time and using a timeout.
|
| There are many ways to solve this and many ways to get it
| wrong. Not working in UTC? Oops, better hope nothing runs
| during daylight savings changeover. Parent process died but
| worker finished job? Let's hope the parent isn't responsible
| for updating the job completion status. Large job is
| borderline on the timeout? Better hope parent process doesn't
| restart the job while the worker is still working on it.
| Network partition? Ut oh. CAP theorem says you're out of luck
| there (and typically there is at least one network hop
| between the DB server/controlling process and the system
| running the workers).
|
| Probably the more straightforward solution is to give each
| worker an ID and let them update the database with the job
| they pick up. Then, something robust like systemd, would
| monitor and restart workers if they fail. When a worker
| starts, they find any jobs where table.worker_id = myId and
| then start back on those. But you still have network
| partitions to worry about. Again, not at all trivial.
| cerved wrote:
| I can't speak to billing since that's not my area but in
| general I disagree. A process doesn't need a non-trivial
| flow to need a queue and workers. I've done this with route
| optimization requests. They have a priority and need to be
| run in discrete processes. Once they are done, they are
| done.
|
| The database records the start time, in the databases
| system time, in UTC.
|
| The parent process can't update the job status, it can only
| cancel jobs.
|
| There's a pool of workers. Each working may fetch a pending
| request or wait. Once it has fetched a request, which sets
| it as pending, no other worker is able to fetch this
| request. Only this worker may alter the status of its
| request and it may only set it to completed, and it may
| only do so with its results.
|
| The worker periodically writes its status to a separate
| table. If such status has not been written in a timely
| manner, the worker is assumed to be stalled and a new child
| request is created, available to other workers.
|
| Putting this logic into the database instead of a parent
| process, which managed a shared concurrent queue, has been
| a by far more pleasant solution. Granted, we're not talking
| large numbers of concurrent requests. Ten as opposed to ten
| thousand. If we're talking hundreds of concurrent, short
| requests, I can see this starting to become problematic.
|
| I think there's huge difference in the needs of different
| implementations of queues and workers so I'm interested to
| hear more about workloads where this approach fails and
| specifically in which regards. In my experience, it's been
| pleasant.
| jffry wrote:
| For many billing activities, "days" is a small amount of time
| delay. One very viable option to "the worker died" is to just
| look for stuck jobs once the run is complete and kick out an
| alert to let a human decide what to do.
| larodi wrote:
| yes, do you really need PDF convertors when you can have them as
| PostgreSQL extensions.
|
| the point (ok, one point of many) of REDIS is that it is not the
| main DB so you can have a sense of security and decoupling in the
| architecture. Besides - there is no silver bullet for all things.
| While you can have your app do everything with PostgreSQL (and
| much more with Oracle, something people dislike it about), the
| fact itself does not mean is a good design decision or is a more
| stable decision.
|
| Because when you have redis for sessions, kafka for event
| streams, postgre (or else) for data storage - you have components
| that can fail separately and thus the system degrades gracefully.
| xupybd wrote:
| Yes but have you built a tool for the job or a Rube Goldberg
| machine?
|
| Complexity comes at a huge cost. Only add it when the benefits
| out weigh the costs.
|
| You could start out building a product that could scale to
| millions of uses overnight. But if you do that you've spent
| months building something with no users. You could have been in
| the market already, building revenue already. Your requirements
| will change as you find your market fit and you'll need to
| change things. The less you have built the easier it is to
| change. Why not leave the million user capabilities until you
| actually need it?
| berkes wrote:
| > Yes but have you built a tool for the job or a Rube
| Goldberg machine? > Complexity comes at a huge cost. Only add
| it when the benefits out weigh the costs.
|
| Im honestly unsure if you mean this as opposing "doing
| everything in Postgres" or as opposing "throw more services
| on the stack".
|
| Because both are true for the statements. You have that
| complexity, regardless of where you implement it. Either you
| are building the rube-goldberg machine inside of postgres out
| of modules, config and SQL or outside of postgres with
| additional services.
|
| The only way to really solve this is to avoid building that
| RG machine in the first place. In this case: don't have
| queues. In practice that probably means introducting
| complexity elsewhere, though.
| xupybd wrote:
| Most web apps I've worked on have had queues in the
| database. The operational simplicity of only having a
| database has far outweighed the code complexity of using
| the relational database as a queue. However the performance
| would not have scaled. Luckily the performance was well
| above peak load of the actual systems.
| chmod775 wrote:
| Funny. My approach is usually the other way around: Can I get
| away with _just_ Redis?
| siscia wrote:
| It was the reason with https://zeesql.com formerly know as
| https://RediSQL.com was written.
|
| My clients seems rather happy with the project.
| grenoire wrote:
| My arsenal is Redis to SQLite to pg.
| Jemaclus wrote:
| I sorta do this, but my approach is more Redis-first than
| _just_ Redis. I try to see if I can use Redis for 99.999% of my
| operations and have a more durable store (like Postgres or
| something) as a "backup". The nature of Redis is such that even
| with some persistence features, we kinda have to assume that
| the data could go away at any minute, so I always build some
| way to rebuild Redis as fast as possible.
|
| But I've run billions of queries per day against a single Redis
| instance with zero failures serving up traffic to large,
| enterprise-level customers with no downtime and no major issues
| (knock on wood). The only minor issues we've run into were some
| high-concurrency writes that caused save events and primaries
| to failover to replicas, and resulted in a few minutes downtime
| at the beginning of our experiments with Redis-first
| approaches, but that was easily mitigated once we realized what
| was happening and we haven't had a problem since.
|
| Huge fan of a Redis-first approach, and while the haters have
| _some_ valid points, I think they're overstated and are missing
| out on a cool way to solve problems with this piece of tech.
| [deleted]
| wokwokwok wrote:
| Why would you choose to use a system that doesn't scale by
| default?
|
| Single user local applications? Fair.
|
| Web applications? Very strange choice imo.
|
| Reddis is great, but it is *not* a database, and it's
| thoroughly rubbish at high load concurrency without clustering,
| which is (still) a massive pain in the ass to setup manually.
|
| Of course, you can just use a hosted version off a cloud
| provider... but, it's generally about 10x more expensive than
| just a plain old database.
|
| /shrug
|
| I mean, sure, it's (arguably...) step up from just using
| sqlite, but... really, it's _easy_ , and that's good... but it
| isn't good enough as a general replacement for having a real
| database.
|
| (To be fair, sqlite has got some pretty sophisticated
| functionality too, even some support for concurrency; it's
| probably a step _up_ from redis in many circumstances).
| Hendrikto wrote:
| > sure, it's (arguably...) step up from just using sqlite
|
| How so? What's wrong with SQLite?
| wokwokwok wrote:
| I suppose it's a bit more suitable to networked services
| than sqlite is, since it's natively a web api, and sqlite
| is natively a local-only solution.
|
| ...but, I started writing about clustering and the network
| API, but, I can't really articulate why those are actually
| superior in any meaningful way to simply using sqlite, and
| given the irritation I've had in maintaining them in
| production in the past...
|
| I guess you're probably right. If I had to pick, I'd
| probably use sqlite.
| tehbeard wrote:
| > Why would you choose to use a system that doesn't scale by
| default?
|
| By all accounts Postgres seems to be a pain to scale off a
| single machine, much more so than redis.
| cbsmith wrote:
| Which PostreSQL scaling pain point would you be referring
| to? Citus?
| hardwaresofton wrote:
| Postgres is not as automatic as other tools but is mostly
| an artifact of it being around so long, and focus being on
| other things. Few projects have been around and stayed as
| relevant as postgres.
|
| Most of the time, you really _don 't_ need to scale
| postgres more than vertically (outside of the usual read
| replicas), and if you have tons of reads (that aren't
| hitting cache, I guess), then you can scale reads
| relatively easily. The problem is that the guarantees that
| postgres gives you around your data are research-level hard
| -- you either quorum or you 2pc.
|
| Once you start looking into solutions that scale easily, if
| they don't ding you on performance, things get murky
| _really_ quick and all of a sudden you hear a lot of
| "read-your-writes" or "eventual consistency" -- they're
| weakening the problem so it can be solved easily.
|
| All that said -- Citus and PostgresXL do exist. They're not
| perfect by any means, but you also have solutions that
| scale at the table-level like TimescaleDB and others. You
| can literally use Postgres for something it was never
| designed for and still be in a manageable situation -- try
| that with other tools.
|
| All _that_ said, KeyDB[0] looks pretty awesome.
| Multithreaded, easy clustering, and flash-as-memory in a
| pinch, I 'm way more excited to roll that out than I am
| Redis these days.
|
| [0]: https://github.com/EQ-Alpha/KeyDB
| victor106 wrote:
| KeyDB is really good. We use it in production to achieve
| millisecond response times on millions of requests per
| second.
| killingtime74 wrote:
| Are you Google search? How do you have millions of
| requests per second?
| manigandham wrote:
| Lots of industries and applications can get to that
| scale. My last few companies were in adtech where that is
| common.
| qeternity wrote:
| It's likely millions of internal requests, which as
| another comment mentions, is common in a number of
| industries.
| hardwaresofton wrote:
| It really looks absolutely amazing, I feel guilty because
| I want to run a service on it, there's almost no downside
| to running it everywhere you'd normally run Redis.
|
| Also in the cool-redis-stuff category:
|
| https://github.com/twitter/pelikan
|
| Doesn't have the feature set that KeyDB has but both of
| these pieces of software feel like they could the basis
| of a cloud redis product that would be _really_ efficient
| and fast. I 've got some plans to do just that.
| arpa wrote:
| redis is not a database. It's a key-value based cache. If
| you're using it as a database, you're gonna have a bad time.
| cube2222 wrote:
| Why so? It has persistence and I'm not aware of any
| reported data loss happening with it.
|
| It's also got loads of complex and useful instructions.
| arpa wrote:
| Data loss can occur between flushes to disk, for example
| (by default every 2 seconds / every I_FORGOT megabytes).
| Perhaps (most likely) it is possible to fine-tune the
| configuration to have redis as a very reliable data
| store, but it doesn't come with such settings by default,
| unlike most of RDBMSes.
| miohtama wrote:
| Not all use cases require reliable data storage and it is
| ok lose few seconds of data. Think simple discussion
| forums, internal chat applications. There are some
| scenarios where ease of use and a single server
| scalability pays off in the faster development and devops
| cost.
| charrondev wrote:
| I work on a SaaS community forums service and I can
| assure you data loss is not acceptable to our clients.
|
| As a result we use MySQL w/ memcached, although we are
| considering a swap to redis for the caching layer.
| [deleted]
| EvilEy3 wrote:
| > Not all use cases require reliable data storage and it
| is ok lose few seconds of data. Think simple discussion
| forums, internal chat applications.
|
| That is definitely not ok. I'd be really pissed as a user
| if I wrote a huge comment and it suddenly disappeared.
| miohtama wrote:
| It only disappears if there is a catastrophic failure.
| The likelihood for such thing to happen when you write a
| huge comment are less than jackpot in Las Vegas, a
| sensible risk tradeoff for better development experience
| and cost.
| TheCoelacanth wrote:
| How is that a sensible tradeoff compared to just using
| something that was actually designed to be a database
| when you need a database?
| obstacle1 wrote:
| > a sensible risk tradeoff
|
| Note the tradeoff doesn't make sense as soon as you're
| operating at a meaningful scale. A small likelihood of
| failure at small scale translates to "I expect a failure
| a million years from now", whereas at large scale it's
| more like "a month from now". Accepting the same percent
| risk of data loss in the former case might be OK, but in
| the latter case is irresponsible. Provided whatever
| you're storing is not transient data.
| arpa wrote:
| GP was asking why redis is not a reliable storage
| solution/database. Redis is great as an unreliable (not
| source-of-truth) storage.
| kwdc wrote:
| For that temporary use case, how does it compare to
| memcached?
| jasonwatkinspdx wrote:
| Mostly boils down to Redis having a richer API, and
| memcached being faster / more efficient. The new EXT
| store stuff allows you to leverage fast ssd's to cache
| stupid huge datasets. Memcached is also one of the most
| battle tested things out there in open source. I've used
| them both plenty over the years, but tend to lean towards
| memcached now unless I really need some Redis API
| feature.
| jasonwatkinspdx wrote:
| Redis is inherently lossy as a matter of basic design,
| and that's not even touching on the many other issues
| born of NIH solutions rampant within it. You may not hit
| the behavior until you push real loads through it. If you
| talk to anyone who has, I'm confident they'll agree with
| the criticism that while it may be an excellent cache, it
| should never be treated as a ground truth database. It's
| excellent as a slower memcachd with richer features. It's
| not a database. You can also read Aphyr's reports over
| the years, which to be utterly frank, bent over backwards
| to be charitable.
| zigzag312 wrote:
| I would say Redis with RediSearch is a database.
| dionian wrote:
| well... how much uptime do you need and how much resources do
| you have to devote towards achieving your desired uptime
| himinlomax wrote:
| If your data has no value whatsoever, sure.
| cheald wrote:
| Redis can be made durable. The WAIT command allows you to
| guarantee writes to a quorum of nodes, and it can be
| configured for on-disk persistence rather easily.
|
| That said, due to it's single-threaded nature, blocking on
| quorum writes is likely to bottleneck your application under
| any kind of significant load. It really shines at volatile
| data, and while it can work for valuable data, there are
| better tools for the job.
| himinlomax wrote:
| > Redis can be made durable
|
| Postgres, SQLite and many others are durable by default.
| Almost all so-called databases are like that. When you need
| a database, 90% of the time, you want durable. People make
| mistakes, developers are people, developers make mistakes,
| and one such mistake is assuming that Redis is like other
| databases in being durable by default when it's not. It's
| not conjecture, I've seen it done _in production_.
| YetAnotherNick wrote:
| Why does by default matters so much to you? Redis has
| persistence support and it can be easily turned on.
| himinlomax wrote:
| vi has persistence support
| gizdan wrote:
| Why? Redis has persistent data stores, and doesn't
| necessarily need to be memory only.
| himinlomax wrote:
| When you commit to Postgres and the server acknowledges it,
| you know for sure that it's been written to disk and that
| it will survive anything but a hardware disk loss (or,
| obviously, system/FS bug). When clustering is enabled with
| synchronous writes, you can also be confident that the data
| has been recorded to another node as well.
|
| With redis clustering, there's no guarantee the data has
| been replicated. I'm not even sure there's any guarantee
| the data you just asked to be recorded be stored even once
| if a power outage happens immediately after the request.
| mianos wrote:
| If you want to use transactions for multi table updates you are
| probably best to use a proper rdbms. Not to mention read
| consistency. If you only have one table Redis may be fine. I
| usually find my work grows beyond one table. Redis does make an
| unbeatable cache.
| hardwaresofton wrote:
| Hey I assume this is like a joke and not too serious, and we'd
| all switch off when things got a bit hairy, but I sure hope
| other readers can tell.
|
| I am literally in the middle of digging a company out of this
| mistake (keeping Redis too long) right now. If your
| software/data is worth something, take a week or a month and
| figure out a reasonable schema, use an auto-generation tool,
| ORM, or hire a DB for a little bit to do something for you.
| Even MongoDB is better than redis if your're gonna do something
| like this.
| tluyben2 wrote:
| > take a week or a month and figure out a reasonable schema,
| use an auto-generation tool, ORM, or hire a DB for a little
| bit to do something for you.
|
| Sorry but am I the only one who is very worried about the
| state of software? There are people who drank so much of the
| schemaless (which was not an actual issue for any dev worth
| her salt to begin with) that you have to dispense this kind
| of advice? I find that bordering on criminal if someone did
| that to you and carries the title programmer.
|
| Again, maybe that is just me.
|
| Edit: not an attack on the parent: good advice. Just didn't
| know it was that bad. And sad.
| chrisdinn wrote:
| If you store protos in your Redis keys (like most people
| using "NoSQL" for data storage), this comment doesn't have
| much punch. Pretty sure we all can think of some pretty high
| profile examples of NoSQL + structured data working very very
| well at scale.
| hardwaresofton wrote:
| I'm not trying to get on people who are using redis as a
| cache (for photos, or any other ephemeral data).
|
| The idea I was trying to get at was using redis to store
| data traditionally reserved for OLTP workloads.
|
| > Pretty sure we all can think of some pretty high profile
| examples of NoSQL + structured data working very very well
| at scale.
|
| Well that's the thing, you very rarely hear of companies
| who cursed their decision early on to use NoSQL when they
| realized that their data was structured but in 20 different
| ways over the lifetime of the product. Some datasets only
| need light structure (key/value, a loosely defined
| document, schema-included documents), and other things
| should probably have a schema and be stored in a database
| with a tight grip on that schema and data
| consistency/correctness. Please don't use redis in that
| latter case.
| chrisdinn wrote:
| I mean, Google was built on protos in a "NoSQL" database
| (BigTable). I think maybe you are overindexing on
| personal experience.
| hardwaresofton wrote:
| Sure, but:
|
| 1) 99.9% of internet-facing/adjacent businesses are not
| Google and will never reach even 1% of Google's scale
|
| 2) Proto + BigTable is very different from just throwing
| stuff in redis/mongo. Proto schemas are compile-time
| enforced, which is great for some teams and might slow
| others down. Google enforces more discipline than your
| average engineering team -- this is overkill for most
| engineering teams.
| dionian wrote:
| operations aside, the big problem in my experience
| dealing with these systems is you are extremely limited
| (on purpose) and cant do much
| sorting/filtering/aggregation/querying. that's what
| really makes true db's powerful. I love redis for what it
| does, i just dont think it replaces a DB well in many
| cases where its non-transient data
| gsvclass wrote:
| I just did a thread on the topic myself
| https://twitter.com/dosco/status/1402909104012623873
| truth_seeker wrote:
| While you are at it, don't forget to use UNLOGGED tables.
| UNLOGGED == In Memory.
|
| But if you must use disk based table for Job queueing, set
| fillfactor = 50. This takes care of heavy updates.
|
| Indexes are helpful but costs memory and CPU, so always make sure
| you partition the table based on job_type for performant pending
| job query.
|
| I wouldn't recommend using LISTEN/NOTIFY unless you are okay with
| "at most once" semantics. I have used disk table based approach
| for PUB/SUB to replace Kafka. More fine tuned approach will also
| allow (job_type, consumer_group, publisher) as a partition key.
|
| Ref - https://www.postgresql.org/docs/current/sql-
| createtable.html
| osigurdson wrote:
| My understanding is, UNLOGGED means that changes are not
| written to the WAL and data can be lost in the event of an
| unscheduled shutdown. It doesn't mean that the table only
| exists in memory however - the data is still eventually
| persisted to disk.
| truth_seeker wrote:
| From the official doc link i shared already:
|
| UNLOGGED
|
| If specified, the table is created as an unlogged table. Data
| written to unlogged tables is not written to the write-ahead
| log (see Chapter 29), which makes them considerably faster
| than ordinary tables. However, they are not crash-safe: an
| unlogged table is automatically truncated after a crash or
| unclean shutdown. The contents of an unlogged table are also
| not replicated to standby servers. Any indexes created on an
| unlogged table are automatically unlogged as well.
| andrewflnr wrote:
| Uh, yeah, that confirms what osigurdson said, not that
| they're in-memory. For that IIRC you need to mount a ram-
| disk in your OS and put the table on that. Definitely also
| make it UNLOGGED, though.
| BenoitEssiambre wrote:
| I came here to post this. CREATE TABLE UNLOGGED is basically a
| redis in your postgres (with periodic dumps to disk), but with
| transactional joins into more persistent data.
| kureikain wrote:
| Thanks so much for this. I have been looking for this in
| Postgres. MySQL has memory back table which helped us a lot in
| the past.
| tpetry wrote:
| Unlogged tables ARE NOT in memory tables. They are written to
| disk like every other table, but unlogged tables don't have use
| the wal and are therefore much lighter.
| [deleted]
| gsvclass wrote:
| And others like this thread on building a MongoDB like JSON
| database in Postgres in one line.
| https://twitter.com/dosco/status/1401413712842346501
| petepete wrote:
| I've used PostgreSQL in the first two scenarios and would love an
| opportunity to in the third. It's worked really well and hasn't
| caused any problems under decent loads.
|
| The one feature of Redis I'd love to have supported in PostgreSQL
| is to be able to set a TTL on a record. On a recent project where
| we could only save personal data for so long, it was a must have
| feature so we had to use Redis for that purpose instead.
| halayli wrote:
| you can do that using postgres portioning and a trigger to drop
| old partitions from the table.
| phiresky wrote:
| Redis EXPIRE doesn't actually delete any data after it expires
| though. Active deletion happens at random, so you can easily
| still have expired values in memory months later:
|
| > Redis keys are expired in two ways: a passive way, and an
| active way.
|
| > A key is passively expired simply when some client tries to
| access it, and the key is found to be timed out.
|
| > Of course this is not enough as there are expired keys that
| will never be accessed again. These keys should be expired
| anyway, so periodically Redis tests a few keys at random among
| keys with an expire set. All the keys that are already expired
| are deleted from the keyspace.
|
| > Specifically this is what Redis does 10 times per second:
|
| 1. Test 20 random keys from the set of keys with an associated
| expire. 2. Delete all the keys found expired. 3. If more than
| 25% of keys were expired, start again from step 1.
|
| So really it's not much better than doing `SELECT value from
| keys where key=? and expires > now()` with manual deletion.
| Though I agree that it can be more convenient.
| mateuszf wrote:
| > Redis EXPIRE doesn't actually delete any data after it
| expires though.
|
| I guess OP likes the simplicity that built-in expiration
| provides. In your example - all selects reading the value
| will need to have this expiration check. And also some
| scheduled process will have to be written to actually delete
| the values.
| jaggederest wrote:
| I would access the table through a view that had that query
| built into it.
|
| create table all_items(id integer, value text, expires
| timestamp);
|
| create index all_item_expiry on all_items(expires);
|
| create view items as (select id, value, expires from
| all_items where expires > now());
|
| Then you can treat items as your base table and postgres
| neatly allows INSERT/UPDATE/DELETE from it. You'll need a
| job to clean up expires < now() items but it can be done at
| whatever arbitrary interval you like, could even be a
| trigger in PG if you were feeling spicy.
| yukinon wrote:
| I would contend that it really depends on what one would
| prioritize the most in that scenario. In my experience, Redis
| EXPIRE means it is not selectable. That is the primary
| requirement for a lot of development around EXPIRE/TTL. It is
| OK if it is still in memory in some form, it still won't be
| accessible by applications SDK or CLI. Since Redis 2.6 the
| expire error is from 0 to 1 milliseconds which is accurate
| enough for many use cases. Not to mention, Redis will handle
| that deletion for you. You don't need to run a deletion job
| and/or include an additional condition on a query.
|
| Additionally, the expire/ttl/get/set in Redis is incredibly
| easy to use (and abuse, hence the OP article). Some team's
| criteria is limiting the amount of moving parts - and that's
| great. Don't use Redis and use a relational database for
| everything such as what you mentioned. Use it as a queue, a
| cache, a message broker, etc..
|
| Other teams may care less about an extra moving part if it
| means their code will look simpler and they leverage
| relational databases for their more common usecases.
| nicoburns wrote:
| You could very easily create a database view that applies
| the where query, and even prevent your db user from
| selecting from the underlying table.
|
| You could also use a library like PG boss to handle the
| cleanup task.
| petepete wrote:
| The fewer moving parts bit is key.
|
| It was a government project, written by one team (us) to be
| maintained by another.
|
| The data that needed to be expunged was user signup data,
| upon completion the record was sent to a CRM and the Redis
| record destroyed. If the signup wasn't finished it's
| automatically removed after 12 hours.
|
| Referential integrity wasn't really a problem, emails are
| unique and if we clash the two records are auto-merged by
| the CRM.
|
| Setting up scheduled tasks, triggers, partitioning, cron,
| etc, is just more things that can go wrong. If they go
| wrong _and_ go unnoticed we end up with piles of data we
| shouldn't have. That would be many different kinds of bad.
|
| Doing `redis.set(k, v, ex: 12.hours)` or whatever is just
| easier.
| arpa wrote:
| hmmm, I disagree that it's not better. Select operation
| implies index scan most likely with O(log n), while GET
| operation is essentially O(2-3). And you also have to run
| DELETE on sql to remove the expired keys.
|
| Oh, and i'm not entirely sure about the part about redis
| active expiry (disabled by default, default is remove expired
| on lookup - lazy); you're talking about key eviction which
| applies to all deleted keys and AFAIR happens only when
| certain watermarks are hit. Since it happens in ram, it's
| also faaaast, unlike SQL DELETE, which will definitely
| involve disk...
| kureikain wrote:
| exact this. even MongoDB has TTL index which is amazing nice.
| sologoub wrote:
| Wouldn't a simple scheduled batch job be enough to go through
| periodically and drop records older than N days?
|
| This would also give you audit logs, etc. As well as
| flexibility to adjust business logic without updating the TTL
| on all records.
| petepete wrote:
| The requirement for the project was that the deletion should
| be handled in as fail-safe a manner as possible. Relying on
| external jobs, cron, other scheduled tasks etc were deemed a
| risk.
|
| We were already using Redis for other things, it was the
| logical choice for this scenario.
| tbarbugli wrote:
| Deleting large amount of data on Postgresql is expensive.
| First you need an index on a column to select the expired
| data, then you actually need to delete the rows which creates
| a lot of garbage, is heavy and slow.
|
| Creating and deleting a lot of data on PG is a pain because
| of MVCC and vacuum. One useful trick is to partition data
| into tables and to truncate/drop entire tables, drop/truncate
| is instant and reclaims space immediately.
| squiggleblaz wrote:
| > Creating and deleting a lot of data on PG is a pain
| because of MVCC and vacuum. One useful trick is to
| partition data into tables and to truncate/drop entire
| tables, drop/truncate is instant and reclaims space
| immediately.
|
| In this case, the requirement is that user data must only
| be kept for a certain time and not longer.
|
| If that time is a property of the record create and we're
| allowed to keep the data with an error of a day, I guess
| it's easy: We partition the table on a daily basis, and
| delete any partition that is older than 28 days old.
| Sometimes, a record will be closer to 29 days old when it's
| deleted, but we accepted that, and it's easy enough to
| write queries so that it's unavailable to the application
| if it's more than exactly 28*24*60*60 seconds old if that's
| our constraint.
|
| If the requirement is to keep it based on the last of a
| certain kind of use, we'd need to move it from one
| partition to another if we need to keep it. For instance,
| if we can keep data for 28 days after the user last longs
| in, we can't just drop the partition - unless we've moved
| the user each day they log in.
|
| If we have that kind of a constraint, where data lifespan
| is based on properties that change over the lifetime of the
| data, is partitioning + drop actually a useful approach?
| The drop will still be instant, I guess, but it's the
| movement of data over its lifetime that concerns me here.
| sologoub wrote:
| You'd probably already have indices by user then. There
| also other ways to store the data, for example not as a
| time series of user actions, but as another data
| structure. It just depends on your requirements and data
| model.
|
| The gist of the original article is asking whether you
| could reduce tech stack complexity and use a single set
| of technologies for more use cases, allowing to
| understand the tool you are using better.
|
| Also, note that a traditional database may or may not be
| the right tool for the job - there are different
| storage/behavior needs for writing lots of data,
| processing lots of data and serving the results.
| squiggleblaz wrote:
| > You'd probably already have indices by user then.
|
| Do you mean "you'd probably already have indices by user
| then, so you won't be able to take advantage of quick
| drops"?
|
| > There also other ways to store the data, for example
| not as a time series of user actions, but as another data
| structure. It just depends on your requirements and data
| model.
|
| I suppose I want to be a bit more specific. Till now, if
| I wanted to deal with this issue, I would just have used
| a cron job and some indexes and taken on the load. But
| what is an example of a nice way of dealing with this?
| You get to make up plausible requirements and data model
| details, perhaps drawing on particular requirements you
| had when you faced a similar issue.
|
| > The gist of the original article is asking whether you
| could reduce tech stack complexity and use a single set
| of technologies for more use cases, allowing to
| understand the tool you are using better.
|
| The legitimacy of a question in a free-form conversation
| doesn't depend on the original inspiration, several
| iterations ago. But even if it did, the question is here
| is exactly about understanding the tools better and
| whether we can use one tool instead of two, so by your
| summary it's perfectly on topic.
|
| > Also, note that a traditional database may or may not
| be the right tool for the job - there are different
| storage/behavior needs for writing lots of data,
| processing lots of data and serving the results.
|
| The subquestion here is precisely if we can get away with
| just using Postgres for data with mandatory lifespan
| requirements that vary over the lifetime of the data.
|
| Extra tools come at a cost - that is the presumption of
| this article and the thread it has spawned.
|
| If we have to use Postgres, we need to assess the cost of
| doing this in Postgres before we can decide whether or
| not to pay the cost of other tools. Waving in the general
| direction of other tools isn't helpful; it probably takes
| as much work to enumerate candidates as it does to
| calculate the cost of doing it in Postgres.
| plank_time wrote:
| You can do the equivalent by adding a Timestamp column with
| index on your table and add "where now - timestamp <= TTL".
| (Or some computational easier way that doesn't require math
| on the query)
|
| TTL on systems like Cassandra is pretty ugly and deleting
| data is hard as you scale no matter how you do it. I don't
| think Postgres would be able to implement a TTL that is
| magically better than the rest.
| nezirus wrote:
| Deleting whole partitions is generally useful strategy.
| It's like the difference between single inserts and batch
| inserts (often huge performance difference, and much
| lower IO)
|
| Since you mentioned Cassandra and TTL, I'll mention
| ClickHouse, very nice TTL options, splitting into smaller
| partitions and using "ttl_only_drop_parts=1" has prove
| itself in the production with big data ingestion rates.
|
| Last, but not the least, I almost always prefer Postgres
| for data storage needs, one can trust it to be safe and
| fast enough. Only some specific situations warrant other
| solutions, but it's a long way until that point (if
| ever), and better not optimize too early.
| cbsmith wrote:
| I was going to say... the standard approach is to simply
| partition the data by time, at which point it gets really
| easy to manage.
| sologoub wrote:
| Sure, there are different design patterns for different
| data sizes. This also adds querying complexity, so just
| depends on what's needed.
|
| Also, most applications have peak and low periods of load
| that are predictable (e.g. users concentrated in a given
| set of time zones) which make for good times to run
| otherwise disrupting functions, etc.
| radicalbyte wrote:
| It would, and you'd also get nice features like proper
| transactions and referential integrity.
|
| BUT that comes at a cost. Doing that in Redis is just so
| easy, and the vast majority of the time no-one is going to
| notice a few concurrency errors, YOLO!
|
| ;-)
|
| To be serious: it's a nice tool to have in the box and is
| amazing for inexperience/start developers because you can do
| a lot with very little time investment or knowledge.
| zigzag312 wrote:
| What's wrong with Redis' transactions?
| [deleted]
| mxyzptlk wrote:
| We tried it with just PostgreSQL and struggled with missed jobs.
| We tried it with Redis + PostgreSQL and haven't looked back. I'll
| take the blame for not engineering the first version adequately
| (this was before upsert) but Redis has been useful in so other
| ways that I'm glad it ended up in our architecture.
| gsvclass wrote:
| And a search engine in Posgres instead of Elastic Search
| https://twitter.com/dosco/status/1400643969030127620
| chrisallick wrote:
| interesting... im in opposite camp. do you really need anything
| other than key/value store and a data structure? ive used redis
| exclusively for close to a decade.
| munro wrote:
| I feel like sanity is being restored. Maybe I'm lazy, but yeah I
| use PostgreSQL for everything I can. Zookeeper is great, but I've
| used PostgreSQL for distributed locking & queueing. Sometimes
| it's quicker for me to write a 10-20 line algorithm than install
| a new piece of infrastructure.
| pilif wrote:
| For pub/sub, I would recommend against using PostgreSQL if you're
| doing it at any kind of scale because LISTEN ties up one
| connection completely and Postgres connections are very expensive
| compared to a redis connection.
| postgressomethi wrote:
| > LISTEN ties up one connection completely
|
| I've seen this twice in this thread, but I don't know what that
| means. Can you explain a bit?
| sa46 wrote:
| When a client connects to Postgres, Postgres creates a new
| process just for that client. Separate processes are great
| for isolation but it means Postgres connections are a bit
| expensive (this is an active area of improvement). Postgres
| really starts to struggle once you a have a few thousand
| connections, even if those connections aren't doing anything.
|
| The common workaround is to use a connection pooler like
| PGBouncer so that clients reuse connections. This approach
| doesn't work for LISTEN because typically a client will
| listen for its entire lifecycle so you can't share
| connections in a pool.
| taffer wrote:
| You need one connection per worker thread, and
| realistically you would only have one worker per cpu core.
| So how many LISTENing connections do you really need?
| sa46 wrote:
| You generally have more than one database connection per
| thread. As an example, consider Node.js which acts like a
| single threaded process. You'd probably want to be able
| to handle more than 1 database query concurrently since
| network latency tends to dominate OLTP requests.
|
| How you setup LISTEN and NOTIFY is app dependent. In a
| multi-tenant database, you could have 1 NOTIFY channel
| per tenant.
|
| As you scale, you probably do something that listens in a
| smarter way, maybe 1 Go channel per client with a single
| LISTEN instead of 1 database connection per client. The
| downside is that now the app code is responsible for
| tenant isolation instead of the database.
| dvfjsdhgfv wrote:
| I was looking for your comment here, thank you. I hope they
| solve this problem somehow in a future release.
| gunnarmorling wrote:
| The other issue with LISTEN/NOTIFY is that a client will miss
| any notifications sent while it is not running (e.g. due to
| crash, update, etc.). An alternative would be logical
| replication and change data capture (CDC), as provided for
| Postgres by Debezium for instance. That way, any consumers
| won't miss events during downtimes, as they'll continue to read
| from the replication slot from the last offset they had
| processed.
|
| Disclaimer: I work on Debezium
| jbverschoor wrote:
| This is by design and very common with pub/sub.
| supermatt wrote:
| isnt this also the case for redis pubsub?
| Nick-Craver wrote:
| For pub/sub yes that's correct. For full info though: Redis
| later added streams (in 5.x) for the don't-wan't-to-miss
| case: https://redis.io/topics/streams-intro
| philjohn wrote:
| I'd need to check, but I think so.
|
| Which is probably why if you don't want loss if consumers
| go down a proper queue system (RabbitMQ, ActiveMQ, Amazon
| SQS, Kafka if you don't care about ordering between
| partitions) is the way I'd go.
| hardwaresofton wrote:
| Could you explain a little more about this (Debezium is
| awesome btw)? So after creating the proper replication slots,
| you set up change data capture via Debezium, then you listen
| to the CDC stream _from the original DB you are connected to_
| , correct?
|
| This sounds like it might be vulnerable to dead consumers
| causing WAL to pile up without the right settings/periodic
| checks.
| gunnarmorling wrote:
| > then you listen to the CDC stream from the original DB
| you are connected to, correct?
|
| Yes, exactly.
|
| > This sounds like it might be vulnerable to dead consumers
| causing WAL to pile up without the right settings/periodic
| checks.
|
| There are some subtleties around this indeed. You should
| monitor backlog of replication slots, so to identify
| inactive consumers as you say. Also, there are some corner
| cases you need to take care of: when listening to changes
| from a low-traffic database on the same PG host that also
| has a high-traffic database, the replication slot may not
| be acknowledged often enough. Debezium mitigates this by
| optionally writing changes to a dummy heartbeat table in
| the low-traffic database, so to advance the replication
| slot.
|
| In Postgres 13 there's also a new option
| "max_slot_wal_keep_size" which limits WAL size retained by
| a replication slot. This prevents unbounded WAL growth, at
| the risk of consumers to miss events if they are down for
| too long.
|
| All in all, proper monitoring and alerting is key.
| MichaelMoser123 wrote:
| then what's the point of this feature as part of the db, if
| it doesn't do persistance?
| BiteCode_dev wrote:
| Then use streams.
| dvfjsdhgfv wrote:
| Has anyone tested how listen/notify in pg (for in-memory
| databases) compares to pub/sub in Redis?
| posharma wrote:
| What happened to RabbitMQ? Is that not used as a queue any more?
| lwn wrote:
| Other than it's absence from this thread, Rabbit still seems to
| be going strong.
| didip wrote:
| The issue with Redis is that it's distributed story is not great.
| I wonder if their Raft experiment is finally GA or not.
| https://github.com/RedisLabs/redisraft
| qatanah wrote:
| For simple task queues, yes pg is ok. For high loads, redis is
| still better as PG generates a lot of WAL & connection overhead.
| phendrenad2 wrote:
| Wordpress historically didn't need a cache and just used page
| visits and/or a cron job to kick off automated processes, backed
| by MySQL. Is it fast? Yes. Is it nearly as fast as Redis? No. Do
| you need it to be? Not for Wordpress lol
| [deleted]
| avinassh wrote:
| Most of the times Redis's distributed lock works fine, however
| one should know that its not fail proof and you might run into
| really weird bugs
|
| references:
|
| 0 - https://aphyr.com/posts/283-jepsen-redis
|
| 1 - https://martin.kleppmann.com/2016/02/08/how-to-do-
| distribute...
| antirez wrote:
| 0 - unrelated to RedLock algorithm, since RedLock has its own
| replication in the protocol.
|
| 1 - I replied in a blog post refusing most of the arguments,
| and then there was maybe another reply. You may want to read
| the full thing to form an idea about the safety of RedLock.
| leetrout wrote:
| Watch out for transaction ID and sequence exhaustion if you have
| a lot of things rolling through a table as a queue.
|
| Postgres is awesome but logging and queuing in a table can cause
| gotchas you won't have with redis.
| yukinon wrote:
| This is a great article because it outlines an alternative to
| using Redis for any given use cases. If we don't constantly re-
| evaluate our toolsets and their capabilities, it can lead to poor
| decision making.
|
| That being said, I've found Redis largely a pleasure to work with
| for these use cases and don't really see a real incentive to
| changing my current approach.
| arpa wrote:
| As German engineers are known to say: "Why make things simple
| when complicated will do?"
| Fannon wrote:
| If you already have Postgres in your project and now you get
| the requirement for an additional cache store, I think it will
| be less complicated to reuse what you already have instead of
| adding another DB to your stack.
|
| Of course, at some point of scaling needs a dedicated cache
| store will make sense anyway.
|
| (Just some justifications from a german engineer :) )
| arpa wrote:
| True. However, since cache is usually transient, adding this
| key-value store/cache is as easy as "docker run redis". No
| need to provision block storage, and it's really lightweight
| in comparison.
|
| (that being said, I try really hard not to judge; after all,
| i'm not without fault: it's 2021 and i'm using bash over cgi-
| bin to serve web pages for my own hobby projects :))) )
| rualca wrote:
| > True. However, since cache is usually transient, adding
| this key-value store/cache is as easy as "docker run
| redis". No need to provision block storage, and it's really
| lightweight in comparison.
|
| If you're using postgres for caching only, as you do Redis,
| then you also do not need to provision block storage.
|
| If you happen to already have Postgres running for other
| uses, you also do not need to provision block storage.
|
| Finally, I would add that Redis clients such as Redisson
| are resource hogs that cause performance problems on apps,
| while pg clients are barely noticeable.
| mosselman wrote:
| Redis as a cache server requires different settings from
| redis as a job queue. So you can't reuse the same server
| anyway.
| himinlomax wrote:
| Redis is not a DB. Only idiots think Redis is a DB. To be
| fair, I've met plenty of such idiots, so if you happen to be
| one, you're not alone.
|
| I'm not calling people names to be mean here; I'm trying to
| save your lives.
| cube2222 wrote:
| Could you explain why Redis is not a DB instead of just
| calling people who disagree with you idiots?
| himinlomax wrote:
| It offers no guarantees commonly afforded by databases. I
| have seen several instances, real life instances of
| people incorrectly assuming those guarantees when using
| Redis. So sure, if you know exactly what you're doing,
| you _can_ use Redis as a database, but the reality is
| that it 's much safer to assume you can't as a first
| approximation.
|
| Calling people "idiots" here is a public service. I'm
| sure there are non idiots who can prove me wrong, but
| they're the minority among production users of Redis that
| rely on it for non perishable data.
| kube-system wrote:
| Non-ACID databases are still databases. Everyone should
| use the right tool for the job, and ACID is not a
| requirement for every job.
| squiggleblaz wrote:
| > Only idiots think [x]... I'm not calling people names to
| be mean here; I'm trying to save your lives.
|
| If you don't mean to call people names, my advice is to not
| call them names.
|
| For instance, in this particular case, a perfectly
| intelligent person whose expertise is in another area of
| software development might have been misinformed. In this
| case, you're going to achieve your goal (saving "lives")
| for more effectively if you state (a) what you mean by "a
| DB" and (b) why it is always a worse decision to use Redis
| as a DB than to use some other tool for that end.
| himinlomax wrote:
| I mean it in the same way you'd put a "not to be operated
| by morons" sticker on dangerous industrial equipment.
| Redis offers little to no guarantees, particularly in a
| cluster, and unless your data is perishable (session
| state, cache ...) and you now _exactly_ what you 're
| doing, that's less than what most users expect. I've seen
| it happen several times already.
| squiggleblaz wrote:
| > I mean it in the same way you'd put a "not to be
| operated by morons" sticker on dangerous industrial
| equipment.
|
| I've never seen a sticker like that on dangerous
| industrial equipment. I'm pretty sure if you tried to put
| a sticker like that on dangerous industrial equipment,
| you would not limit your liability one iota. You would
| certainly receive at least this much pushback, and
| probably more. Normally, for a warning to be effective,
| it needs to say what it is warning about.
|
| >Redis offers little to no guarantees, particularly in a
| cluster, and unless your data is perishable (session
| state, cache ...) and you now exactly what you're doing,
| that's less than what most users expect. I've seen it
| happen several times already.
|
| It reads like you're making a different claim now than
| you did before. Before, you said something like "you're
| an idiot if you use Redis as a database", and we were
| concerned by what you mean by "a database" and how
| useless it is to just call someone an idiot. Now it seems
| like you mean "you'd have to be an expert or an idiot if
| you decide to use Redis at all for any purpose".
|
| Now, I think it's still useless to call someone an idiot
| - it would be more productive to play with a cat than
| call someone an idiot for using a widely regarded tool.
|
| In this form we don't need to worry about what you mean
| by a database - it's the tool that is the signal, and you
| need to take into consideration its properties before you
| decide to use it, rather than reading marketing copy or
| looking for a point on your resume.
|
| And I think trustworthy, expert, well-trained people can
| make judgements that turn out to be wrong. A person can
| make a very good case to use Redis which turns into a
| nightmare by future requirements changes or by an
| incorrect weighting of the pros and cons.
|
| I think you're just trying to say something which I would
| express as "The burden of proof lies on the person who
| wants to add Redis to a system, and on any additional use
| of Redis in a system which already has it. If you're
| considering simplifying a system which uses Redis and
| something else so that it uses only one of them, you
| almost certainly want to remove Redis rather than the
| other tool." If so, I think that's fair (even though the
| word "idiot" will make it harder to get your message
| across, and the mention of "as a database" is a
| distraction - you can only use Redis as a database and
| the question is how much persistence do you need and how
| much do you get), but I think you should have added the
| why to the original message: "Redis offers little to no
| guarantees, particularly in a cluster, and unless your
| data is perishable (session state, cache ...) and you now
| exactly what you're doing, that's less than what most
| users expect." Like this, it emphasises the risk of
| "accidental creep" as well as the persistence limitations
| you're worried about.
|
| After all, any idiot who reads your original text is
| going to ignore it. If they're willing to decide based on
| reputation, Redis is a well known tool used by many
| businesses and you're a pseudonymous commenter on social
| media. If they're willing to decide based on facts, you
| didn't give them any. And if they're willing to decide
| based whether or not they assess themselves as an idiot,
| they aren't going to assess themselves as an idiot.
| himinlomax wrote:
| https://www.google.com/search?sxsrf=ALeKk02SkvKxY0RsgH6we
| x0w...
| edgyquant wrote:
| There are no images of that sticker on actual equipment
| in that search
| [deleted]
| Fannon wrote:
| It would have been better to frame it more generically as
| "adding another store or db".
|
| But the official self-description of Redis is the
| following:
|
| > Redis is an open source (BSD licensed), in-memory data
| structure store, used as a database, cache, and message
| broker.
|
| Quote: https://redis.io/
|
| I'd be interested on why exactly you think that Redis
| shouldn't be called a DB?
| himinlomax wrote:
| Most products labelled as "database" afford most if not
| all of the ACID guarantees. Redis offers none by default.
| It's a database in the loosest sense of the word, so it's
| a database alright, but then so is a text file.
|
| The problem is that in practice, some people will assume
| that it, being a "database", has similar properties to
| the other databases they know when it doesn't. I've had
| heated debates in meetings, and I wasn't just spewing an
| opinion but pointing at the documentation. In the latest
| case, that involved replication in a cluster, there is
| absolutely no guarantee (or there wasn't at the time that
| happened) of consistency between members. The developers
| had assumed there was. It's the kind of assumption that
| works just fine and dandy in a staging environment, but
| breaks subtly and painfully in production, eventually.
| squiggleblaz wrote:
| > I'd be interested on why exactly you think that Redis
| shouldn't be called a DB?
|
| Normally I think when people say "Redis isn't a DB", they
| mean "an update will always return successfully before
| the data has been committed to storage".
|
| If your user makes a change, you record that in a
| Postgres database, you've informed the user of your
| success, and the power goes out on the Postgres database,
| then the user will not have been misled; the data is in
| the database. If you did it with a persistent Redis
| system, there is some reasonable probability that the
| user has been misled and their data is no longer
| available.
|
| I don't think this is a good definition of a database. In
| reality, we take various decisions, some of which
| increase the likelihood data will be preserved across
| some kinds of failures, some of which decrease the
| likelihood. No one would say "you don't have a database
| because you take daily backups from a single,
| unreplicated Postgres instance". They say "you have made
| decisions which increase the possibility of dataloss
| under certain circumstances" (well, they say things I
| translate that way).
| Fannon wrote:
| And there I'm unsure what we consider essential
| characteristics of a database.
|
| This characteristic you're mentioning is essential for
| your persistence database, but maybe not so much for a
| derived read-only databases (e.g. when using the CQRS
| pattern). Those guarantees and characteristics are often
| a trade off anyway.
| squiggleblaz wrote:
| I certainly agree with you. As engineers, we should be
| comfortable with properties and requirements, and
| selecting the combinations of tools and self-written code
| which match them most maintainably.
|
| Unfortunately, it's easier to think in terms of tools and
| substitutes. For instance, it's difficult for us to say
| "I used a combination of handwritten code and redis to
| provide some of the guarantees that I would've got for
| less handwritten code and Postgres". However, this is
| exactly that sort of article: it argues that sometimes,
| it is better to use Postgres even if the features you
| want a more naturally provided by Redis.
|
| > And there I'm unsure what we consider essential
| characteristics of a database.
|
| I doubt there really is anything. I don't think it's
| possible to actually narrow down "database" to some
| precise definition, so I think for instance the person
| who said "only an idiot thinks Redis is a database" needs
| to explain what they mean by "a database" before we can
| discuss the merits of their position.
| himinlomax wrote:
| > And there I'm unsure what we consider essential
| characteristics of a database.
|
| Simple, people know other databases (Postgres, MySQL,
| SQLite, SQL Server, Berkeley DB, CouchDB, ....), those
| have many things in common, so when they hear "Redis is a
| database" they unconsciously assume Redis has properties
| those other products have. And it has few all those
| products have.
| squiggleblaz wrote:
| Just fyi: If you're deep in a thread and the Reply button
| doesn't show on the comment you want to reply to, you can
| click on the time. The reply button will be available
| there. Basically, permitted thread depth is relative to
| the view, not absolute to the root.
| rualca wrote:
| > As German engineers are known to say: "Why make things simple
| when complicated will do?"
|
| Isn't Redis an in-memory key/value store? Paying for RAM is not
| the same as paying for SSD.
| welder wrote:
| There's Redis-compatible alternatives that use disk space
| instead of RAM.
|
| https://wakatime.com/blog/45-using-a-diskbased-redis-
| clone-t...
| hardwaresofton wrote:
| A blog post series I've been meaning to write for over 3 years
| now:
|
| * Every database a Postgres 1: Key/Value store
|
| * Every database a Postgres 2: Document stores
|
| * Every database a Postgres 3: Logs (Kafka-esque)
|
| * Every database a Postgres 4: Timeseries
|
| * Every database a Postgres 5: Full Text Search
|
| * Every database a Postgres 6: Message Queues
|
| Low key, you could make almost every single type of database a
| modern startup needs out of Postgres, and get the benefits (and
| drawbacks) of Postgres everywhere.
|
| Should you do it? Probably not. Is it good enough for a
| theoretical ~70% of the startups out there who really don't
| shuffle around too much data or need to pretend to do any hyper
| scaling? Maybe.
|
| If anyone from 2ndQuadrant/Citus/EDB see this, please do a series
| like this, make the solutions open source, and I bet we'd get
| some pretty decent performance out of Postgres compared to the
| purpose built solutions (remember, TimescaleDB did amazing
| compared to InfluxDB, a purpose built tool, not too long ago).
|
| New features like custom table access methods and stuff also
| shift the capabilities of Postgres a ton. I'm fairly certain I
| could write a table access method that "just" allocated some
| memory and gave it to a redis subprocess (or even a compiled-in
| version) to use.
|
| [EDIT] - It's not clear but the listing is in emacs org mode,
| those bullet points are expandable and I have tons of notes in
| each one of these (ex. time series has lots of activity in
| postgres -- TimescaleDB, native partitioning, Citus, etc).
| Unfortunately the first bullet point is 43 (!) bullet points
| down. If someone wants to fund my yak shaving reach out,
| otherwise someone signal boost this to 2Q/Citus/EDB so
| professionals can take a stab at it.
|
| [EDIT2] - I forgot some, Postgres actually has:
|
| - Graph support, w/ AgensGraph now known as AGE[0]
|
| - OLAP workloads with Citus Columnar[1] (and zedstore[2]).
|
| [0]: https://age.apache.org
|
| [1]:
| https://www.citusdata.com/blog/2021/03/05/citus-10-release-o...
|
| [2]: https://github.com/greenplum-db/postgres/tree/zedstore
| rkwz wrote:
| > Should you do it? Probably not. Is it good enough for a
| theoretical ~70% of the startups out there who really don't
| shuffle around too much data or need to pretend to do any hyper
| scaling? Maybe.
|
| It's also useful when you want to quickly build a "good enough"
| version of a feature like search so you can get it in front of
| your users fast and iterate on their feedback. Most of the
| time, they'd be quite happy with the results and you don't have
| to spend time on something like managing Elasticsearch.
|
| I wrote a post on how you can use postgres to add search
| capabilities with support for queries like
|
| jaguar speed -car
|
| ipad OR iphone
|
| "chocolate chip" recipe
|
| http://www.sheshbabu.com/posts/minimal-viable-search-using-p...
| hardwaresofton wrote:
| Yup -- I'm a big fan of always writing the interface and the
| implementation, even if there's only one. You're always glad
| you wrote `Queue` and `PostgresQueue` when it comes time to
| write `KafkaQueue` or `NATSQueue`.
|
| That said, I am an unrepentant yak shaver, and there is a lot
| to be said in just writing those things when you need it but,
| Postgres would be perfect for rapid prototyping in this way.
| osigurdson wrote:
| I do think this is a product that everyone wants - support all
| popular models (relational, kvs, queue, log, etc) in a
| consistent, scalable, open source and easy to operate service.
| I'm not sure that this is actually possible but I think if such
| a thing did exist it really would dominate.
|
| In the current reality today, implementing everything in
| Postgres is probably going to be slower to market (i.e. for a
| start-up) than using off-the-shelf products. When you do need
| to scale, this is when you get to learn about how valid your
| assumptions were in your abstraction layer - mostly likely in
| production. As a concrete example, Kafka isn't designed to work
| well with large numbers of topics. Similarly, InfluxDB isn't
| designed to work well with high cardinality time series. I
| think it is generally wiser to "skate where the puck is going"
| in this situation.
|
| Of course, everything is a trade-off. Postgres is incredibly
| reliable (like insane) and simple to operate. I'd say for any
| kind of internal line-of-business type application where
| scalability is less of a concern you really would be doing your
| ops team a service by implementing everything in Postgres.
| des1nderlase wrote:
| But I don't get it, why would you use PG for all these if
| specialized systems (and arguably optimized for that use case)
| already exist?
| jpalomaki wrote:
| One practical thing is that consistent backups can become
| very difficult if you distribute your state to multiple
| places.
| cnorthwood wrote:
| Operational ease
| konschubert wrote:
| Because you already have a Postgres DB running probably and
| you know how to back it up, you know how to upgrade it, all
| your services can already authenticate towards it, your
| developers can run it locally, you know how to mock it...
| hardwaresofton wrote:
| Just repeating what others have said:
|
| - Postgres is probably already running (it's pretty good for
| OLTP workloads)
|
| - Operational ease and robustness
|
| - Cloud support everywhere for Postgres
|
| - People know how to backup and restore postgres
|
| - Sometimes Postgres will beat or wholly subsume your
| specialized system and be a good choice
|
| - Postgres has ACID compliance and a _very_ good production-
| ready grasp on the research level problems involved in
| transactions. I 've never met an etcd/zookeeper cluster I
| didn't wish was simply a postgres table. Image being able to
| actually change your cache _and_ your data at the same time
| and ensure that both changes happen or none of them happen
| (this is a bit vaporware-y, because locks and access pattern
| discrepancies and stuff but bear with me). You 're much more
| unlikely to see Postgres fail a Jepsen test[0]
|
| [0]: https://jepsen.io
| dave_sid wrote:
| I'd think you don't really need redis or elastic search in about
| 80% of the places you see it. You certainly don't need the
| baggage.
| Xophmeister wrote:
| SKIP LOCKED looks interesting; I'll have to try that. I've used
| advisory locks in the past, but I kept running into deadlocks
| when I tried to acquire them intelligently (just for popping off
| the queue). It was unclear why, at the time, so I just put an
| advisory lock on every transaction. Obviously that causes serious
| contention problems as the number of jobs and workers increase.
| [deleted]
| u678u wrote:
| Surely redis is trivial to set up. Postgres gives you lots more
| maintenance and admin headaches.
___________________________________________________________________
(page generated 2021-06-12 23:01 UTC)