[HN Gopher] Redis is fast - I'll cache in Postgres
___________________________________________________________________
Redis is fast - I'll cache in Postgres
Author : redbell
Score : 309 points
Date : 2025-09-25 23:34 UTC (23 hours ago)
(HTM) web link (dizzy.zone)
(TXT) w3m dump (dizzy.zone)
| joonas wrote:
| In similar vein, I'd always thought Redka
| (https://github.com/nalgeon/redka) was a neat idea since it gives
| you access to a subset of the Redis API backed by either SQLite
| or Postgres
| notpushkin wrote:
| Came here to post this. I'm wondering how Redka will compare
| with Postgres (and Redis) in terms of performance though.
|
| Edit: https://antonz.org/redka/#performance
| iamcalledrob wrote:
| This isn't a great test setup. It's testing RTT rather than the
| peak throughput of Redis.
|
| I'd suggest using Redis pipelining -- or better: using the
| excellent rueidis redis client which performs auto-pipelining.
| Wouldn't be surprising to see a 10x performance boost.
|
| https://github.com/redis/rueidis
| oa335 wrote:
| Postgres also supports query pipelining - at it seems like the
| popular go Postgres client library pgx supports it:
| https://github.com/jackc/pgx/issues/1113#event-6964024724
| aaronblohowiak wrote:
| Even so 35ms+ latency for Redis reads is very very high I'd
| want to understand what is happening there
| dizzyVik wrote:
| Author here. Redis is definitely faster. I was specifically not
| going for absolute peak performance for either redis or
| postgres - that would require going down to the wire protocols.
| The idea was emphasize that there' a "good enough" level of
| performance. Once you need that sort of speed - sure, there are
| ways to achieve it.
| motorest wrote:
| > The idea was emphasize that there' a "good enough" level of
| performance. Once you need that sort of speed - sure, there
| are ways to achieve it.
|
| To this I would add that more often than not the extra cost
| and complexity of a memory cache does not justify shaving off
| a few hypothetical milliseconds from a fetch.
|
| On top of that, some nosql offerings from popular cloud
| providers already have CRUD operations faster than 20ms.
| arp242 wrote:
| When I last benchmarked Redis vs. PostgreSQL for a simple k/v
| cache it was about ~1ms for PostgreSQL to fetch a key, and ~0.5ms
| for Redis with a similar setup as in this post (although I used
| "value bytea" instead of "value string" - I don't know if it
| matters, probably not; 1ms was fast enough that I didn't care to
| test).
|
| I didn't measure setting keys or req/sec because for my use case
| keys were updated infrequently.
|
| I generally find ms to be a more useful metric than reqs/sec or
| latency at full load, as this is not a typical load. Or at least
| wasn't for my use case.
|
| Of course all depends on your use case etc. etc. In some cases
| throughput does matter. I would encourage everyone to run their
| own benchmarks suited to their own use case to be sure - should
| be quick and easy.
|
| As I rule I recommend starting with PostgreSQL and using
| something else only if you're heavily using the cache or you run
| in to problems. Redis isn't too hard to run, but still just one
| less service to worry about. Or alternatively, just use a in-
| memory DB. Not always appropriate of course, but sometimes it is.
| Maskawanian wrote:
| When you benchmarked Postgres did you disable WAL for the cache
| table? That may minimize the difference.
| arp242 wrote:
| Unlogged table, yes. 1ms is more than fast enough so I didn't
| bother to look further.
| xyzzy_plugh wrote:
| A difference of 0.5ms is negligible with single digit network
| latency. You would need significant batching to experience the
| effects of this difference.
|
| Of course such sensitive environments are easily imaginable but
| I wonder why you'd select either in that case.
| arp242 wrote:
| > A difference of 0.5ms is negligible with single digit
| network latency
|
| Yes, that was my take-away.
| brightball wrote:
| The big with for Redis is pipelining IMO.
| bart3r wrote:
| Don't tell DHH
| Fire-Dragon-DoL wrote:
| Didn't DHH just release a sqlite-only cache?
| firecall wrote:
| Solid Cache, default in Rails 8.
|
| Doesn't require SQLite.
|
| Works with other DBs:
|
| https://github.com/rails/solid_cache
| Fire-Dragon-DoL wrote:
| Yeah,so no redis
| mehphp wrote:
| I think you just convinced me to drop redis for my new project.
|
| Definitely a premature optimization on my part.
| throwup238 wrote:
| "Dropping" something from a "new" project is premature
| optimization?
|
| Wherever you go, there you are.
| danielheath wrote:
| I read it as dropping something that _had been_ a premature
| optimisation.
| qu4z-2 wrote:
| Presumably adding Redis to a new project with no performance
| issues (yet?) is the premature optimisation.
| rplnt wrote:
| If you are optimizng for simplicity it may not be, as the
| use as a cache is much (much) more straightforward. Also,
| for a new project, I'd go with in-memory service-level
| cache as it outperforms both (in any metric) and can be
| easily replaced once the need arises.
| mehphp wrote:
| I was referring to adding redis prematurely
| MobiusHorizons wrote:
| "Premature optimization" typically refers to optimizing before
| profiling. Ie optimizing in places that won't help.
|
| Is redis not improving your latency? Is it adding complexity
| that isn't worth it? Why bother removing it?
| maxbond wrote:
| I like to call cases like this "premature distribution." Or
| maybe you could call it "premature capacity." If you have an
| application running in the cloud with several thousand
| requests per day, you could probably really benefit from
| adding a service like Redis.
|
| But when you have 0-10 users and 0-1000 requests per day, it
| can make more sense to write something more monolithic and
| with limited scalability. Eg, doing everything in Postgres.
| Caching is especially amenable to adding in later. If you get
| too far into the weeds managing services and creating
| scalability you might bogged down and never get your
| application in front of potential users in the first place.
|
| Eg, your UX sucks and key features aren't implemented, but
| you're tweaking TTLs and getting a Redis cluster to work
| inside Docker Compose. Is that a good use of your time? If
| your goal is to get a functional app in front of potential
| users, probably not.
| not_kurt_godel wrote:
| You probably don't need Redis until you have thousands of
| requests per minute, nevermind per day.
| foobarian wrote:
| I'd go further and even say per second! Actually PG can
| still handle it, the main problem is that it has a more
| complex runtime that can spike. Backups? Background jobs
| doing heavy writes? Replication? Vacuum? Can tend to
| cause multisecond slowdowns which may be undesirable
| depending on your SLA. But otherwise it would be fine.
| MobiusHorizons wrote:
| To be clear my question isn't claiming redis isn't
| premature optimization, but rather asking why the op thinks
| that it is. Being new doesn't automatically mean that there
| is no need for latency sensitivity. Making that assumption
| could be just as premature. Ripping something out that is
| already working also takes time and the trade offs need to
| be weighed.
| maxbond wrote:
| Can't respond for them of course but I didn't take the
| impression that it was already fully implemented,
| working, and functionally necessary. I took the
| impression they had started going down that path but it
| was still easy to bail. That's just a vibe though.
|
| But I agree that it would be appropriate to start out
| that way in some projects.
| flanked-evergl wrote:
| We dropped Redis from a 4 year old app that had a rapidly
| growing userbase. Best choice ever. We never looked back once
| other than to think how annoying it was to have to deal with
| Redis in addition to Postgres.
| Implicated wrote:
| Sincerely (Feel the need to add that given the tension around
| here in these comments), I'm curious how Redis was annoying.
| Can you give any detail/insight?
| flanked-evergl wrote:
| Every component takes work. You have to upgrade it,
| something goes wrong with tuning, you have to debug it,
| etc. It went wrong about once a month, somehow. I'm sure it
| was fixable, but time is a commodity, and not having it any
| more gives us more time to work on other things.
|
| We can't get rid of Postgres, but since we run Postgres on
| GCP we really never even think about it.
| sixo wrote:
| If you use an UNLOGGED table in Postgres as a cache, and your DB
| restarts, you no longer have a cache. Then your main table gets a
| huge spike in traffic and likely grinds to a halt.
| Spivak wrote:
| Same as the folks who use in-memory Redis. Is there something
| uniquely bad about Postgres for this situation?
|
| If your cache is so performance critical that you can't lose
| the data then it sounds like you need a (denormalized)
| database.
| frollogaston wrote:
| Cache isn't meant to persist, and something is wrong if you
| hard depend on it persisting.
| shakna wrote:
| Cache is not the only solution to the thundering herd.
| wielebny wrote:
| If you need persistence on your Redis, then you're not using is
| as a cache. You're using it as a key-value store.
| sgarland wrote:
| Not quite - if you have a crash / hard restart, the table is
| truncated. If Postgres gracefully shuts down, the data is
| retained.
| carterschonwald wrote:
| This seems to be testing how the software is optimized for low
| core deployments... how does Postgres perform vary as you add
| more cores and ram? It's the sort of software where I'd presume
| more cores and ram yields better performance. Assuming as always
| that mature systems software sees more many core perf
| engineering.
| cortesoft wrote:
| Having the ability to set a TTL on the cache key is a critical
| feature of a cache, not something that can be tacked on later.
|
| I always find these "don't use redis" posts kind of strange.
| Redis is so simple to operate at any scale, I don't quite get why
| it is important to remove it.
| frollogaston wrote:
| Yeah, the article was like "I always need a DB anyway" but then
| sets up an extra cronjob to expire keys, plus more code. I get
| YAGNI and avoiding deps, but this is really extra stuff to deal
| with.
|
| Maybe Postgres could use a caching feature. Until then, I'm
| gonna drop in Redis or memcached instead of reinventing the
| wheel.
| maxbond wrote:
| Expiring keys in Postgres with a created_at column and a
| pg_cron job is very easy (at least, if you're comfortable in
| Postgres). Redis is world class though of course, and can be
| deployed turn-key in basically any environment. If you're
| more comfortable in Redis than Postgres, more power to you.
| Different choices can be pragmatic to different people.
|
| Personally for a greenfield project, my thinking would be
| that I am _paying_ for Postgres already. So I would want to
| avoid paying for Redis too. My Postgres database is likely to
| be underutilized until (and unless) I get any real scale. So
| adding caching to it is free in terms of dollars.
| frollogaston wrote:
| I'm comfy with Postgres though, like I'll center my entire
| backend around it and do the heavy lifting in SQL (never
| ORM). It's more that I don't want to depend on a cronjob
| for something as fundamental as a cache.
|
| Usually Postgres costs a lot more than Redis if you're
| paying for a platform. Like a decent Redis or memcached in
| Heroku is free. And I don't want to waste precious Postgres
| connections or risk bogging down the whole DB if there's
| lots of cache usage, which actually happened last time I
| tried skipping Redis.
| maxbond wrote:
| I can understand being nervous about some cron job
| running on some other service, but what's concerning
| about a cron job managed inside of Postgres with pg_cron?
| If that doesn't run, your database is probably down
| anyway.
|
| Postgres might cost more but I'm probably already paying.
| I agree that exhausting connections and writing at a high
| rate are easy ways to bring down Postgres, but I'm
| personally not going to worry about exhausting
| connections to Postgres until I have at least a thousand
| of them. Everything has to be considered within the
| actual problem you are solving, there are definitely
| situations to start out with a cache.
| frollogaston wrote:
| I might be ok with this if it were built in, but pg_cron
| is an extension, so first off you might not even have
| access to it. And then you still have to monitor it.
| maxbond wrote:
| Seems like it's available on all the major providers.
| frollogaston wrote:
| Heroku doesn't have it. That's actually kinda annoying
| that they don't, cause the others do. AND they no longer
| have free Redis, so that changes things a bit.
|
| Edit: well a tiny bit, max $3/mo
| motorest wrote:
| > Usually Postgres costs a lot more than Redis if you're
| paying for a platform.
|
| You need to back up your unbelievable assertion with
| facts. Memory cache is typically far more expensive than
| a simple database, specially as provisioning the same
| memory capacity as RAM is orders of magnitude more
| expensive than storing the equivalent data in a database.
| frollogaston wrote:
| I didn't say it's cheaper for the same cache size. But
| yeah a base tier Redis that will carry a small project
| tends to be a lot cheaper than the base tier Postgres.
| motorest wrote:
| > I didn't say it's cheaper for the same cache size.
|
| So be specific. What exactly did you wanted to say?
|
| > But yeah a base tier Redis that will carry a small
| project tends to be a lot cheaper than the base tier
| Postgres.
|
| This is patently false. I mean,some cloud providers offer
| nosql databases with sub-20ms performance as part of
| their free tier.
|
| Just go ahead and provide any evidence, any at all,that
| support the idea that Redis is cheaper than Postgres. Any
| concrete data will do.
| frollogaston wrote:
| Look at the Heroku pricing. If you don't like Heroku then
| look at AWS pricing. Specifically for Postgres, not a
| NoSQL DB (which Redis can be too)
| MangoToupe wrote:
| Why would you cache the entire database though? Seems
| like an apples to oranges comparison.
| motorest wrote:
| > Why would you cache the entire database though?
|
| I have no ideas where did you got that from.
| MangoToupe wrote:
| > specially as provisioning the same memory capacity as
| RAM is orders of magnitude more expensive than storing
| the equivalent data in a database.
|
| I'm not sure how else to interpret this
| motorest wrote:
| Why are you confusing memory capacity as a requirement to
| store the whole database in memory? I mean, think. What
| do you think is the biggest performance bottleneck with
| caches, and how does this relate to memory capacity?
| motorest wrote:
| > Yeah, the article was like "I always need a DB anyway" but
| then sets up an extra cronjob to expire keys, plus more code.
|
| You do not need cron jobs to do cache. Sometimes you don't
| even need a TTL. All you need is a way to save data in a way
| that is easy and cheaper to retrieve. I feel these comments
| just misinterpret what a cache is by confusing it with what
| some specific implementation does. Perhaps that's why we see
| expensive and convoluted strategies using Redis and the like
| when they are absolutely not needed at all.
| maxbond wrote:
| If we don't use a TTL, aren't we going to have to either
| accept that our cache will grow without bounds or take an
| even more sophisticated approach (like tracking access
| times instead of creation times)? Is there something
| simpler I'm not seeing?
| frollogaston wrote:
| I've tried it this way. You can get away with no TTL if
| your keys are constrained. Sometimes there are enough
| keys to be a problem. I'd rather just set up a TTL and
| not worry about this.
| maxbond wrote:
| Agreed, simple and widely applicable heuristics are
| great, and you can think deeply on it when and if it
| proves to be an issue worthy of such attention.
| motorest wrote:
| > If we don't use a TTL, aren't we going to have to
| either accept that our cache will grow without bounds
| (...)
|
| Do you have a bound? I mean, with Redis you do, but
| that's primarily a cost-driven bound.
|
| Nevertheless, I think you're confusing the point of a
| TTL. TTLs are not used to limit how much data you cache.
| The whole point of a TTL is to be able to tell whether a
| cache entry is still fresh or it is stale and must be
| revalidated. Just because some cache strategies use TTL
| to determine what entry they should evict, that is just a
| scenario that takes place when memory is at full
| capacity.
| maxbond wrote:
| A TTL doesn't really tell you if it's stale though. It
| gives you an upper bound on how long it can have been
| stale. But something becomes stale when the underlying
| resource is written to, which can happen an hour or an
| instant after you cache it. You should probably evict it
| when the write comes in. In my mind, it's for evicting
| things that aren't in use (to free up memory).
| motorest wrote:
| > A TTL doesn't really tell you if it's stale though
| (...)
|
| Non-sequitur,and imaterial to the discussion.
|
| > You should probably evict it when the write comes in.
|
| No. This is only required if memory is maxed out and
| there is no more room to cache your entry. Otherwise you
| are risking cache misses by evicting entries that are
| still relatively hot.
| maxbond wrote:
| > Non-sequitur,and imaterial to the discussion.
|
| You said:
|
| > The whole point of a TTL is to be able to tell whether
| a cache entry is still fresh or it is stale and must be
| revalidated.
|
| So I responded to it. I don't really understand why you
| think that's nonsequiter.
|
| > No.
|
| I'm a bit confused. We're not using TTLs and we're not
| evicting things when they become invalid. What is your
| suggestion?
| frollogaston wrote:
| The cache isn't the only hot thing here. Relax.
| akdor1154 wrote:
| It's not like it's bad, it's more like cutting down on the
| amount of systems you need to operate.
| pphysch wrote:
| I have been running Redis for years as a cache and have spent
| less than 5 cumulative minutes "operating" it.
|
| I'm a big "just use Postgres" fan but I think Redis is
| sufficiently simple and orthogonal to include in the stack.
| coder543 wrote:
| I keep hoping Postgres will one day have the ability to mark a
| timestamp column as an expiry column. It would be useful for
| all kinds of things beyond caching, including session tokens,
| feature flags, background jobs, rate limiting, delayed deletion
| (as a variant of soft deletion), etc.
|
| It seems like the autovacuum could take care of these expired
| rows during its periodic vacuum. The query planner could
| automatically add a condition that excludes any expired rows,
| preventing expired rows from being visible before autovacuum
| cleans them up.
| neoden wrote:
| One could use a trigger for this. All we need is to setup a
| trigger that would delete all expired records looking at some
| timestamp column on update. That would eat up some latency
| but as was said, most projects would find it good enough
| anyway.
| nhumrich wrote:
| I use pg cron for this. But I don't have a need for TTL to
| be to the minute accurate, or even to the hour.
| dmitry-vsl wrote:
| Probably better to use partitioned table and drop old
| partitions.
| motorest wrote:
| > Having the ability to set a TTL on the cache key is a
| critical feature of a cache, not something that can be tacked
| on later.
|
| What exactly is the challenge you're seeing? In the very least,
| you can save an expiry timestamp as part of the db entry. Your
| typical caching strategy already involves revalidating cache
| before it expires, and it's not as if returning stale while
| revalidating is something completely unheard of.
| zzbzq wrote:
| Postgres nationalists will applaud the conclusion no matter how
| bad the reasoning is.
|
| Don't get me wrong, the idea that he wants to just use a RDMBS
| because his needs aren't great enough, is a perfectly
| inoffensive conclusion. The path that led him there is very
| unpersuasive.
|
| It's also dangerous. Ultimately the author is willing to do a
| bit more work rather than learn something new. This works
| because he's using a popular tool people like. But overall, he
| doesn't demonstrate he's even thought about any of the things
| I'd consider most important; he just sort of assumes running a
| Redis is going to be hard and he'd rather not mess with it.
|
| To me, the real question is just cost vs. how much load the DB
| can even take. My most important Redis cluster basically exists
| to take load off the DB, which takes high load even by simple
| queries. Using the DB as a cache only works if your issue is
| expensive queries.
|
| I think there's an appeal that this guy reaches the conclusion
| someone wants to hear, and it's not an unreasonable conclusion,
| but it creates the illusion the reasoning he used to get there
| was solid.
|
| I mean, if you take the same logic, cross out the word
| Postgres, and write in "Elasticsearch," and now it's an article
| about a guy who wants to cache in Elasticsearch because it's
| good enough, and he uses the exact same arguments about how
| he'll just write some jobs to handle expiry--is this still
| sounding like solid, reasonable logic? No it's crazy.
| nvader wrote:
| I hope you remember to add monitoring for your cache expiry cron
| job so you can be notified if it ever fails to run due to a code
| change or configuration change. For example, database credentials
| might be rolled and if you forget to update the job your primary
| database could fill up.
|
| Perhaps you could have a second cron job that runs to verify that
| the first one completed. It could look for a last-ran entry. You
| should put it in the same database, so maybe perhaps you could a
| key value store like redis for that.
| monkeyelite wrote:
| This is more than a few leaps of logic. a bad config of any
| infrastructure can cause problems
| frollogaston wrote:
| The values in this test are like 20 bytes, right? Wonder how
| things compare if they're about 1KB.
| dizzyVik wrote:
| Not on my computer right now but I think it's 45 bytes per
| value
| monkeyelite wrote:
| In nginx config you can pretty easily cache web pages or API
| responses. I would also like to know how far that goes.
|
| Also does anyone like memcached anymore? When I compared with
| Redis in the past it appeared more simple.
| hnaccountme wrote:
| I think there is more performance to be taken from PostgreSQL.
| Not sure how pgx works internally since I have not used it.
|
| There are async functions provided by PostgreSQL client library
| (libpq). I've used it to process around 2000 queries on a single
| connection per second on a logged table.
| jamesblonde wrote:
| Why do we promote articles like this that have nice graphs and
| are well written, when they should get a grade 'F' as an actual
| benchmark study. The way it is presented, a casual reader would
| think Postgres is 2/3rds the performance of Redis. Good god. He
| even admits Postgres maxxed out its 2 cores, but Redis was
| bottlenecked by the HTTP server. We need more of an academic, not
| a hacker, culture for benchmarks.
| dizzyVik wrote:
| There's a reason this is on my blog and not a paper in a
| journal. This isn't supposed to show the absolute speed of
| either tool, the benchmark is not set up for that. I do state
| that redis has more performance on the table in the blog post.
| vasco wrote:
| It's not a paper or a journal but you could at least try to
| run a decent benchmark. As it is this serves no purpose other
| than reinforcing whatever point you started with. Didn't even
| tweak postgres buffers, literally what's the point.
| dizzyVik wrote:
| I still end up recommending using postgres though, don't I?
| pcthrowaway wrote:
| "I'll use postgres" was going to be your conclusion no
| matter what I guess?
|
| I mean what if an actual benchmark showed Redis is 100X
| as fast as postgres for a certain use case? What are the
| constraints you might be operating with? What are the
| characteristics of your workload? What are your budgetary
| constraints?
|
| Why not just write a blog post saying "Unoptimized
| postgres vs redis for the lazy, running virtualized with
| a bottleneck at the networking level"
|
| I even think that blog post would be interesting, and
| might be useful to someone choosing a stack for a proof
| of concept. For someone who to scale to large production
| workloads (~10,000 requests/second or more), this isn't a
| very useful article, so the criticism is fair, and I'm
| not sure why you're dismissing it off hand.
| dizzyVik wrote:
| I completely agree that this is not relevant for anyone
| running such workloads, the article is not aimed at them
| at all.
|
| Within the constraints of my setup, postgres came out
| slower but still fast enough. I don't think I can
| quantify what fast enough is though. Is it 1000 req/s? Is
| it 200? It all depends on what you're doing with it. For
| many of my hobby projects which see tens of requests per
| second it definitely is fast enough.
|
| You could argue that caching is indeed redundant in such
| cases, but some of those have quite a lot of data that
| takes a while to query.
| motorest wrote:
| > "I'll use postgres" was going to be your conclusion no
| matter what I guess?
|
| Would it bother you as well if the conclusion was
| rephrased as "based on my observations, I see no point in
| rearchitecting the system to improve the performance by
| this much"?
|
| I think you are too tied to a template solution that not
| only you don't stop to think why you're using it or even
| if it is justified at all. Then, when you are faced with
| observations that challenge your unfounded beliefs, you
| somehow opt to get defensive? That's not right.
| vasco wrote:
| That's the point, you put no effort and decided to do
| what you had decided already to do before.
| dizzyVik wrote:
| I don't think this is a fair assessment. Had my
| benchmarks shown, say, that postgres crumbled under heavy
| write load then the conclusion would be different. That's
| exactly why I decided to do this - to see what the
| difference was.
| m000 wrote:
| Of course you didn't see postgres crumble. This still a
| toy example of a benchmark. Nobody starts (and even more
| pays for) a postgres instance to use exclusively as a
| cache. It is guaranteed that even in the simplest of
| deployments some other app (if not many of them) will be
| the main postgres tenant.
|
| Add an app that actually uses postgres as a database, you
| will probably see its performance crumble, as the app
| will content the cache for resources.
|
| Nobody asked for benchmarking as rigorous as you would
| have in a published paper. But toy examples are toy
| examples, be it in a publication or not.
| a_c wrote:
| I find your article valuable. It shows me what amount of
| configuration is needed for a reasonable expectation of
| performance. In real world, I'm not going to spend effort
| maxing out configuring a single piece of tool. Not being the
| most performing config on either of the tools is the least of
| my concern. Picking either of them, or as you suggested,
| Postgres, and then worry about getting one billion requests
| to the service is far more important
| lemagedurage wrote:
| The main issue is that a reader might mistake Redis as a 2X
| faster postgres. Memory is 1000X faster than disk (SSD) and
| with network overhead Redis can still be 100X as fast as
| postgres for caching workloads.
|
| Otherwise, the article does well to show that we can get a
| lot of baseline performance either way. Sometimes a cache is
| premature optimisation.
| phiresky wrote:
| If your cache fits in Redis then it fits in RAM, if your
| cache fits in RAM then Postgres will serve it from RAM just
| as well.
|
| Writes will go to RAM as well if you have synchronous=off.
| senorrib wrote:
| Not necessarily true. If you're sharing the database with
| your transaction workload your cache will be paged out
| eventually.
| jgalt212 wrote:
| This was my take as well, but I'm a MySQL / Redis shop. I
| really have no idea what tables MySQL has in RAM at any
| given moment, but with Redis I know what's in RAM.
| motorest wrote:
| > The main issue is that a reader might mistake Redis as a
| 2X faster postgres. Memory is 1000X faster than disk (SSD)
| and with network overhead Redis can still be 100X as fast
| as postgres for caching workloads.
|
| Your comments suggest that you are definitely missing some
| key insights onto the topic.
|
| If you, like the whole world, consume Redis through a
| network connection, it should be obvious to you that
| network is in fact the bottleneck.
|
| Furthermore, using a RDBMS like Postgres may indeed imply
| storing data in a slower memory. However, you are ignoring
| the obvious fact that a service such as Postgres also has
| its own memory cache, and some query results can and are
| indeed fetched from RAM. Thus it's not like each and every
| single query forces a disk read.
|
| And at the end of the day, what exactly is the performance
| tradeoff? And does it pay off to spend more on an in-memory
| cache like Redis to buy you the performance Delta?
|
| That's why real world benchmarks like this one are
| important. They help people think through the problem and
| reassess their irrational beliefs. You may nitpick about
| setup and configuration and test patterns and choice of
| libraries. What you cannot refute are the real world
| numbers. You may argue they could be better if this and
| that, but the real world numbers are still there.
| lossolo wrote:
| > If you, like the whole world, consume Redis through a
| network connection
|
| I think "you are definitely missing some key insights
| onto the topic". The whole world is a lot bigger than
| your anecdotes.
| Implicated wrote:
| > If you, like the whole world, consume Redis through a
| network connection, it should be obvious to you that
| network is in fact the bottleneck.
|
| Not to be annoying - but... what?
|
| I specifically _do not_ use Redis over a network. It's
| _wildly_ fast. High volume data ingest use case - lots
| and lots of parallel queue workers. The database is over
| the network, Redis is local (socket). Yes, this means
| that each server running these workers has its own cache
| - that 's fine, I'm using the cache for absolutely insane
| speed and I'm not caching huge objects of data. I don't
| persist it to disk, I don't care (well, it's not a big
| deal) if I lose the data - it'll rehydrate in such a
| case.
|
| Try it some time, it's fun.
|
| > And at the end of the day, what exactly is the
| performance tradeoff? And does it pay off to spend more
| on an in-memory cache like Redis to buy you the
| performance Delta?
|
| Yes, yes it is.
|
| > That's why real world benchmarks like this one are
| important.
|
| That's not what this is though. Just about nobody who has
| a clue is using default configurations for things like PG
| or Redis.
|
| > They help people think through the problem and reassess
| their irrational beliefs.
|
| Ok but... um... you just stated that "the whole world"
| consumes redis through a network connection. (Which, IMO,
| is wrong tool for the job - sure it will work, but that's
| not where/how Redis shines)
|
| > What you cannot refute are the real world numbers.
|
| Where? This article is not that.
| gmm1990 wrote:
| that is an interesting use case, I hadn't thought about a
| setup like this with a local redis cache before. Is it
| the typical advantages of using a db over a filesystem
| the reason to use redis instead of just reading from
| memory mapped files?
| Implicated wrote:
| > Is it the typical advantages of using a db over a
| filesystem the reason to use redis instead of just
| reading from memory mapped files?
|
| Eh - while surely not everyone has the benefits of doing
| so, I'm running Laravel and using Redis is just _really_
| simple and easy. To do something via memory mapped files
| I'd have to implement quite a bit of stuff I don't
| want/need to (locking, serialization, ttl/expiration,
| etc).
|
| Redis just works. Disable persistence, choose the
| eviction policy that fits the use, config for unix socket
| connection and you're _flying_.
|
| My use case is generally data ingest of some sort where
| the processing workers (in my largest projects I'm
| talking about 50-80 concurrent processes chewing through
| tasks from a queue (also backed by redis) and are likely
| to end up running the same queries against the database
| (mysql) to get 'parent' records (ie: user associated with
| object by username, post by slug, etc) and there's no way
| to know if there will be multiples (ie: if we're
| processing 100k objects there might be 1 from UserA or
| there might be 5000 by UserA - where each one processing
| will need the object/record of UserA). This project in
| particular there's ~40 million of these 'user' records
| and hundreds of millions of related objects - so can't
| store/cache _all_ users locally - but sure would benefit
| from not querying for the same record 5000 times in a 10
| second period.
|
| For the most part, when caching these records over the
| network, the performance benefits were negligible
| (depending on the table) compared to just querying myqsl
| for them. They are just `select where id/slug =` queries.
| But when you lose that little bit of network latency and
| you can make _dozens_ of these calls to the cache in the
| time it would take to make a single networked call... it
| adds up real quick.
|
| PHP has direct memory "shared memory" but again, it would
| require handling/implementing a bunch of stuff I just
| don't want to be responsible for - especially when it's
| so easy and performant to lean on Redis over a unix
| socket. If I needed to go faster than this I'd find
| another language and likely do something direct-to-memory
| style.
| pigbearpig wrote:
| That's the reader's fault then. I see the blog post as the
| counter to the insane resume-building over-engineered
| architecture you see at a lot of non-tech companies. Oh,
| you need a cache for our 25-user internal web application?
| Let's put an front a redis cluster with elastisearch using
| an LLM to publish cache invalidation with Kafka.
| themgt wrote:
| There's also a sort of anti-everything attitude that gets
| boring and lazy. Redis is about the simplest thing
| possible to deploy. This wasn't about "a redis cluster
| with elastisearch using an LLM" it was just Redis.
|
| I sometimes read this stuff like people explaining how
| they replaced their spoon and fork with a spork and
| measured only a 50% decrease in food eating performance.
| And have you heard of the people with a $20,000 Parisian
| cutlery set to eat McDonalds? I just can't understand
| insane fork enjoyers with their over-engineered their
| dining experience.
| lomase wrote:
| There is this cv-driven-development when you have to use
| Redis, Kafka, Mongo, Rabbit, Docker, AWS, job schelduers,
| Microservices, and so on.
|
| The less dependencies my project has the better. If it is
| not needed why use it?
| array_key_first wrote:
| Software development has such a pro-complexity culture
| that, I think, we need more anti-stuff or pushback.
| rollcat wrote:
| Thank you for the article.
|
| My own conclusions from your data:
|
| - Under light workloads, you can get away with Postgres. 7k
| RPS is fine for a lot of stuff.
|
| - Introducing Redis into the mix has to be carefully weighted
| against increased architectural complexity, and having a
| common interface allows us to change that decision down the
| road.
|
| Yeah maybe that's not up to someone else's idea of a good
| synthetic benchmark. Do your load-testing against actual
| usage scenarios - spinning up an HTTP server to serve traffic
| is a step in the right direction. Kudos.
| whateveracct wrote:
| most people with blogs don't know what they're doing. or don't
| care to know? sadly they get hired at companies and everyone
| does what they say cuz they have a blog. i've seen some shit in
| that department it's wild how much some people really are
| imposters.
| motorest wrote:
| > most people with blogs don't know what they're doing. or
| don't care to know?
|
| I don't see any point to this blend of cynical contrarianism.
| If you feel you can do better, put your money where your
| mouth is. Lashing at others because they went through the
| trouble of sharing something they did is something that's
| absurd and creates no value.
|
| Also, maintaining a blog doesn't make anyone an expert, but
| not maintaining a blog doesn't mean you are suddenly more
| competent than those who do.
| whateveracct wrote:
| just an observation :)
| motorest wrote:
| > He even admits Postgres maxxed out its 2 cores, but Redis was
| bottlenecked by the HTTP server.
|
| What exactly is your point? That you can further optimize
| either option? Well yes, that comes at no suprise. I mean, the
| latencies alone are in the range of some transcontinental
| requests. Were you surprised that Redis outperformed Postgres?
| I hardly think so.
|
| So what's the problem?
|
| The main point that's proven is that there is indeed
| diminishing returns in terms of performance. For applications
| where you can afford an extra 20ms when hitting a cache,
| caching using a persistent database is an option. For some
| people, it seems this fact was very surprising. That's food for
| thought, isn't it?
| hvb2 wrote:
| I've done this many times in AWS leveraging dynamodb.
|
| Comes with ttl support (which isn't precise so you still need
| to check expiration on read), and can support long TTLs as
| there's essentially no limit to the storage.
|
| All of this at a fraction of the cost of HA redis Only if you
| need that last millisecond of performance and have done all
| other optimizations should one consider redis imho
| motorest wrote:
| > I've done this many times in AWS leveraging dynamodb.
|
| Exactly. I think nosql offerings from any cloud provider
| already supports both TTL and conditional requests out-of-
| the-box, and the performance of basic key-value CRUD
| operations is often <10ms.
|
| I've seem some benchmarks advertise memory cache services
| as having latencies around 1ms. Yeah, this would mean the
| latency of a database is 10 times higher. But relative
| numbers matter nothing. What matters is absolute numbers,
| as they are the ones that drive tradeoff analysis. Does a
| feature afford an extra 10ms in latency, and is that
| performance improvement worth paying a premium?
| re-thc wrote:
| > All of this at a fraction of the cost of HA redis
|
| This depends on your scale. Dynamodb is pay per request and
| the scaling isn't as smooth. At certain scales Redis is
| cheaper.
|
| Then if you don't have high demand maybe it's ok without HA
| for Redis and it can still be cheaper.
| motorest wrote:
| > At certain scales Redis is cheaper.
|
| Can you specify in which scenario you think Redis is
| cheaper than caching things in, say, dynamodb.
| odie5533 wrote:
| High read/write and low-ish size. Also it's faster.
| motorest wrote:
| > High read/write and low-ish size. Also it's faster
|
| You posted a vague and meaningless assertion. If you do
| not have latency numbers and cost differences, you have
| absolutely nothing to show for, and you failed to provide
| any rationale that justified even whether any cache is
| required at all.
| odie5533 wrote:
| At 10k RPS you'll see a significant cost savings with
| Redis over DynamoDB.
|
| ElastiCache Serverless (Redis/Memcached): Typical latency
| is 300-500 microseconds (sub-millisecond response)
|
| DynamoDB On-Demand: Typical latency is single-digit
| milliseconds (usually between 1-10 milliseconds for
| standard requests)
| motorest wrote:
| > At 10k RPS you'll see a significant cost savings with
| Redis over DynamoDB.
|
| You need to be more specific than that. Depending on your
| read/write patterns and how much memory you need to
| allocate to Redis, back of the napkin calculations still
| point to the fact that Redis can still cost >$1k/month
| more than DynamoDB.
|
| Did you actually do the math on what it costs to run
| Redis?
| hvb2 wrote:
| > At 10k RPS
|
| You would've used local memory first. At which point I
| cannot see getting to those request levels anymore
|
| > ElastiCache Serverless (Redis/Memcached): Typical
| latency is 300-500 microseconds (sub-millisecond
| response)
|
| Sure
|
| > DynamoDB On-Demand: Typical latency is single-digit
| milliseconds (usually between 1-10 milliseconds for
| standard requests)
|
| I know very little use cases where that difference is
| meaningful. Unless you have to do this many times
| sequentially in which case optimizing that would be much
| more interesting than a single read being .5 ms versus
| the typical 3 to 4 for dynamo (that last number is based
| on experience)
| hvb2 wrote:
| You would need to get to insane read counts pretty much
| 24/7 for this to work out.
|
| For HA redis you need at least 6 instances, 2 regions * 3
| AZs. And you're paying for all of that 24/7.
|
| And if you truly have 24/7 use then just 2 regions won't
| make sense as the latency to get to those regions from
| the other side of the globe easily removes any caching
| benefit.
| ahoka wrote:
| A 6 node cache and caching in DynamoDB, what the hell
| happened to the industry? Or people just call every kind
| of non business-object persistence cache now?
| hvb2 wrote:
| I don't understand your comment.
|
| If you're given the requirement of highly available, how
| do you not end up with at least 3 nodes? I wouldn't
| consider a single region to be HA but I could see that
| argument as being paranoid.
|
| A cache is just a store for things that expire after a
| while that take load of your persistent store. It's
| inherently eventually consistent and supposed to help you
| scale reads. Whatever you use for storage is irrelevant
| to the concept of offloading reads
| odie5533 wrote:
| It's $9/mo for 100 MB of ElastiCache Serverless which is
| HA.
|
| It's $15/mo for 2x cache.t4g.micro nodes for ElastiCache
| Valkey with multi-az HA and a 1-year commitment. This
| gives you about 400 MB.
|
| It very much depends on your use case though if you need
| multiple regions then I think DynamoDB might be better.
|
| I prefer Redis over DynamoDB usually because it's a
| widely supported standard.
| motorest wrote:
| > It's $9/mo for 100 MB of ElastiCache Serverless which
| is HA.
|
| You need to be more specific with your scenario. Having
| to cache 100MB of anything is hardly a scenario that
| involves introducing a memory cache service such as
| Redis. This is well within the territory of just storing
| data in a dictionary. Whatever is driving the requirement
| for Redis in your scenario, performance and memory
| clearly isn't it.
| lelanthran wrote:
| I'm not seeing your point. This wouldn't get an F, purely
| because all the parameters are documented.
|
| Conclusions aren't incorrect either, so what's the problem?
| m000 wrote:
| The use case is not representative of a real-life scenario,
| so the value of the presented results are minimal.
|
| A takeaway could be that you can dedicate a postgres instance
| for caching and have acceptable results. But who does that?
| Even for a relatively simple intranet app, your #1 cost when
| deploying in Google Cloud would probably be running Postgres.
| Redis OTOH is dirt cheap.
| lelanthran wrote:
| > The use case is not representative of a real-life
| scenario, so the value of the presented results are
| minimal.
|
| Maybe I'm reading the article wrong, but it is
| representative of any application that uses a PosgreSQL
| server for data, correct?
|
| In what way is that not a real-life scenario? I've deployed
| Single monolith + PostgreSQL to about 8 different clients
| in the last 2.5 years. It's my largest source of income.
| m000 wrote:
| When you run a relational database, you typically do it
| for the joins, aggregations, subqueries, etc. So a real-
| life scenario would include some application actually
| putting some stress on postgres.
|
| If your don't mind overprovisioning your postgres, yes I
| guess the presented benchmarks are kind of
| representative. But they also don't add anything that you
| didn't know without reading the article.
| lelanthran wrote:
| > If your don't mind overprovisioning your postgres
|
| Why would I mind it? I'm not using overpriced hosted
| PostgreSQL, after all.
| sherburt3 wrote:
| My stance has always been stick to 1 database for as long
| as humanly possible because having 2 databases is 1000x
| harder.
| Implicated wrote:
| > I've deployed Single monolith + PostgreSQL to about 8
| different clients in the last 2.5 years. It's my largest
| source of income.
|
| And... do you do that with the default configuration?
| lelanthran wrote:
| > And... do you do that with the default configuration?
|
| Yes. Internal apps/LoB apps for a large company might
| have, at most 5k users. PostgreSQL seems to manage it
| fine, none of my metrics are showing high latencies even
| when all employees log on in the morning during the same
| 30m period.
| Implicated wrote:
| I'm definitely getting the wrong kind of clients.
|
| Kudos to you sir. Sincerely, I'm not hating, I'm actually
| jealous of the environment being that mellow.
| ENGNR wrote:
| There's too many hackers on hacker news!
| zer00eyz wrote:
| You might not have been here 25 years ago when the dot com
| bubble burst.
|
| A lot of us ate shit to stay in the Bay Area, to stay in
| computing. I have stories of great engineers doing really
| crappy jobs and "contracting" on the side.
|
| I couldn't really have a 'startup' out of my house and a slice
| of rented hosting. Hardware was expensive and nothing was easy.
| Today I can set up a business and thrive on 1000 users at 10
| bucks a month. Thats a viable and easy to build business. It's
| an achievable metric.
|
| But Im not going to let amazon and its infinite bill you for
| everything at 2012 prices so it can be profitable hosting be my
| first choice. Im not going to do that when I can get fixed cost
| hosting.
|
| For me, all the interesting things going on in tech aren't
| coming out of FB, Google and hyperscalers. They aren't AI or
| ML. We dont need another Kubernetes or Kafka or react (no more
| Conways law projects). There is more interesting work going on
| down at the bottom. In small 2 and 3 man shops solving their
| problems on limited time and budget with creative "next step"
| solutions. Their work is likely more applicable to most people
| reading HN than another well written engineering blog from
| cloud flare about their latest massive rust project.
| positron26 wrote:
| A lot of great benchmarking probably dies inside internal
| tuning. When we're lucky, we get a blog post, but if the
| creator isn't incentivized or is even discouraged by an
| employer from sharing the results, it will never see the light
| of day.
| oulipo2 wrote:
| The main point was not to fully benchmark and compare both, but
| just to get a rough sense of whether a Postgres cache was fast
| enough to be useful in practice. The comparison with Redis was
| more a crutch to get a sense of that, than really something
| that pretends to be "rock-solid benchmarking"
| KronisLV wrote:
| I feel like the outrage is unwarranted.
|
| > The way it is presented, a casual reader would think Postgres
| is 2/3rds the performance of Redis.
|
| If a reader cares about the technical choice, they'll probably
| at least read enough to learn of the benchmarks in this popular
| use case, or even just the conclusion:
|
| > Redis is faster than postgres when it comes to caching,
| there's no doubt about it. It conveniently comes with a bunch
| of other useful functionality that one would expect from a
| cache, such as TTLs. It was also bottlenecked by the hardware,
| my service or a combination of both and could definitely show
| better numbers. Surely, we should all use Redis for our caching
| needs then, right? Well, I think I'll still use postgres.
| Almost always, my projects need a database. Not having to add
| another dependency comes with its own benefits. If I need my
| keys to expire, I'll add a column for it, and a cron job to
| remove those keys from the table. As far as speed goes - 7425
| requests per second is still a lot. That's more than half a
| billion requests per day. All on hardware that's 10 years old
| and using laptop CPUs. Not many projects will reach this scale
| and if they do I can just upgrade the postgres instance or if
| need be spin up a redis then. Having an interface for your
| cache so you can easily switch out the underlying store is
| definitely something I'll keep doing exactly for this purpose.
|
| I might take an issue with the first sentence (might add "...at
| least when it comes to my hardware and configuration."), but
| the rest seems largely okay.
|
| As a casual reader, you more or less just get:
| * Oh hey, someone's experience and data points. I won't base my
| entire opinion upon it, but it's cool that people are sharing
| their experiences. * If I wanted to use either, I'd
| probably also need to look into bottlenecks, even the HTTP
| server, something you might not look into at first! *
| Even without putting in a lot of work into tuning, both of the
| solutions process a lot of data and are within an order of
| magnitude when it comes to performance. * So as a casual
| reader, for casual use cases, it seems like the answer is -
| just pick whatever feels the easiest.
|
| If I wanted to read super serious benchmarks, I'd go looking
| for those (which would also have so many details that they
| would no longer be a casual read, short of just the abstract,
| but them I'm missing out on a lot anyways), or do them myself.
| This is more like your average pop-sci article, nothing wrong
| with that, unless you're looking for something else.
|
| Eliminating the bottlenecks would be a cool followup post
| though!
| lomase wrote:
| This site is called Hackernews btw.
| ezekiel68 wrote:
| > Both postgres and redis are used with the out of the box
| settings
|
| Ugh. I know this gives the illusion of fairness, but it's not how
| any self-respecting software engineer should approach benchmarks.
| You have hardware. Perhaps you have virtualized hardware. You
| tune to the hardware. There simply isn't another way, if you want
| to be taken seriously.
|
| Some will say that in a container-orchestrated environment,
| tuning goes out the window since "you never know" where the
| orchestrator will schedule the service but this is bogus. If
| you've got time to write a basic deployment config for the
| service on the orchestrator, you've also got time to at least
| size the memory usage configs for PostgreSQL and/or Redis. It's
| just that simple.
|
| This is the kind of thing that is "hard and tedious" for only
| about five minutes of LLM query or web search time and then you
| don't need to revisit it again (unless you decide to change the
| orchestrator deployment config to give the service more/less
| resources). It doesn't invite controversy to right-size your
| persistence services, especially if you are going to publish the
| results.
| wewewedxfgdf wrote:
| Fully agree.
|
| Postgres is a power tool usable for many many use cases - if
| you want performance it must be tuned.
|
| If you judge Postgres without tuning it - that's not Postgres
| being slow, that's the developer being naive.
| gopalv wrote:
| > If you judge Postgres without tuning it - that's not
| Postgres being slow, that's the developer being naive.
|
| Didn't OP end by picking Postgres anyway?
|
| It's the right answer even for a naive developer, perhaps
| even more so for a naive one.
|
| At the end of the post it even says
|
| >> Having an interface for your cache so you can easily
| switch out the underlying store is definitely something I'll
| keep doing
| lelanthran wrote:
| He concluded postgresql to be fast enough, so what's the
| problem?
|
| IOW, he judged it fast enough.
| vidarh wrote:
| On one hand I agree with you, but on the other hand defaults
| matter because I regularly see systems with the default config
| and no attempt to tune.
|
| Benchmarking the defaults and benchmarking a tuned setup will
| measure very different things, but both of them matter.
| matt-p wrote:
| IME very very few people tune the underlying host. Orgs like
| uber, google or whatever do but outside of that few people
| know what they're really doing/cares that much. Easier to
| "increase EC2 size" or whatever.
| kijin wrote:
| Defaults have all sorts of assumptions built into them. So if
| you compare different programs with their respective
| defaults, you are actually comparing the assumptions that the
| developers of those programs have in mind.
|
| For example, if you keep adding data to a Redis server under
| default config, it will eat up all of your RAM and suddenly
| stop working. Postgres won't do the same, because its default
| buffer size is quite small by modern standards. It will
| happily accept INSERTs until you run out of disk, albeit more
| slowly as your index size grows.
|
| The two programs behave differently because Redis was
| conceived as an in-memory database with optional persistence,
| whereas Postgres puts persistence first. When you use either
| of them with their default config, you are trusting that the
| developers' assumptions will match your expectations. If not,
| you're in for a nasty surprise.
| vidarh wrote:
| Yes, all of this is fine but none of it address my point:
|
| Enough people use the default settings that benchmarking
| the default settings is very relevant.
|
| It often isn't a good thing to rely on the defaults, but
| it's nevertheless the case that many do.
|
| (Yes, it is _also_ relevant to benchmark tuned versions, as
| I also pointed out, my argument was against the claim that
| it is somehow unfair not to tune)
| high_na_euv wrote:
| Disagree, majority of software is running on defaults, it makes
| sense to compare them this way
| IanCal wrote:
| I disagree. They found that Postgres, without tuning, was
| easily fast enough on low level hardware and would come with
| the benefit of not deploying another service. Additionally
| tuning it isn't really relevant.
|
| If the defaults are fine for a use case then unless I want to
| tune it for personal interest it's either a poor use of my fun
| time or a poor use of my clients funds.
| lemagedurage wrote:
| "If we don't need performance, we don't need caches" feels
| like a great broader takeaway here.
| motorest wrote:
| > "If we don't need performance, we don't need caches"
| feels like a great broader takeaway here.
|
| I don't think this holds true. Caches are used for reasons
| other than performance. For example, caches are used in
| some scenarios for stampede protection to mitigate DoS
| attacks.
|
| Also, the impact of caches on performance is sometimes
| negative. With distributed caching, each match and put
| require a network request. Even when those calls don't
| leave a data center, they do cost far more than just
| reading a variable from memory. I already had the
| displeasure of stumbling upon a few scenarios where cache
| was prescribed in a cargo cult way and without any data
| backing up the assertion, and when we took a look at traces
| it was evident that the bottleneck was actually the cache
| itself.
| ralegh wrote:
| DoS is a performance problem, if your server was
| infinitely fast with infinite storage they wouldnt be an
| issue.
| lomase wrote:
| If my gandma had wheels it would be a car.
| indymike wrote:
| It is actually a financial problem too. Servers stop
| working when the bill goes unpaid. Sad but true.
| motorest wrote:
| > DoS is a performance problem
|
| Not really. Running out of computational resources to
| fulfill requests is not a performance issue. Think of
| thinks such as exhausting a connection pool. More often
| than not, some components of a system can't scale
| horizontally.
| IanCal wrote:
| A cache being fast enough doesn't mean no caching is
| relevant - I'm not sure why you'd equate the two.
| hobs wrote:
| I see people downvoting this. Anyone who disagrees with
| this, we have YAGNI for a reason - if someone said to me my
| performance was fine and they added caches, I would look at
| them with a big hairy eyeball because we already know cache
| invalidation is a PITA, that correctness issues are easy to
| create, and now you have the performance of two different
| systems to manage.
|
| Amazon actually moved away from caches for some parts of
| its system because consistent behavior is a feature,
| because what happens if your cache has problems and the
| interaction between that and your normal thing is slow?
| What if your cache has some bugs or edge case behavior? If
| you don't need it you are just doing a bunch of extra work
| to make sure things are in sync.
| indymike wrote:
| Sometimes, a cache is all about reducing expense: I.e, free
| cache query vs expensive API query.
| amluto wrote:
| Sometimes people host software on a server they own or
| rent, the server is plenty fast, and it costs literally
| nothing to issue those queries at the scale on which
| they're needed.
| indymike wrote:
| Yes, that is true, but the original poster said getting
| rid of caches was always a good idea, when in reality the
| answer (as usual with engineering) is "it depends."
| perrygeo wrote:
| The default shared memory is 128MiB, not even 1% of typical
| machines today. A benchmark run with these settings is
| effectively crippling your hardware by making sure 99% of
| your available memory is ignored by postgres. It's an invalid
| benchmark, unless redis is similarly crippled.
| igneo676 wrote:
| > If the defaults are fine for a use case then unless I
| want to tune it for personal interest it's either a poor
| use of my fun time or a poor use of my clients funds.
|
| It doesn't matter if you've crippled the benchmark if the
| performance of both options still exceeds your
| expectations. Not all of us are trying eek out every drop
| of performance
|
| And, well, if you are then you can ignore the entire post
| because Redis offers better perf than postgres and you'd
| use that. It's that simple.
| re-thc wrote:
| > They found that Postgres, without tuning, was easily fast
| enough on low level hardware
|
| Is that production? When you basket it into "low level" it
| sounds like a base case but it really isn't.
|
| In production you don't have local storage, RAM being used
| for all kinds of other things, your CPU only available in
| small slices, network effects and many others.
|
| > If the defaults are fine for a use case
|
| Which I hope isn't the developer's edition of it works on my
| machine.
| Timshel wrote:
| > for only about five minutes of LLM query or web search
|
| I think I have more trust in the PG defaults that in the output
| of a LLM or copy pasting some configuration I might not really
| understand ...
| rollcat wrote:
| It's crazy how wildly inaccurate "top-of-the-list" LLMs are
| for straightforward yet slightly nuanced inquiries.
|
| I've asked ChatGPT to summarize Go build constraints,
| especially in the context of CPU microarchitectures (e.g.
| mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly
| smashed its head on GORISCV64, claiming all sorts of nonsense
| such as v1, v2; then G, IMAFD, Zicsr; only arriving at
| rva20u64 et al under hand-holding. Similar nonsense for
| GOARM64 and GOWASM. It was all right there in e.g. the docs
| for [cmd/go].
|
| This is the future of computer engineering. Brace yourselves.
| yomismoaqui wrote:
| If you are going to ask ChatGPT some specific tidbit it's
| better to force it to search on the web.
|
| Remember, an LLM is a JPG of all the text of the internet.
| dgfitz wrote:
| Wait, what?
|
| Isn't that the whole point, to ask it specific tidbits of
| information? Are we to ask it large, generic
| pontifications and claim success when we get large,
| generic pontifications back?
|
| The narrative around these things changes weekly.
| wredcoll wrote:
| I mean, like most tools they work when they work and
| don't when they fail. Sometimes I can use an llm to find
| a specific datum and sometimes I use google and sometimes
| I use bing.
|
| You might think of it as a cache, worth checking first
| for speed reasons.
|
| The big downside is not that they sometimes fail, its
| that they give zero indication when they do.
| simonw wrote:
| ChatGPT is exceptionally good at using search now, but
| that's new this year, as of o3 and then GPT-5. I didn't
| trust GPT-4o and earlier to use the search tool well
| enough to be useful.
|
| You can see if it's used search in the interface, which
| helps evaluate how likely it is to get the right answer.
| dpkirchner wrote:
| I use it as a tool that understands natural language and
| the context of the environments in work in well enough to
| get by, while guiding it to use search or just facts I
| know if I want more one-shot accuracy. Just like I would
| if I were communicating with a newbie who has their own
| preconceived notions.
| simonw wrote:
| Did you try pasting in the docs for cmd/go and asking
| again?
| Implicated wrote:
| I mean - this is the entire problem right here.
|
| Don't ask LLMs that are trained on a whole bunch of
| different versions of things with different flags and
| options and parameters where a bunch of people who have
| no idea what they're doing have asked and answered
| stackoverflow questions that are likely out of date or
| wrong in the first place how to do things with that thing
| without providing the docs for the version you're working
| with. _Especially_ if it's the newest version, regardless
| if it's cutoff date was after that version was released -
| you have no way to know if it was _included_. (Especially
| about something related to a programming language with
| ~2% market share)
|
| The contexts are so big now - feed it the docs. Just copy
| paste the whole damn thing into it when you prompt it.
| pbronez wrote:
| How was the LLM accessing the docs? I'm not sure what the
| best pattern is for this.
|
| You can put the relevant docs in your prompt, add them to a
| workspace/project, deploy a docs-focused MCP server, or
| even fine-tune a model for a specific tool or ecosystem.
| Implicated wrote:
| > I'm not sure what the best pattern is for this.
|
| > You can put the relevant docs in your prompt
|
| I've done a lot of experimenting with these various
| options for how to get the LLM to reference docs. IMO
| it's almost always best to include in prompt where
| appropriate.
|
| For a UI lib that I use that's rather new, specifically
| there's a new version that the LLMs aren't aware of yet,
| I had the LLM write me a quick python script that just
| crawls the docs site for the lib and feeds the entire
| page content back into itself with a prompt describing
| what it's supposed to do (basically telling it to
| generate a .md document with the specifics about that
| _thing_ , whether it's a component or whatever, ie:
| properties, variants, etc in an extremely brief manner)
| as well as build an 'index.md' that includes a short
| paragraph about what the library is and a list of each
| component/page document that is generated. So in about 60
| seconds it spits out a directory full of .md files and I
| then tell my project-specific LLM (ie: Claude Code or
| Opencode within the project) to review those files with
| the intention of updating the CLAUDE.md in the project to
| instruct that any time we're building UI elements we
| should refer to the index.md for the library to
| understand what components are available and when
| appropriate to use one of them we _must_ review the
| correlating document first.
|
| Works very very very well. Much better than an MCP server
| specifically built for that same lib. (Huge waste of
| tokens, LLM doesn't always use it, etc) Well enough that
| I just copy/paste this directory of docs into my active
| projects using that library - if I wasn't lazy I'd
| package it up but too busy building stuff.
| simonw wrote:
| So run the LLM in an agent loop: give it a benchmarking tool,
| let it edit the configuration and tell it to tweak the
| settings and measure and see how much if a performance
| improvement it can get.
|
| That's what you'd do by hand if you were optimizing, so save
| some time and point Claude Code or Codex CLI or GitHub
| Copilot at it and see what happens.
| lomase wrote:
| How much that would cost?
| carlhjerpe wrote:
| They charge per token...
| danielbln wrote:
| Not if you're on a subscription they don't.
| carlhjerpe wrote:
| So the hidden usage caps doesn't equate to token usage?
|
| They charge per token, everyone charges per token.
| simonw wrote:
| Probably about 10 cents, if you're even paying for
| tokens. Plenty of these tools have generous free tiers or
| allowances included in your subscription.
|
| I run a pricing calculator here - for 50,000 input
| tokens, 5,000 output tokens (which I estimate would be
| about right for a PostgreSQL optimization loop) GPT-5
| would cost 11.25 cents: https://www.llm-
| prices.com/#it=50000&ot=5000&ic=1.25&oc=10
|
| I use Codex CLI with my $20/month ChatGPT account and so
| far I've not hit the limit with it despite running things
| like this multiple times a day.
| lomase wrote:
| If optimizing a Postgress SQL server cost 11.25 cents and
| everybody can do it because AI how much are you going to
| bill your customer? .20 cents?
|
| If that is true in some months there will be no dba jobs.
|
| Funny that at the same time SQL is one of the most
| requested languages in job postings.
| simonw wrote:
| Knowing what "optimizing a PostgreSQL server's
| configuration" even means continues to be high value
| technical knowledge.
|
| Knowing how to "run an agentic loop to optimize the
| config file" is meaningless techno-jabber to 99.99% of
| the world's population.
|
| I am entirely unconcerned for my future career prospects.
| lomase wrote:
| So your big advantage is that nobody has lauched agentic
| tools for the end user yet?
| simonw wrote:
| Anyone can learn to unblock a sink by watching YouTube
| videos these days, and yet most people still hire a
| professional to do it for them.
|
| I don't think end users want to "optimize their
| PostgreSQL servers" even if they DID know that's a thing
| they can do. They want to hire experts who know how to
| make "that tech stuff" work.
| lomase wrote:
| I agree that people like to hire profesionals. That is
| why I hire db experts to work on our infra, not prompt
| engenieers.
|
| Saying that anybody can learn to unblock a sink by
| watching youtube is your tipical HN mentality of stating
| opinons as facts.
| cdelsolar wrote:
| you can become a db expert with the right prompts
| lomase wrote:
| You can learn how to pour a drink in 1 minute, that is
| why most bartenders earn minimum wage.
|
| You can't become a db expert with a promt.
|
| I hope you make a lot of money with your lies and good
| luck.
| simonw wrote:
| You can become a DB expert by reading books, forums and
| practicing hard.
|
| These days you can replace those books and forums with a
| top tier LLM, but you still need to put in the practice
| yourself. Even with AI assistance that's still a _lot_ of
| work.
| lomase wrote:
| You could not replace good books with Intenert and you
| can't replace good books with a any LLM.
|
| You can replace books with your own time and research.
|
| Again making statements that are just not true. Typical
| HN behavior.
| simonw wrote:
| "Saying that anybody can learn to unblock a sink by
| watching youtube is your tipical HN mentality of stating
| opinons as facts."
|
| I don't understand what you mean. Are you saying that
| it's not true that anyone could learn to unblock a sink
| by watching YouTube videos?
| lomase wrote:
| Yes I do think not all people could fix it with Youtube.
| My grandma couldn't for example. I had a neigbor come for
| help with something like that too.
|
| Is not that hard to understan mate. Maybe put my comment
| in the LLM so you can get it.
|
| What is your point again?
| simonw wrote:
| My analogy holds up. Anyone could type "optimize my
| PostgreSQL database by editing the configuration file"
| into an LLM, but most people won't - same as most people
| won't watch YouTube to figure out how to unblock a sink.
|
| If you don't like the sink analogy what analogy would you
| use instead for this? I'm confident there's a "people
| could learn X from YouTube but chose to pay someone else
| instead" that's more effective than the sink one.
| simonw wrote:
| Personally I'd like to hire a DB expert who also knows
| how to drive an agentic coding system to help them
| accelerate their work. AI tools, used correctly, act as
| an amplifier of existing knowledge and experience.
| lomase wrote:
| As far as I know nobody has really came up with proof
| that LLMs act as an amplifier of existing knoledge.
|
| It does make people FEEL more productive.
| simonw wrote:
| What would a "proof" of that even look like?
|
| There are thousands (probably millions) of us walking
| around with anecdotal personal evidence at this point.
| lomase wrote:
| Some years ago when everybody here gave their anecdotal
| evidence about how Bitcoin and Blockchain were the future
| and they used it every day. You were a fool if you did
| not jump on the bandwagon.
|
| If the personal opinions on this site were true, half of
| the code in the world would be functional, lisp would be
| one of the languages most used and Microsoft would have
| not bougth DropBox.
|
| I really think HN hive minds opinions means nothing. Too
| much money here to be real.
| IgorPartola wrote:
| "We will take all the strokes off Jerry's game when we kill
| him." - the LLM, probably.
|
| Just like Mr Meeseeks, it's only a matter of time before it
| realizes that deleting all the data will make the DB
| lightning fast.
| simonw wrote:
| Exactly true, which is why you need to run your agent
| against a safe environment. That's a skill that's worth
| developing.
| Implicated wrote:
| > copy pasting some configuration I might not really
| understand
|
| Uh, yea... why _would_ you? Do you do that for configurations
| you found that weren 't from LLMs? I didn't think so.
|
| I see takes like this all the time and I'm really just mind-
| boggled by it.
|
| There are more than just the "prompt it and use what it gives
| me" use cases with the LLMs. You don't have to be that rigid.
| They're incredible learning and teaching tools. I'd argue
| that the single best use case for these things is as a
| research and learning tool for those who are curious.
|
| Quite often I will query Claude about things I don't know and
| it will tell me things. Then I will dig deeper into those
| things myself. Then I will query further. Then I will ask it
| details where I'm curious. I won't blindly follow or trust it
| like I wouldn't a professor or anyone or any _thing_ else,
| for that matter. Just like I would when querying a human for
| or the internet in general for information, I 'll verify.
|
| You don't have to trust it's code, or it's configurations.
| But you can sure learn a lot from them, particularly when you
| know how to ask the right questions. Which, hold onto your
| chairs, only takes some experience and language skills.
| Timshel wrote:
| My comment is mainly in opposition to the "five minutes"
| part from parent.
|
| If you have 5 minutes then you can't as you say :
|
| > Then I will dig deeper into those things myself ...
|
| So my point is I don't care if it's coming from LLM or a
| random blog, you won't have time to know if it's really
| working (ideally you would want to benchmark the change).
|
| If you can't invest the time better to stay with the
| defaults, which in most project the maintainers spent quite
| a bit of time to make sensible.
| Implicated wrote:
| Yea, I guess in that case I'd say it's likely a bad move
| in every direction if you're constrained to 5 min to
| deploy something you don't understand.
| dotancohen wrote:
| Then either have the LLM explain the config, or go Google it.
| LLM output is a starting point, not your final config.
| oulipo2 wrote:
| Perhaps, but in this case this shows at least that even non-
| tuned Postgres can be used as a fast cache for many real-world
| use-cases
| conradfr wrote:
| But why doesn't Postgres tuned itself based on the system is
| running on, at least the basics based on available RAM & cores?
| simonw wrote:
| I've not tried it myself but I believe that's what pgtune
| does: https://github.com/gregs1104/pgtune
| otikik wrote:
| > Ugh.
|
| > if you want to be taken seriously
|
| For someone so enthusiastic about giving feedback you don't
| seem to have invested a lot of effort into figuring out how to
| give it effectively. Your done and demeanor diminish the value
| of your comment.
| GuinansEyebrows wrote:
| > This is the kind of thing that is "hard and tedious" for only
| about five minutes of LLM query or web search time
|
| not even! if you don't need to go super deep with tablespace
| configs or advanced replication right away, pgtune will get you
| to a pretty good spot in the time it takes to fill out a form.
|
| https://pgtune.leopard.in.ua/
|
| https://github.com/le0pard/pgtune
| rich_sasha wrote:
| Isn't tuning before hitting constraints a premature
| optimisation? The approach of not spending time on tuning
| settings before you have to seems sane.
|
| And TFA shows you that in this world Postgres is close enough
| to Redis.
| aprdm wrote:
| Yep. I worked in a famous-big-company that had a 15 years old
| service that was dogslow, systemd restarts would take multiple
| hours.
|
| Everyone was talking about C++ optimizations, mutex everywhere
| etc - which was in fact a problem.
|
| However.. I seemed to be the first person to actually try to
| debug what the database was doing, and it was going to disk all
| the time with a very small cache.. weird..
|
| I see the MySQL settings on a 1TB ram machine and they were...
| out-of-the-box settings.
|
| With small adjustments I improved the performance of this core
| system an order of magnitude.
| icedchai wrote:
| At one startup, all I did was increase the innodb buffer pool
| size. They were using default settings.
| wewewedxfgdf wrote:
| Given there is nothing at all said about the many config options
| that would contribute to Postgres for this use case, we must
| assume no configuration has been done.
|
| Also, no discussion of indexes or what the data looks like, so we
| must assume no attention has been paid to their critical factors
| either.
|
| So, another case of lies, damned lies and benchmarks.
|
| It seem strange to me people are so willing to post such
| definitive and poorly researched/argued things - if you're going
| to take a public position don't you want to be obviously right
| instead of so easily to discount?
| didip wrote:
| Is this engagement bait? Sigh... fine, at least I bite the bait
| here:
|
| 1. The use-case is super specific to homelab where consistency
| doesn't matter. You didn't show us the Redis persistence setup.
| What is the persistence/durability setting? I bet you'd lose data
| the one day you forgot and flip the breaker of your homelab.
|
| 2. What happened when data is bigger than your 8GB of RAM on
| Redis?
|
| 3. You didn't show us the PG config as well, it is possible to
| just use all of your RAM as buffer and caching.
|
| 4. Postgres has a lot of processes and you give it only 2 CPU?
| Vanilla Redis is single core so this race is rigged to begin
| with. The UNLOGGED table even things out a bit.
|
| In general, what are you trying to achieve with this "benchmark"?
| What outcome would you like to learn? Because this "benchmark"
| will not tell you what you need to know in a production
| environment.
|
| Side note for other HN readers: The UNLOGGED table is actually
| very nifty trick for speeding up unit tests. Just perform ALTER
| to UNLOGGED tables inside the PG that's dedicated for CI/CD:
| ALTER TABLE my_test_table SET UNLOGGED;
| dizzyVik wrote:
| 1. No persistence for redis. 2. Redis would get OOM killed. 3.
| The default config coming with the image was used. 4. Yes, I
| gave it 2 cpus.
|
| I wanted to compare how would my http server behave if I used
| postgres for caching and what the difference would be if I used
| redis instead.
|
| This benchmark is only here to drive the point that sometimes
| you might not even need a dedicated kv store. Maybe using
| postgres for this is good enough for your use case.
|
| The term production environment might mean many things. Perhaps
| you're processing hundreds of thousands of requests per second
| then you'll definitely need a different architecture with HA,
| scaling, dedicated shared caches etc. However, not many
| applications reach such a point and often end up using more
| than necessary to serve their consumers.
|
| So I guess I'm just trying to say keep it simple.
| piniondna wrote:
| Redis is one of the simplest services I've used... we could
| flip the script and say "for many db use cases postgresdb is
| overkill, just use Redis... you get caching too". I'm not
| sure exactly what this commentary adds to a real world
| architecture discussion. The whole thing seems a little
| sophomoric, tbh.
| evanelias wrote:
| > No persistence for redis
|
| In this case, I would expect that a fairer comparison would
| be running Postgres on tmpfs. UNLOGGED only skips WAL writes,
| not _all_ writes; if you do a clean shutdown, your data is
| still there. It 's only lost on crash.
| staplung wrote:
| I guess my question is: why bother with a benchmark if the pick
| is pre-ordained? Is it the case that at some point the results
| would be so lopsided that you _would_ pick the faster solution?
| If so, what is that threshold? I.e. when does performance trump
| system simplicity? To me _those_ are the interesting questions.
| 000ooo000 wrote:
| >why bother with a benchmark if the pick is pre-ordained
|
| Validating assumptions
|
| Curiosity/learning
|
| Enraging a bunch of HN readers who were apparently born with
| deep knowledge of PG and Redis tuning
| chiefmix wrote:
| love that the author finished with:
|
| "i do not care. i am not adding another dependency"
| mj2718 wrote:
| My laptop running OOTB Postgres in docker does 100k requests per
| second with ~5ms latency. How are you getting 20x slower?
| nasretdinov wrote:
| That's what I'm struggling with too. Redis can also serve
| roughly 500k-1m QPS using just ~4-8 cores, so on two cores it
| should be about 100k-200k at least
| mj2718 wrote:
| Yep... this is what I expect as a baseline.
|
| This is also why I rarely use redis - Postgres at 100k TPS is
| perfectly fine for all my use cases, including high usage
| apps.
| Koffiepoeder wrote:
| If you look at their lab [0], it seems his NAS is separate from
| his kubernetes nodes. If he hasn't tuned his networking and NAS
| to the maximum, network storage may in fact add a LOT of delay
| on IOPS. Could be the difference between fractions of a
| millisecond vs actual milliseconds. If your DB load is mostly
| random reads this can really harm performance. Just
| hypothesizing here though, since it is not clear whether his DB
| storage is actually done on the NAS.
|
| [0]: https://dizzy.zone/2025/03/10/State-of-my-Homelab-2025/
| mj2718 wrote:
| Sorry but if he's using a setup that's 20x worse than a
| regular laptop then I'm not really interested in his setup.
|
| To be fair, I asked the question and you found the answer -
| lol, my bad.
|
| Yes I agree using a nas that adds latency would reduce the
| TPS and explain his results. "Littles law"
| necovek wrote:
| I'd be curious to see how Postgres behaves with numeric IDs
| instead of strings (or another built-in, faster-to-index/hash
| type).
|
| Still leaves you with needing to perfomantly hash your strings
| into those IDs on top, but mostly as the Postgres plateau of
| performance compared to purposely built KV DBs.
| Neikius wrote:
| I skimmed the article, but why is everyone always going for
| distributed cache? What is wrong with in-memory cache? Lowest
| latency, fast, easy to implement.
|
| Yeah ok, you have 30 million entries? Sure.
|
| You need to sync something over multiple nodes? Not sure I would
| call that a cache.
| solatic wrote:
| In the naive/default case, durability is more important than
| latency. Servers crash, applications are restarted. If it takes
| a long time to rebuild your cache, or if rebuilding it would be
| unreliable (e.g. dependency on external APIs) then you court
| disaster by not using a durable-first cache.
|
| If you _actually_ need lower latency then great, design for it.
| But it should be a conscious decision, not a default one.
| ahoka wrote:
| Consistency could be one reason, but I think the best caching
| strategy is not to need a cache. Adding a (not just)
| distributed cache early can hide performance issues that can be
| fixed instead of working around while introducing complexity
| and maybe even adding data consistency issues and paradoxically
| performance degradation.
| lemagedurage wrote:
| This is modern backend development. The server scales
| horizontally by default, nodes can be removed and added without
| disrupting service. With redis as cache, we can do e.g. rate
| limiting fast without tying a connection to a node, but also
| scale and deploy without impacting availability.
| kiney wrote:
| because PHP scripts started fresh for each request...
| bearjaws wrote:
| A lot of people are using NodeJS and storing a massive in-
| memory object means longer GC time.
| jerf wrote:
| I've got a couple of systems that 10-15 years ago needed
| something like Redis and multiple nodes distributing them but
| are today just a single node with an in-memory cache that is
| really just a hash keyed by a string. They're running on a
| hot/cold spare system. If one of them dies, it takes maybe 30
| seconds to fully reconstruct the cache, which theses systems
| happen to be capable of doing in advance, they don't need to
| wait for the requests to come in.
|
| One thing that I think has gotten lost in the "I need redundant
| redundancy for my redundantly redundant replicas of my
| redundantly-distributed resources" world is that you really
| only need all that for super-real-time systems. Which a lot of
| things are, such as, all user-facing websites need to be up the
| moment the user hits them and not 30 seconds later. But when
| you _don 't_ have that constraint, if things can take an extra
| few minutes or drop some requests and it's not a big deal, you
| can get away with something a _lot_ cheaper, made even more
| cheap by the fact that running things on a single node gets you
| access to a lot of performance you simply can not have in a
| distributed system because _nothing_ is as fast as the RAM bus
| being accessed by a single OS process. And sometimes you have
| enough flexibility to design your system to be that way in the
| first place instead of accidentally wiring it up to be
| dependent on complicated redundancy schemes.
|
| (Next up after that, if that isn't enough, is the system where
| you have redundant nodes but you make sure they don't need to
| cross-talk at all with something like Redis. Observation: If
| you have two nodes for redundancy, and they are doing something
| with caching, and the cached values are generally stable for
| long periods of time, it is often not that big a deal just to
| let each node have its own in-memory cache and if they happen
| to recreate a value twice, let them. If you work the math out
| carefully, depending on your cache utilization profile you
| often are losing less than you think here (in particular, if
| the modal result is that you never hit a given cached value
| again, it's cheap especially if the ones you hit you end up
| hitting a lot, and if on average you get cached values all the
| time, the amortized cost of the second computation is nearly
| nothing, it's only in the "almost always hit them 2 or 3 times"
| case that this incurs extra expense and that's actually a very,
| very specific place in the caching landscape), especially since
| the in-process caching and such is faster on its own terms too
| which mitigates the problem, especially because you can set it
| up so you have no serialization costs in this case, and the
| architectural simplicity can be very beneficial. No, by no
| means does this work with every system, and it is helpful to
| scan out into the future to be sure you probably won't ever
| need to upgrade to a more complicated setup, but there's a lot
| of redundantly redundant systems that really don't need to be
| written with such complication because this would have been
| fine for them.)
| gethly wrote:
| What is all the fuss about? In the ancient times, you put your
| entry into the database(ANY database), either with TTL or cache
| tags, and have a cron job that periodically flushes the table.
| Then in your application you check if you have the entry cached
| in memory, if not, you check the db entry, if it is not there you
| build it from scratch and cache it.
|
| Why do people complicate things? We've solved caching ages ago.
| nessex wrote:
| I've got a similar setup with a k3s homelab and a bunch of small
| projects that need basic data storage and caching. One thing
| worth considering is that if someone wants to run both redis and
| postgres, they need to allocate enough memory for both including
| enough overhead that they don't suddenly OOM.
|
| In that sense, seeing if the latency impact of postgres is
| tolerable is pretty reasonable. You may be able to get away with
| postgres putting things on disk (yes, redis can too), and only
| paying the overhead cost of allocating sufficient excess RAM to
| one pod rather than two.
|
| But if making tradeoffs like that, for a low-traffic service in a
| small homelab, I do wonder if you even need a remote cache. It's
| always worth considering if you can just have the web server keep
| it's own cache in-memory or even on-disk. If using go like in the
| article, you'd likely only need a map and a mutex. That'd be an
| order of magnitude faster, and be even less to manage... Of
| course it's not persistent, but then neither was Redis (excl.
| across web server restarts).
| phendrenad2 wrote:
| I'm glad someone did this analysis. I've also been tempted to
| remove complexity by removing Redis from my stack. But there's a
| decent speedup from using Redis, so I'll keep it.
|
| (And to the people complaining about this benchmark not being
| extremely scientifically rigorous: Nobody cares.)
| k9294 wrote:
| > For postgres, the bottleneck was the CPU on the postgres side.
| It consistently maxed out the 2 cores dedicated to it, while also
| using ~5000MiB of RAM.
|
| Comparing throttled pg vs non-throttled redis is not a benchmark.
|
| Of course when pg is throttled you will see bad results and high
| latencies.
|
| A correct performance benchmark would be to give all components
| unlimited resources and measure performance and how much they use
| without saturation. In this case, PG might use 3-4 CPUs and 8GB
| of RAM but have comparable latencies and throughput, which is the
| main idea behind the notion "pg for everything".
|
| In a real-world situation, when I see a problem with saturated
| CPU, I add one more CPU. For a service with 10k req/sec, it's
| most likely a negligible price.
| Timshel wrote:
| Since it's in the context of a homelab you usually don't change
| your hardware for one application, using the same resources in
| both test seems logical (could argue that the test should be pg
| vs redis + pg).
|
| And their point is that it's good enough as is.
| m000 wrote:
| It's a homelab. If it works, it works. And we already knew
| that it would work without reading TFA. No new insights
| whatsoever. So what's the point of sharing or discussing?
| k9294 wrote:
| In a home lab you can go the other way around and compare the
| number of requests before saturation.
|
| e.g. 4k/sec saturates PG CPU to 95%, you get only 20% on
| redis at this point. Now you can compare latencies and
| throughput per $.
|
| In the article PG latencies are misleading.
| est wrote:
| reminds me of Handlersocket in MySQL
| truth_seeker wrote:
| PostgreSQL Tuning further for this use case
|
| - Reduce Page Size from 8KB to 4KB, great for write heavy
| operations and indexed reads. Needs to compile source with those
| flags, cant configure once installation is done.
|
| - Increase Buffer cache
|
| - Table partitioning for UNLOGGED Table which the author is using
|
| - At connection session level, lower the transaction level from
| SERIALIZABLE
|
| - The new UUID7 in PG 18 as a key might also help as primary
| indexed KEY type as it also supports range queries on timestamp
| finalhacker wrote:
| postgres message protocol is much complex than redis. i think
| it's the bottleneck for such difference.
| smacker wrote:
| I like using postgres for everything, it lets me simplify
| infrastructure. But using it as a cache is a bit concerning in
| terms of reliability, in my opinion.
|
| I have witnessed many incidents when DB was considerably
| degrading. However, thanks to the cache in redis/memcache, a
| large part of the requests could still be processed with minimal
| increase in latency. If I were serving cache from the same DB
| instance, I guess, it would cause cache degradation too when
| there are any problems with the DB.
| aiisthefiture wrote:
| Select by id is fast. If you're using it as a cache and not
| doing select by id then it's not a cache.
| smacker wrote:
| absolutely. But when PG is running out of open connections or
| has already consumed all available CPU even the simplest
| query will struggle.
| motorest wrote:
| > But when PG is running out of open connections or has
| already consumed all available CPU even the simplest query
| will struggle.
|
| I don't think it is reasonable to assume or even believe
| that connection exhaustion is an issue specific to
| Postgres. If you take the time to learn about the topic,
| you won't need to spend too much time before stumbling upon
| Redis and connection pool exhaustion issues.
| IsTom wrote:
| You can have a separate connection pool for 'cache'
| requests. You shouldn't have too many PG connections open
| anyway, on the order of O(num of CPUs).
| motorest wrote:
| > But using it as a cache is a bit concerning in terms of
| reliability, in my opinion.
|
| This was the very first time I heard anyone even suggest that
| storing data in Postgres was a concern in terms of reliability,
| and I doubt you are the only person in the whole world who has
| access to critical insight onto the matter.
|
| Is it possible that your prior beliefs are unsound and
| unsubstantiated?
|
| > I have witnessed many incidents when DB was considerably
| degrading.
|
| This vague anecdote is meaningless. Do you actually have any
| concrete scenario in mind? Because anyone can make any system
| "considerably degrading", even Redis, if they make enough
| mistakes.
| baobun wrote:
| No need to be so combative. Take a chill pill, zoom out and
| look at the reliability of the entire system and its services
| rather than the db in isolation. If postgres has issues, it
| can affect the reliability of the service further if it's
| also running the cache.
|
| Besides, having the cache on separate hardware can reduce the
| impact on the db on spikes, which can also factor into
| reliability.
|
| Having more headroom for memory and CPU can mean that you
| never reach the load where ot turns to service degradation on
| the same hw.
|
| Obviously a purpose-built tool can perform better for a
| specific use-case than the swiss army knife. Which is not to
| diss on the latter.
| motorest wrote:
| > No need to be so combative.
|
| You're confusing being "combative" with asking you to
| substantiate your extraordinary claims. You opted to make
| some outlandish and very broad sweeping statements, and
| when asked to provide any degree of substance, you resorted
| to talk about "chill pills"? What does that say about the
| substance if your claims?
|
| > If postgres has issues, it can affect the reliability of
| the service further if it's also running the cache.
|
| That assertion is meaningless, isn't it? I mean, isn't that
| the basis of any distributed systems analysis? That if a
| component has issues, it can affect the reliability of the
| whole system? Whether the component in question is Redis,
| Postgres, doesn't that always hold true?
|
| > Besides, having the cache on separate hardware can reduce
| the impact on the db on spikes, which can also factor into
| reliability.
|
| Again, isn't this assertion pointless? I mean, it holds
| true whether it's Postgres and Redis, doesn't it?
|
| > Having more headroom for memory and CPU can mean that you
| never reach the load where ot turns to service degradation
| on the same hw.
|
| Again, this claim is not specific to any specific service.
| It's meaningless to make this sort of claim to single out
| either Redis or Postgres.
|
| > Obviously a purpose-built tool can perform better for a
| specific use-case than the swiss army knife. Which is not
| to diss on the latter.
|
| Is it obvious, though? There is far more to life than
| synthetic benchmarks. In fact, the whole point of this sort
| of comparison is that for some scenarios a dedicated memory
| cache does not offer any tangible advantage over just using
| a vanilla RDBMS.
|
| This reads as some naive auto enthusiasts claiming that a
| Formula 1 car is obviously better than a Volkswagen Golf
| because they read somewhere they go way faster, but in
| reality what they use the car for is to drive to the
| supermarket.
| scns wrote:
| > You opted to make some outlandish and very broad
| sweeping statements, and when asked to provide any degree
| of substance, you resorted to talk about "chill pills"?
|
| You are not answering to OP here. Maybe it's time for a
| little reflection?
| baobun wrote:
| > You're confusing being "combative" with asking you to
| substantiate your extraordinary claims. You opted to make
| some outlandish and very broad sweeping statements, and
| when asked to provide any degree of substance, you
| resorted to talk about "chill pills"?
|
| what are these "extraordinary claims" you speak of? I
| believe it's you who are confusing me with someone else.
| I am not GP. You appear to be fighting windmills.
| motorest wrote:
| > what are these "extraordinary claims" you speak of?
|
| The claim that using postgres to store data, such as a
| cache, "is a bit concerning in terms of reliability".
| baobun wrote:
| Can you point to where I made that claim?
| abtinf wrote:
| Inferring one meaning for "reliability" when the original
| post is obviously using a different meaning suggests LLM use.
|
| This is a class of error a human is extremely unlikely to
| make.
| didntcheck wrote:
| > This was the very first time I heard anyone even suggest
| that storing data in Postgres was a concern in terms of
| reliability
|
| You seem to be reading "reliability" as "durability", when I
| believe the parent post meant "availability" in this context
|
| > Do you actually have any concrete scenario in mind? Because
| anyone can make any system "considerably degrading", even
| Redis
|
| And even Postgres. It can also happen due to seemingly random
| events like unusual load or network issues. What do you find
| outlandish about the scenario of a database server being
| unavailable/degraded and the cache service not being?
| tzahifadida wrote:
| I think the moral here is, if you'll ever have less than 1000
| active customers per second, don't do micro services, don't use
| redis, just use a monolith with a simple stack and you'll still
| be happy, less maintenance, less knowhows, etc... cheaper all
| around.
| mattacular wrote:
| For cache, having TTL is invaluable. Having to tune cleanup jobs
| in Postgres is annoying. Deletes create dead rows (so do updates)
| so now you have to deal with vacuum as well. The method the
| author suggested will run into a lot more problems if the service
| ever needs to scale up beyond what they estimated than a
| traditional dedicated cache like Redis in front.
| wvh wrote:
| I've done some benchmarks over the years and 6-7k/s for getting
| simple data out of a basic Postgres installation seems pretty
| spot on. The question is when using Postgres as a cache, are you
| taking away performance for actual, more complex business logic
| queries. That is going to depend on the level of overlap between
| endpoints that use caching and those that need more complex
| queries.
|
| What I'd be interested to see is a benchmark that mixes lots of
| dumb cache queries with typically more complex business logic
| queries to see how much Postgres performance tanks during highly
| concurrent load.
| lukaslalinsky wrote:
| Cache is a very relative term. If I'm caching heavy computation,
| or perhaps externally acquired resources (e.g web scrapped data),
| I'd use database as a cache. If I'm caching database results,
| then I'd obviously use something faster than that.
| h1fra wrote:
| Having native TTL in PostgreSQL would remove so much unnecessary
| Redis in the wild.
| jmull wrote:
| Besides using the word "cache" what does this have to do with
| caching?
|
| It looks like it's just storing the session in postgres/redis.
|
| Caching implies there's some slower/laggier/more remote primary
| storage, for which the cache provides faster/readier access to
| some data of the primary storage.
| noisy_boy wrote:
| I have dealt with an abomination of a bespoke setup on Redis that
| was neither simple key/value nor an RDBMS but a grotesque attempt
| to simulate tables using key prefix with value being a hash
| containing "references" to entries in other "tables" (with some
| "special condition" edge cases thrown in to make it spicy). All
| with next to no documentation. It took adding many unit tests
| with generous logging to expose the logic and the underlying
| structure. That complete madness could have been handled in a bog
| standard way in PostgreSQL with minimal fuss. All for something
| that didn't even need to be "real-time".
|
| If you see yourself starting with simple key/value setup and then
| feature requests come in that make you considering having
| "references" to other keys, it is time to re-consider Redis, not
| double down on it. Even if you insist on continuing, at the very
| least, add a service to manage it with some clean abstractions
| instead of raw-dogging it.
| bhaak wrote:
| I'm surprised to see this being somewhat controversial.
|
| In Rails we just got database backed everything with the option
| to go to special backends if need be.
|
| The only question I have is how do I notice that my current
| backend doesn't scale anymore and who or what would tell me to
| switch.
| enigma101 wrote:
| postgres for everything!
| saberience wrote:
| Note to anyone reading this blog in the future, this is both very
| very basic, and also highly misleading. The author makes it seem
| as though PostGres and Redis are interchangeable and that there's
| not much difference between them.
|
| This is totally misguided and incorrect.
|
| Redis can be easily deployed such that any request returns in
| less than a millisecond, and this is where it's most useful. It's
| also consistent and stable as hell. There are many use-cases for
| Redis where Postgres is totally unsuitable and doesn't make
| sense, and vice versa.
|
| Do yourself a favour and ignore this blog (again: inaccurate,
| poorly benchmarked, misleading) and do your own research and use
| better sources of information.
| sgarland wrote:
| Aside from the lack of tuning mentioned, it wasn't mentioned what
| variety of UUID was used - I assume v4. If so, the visibility map
| impacted the performance [0].
|
| [0]: https://www.cybertec-postgresql.com/en/unexpected-
| downsides-...
| paulsutter wrote:
| Or just use sync.Map, which would be faster and about 10x less
| work
| Ellipsis753 wrote:
| Great article. Not understanding the hate.
|
| I think the jist of it is, you probably have sufficiently low
| requests/second (<1000) that using postgres as a cache is totally
| reasonable - which it is. If your hitting your load tests and
| hardware spend, no need to optimise more.
| BinaryIgor wrote:
| As expected, Postgres - especially with Unlogged Tables - is fast
| enough; additionally, having less things to run on your
| infrastructure and fewer integrations to manage in your app is a
| hugely underrated benefit.
|
| I am just a little bit surprised on the relatively low write
| performance for both Postgres and Redis here; but as I can see,
| the tests were run on just 2 CPUs and 8 GB of RAM machine. In my
| experience, with 8 CPUs, Postgres can easily handle more than _15
| 000_ writes per second using regular tables; I would imagine that
| it can easily be _20 000+_ for the Unlogged variety - who needs
| more than that to cache?
| sharadov wrote:
| Postgres needs to be tuned for the workflow and the hardware.
|
| It's a travesty to run it on default settings.
|
| All it takes is 5 mins to do it.
|
| Use pgtune - https://pgtune.leopard.in.ua/
|
| I know the author concludes that he would still use Postgres for
| his projects.
|
| But, he would get much better benchmark numbers if it was tuned.
| dvcoolarun wrote:
| I get the argument for using Postgres, but it's mostly about the
| convenience of the tool, specific constraints, and the
| bottlenecks it solves, especially with TTL and a cache interface
| already implemented.
|
| And dealing with unlogged table contents are not crash-safe.
|
| I believe Redis would have performed better with more allocated
| CPU.
___________________________________________________________________
(page generated 2025-09-26 23:01 UTC)