[HN Gopher] Redis is fast - I'll cache in Postgres
       ___________________________________________________________________
        
       Redis is fast - I'll cache in Postgres
        
       Author : redbell
       Score  : 309 points
       Date   : 2025-09-25 23:34 UTC (23 hours ago)
        
 (HTM) web link (dizzy.zone)
 (TXT) w3m dump (dizzy.zone)
        
       | joonas wrote:
       | In similar vein, I'd always thought Redka
       | (https://github.com/nalgeon/redka) was a neat idea since it gives
       | you access to a subset of the Redis API backed by either SQLite
       | or Postgres
        
         | notpushkin wrote:
         | Came here to post this. I'm wondering how Redka will compare
         | with Postgres (and Redis) in terms of performance though.
         | 
         | Edit: https://antonz.org/redka/#performance
        
       | iamcalledrob wrote:
       | This isn't a great test setup. It's testing RTT rather than the
       | peak throughput of Redis.
       | 
       | I'd suggest using Redis pipelining -- or better: using the
       | excellent rueidis redis client which performs auto-pipelining.
       | Wouldn't be surprising to see a 10x performance boost.
       | 
       | https://github.com/redis/rueidis
        
         | oa335 wrote:
         | Postgres also supports query pipelining - at it seems like the
         | popular go Postgres client library pgx supports it:
         | https://github.com/jackc/pgx/issues/1113#event-6964024724
        
         | aaronblohowiak wrote:
         | Even so 35ms+ latency for Redis reads is very very high I'd
         | want to understand what is happening there
        
         | dizzyVik wrote:
         | Author here. Redis is definitely faster. I was specifically not
         | going for absolute peak performance for either redis or
         | postgres - that would require going down to the wire protocols.
         | The idea was emphasize that there' a "good enough" level of
         | performance. Once you need that sort of speed - sure, there are
         | ways to achieve it.
        
           | motorest wrote:
           | > The idea was emphasize that there' a "good enough" level of
           | performance. Once you need that sort of speed - sure, there
           | are ways to achieve it.
           | 
           | To this I would add that more often than not the extra cost
           | and complexity of a memory cache does not justify shaving off
           | a few hypothetical milliseconds from a fetch.
           | 
           | On top of that, some nosql offerings from popular cloud
           | providers already have CRUD operations faster than 20ms.
        
       | arp242 wrote:
       | When I last benchmarked Redis vs. PostgreSQL for a simple k/v
       | cache it was about ~1ms for PostgreSQL to fetch a key, and ~0.5ms
       | for Redis with a similar setup as in this post (although I used
       | "value bytea" instead of "value string" - I don't know if it
       | matters, probably not; 1ms was fast enough that I didn't care to
       | test).
       | 
       | I didn't measure setting keys or req/sec because for my use case
       | keys were updated infrequently.
       | 
       | I generally find ms to be a more useful metric than reqs/sec or
       | latency at full load, as this is not a typical load. Or at least
       | wasn't for my use case.
       | 
       | Of course all depends on your use case etc. etc. In some cases
       | throughput does matter. I would encourage everyone to run their
       | own benchmarks suited to their own use case to be sure - should
       | be quick and easy.
       | 
       | As I rule I recommend starting with PostgreSQL and using
       | something else only if you're heavily using the cache or you run
       | in to problems. Redis isn't too hard to run, but still just one
       | less service to worry about. Or alternatively, just use a in-
       | memory DB. Not always appropriate of course, but sometimes it is.
        
         | Maskawanian wrote:
         | When you benchmarked Postgres did you disable WAL for the cache
         | table? That may minimize the difference.
        
           | arp242 wrote:
           | Unlogged table, yes. 1ms is more than fast enough so I didn't
           | bother to look further.
        
         | xyzzy_plugh wrote:
         | A difference of 0.5ms is negligible with single digit network
         | latency. You would need significant batching to experience the
         | effects of this difference.
         | 
         | Of course such sensitive environments are easily imaginable but
         | I wonder why you'd select either in that case.
        
           | arp242 wrote:
           | > A difference of 0.5ms is negligible with single digit
           | network latency
           | 
           | Yes, that was my take-away.
        
       | brightball wrote:
       | The big with for Redis is pipelining IMO.
        
       | bart3r wrote:
       | Don't tell DHH
        
         | Fire-Dragon-DoL wrote:
         | Didn't DHH just release a sqlite-only cache?
        
           | firecall wrote:
           | Solid Cache, default in Rails 8.
           | 
           | Doesn't require SQLite.
           | 
           | Works with other DBs:
           | 
           | https://github.com/rails/solid_cache
        
             | Fire-Dragon-DoL wrote:
             | Yeah,so no redis
        
       | mehphp wrote:
       | I think you just convinced me to drop redis for my new project.
       | 
       | Definitely a premature optimization on my part.
        
         | throwup238 wrote:
         | "Dropping" something from a "new" project is premature
         | optimization?
         | 
         | Wherever you go, there you are.
        
           | danielheath wrote:
           | I read it as dropping something that _had been_ a premature
           | optimisation.
        
           | qu4z-2 wrote:
           | Presumably adding Redis to a new project with no performance
           | issues (yet?) is the premature optimisation.
        
             | rplnt wrote:
             | If you are optimizng for simplicity it may not be, as the
             | use as a cache is much (much) more straightforward. Also,
             | for a new project, I'd go with in-memory service-level
             | cache as it outperforms both (in any metric) and can be
             | easily replaced once the need arises.
        
           | mehphp wrote:
           | I was referring to adding redis prematurely
        
         | MobiusHorizons wrote:
         | "Premature optimization" typically refers to optimizing before
         | profiling. Ie optimizing in places that won't help.
         | 
         | Is redis not improving your latency? Is it adding complexity
         | that isn't worth it? Why bother removing it?
        
           | maxbond wrote:
           | I like to call cases like this "premature distribution." Or
           | maybe you could call it "premature capacity." If you have an
           | application running in the cloud with several thousand
           | requests per day, you could probably really benefit from
           | adding a service like Redis.
           | 
           | But when you have 0-10 users and 0-1000 requests per day, it
           | can make more sense to write something more monolithic and
           | with limited scalability. Eg, doing everything in Postgres.
           | Caching is especially amenable to adding in later. If you get
           | too far into the weeds managing services and creating
           | scalability you might bogged down and never get your
           | application in front of potential users in the first place.
           | 
           | Eg, your UX sucks and key features aren't implemented, but
           | you're tweaking TTLs and getting a Redis cluster to work
           | inside Docker Compose. Is that a good use of your time? If
           | your goal is to get a functional app in front of potential
           | users, probably not.
        
             | not_kurt_godel wrote:
             | You probably don't need Redis until you have thousands of
             | requests per minute, nevermind per day.
        
               | foobarian wrote:
               | I'd go further and even say per second! Actually PG can
               | still handle it, the main problem is that it has a more
               | complex runtime that can spike. Backups? Background jobs
               | doing heavy writes? Replication? Vacuum? Can tend to
               | cause multisecond slowdowns which may be undesirable
               | depending on your SLA. But otherwise it would be fine.
        
             | MobiusHorizons wrote:
             | To be clear my question isn't claiming redis isn't
             | premature optimization, but rather asking why the op thinks
             | that it is. Being new doesn't automatically mean that there
             | is no need for latency sensitivity. Making that assumption
             | could be just as premature. Ripping something out that is
             | already working also takes time and the trade offs need to
             | be weighed.
        
               | maxbond wrote:
               | Can't respond for them of course but I didn't take the
               | impression that it was already fully implemented,
               | working, and functionally necessary. I took the
               | impression they had started going down that path but it
               | was still easy to bail. That's just a vibe though.
               | 
               | But I agree that it would be appropriate to start out
               | that way in some projects.
        
         | flanked-evergl wrote:
         | We dropped Redis from a 4 year old app that had a rapidly
         | growing userbase. Best choice ever. We never looked back once
         | other than to think how annoying it was to have to deal with
         | Redis in addition to Postgres.
        
           | Implicated wrote:
           | Sincerely (Feel the need to add that given the tension around
           | here in these comments), I'm curious how Redis was annoying.
           | Can you give any detail/insight?
        
             | flanked-evergl wrote:
             | Every component takes work. You have to upgrade it,
             | something goes wrong with tuning, you have to debug it,
             | etc. It went wrong about once a month, somehow. I'm sure it
             | was fixable, but time is a commodity, and not having it any
             | more gives us more time to work on other things.
             | 
             | We can't get rid of Postgres, but since we run Postgres on
             | GCP we really never even think about it.
        
       | sixo wrote:
       | If you use an UNLOGGED table in Postgres as a cache, and your DB
       | restarts, you no longer have a cache. Then your main table gets a
       | huge spike in traffic and likely grinds to a halt.
        
         | Spivak wrote:
         | Same as the folks who use in-memory Redis. Is there something
         | uniquely bad about Postgres for this situation?
         | 
         | If your cache is so performance critical that you can't lose
         | the data then it sounds like you need a (denormalized)
         | database.
        
         | frollogaston wrote:
         | Cache isn't meant to persist, and something is wrong if you
         | hard depend on it persisting.
        
         | shakna wrote:
         | Cache is not the only solution to the thundering herd.
        
         | wielebny wrote:
         | If you need persistence on your Redis, then you're not using is
         | as a cache. You're using it as a key-value store.
        
         | sgarland wrote:
         | Not quite - if you have a crash / hard restart, the table is
         | truncated. If Postgres gracefully shuts down, the data is
         | retained.
        
       | carterschonwald wrote:
       | This seems to be testing how the software is optimized for low
       | core deployments... how does Postgres perform vary as you add
       | more cores and ram? It's the sort of software where I'd presume
       | more cores and ram yields better performance. Assuming as always
       | that mature systems software sees more many core perf
       | engineering.
        
       | cortesoft wrote:
       | Having the ability to set a TTL on the cache key is a critical
       | feature of a cache, not something that can be tacked on later.
       | 
       | I always find these "don't use redis" posts kind of strange.
       | Redis is so simple to operate at any scale, I don't quite get why
       | it is important to remove it.
        
         | frollogaston wrote:
         | Yeah, the article was like "I always need a DB anyway" but then
         | sets up an extra cronjob to expire keys, plus more code. I get
         | YAGNI and avoiding deps, but this is really extra stuff to deal
         | with.
         | 
         | Maybe Postgres could use a caching feature. Until then, I'm
         | gonna drop in Redis or memcached instead of reinventing the
         | wheel.
        
           | maxbond wrote:
           | Expiring keys in Postgres with a created_at column and a
           | pg_cron job is very easy (at least, if you're comfortable in
           | Postgres). Redis is world class though of course, and can be
           | deployed turn-key in basically any environment. If you're
           | more comfortable in Redis than Postgres, more power to you.
           | Different choices can be pragmatic to different people.
           | 
           | Personally for a greenfield project, my thinking would be
           | that I am _paying_ for Postgres already. So I would want to
           | avoid paying for Redis too. My Postgres database is likely to
           | be underutilized until (and unless) I get any real scale. So
           | adding caching to it is free in terms of dollars.
        
             | frollogaston wrote:
             | I'm comfy with Postgres though, like I'll center my entire
             | backend around it and do the heavy lifting in SQL (never
             | ORM). It's more that I don't want to depend on a cronjob
             | for something as fundamental as a cache.
             | 
             | Usually Postgres costs a lot more than Redis if you're
             | paying for a platform. Like a decent Redis or memcached in
             | Heroku is free. And I don't want to waste precious Postgres
             | connections or risk bogging down the whole DB if there's
             | lots of cache usage, which actually happened last time I
             | tried skipping Redis.
        
               | maxbond wrote:
               | I can understand being nervous about some cron job
               | running on some other service, but what's concerning
               | about a cron job managed inside of Postgres with pg_cron?
               | If that doesn't run, your database is probably down
               | anyway.
               | 
               | Postgres might cost more but I'm probably already paying.
               | I agree that exhausting connections and writing at a high
               | rate are easy ways to bring down Postgres, but I'm
               | personally not going to worry about exhausting
               | connections to Postgres until I have at least a thousand
               | of them. Everything has to be considered within the
               | actual problem you are solving, there are definitely
               | situations to start out with a cache.
        
               | frollogaston wrote:
               | I might be ok with this if it were built in, but pg_cron
               | is an extension, so first off you might not even have
               | access to it. And then you still have to monitor it.
        
               | maxbond wrote:
               | Seems like it's available on all the major providers.
        
               | frollogaston wrote:
               | Heroku doesn't have it. That's actually kinda annoying
               | that they don't, cause the others do. AND they no longer
               | have free Redis, so that changes things a bit.
               | 
               | Edit: well a tiny bit, max $3/mo
        
               | motorest wrote:
               | > Usually Postgres costs a lot more than Redis if you're
               | paying for a platform.
               | 
               | You need to back up your unbelievable assertion with
               | facts. Memory cache is typically far more expensive than
               | a simple database, specially as provisioning the same
               | memory capacity as RAM is orders of magnitude more
               | expensive than storing the equivalent data in a database.
        
               | frollogaston wrote:
               | I didn't say it's cheaper for the same cache size. But
               | yeah a base tier Redis that will carry a small project
               | tends to be a lot cheaper than the base tier Postgres.
        
               | motorest wrote:
               | > I didn't say it's cheaper for the same cache size.
               | 
               | So be specific. What exactly did you wanted to say?
               | 
               | > But yeah a base tier Redis that will carry a small
               | project tends to be a lot cheaper than the base tier
               | Postgres.
               | 
               | This is patently false. I mean,some cloud providers offer
               | nosql databases with sub-20ms performance as part of
               | their free tier.
               | 
               | Just go ahead and provide any evidence, any at all,that
               | support the idea that Redis is cheaper than Postgres. Any
               | concrete data will do.
        
               | frollogaston wrote:
               | Look at the Heroku pricing. If you don't like Heroku then
               | look at AWS pricing. Specifically for Postgres, not a
               | NoSQL DB (which Redis can be too)
        
               | MangoToupe wrote:
               | Why would you cache the entire database though? Seems
               | like an apples to oranges comparison.
        
               | motorest wrote:
               | > Why would you cache the entire database though?
               | 
               | I have no ideas where did you got that from.
        
               | MangoToupe wrote:
               | > specially as provisioning the same memory capacity as
               | RAM is orders of magnitude more expensive than storing
               | the equivalent data in a database.
               | 
               | I'm not sure how else to interpret this
        
               | motorest wrote:
               | Why are you confusing memory capacity as a requirement to
               | store the whole database in memory? I mean, think. What
               | do you think is the biggest performance bottleneck with
               | caches, and how does this relate to memory capacity?
        
           | motorest wrote:
           | > Yeah, the article was like "I always need a DB anyway" but
           | then sets up an extra cronjob to expire keys, plus more code.
           | 
           | You do not need cron jobs to do cache. Sometimes you don't
           | even need a TTL. All you need is a way to save data in a way
           | that is easy and cheaper to retrieve. I feel these comments
           | just misinterpret what a cache is by confusing it with what
           | some specific implementation does. Perhaps that's why we see
           | expensive and convoluted strategies using Redis and the like
           | when they are absolutely not needed at all.
        
             | maxbond wrote:
             | If we don't use a TTL, aren't we going to have to either
             | accept that our cache will grow without bounds or take an
             | even more sophisticated approach (like tracking access
             | times instead of creation times)? Is there something
             | simpler I'm not seeing?
        
               | frollogaston wrote:
               | I've tried it this way. You can get away with no TTL if
               | your keys are constrained. Sometimes there are enough
               | keys to be a problem. I'd rather just set up a TTL and
               | not worry about this.
        
               | maxbond wrote:
               | Agreed, simple and widely applicable heuristics are
               | great, and you can think deeply on it when and if it
               | proves to be an issue worthy of such attention.
        
               | motorest wrote:
               | > If we don't use a TTL, aren't we going to have to
               | either accept that our cache will grow without bounds
               | (...)
               | 
               | Do you have a bound? I mean, with Redis you do, but
               | that's primarily a cost-driven bound.
               | 
               | Nevertheless, I think you're confusing the point of a
               | TTL. TTLs are not used to limit how much data you cache.
               | The whole point of a TTL is to be able to tell whether a
               | cache entry is still fresh or it is stale and must be
               | revalidated. Just because some cache strategies use TTL
               | to determine what entry they should evict, that is just a
               | scenario that takes place when memory is at full
               | capacity.
        
               | maxbond wrote:
               | A TTL doesn't really tell you if it's stale though. It
               | gives you an upper bound on how long it can have been
               | stale. But something becomes stale when the underlying
               | resource is written to, which can happen an hour or an
               | instant after you cache it. You should probably evict it
               | when the write comes in. In my mind, it's for evicting
               | things that aren't in use (to free up memory).
        
               | motorest wrote:
               | > A TTL doesn't really tell you if it's stale though
               | (...)
               | 
               | Non-sequitur,and imaterial to the discussion.
               | 
               | > You should probably evict it when the write comes in.
               | 
               | No. This is only required if memory is maxed out and
               | there is no more room to cache your entry. Otherwise you
               | are risking cache misses by evicting entries that are
               | still relatively hot.
        
               | maxbond wrote:
               | > Non-sequitur,and imaterial to the discussion.
               | 
               | You said:
               | 
               | > The whole point of a TTL is to be able to tell whether
               | a cache entry is still fresh or it is stale and must be
               | revalidated.
               | 
               | So I responded to it. I don't really understand why you
               | think that's nonsequiter.
               | 
               | > No.
               | 
               | I'm a bit confused. We're not using TTLs and we're not
               | evicting things when they become invalid. What is your
               | suggestion?
        
               | frollogaston wrote:
               | The cache isn't the only hot thing here. Relax.
        
         | akdor1154 wrote:
         | It's not like it's bad, it's more like cutting down on the
         | amount of systems you need to operate.
        
           | pphysch wrote:
           | I have been running Redis for years as a cache and have spent
           | less than 5 cumulative minutes "operating" it.
           | 
           | I'm a big "just use Postgres" fan but I think Redis is
           | sufficiently simple and orthogonal to include in the stack.
        
         | coder543 wrote:
         | I keep hoping Postgres will one day have the ability to mark a
         | timestamp column as an expiry column. It would be useful for
         | all kinds of things beyond caching, including session tokens,
         | feature flags, background jobs, rate limiting, delayed deletion
         | (as a variant of soft deletion), etc.
         | 
         | It seems like the autovacuum could take care of these expired
         | rows during its periodic vacuum. The query planner could
         | automatically add a condition that excludes any expired rows,
         | preventing expired rows from being visible before autovacuum
         | cleans them up.
        
           | neoden wrote:
           | One could use a trigger for this. All we need is to setup a
           | trigger that would delete all expired records looking at some
           | timestamp column on update. That would eat up some latency
           | but as was said, most projects would find it good enough
           | anyway.
        
             | nhumrich wrote:
             | I use pg cron for this. But I don't have a need for TTL to
             | be to the minute accurate, or even to the hour.
        
             | dmitry-vsl wrote:
             | Probably better to use partitioned table and drop old
             | partitions.
        
         | motorest wrote:
         | > Having the ability to set a TTL on the cache key is a
         | critical feature of a cache, not something that can be tacked
         | on later.
         | 
         | What exactly is the challenge you're seeing? In the very least,
         | you can save an expiry timestamp as part of the db entry. Your
         | typical caching strategy already involves revalidating cache
         | before it expires, and it's not as if returning stale while
         | revalidating is something completely unheard of.
        
         | zzbzq wrote:
         | Postgres nationalists will applaud the conclusion no matter how
         | bad the reasoning is.
         | 
         | Don't get me wrong, the idea that he wants to just use a RDMBS
         | because his needs aren't great enough, is a perfectly
         | inoffensive conclusion. The path that led him there is very
         | unpersuasive.
         | 
         | It's also dangerous. Ultimately the author is willing to do a
         | bit more work rather than learn something new. This works
         | because he's using a popular tool people like. But overall, he
         | doesn't demonstrate he's even thought about any of the things
         | I'd consider most important; he just sort of assumes running a
         | Redis is going to be hard and he'd rather not mess with it.
         | 
         | To me, the real question is just cost vs. how much load the DB
         | can even take. My most important Redis cluster basically exists
         | to take load off the DB, which takes high load even by simple
         | queries. Using the DB as a cache only works if your issue is
         | expensive queries.
         | 
         | I think there's an appeal that this guy reaches the conclusion
         | someone wants to hear, and it's not an unreasonable conclusion,
         | but it creates the illusion the reasoning he used to get there
         | was solid.
         | 
         | I mean, if you take the same logic, cross out the word
         | Postgres, and write in "Elasticsearch," and now it's an article
         | about a guy who wants to cache in Elasticsearch because it's
         | good enough, and he uses the exact same arguments about how
         | he'll just write some jobs to handle expiry--is this still
         | sounding like solid, reasonable logic? No it's crazy.
        
       | nvader wrote:
       | I hope you remember to add monitoring for your cache expiry cron
       | job so you can be notified if it ever fails to run due to a code
       | change or configuration change. For example, database credentials
       | might be rolled and if you forget to update the job your primary
       | database could fill up.
       | 
       | Perhaps you could have a second cron job that runs to verify that
       | the first one completed. It could look for a last-ran entry. You
       | should put it in the same database, so maybe perhaps you could a
       | key value store like redis for that.
        
         | monkeyelite wrote:
         | This is more than a few leaps of logic. a bad config of any
         | infrastructure can cause problems
        
       | frollogaston wrote:
       | The values in this test are like 20 bytes, right? Wonder how
       | things compare if they're about 1KB.
        
         | dizzyVik wrote:
         | Not on my computer right now but I think it's 45 bytes per
         | value
        
       | monkeyelite wrote:
       | In nginx config you can pretty easily cache web pages or API
       | responses. I would also like to know how far that goes.
       | 
       | Also does anyone like memcached anymore? When I compared with
       | Redis in the past it appeared more simple.
        
       | hnaccountme wrote:
       | I think there is more performance to be taken from PostgreSQL.
       | Not sure how pgx works internally since I have not used it.
       | 
       | There are async functions provided by PostgreSQL client library
       | (libpq). I've used it to process around 2000 queries on a single
       | connection per second on a logged table.
        
       | jamesblonde wrote:
       | Why do we promote articles like this that have nice graphs and
       | are well written, when they should get a grade 'F' as an actual
       | benchmark study. The way it is presented, a casual reader would
       | think Postgres is 2/3rds the performance of Redis. Good god. He
       | even admits Postgres maxxed out its 2 cores, but Redis was
       | bottlenecked by the HTTP server. We need more of an academic, not
       | a hacker, culture for benchmarks.
        
         | dizzyVik wrote:
         | There's a reason this is on my blog and not a paper in a
         | journal. This isn't supposed to show the absolute speed of
         | either tool, the benchmark is not set up for that. I do state
         | that redis has more performance on the table in the blog post.
        
           | vasco wrote:
           | It's not a paper or a journal but you could at least try to
           | run a decent benchmark. As it is this serves no purpose other
           | than reinforcing whatever point you started with. Didn't even
           | tweak postgres buffers, literally what's the point.
        
             | dizzyVik wrote:
             | I still end up recommending using postgres though, don't I?
        
               | pcthrowaway wrote:
               | "I'll use postgres" was going to be your conclusion no
               | matter what I guess?
               | 
               | I mean what if an actual benchmark showed Redis is 100X
               | as fast as postgres for a certain use case? What are the
               | constraints you might be operating with? What are the
               | characteristics of your workload? What are your budgetary
               | constraints?
               | 
               | Why not just write a blog post saying "Unoptimized
               | postgres vs redis for the lazy, running virtualized with
               | a bottleneck at the networking level"
               | 
               | I even think that blog post would be interesting, and
               | might be useful to someone choosing a stack for a proof
               | of concept. For someone who to scale to large production
               | workloads (~10,000 requests/second or more), this isn't a
               | very useful article, so the criticism is fair, and I'm
               | not sure why you're dismissing it off hand.
        
               | dizzyVik wrote:
               | I completely agree that this is not relevant for anyone
               | running such workloads, the article is not aimed at them
               | at all.
               | 
               | Within the constraints of my setup, postgres came out
               | slower but still fast enough. I don't think I can
               | quantify what fast enough is though. Is it 1000 req/s? Is
               | it 200? It all depends on what you're doing with it. For
               | many of my hobby projects which see tens of requests per
               | second it definitely is fast enough.
               | 
               | You could argue that caching is indeed redundant in such
               | cases, but some of those have quite a lot of data that
               | takes a while to query.
        
               | motorest wrote:
               | > "I'll use postgres" was going to be your conclusion no
               | matter what I guess?
               | 
               | Would it bother you as well if the conclusion was
               | rephrased as "based on my observations, I see no point in
               | rearchitecting the system to improve the performance by
               | this much"?
               | 
               | I think you are too tied to a template solution that not
               | only you don't stop to think why you're using it or even
               | if it is justified at all. Then, when you are faced with
               | observations that challenge your unfounded beliefs, you
               | somehow opt to get defensive? That's not right.
        
               | vasco wrote:
               | That's the point, you put no effort and decided to do
               | what you had decided already to do before.
        
               | dizzyVik wrote:
               | I don't think this is a fair assessment. Had my
               | benchmarks shown, say, that postgres crumbled under heavy
               | write load then the conclusion would be different. That's
               | exactly why I decided to do this - to see what the
               | difference was.
        
               | m000 wrote:
               | Of course you didn't see postgres crumble. This still a
               | toy example of a benchmark. Nobody starts (and even more
               | pays for) a postgres instance to use exclusively as a
               | cache. It is guaranteed that even in the simplest of
               | deployments some other app (if not many of them) will be
               | the main postgres tenant.
               | 
               | Add an app that actually uses postgres as a database, you
               | will probably see its performance crumble, as the app
               | will content the cache for resources.
               | 
               | Nobody asked for benchmarking as rigorous as you would
               | have in a published paper. But toy examples are toy
               | examples, be it in a publication or not.
        
           | a_c wrote:
           | I find your article valuable. It shows me what amount of
           | configuration is needed for a reasonable expectation of
           | performance. In real world, I'm not going to spend effort
           | maxing out configuring a single piece of tool. Not being the
           | most performing config on either of the tools is the least of
           | my concern. Picking either of them, or as you suggested,
           | Postgres, and then worry about getting one billion requests
           | to the service is far more important
        
           | lemagedurage wrote:
           | The main issue is that a reader might mistake Redis as a 2X
           | faster postgres. Memory is 1000X faster than disk (SSD) and
           | with network overhead Redis can still be 100X as fast as
           | postgres for caching workloads.
           | 
           | Otherwise, the article does well to show that we can get a
           | lot of baseline performance either way. Sometimes a cache is
           | premature optimisation.
        
             | phiresky wrote:
             | If your cache fits in Redis then it fits in RAM, if your
             | cache fits in RAM then Postgres will serve it from RAM just
             | as well.
             | 
             | Writes will go to RAM as well if you have synchronous=off.
        
               | senorrib wrote:
               | Not necessarily true. If you're sharing the database with
               | your transaction workload your cache will be paged out
               | eventually.
        
               | jgalt212 wrote:
               | This was my take as well, but I'm a MySQL / Redis shop. I
               | really have no idea what tables MySQL has in RAM at any
               | given moment, but with Redis I know what's in RAM.
        
             | motorest wrote:
             | > The main issue is that a reader might mistake Redis as a
             | 2X faster postgres. Memory is 1000X faster than disk (SSD)
             | and with network overhead Redis can still be 100X as fast
             | as postgres for caching workloads.
             | 
             | Your comments suggest that you are definitely missing some
             | key insights onto the topic.
             | 
             | If you, like the whole world, consume Redis through a
             | network connection, it should be obvious to you that
             | network is in fact the bottleneck.
             | 
             | Furthermore, using a RDBMS like Postgres may indeed imply
             | storing data in a slower memory. However, you are ignoring
             | the obvious fact that a service such as Postgres also has
             | its own memory cache, and some query results can and are
             | indeed fetched from RAM. Thus it's not like each and every
             | single query forces a disk read.
             | 
             | And at the end of the day, what exactly is the performance
             | tradeoff? And does it pay off to spend more on an in-memory
             | cache like Redis to buy you the performance Delta?
             | 
             | That's why real world benchmarks like this one are
             | important. They help people think through the problem and
             | reassess their irrational beliefs. You may nitpick about
             | setup and configuration and test patterns and choice of
             | libraries. What you cannot refute are the real world
             | numbers. You may argue they could be better if this and
             | that, but the real world numbers are still there.
        
               | lossolo wrote:
               | > If you, like the whole world, consume Redis through a
               | network connection
               | 
               | I think "you are definitely missing some key insights
               | onto the topic". The whole world is a lot bigger than
               | your anecdotes.
        
               | Implicated wrote:
               | > If you, like the whole world, consume Redis through a
               | network connection, it should be obvious to you that
               | network is in fact the bottleneck.
               | 
               | Not to be annoying - but... what?
               | 
               | I specifically _do not_ use Redis over a network. It's
               | _wildly_ fast. High volume data ingest use case - lots
               | and lots of parallel queue workers. The database is over
               | the network, Redis is local (socket). Yes, this means
               | that each server running these workers has its own cache
               | - that 's fine, I'm using the cache for absolutely insane
               | speed and I'm not caching huge objects of data. I don't
               | persist it to disk, I don't care (well, it's not a big
               | deal) if I lose the data - it'll rehydrate in such a
               | case.
               | 
               | Try it some time, it's fun.
               | 
               | > And at the end of the day, what exactly is the
               | performance tradeoff? And does it pay off to spend more
               | on an in-memory cache like Redis to buy you the
               | performance Delta?
               | 
               | Yes, yes it is.
               | 
               | > That's why real world benchmarks like this one are
               | important.
               | 
               | That's not what this is though. Just about nobody who has
               | a clue is using default configurations for things like PG
               | or Redis.
               | 
               | > They help people think through the problem and reassess
               | their irrational beliefs.
               | 
               | Ok but... um... you just stated that "the whole world"
               | consumes redis through a network connection. (Which, IMO,
               | is wrong tool for the job - sure it will work, but that's
               | not where/how Redis shines)
               | 
               | > What you cannot refute are the real world numbers.
               | 
               | Where? This article is not that.
        
               | gmm1990 wrote:
               | that is an interesting use case, I hadn't thought about a
               | setup like this with a local redis cache before. Is it
               | the typical advantages of using a db over a filesystem
               | the reason to use redis instead of just reading from
               | memory mapped files?
        
               | Implicated wrote:
               | > Is it the typical advantages of using a db over a
               | filesystem the reason to use redis instead of just
               | reading from memory mapped files?
               | 
               | Eh - while surely not everyone has the benefits of doing
               | so, I'm running Laravel and using Redis is just _really_
               | simple and easy. To do something via memory mapped files
               | I'd have to implement quite a bit of stuff I don't
               | want/need to (locking, serialization, ttl/expiration,
               | etc).
               | 
               | Redis just works. Disable persistence, choose the
               | eviction policy that fits the use, config for unix socket
               | connection and you're _flying_.
               | 
               | My use case is generally data ingest of some sort where
               | the processing workers (in my largest projects I'm
               | talking about 50-80 concurrent processes chewing through
               | tasks from a queue (also backed by redis) and are likely
               | to end up running the same queries against the database
               | (mysql) to get 'parent' records (ie: user associated with
               | object by username, post by slug, etc) and there's no way
               | to know if there will be multiples (ie: if we're
               | processing 100k objects there might be 1 from UserA or
               | there might be 5000 by UserA - where each one processing
               | will need the object/record of UserA). This project in
               | particular there's ~40 million of these 'user' records
               | and hundreds of millions of related objects - so can't
               | store/cache _all_ users locally - but sure would benefit
               | from not querying for the same record 5000 times in a 10
               | second period.
               | 
               | For the most part, when caching these records over the
               | network, the performance benefits were negligible
               | (depending on the table) compared to just querying myqsl
               | for them. They are just `select where id/slug =` queries.
               | But when you lose that little bit of network latency and
               | you can make _dozens_ of these calls to the cache in the
               | time it would take to make a single networked call... it
               | adds up real quick.
               | 
               | PHP has direct memory "shared memory" but again, it would
               | require handling/implementing a bunch of stuff I just
               | don't want to be responsible for - especially when it's
               | so easy and performant to lean on Redis over a unix
               | socket. If I needed to go faster than this I'd find
               | another language and likely do something direct-to-memory
               | style.
        
             | pigbearpig wrote:
             | That's the reader's fault then. I see the blog post as the
             | counter to the insane resume-building over-engineered
             | architecture you see at a lot of non-tech companies. Oh,
             | you need a cache for our 25-user internal web application?
             | Let's put an front a redis cluster with elastisearch using
             | an LLM to publish cache invalidation with Kafka.
        
               | themgt wrote:
               | There's also a sort of anti-everything attitude that gets
               | boring and lazy. Redis is about the simplest thing
               | possible to deploy. This wasn't about "a redis cluster
               | with elastisearch using an LLM" it was just Redis.
               | 
               | I sometimes read this stuff like people explaining how
               | they replaced their spoon and fork with a spork and
               | measured only a 50% decrease in food eating performance.
               | And have you heard of the people with a $20,000 Parisian
               | cutlery set to eat McDonalds? I just can't understand
               | insane fork enjoyers with their over-engineered their
               | dining experience.
        
               | lomase wrote:
               | There is this cv-driven-development when you have to use
               | Redis, Kafka, Mongo, Rabbit, Docker, AWS, job schelduers,
               | Microservices, and so on.
               | 
               | The less dependencies my project has the better. If it is
               | not needed why use it?
        
               | array_key_first wrote:
               | Software development has such a pro-complexity culture
               | that, I think, we need more anti-stuff or pushback.
        
           | rollcat wrote:
           | Thank you for the article.
           | 
           | My own conclusions from your data:
           | 
           | - Under light workloads, you can get away with Postgres. 7k
           | RPS is fine for a lot of stuff.
           | 
           | - Introducing Redis into the mix has to be carefully weighted
           | against increased architectural complexity, and having a
           | common interface allows us to change that decision down the
           | road.
           | 
           | Yeah maybe that's not up to someone else's idea of a good
           | synthetic benchmark. Do your load-testing against actual
           | usage scenarios - spinning up an HTTP server to serve traffic
           | is a step in the right direction. Kudos.
        
         | whateveracct wrote:
         | most people with blogs don't know what they're doing. or don't
         | care to know? sadly they get hired at companies and everyone
         | does what they say cuz they have a blog. i've seen some shit in
         | that department it's wild how much some people really are
         | imposters.
        
           | motorest wrote:
           | > most people with blogs don't know what they're doing. or
           | don't care to know?
           | 
           | I don't see any point to this blend of cynical contrarianism.
           | If you feel you can do better, put your money where your
           | mouth is. Lashing at others because they went through the
           | trouble of sharing something they did is something that's
           | absurd and creates no value.
           | 
           | Also, maintaining a blog doesn't make anyone an expert, but
           | not maintaining a blog doesn't mean you are suddenly more
           | competent than those who do.
        
             | whateveracct wrote:
             | just an observation :)
        
         | motorest wrote:
         | > He even admits Postgres maxxed out its 2 cores, but Redis was
         | bottlenecked by the HTTP server.
         | 
         | What exactly is your point? That you can further optimize
         | either option? Well yes, that comes at no suprise. I mean, the
         | latencies alone are in the range of some transcontinental
         | requests. Were you surprised that Redis outperformed Postgres?
         | I hardly think so.
         | 
         | So what's the problem?
         | 
         | The main point that's proven is that there is indeed
         | diminishing returns in terms of performance. For applications
         | where you can afford an extra 20ms when hitting a cache,
         | caching using a persistent database is an option. For some
         | people, it seems this fact was very surprising. That's food for
         | thought, isn't it?
        
           | hvb2 wrote:
           | I've done this many times in AWS leveraging dynamodb.
           | 
           | Comes with ttl support (which isn't precise so you still need
           | to check expiration on read), and can support long TTLs as
           | there's essentially no limit to the storage.
           | 
           | All of this at a fraction of the cost of HA redis Only if you
           | need that last millisecond of performance and have done all
           | other optimizations should one consider redis imho
        
             | motorest wrote:
             | > I've done this many times in AWS leveraging dynamodb.
             | 
             | Exactly. I think nosql offerings from any cloud provider
             | already supports both TTL and conditional requests out-of-
             | the-box, and the performance of basic key-value CRUD
             | operations is often <10ms.
             | 
             | I've seem some benchmarks advertise memory cache services
             | as having latencies around 1ms. Yeah, this would mean the
             | latency of a database is 10 times higher. But relative
             | numbers matter nothing. What matters is absolute numbers,
             | as they are the ones that drive tradeoff analysis. Does a
             | feature afford an extra 10ms in latency, and is that
             | performance improvement worth paying a premium?
        
             | re-thc wrote:
             | > All of this at a fraction of the cost of HA redis
             | 
             | This depends on your scale. Dynamodb is pay per request and
             | the scaling isn't as smooth. At certain scales Redis is
             | cheaper.
             | 
             | Then if you don't have high demand maybe it's ok without HA
             | for Redis and it can still be cheaper.
        
               | motorest wrote:
               | > At certain scales Redis is cheaper.
               | 
               | Can you specify in which scenario you think Redis is
               | cheaper than caching things in, say, dynamodb.
        
               | odie5533 wrote:
               | High read/write and low-ish size. Also it's faster.
        
               | motorest wrote:
               | > High read/write and low-ish size. Also it's faster
               | 
               | You posted a vague and meaningless assertion. If you do
               | not have latency numbers and cost differences, you have
               | absolutely nothing to show for, and you failed to provide
               | any rationale that justified even whether any cache is
               | required at all.
        
               | odie5533 wrote:
               | At 10k RPS you'll see a significant cost savings with
               | Redis over DynamoDB.
               | 
               | ElastiCache Serverless (Redis/Memcached): Typical latency
               | is 300-500 microseconds (sub-millisecond response)
               | 
               | DynamoDB On-Demand: Typical latency is single-digit
               | milliseconds (usually between 1-10 milliseconds for
               | standard requests)
        
               | motorest wrote:
               | > At 10k RPS you'll see a significant cost savings with
               | Redis over DynamoDB.
               | 
               | You need to be more specific than that. Depending on your
               | read/write patterns and how much memory you need to
               | allocate to Redis, back of the napkin calculations still
               | point to the fact that Redis can still cost >$1k/month
               | more than DynamoDB.
               | 
               | Did you actually do the math on what it costs to run
               | Redis?
        
               | hvb2 wrote:
               | > At 10k RPS
               | 
               | You would've used local memory first. At which point I
               | cannot see getting to those request levels anymore
               | 
               | > ElastiCache Serverless (Redis/Memcached): Typical
               | latency is 300-500 microseconds (sub-millisecond
               | response)
               | 
               | Sure
               | 
               | > DynamoDB On-Demand: Typical latency is single-digit
               | milliseconds (usually between 1-10 milliseconds for
               | standard requests)
               | 
               | I know very little use cases where that difference is
               | meaningful. Unless you have to do this many times
               | sequentially in which case optimizing that would be much
               | more interesting than a single read being .5 ms versus
               | the typical 3 to 4 for dynamo (that last number is based
               | on experience)
        
               | hvb2 wrote:
               | You would need to get to insane read counts pretty much
               | 24/7 for this to work out.
               | 
               | For HA redis you need at least 6 instances, 2 regions * 3
               | AZs. And you're paying for all of that 24/7.
               | 
               | And if you truly have 24/7 use then just 2 regions won't
               | make sense as the latency to get to those regions from
               | the other side of the globe easily removes any caching
               | benefit.
        
               | ahoka wrote:
               | A 6 node cache and caching in DynamoDB, what the hell
               | happened to the industry? Or people just call every kind
               | of non business-object persistence cache now?
        
               | hvb2 wrote:
               | I don't understand your comment.
               | 
               | If you're given the requirement of highly available, how
               | do you not end up with at least 3 nodes? I wouldn't
               | consider a single region to be HA but I could see that
               | argument as being paranoid.
               | 
               | A cache is just a store for things that expire after a
               | while that take load of your persistent store. It's
               | inherently eventually consistent and supposed to help you
               | scale reads. Whatever you use for storage is irrelevant
               | to the concept of offloading reads
        
               | odie5533 wrote:
               | It's $9/mo for 100 MB of ElastiCache Serverless which is
               | HA.
               | 
               | It's $15/mo for 2x cache.t4g.micro nodes for ElastiCache
               | Valkey with multi-az HA and a 1-year commitment. This
               | gives you about 400 MB.
               | 
               | It very much depends on your use case though if you need
               | multiple regions then I think DynamoDB might be better.
               | 
               | I prefer Redis over DynamoDB usually because it's a
               | widely supported standard.
        
               | motorest wrote:
               | > It's $9/mo for 100 MB of ElastiCache Serverless which
               | is HA.
               | 
               | You need to be more specific with your scenario. Having
               | to cache 100MB of anything is hardly a scenario that
               | involves introducing a memory cache service such as
               | Redis. This is well within the territory of just storing
               | data in a dictionary. Whatever is driving the requirement
               | for Redis in your scenario, performance and memory
               | clearly isn't it.
        
         | lelanthran wrote:
         | I'm not seeing your point. This wouldn't get an F, purely
         | because all the parameters are documented.
         | 
         | Conclusions aren't incorrect either, so what's the problem?
        
           | m000 wrote:
           | The use case is not representative of a real-life scenario,
           | so the value of the presented results are minimal.
           | 
           | A takeaway could be that you can dedicate a postgres instance
           | for caching and have acceptable results. But who does that?
           | Even for a relatively simple intranet app, your #1 cost when
           | deploying in Google Cloud would probably be running Postgres.
           | Redis OTOH is dirt cheap.
        
             | lelanthran wrote:
             | > The use case is not representative of a real-life
             | scenario, so the value of the presented results are
             | minimal.
             | 
             | Maybe I'm reading the article wrong, but it is
             | representative of any application that uses a PosgreSQL
             | server for data, correct?
             | 
             | In what way is that not a real-life scenario? I've deployed
             | Single monolith + PostgreSQL to about 8 different clients
             | in the last 2.5 years. It's my largest source of income.
        
               | m000 wrote:
               | When you run a relational database, you typically do it
               | for the joins, aggregations, subqueries, etc. So a real-
               | life scenario would include some application actually
               | putting some stress on postgres.
               | 
               | If your don't mind overprovisioning your postgres, yes I
               | guess the presented benchmarks are kind of
               | representative. But they also don't add anything that you
               | didn't know without reading the article.
        
               | lelanthran wrote:
               | > If your don't mind overprovisioning your postgres
               | 
               | Why would I mind it? I'm not using overpriced hosted
               | PostgreSQL, after all.
        
               | sherburt3 wrote:
               | My stance has always been stick to 1 database for as long
               | as humanly possible because having 2 databases is 1000x
               | harder.
        
               | Implicated wrote:
               | > I've deployed Single monolith + PostgreSQL to about 8
               | different clients in the last 2.5 years. It's my largest
               | source of income.
               | 
               | And... do you do that with the default configuration?
        
               | lelanthran wrote:
               | > And... do you do that with the default configuration?
               | 
               | Yes. Internal apps/LoB apps for a large company might
               | have, at most 5k users. PostgreSQL seems to manage it
               | fine, none of my metrics are showing high latencies even
               | when all employees log on in the morning during the same
               | 30m period.
        
               | Implicated wrote:
               | I'm definitely getting the wrong kind of clients.
               | 
               | Kudos to you sir. Sincerely, I'm not hating, I'm actually
               | jealous of the environment being that mellow.
        
         | ENGNR wrote:
         | There's too many hackers on hacker news!
        
         | zer00eyz wrote:
         | You might not have been here 25 years ago when the dot com
         | bubble burst.
         | 
         | A lot of us ate shit to stay in the Bay Area, to stay in
         | computing. I have stories of great engineers doing really
         | crappy jobs and "contracting" on the side.
         | 
         | I couldn't really have a 'startup' out of my house and a slice
         | of rented hosting. Hardware was expensive and nothing was easy.
         | Today I can set up a business and thrive on 1000 users at 10
         | bucks a month. Thats a viable and easy to build business. It's
         | an achievable metric.
         | 
         | But Im not going to let amazon and its infinite bill you for
         | everything at 2012 prices so it can be profitable hosting be my
         | first choice. Im not going to do that when I can get fixed cost
         | hosting.
         | 
         | For me, all the interesting things going on in tech aren't
         | coming out of FB, Google and hyperscalers. They aren't AI or
         | ML. We dont need another Kubernetes or Kafka or react (no more
         | Conways law projects). There is more interesting work going on
         | down at the bottom. In small 2 and 3 man shops solving their
         | problems on limited time and budget with creative "next step"
         | solutions. Their work is likely more applicable to most people
         | reading HN than another well written engineering blog from
         | cloud flare about their latest massive rust project.
        
         | positron26 wrote:
         | A lot of great benchmarking probably dies inside internal
         | tuning. When we're lucky, we get a blog post, but if the
         | creator isn't incentivized or is even discouraged by an
         | employer from sharing the results, it will never see the light
         | of day.
        
         | oulipo2 wrote:
         | The main point was not to fully benchmark and compare both, but
         | just to get a rough sense of whether a Postgres cache was fast
         | enough to be useful in practice. The comparison with Redis was
         | more a crutch to get a sense of that, than really something
         | that pretends to be "rock-solid benchmarking"
        
         | KronisLV wrote:
         | I feel like the outrage is unwarranted.
         | 
         | > The way it is presented, a casual reader would think Postgres
         | is 2/3rds the performance of Redis.
         | 
         | If a reader cares about the technical choice, they'll probably
         | at least read enough to learn of the benchmarks in this popular
         | use case, or even just the conclusion:
         | 
         | > Redis is faster than postgres when it comes to caching,
         | there's no doubt about it. It conveniently comes with a bunch
         | of other useful functionality that one would expect from a
         | cache, such as TTLs. It was also bottlenecked by the hardware,
         | my service or a combination of both and could definitely show
         | better numbers. Surely, we should all use Redis for our caching
         | needs then, right? Well, I think I'll still use postgres.
         | Almost always, my projects need a database. Not having to add
         | another dependency comes with its own benefits. If I need my
         | keys to expire, I'll add a column for it, and a cron job to
         | remove those keys from the table. As far as speed goes - 7425
         | requests per second is still a lot. That's more than half a
         | billion requests per day. All on hardware that's 10 years old
         | and using laptop CPUs. Not many projects will reach this scale
         | and if they do I can just upgrade the postgres instance or if
         | need be spin up a redis then. Having an interface for your
         | cache so you can easily switch out the underlying store is
         | definitely something I'll keep doing exactly for this purpose.
         | 
         | I might take an issue with the first sentence (might add "...at
         | least when it comes to my hardware and configuration."), but
         | the rest seems largely okay.
         | 
         | As a casual reader, you more or less just get:
         | * Oh hey, someone's experience and data points. I won't base my
         | entire opinion upon it, but it's cool that people are sharing
         | their experiences.       * If I wanted to use either, I'd
         | probably also need to look into bottlenecks, even the HTTP
         | server, something you might not look into at first!       *
         | Even without putting in a lot of work into tuning, both of the
         | solutions process a lot of data and are within an order of
         | magnitude when it comes to performance.       * So as a casual
         | reader, for casual use cases, it seems like the answer is -
         | just pick whatever feels the easiest.
         | 
         | If I wanted to read super serious benchmarks, I'd go looking
         | for those (which would also have so many details that they
         | would no longer be a casual read, short of just the abstract,
         | but them I'm missing out on a lot anyways), or do them myself.
         | This is more like your average pop-sci article, nothing wrong
         | with that, unless you're looking for something else.
         | 
         | Eliminating the bottlenecks would be a cool followup post
         | though!
        
         | lomase wrote:
         | This site is called Hackernews btw.
        
       | ezekiel68 wrote:
       | > Both postgres and redis are used with the out of the box
       | settings
       | 
       | Ugh. I know this gives the illusion of fairness, but it's not how
       | any self-respecting software engineer should approach benchmarks.
       | You have hardware. Perhaps you have virtualized hardware. You
       | tune to the hardware. There simply isn't another way, if you want
       | to be taken seriously.
       | 
       | Some will say that in a container-orchestrated environment,
       | tuning goes out the window since "you never know" where the
       | orchestrator will schedule the service but this is bogus. If
       | you've got time to write a basic deployment config for the
       | service on the orchestrator, you've also got time to at least
       | size the memory usage configs for PostgreSQL and/or Redis. It's
       | just that simple.
       | 
       | This is the kind of thing that is "hard and tedious" for only
       | about five minutes of LLM query or web search time and then you
       | don't need to revisit it again (unless you decide to change the
       | orchestrator deployment config to give the service more/less
       | resources). It doesn't invite controversy to right-size your
       | persistence services, especially if you are going to publish the
       | results.
        
         | wewewedxfgdf wrote:
         | Fully agree.
         | 
         | Postgres is a power tool usable for many many use cases - if
         | you want performance it must be tuned.
         | 
         | If you judge Postgres without tuning it - that's not Postgres
         | being slow, that's the developer being naive.
        
           | gopalv wrote:
           | > If you judge Postgres without tuning it - that's not
           | Postgres being slow, that's the developer being naive.
           | 
           | Didn't OP end by picking Postgres anyway?
           | 
           | It's the right answer even for a naive developer, perhaps
           | even more so for a naive one.
           | 
           | At the end of the post it even says
           | 
           | >> Having an interface for your cache so you can easily
           | switch out the underlying store is definitely something I'll
           | keep doing
        
           | lelanthran wrote:
           | He concluded postgresql to be fast enough, so what's the
           | problem?
           | 
           | IOW, he judged it fast enough.
        
         | vidarh wrote:
         | On one hand I agree with you, but on the other hand defaults
         | matter because I regularly see systems with the default config
         | and no attempt to tune.
         | 
         | Benchmarking the defaults and benchmarking a tuned setup will
         | measure very different things, but both of them matter.
        
           | matt-p wrote:
           | IME very very few people tune the underlying host. Orgs like
           | uber, google or whatever do but outside of that few people
           | know what they're really doing/cares that much. Easier to
           | "increase EC2 size" or whatever.
        
           | kijin wrote:
           | Defaults have all sorts of assumptions built into them. So if
           | you compare different programs with their respective
           | defaults, you are actually comparing the assumptions that the
           | developers of those programs have in mind.
           | 
           | For example, if you keep adding data to a Redis server under
           | default config, it will eat up all of your RAM and suddenly
           | stop working. Postgres won't do the same, because its default
           | buffer size is quite small by modern standards. It will
           | happily accept INSERTs until you run out of disk, albeit more
           | slowly as your index size grows.
           | 
           | The two programs behave differently because Redis was
           | conceived as an in-memory database with optional persistence,
           | whereas Postgres puts persistence first. When you use either
           | of them with their default config, you are trusting that the
           | developers' assumptions will match your expectations. If not,
           | you're in for a nasty surprise.
        
             | vidarh wrote:
             | Yes, all of this is fine but none of it address my point:
             | 
             | Enough people use the default settings that benchmarking
             | the default settings is very relevant.
             | 
             | It often isn't a good thing to rely on the defaults, but
             | it's nevertheless the case that many do.
             | 
             | (Yes, it is _also_ relevant to benchmark tuned versions, as
             | I also pointed out, my argument was against the claim that
             | it is somehow unfair not to tune)
        
         | high_na_euv wrote:
         | Disagree, majority of software is running on defaults, it makes
         | sense to compare them this way
        
         | IanCal wrote:
         | I disagree. They found that Postgres, without tuning, was
         | easily fast enough on low level hardware and would come with
         | the benefit of not deploying another service. Additionally
         | tuning it isn't really relevant.
         | 
         | If the defaults are fine for a use case then unless I want to
         | tune it for personal interest it's either a poor use of my fun
         | time or a poor use of my clients funds.
        
           | lemagedurage wrote:
           | "If we don't need performance, we don't need caches" feels
           | like a great broader takeaway here.
        
             | motorest wrote:
             | > "If we don't need performance, we don't need caches"
             | feels like a great broader takeaway here.
             | 
             | I don't think this holds true. Caches are used for reasons
             | other than performance. For example, caches are used in
             | some scenarios for stampede protection to mitigate DoS
             | attacks.
             | 
             | Also, the impact of caches on performance is sometimes
             | negative. With distributed caching, each match and put
             | require a network request. Even when those calls don't
             | leave a data center, they do cost far more than just
             | reading a variable from memory. I already had the
             | displeasure of stumbling upon a few scenarios where cache
             | was prescribed in a cargo cult way and without any data
             | backing up the assertion, and when we took a look at traces
             | it was evident that the bottleneck was actually the cache
             | itself.
        
               | ralegh wrote:
               | DoS is a performance problem, if your server was
               | infinitely fast with infinite storage they wouldnt be an
               | issue.
        
               | lomase wrote:
               | If my gandma had wheels it would be a car.
        
               | indymike wrote:
               | It is actually a financial problem too. Servers stop
               | working when the bill goes unpaid. Sad but true.
        
               | motorest wrote:
               | > DoS is a performance problem
               | 
               | Not really. Running out of computational resources to
               | fulfill requests is not a performance issue. Think of
               | thinks such as exhausting a connection pool. More often
               | than not, some components of a system can't scale
               | horizontally.
        
             | IanCal wrote:
             | A cache being fast enough doesn't mean no caching is
             | relevant - I'm not sure why you'd equate the two.
        
             | hobs wrote:
             | I see people downvoting this. Anyone who disagrees with
             | this, we have YAGNI for a reason - if someone said to me my
             | performance was fine and they added caches, I would look at
             | them with a big hairy eyeball because we already know cache
             | invalidation is a PITA, that correctness issues are easy to
             | create, and now you have the performance of two different
             | systems to manage.
             | 
             | Amazon actually moved away from caches for some parts of
             | its system because consistent behavior is a feature,
             | because what happens if your cache has problems and the
             | interaction between that and your normal thing is slow?
             | What if your cache has some bugs or edge case behavior? If
             | you don't need it you are just doing a bunch of extra work
             | to make sure things are in sync.
        
             | indymike wrote:
             | Sometimes, a cache is all about reducing expense: I.e, free
             | cache query vs expensive API query.
        
               | amluto wrote:
               | Sometimes people host software on a server they own or
               | rent, the server is plenty fast, and it costs literally
               | nothing to issue those queries at the scale on which
               | they're needed.
        
               | indymike wrote:
               | Yes, that is true, but the original poster said getting
               | rid of caches was always a good idea, when in reality the
               | answer (as usual with engineering) is "it depends."
        
           | perrygeo wrote:
           | The default shared memory is 128MiB, not even 1% of typical
           | machines today. A benchmark run with these settings is
           | effectively crippling your hardware by making sure 99% of
           | your available memory is ignored by postgres. It's an invalid
           | benchmark, unless redis is similarly crippled.
        
             | igneo676 wrote:
             | > If the defaults are fine for a use case then unless I
             | want to tune it for personal interest it's either a poor
             | use of my fun time or a poor use of my clients funds.
             | 
             | It doesn't matter if you've crippled the benchmark if the
             | performance of both options still exceeds your
             | expectations. Not all of us are trying eek out every drop
             | of performance
             | 
             | And, well, if you are then you can ignore the entire post
             | because Redis offers better perf than postgres and you'd
             | use that. It's that simple.
        
           | re-thc wrote:
           | > They found that Postgres, without tuning, was easily fast
           | enough on low level hardware
           | 
           | Is that production? When you basket it into "low level" it
           | sounds like a base case but it really isn't.
           | 
           | In production you don't have local storage, RAM being used
           | for all kinds of other things, your CPU only available in
           | small slices, network effects and many others.
           | 
           | > If the defaults are fine for a use case
           | 
           | Which I hope isn't the developer's edition of it works on my
           | machine.
        
         | Timshel wrote:
         | > for only about five minutes of LLM query or web search
         | 
         | I think I have more trust in the PG defaults that in the output
         | of a LLM or copy pasting some configuration I might not really
         | understand ...
        
           | rollcat wrote:
           | It's crazy how wildly inaccurate "top-of-the-list" LLMs are
           | for straightforward yet slightly nuanced inquiries.
           | 
           | I've asked ChatGPT to summarize Go build constraints,
           | especially in the context of CPU microarchitectures (e.g.
           | mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly
           | smashed its head on GORISCV64, claiming all sorts of nonsense
           | such as v1, v2; then G, IMAFD, Zicsr; only arriving at
           | rva20u64 et al under hand-holding. Similar nonsense for
           | GOARM64 and GOWASM. It was all right there in e.g. the docs
           | for [cmd/go].
           | 
           | This is the future of computer engineering. Brace yourselves.
        
             | yomismoaqui wrote:
             | If you are going to ask ChatGPT some specific tidbit it's
             | better to force it to search on the web.
             | 
             | Remember, an LLM is a JPG of all the text of the internet.
        
               | dgfitz wrote:
               | Wait, what?
               | 
               | Isn't that the whole point, to ask it specific tidbits of
               | information? Are we to ask it large, generic
               | pontifications and claim success when we get large,
               | generic pontifications back?
               | 
               | The narrative around these things changes weekly.
        
               | wredcoll wrote:
               | I mean, like most tools they work when they work and
               | don't when they fail. Sometimes I can use an llm to find
               | a specific datum and sometimes I use google and sometimes
               | I use bing.
               | 
               | You might think of it as a cache, worth checking first
               | for speed reasons.
               | 
               | The big downside is not that they sometimes fail, its
               | that they give zero indication when they do.
        
               | simonw wrote:
               | ChatGPT is exceptionally good at using search now, but
               | that's new this year, as of o3 and then GPT-5. I didn't
               | trust GPT-4o and earlier to use the search tool well
               | enough to be useful.
               | 
               | You can see if it's used search in the interface, which
               | helps evaluate how likely it is to get the right answer.
        
               | dpkirchner wrote:
               | I use it as a tool that understands natural language and
               | the context of the environments in work in well enough to
               | get by, while guiding it to use search or just facts I
               | know if I want more one-shot accuracy. Just like I would
               | if I were communicating with a newbie who has their own
               | preconceived notions.
        
             | simonw wrote:
             | Did you try pasting in the docs for cmd/go and asking
             | again?
        
               | Implicated wrote:
               | I mean - this is the entire problem right here.
               | 
               | Don't ask LLMs that are trained on a whole bunch of
               | different versions of things with different flags and
               | options and parameters where a bunch of people who have
               | no idea what they're doing have asked and answered
               | stackoverflow questions that are likely out of date or
               | wrong in the first place how to do things with that thing
               | without providing the docs for the version you're working
               | with. _Especially_ if it's the newest version, regardless
               | if it's cutoff date was after that version was released -
               | you have no way to know if it was _included_. (Especially
               | about something related to a programming language with
               | ~2% market share)
               | 
               | The contexts are so big now - feed it the docs. Just copy
               | paste the whole damn thing into it when you prompt it.
        
             | pbronez wrote:
             | How was the LLM accessing the docs? I'm not sure what the
             | best pattern is for this.
             | 
             | You can put the relevant docs in your prompt, add them to a
             | workspace/project, deploy a docs-focused MCP server, or
             | even fine-tune a model for a specific tool or ecosystem.
        
               | Implicated wrote:
               | > I'm not sure what the best pattern is for this.
               | 
               | > You can put the relevant docs in your prompt
               | 
               | I've done a lot of experimenting with these various
               | options for how to get the LLM to reference docs. IMO
               | it's almost always best to include in prompt where
               | appropriate.
               | 
               | For a UI lib that I use that's rather new, specifically
               | there's a new version that the LLMs aren't aware of yet,
               | I had the LLM write me a quick python script that just
               | crawls the docs site for the lib and feeds the entire
               | page content back into itself with a prompt describing
               | what it's supposed to do (basically telling it to
               | generate a .md document with the specifics about that
               | _thing_ , whether it's a component or whatever, ie:
               | properties, variants, etc in an extremely brief manner)
               | as well as build an 'index.md' that includes a short
               | paragraph about what the library is and a list of each
               | component/page document that is generated. So in about 60
               | seconds it spits out a directory full of .md files and I
               | then tell my project-specific LLM (ie: Claude Code or
               | Opencode within the project) to review those files with
               | the intention of updating the CLAUDE.md in the project to
               | instruct that any time we're building UI elements we
               | should refer to the index.md for the library to
               | understand what components are available and when
               | appropriate to use one of them we _must_ review the
               | correlating document first.
               | 
               | Works very very very well. Much better than an MCP server
               | specifically built for that same lib. (Huge waste of
               | tokens, LLM doesn't always use it, etc) Well enough that
               | I just copy/paste this directory of docs into my active
               | projects using that library - if I wasn't lazy I'd
               | package it up but too busy building stuff.
        
           | simonw wrote:
           | So run the LLM in an agent loop: give it a benchmarking tool,
           | let it edit the configuration and tell it to tweak the
           | settings and measure and see how much if a performance
           | improvement it can get.
           | 
           | That's what you'd do by hand if you were optimizing, so save
           | some time and point Claude Code or Codex CLI or GitHub
           | Copilot at it and see what happens.
        
             | lomase wrote:
             | How much that would cost?
        
               | carlhjerpe wrote:
               | They charge per token...
        
               | danielbln wrote:
               | Not if you're on a subscription they don't.
        
               | carlhjerpe wrote:
               | So the hidden usage caps doesn't equate to token usage?
               | 
               | They charge per token, everyone charges per token.
        
               | simonw wrote:
               | Probably about 10 cents, if you're even paying for
               | tokens. Plenty of these tools have generous free tiers or
               | allowances included in your subscription.
               | 
               | I run a pricing calculator here - for 50,000 input
               | tokens, 5,000 output tokens (which I estimate would be
               | about right for a PostgreSQL optimization loop) GPT-5
               | would cost 11.25 cents: https://www.llm-
               | prices.com/#it=50000&ot=5000&ic=1.25&oc=10
               | 
               | I use Codex CLI with my $20/month ChatGPT account and so
               | far I've not hit the limit with it despite running things
               | like this multiple times a day.
        
               | lomase wrote:
               | If optimizing a Postgress SQL server cost 11.25 cents and
               | everybody can do it because AI how much are you going to
               | bill your customer? .20 cents?
               | 
               | If that is true in some months there will be no dba jobs.
               | 
               | Funny that at the same time SQL is one of the most
               | requested languages in job postings.
        
               | simonw wrote:
               | Knowing what "optimizing a PostgreSQL server's
               | configuration" even means continues to be high value
               | technical knowledge.
               | 
               | Knowing how to "run an agentic loop to optimize the
               | config file" is meaningless techno-jabber to 99.99% of
               | the world's population.
               | 
               | I am entirely unconcerned for my future career prospects.
        
               | lomase wrote:
               | So your big advantage is that nobody has lauched agentic
               | tools for the end user yet?
        
               | simonw wrote:
               | Anyone can learn to unblock a sink by watching YouTube
               | videos these days, and yet most people still hire a
               | professional to do it for them.
               | 
               | I don't think end users want to "optimize their
               | PostgreSQL servers" even if they DID know that's a thing
               | they can do. They want to hire experts who know how to
               | make "that tech stuff" work.
        
               | lomase wrote:
               | I agree that people like to hire profesionals. That is
               | why I hire db experts to work on our infra, not prompt
               | engenieers.
               | 
               | Saying that anybody can learn to unblock a sink by
               | watching youtube is your tipical HN mentality of stating
               | opinons as facts.
        
               | cdelsolar wrote:
               | you can become a db expert with the right prompts
        
               | lomase wrote:
               | You can learn how to pour a drink in 1 minute, that is
               | why most bartenders earn minimum wage.
               | 
               | You can't become a db expert with a promt.
               | 
               | I hope you make a lot of money with your lies and good
               | luck.
        
               | simonw wrote:
               | You can become a DB expert by reading books, forums and
               | practicing hard.
               | 
               | These days you can replace those books and forums with a
               | top tier LLM, but you still need to put in the practice
               | yourself. Even with AI assistance that's still a _lot_ of
               | work.
        
               | lomase wrote:
               | You could not replace good books with Intenert and you
               | can't replace good books with a any LLM.
               | 
               | You can replace books with your own time and research.
               | 
               | Again making statements that are just not true. Typical
               | HN behavior.
        
               | simonw wrote:
               | "Saying that anybody can learn to unblock a sink by
               | watching youtube is your tipical HN mentality of stating
               | opinons as facts."
               | 
               | I don't understand what you mean. Are you saying that
               | it's not true that anyone could learn to unblock a sink
               | by watching YouTube videos?
        
               | lomase wrote:
               | Yes I do think not all people could fix it with Youtube.
               | My grandma couldn't for example. I had a neigbor come for
               | help with something like that too.
               | 
               | Is not that hard to understan mate. Maybe put my comment
               | in the LLM so you can get it.
               | 
               | What is your point again?
        
               | simonw wrote:
               | My analogy holds up. Anyone could type "optimize my
               | PostgreSQL database by editing the configuration file"
               | into an LLM, but most people won't - same as most people
               | won't watch YouTube to figure out how to unblock a sink.
               | 
               | If you don't like the sink analogy what analogy would you
               | use instead for this? I'm confident there's a "people
               | could learn X from YouTube but chose to pay someone else
               | instead" that's more effective than the sink one.
        
               | simonw wrote:
               | Personally I'd like to hire a DB expert who also knows
               | how to drive an agentic coding system to help them
               | accelerate their work. AI tools, used correctly, act as
               | an amplifier of existing knowledge and experience.
        
               | lomase wrote:
               | As far as I know nobody has really came up with proof
               | that LLMs act as an amplifier of existing knoledge.
               | 
               | It does make people FEEL more productive.
        
               | simonw wrote:
               | What would a "proof" of that even look like?
               | 
               | There are thousands (probably millions) of us walking
               | around with anecdotal personal evidence at this point.
        
               | lomase wrote:
               | Some years ago when everybody here gave their anecdotal
               | evidence about how Bitcoin and Blockchain were the future
               | and they used it every day. You were a fool if you did
               | not jump on the bandwagon.
               | 
               | If the personal opinions on this site were true, half of
               | the code in the world would be functional, lisp would be
               | one of the languages most used and Microsoft would have
               | not bougth DropBox.
               | 
               | I really think HN hive minds opinions means nothing. Too
               | much money here to be real.
        
             | IgorPartola wrote:
             | "We will take all the strokes off Jerry's game when we kill
             | him." - the LLM, probably.
             | 
             | Just like Mr Meeseeks, it's only a matter of time before it
             | realizes that deleting all the data will make the DB
             | lightning fast.
        
               | simonw wrote:
               | Exactly true, which is why you need to run your agent
               | against a safe environment. That's a skill that's worth
               | developing.
        
           | Implicated wrote:
           | > copy pasting some configuration I might not really
           | understand
           | 
           | Uh, yea... why _would_ you? Do you do that for configurations
           | you found that weren 't from LLMs? I didn't think so.
           | 
           | I see takes like this all the time and I'm really just mind-
           | boggled by it.
           | 
           | There are more than just the "prompt it and use what it gives
           | me" use cases with the LLMs. You don't have to be that rigid.
           | They're incredible learning and teaching tools. I'd argue
           | that the single best use case for these things is as a
           | research and learning tool for those who are curious.
           | 
           | Quite often I will query Claude about things I don't know and
           | it will tell me things. Then I will dig deeper into those
           | things myself. Then I will query further. Then I will ask it
           | details where I'm curious. I won't blindly follow or trust it
           | like I wouldn't a professor or anyone or any _thing_ else,
           | for that matter. Just like I would when querying a human for
           | or the internet in general for information, I 'll verify.
           | 
           | You don't have to trust it's code, or it's configurations.
           | But you can sure learn a lot from them, particularly when you
           | know how to ask the right questions. Which, hold onto your
           | chairs, only takes some experience and language skills.
        
             | Timshel wrote:
             | My comment is mainly in opposition to the "five minutes"
             | part from parent.
             | 
             | If you have 5 minutes then you can't as you say :
             | 
             | > Then I will dig deeper into those things myself ...
             | 
             | So my point is I don't care if it's coming from LLM or a
             | random blog, you won't have time to know if it's really
             | working (ideally you would want to benchmark the change).
             | 
             | If you can't invest the time better to stay with the
             | defaults, which in most project the maintainers spent quite
             | a bit of time to make sensible.
        
               | Implicated wrote:
               | Yea, I guess in that case I'd say it's likely a bad move
               | in every direction if you're constrained to 5 min to
               | deploy something you don't understand.
        
           | dotancohen wrote:
           | Then either have the LLM explain the config, or go Google it.
           | LLM output is a starting point, not your final config.
        
         | oulipo2 wrote:
         | Perhaps, but in this case this shows at least that even non-
         | tuned Postgres can be used as a fast cache for many real-world
         | use-cases
        
         | conradfr wrote:
         | But why doesn't Postgres tuned itself based on the system is
         | running on, at least the basics based on available RAM & cores?
        
           | simonw wrote:
           | I've not tried it myself but I believe that's what pgtune
           | does: https://github.com/gregs1104/pgtune
        
         | otikik wrote:
         | > Ugh.
         | 
         | > if you want to be taken seriously
         | 
         | For someone so enthusiastic about giving feedback you don't
         | seem to have invested a lot of effort into figuring out how to
         | give it effectively. Your done and demeanor diminish the value
         | of your comment.
        
         | GuinansEyebrows wrote:
         | > This is the kind of thing that is "hard and tedious" for only
         | about five minutes of LLM query or web search time
         | 
         | not even! if you don't need to go super deep with tablespace
         | configs or advanced replication right away, pgtune will get you
         | to a pretty good spot in the time it takes to fill out a form.
         | 
         | https://pgtune.leopard.in.ua/
         | 
         | https://github.com/le0pard/pgtune
        
         | rich_sasha wrote:
         | Isn't tuning before hitting constraints a premature
         | optimisation? The approach of not spending time on tuning
         | settings before you have to seems sane.
         | 
         | And TFA shows you that in this world Postgres is close enough
         | to Redis.
        
         | aprdm wrote:
         | Yep. I worked in a famous-big-company that had a 15 years old
         | service that was dogslow, systemd restarts would take multiple
         | hours.
         | 
         | Everyone was talking about C++ optimizations, mutex everywhere
         | etc - which was in fact a problem.
         | 
         | However.. I seemed to be the first person to actually try to
         | debug what the database was doing, and it was going to disk all
         | the time with a very small cache.. weird..
         | 
         | I see the MySQL settings on a 1TB ram machine and they were...
         | out-of-the-box settings.
         | 
         | With small adjustments I improved the performance of this core
         | system an order of magnitude.
        
           | icedchai wrote:
           | At one startup, all I did was increase the innodb buffer pool
           | size. They were using default settings.
        
       | wewewedxfgdf wrote:
       | Given there is nothing at all said about the many config options
       | that would contribute to Postgres for this use case, we must
       | assume no configuration has been done.
       | 
       | Also, no discussion of indexes or what the data looks like, so we
       | must assume no attention has been paid to their critical factors
       | either.
       | 
       | So, another case of lies, damned lies and benchmarks.
       | 
       | It seem strange to me people are so willing to post such
       | definitive and poorly researched/argued things - if you're going
       | to take a public position don't you want to be obviously right
       | instead of so easily to discount?
        
       | didip wrote:
       | Is this engagement bait? Sigh... fine, at least I bite the bait
       | here:
       | 
       | 1. The use-case is super specific to homelab where consistency
       | doesn't matter. You didn't show us the Redis persistence setup.
       | What is the persistence/durability setting? I bet you'd lose data
       | the one day you forgot and flip the breaker of your homelab.
       | 
       | 2. What happened when data is bigger than your 8GB of RAM on
       | Redis?
       | 
       | 3. You didn't show us the PG config as well, it is possible to
       | just use all of your RAM as buffer and caching.
       | 
       | 4. Postgres has a lot of processes and you give it only 2 CPU?
       | Vanilla Redis is single core so this race is rigged to begin
       | with. The UNLOGGED table even things out a bit.
       | 
       | In general, what are you trying to achieve with this "benchmark"?
       | What outcome would you like to learn? Because this "benchmark"
       | will not tell you what you need to know in a production
       | environment.
       | 
       | Side note for other HN readers: The UNLOGGED table is actually
       | very nifty trick for speeding up unit tests. Just perform ALTER
       | to UNLOGGED tables inside the PG that's dedicated for CI/CD:
       | ALTER TABLE my_test_table SET UNLOGGED;
        
         | dizzyVik wrote:
         | 1. No persistence for redis. 2. Redis would get OOM killed. 3.
         | The default config coming with the image was used. 4. Yes, I
         | gave it 2 cpus.
         | 
         | I wanted to compare how would my http server behave if I used
         | postgres for caching and what the difference would be if I used
         | redis instead.
         | 
         | This benchmark is only here to drive the point that sometimes
         | you might not even need a dedicated kv store. Maybe using
         | postgres for this is good enough for your use case.
         | 
         | The term production environment might mean many things. Perhaps
         | you're processing hundreds of thousands of requests per second
         | then you'll definitely need a different architecture with HA,
         | scaling, dedicated shared caches etc. However, not many
         | applications reach such a point and often end up using more
         | than necessary to serve their consumers.
         | 
         | So I guess I'm just trying to say keep it simple.
        
           | piniondna wrote:
           | Redis is one of the simplest services I've used... we could
           | flip the script and say "for many db use cases postgresdb is
           | overkill, just use Redis... you get caching too". I'm not
           | sure exactly what this commentary adds to a real world
           | architecture discussion. The whole thing seems a little
           | sophomoric, tbh.
        
           | evanelias wrote:
           | > No persistence for redis
           | 
           | In this case, I would expect that a fairer comparison would
           | be running Postgres on tmpfs. UNLOGGED only skips WAL writes,
           | not _all_ writes; if you do a clean shutdown, your data is
           | still there. It 's only lost on crash.
        
       | staplung wrote:
       | I guess my question is: why bother with a benchmark if the pick
       | is pre-ordained? Is it the case that at some point the results
       | would be so lopsided that you _would_ pick the faster solution?
       | If so, what is that threshold? I.e. when does performance trump
       | system simplicity? To me _those_ are the interesting questions.
        
         | 000ooo000 wrote:
         | >why bother with a benchmark if the pick is pre-ordained
         | 
         | Validating assumptions
         | 
         | Curiosity/learning
         | 
         | Enraging a bunch of HN readers who were apparently born with
         | deep knowledge of PG and Redis tuning
        
       | chiefmix wrote:
       | love that the author finished with:
       | 
       | "i do not care. i am not adding another dependency"
        
       | mj2718 wrote:
       | My laptop running OOTB Postgres in docker does 100k requests per
       | second with ~5ms latency. How are you getting 20x slower?
        
         | nasretdinov wrote:
         | That's what I'm struggling with too. Redis can also serve
         | roughly 500k-1m QPS using just ~4-8 cores, so on two cores it
         | should be about 100k-200k at least
        
           | mj2718 wrote:
           | Yep... this is what I expect as a baseline.
           | 
           | This is also why I rarely use redis - Postgres at 100k TPS is
           | perfectly fine for all my use cases, including high usage
           | apps.
        
         | Koffiepoeder wrote:
         | If you look at their lab [0], it seems his NAS is separate from
         | his kubernetes nodes. If he hasn't tuned his networking and NAS
         | to the maximum, network storage may in fact add a LOT of delay
         | on IOPS. Could be the difference between fractions of a
         | millisecond vs actual milliseconds. If your DB load is mostly
         | random reads this can really harm performance. Just
         | hypothesizing here though, since it is not clear whether his DB
         | storage is actually done on the NAS.
         | 
         | [0]: https://dizzy.zone/2025/03/10/State-of-my-Homelab-2025/
        
           | mj2718 wrote:
           | Sorry but if he's using a setup that's 20x worse than a
           | regular laptop then I'm not really interested in his setup.
           | 
           | To be fair, I asked the question and you found the answer -
           | lol, my bad.
           | 
           | Yes I agree using a nas that adds latency would reduce the
           | TPS and explain his results. "Littles law"
        
       | necovek wrote:
       | I'd be curious to see how Postgres behaves with numeric IDs
       | instead of strings (or another built-in, faster-to-index/hash
       | type).
       | 
       | Still leaves you with needing to perfomantly hash your strings
       | into those IDs on top, but mostly as the Postgres plateau of
       | performance compared to purposely built KV DBs.
        
       | Neikius wrote:
       | I skimmed the article, but why is everyone always going for
       | distributed cache? What is wrong with in-memory cache? Lowest
       | latency, fast, easy to implement.
       | 
       | Yeah ok, you have 30 million entries? Sure.
       | 
       | You need to sync something over multiple nodes? Not sure I would
       | call that a cache.
        
         | solatic wrote:
         | In the naive/default case, durability is more important than
         | latency. Servers crash, applications are restarted. If it takes
         | a long time to rebuild your cache, or if rebuilding it would be
         | unreliable (e.g. dependency on external APIs) then you court
         | disaster by not using a durable-first cache.
         | 
         | If you _actually_ need lower latency then great, design for it.
         | But it should be a conscious decision, not a default one.
        
         | ahoka wrote:
         | Consistency could be one reason, but I think the best caching
         | strategy is not to need a cache. Adding a (not just)
         | distributed cache early can hide performance issues that can be
         | fixed instead of working around while introducing complexity
         | and maybe even adding data consistency issues and paradoxically
         | performance degradation.
        
         | lemagedurage wrote:
         | This is modern backend development. The server scales
         | horizontally by default, nodes can be removed and added without
         | disrupting service. With redis as cache, we can do e.g. rate
         | limiting fast without tying a connection to a node, but also
         | scale and deploy without impacting availability.
        
         | kiney wrote:
         | because PHP scripts started fresh for each request...
        
         | bearjaws wrote:
         | A lot of people are using NodeJS and storing a massive in-
         | memory object means longer GC time.
        
         | jerf wrote:
         | I've got a couple of systems that 10-15 years ago needed
         | something like Redis and multiple nodes distributing them but
         | are today just a single node with an in-memory cache that is
         | really just a hash keyed by a string. They're running on a
         | hot/cold spare system. If one of them dies, it takes maybe 30
         | seconds to fully reconstruct the cache, which theses systems
         | happen to be capable of doing in advance, they don't need to
         | wait for the requests to come in.
         | 
         | One thing that I think has gotten lost in the "I need redundant
         | redundancy for my redundantly redundant replicas of my
         | redundantly-distributed resources" world is that you really
         | only need all that for super-real-time systems. Which a lot of
         | things are, such as, all user-facing websites need to be up the
         | moment the user hits them and not 30 seconds later. But when
         | you _don 't_ have that constraint, if things can take an extra
         | few minutes or drop some requests and it's not a big deal, you
         | can get away with something a _lot_ cheaper, made even more
         | cheap by the fact that running things on a single node gets you
         | access to a lot of performance you simply can not have in a
         | distributed system because _nothing_ is as fast as the RAM bus
         | being accessed by a single OS process. And sometimes you have
         | enough flexibility to design your system to be that way in the
         | first place instead of accidentally wiring it up to be
         | dependent on complicated redundancy schemes.
         | 
         | (Next up after that, if that isn't enough, is the system where
         | you have redundant nodes but you make sure they don't need to
         | cross-talk at all with something like Redis. Observation: If
         | you have two nodes for redundancy, and they are doing something
         | with caching, and the cached values are generally stable for
         | long periods of time, it is often not that big a deal just to
         | let each node have its own in-memory cache and if they happen
         | to recreate a value twice, let them. If you work the math out
         | carefully, depending on your cache utilization profile you
         | often are losing less than you think here (in particular, if
         | the modal result is that you never hit a given cached value
         | again, it's cheap especially if the ones you hit you end up
         | hitting a lot, and if on average you get cached values all the
         | time, the amortized cost of the second computation is nearly
         | nothing, it's only in the "almost always hit them 2 or 3 times"
         | case that this incurs extra expense and that's actually a very,
         | very specific place in the caching landscape), especially since
         | the in-process caching and such is faster on its own terms too
         | which mitigates the problem, especially because you can set it
         | up so you have no serialization costs in this case, and the
         | architectural simplicity can be very beneficial. No, by no
         | means does this work with every system, and it is helpful to
         | scan out into the future to be sure you probably won't ever
         | need to upgrade to a more complicated setup, but there's a lot
         | of redundantly redundant systems that really don't need to be
         | written with such complication because this would have been
         | fine for them.)
        
       | gethly wrote:
       | What is all the fuss about? In the ancient times, you put your
       | entry into the database(ANY database), either with TTL or cache
       | tags, and have a cron job that periodically flushes the table.
       | Then in your application you check if you have the entry cached
       | in memory, if not, you check the db entry, if it is not there you
       | build it from scratch and cache it.
       | 
       | Why do people complicate things? We've solved caching ages ago.
        
       | nessex wrote:
       | I've got a similar setup with a k3s homelab and a bunch of small
       | projects that need basic data storage and caching. One thing
       | worth considering is that if someone wants to run both redis and
       | postgres, they need to allocate enough memory for both including
       | enough overhead that they don't suddenly OOM.
       | 
       | In that sense, seeing if the latency impact of postgres is
       | tolerable is pretty reasonable. You may be able to get away with
       | postgres putting things on disk (yes, redis can too), and only
       | paying the overhead cost of allocating sufficient excess RAM to
       | one pod rather than two.
       | 
       | But if making tradeoffs like that, for a low-traffic service in a
       | small homelab, I do wonder if you even need a remote cache. It's
       | always worth considering if you can just have the web server keep
       | it's own cache in-memory or even on-disk. If using go like in the
       | article, you'd likely only need a map and a mutex. That'd be an
       | order of magnitude faster, and be even less to manage... Of
       | course it's not persistent, but then neither was Redis (excl.
       | across web server restarts).
        
       | phendrenad2 wrote:
       | I'm glad someone did this analysis. I've also been tempted to
       | remove complexity by removing Redis from my stack. But there's a
       | decent speedup from using Redis, so I'll keep it.
       | 
       | (And to the people complaining about this benchmark not being
       | extremely scientifically rigorous: Nobody cares.)
        
       | k9294 wrote:
       | > For postgres, the bottleneck was the CPU on the postgres side.
       | It consistently maxed out the 2 cores dedicated to it, while also
       | using ~5000MiB of RAM.
       | 
       | Comparing throttled pg vs non-throttled redis is not a benchmark.
       | 
       | Of course when pg is throttled you will see bad results and high
       | latencies.
       | 
       | A correct performance benchmark would be to give all components
       | unlimited resources and measure performance and how much they use
       | without saturation. In this case, PG might use 3-4 CPUs and 8GB
       | of RAM but have comparable latencies and throughput, which is the
       | main idea behind the notion "pg for everything".
       | 
       | In a real-world situation, when I see a problem with saturated
       | CPU, I add one more CPU. For a service with 10k req/sec, it's
       | most likely a negligible price.
        
         | Timshel wrote:
         | Since it's in the context of a homelab you usually don't change
         | your hardware for one application, using the same resources in
         | both test seems logical (could argue that the test should be pg
         | vs redis + pg).
         | 
         | And their point is that it's good enough as is.
        
           | m000 wrote:
           | It's a homelab. If it works, it works. And we already knew
           | that it would work without reading TFA. No new insights
           | whatsoever. So what's the point of sharing or discussing?
        
           | k9294 wrote:
           | In a home lab you can go the other way around and compare the
           | number of requests before saturation.
           | 
           | e.g. 4k/sec saturates PG CPU to 95%, you get only 20% on
           | redis at this point. Now you can compare latencies and
           | throughput per $.
           | 
           | In the article PG latencies are misleading.
        
       | est wrote:
       | reminds me of Handlersocket in MySQL
        
       | truth_seeker wrote:
       | PostgreSQL Tuning further for this use case
       | 
       | - Reduce Page Size from 8KB to 4KB, great for write heavy
       | operations and indexed reads. Needs to compile source with those
       | flags, cant configure once installation is done.
       | 
       | - Increase Buffer cache
       | 
       | - Table partitioning for UNLOGGED Table which the author is using
       | 
       | - At connection session level, lower the transaction level from
       | SERIALIZABLE
       | 
       | - The new UUID7 in PG 18 as a key might also help as primary
       | indexed KEY type as it also supports range queries on timestamp
        
       | finalhacker wrote:
       | postgres message protocol is much complex than redis. i think
       | it's the bottleneck for such difference.
        
       | smacker wrote:
       | I like using postgres for everything, it lets me simplify
       | infrastructure. But using it as a cache is a bit concerning in
       | terms of reliability, in my opinion.
       | 
       | I have witnessed many incidents when DB was considerably
       | degrading. However, thanks to the cache in redis/memcache, a
       | large part of the requests could still be processed with minimal
       | increase in latency. If I were serving cache from the same DB
       | instance, I guess, it would cause cache degradation too when
       | there are any problems with the DB.
        
         | aiisthefiture wrote:
         | Select by id is fast. If you're using it as a cache and not
         | doing select by id then it's not a cache.
        
           | smacker wrote:
           | absolutely. But when PG is running out of open connections or
           | has already consumed all available CPU even the simplest
           | query will struggle.
        
             | motorest wrote:
             | > But when PG is running out of open connections or has
             | already consumed all available CPU even the simplest query
             | will struggle.
             | 
             | I don't think it is reasonable to assume or even believe
             | that connection exhaustion is an issue specific to
             | Postgres. If you take the time to learn about the topic,
             | you won't need to spend too much time before stumbling upon
             | Redis and connection pool exhaustion issues.
        
             | IsTom wrote:
             | You can have a separate connection pool for 'cache'
             | requests. You shouldn't have too many PG connections open
             | anyway, on the order of O(num of CPUs).
        
         | motorest wrote:
         | > But using it as a cache is a bit concerning in terms of
         | reliability, in my opinion.
         | 
         | This was the very first time I heard anyone even suggest that
         | storing data in Postgres was a concern in terms of reliability,
         | and I doubt you are the only person in the whole world who has
         | access to critical insight onto the matter.
         | 
         | Is it possible that your prior beliefs are unsound and
         | unsubstantiated?
         | 
         | > I have witnessed many incidents when DB was considerably
         | degrading.
         | 
         | This vague anecdote is meaningless. Do you actually have any
         | concrete scenario in mind? Because anyone can make any system
         | "considerably degrading", even Redis, if they make enough
         | mistakes.
        
           | baobun wrote:
           | No need to be so combative. Take a chill pill, zoom out and
           | look at the reliability of the entire system and its services
           | rather than the db in isolation. If postgres has issues, it
           | can affect the reliability of the service further if it's
           | also running the cache.
           | 
           | Besides, having the cache on separate hardware can reduce the
           | impact on the db on spikes, which can also factor into
           | reliability.
           | 
           | Having more headroom for memory and CPU can mean that you
           | never reach the load where ot turns to service degradation on
           | the same hw.
           | 
           | Obviously a purpose-built tool can perform better for a
           | specific use-case than the swiss army knife. Which is not to
           | diss on the latter.
        
             | motorest wrote:
             | > No need to be so combative.
             | 
             | You're confusing being "combative" with asking you to
             | substantiate your extraordinary claims. You opted to make
             | some outlandish and very broad sweeping statements, and
             | when asked to provide any degree of substance, you resorted
             | to talk about "chill pills"? What does that say about the
             | substance if your claims?
             | 
             | > If postgres has issues, it can affect the reliability of
             | the service further if it's also running the cache.
             | 
             | That assertion is meaningless, isn't it? I mean, isn't that
             | the basis of any distributed systems analysis? That if a
             | component has issues, it can affect the reliability of the
             | whole system? Whether the component in question is Redis,
             | Postgres, doesn't that always hold true?
             | 
             | > Besides, having the cache on separate hardware can reduce
             | the impact on the db on spikes, which can also factor into
             | reliability.
             | 
             | Again, isn't this assertion pointless? I mean, it holds
             | true whether it's Postgres and Redis, doesn't it?
             | 
             | > Having more headroom for memory and CPU can mean that you
             | never reach the load where ot turns to service degradation
             | on the same hw.
             | 
             | Again, this claim is not specific to any specific service.
             | It's meaningless to make this sort of claim to single out
             | either Redis or Postgres.
             | 
             | > Obviously a purpose-built tool can perform better for a
             | specific use-case than the swiss army knife. Which is not
             | to diss on the latter.
             | 
             | Is it obvious, though? There is far more to life than
             | synthetic benchmarks. In fact, the whole point of this sort
             | of comparison is that for some scenarios a dedicated memory
             | cache does not offer any tangible advantage over just using
             | a vanilla RDBMS.
             | 
             | This reads as some naive auto enthusiasts claiming that a
             | Formula 1 car is obviously better than a Volkswagen Golf
             | because they read somewhere they go way faster, but in
             | reality what they use the car for is to drive to the
             | supermarket.
        
               | scns wrote:
               | > You opted to make some outlandish and very broad
               | sweeping statements, and when asked to provide any degree
               | of substance, you resorted to talk about "chill pills"?
               | 
               | You are not answering to OP here. Maybe it's time for a
               | little reflection?
        
               | baobun wrote:
               | > You're confusing being "combative" with asking you to
               | substantiate your extraordinary claims. You opted to make
               | some outlandish and very broad sweeping statements, and
               | when asked to provide any degree of substance, you
               | resorted to talk about "chill pills"?
               | 
               | what are these "extraordinary claims" you speak of? I
               | believe it's you who are confusing me with someone else.
               | I am not GP. You appear to be fighting windmills.
        
               | motorest wrote:
               | > what are these "extraordinary claims" you speak of?
               | 
               | The claim that using postgres to store data, such as a
               | cache, "is a bit concerning in terms of reliability".
        
               | baobun wrote:
               | Can you point to where I made that claim?
        
           | abtinf wrote:
           | Inferring one meaning for "reliability" when the original
           | post is obviously using a different meaning suggests LLM use.
           | 
           | This is a class of error a human is extremely unlikely to
           | make.
        
           | didntcheck wrote:
           | > This was the very first time I heard anyone even suggest
           | that storing data in Postgres was a concern in terms of
           | reliability
           | 
           | You seem to be reading "reliability" as "durability", when I
           | believe the parent post meant "availability" in this context
           | 
           | > Do you actually have any concrete scenario in mind? Because
           | anyone can make any system "considerably degrading", even
           | Redis
           | 
           | And even Postgres. It can also happen due to seemingly random
           | events like unusual load or network issues. What do you find
           | outlandish about the scenario of a database server being
           | unavailable/degraded and the cache service not being?
        
       | tzahifadida wrote:
       | I think the moral here is, if you'll ever have less than 1000
       | active customers per second, don't do micro services, don't use
       | redis, just use a monolith with a simple stack and you'll still
       | be happy, less maintenance, less knowhows, etc... cheaper all
       | around.
        
       | mattacular wrote:
       | For cache, having TTL is invaluable. Having to tune cleanup jobs
       | in Postgres is annoying. Deletes create dead rows (so do updates)
       | so now you have to deal with vacuum as well. The method the
       | author suggested will run into a lot more problems if the service
       | ever needs to scale up beyond what they estimated than a
       | traditional dedicated cache like Redis in front.
        
       | wvh wrote:
       | I've done some benchmarks over the years and 6-7k/s for getting
       | simple data out of a basic Postgres installation seems pretty
       | spot on. The question is when using Postgres as a cache, are you
       | taking away performance for actual, more complex business logic
       | queries. That is going to depend on the level of overlap between
       | endpoints that use caching and those that need more complex
       | queries.
       | 
       | What I'd be interested to see is a benchmark that mixes lots of
       | dumb cache queries with typically more complex business logic
       | queries to see how much Postgres performance tanks during highly
       | concurrent load.
        
       | lukaslalinsky wrote:
       | Cache is a very relative term. If I'm caching heavy computation,
       | or perhaps externally acquired resources (e.g web scrapped data),
       | I'd use database as a cache. If I'm caching database results,
       | then I'd obviously use something faster than that.
        
       | h1fra wrote:
       | Having native TTL in PostgreSQL would remove so much unnecessary
       | Redis in the wild.
        
       | jmull wrote:
       | Besides using the word "cache" what does this have to do with
       | caching?
       | 
       | It looks like it's just storing the session in postgres/redis.
       | 
       | Caching implies there's some slower/laggier/more remote primary
       | storage, for which the cache provides faster/readier access to
       | some data of the primary storage.
        
       | noisy_boy wrote:
       | I have dealt with an abomination of a bespoke setup on Redis that
       | was neither simple key/value nor an RDBMS but a grotesque attempt
       | to simulate tables using key prefix with value being a hash
       | containing "references" to entries in other "tables" (with some
       | "special condition" edge cases thrown in to make it spicy). All
       | with next to no documentation. It took adding many unit tests
       | with generous logging to expose the logic and the underlying
       | structure. That complete madness could have been handled in a bog
       | standard way in PostgreSQL with minimal fuss. All for something
       | that didn't even need to be "real-time".
       | 
       | If you see yourself starting with simple key/value setup and then
       | feature requests come in that make you considering having
       | "references" to other keys, it is time to re-consider Redis, not
       | double down on it. Even if you insist on continuing, at the very
       | least, add a service to manage it with some clean abstractions
       | instead of raw-dogging it.
        
       | bhaak wrote:
       | I'm surprised to see this being somewhat controversial.
       | 
       | In Rails we just got database backed everything with the option
       | to go to special backends if need be.
       | 
       | The only question I have is how do I notice that my current
       | backend doesn't scale anymore and who or what would tell me to
       | switch.
        
       | enigma101 wrote:
       | postgres for everything!
        
       | saberience wrote:
       | Note to anyone reading this blog in the future, this is both very
       | very basic, and also highly misleading. The author makes it seem
       | as though PostGres and Redis are interchangeable and that there's
       | not much difference between them.
       | 
       | This is totally misguided and incorrect.
       | 
       | Redis can be easily deployed such that any request returns in
       | less than a millisecond, and this is where it's most useful. It's
       | also consistent and stable as hell. There are many use-cases for
       | Redis where Postgres is totally unsuitable and doesn't make
       | sense, and vice versa.
       | 
       | Do yourself a favour and ignore this blog (again: inaccurate,
       | poorly benchmarked, misleading) and do your own research and use
       | better sources of information.
        
       | sgarland wrote:
       | Aside from the lack of tuning mentioned, it wasn't mentioned what
       | variety of UUID was used - I assume v4. If so, the visibility map
       | impacted the performance [0].
       | 
       | [0]: https://www.cybertec-postgresql.com/en/unexpected-
       | downsides-...
        
       | paulsutter wrote:
       | Or just use sync.Map, which would be faster and about 10x less
       | work
        
       | Ellipsis753 wrote:
       | Great article. Not understanding the hate.
       | 
       | I think the jist of it is, you probably have sufficiently low
       | requests/second (<1000) that using postgres as a cache is totally
       | reasonable - which it is. If your hitting your load tests and
       | hardware spend, no need to optimise more.
        
       | BinaryIgor wrote:
       | As expected, Postgres - especially with Unlogged Tables - is fast
       | enough; additionally, having less things to run on your
       | infrastructure and fewer integrations to manage in your app is a
       | hugely underrated benefit.
       | 
       | I am just a little bit surprised on the relatively low write
       | performance for both Postgres and Redis here; but as I can see,
       | the tests were run on just 2 CPUs and 8 GB of RAM machine. In my
       | experience, with 8 CPUs, Postgres can easily handle more than _15
       | 000_ writes per second using regular tables; I would imagine that
       | it can easily be _20 000+_ for the Unlogged variety - who needs
       | more than that to cache?
        
       | sharadov wrote:
       | Postgres needs to be tuned for the workflow and the hardware.
       | 
       | It's a travesty to run it on default settings.
       | 
       | All it takes is 5 mins to do it.
       | 
       | Use pgtune - https://pgtune.leopard.in.ua/
       | 
       | I know the author concludes that he would still use Postgres for
       | his projects.
       | 
       | But, he would get much better benchmark numbers if it was tuned.
        
       | dvcoolarun wrote:
       | I get the argument for using Postgres, but it's mostly about the
       | convenience of the tool, specific constraints, and the
       | bottlenecks it solves, especially with TTL and a cache interface
       | already implemented.
       | 
       | And dealing with unlogged table contents are not crash-safe.
       | 
       | I believe Redis would have performed better with more allocated
       | CPU.
        
       ___________________________________________________________________
       (page generated 2025-09-26 23:01 UTC)