[HN Gopher] You Don't Need a Dedicated Cache Service - PostgreSQ...
       ___________________________________________________________________
        
       You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache
       (2023)
        
       Author : John23832
       Score  : 62 points
       Date   : 2024-03-06 18:27 UTC (4 hours ago)
        
 (HTM) web link (martinheinz.dev)
 (TXT) w3m dump (martinheinz.dev)
        
       | BiteCode_dev wrote:
       | Systems that are used as cache such as redis evict keys with some
       | randomness to avoid scanning all of them. They also evict keys
       | you read that expired without you having to check for that. And
       | they don't store both a key and an id.
       | 
       | Given the value of redis for caching and outside of caching, bur
       | the ease and low cost of setting it up, I would say postgres as a
       | cache seems niche.
        
         | Rapzid wrote:
         | Postgres doesn't need to scan them all because it has indexes.
         | It also has a number of random sampling strategies available.
         | 
         | It is niche, but it's a viable option for a lot of use cases if
         | you need to limit the number of service dependencies.
        
       | bjornsing wrote:
       | > Benchmarking and comparing UNLOGGED tables against Memcached or
       | Redis is out-of-scope of this article
       | 
       | That's the interesting part...
        
         | netcraft wrote:
         | yeah exactly - write isnt generally the concern, its the read
         | that makes or breaks a cache...
        
       | williamdclt wrote:
       | It's a weird article. It claims that one only needs Postgres but
       | then presents a half-baked solution: the takeaway I got is "it's
       | not HA or horizontally scalable, and it's probably slow" so it
       | sounds like I _do_ need a dedicated cache service?
       | 
       | They just handwave benchmarking against cache databases as "out
       | of scope", that is very much in scope to claim I can use PG
       | rather than redis.
       | 
       | Also a cache would impact the rest of the application
       | (reading/writing to this cache consumes cpu and IO).
       | 
       | Also, missing an index on the inserted_at and there's no
       | uniqueness constraint on the key.
       | 
       | Overall, it reads like someone read about UNLOGGED and thought it
       | could be useful for caching use cases. I don't think anyone
       | doubted that PG can be used a cache really, the trade offs would
       | be a more interesting read than the POC
        
       | bedobi wrote:
       | at a previous gig, we put a Redis cache in front of a slow
       | monolithic Microsoft SQL Server db
       | 
       | that gave us breathing room to migrate the data we were
       | interested in into a much smaller dedicated db for our purposes
       | (the user data, used on every login and other operation in the
       | system)
       | 
       | in a correctly specced RDS, MySQL can be set up to store all the
       | data both in memory and on disk (note, I'm not referring to the
       | ephemeral disk only db engine, I'm referring to giving the RDS
       | the correct memory settings and parameters to store all the data
       | both in memory and on disk)
       | 
       | once we did that, there was no difference in read speed between
       | the Redis and just hitting the db, so we got rid of the Redis,
       | which vastly simplified the architecture of the user service
       | 
       | = the article title is true, caching is often understood to be a
       | default requirement but it kind of isn't if you architect things
       | correctly
        
       | whalesalad wrote:
       | > Benchmarking and comparing UNLOGGED tables against Memcached or
       | Redis is out-of-scope of this article
        
         | CharlesW wrote:
         | That was my "why bother writing this" moment also.
        
       | antimora wrote:
       | Before reading the article I thought they author would talk about
       | caching the query results automatically, which most DBs do that.
        
       | techscruggs wrote:
       | This appears to be riding the [Solid
       | Cache](https://dev.37signals.com/solid-cache/) idea that DHH has
       | started espousing. At the core of this idea is that SSD's are
       | sufficiently fast for caching. Sure, this may work for 37signals,
       | but that is because they manage their own hardware. That is the
       | exception and not the rule. And, per [this discussion two weeks
       | ago](https://news.ycombinator.com/item?id=39443679). SSD's aren't
       | as fast for most of us that leverage "the cloud".
       | 
       | Sure, you can do it, but why? I am all for removing complexities
       | in my infrastructure. That said, Memcached and/or Redis are both
       | rock solid and dead simple.
       | 
       | This just seems to be a waist of time 99.999% of use cases.
        
         | wlll wrote:
         | > Sure, you can do it, but why?
         | 
         | Re. Solid cache they effectively answered this in the post. The
         | cost is less meaning they can cache more so p95 response time
         | goes down. If you're not constrained my money you can just buy
         | a bigger memcached or Redis of course.
         | 
         | If you're talking about moving things to Postgres /in general/,
         | then cost and/or complexity are compelling reasons to do so.
         | Small engineering teams, small budgets, simplifying and
         | reducing cost can really help. Obviously it's not suitable for
         | everyone but it's nice to have the option.
        
       | osigurdson wrote:
       | All "Postgres as an X" are valid to a point. If your use case
       | doesn't require anything else and you prefer to build / test /
       | maintain dedicated (a little snowflakey probably) logic, fine.
       | Unfortunately, few such articles discuss the edges at which these
       | solutions no longer work.
        
         | ants_everywhere wrote:
         | Do you mean just in general, or are there known edge cases
         | where Postgres performs poorly as a cache?
        
       | api wrote:
       | You almost always don't need a specialized or dedicated (blank).
       | You can usually get away with using some boring off the shelf
       | tech that can do the job, even if it isn't doing the job as
       | perfectly as a dedicated (blank). Any additional cost in
       | resources is probably offset by the savings in cognitive load and
       | admin hours.
       | 
       | If you do, you're probably operating at a scale at which you have
       | a whole big devops team to figure out, deploy, and manage your
       | dedicated (blank). Or if you're really huge, you might be
       | _creating_ your dedicated (blank) in a way that 's custom
       | engineered around your problem.
        
       | Spiwux wrote:
       | Calling Postgres experts:
       | 
       | Why, exactly, do we need to put a memory cache such as Redis in
       | front of Postgres? Postgres has its own in-memory cache that it
       | updates on reads and writes, right? What makes Postgres' cache so
       | much worse than a dedicated Redis?
        
         | darth_avocado wrote:
         | Redis is completely in memory, therefore all data is in memory.
         | Postgres on the other hand does have a cache of it's own, does
         | not give you fine controls over what stays in cache. What stays
         | in cache depends on data access patterns. E.g. I cannot make an
         | entire table of my choosing to be in cache.
        
         | candiddevmike wrote:
         | Caching with postgres also lets you do cache invalidation via
         | triggers!
        
         | kapperchino wrote:
         | Because you can't scale out just the cache part of Postgres,
         | one machine can only have so much memory
        
           | 10000truths wrote:
           | If you have a second machine, why not just put a Postgres
           | read replica on it? Letting the WAL deal with replica
           | consistency is much simpler than making the client
           | responsible for keeping an external cache in sync, and you
           | get the benefit of keeping everything in Postgres.
        
             | dpedu wrote:
             | Either I pay a performance penalty waiting for my cache
             | entry to be synced to the replica, or I risk reading stale
             | data from the replica, no?
        
         | VWWHFSfQ wrote:
         | > Why, exactly, do we need to put a memory cache such as Redis
         | in front of Postgres?
         | 
         | Maybe you don't want to run the same expensive queries all the
         | time to serve your json API?
         | 
         | There's a million reasons you might want to cache expensive
         | queries somewhere upstream from the actual database.
        
         | ww520 wrote:
         | Machines used to have limited memory. Distributed caching can
         | utilize many machines to form the overall cache. Nowadays
         | machines have plenty of memory with numerous cores and fast
         | bandwidth. The need for large network of cache servers has
         | waned.
        
         | dpedu wrote:
         | How would the architecture in the OP mesh with master-slave
         | postgres setups? If I write a cache item how can I be certain
         | the freshest entry is read back from the read-only slave?
         | Can/do I pay a performance penalty on writes waiting for it to
         | be synchronized? Is it better, when it comes to caching, to
         | ignore the slave and send all read/write cache related queries
         | to the master?
         | 
         | All of these questions go away or are greatly simplified with
         | redis.
        
           | watermelon0 wrote:
           | They don't really go away, because if you need read-only
           | replicas with PostgreSQL, there is a good chance that you
           | will also need read-only replicas with Redis.
           | 
           | Similarly to Postgres, Redis replication is also async, which
           | means that replicas can be out-of-sync for a brief period of
           | time.
        
             | tetha wrote:
             | I was unsure to comment this: You can mark postgres
             | replicas as sync replicas. Writes on the leader only commit
             | fully once the writes are fully replicated to all sync
             | replicas. This way postgres could ensure consistency across
             | several replicas.
             | 
             | This however can come with a lot of issues if you started
             | to use this to ensure consistency across many replicas.
             | Writes are only as fast as the slowest replica, and any
             | hickup on any replica could stall all writes.
             | 
             | What I wasn't sure about - IMO in such a situation, you
             | should rather fix the application to deal with (briefly)
             | stale information, and then you can throw either async
             | postgres replicas at it.. or redis replication, or
             | something based on memcache.
        
         | dvfjsdhgfv wrote:
         | > What makes Postgres' cache so much worse than a dedicated
         | Redis?
         | 
         | It's not worse, this is just a cheap way to increase
         | performance without having to scale the main instance
         | vertically.
        
         | tetha wrote:
         | Postgres can develop problematic behavior if you have high
         | churn tables - tables with lots of deletes on them.
         | 
         | If you have many inserts and deletes on a table, the table will
         | build up tombstones and postgres will eventually be forced to
         | vacuum the table. This doesn't block normal operation, but auto
         | vacuums on large tables can be resource intensive - especially
         | on the storage/io side. And this - at worst - can turn into a
         | resource contention so you either end up with an infinite auto
         | vacuum (because the vacuum can't keep up fast enough), or a
         | severe performance impact on all queries on the system (and
         | since this is your postgres-as-redis, there is a good chance
         | all of the hot paths rely on the cache and get slowed down
         | significantly).
         | 
         | Both of these result in different kinds of fun - either your
         | applications just stop working because postgres is busy
         | cleaning up, or you end up with some horrible table bloat in
         | the future, which will take hours and hours of application
         | downtime to fix, because your drives are fast, but not that
         | fast.
         | 
         | There are ways to work around this, naturally. You could have
         | an expiration key with an index on it, and do "select * from
         | cache order by expiration_key desc limit 1", and throw
         | pg_partman at it to partition the table based on the expiration
         | key, and drop old values by dropping partitions and such... but
         | at some point you start wondering if using a system meant for
         | this kinda workload is easier.
        
           | zomnoys wrote:
           | I believe the author addresses this by making the table
           | `UNLOGGED`. https://www.crunchydata.com/blog/postgresl-
           | unlogged-tables. These have less overhead.
        
         | mjdijk wrote:
         | IME -- and I've just replaced a Postgres-only unlogged cache
         | table with Redis -- it's not about the storage or caching, but
         | about the locking. Postgres needs to acquire (or at least
         | check) a lock for reading as well as writing. Although some
         | optimizations have been done for mostly-reading-workloads
         | (search for postgres fast-path locking), you'll still run into
         | lock contention problems relatively quickly.
        
         | hamandcheese wrote:
         | As far as I know there is no way to tell Postgres to keep a
         | particular index or table in memory, which is one reason to be
         | weary of using one PG instance for many varied workloads. You
         | might solve this by earmarking workload-specific replicas,
         | though.
         | 
         | If you can keep your entire working set in memory, though, then
         | it probably doesn't matter that much.
        
           | gjvc wrote:
           | Being wary for too long can make one weary.
        
         | renegade-otter wrote:
         | Add read replicas before doing cache "optimization", because
         | cache timing bugs are a special kind of hell.
        
         | jasonwatkinspdx wrote:
         | The buffer pool in a rdbms ends up intimately connected with
         | the concurrency control and durability protocols. There's also
         | a variety of tradeoffs in how to handle conflicts between
         | transactions (steal vs no steal, force vs no force, etc). You
         | need deadlock detection or prevention. That creates a necessary
         | minimum of complexity and overhead.
         | 
         | By comparison an in memory kv cache is _much_ more streamlined.
         | They basically just need to move bytes from a hash table to a
         | network socket as fast as possible, with no transactional
         | concerns.
         | 
         | The semantics matter as well. PostgreSQL has to assume all data
         | needs to be retained. Memcached can always just throw something
         | away. Redis persistence is best effort with an explicit loss
         | window. That has enormous practical implications on their
         | internals.
         | 
         | So in practical terms this means they're in different universes
         | performance wise. If your workload is compatible with a kv
         | cache semantically, adding memcached to your infrastructure
         | will probably result in a savings overall.
        
         | vbezhenar wrote:
         | Because Redis is almost infinitely scalable while Postgres is
         | not. You have limited vertical scalability budget for your
         | database. The more things you put into your database, the more
         | budget you spending on things that could be done elsewhere.
         | 
         | Sometimes it makes sense, when your workload is not going to
         | hit the limits of your available hardware.
         | 
         | But generally you should be prepared to move everything you can
         | out of the database, so database will not spend any CPU on
         | things that could be computed on another computer. And cache is
         | one of those things. If you can avoid hitting database, by
         | hitting another server, it's a great thing to do.
         | 
         | Of course you should not prematurely optimize. Start simple,
         | hit your database limits, then introduce cache.
        
         | stonemetal12 wrote:
         | Even though PG caches it is still doing all the things to run
         | the query. It is like saying why does a 3d render take so long
         | to render an image when the same image saved to a PNG opens so
         | much faster.
         | 
         | The article talks about using Unlogged tables, they double
         | write speed by forgoing the durability and safety of the WAL.
         | It doesn't mention query speed because it is completely
         | unaffected by the change.
        
       | koliber wrote:
       | The reason I use Redis as a cache is raw speed. I would consider
       | using this Postgres-based solution if the speed was comparable.
       | The article leaves benchmarking up to the reader.
        
       | darth_avocado wrote:
       | The reason you build cache for most purposes, is fast data
       | retrieval. A major component of "fast" is the latency of a disk
       | read vs a memory read. Redis and memcached are in memory by
       | design, so entirety of your cached table will be in memory. When
       | you design a cache using postgres (assuming on an RDS instance),
       | although data will be cached in memory based on access patterns,
       | I don't believe there is a direct way to control what data gets
       | cached, which makes this implementation slower for a lot of real
       | world use cases.
       | 
       | This gets even slower when you have a distributed DB like AWS
       | Aurora Postgres because your data on disk can be on different EBS
       | volumes, so bringing it in memory can be slower than even RDS
       | Postgres.
        
       | ctc24 wrote:
       | Why wouldn't you simply use SQLite (or some other in-memory
       | flavor of SQL) instead of hacking the main Postgres db and adding
       | load to the primary instance?
       | 
       | The author makes a valid point that there's something nice about
       | using familiar tooling (including the SQL interface) for a cache,
       | but it feels like there are better solutions.
        
         | VWWHFSfQ wrote:
         | > hacking the main Postgres db and adding load to the primary
         | instance
         | 
         | nobody said this had to be on the primary database server, and
         | how is this hacking?
         | 
         | Is every app server going to have its own local "sqlite cache"?
         | Or is it going to use one of the sqlite server/replication
         | things? So why not just use PG?
        
           | ctc24 wrote:
           | That's a bit of a strawman argument. Per the post, you can't
           | leverage this on a read replica, it has to be run on primary.
           | So you're going to stand up and manage a full new Postgres
           | instance for this?
           | 
           | I'm sure there are many cases when that makes sense, but
           | there are many cases when that's also overkill. An in-memory
           | cache inside your server will give you better performance,
           | and a lot of less infrastructure maintenance complexity.
        
         | 0x457 wrote:
         | Because SQLite is in process. Usually, when you start thinking
         | about cache, you have more than one application server. Each
         | application server running its own cache make cache
         | invalidation a nightmare (I worked in a company where one
         | genius did that and caused a lot of troubles). Don't show me
         | any sqlite replication things because that's not how you want
         | your cache to work.
         | 
         | My issue with running PostgreSQL as cache would be its thread
         | per connection model and downsides of MVCC for cache.
        
       | antoineleclair wrote:
       | For projects where I know the team will remain small (less than
       | let's say 15 developers), I usually push to keep the architecture
       | as simple as possible.
       | 
       | I've used something similar in the past, but kept the expiration
       | code in the app code (Python) instead of using "fancy" Postgres
       | features, like stored procedures. It's much easier to maintain
       | since most developers will know how to read and maintain the
       | Python code, that's also commited to the git repository.
       | 
       | Also, instead of using basic INSERT statements, you can "upsert".
       | 
       | INSERT INTO cache_items (key, created, updated, key, expires,
       | value) VALUES (...) ON CONFLICT ON CONSTRAINT pk_cache_items DO
       | UPDATE SET updated = ..., key = ..., expires = ..., value = ...;
       | 
       | And since you have control over the table, you can customize it
       | however you want. Like adding categories of cache that you can
       | invalidate all at once, etc.
       | 
       | Postgres is also pretty good at key/values.
       | 
       | In other words, I agree that using Postgres for things like
       | caching, key/values, and even maybe message queue, can make
       | sense, until it doesn't. When it doesn't make sense anymore, it's
       | usually easy to migrate that one thing off of Postgres and keep
       | the rest there.
       | 
       | Also, one benefit that's not often talked about is the complexity
       | of distributed transactions when you have many systems.
       | 
       | Let's say you compute a value inside a transaction, cache it in
       | Redis, and then the transaction fails. The cached valued is
       | wrong. If everything is inside of Postgres, the cached value will
       | also not be commited. One less thing to worry about.
        
         | VWWHFSfQ wrote:
         | > Let's say you compute a value inside a transaction, cache it
         | in Redis, and then the transaction fails.
         | 
         | That just sounds like an application bug. Nothing should be
         | done with the query result anyway until the transaction either
         | completes our rolls back.
        
           | antoineleclair wrote:
           | Transactions can fail because they conflict with other
           | transactions happening at the same time. It's not an
           | application bug. It's real life transactions happening on a
           | production system. It's normal for that to happen all the
           | time. The app can retry, etc., but it should be expected to
           | happen. Having to deal with distributed transactions is not
           | something easy. Especially when they're part of many
           | different systems. For example, you'd have to wait until the
           | transaction commits successfully before setting the value in
           | the cache, which makes it hard to read. Also, life in general
           | happens. Compute a value, cache it, save things to the
           | database, make API calls, and then a network error happens
           | cancelling everything that you've just done. Having code that
           | handles this kind of possibility is relatively hard to
           | write/read.
        
             | VWWHFSfQ wrote:
             | Right but postgres isn't going to help with this if the
             | application developer isn't doing safe and proper
             | transaction management in the first place. What you
             | described is a bug in the application logic for when and
             | how to update the cache.
        
               | singron wrote:
               | It's super hard to get this right. E.g. if you only
               | update the cache after the transaction commits, you might
               | commit without updating the cache, or if 2 writers
               | interleave, the first one to commit might make the final
               | update to the cache with a stale value.
        
       | mrweasel wrote:
       | So presumably the reason why you'd do something like this is if
       | you have expensive queries or calculations, but in that case you
       | could use materialized views instead, depending on your data and
       | queries obviously.
       | 
       | If it's to serve as a cache on an application level, memcache
       | seems like it would be simpler and faster. While I do like Redis,
       | I can see why you'd be careful introducing it. Redis can blur the
       | boundaries between caching and data storage a bit, but it's just
       | so handy.
        
       | 0xbadcafebee wrote:
       | Sometimes I feel like the articles posted here are elaborate
       | trolls. Like they wanted to know what the purpose of a cache is
       | and how/when to implement one, but they didn't want to do
       | research, so they came up with the worst idea they could imagine
       | and blogged about it, hoping someone on HN would tell them the
       | right way (or just laughing at people trying to correct them).
       | 
       | It gives me anxiety how much truly awful advice is upvoted here.
       | It feels like children playing in a sand pit, and then one of the
       | children dumps a box of rat poison on the ground, and some kid
       | says how rat poison is actually good for you because it contains
       | minerals or something. So they all start quickly scarfing it
       | down. You try to tell them rat poison is bad for them, and then
       | several of the kids start defending rat poison, with one
       | lecturing you for being so negative about rat poison. I guess I
       | should just let the kids poison themselves, but it's a terrible
       | thing to watch.
        
         | luibelgo wrote:
         | Not super constructive either, the beauty of having things like
         | HN is to be able to discuss ideas, rather than just point to
         | them as wrong.
         | 
         | At least a couple of counterpoints on why this would be a bad
         | idea and offer a better approach.
        
           | mrd3v0 wrote:
           | There are way more terrible ideas than good ones, and that
           | means treating all of them equally is a one-way road to HN
           | losing the one thing it has going: new ideas.
           | 
           | Feedback like root comment is important to keep pushing ideas
           | forward, instead of a loop.
        
         | eloisant wrote:
         | Yes, and honestly his "why" section really reads like "A hammer
         | can do everything a screwdriver can, and I really like my
         | hammer so I'll use it to push screws into planks".
        
         | inopinatus wrote:
         | This is Cunningham's Law.
         | https://en.wikipedia.org/wiki/Ward_Cunningham#%22Cunningham'...
        
         | macNchz wrote:
         | Do you have specific criticisms? There are myriad and varied
         | different use cases for a cache. I've used Postgres in a pinch.
         | It's fine, for certain things. I've used a lot of other caches
         | as well that would be highly inappropriate for the situations
         | I've used Postgres in.
         | 
         | In my experience there are far more highly-overwrought bloated
         | architectures deployed than there are overly-clever minimalist
         | ones. Everyone wants to put the cool tools on their resume.
        
           | lnxg33k1 wrote:
           | So for one we do cache to remove load from dbms, then makes a
           | lot of sense to remove load from dbms by giving it the load
           | of cache too, another thing might be that separate services
           | might be optimised for different tasks, redis is built with
           | being first class in memory cache service, postgres is built
           | with different first-class usage in mind, another one I can
           | think of is resources, I can scale redis and dbms
           | independently as their specific needs change
           | 
           | Overall for me this cache in postgres is a bad idea too
        
         | lnxg33k1 wrote:
         | I was working at a company where we had stupid things from the
         | tech world as loading message for slack, like "parsing html
         | with regex". I think "caching in postgres" might deserve a spot
         | there too
        
         | biorach wrote:
         | Very dramatic. No actual technical arguments.
         | 
         | It's easy to envisage systems where this would have no negative
         | impact whatsoever.
        
       | shrubble wrote:
       | I was checking out used rack servers on ebay and servers with
       | 768GB RAM are not very expensive, they are under $3k USD... seems
       | that if you could cache 256GB to 512GB of Postgres into RAM you
       | could do pretty well...
        
         | 0cf8612b2e1e wrote:
         | If you own the hardware, pricing becomes an entirely different
         | question. Cloud will make you pay heavily for RAM where this
         | strategy of using disk backed cache suddenly shines.
        
       | nodesocket wrote:
       | Curious in the db schema, why bother with an id as the primary
       | key? Just make the key the primary key.
        
       ___________________________________________________________________
       (page generated 2024-03-06 23:01 UTC)