[HN Gopher] Single dependency stacks
___________________________________________________________________
Single dependency stacks
Author : jeffreyrogers
Score : 142 points
Date : 2022-02-09 16:53 UTC (6 hours ago)
(HTM) web link (brandur.org)
(TXT) w3m dump (brandur.org)
| bcrosby95 wrote:
| I always start with just MySQL and introduce things as needed -
| not as guessed. These days I don't work on anything with enough
| traffic that needs more than that.
|
| An RDBMS is a lot more than just SQL these days, and they offer a
| lot of good enough solutions to a wide variety of problems.
| mrweasel wrote:
| Completely agreed, sadly we're seeing a ton of developers who
| are honestly more interested in getting half baked solutions
| out the door so they can move on to the next project. We have
| one customer who run a huge national project on a few MariaDB
| servers, one can technically run the whole thing, it's no
| problem. Another customer is smaller but insist on using
| Hibernate, but they don't really know how to use it, so they'll
| frequently kill the database generating silly queries. Instead
| of accepting that may they don't fully understand their choose
| stack, they try to "solve" the problem by adding things like
| Kubernetes and Kafka, complicating everything.
|
| Modern databases, and servers in general is capable of amazing
| things, but there's a shortage of developers with the skills to
| utilize them.
| alilleybrinker wrote:
| Apparently Tailscale for a long time just used a JSON file as
| their data storage, and moved from that to SQLite with a hot-
| swappable backup with Litestream [0], and hey they've done fine
| with that.
|
| [0]:
| https://securitycryptographywhatever.buzzsprout.com/1822302/...
| fizx wrote:
| This is great, but you might want to have multiple postgreses for
| the different workloads. DB postgres != rate-limit PG != search
| PG. It's pretty hard to optimize one DB for every workload.
| goostavos wrote:
| Counter point: most people operate on workloads so trivial that
| they don't need optimized.
|
| I think the most important line in the article is the "let's
| see how far it gets us." It is absolutely trivial to invent
| situations where an architecture wouldn't work well, or scale,
| or "be optimal." It's far, far harder to just exist in reality,
| where most things are boring, and your "bad" architecture is
| all you ever need.
| RedShift1 wrote:
| Why? You can have multiple databases in one instance, running
| multiple pg instances seems counterproductive?
| rtheunissen wrote:
| Maybe instance-level configuration?
| mikeklaas wrote:
| Multiple databases in postgres fundamentally share the same
| underlying infrastructure (i.e., WAL), and so do not offer
| much in terms of scalability or blast-radius protection
| compared to putting all tables in the same database.
| theptip wrote:
| I'm a big fan of this approach, having built a Django monolith
| with the standard Celery/RMQ, dabbled in Redis for latency-
| sensitive things like session caching, and never hitting scale
| where any of those specialized tools were actually required
| (despite generating substantial revenue with the company).
|
| One thing to note, if you use pq or another Postgres-as-queue
| approach, you should be aware of the work you'll need to do to
| move off it -- this pattern lets you do exactly-once processing
| by consuming your tasks in the same DB transaction where you
| process the side-effects. In general when using a separate task
| queue (RMQ, SQS, etc.) you need to do idempotent processing (at
| least once message semantics). A possible exception is if you use
| Kafka and use transactional event processing, but it's not
| serializable isolation.
|
| This is probably a reason in favor of using a Postgres task queue
| initially since exactly-once is way simpler to build, but just be
| aware that you're going to need to rethink some of your
| architectural foundation if/when you need to move to a higher-
| throughput queue implementation.
| btown wrote:
| https://www.2ndquadrant.com/en/blog/what-is-select-skip-lock...
| describes the benefits of the above approach.
|
| Something to bear in mind is that if you have a bug or crash in
| your task handler that causes a rollback, another worker will
| likely try to grab the same task again, and you might end up
| clogging all of your workers trying the same failed tasks over
| and over again. We use a hybrid approach where a worker takes
| responsibility for a task atomically using SKIP LOCKED and
| setting a work-in-progress flag, but actually does the bulk of
| the work outside of a transaction; you can then run arbitrary
| cleanup code periodically for things that were set as work-in-
| progress but abandoned, perhaps putting them into lower-
| priority queues or tracking how many failures were seen in a
| row.
|
| Postgres is absolutely incredible. If you are at anything less
| than unicorn scale, outside of analytics where columnar stores
| are better (though Citus apparently now has a solution for this
| - https://www.citusdata.com/blog/2021/03/06/citus-10-columnar-.
| ..), and highly customizable full-text search (it's hard to
| beat Elastic/Lucene's bitmap handling), there are very few
| things that other databases can do better if you have the right
| indices, tuning, and read replicas on your Postgres database.
| [deleted]
| samwillis wrote:
| Your whole first paragraph literally describes our situation
| exactly, same stack and all. Classic premature optimisation.
|
| It's made it so clear to me that so much of what you read on HN
| about the latest and greatest scaling tricks are only relative
| to a tiny tiny fraction of business.
| jrumbut wrote:
| Getting good at solving problems with relational databases is
| a highly underrated skill.
|
| They are really underutilized by many projects.
|
| Not to say any other software is bad, just that keeping the
| stack simple can help small teams move quickly and not get
| bogged down fighting fires. Also, the paths to scale up the
| major RDBMSes are well documented and understood. With a
| newer service and many interacting systems in your back end
| you end up having to be a pioneer (which takes time away from
| implementing new features).
| jaxrtech wrote:
| Absolutely. I've also seen my fair share of horrendous
| home-grown "ETL" programs that waste more time shuffling
| bytes to and from database with poor ORM queries in loops,
| that could be done with a couple half decent SQL queries.
|
| Probably the most useful things for me was learning
| relational algebra in college, and having been thrown on
| the deep end on a team that was very SQL heavy (not
| withstanding attempting to debug Oracle PL/SQL syntax
| errors while pulling your hair out about a missing closing
| parenthesis -- of which isn't the problem).
|
| The usual challenge seems to be fetching data from external
| services or performing complex business that may
| conditionally load things -- things that can be awkward in
| procedural SQL. At the end of the day, you're building a
| messy ad-hoc dependency graph that is being manually
| executed very inefficiently. Would be better to just have
| your code just describe the dependency graph and treat each
| value transparently as a promise/future and then have a
| separate engine execute it.
|
| Anyhow, something something monads with lipstick, I
| digress...
| KptMarchewa wrote:
| ETL using ORM? Really?
| jjice wrote:
| FWIW, setting up a Redis server and setting up some basic
| caching middleware is pretty straightforward in my
| experience. Did this at my job a month ago in an afternoon.
|
| I'd say the biggest overhead is adding Redis if you don't
| already have it, and that addition's difficulty will vary
| based on how you host Redis. We use Elasticache on AWS, so
| just a few clicks and set it in the same VPC.
|
| I guess the real question comes down to how you feel about an
| extra moving part. Redis is probably the part of our system
| that has had the least hiccups (very much set and forget and
| no resource issues so far), but I can understand in the case
| where you'd rather not add more than a DB.
|
| I'd say it's just as easy to setup as Postgres. Elastic
| search I hear is a pain, though I have no personal
| experience.
| blowski wrote:
| The pain is not initially setting it up, it's in the
| ongoing maintenance. Redis is one of the less painful
| services to support, especially if using a managed version.
| But I don't like the trend of defaulting to using Redis
| without really justifying it.
| jjice wrote:
| You're right for sure if the service is small - Redis
| would be overkill. I guess I'm coming at it from the
| perspective I'm most used to where we use it for data
| caching and session data since we have multiple servers,
| so handling it any other way would be more work.
| barrkel wrote:
| Setting up caching usually isn't the problem, it's
| invalidation and eviction that bites you.
| smoe wrote:
| From my view, the issue in the case is not how difficult it
| is to setup Redis for caching (it is indeed just a couple
| of clicks/commands), but the new issues one has to deal
| with when resorting to caching things too prematurely,
| instead of making the app fast enough with minimal effort.
| smoe wrote:
| > just be aware that you're going to need to rethink some of
| your architectural foundation if/when you need to move to a
| higher-throughput queue implementation
|
| I haven't seen many if any projects that didn't require some
| architectural rethinking over their lifetime. I have seen more,
| that where arguably over engineered in the beginning, but then
| never lived long enough to actually benefit from it.
|
| Not saying everyone should use Postgres-as-queue for every
| project. But for a lot of projects it is going to be much
| harder to acquire the active user base generating the
| throughput Postgres can't handle, than doing continuous
| refactoring of the system to deal with the changing
| requirements.
| rattray wrote:
| For Node folks interested in a postgres-based task queue, I
| find graphile-worker[1] to be pretty terrific. Docs make it
| sounds like it's only for postgraphile/postgrest but it's great
| with any Node app IMO.
|
| [1] https://www.npmjs.com/package/graphile-worker
| closeparen wrote:
| Dependencies are not equal in this regard. For example in a
| corporate context, we have basically 1.5 people in Eastern
| Europe maintaining Redis for 5,000 engineers. Kafka is more
| like 15.
| [deleted]
| kelp wrote:
| I kind of love this idea.
|
| It reminds me of a redis use case we had at a former employer.
|
| We had a cluster with a high double digit number of nodes that
| delivered a lot of data to various external APIs.
|
| Some of those APIs required some metadata along with the payload.
| That metadata was large enough that it had made sense to cache it
| in Redis. But over time, with growth the cluster got large
| enough, and high volume enough that just fetching that data from
| Redis was saturating the 10Gbps Nic on the ElastiCache instance,
| creating a significant scaling bottleneck. (I don't remember if
| we moved up to the 25Gbps ones or not.)
|
| But we could have just as easily done a local cache (on disk or
| something) for this metadata on each node and avoided the cost of
| the ElastiCache and all the scaling and operational issues. It
| would have also avoided the network round trips to Redis, and the
| whole thing probably would have just performed better.
| rkhacker wrote:
| I am sure there is momentary thrill of achieving minimalism but
| alas the world is not so simple anymore. I would refer the OP and
| the community here to the paper from the creator of PostgreSQL:
| http://cs.brown.edu/~ugur/fits_all.pdf
| finiteseries wrote:
| (2005)
| recuter wrote:
| Exactly. The title is - "One Size Fits All": An Idea Whose
| Time Has Come and Gone
|
| As is so often the case in this industry an idea comes, goes,
| and comes back around again. Time to reevaluate.
| luhn wrote:
| I think that paper is making an argument orthogonal to OP. OP
| is saying Postgres is a good enough solution, that the
| advantages of simplifying the stack outweigh the disadvantages
| of using a non-optimal database for basic use cases.
| samwillis wrote:
| I am so for this, being the sole developer in my company for the
| last 10 years I introduced far to many "moving parts" as it grew
| and I'm now going through the process of simplifying it all.
|
| I love Redis but it's next to go, currently used for user
| sessions and a task queue, both of which Postgres is more than
| capable of doing [0]. Also, as a mostly Python backend, I want to
| rip out a couple of Node parts, shrink that Docker image.
|
| 0: https://news.ycombinator.com/item?id=21536698
| deathanatos wrote:
| I'm not sure what qualifies as "stateful", but
|
| > _Fewer dependencies to fail and take down the service._
|
| No logging? No metrics? No monitoring? (& _yes_ , you'd think
| those shouldn't take down the stack if they went offline. And I'd
| agree. And yet, I've witnessed that failure mode multiple times.
| In one, a call to Sentry was synchronous & a hard-fail, so when
| Sentry went down, that service 500'd. In another, syslog couldn't
| push logs out to the logging service, as that was _very_ down,
| having been inadvertently deleted by someone who ran "terraform
| apply", didn't read the plan, & then said "make it so"; syslog
| then responded to the logging service being down by logging that
| error to a local file. Repeatedly. As fast as it possibly could.
| Fill the disk. Disk is full. Service fails.)
|
| I've also seen our alerting provider have an outage _during an
| outage we 're having_ & thus not sending pages for our outage,
| causing me to ponder and wonder how I'd just rolled a 1 on the
| SRE d20 and what god did I anger? Also who watches the watchmen?
|
| > _A common pitfall is to introduce something like ElasticSearch,
| only to realize a few months later that no one knows how to run
| it._
|
| Yeah I've seen that exact pit fallen into.
|
| No DNS? Global Cloudflare outage == fun.
|
| No certificates?
|
| I've seen certs fail so many different way. Of course not getting
| renewed, that's your table stakes "welcome to certs!" failure
| mode. Certs get renewed but an _allegedly Semver compatible_
| upgrade changed the defaults, and required extensions don 't get
| included leading to the client rejecting the cert. I've seen a
| service which watches certs to make sure they don't expire (see
| the outage earlier in this paragraph!) have an outage (which, b/c
| it's monitoring, wasn't customer visible) because a tool issued a
| malformed cert (...by... default...) that the monitor failed to
| parse (as it was malformed). Oh, and then the LE cross-signing
| expiration took out an Azure service that wasn't ready for it, a
| service from a third-party of ours that wasn't ready for it,
| _and_ our CI system b /c several tools were out of date including
| _an up to date system on Debian that was theoretically
| "supported"..._ but still shipped an ancient crypto library
| riddled with bugs in its path validation.
|
| > _Okay fine, S3 too, but that's a different animal._
|
| _Is it?_ I 've seen that have outages too, & bring down a
| service with it. (There really wasn't a choice there; S3 was the
| service's backing store, & without it, the service was truly
| screwed.)
|
| But of course, all this is to say I violently agree with the
| article's core point: think carefully about each dependency, as
| they have a very real production cost.
|
| (I've recently been considering changing my title to SRE because
| I have done very little in the way of SWE recently...)
| AtNightWeCode wrote:
| Redis is overused in my opinion. For many requests it does not
| beat a database for the same amount of money. There can be other
| reasons for using a cache though. I often hear that people claim
| that the cache "protects" the database. From my experience it is
| more common that once the database has problems it spills over to
| the cache. If then for instance a circuit breaker opens to the
| cache the database will be smacked senseless.
| rtheunissen wrote:
| Often, cache is relied on so much that we are afraid to clear
| it because no one knows what the impact will be on the
| database. We now duplicate our data in many cases, have to deal
| with cache invalidation, and ironically create more risk than
| protection. Cache should be extremely selective and
| encapsulated very well.
| AtNightWeCode wrote:
| Most projects I work on use a lot of edge caching but it is
| not business critical. It is for speed. It is a problem if
| the design depends on both a cache and a database if the
| cache is dependent on the database.
| rglover wrote:
| Personally staking my own future on this "less is more" approach
| having seen some serious horror flicks in terms of
| app/infrastructure stacks the past few years.
|
| What continues to surprise me: a lot of time and money has been
| or is being wasted on reinventing the wheel (speaking
| specifically about webdev here).
|
| There are a ton of great tools out there that are battle-tested
| (e.g., my favorite find of late as someone just digging into
| hand-rolled--meaning no k8s/docker/etc--infrastructure is
| haproxy).
| xupybd wrote:
| Simple manually deployed docker images have been a great win
| for us.
|
| You get to declare all your dependencies in the docker build.
| All config is in one .env file.
|
| Installs and roll backs are trivial.
|
| Setting up new dev environments is easy.
| justin_oaks wrote:
| What do you mean when you say "manually deployed docker
| images"?
|
| It could mean that you build the images on one machine, then
| export the images as a tar files, copy those to the
| destination server, and then import the images.
|
| Or it could mean that you copy the Dockerfile and any
| necessary context files to the destination server and run the
| Docker build there.
|
| Or it could mean you still use a Docker registry (Docker Hub,
| AWS ECR, or a self-hosted registry), but you're manually
| running docker or docker-compose commands on the destination
| server instead of using an orchestrator like Kubernetes.
|
| As for me, I've done pretty well with that last option. I
| still use either an in-house Docker registry or AWS ECR, but
| I haven't needed anything like Kubernetes yet.
| ftlio wrote:
| I've seen millions of dollars spent on infrastructure to
| support what Heroku could do for a few thousand a month. Not to
| mention the egregious dev heartache caused by having to work
| against it. Anyone who argued it was a waste of time and money
| just "didn't get it" apparently.
|
| I'm a huge fan of all the cool container stuff, queues, stream
| processing, all the weird topologies for apps built with node,
| Golang. I'm with it, kids. But for an MVP, just use Heroku,
| GAE, a Droplet with SCP for god sakes.
|
| If you need to do something more complicated, growth will tell
| you.
| pphysch wrote:
| > If you need to do something more complicated, growth will
| tell you.
|
| +1. Needing to refactor & scale your infrastructure to enable
| more growth is almost always a "good problem to have".
|
| You've steadily grown to $100mm revenue and your backend is
| starting to show it because you prioritized productivity over
| premature optimizations? Oh no, the world is ending! (said no
| one ever)
| VWWHFSfQ wrote:
| I worked for a very profitable small internet biz whose
| entire deployment was git-archive + rsync. I never had to
| troubleshoot that thing even once. Now it seems like
| everybody is playing golf trying to see how many AWS services
| they can use to unpack a gzip on a server.
| rlawson wrote:
| There are so many benefits of keeping things as simple as
| possible. - easier troubleshooting - easier
| to maintain documentation - quicker onboarding of new devs
| - easier to migrate to new hosting if needed - quicker to
| add features (or decide not to add)
| baggy_trough wrote:
| The next level up of this approach is running everything on one
| box.
| pnathan wrote:
| my take on this looks similar, but I'll have more going on:
|
| 1. kubernetes. 2. postgres. 3. application.
|
| where the kubernetes bit is used for the more integration test
| side of things.
|
| a lot of machinery can be employed that gets in the way of "wtf
| just happened".
| eezing wrote:
| I was worried about how long the initial indexing would take for
| a recent full text search implementation in Postgres.
|
| Took less than a second on a few hundred thousand rows.
|
| Naive and simple is good enough for now.
| cjfd wrote:
| I think this is the right idea. The pendulum between having as
| many dependencies as possible and having no dependencies at all
| has flung way too far in the 'as many dependencies as possible'
| side. It is a major PITA when yet another random component
| breaks. Let us say that A can be done in B in, let us say, three
| man weeks. I would say it is worth it. The advantage is that A
| will never break because it is not there. Note that A also may
| break 3 years from now when everybody who knows anything about A
| has left the company.... B is now used in more places so people
| are more likely to have been forced to learn it so when the
| emulation of A break there is a better chance that people will
| know what to do. I see mostly advantages.
| mberning wrote:
| I was a bit disappointed. I though they were going to implement
| their entire system using stored procedures. That would be
| "single dependency". As it stands it is "all my app tier
| dependencies and postgres.
| [deleted]
| M0r13n wrote:
| This is why I love Ansible. As a DevOps enigneer I do not design
| or implement complex systems or programs. But I am responsible
| for the reliability of our systems and infrastructure. And
| Ansible is just pleasant to use for the same reasons stated by
| the author:
|
| - a single packaged without any additional dependencies - no
| client side software - pure SSH - simple playbooks written in
| only YAML
|
| Focusing on simplicity and maintainabilty has helped me deliver
| reliable systems.
| chishaku wrote:
| What other examples are there of "single dependency stacks"?
|
| This article is really about the versatility and reliability of
| postgres.
|
| And I'm all in agreement.
|
| Reminiscent of:
|
| https://www.craigkerstiens.com/2017/04/30/why-postgres-five-...
|
| https://webapp.io/blog/postgres-is-the-answer/
|
| http://rachbelaid.com/postgres-full-text-search-is-good-enou...
|
| http://boringtechnology.club/
|
| As much as HN could lead you astray with the hype of this and
| that tech, articles like the above are some of the most
| consistently upvoted on this website.
| chubot wrote:
| Also:
|
| https://sive.rs/pg2 - Simplify: move code into database
| functions
|
| https://sive.rs/pg - PostgreSQL example of self-contained
| stored procedures
|
| some linked examples:
| https://github.com/sivers/store/tree/master/store/functions
|
| I like this idea in theory ... although it would cause me to
| need to know a lot more SQL, which is a powerful but hostile
| language :-/
|
| I care about factoring stuff out into expressions / functions
| and SQL fails in that regard ...
|
| https://www.scattered-thoughts.net/writing/against-sql/
|
| It's hard to imagine doing this with a ton of duplication. I
| have written SQL by hand and there are probably more confusing
| corners than in shell, which is saying a lot!
| np_tedious wrote:
| I have nothing to disagree with here, but it's worth noting that
| his company Crunchy Data are themselves a postgres provider. So
| they, more then most, have the chops and incentive to do a great
| deal in postgres alone.
|
| https://www.crunchydata.com/
| 0xbadcafebee wrote:
| > 1 Okay fine, S3 too, but that's a different animal.
|
| I think people forget that AWS S3 isn't immutable. Unlike an EBS
| volume, it is impossible to "snapshot" S3 the way you can a
| database. There are arbitrary global limitations outside the
| scope of your control, and a dozen different problems with trying
| to restore or move or version all the things about buckets that
| aren't objects (although the objects too can be problematic
| depending on a series of factors).
|
| If you want real simplicity/repeatability/reliability, but have
| to use S3, host your own internal S3 service. This way you can
| completely snapshot both the metadata and block devices used by
| your S3 service, making it actually immutable. Plus you can do
| things like re-use a bucket name or move it to any region without
| worrying about global conflicts. (All of that is hard/expensive
| to do, however, so you should just use AWS S3 and be very careful
| not to use it in a way that is unreliable)
| [deleted]
| simonw wrote:
| My guess is that they use S3 mainly for things like backups,
| where you write once to a brand new key.
|
| I'd be surprised if they were using mutable S3 objects that
| constantly get updated in-place. They have PostgreSQL for that!
| [deleted]
| tulstrup wrote:
| This idea is super cool.
|
| I am not sure it necessarily has to be just one single
| dependency, but keeping the number of dependencies as low as
| possible makes a lot of sense to me. At least the overhead of
| introducing any given new dependency should be taken into serious
| consideration and held against the concrete benefits that will be
| gained from it.
|
| I wrote a blog post on a very similar subject, essentially all of
| the same arguments, but targeted more towards the dependencies
| and abstractions found within a given systems code structure and
| application architecture.
|
| If you are interested, you can read it here:
| https://betterprogramming.pub/avoiding-premature-software-ab...
| wwweston wrote:
| > normally I'd push all rate limiting to Redis. Here, we rate
| limit in memory and assume roughly uniform distribution between
| server nodes.
|
| Dumb question: what do they mean by rate limiting in memory vs
| via Redis? Does that mean keeping track of request origins+counts
| using those storage mechanisms, or something else?
| winrid wrote:
| You can use an in memory LRU cache of request orgin+count. You
| can also periodically take that data and do an INCREMENT
| against your DB to get fairly scalable rate limiting.
| VWWHFSfQ wrote:
| I'm guessing process-local memory. Like a Python dict or
| something
| KwisaksHaderach wrote:
| Many can even get away with less: sqlite.
|
| One less process to worry about.
| chucke wrote:
| operationally, makes sense. but the inevitable moment (if you
| survive) you need to migrate to smth else depending on a
| different queue system, it'll be a pain to retrofit the code
| relying on db-level transactions and locks.
| bcrosby95 wrote:
| Except it's not inevitable. We have a few 15+ year old
| profitable projects that are still working fine on RDBMS backed
| queues.
| jjav wrote:
| > but the inevitable moment (if you survive)
|
| It's probably not inevitable. Simple is fast, and fast can
| scale you really far.
|
| Sure, if you end up being google-scale then yeah, the world
| changes. But there's very few companies that large, yours is
| probably not growing to that size.
|
| Over a decade ago I joined a mid-sized startup and took
| ownership of a service using MySQL. The first urgent warnings I
| was given was that they had to migrate to cassandra ASAP
| because soon MySQL couldn't possibly handle it.
|
| I took a look at the traffic and the growth curve and projected
| customer adoption. And then put that project on hold, no need
| yet. Company went on to an IPO, grew a lot, pretty successful
| in its industry. And years later when I left, it was still
| going strong on MySQL with no hint of approaching any
| limitations.
| xupybd wrote:
| The if you survive but is key. If you survive to the point you
| need to scale like this you will no doubt have more resources
| available. Do what you can to get going now. Solve future
| problems as they come.
| Nextgrid wrote:
| This only makes sense if the effort to migrate is more than the
| accumulated effort of working with and maintaining that
| solution from the start.
| VWWHFSfQ wrote:
| Note that you can approximate rate-limiting in Redis with
| Postgres' UNLOGGED tables [0]. They're a lot faster than regular
| tables since they don't write to the WAL. Of course, if you
| restart your server or if it crashes then you lose the table
| contents. But for stuff like rate-limiting you probably don't
| care. And unless you're using some kind of persistence in Redis
| it happens there also.
|
| I tend to run this kind of stuff on a separate PG server so that
| the query velocity doesn't affect more biz-critical things.
|
| [0] https://www.postgresql.org/docs/current/sql-
| createtable.html...
| timando wrote:
| You don't lose data on a clean restart.
| jamil7 wrote:
| Nice, Redis is awesome but definitely something I've seen
| pulled in prematurely all the time.
| hcarvalhoalves wrote:
| > But for stuff like rate-limiting you probably don't care.
|
| I guess you would care in this scenario, otherwise you have
| cascading failures (something makes pg crash, and the lack of
| rate limit allows the abuse to continue).
|
| So implementing a rate limiter separate from the rest may make
| sense too. I like their idea of doing it on memory and keeping
| the load balanced, as it doesn't rely on any dependency.
| simonw wrote:
| I hadn't seen UNLOGGED tables before, that's a really neat
| trick, thanks.
| pphysch wrote:
| There's definitely potential to go too far into monolith
| territory and misinterpret how simple your architecture actually
| is.
|
| An example: Django backed by Postgres. I tend to view this as 1
| architectural unit, i.e. Postgres is wholly embedded in Django. I
| am under no illusion that I have _both_ a Django project and a
| PostgreSQL instance. I have a Django-backed-by-Postgres. I _can_
| have that PostgreSQL instance be a standalone interface, but that
| means increasing my architectural units from 1 to 2. Instead, if
| I want to integrate with Django 's raw tables, I'm going to do it
| on Django's terms (via a custom HTTP API) rather than fighting
| the ORM over who gets to DBA the database. Bad for performance?
| No doubt. We'll worry about that when we get there.
|
| Yes, you can run a web app server directly out of Postgres
| without an additional "app layer" like Django (Crunchy has some
| cool tools for this). But should you?
|
| To be clear, I'm a big fan of KISS, just skeptical of false
| minimalism.
| justin_oaks wrote:
| Agreed. This quote seems relevant: "Everything should be made
| as simple as possible, but not simpler."
|
| The article talks about using rate limiting using Redis and
| dropping it in favor of handling it on each server node and
| assuming uniform distribution of requests. Doing that is a
| trade-off of precision rate-limiting for a simpler
| architecture.
|
| That may be a good trade-off, but only if you can get away with
| it. If they were required to have more precise rate-limiting
| then the simpler architecture would not have been possible.
|
| In my own work, I used Memcached instead of Redis for rate
| limiting data. The applications were coded to fall back to the
| per-node rate limiting if Memcached weren't reachable.
| Memcached may have been another dependency, but it was one of
| the less troublesome dependencies. I never experienced a
| problem with it in production. The fallback behavior meant that
| we didn't even need Memcached in a dev environment.
|
| I guess my point is this: Not all dependencies are as
| troublesome as others.
| KwisaksHaderach wrote:
| What's the crunchy tool for this?
| craigkerstiens wrote:
| I believe they're referring to some tools like pg_tileserv
| which gives you a turnkey tile server on top of PostGIS. As
| it stands today we don't have anything to automatically run
| that app from Postgres itself (but stay tuned we might be
| launching something around that in just a few weeks).
| Tileserv is in an interesting category like many other turn
| key APIs or services (like PostgREST or Postgraphile) on top
| of a database, but I don't view them as different than say
| running a Django app for example.
___________________________________________________________________
(page generated 2022-02-09 23:00 UTC)