[HN Gopher] Building a highly-available web service without a da...
___________________________________________________________________
Building a highly-available web service without a database
Author : tdrhq
Score : 236 points
Date : 2024-08-10 02:37 UTC (20 hours ago)
(HTM) web link (blog.screenshotbot.io)
(TXT) w3m dump (blog.screenshotbot.io)
| nephy wrote:
| We didn't want to build something complicated, so we implemented
| our own raft consensus layer. Have you considered just using
| Redis?
| tdrhq wrote:
| Haha, I totally hear you. But but, we didn't really build the
| raft consensus layer from scratch. We used an existing robust
| library for that: https://github.com/baidu/braft
| ramon156 wrote:
| You completely skipped the question though
| ActorNightly wrote:
| To throw the question back at you: have you considered that
| this isn't complicated?
| nephy wrote:
| No I haven't because it's quite complicated. Databases are
| very much a solved problem. Unfortunately, this architecture
| is going to be nigh impossible to hire for and when it goes
| absolutely sideways recovery will be difficult.
| ahoka wrote:
| That's the best part, you don't realize when things go
| sideways.
| echoangle wrote:
| Compared to installing, configuring and maintaining an
| installation of Redis, this absolutely is complicated. Do you
| think this is less complicated than using Redis?
| 1oooqooq wrote:
| redis and mongo are the type of things i will yak shave to no
| ends so i don't have to deploy them in production
| nephy wrote:
| I'm honestly not sure what you are talking about. In my
| experience, Redis is super easy to run and manage in
| production.
| ahoka wrote:
| If you like split brains, yes. :)
| sparrish wrote:
| I'm with you. We've been using Redis in production for more
| than a decade and it's one of the easiest distributed DBs
| we've ever used.
| gunapologist99 wrote:
| Redis is best as an in-memory cache, not a database. Having
| used it in production for roughly a decade, I don't trust it's
| on-disk capabilities (AOF/RDB etc) as either solid or reliable
| (or even performant) in an emergency scenario, especially with
| DR or DB migration in mind.
| localfirst wrote:
| I would use cloudflare R2 but its not globally distributed so its
| pointless using it on edge
|
| otherwise I get the messaging with edge you the database is the
| bottleneck
|
| just need a one stop shop to do edge functions + edge db
| tazu wrote:
| Cloudflare's durable objects seem similar to this article's
| "objects in RAM", but I think you still have to do some minimal
| serialization.
| jeremycarter wrote:
| The Cloudflare durable object is very much the same as a
| Virtual Actor
|
| https://www.microsoft.com/en-us/research/project/orleans-
| vir...
| Zak wrote:
| Decades ago, PG wrote that he didn't use a database for Viaweb,
| and that it seemed odd for web apps to be frontends to databases
| when desktop apps were not[0]. HN also doesn't use a database.
|
| That's no longer true, with modern desktop and mobile apps often
| using a database (usually SQLite) because relational data storage
| and queries turn out to be pretty useful in a wide range of
| applications.
|
| [0] https://www.paulgraham.com/vwfaq.html
| never_inline wrote:
| I think even SQLite itself wasn't as ubiquitous (edit: it
| didn't exist) when pg write viaweb. If SQLite wasn't there and
| my options were basically key value stores, I could as well use
| filesystem in most cases.
|
| Second, querying the RDBMS has been much simplified in past 20
| years. We have all kind of ORMs and row mappers to reduce the
| boilerplate.
|
| We also got advanced features like FTS which are useful for
| desktop and mobile apps.
|
| Today it's a good choice to use RDBMS for desktop apps.
| knallfrosch wrote:
| > Today it's a good choice to use RDBMS for desktop apps.
|
| Is there an alternative? I haven't seen a "local filesystem
| is okay as data storage" software in the 21th century.
| zimpenfish wrote:
| > If SQLite wasn't there and my options were basically key
| value stores
|
| Well, there were "options" other than KV stores - MySQL
| launched a month before Viaweb (but flakey for a good long
| while.) Oracle was definitely around (but probably $$$$.)
| mSQL was being used on the web and reasonably popular by 1995
| (cheap! cheerful! not terrible!)
|
| (definitely understand making your own in-memory DB in 1995
| though)
| endorphine wrote:
| HN does not use a database?! Can you expand on that? It's very
| surprising to me.
| exe34 wrote:
| probably uses the filesystem as the backing store
| szundi wrote:
| Filesystems these days are like dbs
| ahoka wrote:
| Good luck transactionally writing files to a random FS,
| but especially without access to native OS APIs.
| 1oooqooq wrote:
| if pg is still stuck in the 90s lisp, if bet it's just a
| single process with the site in ram, using make-object-
| persistent and loading as needed (kinda like python pickle).
|
| that was all the rave for prototypes back then.
| Zak wrote:
| It just persists its in-memory data structures to disk.
| Here's the source of an old version; note uses of `diskvar`
| and `disktable`. A "table" here is just a hashtable.
|
| https://github.com/wting/hackernews/blob/master/news.arc
| tim333 wrote:
| I think the structure is very simple. It's just a lot of
| items like your comment is item 41207393 as in
| https://news.ycombinator.com/item?id=41207393
|
| I think that is just written to disk as something like
| file41207393 when you click reply.
|
| When the system needs an item it sees if it's cached in
| memeory and otherwise reads it from disk and I think that is
| pretty much the whole memory system. Some other stuff like
| user id that works in the same sort of way.
| tdrhq wrote:
| I was certainly inspired by PG's writing (after all we do use
| Common Lisp, and it's hard to avoid PG in this space). But I
| don't think they did things like transaction logs like how
| bknr.datastore does, which makes the development process a lot
| more seamless.
| chipdart wrote:
| > Decades ago, PG wrote that he didn't use a database for
| Viaweb, and that it seemed odd for web apps to be frontends to
| databases when desktop apps were not[0].
|
| After reading the link, I don't think that database means the
| same thing for everyone.
|
| The vwfaq still mentions loading data from disk, and also
| mention "start up a process to respond to an HTTP request."
| This suggests that by "database" they meant a separate server
| dedicated to persist data, and having to communicate with
| another server to fetch that data.
|
| Obviously, this leaves SQLite out of this definition of
| database. Also, if you're loading data from disk already,
| either you're using a database or you're implementing your own
| ad-hoc persistence layer. Would you still consider you're using
| a database if you load data from SQLite at app start?
|
| The problem with this sort of mental model is that it ignores
| the fact that the whole point of a database is to persist and
| fetch data in a way that is convenient to you without having to
| bother about low-level details. Storing data in a database does
| not mean running a postgres instance somewhere and fetching
| data over the web. If you store all your data in-memory and
| have a process that saves snapshots to disk using a log-
| structured data structure... Congratulations, you just
| developed your own database.
| cultofmetatron wrote:
| it was a different time. to my knowledge, viaweb was a series
| of common lisp instances. All states for a user session was
| held IN MEMORY on the individual machine. I remember reading
| somewhere that they would be on a call with a user on
| production and patch bugs in real time while they were on the
| phone.
|
| The web has gotten bigger and a lot of these practices simply
| would not fly today. If I was pushing a live fix on our prod
| machine with the amount of testing doing it live while on the
| customer is on the phone entails today, a good portion of you
| would be questioning my sanity.
| Zak wrote:
| An important reason that practice wasn't as reckless as it
| sounds is that early Viaweb was just a page builder. The
| actual web stores its customers were building were _static
| HTML_ , so updating a customer's instance while talking to
| them on the phone only affected that one user's backend.
| Sn0wCoder wrote:
| Not sure I would call that setup simple, but it is interesting. I
| have honestly never heard of 'Raft' or the Raft Consensus
| Protocol or bknr.datastore, so always happy to learn something on
| a Friday night.
| tdrhq wrote:
| Author here.
|
| I agree, the infrastructure required to make this happen
| eventually gets quite complicated. But the developer experience
| is what's super simple. If somebody had to take all our
| infrastructure and just use it to build their next big app,
| they can get the simplicity without worrying about the internal
| plumbing.
| pclmulqdq wrote:
| Raft is fantastic and most modern systems with more than one
| node are built on Raft. It is actually proven to be equivalent
| to Paxos, but the semantics of it are closer to what you would
| prefer as a software writer and the implementation is much
| simpler.
| nickpsecurity wrote:
| What they described early on in the article was basically how
| NUMA machines worked (eg SGI Altix or UV). Also, their claimed
| benefit was being able to parallelize things with multithreading
| in low-latency, huge RAM. Clustering came as a low-cost
| alternative to $1+ million machines. There's similarities to
| persistence in AS/400, too, where apps just wrote memory that
| gets transparently mapped to disk.
|
| Now, with cheap hardware, they're going back in time to the
| benefits of clustered, NUMA machines. They've improved on it
| along the way. I did enjoy the article.
|
| Another trick from the past was eliminating TCP/IP stacks from
| within clusters to knock out their issues. Solutions like Active
| Messages were a thin layer on top of the hardware. There's also
| designs for network routers that have strong consistency built
| into them. Quite a few things they could do.
|
| If they get big, there's hardware opportunities. On CPU side, SGI
| did two things. Their NUMA machines expanded the number of CPU's
| and RAM for one system. They also allowed FPGA's to plug directly
| into the memory bus to do custom accelerators. Finally, some
| CompSci papers modified processor ISA's, networks on a chip, etc
| to remove or reduce bottlenecks in multithreading. Also, chips
| like OpenPiton increase core counts (eg 32) with open,
| customizable cores.
| oconnore wrote:
| My first thought was, "oh, I used to do this when I wrote Common
| Lisp, it's funny someone rediscovered that technique in
| <rust/typescript/java/whatever>".
|
| But no, just more lispers.
| joatmon-snoo wrote:
| This is cool! I'm always excited by people trying simpler things,
| as a big fan of using Boring Technology.
|
| But I have some bad news: you haven't built a system without a
| database, you've just built your own database without
| transactions and weak durability properties.
|
| > Hold on, what if you've made changes since the last snapshot?
| And this is the clever bit: you ensure that every time you change
| parts of RAM, we write a transaction to disk.
|
| This is actually not an easy thing to do. If your shutdowns are
| always clean SIGSTOPs, yes, you can reliably flush writes to
| disk. But if you get a SIGKILL at the wrong time, or don't handle
| an io error correctly, you're probably going to lose data.
| (Postgres' 20-year fsync issue was one of these:
| https://archive.fosdem.org/2019/schedule/event/postgresql_fs...)
|
| The open secret in database land is that for all we talk about
| transactional guarantees and durability, the reality is that
| those properties only start to show up in the very, very, _very_
| long tail of edge cases, many of which are easily remedied by
| some combination of humans getting paged and end users developing
| workarounds (eg double entry bookkeeping). This is why MySQL's
| default isolation level can lose writes: there are usually enough
| safeguards in any given system that it doesn't matter.
|
| A lot of what you're describing as "database issues" problem
| don't sound to me like DB issues, so much as latency issues
| caused by not colocating your service with your DB. By hand-
| rolling a DB implementation using Raft, you've also colocated
| storage with your service.
|
| > Screenshotbot runs on their CI, so we get API requests 100s of
| times for every single commit and Pull Request.
|
| I'm sorry, but I don't think this was as persuasive as you meant
| it to be. This is the type of workload that, to be snarky about,
| I could run off my phone[0]
|
| [0]: https://tailscale.com/blog/new-internet
| tdrhq wrote:
| > This is actually not an easy thing to do. If your shutdowns
| are always clean SIGSTOPs, yes, you can reliably flush writes
| to disk. But if you get a SIGKILL at the wrong time, or don't
| handle an io error correctly, you're probably going to lose
| data.
|
| Thanks for the comment! This is handled correctly by
| Raft/Braft. With Raft, before a transaction is considered
| committed it must be committed by a majority of nodes. So if
| the transaction log gets corrupted, it will restore and get the
| latest transaction logs from the other node.
|
| > I'm sorry, but I don't think this was as persuasive as you
| meant it to be.
|
| I wasn't trying to be persuasive about this. :) I was trying to
| drive home the point that you don't need a massively
| distributed system to make a useful startup. I think some
| founders go the opposite direction and try to build something
| that scales to a billion users before they even get their first
| user.
| joatmon-snoo wrote:
| Wait, so you're blocking on a Raft round-trip to make forward
| progress? That's the correct decision wrt durability, but...
|
| I'm now completely lost as to why you believe this was a good
| idea over using something like MySQL/Postgres/Aurora. As I
| see it, you've added complexity in three different dimensions
| (novel DB API, novel infra/maintenance, and novel
| oncall/incident response) with minimal gain in availability
| and no gain in performance. What am I missing?
|
| (FWIW, I worked on Bigtable/Megastore/Spanner/Firestore in a
| previous job. I'm pretty familiar with what goes into
| consensus, although it's been a few years since I've had to
| debug Paxos.)
|
| > I was trying to drive home the point that you don't need a
| massively distributed system to make a useful startup. I
| think some founders go the opposite direction and try to
| build something that scales to a billion users before they
| even get their first user.
|
| This reads to me as exactly the opposite: overengineering for
| a problem that you don't have.
|
| For exactly the reasons you describe, I would argue the
| burden of proof is on you to demonstrate why Redis, MySQL,
| Postgres, SQLite, and other comparable options are
| insufficient for your use case.
|
| To offer you an example: let's say your Big Customer decides
| "hey, let's split our repo into N micro repos!" and they now
| want you to create N copies of their instance so they can
| split things up. As implemented, you'll now need to implement
| a ton of custom logic for the necessary data transforms. With
| Postgres, there's a really good chance you could do all of
| that by manipulating the backups with a few lines of SQL.
| aeinbu wrote:
| > As implemented, you'll now need to implement a ton of
| custom logic for the necessary data transforms. With
| Postgres, there's a really good chance you could do all of
| that by manipulating the backups with a few lines of SQL.
|
| Isn't writing <<a few Lines of SQL>> also custom logic? The
| difference is just the language.
|
| It is also possible that the custom data store is more
| easily manipulated with other languages than SQL.
|
| SQL really is great for manipulating data, but not all
| relational databases are easy to work with.
| oefrha wrote:
| Seems weird to start with "not talking about using something like
| SQLite where your data is still serialized", then end up with a
| home grown transaction log that requires serialization and needs
| to be replicated, which is how databases are replicated anyway.
|
| If your load fits entirely on one server, then just run the
| database on that damn server and forget about "special
| architectures to reduce round-trips to your database". If your
| data fits entirely in RAM, then use a ramdisk for the database if
| you want, and replicate it to permanent storage with standard
| tools. Now that's actually simple.
| Groxx wrote:
| I do feel like this largely summarizes as "we built our own
| sqlite + raft replication", yeah. But without sqlite's battle-
| tested reliability or the ability to efficiently offload memory
| back to disk.
|
| So, basically, https://litestream.io/ . But perhaps faster
| switching thanks to an explicit Raft setup? I'm not a
| litestream user so I'm not sure about the subtleties, but it
| sounds awfully similar.
|
| That overly-simplified summary aside, I quite like the idea and
| I think the post does a pretty good job of selling the concept.
| For a lot of systems it'll scale more than well enough to
| handle most or all of your business even if you become
| abnormally successful, and the performance will be absurdly
| good compared to almost anything else.
| kitd wrote:
| Rqlite would be a better comparison. It is actually SQLite +
| raft
|
| https://github.com/rqlite/rqlite
| otoolep wrote:
| rqlite author here, happy to answer any questions.
| lifeisstillgood wrote:
| So some dumb questions if you don't mind
|
| - In GitHub readme you mention etcd / consul. Is rqlite
| suitable for transaction processing as well ?
|
| - I am imagining a dirt simple load balancer over two web
| servers. They are a crud app backed onto a database. What
| is the disadvantages of putting rqlite on each server
| compared to say having a third backend database.
| otoolep wrote:
| It depends on what kind of transaction support you want.
| If your transactions need to span rqlite API requests
| then no, rqlite doesn't support that (due to the
| stateless nature of HTTP requests). That sort of thing
| could be developed, but it's substantial work. I have
| some design ideas, it may arrive in the future.
|
| If you need to ensure that a given API request (which can
| contain multiple SQL statements) is atomically processed
| (all SQL statements succeed or none do) that _is_
| supported however [1]. That 's why I think of rqlite as
| closer to the kind of use cases that etcd and Consul
| support, rather than something like Postgres -- though
| some people have replaced their use of Postgres with
| rqlite! [2]
|
| [1] https://rqlite.io/docs/api/api/#transactions
|
| [2] https://www.replicated.com/blog/app-manager-with-
| rqlite
| lifeisstillgood wrote:
| Thank you - so my takeaway is that rqlite is well suited
| for distributed "publishing" of data ala etcd, but it is
| possible to use it as a Postgres replacement - thank you
| I will give it a go
| otoolep wrote:
| As for your second question, I don't think you'd benefit
| much from than that, for two reasons: - rqlite is a Raft
| based system, with quorum requirements. Running 2-node
| systems don't make much sense. [1] - Secondly, all writes
| go to the Raft leader (rqlite makes sure this happens
| transparently if you don't initially contact the Leader
| node [2]). A load balancer, in this case, isn't going to
| allow you to "spread load". What is load balancer is
| useful for when it comes to rqlite is making life simpler
| for clients -- they just hit the load balancer, and it
| will find some rqlite node to handle the request
| (redirecting to the Leader _if_ needed).
|
| [1] https://rqlite.io/docs/clustering/general-
| guidelines/#cluste...
|
| [2] https://rqlite.io/docs/faq/#can-any-node-execute-a-
| write-req...
| Groxx wrote:
| I'll throw in a "ehh... sorta" though rqlite is quite neat
| and very much worth considering.
|
| The main caveat here is that rqlite is an out-of-process
| database, which you communicate with over http. That puts
| it on similar grounds as e.g. postgres, just significantly
| lighter weight, and somewhat biased in favor of running it
| locally on every machine that needs the data.
|
| So minimum read latency is likely much lower than postgres,
| but it's still noticeable when compared to in-process
| stuff, and you lose other benefits of in-process sqlite,
| like trivial extensibility.
| oefrha wrote:
| They basically only save on serialization & deserialization
| at query time, which I would consider an infinitesimal saving
| in the vast majority of use cases. They claim to be able to
| build some magical index that's not possible with existing
| disk-based databases (I didn't read the linked blog post).
| They lose access to a nice query language and entire
| ecosystems of tools and domain knowledge.
|
| I fail to see how this little bit of saving justifies all the
| complexity for run-of-the-mill web services that fit on one
| or a few servers as described in the article. The context
| isn't large scale services where 1ms/request saving
| translates to $$$, and the proposal doesn't (vertically)
| scale anyway.
| oefrha wrote:
| One thing I forgot to mention: if you use a not-in-process
| RDBMS on the same machine you also incur some socket
| overhead. But that's also small.
| rrrix1 wrote:
| You should probably RTFA before making broad assumptions on
| their solution and how it works. Most of what you wrote is
| both incorrect and addressed in the article.
| otabdeveloper4 wrote:
| SQlite doesn't do Raft. There isn't any simple way to do
| replicated SQlite. (In fact, writing your own database is
| probably the simplest way currently, if SQlite+Raft is
| actually what you want.)
| carderne wrote:
| What about rqlite?
| robertclaus wrote:
| Agreed. Reinventing the WAL means reinventing (or ignoring) all
| the headaches that come with it. I got the impression it takes
| them a long time to recover from the logs, so they likely
| haven't even gotten as far as log checkpointing.
| chipdart wrote:
| > Agreed. Reinventing the WAL means reinventing (or ignoring)
| all the headaches that come with it.
|
| But if the blogger learned SQLite, how would they have a
| topic to blog about?
|
| Also, no benchmarks. It's quite odd that an argument grounded
| on performance claims does not bother to put out any hard
| data comparing the output of this project. I'm talking about
| basic things like how does this contrived custom ad-hoc setup
| compare with vanilla, out-of-the-box SQLite deployment? Which
| one performs worse and by how much? How does the performance
| difference reflect in request times and infrastructure cost?
| Does it actually pay off to replace the dozen lines of code
| of on boarding SQLite with a custom, in-development, ad-hoc
| setup? I mean, I get the weekend personal project vibe of
| this blog post, but if this is supposed to be a production-
| minded project then step zero would have been a performance
| test on the default solution. Where is it?
| bjornsing wrote:
| > I got the impression it takes them a long time to recover
| from the logs, so they likely haven't even gotten as far as
| log checkpointing.
|
| The OP starts out by talking about periodically dumping
| everything in RAM to disk. I'd say that's your checkpointing.
| bingo-bongo wrote:
| You don't even need a ram disk imho, databases already cache
| everything in memory and only writes reach the disk.
|
| Just try and cold-start your database and run a fairly large
| select twice.
| piker wrote:
| Also the OS will cache a lot of the reads even if your
| database isn't sophisticated enough or tuned correctly. Still
| could be a fun exercise, as with all things on here.
| LtdJorge wrote:
| Any half decent DBMS bypasses the page cache, except for
| LMDB.
| nine_k wrote:
| Trading systems bluntly keep everything in RAM, in preallocated
| structures. It all depends on the kind of tradeoffs you're
| willing to make.
| ahoka wrote:
| I used to work on a telecom platform (think something that
| runs 4G services), where every node was just part of an in-
| memory database that replicated using 2PC and just did
| periodic snapshot to avoid losing data. Basically processes
| were colocated with their data in the DB.
| icedchai wrote:
| I worked on a lottery / casino system that was similar. In
| memory database ( memory mapped files), with a WAL log for
| transaction replay / recovery. There was also a periodic
| snapshot capability. It was incredibly low latency on late
| 90's era hardware.
| phamilton wrote:
| Very erlang/otp. Joe Armstrong used to rant to anyone who
| would listen that we used databases too often. If data was
| important, multiple nodes probably need a copy of it. If
| multiple nodes need a copy, you probably have plenty of
| durability.
|
| Even if you weren't using erlang, his influence (and in
| general, ericsson) permeates the telecom industry.
| ActorNightly wrote:
| Setting up a single server with database replication and
| restore functionality is arguably more complex then setting
| this up.
|
| There are libraries available to wrap your stuff with this
| algorithm, and the benefit is that you write your server like
| it would run on a single machine, and then when launching it in
| prod across multiple, everything just works.
| tdrhq wrote:
| I think it's important to understand that every startup goes
| through three phases: Explore, Expand, Extract. What's simple
| in one phase isn't simple in the other.
|
| A transactional database is simple in Expand and Extract, but
| adds additional overhead during the Explore phase, because
| you're focusing on infrastructure issues rather than product.
| Data reliability isn't critical in the Explore phase either,
| because you just don't have customers, so you just don't have
| data.
|
| Having everything in memory with bknr.datastore (without
| replication) is simple in the Explore phase, but once you get
| to Expand phase it adds operational overhead to make sure that
| data is consistent.
|
| But by the time I've reached the Expand phase, I've already
| proven my product and I've already written a bunch of code.
| Rewriting it with a transactional database doesn't make sense,
| and it's easier to just add replication on top of it with Raft.
| gtirloni wrote:
| I'd assume in the beginning you do not want to spend time
| writing a bunch of highly difficult code until you've proven
| your idea/product. Then when you're big enough and have the
| money, start replacing things where it makes sense. It seems
| to be the strategy used by many companies.
|
| Unless, of course, your startup is in the business of selling
| DBMSes.
| Groxx wrote:
| Having Explored with a transactional database: I really can't
| agree. Just change your database, migrations are easy and
| should be something you're comfortable doing at any time, or
| you'll get stuck working around it for 100x more effort in
| the future.
| kasey_junk wrote:
| That was the biggest disconnect I had as well. SQL db have
| the _best_ data migration tooling and practices of any data
| system. It's not addressed in the article how migrations
| are handled with this system but I'm assuming it's a hand
| rolled set of code for each one.
|
| I think sql db make the most sense during the explore phase
| and you switch off of them once you know you need an
| improvement somewhere (like latency or horizontal
| scalability).
| troupo wrote:
| > ". If your data fits entirely in RAM, then use a ramdisk for
| the database if you want, and replicate it to permanent storage
| with standard tools
|
| Then you get used to near-zero latency that in-RAM data gives
| you, and when it outgrows your RAM, it's a pain in the butt to
| move it to disk :)
| theideaofcoffee wrote:
| I get the desire to experiment with interesting things, but it
| seems like such a huge waste of time to avoid having to learn the
| most basic aspects of MySQL or postgres. You could "just" build
| on top of and be done with it, especially if you're running in a
| public cloud provider. I don't buy the increased RTT or troubles
| with concurrency issues, the latter having simple solutions by
| basic tuning, or breaking out your noisy customers. There's
| another post on their blog mentioning the possibility of adding
| 10 million rows per day and the challenges of indexing that.
| That's... literally nothing and I don't think even 10x that
| justifies having to engineer a custom solution.
|
| Worse is better until you absolutely need to be less worse, then
| you'll know for sure. At that point you'll know your pain points
| and can address them more wisely than building more up front.
| chipdart wrote:
| > I get the desire to experiment with interesting things, but
| it seems like such a huge waste of time to avoid having to
| learn the most basic aspects of MySQL or postgres.
|
| For server-based database engines you can still make an
| argument on shedding network calls. It's dubious, but you can.
|
| What's baffling is that the blogger tries to justify not
| picking up SQLite claiming it might have features that they
| don't need, which is absurd and does not justify anything.
|
| The blog post reads like a desperate attempt to start with a
| poor solution to a fictitions problem and proceed to come up
| with far-fetched arguments hoping to reject the obvious
| solution.
| wongarsu wrote:
| If you want to shed network calls, the easiest solution would
| be to just run postgres or MySql on the same server and
| connecting to it via Unix domain socket. So even if SQLite
| wasn't an option network overhead isn't a good argument
| ibash wrote:
| 1. If your entire cluster goes down do you permanently lose
| state?
|
| 2. Are network requests / other ephemeral things also saved to
| the snapshot?
| tdrhq wrote:
| [Author here] The transactions and snapshots are still logged
| to disk. So if the cluster goes down and comes back up, each
| one just reloads the state. Until at least two machines are
| back up, we won't be able to serve requests though.
|
| Not sure what you mean by ephemeral things. If you mean things
| like file descriptors, they are not stored. Technically the
| snapshot is not a simple snapshot of RAM, it snapshots through
| all the objects in memory that are set up to be part of the
| datastore. (It's a bit more complicated and flexible than this,
| but that's the general idea.)
| ibash wrote:
| Ah awesome! Thank you!
| wmf wrote:
| This sounds a lot like Prevayler. https://prevayler.org/
| tdrhq wrote:
| [Author here] Indeed, bknr.datastore was inspired by Prevayler
| and similar libraries
| Tehdasi wrote:
| Hmm, but the problem with having in-memory objects rather than a
| db is you end up having to replicate alot of the features of a
| relational database to get a usable system. And adding all these
| extra features you want from those dbs end up making a simple
| solution not very simple at all.
| swiftcoder wrote:
| To some extent I think this is an "if all you have is a
| hammer..." situation. Relational DBs are often not a great fit
| for how contemporary software manages data in memory (hence the
| proliferation of ORMs, and adapter layers like graphql). I
| think it's often easier to write out one's relations in the
| data structures directly, rather than mapping them to queries
| and joins
| iammrpayments wrote:
| Isn't this like redis?
| andrewstuart wrote:
| But why, when you can build things in an ordinary way with
| ordinary tech like Python/Java/C#/TypeScript and Postgres. Lots
| of developers know it, lots of answers to your questions online,
| the AI knows how to write it.
|
| Reading posts like this makes me think the founders/CTO is mixing
| hobby programming with professional programming.
| nesarkvechnep wrote:
| Why not, though? Because you only know the languages you
| listed?
| andrewstuart wrote:
| A home grown maintenance nightmare. Try logging in and
| querying and working out what is going on.
|
| There's literally no reason to waste time doing all this.
|
| So many lines of pointless, wasted code.
|
| Which is absolutely fine if you are hobby programming but if
| you are running a business then this approach is wasteful.
| jhardy54 wrote:
| > Hold on, what if you've made changes since the last snapshot?
| And this is the clever bit: you ensure that every time you change
| parts of RAM, we write a transaction to disk. So if you have a
| line like foo.setBar(2), this will first write a transaction that
| says we've changed the bar field of foo to 2, and then actually
| set the field to 2. An operation like new Foo() writes a
| transaction to disk to say that a Foo object was created, and
| then returns the new object.
|
| >
|
| > And so, if your process crashes and restarts, it first reloads
| the snapshot, and replays the transaction logs to fully recover
| the state. (Notice that index changes don't need to be part of
| the transaction log. For instance if there's an index on field
| bar from Foo, then setBar should just update the index, which
| will get updated whether it's read from a snapshot, or from a
| transaction.)
|
| That's a database. You even linked to the specific database
| you're using [0], which describes itself as:
|
| > [...] in-memory database with transactions [...]
|
| Am I misunderstanding something?
|
| [0]: https://github.com/bknr-datastore/bknr-datastore
| apexkid wrote:
| > periodically just take a snapshot of everything in RAM.
|
| Sound similar to `stop the world Garbage collection` in Java.
| Does your entire processing comes to halt when you do this? How
| frequently do you need to take snapshots? Or do you have a way to
| do this without halting everything
| tdrhq wrote:
| Good catch! Snapshotting was certainly a bottleneck that I
| chose not to write about.
|
| But we aren't really taking the snapshot of RAM, instead we're
| running some code asking each object to snapshot itself into a
| stream. If you do this naively, it will block writes on the
| server until the snapshot is done (reads will continue to
| work).
|
| But Raft has a protocol for asynchronous snapshots. So in the
| first step we take an immutable fast snapshot of the state we
| care about which happens quickly, then writes can keep going
| while in the background we serialize the state to disk.
| AdieuToLogic wrote:
| > Imagine all the wonderful things you could build if you never
| had to serialize data into SQL queries.
|
| This exists in sufficiently mature Actor model[0]
| implementations, such as Akka Event Sourcing[1], which also
| addresses:
|
| > But then comes the important part: how do you recover when your
| process crashes? It turns out that answer is easy, periodically
| just take a snapshot of everything in RAM.
|
| Intrinsically and without having to create "a new architecture
| for web development". There are even open source efforts which
| explore the RAFT protocol using actors here[2] and here[3].
|
| 0 - https://en.wikipedia.org/wiki/History_of_the_Actor_model
|
| 1 - https://doc.akka.io/docs/akka/current/typed/persistence.html
|
| 2 - https://github.com/Michael-Dratch/RAFT_Implementation
|
| 3 - https://github.com/invkrh/akka-raft
| jeremycarter wrote:
| I have built some medium sized systems using Microsoft Orleans
| (Virtual Actors). There was no transactional database involved,
| but everything was ordered and fully transactional.
|
| If you choose say Cosmos DB, MongoDB or DynamoDB as your
| persistence provider you can even query the persisted state.
|
| https://learn.microsoft.com/en-us/dotnet/orleans/grains/grai...
|
| https://learn.microsoft.com/en-us/dotnet/orleans/grains/tran...
|
| https://learn.microsoft.com/en-us/dotnet/orleans/grains/even...
| mg wrote:
| When I start a new project, the data structure usually is a "list
| of items with attributes". For example right now, I am writing a
| fitness app. The data consists of a list of exercises and each
| exercise has a title, a description, a video url and some other
| attributes.
|
| I usually start by putting those items into YAML files in a
| "data" directory. Actually a custom YAML dialect without the
| quirks of the original. Each value is a string. No magic type
| conversions. Creating a new item is just "vim crunches.yaml" and
| putting the data in. Editing, deleting etc all is just
| wonderfully easy with this data structure.
|
| Then when the project grows, I usually create a DB schema and
| move the items into MariaDB or SQLite.
|
| This time, I think I will move the items (exercises) into a JSON
| column of an SQLite DB. All attributes of an item will be stored
| in a single JSON field. And then write a little DB explorer which
| lets me edit JSON fields as YAML. So I keep the convenience of
| editing human readable data.
|
| Writing the DB explorer should be rather straight forward. A bit
| of ncurses to browse through tables, select one, browse through
| rows, insert and delete rows. And for editing a field, it will
| fire up Vim. And if the field is a JSON field, it converts it to
| YAML before it sends it to Vim and back to JSON when the user
| quits Vim.
| aorloff wrote:
| This is like an example case of a lambda + kinesis
| nesarkvechnep wrote:
| It'll be interesting to do something like this in Elixir where
| clustering is almost a runtime primitive.
| nilirl wrote:
| I'm baffled at the arguments made in this article. This is
| supposed to be a simpler and faster way to build stateful
| applications?
|
| The premises are weak and the claims absurd. The author uses
| overstatement of the difficulties of serialization just to make
| their weak claim stronger.
| t0mas88 wrote:
| And then they implement serialization to write their
| transactions to a log and replicate them to the other nodes...
| voidfunc wrote:
| Big vibes of "We are very smart, see how smart we are?" from
| the blog post.
|
| These kind of people usually suck to work with. I'm glad
| they've found a startup to sink so I don't have to deal with
| them.
| golergka wrote:
| > Imagine all the wonderful things you could build if you never
| had to serialize data into SQL queries.
|
| No transactions, no WAL, no relational schema to keep data design
| sane, no query planner doing all kinds of optimisations and
| memory layout things I don't have to think about?
|
| You could say that transactions, for example, would be redundant
| if there is no external communication between app server and the
| database. But it is far from the only thing they're useful for.
| Transactions are a great way of fulfilling important invariants
| about the data, just like a good strict database schema. You
| rollback a transaction if an internal error throws. You make sure
| that transaction data changes get serialised to disk all at once.
| You remove a possibility that statements from two simultaneous
| transactions access the same data in a random order (at least if
| you pick a proper transaction isolation level, which you usually
| should).
|
| > You also won't need special architectures to reduce round-trips
| to your database. In particular, you won't need any of that
| Async-IO business, because your threads are no longer IO bound.
| Retrieving data is just a matter of reading RAM. Suddenly
| debugging code has become a lot easier too.
|
| Database is far from the only other server I have to communicate
| with when I'm working on user's HTTP request. As a web developer,
| I don't think I've worked on a single product in the last 4 years
| that didn't have some kind of server-server communication for
| integrations with other tools and social media sites.
|
| > You don't need crazy concurrency protocols, because most of
| your concurrency requirements can be satisfied with simple in-
| memory mutexes and condition variables.
|
| Ah, mutexes. Something that programmers never shot themselves in
| a foot with. Also, deadlocks don't exist.
|
| > Hold on, what if you've made changes since the last snapshot?
| And this is the clever bit: you ensure that every time you change
| parts of RAM, we write a transaction to disk. So if you have a
| line like foo.setBar(2), this will first write a transaction that
| says we've changed the bar field of foo to 2, and then actually
| set the field to 2. An operation like new Foo() writes a
| transaction to disk to say that a Foo object was created, and
| then returns the new object.
|
| A disk write latency is added to every RAM write. It has no
| performance cost and nobody notices this.
|
| I apologise if this comes off too snarky. Despite all of the
| above, I really like this idea -- and already think of
| implementing it in a hobby project, just to see how well it
| really works. I'm still not sure if it's practical, but I love
| the creative thinking behind this, and a fact that it actually
| helped them build a business.
| briHass wrote:
| I would add that the 'serialization' to a RDBMS-schema cites as
| a negative is actually a huge positive for most systems.
| Modeling your data relationally, often in 3NF, usually differs
| from the in-memory/code objects in all but the most simple ORM
| class=table projects. Thinking deeply about how to persist data
| in a way that makes it flexible and useful as application needs
| change (i.e. the database outlives the applications(s)) has
| value in itself, not just a pointless cost.
|
| I like being able to draw a hard line between application data
| structures, often ephemeral and/or optimized for particular
| tasks -- and the persisted, domain data which has meaning
| beyond a specific application use case.
| antman wrote:
| As a side question is there a python library for braft or a
| production grade raft library for python?
| tdrhq wrote:
| There's a list of libraries here, which include a few Python
| libraries: https://raft.github.io/
|
| I don't know if they're production grade. I was drawn to Braft
| because of Baidu's backing.
| leokennis wrote:
| I'm not from "start up world" but in the end, few things give me
| more comfort and lack of surprises down the line than just having
| a relational database with built in redundancy/transaction
| logs/back up/recovery. Sure there might always be edge cases
| (lack of money, regulations, specialist software offering) but in
| the vast majority of cases - just get a database.
| gunapologist99 wrote:
| It's interesting you say "backup/recovery" as a strong point of
| relational databases (servers), because backup and recovery on
| hot databases have always been a challenge.
|
| With many enterprise databases these days, often "incremental"
| or other seemingly required backup modes are not included in
| the "community source" versions; perhaps because surely if you
| want your database to be backed up safely and then come back
| online safely, you certainly will fall into the "contact us for
| quote" enterprise customer demographic.
|
| At least, with SQLite, copying even a hot (in-use) db file to a
| remote server will usually "just work", with the potential loss
| of a few transactions, but with most other database/servers,
| you definitely can't just backup the data directory
| occasionally and call it a day.
| leokennis wrote:
| Like I mentioned, I don't have experience working in a start
| up. My real world experience with backup/recovery of a live
| relational DB has been with Oracle using ZDLRA - and indeed
| its license probably costs dearly.
|
| For stuff like MariaDB a quick search also finds options to
| perform snapshots, backups, restores etc.
|
| And if you need to be super high available, set up a
| distributed DB like Cassandra - you lose the relational and
| transaction part, but at least you're running a product with
| known failure modes and known ways to prevent/circumvent
| them.
|
| I guess my bigger point is that besides "don't roll your own
| crypto", I'd also advice not to roll your own DB. There's a
| lot of known stuff in the market, all built by people who
| made and fixed the mistakes you're going to make a long time
| ago.
| lpapez wrote:
| I once saw a project in the wild where the "database" was
| implemented using filesystem directories as "tables" with JSON
| files inside as "rows".
|
| When I asked people working on it if they considered Redis or
| Mongo or Postgres with jsonb columns, they just said they
| considered all of those things but decided to roll out their own
| db anyway because "they understood it better".
|
| This article gives off the same energy. I really hope it works
| out for you, but IMO spending innovation tokens to build a
| database is nuts.
| ActorNightly wrote:
| This isn't innovation though. You literally just write your
| server like you would for a single machine, then wrap it any of
| the available Raft libraries.
|
| AWS and other cloud providers are money printers because a lot
| of engineers are insanely tied into established patterns of
| doing things and can't think through things at a fundamental
| level. Ive seen company backends where their entire AWS stacks
| could be replaced by a 2 EC2 instances behind a load balancer
| with a domain name, without affecting business flow.
|
| We did something similar to the work in the OP post at my work,
| we had a bunch of ECS tasks for a service, where the service
| did another call to an upstream service to fetch some
| intermediate results. We wanted to cache results for lower
| response latency. People were working to set up a Redis
| cluster. Except the TPS of the service was like 0.1.
|
| Took me one day to code a /sync api endpoint, which was just a
| replica of the main endpoint. The only difference is that the
| main endpoint would spin of a thread to call the /sync
| endpoint, whereas the /sync endpoint didn't. Both endpoints
| ended with caching the results in memory before returning. Easy
| as day, no additional infra costs necessary.
|
| But overall, personally, I don't hate the "spending innovation
| tokens to build a database is nuts" sentiment too much, because
| it keeps me employed at high salary while doing minimal work,
| where things that really should be basic CS are considered
| innovation.
| gtirloni wrote:
| Software Engineering is different than CS though.
| bastawhiz wrote:
| > then wrap it any of the available Raft libraries.
|
| Raft does consensus. Raft does not do persistence to disk,
| WAL, crash recovery, indexing, vacuuming (you're using
| tombstones for your deletes, right?), or any of the other
| necessary pieces of a database. That's not mentioning how
| such a system has _no query engine_ , so every piece of data
| you're looking up in every place you need data is traversing
| your bespoke data structures.
|
| What you described isn't a database. Keeping some disposable
| values cached isn't a database.
| jmull wrote:
| I get your point and I don't doubt the project you're talking
| about was a mess, but the file system _is_ a database, and can
| be a very good choice, depending on exactly what you're doing.
| rrrix1 wrote:
| The file system is a database _and_ an API.
|
| Magic!
| philipwhiuk wrote:
| > I once saw a project in the wild where the "database" was
| implemented using filesystem directories as "tables" with JSON
| files inside as "rows".
|
| I did this sort of thing recently. I felt bad doing it, I still
| objectively hate it, because I do know enough to know that
| basically I'm re-implementing what years of hardworking O/S
| developers have done, piecemeal. But at least I'm going in with
| my eyes open which feels better.
|
| The only real mitigating factor I have is that the application
| is largely 'never-read' and then when reading is done, it's
| sequential batches. Which is not normally something databases
| optimise for and works okay for file-storage.
|
| (If someone _does_ know a lightweight database architecture
| that performs like this, let me know).
| hankchinaski wrote:
| textbook example of overengineering for no reason
| annacappa wrote:
| It's great that people explore new ideas. However this does not
| seem like a good idea.
|
| It claims to solve a bunch of problems by ignoring them. There
| are solid reasons why people distribute their applications across
| multiple machines. After reading this article I feel like we need
| to state a bunch of them.
|
| Redundancy - what if one machine breaks either a hardware failure
| a software failure or a network failure (network partition where
| you can't reach the machine or it can't reach the internet)
|
| Scaling- what if you can't serve all of your customers from one
| machine ? Perhaps you have many customers and a small app or
| perhaps your app can use a lot of resources (maybe it loads gigs
| of data)
|
| Deployment - what happens when we want to change the code and not
| go down if you are running multiple copies of your app you get
| this for cheap
|
| There are tons of smaller benefits - right sizing your
| architecture What if the one machine you choose is not big enough
| you need to move to a new machine, with multiple machines you
| just increase the number of machines. You also get to use a
| variety of machine sizes and can choose ones that fit your needs
| so this flexibility allows you to choose cheaper machines
|
| I feel like the authors don't know why people invented the
| standard way of doing things.
| annacappa wrote:
| The more I think about it the worse it gets.
|
| Because we don't want everything to fall over when one machine
| goes down we need at least 3 machines (for raft). So if our
| traditional db would have 500 GB of data we now need 3 machines
| with 500 GB of ram running at all times. That is an epic waste
| of money. Millions per year to run ? And you could store it in
| a db for a couple of dollars.
| 1oooqooq wrote:
| their use case is mostly-never-retrieved images!
|
| they store the index of files only in memory. and have the
| entire build time to fetch build-1 images to get ready for
| the diff.
|
| it's much easier than most use cases
| annacappa wrote:
| So all of this ram is being used and is only accessed
| sporadically if at all. This is not good. Sounds like you
| could implement the entire thing on a micro db instance
| (redis or a regular db) with no raft or any other custom
| implementation or messing.
| LAC-Tech wrote:
| Really got a kick out of this article. RAM is big, and cheap. And
| as we all know the database is the log, and everything else is
| just the cache. A few questions, comments!
|
| 1. I take it you've seen the LMAX talk [0], and were similarly
| inspired? :)
|
| 2. Are you familiar with the event sourcing approach? It's
| basically what you describe, except you don't flush to disk after
| editing every field, you batch your updates into a single
| "event". (you've come at it from the exact opposite end, but it
| looks like roughly the same thing).
|
| [0] https://www.infoq.com/presentations/LMAX/
| qprofyeh wrote:
| We used Redis with persistence to build our first prototype. It
| performed amazingly and development speed was awesome. We were a
| full year beyond break-even before adding MySQL to the stack for
| the few times we missed the ability to run SQL queries, for
| finance.
| _the_inflator wrote:
| This reminds me of the heated discussions around jQuery by some
| so called performance driven devs, which cumulated into this
| website:
|
| https://youmightnotneedjquery.com/
|
| The overwhelming majority underestimates the beauty and effort as
| well as experience that goes into abstractions. There are some
| true geniuses at times doing fantastic work, to deliver
| syntactical sugar while the critics mock the maybe somewhat
| larger bundle size for "a couple of lines frequently used."
| That's why.
|
| In the end, a good framework is more than just an abstraction. It
| guarantees consistency and accessibility.
|
| Try to understand the source code if possible before reinventing
| the wheel is my advice.
|
| What maybe starts out to be fun quickly becomes a burden. If
| there weren't any edge cases or different conditions, you
| wouldn't need an abstraction. Been there, done that.
| k__ wrote:
| Well, at least they gave an example of what not to do.
| gchamonlive wrote:
| The problem is, you only know what you know.
|
| Sure you reduce deployment complexity, but what about maintaining
| your algorithm that implements data persistence and replication?
|
| To assume that will never spectacularly bite you is naive. Tests
| also only go so far as you know what you are testing for, and
| while you don't know if your product will ever be used, you also
| don't know if it will explode in success and you will be hostage
| of your own decisions and technical debt.
|
| These are HARD decisions. Hard decisions require solid solutions.
| You can surely try that with toy projects, but if I was in a
| position to build a software architecture for something that had
| a remote possibility of being used in production, I would oppose
| such designs adamantly.
| paxys wrote:
| There is so much wrong with this I don't know where to even
| start. You want to "keep things simple" and not stand up a
| separate instance of MySQL/Postgres/Redis/MongoDB/whatever else.
| So, you:
|
| 1. Create your own in-memory database.
|
| 2. Make sure every transaction in this DB can be serialized and
| is simultaneously written to disk.
|
| 3. Use some orchestration platform to make all web servers aware
| of each other.
|
| 4. Synchronize transaction logs between your web servers (by
| implementing the Raft protocol) and update the in-memory DB.
|
| 5. Write some kind of conflict resolution algorithm, because
| there's no way to implement locking or enforce
| consistency/isolation in your DB.
|
| 6. Shard your web servers by tenant and write another load
| balancing layer to make sure that requests are getting to the
| server their data is on.
|
| Simple indeed.
| ozim wrote:
| I don't want to go ad personam on the blog author - but
| checking his socials he is not really experienced person.
|
| I don't think we have anything to discuss here. He seems just
| to want to do cool stuff and his drop of databases seems to be
| because he just doesn't know a lot of stuff there is to know.
|
| I applaud attempt and might be that his needs will be covered
| by what he is doing.
|
| But for everyone else yes, pick boring technology if you want
| to do startup because technology shouldn't be hard or something
| you worry about if you are making web applications.
| tdrhq wrote:
| > but checking his socials he is not really experienced
| person.
|
| I'm not sure what qualifies as experience if Meta/Google
| doesn't. ;)
| ozim wrote:
| Well he is not Kent Beck or Jon Skeet, Martin Fowler - that
| is what I call experienced to take seriously a blog post.
|
| Just working at Meta/Google doesn't impress me much just
| like Shania Twain would sing.
| rrrix1 wrote:
| > he is not Kent Beck or Jon Skeet, Martin Fowler
|
| Just FYI, you are (perhaps unintentionally) showing your
| lack of experience.
|
| There are many thousands of brilliant engineers for every
| brilliant engineer who also is a
| author/speaker/publisher. These are very different
| skills.
|
| Also, perhaps the author _is_ the next Martin Fowler? You
| never know...
| mtlynch wrote:
| > _I don't want to go ad personam on the blog author - but
| checking his socials he is not really experienced person._
|
| According to LinkedIn:
|
| - Masters in CS from UPenn
|
| - 1 year as SWE at Google
|
| - 6 years as SWE at FB/Meta
|
| - 6 years running his own company
|
| When I hear "not really experienced," I think recent college
| grad, not someone with a Master's and 15 years of industry
| experience.
| williamdclt wrote:
| Well that's only 7y of working with people to learn from,
| it's not nothing but it's not enough credentials to make me
| go from "it's a horrible idea" to "I must be missing
| something"
| karmakaze wrote:
| I played with making an in-memory database too, but I wouldn't
| recommend anyone use one in production unless they have strict
| latency requirements.
|
| Simple is what people are already using. And beware 'good for
| startups' tech. If you're successful you'll have legacy 'bad
| for scale' tech.
| SoftTalker wrote:
| Yeah and good luck when the CEO starts asking for reports and
| metrics (or anything else that databases have been optimized
| over the last 50 years to do very well).
|
| Surely this is a parody article of some sort?
| 0x74696d wrote:
| This architecture is roughly how HashiCorp's Nomad, Consul, and
| Vault are built (I'm one of the maintainers of Nomad). While it's
| definitely a "weird" architecture, the developer experience is
| really nice once you get the hang of it.
|
| The in-memory state can be whatever you want, which means you can
| build up your own application-specific indexing and querying
| functions. You _could_ just use sqlite with :memory: for the Raft
| FSM, but if you can build /find an in-memory transaction store
| (we use our own go-memdb), then reading from the state is just
| function calls. Protecting yourself from stale reads or write
| skew is trivial; every object you write has a Raft index so you
| can write APIs like "query a follower for object foo and wait
| till it's at least at index 123". It sweeps away a lot of "magic"
| that normally you'd shove into a RDBMS or other external store.
|
| That being said, I'd be hesitant to pick this kind of
| architecture for a new startup outside of the "infrastructure"
| space... you are effectively building your own database here
| though. You need to pick (or write) good primitives for things
| like your inter-node RPC, on-disk persistence, in-memory
| transactional state store, etc. Upgrades are especially
| challenging, because the new code can try to write entities to
| the Raft log that nodes still on the previous version don't
| understand (or worse, misunderstand because the way they're
| handled has changed!). There's no free lunch.
| jstrong wrote:
| like you I'm more open to the idea of keeping data in memory
| than most of the responders here. when I got to the part of the
| article about how they are using common lisp with hot
| reloading, I was thinking, well you guys can do whatever you
| want, but not everybody is working on that team, ha.
| donatj wrote:
| I've got a handful of small Go applications where I just have a
| "go generate" command that generates the entire dataset as Go, so
| the data set ends up compiled into the binary. Works great.
|
| https://emoji.boats/ is the most public facing of these.
|
| I also have built a whole class of micro-services that pull their
| entire dataset from an API on start up, hold it resident and
| update on occasion. These have been amazing for speeding up
| certain classes of lookup for us where we don't always need
| entirely up to date data.
| tofflos wrote:
| Check out https://eclipsestore.io (previously named Microstream)
| if you're into Java and interested in some of the ideas presented
| in this article. You use regular objects, such as Records, and
| regular code, such as java.util.stream, for processing, and the
| library does snapshotting to disk.
|
| I haven't tried it out but just thinking of how many fewer
| organizational hoops I would have to jump through makes we want
| to try it out:
|
| - No ordering a database from database operations.
|
| - No ordering a port opening from network operations.
|
| - No ordering of certificates.
|
| - The above times 3 for development, test and production.
|
| - Not having to run database containers during development.
|
| I think the sweet spot for me would be in services that I don't
| expect to grow beyond a single node and there is an acceptance
| for a small amount of downtime during service windows.
| ksec wrote:
| >RAM is super cheap
|
| I think this has to be the number one misunderstanding for
| developers.
|
| Yes, SSD in terms of throughput or IOPs has gone up by 100 to
| 10000x. vCPU performance per dollar has gone up by 20 - 50x. We
| went from 45/32nm to now 5nm/3nm, and much higher IPC.
|
| But RAM price hasn't gotten anywhere near the same fall as CPU or
| SSD. It may have gotten a lot faster, you may be even getting to
| stick lots of memory with higher density chip and channels went
| from dual to 8 or 12. But if you look at the DRAM Spot price
| since 2008 to 2022, you will see the lowest DRAM price has been
| the same at around $2.8/GB for three times. As the DRAM price
| goes in cycle with $8 / $6 per GB in between this same period.
| i.e Had you bought DRAM at its lowest point or its highest point
| during the past ~15 years your DRAM would have cost roughly the
| same plus or minus 10-20% ignoring inflation.
|
| It was only until Mid 2022 it finally broke through the $2.8/GB
| barrier and collapse close to $1/GB before settling on ~ $2/GB
| for DDR5.
|
| Yes you can now get 4TB RAM on a server. But it doesn't mean DRAM
| are super cheap. Developers on average or for those in big Tech
| are now earning way more than they were in 2010. Which makes them
| think RAM has gotten a lot more affordable. In reality even in
| the lowest point over past 15 years you only get at best slightly
| more than 2x reduction in DRAM price. And we will likely see DRAM
| price shot up again in a year or two.
| klysm wrote:
| Simultaneously, many developers reach for distributed systems
| too quickly when they could just buy more ram. Perhaps that's
| what the writer means
| rrrix1 wrote:
| An alternative interpretation is that the maximum RAM capacity
| for an individual node has drastically increased over the last
| couple of decades.
|
| A simplistic example, if a given node was limited to 16GB of
| RAM 20 years ago, I would need 256 nodes to have 4TB of RAM for
| my system (not including overhead for each OS).
|
| Compared to today, where a single node can have that entire 4TB
| all in one chassis.
|
| The total cost of RAM chips themselves may not have changed,
| but the actual cost of using that RAM in a physical system has
| dropped dramatically.
| jb3689 wrote:
| We wanted to simplify our architecture and not use a database, so
| instead we created our own version of everything databases
| already do for us. Super risky for a company. Hopefully you don't
| spend all of your time maintaining, optimizing, and scaling this
| custom architecture.
| bastawhiz wrote:
| Please, someone explain how building your own in-memory database
| and snapshotting on top of Raft is simpler than just installing
| Postgres or SQLite with one of the modern durability tools.
| Seriously, if you genuinely believe writing concurrency code with
| mutexes and other primitives and hoping that's all correct is
| _easier_ than just writing a little SQL, you 've tragically lost
| your way.
| samarabbas wrote:
| Notice how the complexity of this grows suddenly when you start
| thinking about infrastructure failure and restarts due to
| deployments. I have seen this play out dozens of time in my
| professional career where these systems although starts very
| simple but eventually becomes a huge maintenance burden over
| time. This is where high level abstractions like Durable
| Execution is much more powerful for developers which has the
| potential to abstract out this level of details. Basically code
| up your application like infrastructure failures does not exist
| and let underlying Durable Execution platform like Temporal or
| something similar handle resiliency for you.
___________________________________________________________________
(page generated 2024-08-10 23:01 UTC)