[HN Gopher] Saving cloud costs by writing our own database
___________________________________________________________________
Saving cloud costs by writing our own database
Author : wolframhempel
Score : 146 points
Date : 2024-04-04 11:54 UTC (2 days ago)
(HTM) web link (hivekit.io)
(TXT) w3m dump (hivekit.io)
| icsa wrote:
| How is it possible to save more than 100% ?
| wolframhempel wrote:
| Fair, should be 98%. Can't change the title anymore though
| jbverschoor wrote:
| Aws credits
| jayd16 wrote:
| Move from the cloud to on-prem and then sell extra
| availability.
| olddustytrail wrote:
| Isn't it obvious?!
|
| 1. Write your own database
|
| 2. ???
|
| 3. Profit!
| aclatuts wrote:
| Receive a license fee from someone else for using your
| software!
| mdaniel wrote:
| Anytime I hear "we need to blast in per-second measurements of
| ..." my mind jumps to "well, have you looked at the bazillions of
| timeseries databases out there?" Because the fact those payloads
| happen to be (time, lat, long, device_id) tuples seems immaterial
| to the timeseries database and can then be rolled up into
| whatever level of aggregation one wishes for long-term storage
|
| It also seems that just about every open source "datadog / new
| relic replacement" is built on top of ClickHouse, and even they
| themselves allege multi-petabyte capabilities
| <https://news.ycombinator.com/item?id=39905443>
|
| OT1H, I saw the "we did research" part of the post, and I for
| sure have no horse in your race of NIH, but "we write to EBS,
| what's the worst that can happen" strikes me as ... be sure
| you're comfortable with the tradeoffs you've made in order to get
| a catchy blog post title
| robertlagrant wrote:
| > but "we write to EBS, what's the worst that can happen"
| strikes me as ... be sure you're comfortable with the tradeoffs
| you've made in order to get a catchy blog post title
|
| In what way?
| freeone3000 wrote:
| EBS latency is all over the place. The jitter is up to the
| 100ms scale, even on subsequent IOPS. We've also had
| intermittent failures for fsync(), which is a case that
| should be handled but is exceptionally rare for
| traditionally-attached drives.
| RHSeeger wrote:
| The author does note in the writeup that they are
| comfortable with some (relatively rare) data loss; like
| server failure and the like. Given their use cases, it
| seems like the jitter/loss of EBS wouldn't be too impactful
| to them.
| solatic wrote:
| There's different kinds of data loss. There's data loss
| because you lose the whole drive; because you lost a
| whole write; because a write was only partially written.
| Half the problem with NIH solutions is, what happens when
| you try to read from your bespoke binary format, and the
| result is corrupted in some way? So much of the value of
| battle-tested, multi-decade-old databases is that those
| are _solved problems_ that you, the engineer building on
| top of the database, do not need to worry about.
|
| Of course data loss is alright when you're talking about
| a few records within a billion. It is categorically
| _unacceptable_ when AWS loses your drive, you try to
| restore from backup, the application crashes when trying
| to use the restored backup because of "corruption", the
| executives are pissed because downtime is reaching into
| the hours/days while you frantically try to FedEx a
| laptop to the one engineer who knows your bespoke binary
| format and can maybe heal the backup by hand except he's
| on vacation and didn't bring his laptop with him.
| Spivak wrote:
| I mean if you spun up Postgres on EC2 you would be directly
| writing to EBS so that's not really the part I'm worried about.
| I'm more worried about the lack of replication, seemingly no
| way to scale reads or writes, beyond a single server, and no
| way to failover uninterrupted.
|
| I'm guessing it doesn't matter for their use-case which is a
| good thing. When you realize you only need like this teeny
| subset of db features and none of the hard parts writing you
| own starts to get feasible.
| VirusNewbie wrote:
| Right, cassandra/scylla model is _really_ good for time
| series use cases, i've yet to see good arguments against
| them.
| speedgoose wrote:
| ClickHouse is one of the few databases that can handle most of
| the time-series use cases.
|
| InfluxDB, the most popular time-series database, is optimised
| for a very specific kind of workloads: many sensors publishing
| frequently to a single node, and frequent queries that are not
| going far back in time. It's great for that. But it doesn't
| support doing slightly advanced queries such an average over
| two sensors. It also doesn't scale and is pretty slow to query
| far back in time due to its architecture.
|
| TimeScaleDB is a bit more advanced, because it's built on top
| of PostGreSQL, but it's not very fast. It's better than vanilla
| PostGreSQL for time-series.
|
| The TSM Bench paper has interesting figures, but in short
| ClickHouse wins and manage well in almost all benchmarks.
|
| https://dl.acm.org/doi/abs/10.14778/3611479.3611532
|
| https://imgur.com/a/QmWlxz9
|
| Unfortunately, the paper didn't benchmark DuckDB, Apache IoTDB,
| and VictoriaMetrics. They also didn't benchmark proprietary
| databases such as Vertica or BigQuery.
|
| If you deal with time-series data, ClickHouse is likely going
| to perform very well.
| lispisok wrote:
| I work on a project that ingests sensor measurements from the
| field and in our testing found timescaledb was by far the
| best choice. The performance x all their timeseries specific
| features like continuous aggregates and `time_bucket` plus
| access to the postgres ecosystem was killer for us. We get
| about 90% reduction in storage with compression without much
| performance hit too
| omeze wrote:
| Did you try clickhouse? What were its weak points?
| Too wrote:
| Apache Parquet as data format on disk seems to be popular these
| days for similar DIY log/time series applications. It can be
| appended locally and flushed to S3 for persistence.
| nikonyrh wrote:
| Very interesting, it must feel great to get to apply CS knowledge
| at work, rather than writing basic CRUD apis / websites.
| hasmanean wrote:
| Stick the gps data in a binary file. Store then filename in the
| database record.
| MuffinFlavored wrote:
| > We want to be able to handle up to 30k location updates per
| second per node. They can be buffered before writing, leading to
| a much lower number of IOPS.
|
| > This storage engine is part of our server binary, so the cost
| for running it hasn't changed. What has changed though, is that
| we've replaced our $10k/month Aurora instances with a $200/month
| Elastic Block Storage (EBS) volume. We are using Provisioned IOPS
| SSD (io2) with 3000 IOPS and are batching updates to one write
| per second per node and realm.
|
| I would be curious to hear what that "1 write per second" looks
| like in terms of throughput/size?
| zaroth wrote:
| Well they said ~40 bytes per update, so 30k * 40 = 1.2MB/sec...
| so quite trivial.
|
| They also said 30GB per month which works out to 0.7MB/sec if
| load is perfectly constant.
| MuffinFlavored wrote:
| > we've replaced our $10k/month Aurora
|
| How does 0.7MB/sec end up costing $10k/mo in a hosted
| database?
|
| Can you not achieve 1MB/sec of "queued writes" or something
| clever against SQLite?
| speedgoose wrote:
| SQLite in WAL mode would manage for sure.
| awinter-py wrote:
| we have invented write concern = 0
| RHSeeger wrote:
| > we've replaced our $10k/month Aurora instances with a
| $200/month Elastic Block Storage (EBS) volume.
|
| Without any intent to insult what you've done (because the
| information is interesting and the writeup is well done)... how
| do the numbers work out when you account for actually
| implementing and maintaining the database?
|
| - Developer(s) time to initially implement it
|
| - PjM/PM time to organize initial build
|
| - Developer(s) time for maintenance (fix bugs and enhancement
| requirements)
|
| - PjM/PM time to organize maintenance
|
| The cost of someone to maintain the actual "service" (independent
| of the development of it) is, I assume, either similar or lower,
| so there's probably a win there. I'm assuming you have someone on
| board that was on charge of making sure Aurora was configured /
| being used correctly; and it would be just as easier if not
| easier to do the same for your custom database.
|
| The cost of 120,000/year for Aurora seems like it would be less
| than the cost of development/organization time for the custom
| database.
|
| Note: It's clear you have other reasons for needing your custom
| database. I get that. I was just curious about the costs.
| donohoe wrote:
| I came here to ask the same question.
|
| If this db requires 1 full-time developer then the cost would
| immediately be not worth it (assuming salary + benefits >
| $120k/yr)
|
| As you say, without details it's hard to know if this was a
| good idea.
| bilsbie wrote:
| Shouldn't we up our standard developer cost for inflation?
|
| That barely qualifies for the median mortgage in the US.
| twbarber wrote:
| I believe the 120k number was in reference to the OP's
| Aurora spend.
| Filligree wrote:
| What makes you think a standard developer can afford a
| mortgage?
| nightski wrote:
| I find that highly unlikely. Maybe in specific markets but
| not US wide.
| hibikir wrote:
| The median home price is under 400K, so a 120k salary is
| not really stretched.
|
| Now, median in the Seattle metro, or in San Francisco,
| sure. But 120k in, say, St Louis is still going to get you
| an intermediate dev, no problem, and they can afford a
| house by themselves too. There are 4 bedroom houses in my
| neighborhood for 300K.
| solatic wrote:
| I actually disagree with you here. There are costs above and
| beyond the engineer's effect on the balance sheet. There's
| the partial salary of management to manage them, plus asking
| them to document their work and train others so that the
| database won't have a bus factor of 1. So in well-run
| engineering departments, there's no such thing as paying for
| a "single" engineering salary. You have teams; a team
| maintains the system and it has a pre-existing workload.
|
| A large part of the value of popular platforms is precisely
| that they are _not_ bespoke. You can hire engineers with
| MySQL /Postgres experience. You cannot hire engineers who
| already have experience with your bespoke systems.
| Spivak wrote:
| I think for this kind of thing their needs are so simple and
| well-suited to a bespoke implementation that it probably paid
| for itself in less than 4 months. This doesn't seem like a db
| implementation that's going to need dedicated maintenance.
|
| They're operationally using a funny spelling of SQLite and I
| don't imagine anyone arguing that such a thing needs constant
| attention.
| ReflectedImage wrote:
| Well presumably they need only 1/3 of a developer to do this
| and they intend to scale up 10x in the next 5 years.
|
| $60,000 per year in-house vs $1,200,000 per year aurora. No
| brainer really.
| andai wrote:
| Also worth mentioning that it's 150x faster.
| cortesoft wrote:
| Its $120,000 a year for aurora, not $1,200,000.
| bbarnett wrote:
| _What has changed though, is that we've replaced our $10k
| /month Aurora instances with a $200/month Elastic Block
| Storage (EBS) volume._
|
| Note 'instances' eg plural, versus a singular EBS. There is
| some ambiguity here, I'm not sure where the 10x came from,
| but it seems plausible.
| swasheck wrote:
| if they were able to replace aurora instances with a
| glorified kvp store, then they bought the wrong tool in
| the first place.
|
| i saved hundreds of dollars per month by switching from
| an audi a4 to riding a home-built bike for my 1.5 mile
| commute to work
| bbarnett wrote:
| To be fair, one hardware mysql server for 5k, outperforms
| a dozen auroea instances.
|
| It really bugs me how everyone has drank the coolaide.
| Cloud is stupid expensive, but of course this is cloud vs
| cloud.
| rdtsc wrote:
| > The cost of 120,000/year for Aurora seems like it would be
| less than the cost of development/organization time for the
| custom database
|
| Only if they planned on hiring someone just to develop this new
| database and if they switch to Aurora they'd let them go
| immediately. If the said developer was already costing them
| $250k to maintain and develop the application and work on top
| of Aurora cost seems like a good way to save $100k/year.
| organsnyder wrote:
| There's the opportunity cost of whatever else they could have
| been paying that developer to work on.
| grogenaut wrote:
| Agreed, a developer that can pull this off is pretty good,
| if maybe distracted by shiny objects, what could they do
| working on the actual product instead of this technological
| terror?
| rdtsc wrote:
| True. Also, to your point, one could argue that if that
| developer leaves, they'd have an easier time hiring anyone
| with Aurora experience as opposed to someone to learn and
| maintain the custom database.
|
| But at the same time, Aurora costs could also scale with
| usage. It may cost $120k one year, $180k next year, $500k
| the year after. If the database they have now is well
| designed after it's already built it may not need active
| development every year but adding a feature here and there.
| Also, switching back to Aurora could also be an opportunity
| cost "we should have written our own thing and could have
| saved millions ...".
| addicted wrote:
| Well considering the cost is lower than Aurora isn't the
| opportunity cost in favor of the home built situation.
| organsnyder wrote:
| That's only true if Aurora is the most valuable thing
| that developer could be working on.
| preommr wrote:
| I feel like the word "database" is throwing people off because
| they're comparing it with something like MySql/Postgres, when
| this seems slightly more complex than a k/v store stored to a
| file, with some other indexing, where data integrity is a low
| priority. That shouldn't take too much time and should be
| fairly isolated on the tech side so little involvement from
| product/project managers.
| hmottestad wrote:
| A k/v store typically is really fast at looking up the value
| based on the key. So there are usually some pretty advanced
| indexes involved.
| arandomusername wrote:
| or a simple b-tree...
| eatonphil wrote:
| My simple btrees have had bugs in them. :)
|
| (Though to be fair, if I actually wanted to put this in
| prod it probably wouldn't take too long to fuzz it and
| fix the kinks.)
|
| https://github.com/eatonphil/btree-rs
| paulddraper wrote:
| 80/20
| samatman wrote:
| I would imagine, as someone with no special insight into
| goings-on at Hivekit, that the answer is intended scale.
|
| They mention 13.5k simultaneous connections. The US has 4.2
| million tractors alone, just the US, just tractors. If they get
| 10% of those tractors on the network that's a 30x to their data
| storage needs. So multiply that across the entire planet, and
| all the use cases they hope to serve.
|
| Investing time early on so that they can store 50x data-per-
| dollar is almost certainly time well spent.
| kdazzle wrote:
| Presumably those tractors wouldn't be connecting directly to
| the db though. Not sure why they dont just go the standard
| iot events route and store data in a data lake and propagate
| into an analytics db/warehouse from there. Add a layer to
| make recent events available immediately.
|
| S3 is relatively cheap.
| exe34 wrote:
| > PjM/PM
|
| What do you need them for?
| g9yuayon wrote:
| > PjM/PM time to organize initial build
|
| This sounds what big companies or a disorganized company would
| need. For an efficient enough company, a project like this
| needs just one or two dedicated engineers.
|
| In fact, I can't imagine why this project needs a PM at all.
| The database is used by engineers and is built by engineers.
| Engineers should be their own PMs. It's like we need a PM for a
| programming language, but no, the compiler writer must be the
| language designer and must use the the language. Those who do
| not use a product or do not have in-depth knowledge in the the
| domain should not be the PM of the product.
| vannevar wrote:
| >For an efficient enough company, a project like this needs
| just one or two dedicated engineers.
|
| Maybe for a research project or a hobby project, but not for
| a real, high performance database to be used in a business-
| critical application.
|
| FTA:
|
| "Databases are a nightmare to write, from Atomicity,
| Consistency, Isolation, and Durability (ACID) requirements to
| sharding to fault recovery to administration - everything is
| hard beyond belief."
|
| >Engineers should be their own PMs.
|
| For small projects, sure (your "one or two dedicated
| engineers"). But once you start tackling projects that
| require larger teams, or even teams of teams, you need
| someone to track and prioritize the work remaining and the
| work in progress (as well as the corresponding budgets for
| personnel, services, and other resources). Similar to the way
| a sole proprietor can do their own accounting, but a multi-
| million dollar business probably should have an accountant.
|
| As an aside, I wonder if this might be a use case for a
| bitmap db engine like Featurebase
| (https://www.featurebase.com/).
| delusional wrote:
| > what we've built is just a cursor streaming a binary file
| feed with a very limited set of functionality - but then
| again, it's the exact functionality we need and we didn't
| lose any features.
|
| The trick is that they didn't need a database that provides
| "Atomicity, Consistency, Isolation, and Durability (ACID)".
| By only implementing what they need they were able to keep
| the project small.
|
| It's like people are scared of doing anything without
| making it into some huge multi hundred developer effort.
| They've written a super simple append only document store.
| It's not rocket science. It's not a general purpose
| arbitrary SQL database.
| cortesoft wrote:
| > a project like this needs just one or two dedicated
| engineers.
|
| So that is at least 20k a month, for fairly cheap engineers.
| vineyardmike wrote:
| > In fact, I can't imagine why this project needs a PM at
| all. The database is used by engineers and is built by
| engineers. Engineers should be their own PMs.
|
| What about when two different projects have two different
| requirements they need supported by the database. Which one
| is implemented first? What about if there is only engineering
| capacity to implement one?
|
| I don't think a database is the place for "just send a PR for
| adding your required feature and ping the team that owns it"
| kind of development. It requires research, planning,
| architecture review, testing, etc. It's not a hobby project,
| it's a critical tool for the business.
| delusional wrote:
| > Which one is implemented first?
|
| One of them. This is true whether you have a person named
| "PM" or not. It's just a matter of who picks.
|
| > What about if there is only engineering capacity to
| implement one?
|
| How does naming some guy "PM" solve the issue? The team
| just picks one of the features.
| deedasmi wrote:
| Don't forget this is a largely one time cost vs Aurora, which
| scales cost with usage.
|
| Also they said their current volume is around 13k/second.
| They've built the new platform for 30k/sec per node. This
| should last them a long time with minimal maintenance.
| loftsy wrote:
| Apache Cassandra could be a good fit here. Highly parallel
| frequent writes with some consistency loss allowed.
| bawolff wrote:
| Kind of misleading to not include the cost of developing it
| yourself.
|
| I think everything is cheaper than cloud if you do it yourself
| when you don't count staffing cost.
| benrutter wrote:
| Yeah and for most companies without a huge supply of developers
| the financial risk of having all your stuff blitzed when your
| home spun solution fails.
| kaladin_1 wrote:
| I love the attitude, we didn't see a good fit so we rolled ours.
|
| Sure it won't cover the bazillion cases the DBs out there do but
| that's not what you need. The source code is small enough for any
| team member to jump in and debug while pushing performance in any
| direction you want.
|
| Cudos!
| INTPenis wrote:
| That is such an insane headline.
|
| You might as well say "we saved 100% of cloud costs by writing
| our own cloud".
| yunohn wrote:
| This is more a bespoke file format than a full blown database.
| It's optimized for one table schema and a few specific queries.
|
| Not a negative though, not everything needs a general purpose
| database. Clearly this satisfies their requirements, which is the
| most important thing.
| Kalanos wrote:
| Exactly. There are a hundred questions that come to mind like
| how does it handle concurrent writes, sharding, views.
|
| https://en.wikipedia.org/wiki/Database#Database_management_s...
|
| I'm sure they learned a lot, but probably a waste in the long
| run
| yau8edq12i wrote:
| Wasn't this already discussed here yesterday? The main criticism
| of the article is that they didn't write a database, they wrote
| an append-only log system with limited query capabilities. Which
| is fine. But it's not a "database" in the sense that someone
| would understand when reading the title.
| throwaway63467 wrote:
| Why isn't that a database? In my understanding a DB needs to be
| able to store structured data and retrieve it, so not sure
| what's missing here? Many modern DBs are effectively append
| only logs with compaction and some indexing on top as well as a
| query engine, so personally I don't think it's weird to call
| this a DB.
| hmottestad wrote:
| I don't know what point you are really trying to make. At uni
| the DBMS that everyone learns in their database course is an
| SQL database. The database part is technically just a binary
| file, but it's not what people usually mean when they say
| they need a database for their project. Just like a search
| engine doesn't have to be anything more than indexOf and a
| big text file. It's just not very useful to think of it like
| that.
| Symbiote wrote:
| You're describing a relational database management system,
| which is a specific type of software implementing a
| specific type of database.
| didgetmaster wrote:
| I agree. Is there an industry accepted definition of what a
| system must do before it can be called a database?
|
| I also wrote a KV system to keep track of metadata (tags) for
| an object store I invented. I discovered that it could also
| be used to create relational tables and perform fast queries
| against them without needing separate indexes.
|
| I started calling it a database and many people complained
| that I was misusing the term because it can't yet do
| everything that Postgres, MySQL, or SQLite can do.
| josephg wrote:
| Sounds like a database to me.
|
| Databases have a long history that reaches back much
| further than the modern, full featured SQL databases we
| have today. What you built sounds like it would fit in well
| amongst the non-sql databases of the world, like
| Berkeleydb, indexeddb, mongo, redis, and so on.
| yau8edq12i wrote:
| Don't be absurd. By your standard, cat, grep and a file form
| a database. Sure, if you interpret literally what a database
| is, that fits. But once again, it's not what people have in
| mind when they read "we cut cloud costs by writing our own
| database".
| swiftcoder wrote:
| cat + grep absolutely constitute a database (and it's
| probably in use in production _somewhere_ ). No need to
| gatekeep the concept of a database
| com2kid wrote:
| File systems are databases. Different file systems choose
| different trade offs, different indexing strategies, etc.
|
| Git is also a database. I got into this argument with
| someone when I proposed using Github as a database to store
| configuration entries. Our requirements included needing
| the ability to review changes before they went live, and
| the ability to easily undo changes to the config. If your
| requirements for a DB include those two things, Github is a
| damn good database platform! (Azure even has built in
| support for storing configurations in Github!)
| superq wrote:
| It's difficult to be pedantic about an ambiguous term like
| database without additional qualification or specificity.
|
| There are more types of databases than those that end in "SQL".
|
| A CSV file alone is a database. The rows are, well, rows. So is
| a DBM file, which is what MySQL was originally built on (might
| still be). Or an SQLite file.
|
| The client or server API doesn't have to be part of the
| database itself.
| xyst wrote:
| Sounds like Kafka to me. Except have to rewrite components like
| ksqldb
| Retr0id wrote:
| If you described those needs to the average engineer, they'd
| correctly say "use a database".
| eatonphil wrote:
| > they wrote an append-only log system with limited query
| capabilities.
|
| This sounds like a database to me.
| forrestthewoods wrote:
| Writing custom code that does exactly what you need and nothing
| else is underrated. More people should do that! This is a great
| example.
| hmottestad wrote:
| Yeah. They basically defined a binary format. I wouldn't call
| it a database either.
| mamcx wrote:
| > But it's not a "database" in the sense that someone would
| understand when reading the title.
|
| Sure, because it is common for people to mix a "database" (aka:
| data in some kind of structure) with a paradigm (relational,
| SQL, document, kv) with a "database _system_ " aka: and app
| that manages the database.
| Simon_ORourke wrote:
| I've no doubt this is true, however, anyone I've ever met who
| exclaimed "let's create our own database" would be viewed as
| dangerous, unprofessional or downright uneducated in any business
| meeting. There's just too much can go badly wrong, for all the
| sunk cost in getting anything up and running.
| democracy wrote:
| Depends on their meaning of a "database"
| mavili wrote:
| That is such a problem in today's world. Of course you don't
| want to re-invent the wheel and all that, but we must be open
| to the idea of having to do it. Innovation stagnates if people
| suggesting redoing something are immediately seen as
| "dangerous, unprofessional or downright uneducated"
| MaKey wrote:
| I think the issue is that you rarely get to see a neat new
| solution to a given problem. Usually you'll see some kind of
| half-baked attempted solution that's worse than the already
| existing alternatives.
| mavili wrote:
| Yes, but what I'm describing is the problem of not even
| listening to the idea of a new attempt.
| akira2501 wrote:
| > would be viewed as dangerous, unprofessional or downright
| uneducated in any business meeting
|
| Sounds like a great place to work.
|
| > There's just too much can go badly wrong, for all the sunk
| cost in getting anything up and running.
|
| Engineering is the art of compromise. In many cases the
| compromises would not be worth it, but that doesn't mean there
| are zero places where it would be, and eschewing the discussion
| out fear of how it would be perceived is the opposite of
| Engineering.
| endisneigh wrote:
| It would be interesting to see a database built from the ground
| up for being trivial to maintain.
|
| I use managed databases, but is there really that much to do for
| maintaining a database? The host requires some level of
| maintenance - changing disks, updating the host operating system,
| failover during downtime for machine repair, etc. if you use a
| database built for failover I imagine much of this doesn't
| actually affect the operations that much assuming you slightly
| over provision.
|
| For a database alone I think the work needed to maintain is
| greatly exaggerated. That being said I still think it's more than
| using a managed database, which is why my company still does so.
|
| In this case though, an append log seems pretty simple imo.
| Better to self host.
| the_duke wrote:
| I don't know what geospatial features are needed, but otherwise
| time series databases are great for this use case.
|
| I especially like Clickhouse, it's generic but also a powerhouse
| that handles most things you throw at it, handles huge write
| volumes (with sufficient batching), supports horizontal scaling,
| and offloading long-term storage to S3 for much smaller disk
| requirements. The geo features in clickhouse are pretty basic,
| but it does have some builtin geo datatypes and functions for eg
| calculating the distance.
| fifilura wrote:
| Would building a data lakehouse be an option?
|
| Stream the events to s3 stored as Parquet or Avro files, maybe in
| Iceberg format.
|
| And then use Trino/Athena to do the long term heavy lifting. Or
| for on-demand use cases.
|
| Then only push what you actually need live to a Aurora.
| bsaul wrote:
| I had a similar idea (except using kafka) : have all the nodes
| write to a kafka cluster, used for buffering, and let some
| consumer write those data in batch, into whatever database
| engine(s) you need for querying, with intermediate pre-
| processing steps whenever needed. This lets you trade latency
| for write buffering, while not loosing data thanks to kafka
| durability guarantees.
|
| What would you use for streaming directly to s3 in high volumes
| ?
| fifilura wrote:
| Yeah kafka would handle it, except in my experience i would
| like to avoid kafka if possible, since it adds complexity.
| (Fair enough it depends on how precious your data is, if it
| is acceptable to loose some of it if a node crashes)
|
| But somehow they are ingesting the data over network. Would
| writing files to s3 be slower than that? Otherwise you don't
| need much more than a RAM buffer?
|
| Edit: to be clear, kafka is probably the right choice here,
| it is just that kafka and me is not a love story.
|
| But it should be cheaper to store long term data in s3 than
| storing it in kafka, right?
| diziet wrote:
| As others had mentioned, probably hosting your own clickhouse
| instance could yield major savings while allowing for much more
| flexibility in the future for querying data. If your use case can
| be served by what clickhouse offers, gosh is it an incredibly
| fast and reliable open source solution that you can host
| yourself.
| zinodaur wrote:
| Very cool! When I started reading the article I thought it was
| going to end up using an LSM tree/RocksDB but y'all went even
| more custom than that
| kumarm wrote:
| I have built similar system in 2002 using JGroups (JavaGroups at
| the time before open source project was acquired by JBoss) while
| persisting asynchronously to DB (Oracle at the time). Our scale
| even in 2002 was much higher than 13,000 vehicles.
|
| The project I believe still appears in success story on JGroups
| website after 20+ years. I am surprised people are writing their
| own databases for location storage in 2024 :). There was no need
| to invent new technology in 2002 and definitely not in 2024.
| jrockway wrote:
| Everyone seems fixated on the word database and the engineering
| cost of writing one. This is a log file. You write data to the
| end of it. You flush it to disk whenever you've filled up some
| unit of storage that is efficient to write to disk. Every query
| is a full table scan. If you have multiple writers, this works
| out very nicely when you have one API server per disk; each
| server writes its own files (with a simple mutex gating the write
| out of a batch of records), and queries involve opening all the
| files in parallel and aggregating the result. (Map, shuffle,
| reduce.)
|
| Atomic: not applicable, as there are no transactions. Consistent:
| no, as there is no protection about losing the tail end of writes
| (consider "no space left on device" halfway through a record).
| Independent: not applicable, as there are no transactions.
| Durable: no, the data is buffered in memory before being written
| to the network (EBS is the network, not a disk).
|
| So with all of this in mind, the engineering cost is not going to
| be higher than $10,000 a month. It's a print statement.
|
| If it sounds like I'm being negative, I'm not. Log files are one
| of my favorite types of time series data storage. A for loop that
| reads every record is one of my favorite query plans. But this is
| not what things like Postgres or Aurora aim to do, they aim for
| things like "we need to edit past data several times per second
| and derive some of those edits from data that is also being
| edited". Now you have some complexity and a big old binary log
| file and some for loops isn't really going to get you there. But
| if you don't need those things, then you don't need those things,
| and you don't need to pay for them.
|
| The question you always have to ask, though, is have you reasoned
| about the business impacts of losing data through unhandled
| transactional conflicts? "read committed" or "non-durable writes"
| are often big customer service problems. "You deducted this bill
| payment twice, and now I can't pay the rent!" Does it matter to
| your end users? If not, you can save a lot of time and money. If
| it does, well, then the best-effort log file probably isn't going
| to be good for business.
| bradleyjg wrote:
| If you only need those things there's also an off the shelf
| solution for log files. Time you spend reinventing the wheel is
| time you aren't spending finding product-market fit (if you've
| already found it you wouldn't even consider it because you'd be
| too busy servicing the flood of customers.)
|
| Unless your company is so far past product market fit that it
| hires qualified applicants by the classfull _or_ whatever-it-is
| is their product, they have no business coding up custom infra
| bits. The opportunity cost alone is sufficient argument
| against, though far from the only one.
| jrockway wrote:
| I think that EBS is the difficult engineering problem that
| they purchased instead of built from scratch here. Writing
| binary records to a file and reading them all into memory is
| not going to be a time sink that prevents you from finding
| product/market fit. The $120,000/year burn rate on Aurora
| they had seems alarming; an alarm that strongly implies "we
| didn't use the right system for this problem".
|
| My guess for "why didn't they use something off the shelf" is
| that no existing software would be satisfied with the
| tradeoffs they made here. Nobody else wants this.
| happymellon wrote:
| It's also nonsense.
|
| If those were your requirements, why on earth are you using
| Aurora?
|
| Aurora is a multi-region, failover protected, backup managed
| service.
|
| This isn't. It would have been cheaper and quicker to install
| an OpenSource logging DB on an EC2. Like Elastic.
| mavili wrote:
| That's called engineering; you had a problem, you came up with a
| solution THAT WORKS for your needs. Nicely done and thanks for
| sharing.
| xyst wrote:
| This seems like they rewrote Kafka to me.
|
| Even moderately sized Kafka clusters can handle the throughput
| requirement. Can even optimize for performance over durability.
|
| Some limited query capability with components such as ksqldb.
|
| Maybe offload historical data to blob storage.
|
| Then again, Kafka is kind of complicated to run at these scales.
| Very easy to fuck up.
| kdazzle wrote:
| Plus managed kafka is pretty expensive
| time0ut wrote:
| Good article.
|
| > EBS has automated backups and recovery built in and high uptime
| guarantees, so we don't feel that we've missed out on any of the
| reliability guarantees that Aurora offered.
|
| It may not matter for their use case, but I don't believe this is
| accurate in a general sense. EBS volumes are local to an
| availability zone while Aurora's storage is replicated across a
| quorum of AZs [0]. If a region loses an AZ, the database instance
| can be failed over to a healthy one with little downtime. This
| has only happened to me a couple times over the past three years,
| but it was pretty seamless and things were back on track pretty
| fast.
|
| I didn't see anything in the article about addressing
| availability if there is an AZ outage. It may simply not matter
| or maybe they have solved for it. Could be a good topic for a
| follow up article.
|
| [0] https://aws.amazon.com/blogs/database/introducing-the-
| aurora...
| SmellTheGlove wrote:
| I'm surprised to see the (mostly) critical posts. My reaction
| before coming to the comments was:
|
| - This is core to their platform, makes sense to fit it closely
| to their use cases
|
| - They didn't need most of what a full database offers - they're
| "just" logging
|
| - They know the tradeoffs and designed appropriately to accept
| those to keep costs down
|
| I'm a big believer in building on top of the solved problems in
| the world, but it's also completely okay to build shit. That used
| to be what this industry did, and now it seems to have shifted in
| the direction of like 5-10% of large players invent shit and open
| source it, and the other 90-95% are just stitching together
| things they didn't build in infrastructure that they don't own or
| operate, to produce the latest CRUD app. And hell, that's not bad
| either, it's pretty much my job. But it's also occasionally nice
| to see someone build to their spec and save a few dollars. It's a
| good reminder that costs matter, particularly when money isn't
| free and incinerating endless piles of it chasing a (successful)
| public exit is no longer the norm.
|
| I get the arguments that developer time isn't free, but neither
| is running AWS managed services, despite the name. And they
| didn't really build a general purpose database, they built a much
| simpler logger for their use case to replace a database. I'd be
| surprised if they hired someone additional to build this, and if
| they did, I'd guess (knowing absolutely nothing) that the added
| dev spends 80% of their time doing other things. It's not like
| they launched a datacenter. They just built the software and run
| it on cheaper AWS services versus paying AWS extra for the more
| complex product.
| CapeTheory wrote:
| It's amazing what can happen when software companies start doing
| something approximating real engineering, rather than just
| sitting a UI on top of some managed services.
| zX41ZdbW wrote:
| Sounds totally redundant to me. You can write all location
| updates into ClickHouse, and the problem is solved.
|
| As a demo, I've recently implemented a tool to browse 50 billion
| airplane locations: https://adsb.exposed/
|
| Disclaimer: I'm the author of ClickHouse.
| kroolik wrote:
| I could be missing something, but I can't really wrap my head
| around "unlimited paralelism".
|
| What they say is that the logic is embedded into their server
| binary and they write to a local EBS. But what happens when they
| have two servers? EBS can't be rw mounted in multiple places.
|
| Won't adding the second and more servers cause trouble like
| migrating data when new server joins the cluster, or a server
| leaves the cluster?
|
| I understand Aurora was too expensive for them. But I think it is
| important to note their whole setup is not HA at all (which may
| be fine, but the header could be misleading).
| klohto wrote:
| EBS can be multi attached for a long time now, no perf impact
| kroolik wrote:
| Oh thanks! Always thought it was what EFS was for. They are
| still limited to the same AZ, so no multi AZ redundancy.
| klohto wrote:
| yea, multi AZ failover would be an issue but I assume they
| don't care that much.
|
| you could spinup new EBS from the backup when the first
| region fails or keep a warm copy there, but seems like a
| lot of extra engi work.
| afro88 wrote:
| These two sentences don't work together:
|
| > [We need to cater for] Delivery companies that want to be able
| to replay the exact seconds leading up to an accident.
|
| > We are ok with losing some data. We buffer about 1 second worth
| of updates before we write to disk
|
| Impressive engineering effort on it's own though!
| bevekspldnw wrote:
| "We are running a cloud platform that tracks tens of thousands of
| people and vehicles simultaneously"
|
| ...that's not something to brag about.
| sneak wrote:
| Why, because you think the surveillance implies that it's
| nonconsensual and thus unethical, or the very small scale
| (<100k clients) means this isn't actually a very difficult
| engineering challenge?
___________________________________________________________________
(page generated 2024-04-06 23:01 UTC)