[HN Gopher] An Unlikely Database Migration
___________________________________________________________________
An Unlikely Database Migration
Author : ifcologne
Score : 172 points
Date : 2021-01-13 19:52 UTC (3 hours ago)
(HTM) web link (tailscale.com)
(TXT) w3m dump (tailscale.com)
| jbverschoor wrote:
| Doesn't sound like smart thing to do and sounds more like a js
| dev/student discovering step by step why sql databases are so
| popular..
|
| Probably not so, bc tailscale is a decent product, but this post
| did not change my view in a good way
| jclulow wrote:
| On the contrary, it sounds like a seasoned group of people who
| understand their needs and are wary of the very real challenges
| presented by most existing SQL systems with respect to
| deployment, testing, and especially fault tolerance. I'm
| interested in Cockroach myself, but I also acknowledge it's
| relatively new, and itself a large and complicated body of
| software, and choosing it represents a risk.
| hankchinaski wrote:
| seems like the reason for not going the MySQL/PSQL/DMBS route is
| lack of good Go libraries to handle relational databases
| (ORM/migration/testing)? from the story it seems more like a
| solution in search for a problem
| bradfitz wrote:
| I thought we adequately addressed all those points in the
| article? Search for ORM and dockertest.
| dekhn wrote:
| """Even with fast NVMe drives and splitting the database into two
| halves (important data vs. ephemeral data that we could lose on a
| tmpfs), things got slower and slower. We knew the day would come.
| The file reached a peak size of 150MB and we were writing it as
| quickly as the disk I/O would let us. Ain't that just peachy?"""
|
| Uh, you compressed it first, right? because CPUs can compress
| data faster than disk I/O.
| bradfitz wrote:
| Yeah, I think it was zstd.
|
| But the bigger problem was the curve. Doing something O(N*M)
| where N (file size) and M (writes per second) were both growing
| was not a winning strategy, compression or not.
| jeffbee wrote:
| Hrmm. Even lz4 level 1 compresses at "only" about 500-1000MB/s
| on various CPU types, which isn't quiet as fast as NVMe devices
| demand.
| npiit wrote:
| Unfortunately this company is known for its shady agressive
| marketing on hacker news. The upvotes count is really suspicious
| and this is not the first time.
|
| EDIT: Why is the downvoting? the post was given like 9 upvotes in
| the first 5 minutes. I frequently go to "new" and this is a
| highly suspicious behaviur.
| VikingCoder wrote:
| I upvote literally everything I see from Tailscale, because I
| am hugely impressed with Avery Pennarun and David Crawshaw.
| Seeing Fitz is there too takes it to an even higher level. I
| want them to succeed, and I think they create good HN content.
| Major kudos.
| [deleted]
| hfern wrote:
| I don't think it's shady marketing. The company is composed of
| several prominent software engineers (eg Brad Fitzpatrick of
| memcached fame). I think many folks on HN (myself included) are
| interested to see what they are working on. Especially as Brad
| left Google to work on it.
| npiit wrote:
| I understand. But given the product features compared to the
| rest of the industry, Tailscale brings no real value compared
| to Zero-tier hen it comes to meshes, it's not zerotrust like
| Cloudflare, Twingate, and many others, it claims to be open
| source while only the client is and it cannot be used without
| their closed source control plane where most of the feature
| are behind paywall, it's way more expensive than reputable
| offerings like Citrix, Cloudflare and others. Their security
| is very dubious to me (they can in fact inject their own
| public keys to connect to clients machines and there is no
| way but to trust their word that they won't). I mean, what's
| the innovation compared to the industry in order to get that
| systemically excessive coverage here?
| lima wrote:
| It's _more_ zerotrust-y than Cloudflare et al since it 's
| entirely P2P, with only the control plane running in the
| cloud.
|
| Compared to ZeroTier, the Tailscale client has a permissive
| license, the mesh is fully routed (vs. a L2 network with
| unencrypted broadcasts), is written in a memory-safe
| programming language, integrates with company SSO, and uses
| the Wireguard protocol (i.e. sane, audited crypto instead
| of a DIY protocol).
| npiit wrote:
| zerotrust has nothing to do with p2p, zero-trust is about
| making sure that this user is authorized to access that
| application at the resource level not using some decades
| old segmentation/network level policies. Zerotier also
| claims to be zerotrust but it's technically not.
| Cloudflare, Citrix, PulseSecure have zerotrust offerings,
| but many others sadly just claim to be either by
| ignorance or dishonesty.
| lima wrote:
| Yes, and implementing that is exactly the point of
| Tailscale, with the added advantage of not relying on a
| centralized proxy.
| ignoramous wrote:
| Brad Fitzpatrick aside (he's done a lot more than memcached),
| David Crawshaw, co-author of TFA, lead _go-mobile_ (native
| golang apps for Android and iOS) and _NetStack_ (userspace
| TCP /IP implementation used by gVisor and in-anger by
| Fuschia) while at Google.
| npiit wrote:
| Of course I know of Brad Fitzpatrick. I am just questioning
| the product and its "innovation" compared to the rest of
| the industry in order to be promoted here that much and I
| think I know enough about the industry to see that there is
| nothing to see here for all that hype assuming it's
| genuine. I can get that for FOSS projects, but not for
| companies.
| detaro wrote:
| > _EDIT: Why is the downvoting?_
|
| Because this kind of thing is something you should contact the
| mods about, not leave comments that nobody can really
| (dis-)prove
| gk1 wrote:
| What is suspicious about the number 36? (The upvote count at
| this moment.)
|
| Just because you dislike the product (and it's clear you do)
| does not prevent others from liking it, or at least finding
| their articles interesting.
| npiit wrote:
| The post was given like 9 upvotes in the first 5 minutes. I
| frequently go to "new" and this is a highly suspicious
| behavior.
| Foxboron wrote:
| >EDIT: Why is the downvoting? the post was give like 9 upvotes
| in the first 5 minutes. I frequently go to "new" and this is a
| highly suspicious behaviour.
|
| They are popular people so people submit the link. Once
| duplicate links are submitted there is an upvote on the first
| submission. No duplicates.
|
| Source: I was the second upvote.
| nwsm wrote:
| Is tailscale downvoting you as well?
| joshxyz wrote:
| Its ok im part of audience that doesnt know much about
| databases its fun to read discussions about them once in a
| while.
|
| I learned a lot about postgresql redis clickhouse and
| elasticsearch here, people's perspectives here are great to
| learn from, they tell you which to avoid and which to try.
| judofyr wrote:
| Interesting choice of technology, but you didn't completely
| convince me to why this is better than just using SQLite or
| PostgreSQL with a lagging replica. (You could probably start with
| either one and easily migrate to the other one if needed.)
|
| In particular you've designed a very complicated system:
| Operationally you need an etcd cluster and a tailetc cluster.
| Code-wise you now have to maintain your own transaction-aware
| caching layer on top of etcd
| (https://github.com/tailscale/tailetc/blob/main/tailetc.go).
| That's quite a brave task considering how many databases fail at
| Jepsen. Have you tried running Jepsen tests on tailetc yourself?
| You also mentioned a secondary index system which I assume is
| built on top of tailetc again? How does that interact with
| tailetc?
|
| Considering that high-availability was _not_ a requirement and
| that the main problem with the previous solution was performance
| ( "writes went from nearly a second (sometimes worse!) to
| milliseconds") it _looks_ like a simple server with SQLite + some
| indexes could have gotten you quite far.
|
| We don't really get the full overview from a short blog post like
| this though so maybe it turns out to be a great solution for you.
| The code quality itself looks great and it seems that you have
| thought about all of the hard problems.
| bradfitz wrote:
| > and a tailetc cluster
|
| What do you mean by this part? tailetc is a library used by the
| client of etcd.
|
| Running an etcd cluster is much easier than running an HA
| PostgreSQL or MySQL config. (I previously made LiveJournal and
| ran its massively sharded HA MySQL setup)
| jgraettinger1 wrote:
| Neat. This is very similar to [0], which is _not_ a cache but
| rather a complete mirror of an Etcd keyspace. It does
| Key/Value decoding up front, into a user-defined & validated
| runtime type, and promises to never mutate an existing
| instance (instead decoding into a new instance upon revision
| change).
|
| The typical workflow is do do all of your "reads" out of the
| keyspace, attempt to apply Etcd transactions, and (if needed)
| block until your keyspace has caught up such that you read
| your write -- or someone else's conflicting write.
|
| [0] https://pkg.go.dev/go.gazette.dev/core/keyspace
| crawshaw wrote:
| Drat! I went looking for people doing something similar
| when I sat down to design our client, but did not find your
| package. That's a real pity, I would love to have
| collaborated on this.
|
| I guess Go package discovery remains an unsolved problem.
| bradfitz wrote:
| Whoa, we hadn't seen that! At first glance it indeed
| appears to be perhaps exactly identical to what we did.
| jgraettinger1 wrote:
| Slightly different trade-offs. This package is
| emphatically just "for" Etcd, choosing to directly expose
| MVCC types & concepts from the client.
|
| It also doesn't wrap transactions -- you use the etcd
| client directly for that.
|
| The Nagel delay it implements helps quite a bit with
| scaling, though, while keeping the benefits of a tightly
| packed sorted keyspace. And you can directly access /
| walk decoded state without copies.
| judofyr wrote:
| > What do you mean by this part? tailetc is a library used by
| the client of etcd.
|
| Oh. Since they have a full cache of the database I thought it
| was intended to be used as a separate set of servers layered
| in front of etcd to lessen the read load. But you're actually
| using it directly? Interesting. What's the impact on memory
| usage and scalability? Are you not worried that this will not
| scale over time since all clients need to have all the data?
| bradfitz wrote:
| Well, we have exactly 1 client (our 1 control server
| process).
|
| So architecturally it's:
|
| 3 or 5 etcd (forget what we last deployed) <--> 1 control
| process <--> every Tailscale client in the world
|
| The "Future" section is about bumping "1 control process"
| to "N control processes" where N will be like 2 or max 5
| perhaps.
|
| The memory overhead isn't bad, as the "database" isn't big.
| Modern computers have tons of RAM.
| judofyr wrote:
| You're able to serve _all_ your clients from a _single_
| control process? And this would probably work for quite a
| while? Then I struggle to see why you couldn 't just use
| SQLite. On startup read the full database into memory.
| Serve reads straight from memory. Writes go to SQLite
| first and if it succeeds then you update the data in
| memory. What am I missing here?
| crawshaw wrote:
| We could use SQLite. (I love SQLite and have written
| about it before!) The goal is N control processes not for
| scale, but for more flexibility with deployment,
| canarying, etc.
| judofyr wrote:
| That makes sense. Thanks for answering all of my critical
| questions. Looks like a very nice piece of technology
| you're building!
| robmccoll wrote:
| I'm curious what drove the decision to move to an
| external store (and multinode HA config at that) now
| compared to using a local Go KV store like Badger or
| Pebble?
|
| Given that the goals seem to be improving performance
| over serializing a set of maps to disk as JSON on every
| change and keeping complexity down for fast and simple
| testing, a KV library would seem to accomplish both with
| less effort, without introducing dependence on an
| external service, and would enable the DB to grow out of
| memory if needed. Do you envision going to 2+ control
| processes that soon?
|
| Any consideration given to running the KV store inside
| the control processes themselves (either by embedding
| something like an etcd or by integrating a raft library
| and a KV store to reinvent that wheel) since you are
| replicating the entire DB into the client anyway?
|
| Meanwhile I'm working with application-sharded PG
| clusters with in-client caches with coherence maintained
| through Redis pubsub, so who am I to question the
| complexity of this setup haha.
| bradfitz wrote:
| Yes, we're going to be moving to 2+ control servers for
| HA + blue/green reasons pretty soon here.
| mwcampbell wrote:
| > Running an etcd cluster is much easier than running an HA
| PostgreSQL or MySQL config.
|
| What if you used one of the managed RDBMS services offered by
| the big cloud providers? BTW, if you don't mind sharing,
| where are you hosting the control plane?
| bradfitz wrote:
| > What if you used one of the managed RDBMS services
| offered by the big cloud providers?
|
| We could (and likely would, despite the costs) but that
| doesn't address our testing requirements.
|
| The control plane is on AWS.
|
| We use 4 or 5 different cloud providers (Tailscale makes
| that much easier) but the most important bit is on AWS.
| nicoburns wrote:
| Why is testing Postgres/MySQL difficult? You can easily
| run a server locally (or on CI) and create new databases
| for test runs, etc.
| bradfitz wrote:
| It's not difficult. We've done it before and have code
| for it. See the article.
| endymi0n wrote:
| This is about spot on. I do get the part about testability, but
| with a simple Key/Value use case like this, BoltDB or Pebble
| might have fit extremely well into the Native Golang paradigm
| as a backing store for the in-memory maps while not needing
| nearly as much custom code.
|
| Plus maybe replacing the sync.Mutex with RWMutexes for optimum
| read performance in a seldom-write use case.
|
| On the other hand again, I feel a bit weird criticizing Brad
| Fitzpatrick ;-) -- so there might be other things at play I
| don't have a clue about...
| jamescun wrote:
| This post touches on "innovation tokens". While I agree with the
| premise of "choose boring technology", it feels like a process
| smell, particularly of a startup whose goal is to innovate a
| techology. Feels demotivating as an engineer if management says
| our team can only innovate an arbitrary N times.
| marcinzm wrote:
| I mean, what do you propose in terms of getting a diverse group
| of engineers to trade off innovation versus choosing boring
| technologies? Keeping in mind that how much risk the company is
| willing to take is not the decision of engineers but the
| executive team. Innovation tokens convey the level of risk the
| executive team is willing to take in terms of technologies. The
| alternative I've often seen is a dictatorial CTO (or VP of Eng)
| who simply says NO a lot which is a lot more demotivating. A
| large company may do detailed risk analyses but those are too
| cumbersome for a startup.
| sbierwagen wrote:
| What is the purpose of a startup? Is it to put keywords on your
| resume, or is it to create a product?
| ZephyrBlu wrote:
| Depends if you have funding or not.
| crawshaw wrote:
| It is a tricky tradeoff for startups. On the one hand, a
| startup has very limited resources and so has to focus on the
| business. On the other hand, a startup has to experiment to
| find the business. I don't think there's an easy answer.
|
| In our case, the control plane data store really should be as
| boring as possible. It was real stretch using anything other
| than MySQL. We tried to lay out the arguments in the post, but
| the most compelling was we had lots of unit tests that spun up
| a DB and shut it down quickly. Maybe a hundred tests whose
| total execution time was 1.5s. The "boring" options made that
| surprisingly difficult.
|
| (Tailscaler and blog post co-author)
| eatonphil wrote:
| One solution is to use a standard subset of sql so you can
| use sqlite in unit tests and mysql/postgres in prod. Many
| languages also have in-memory SQL implementations that are
| also a more convenient substitute for sqlite.
|
| Of course the benefit of what you did, even if I wouldn't
| have done it, is that you're _not_ using a different system
| in dev vs prod.
| jrockway wrote:
| I would not go down the road of figuring out what subset of
| SQL various database understand. You will always be
| surprised, and you'll be surprised in production because
| "the thing" you developed against sqlite doesn't work in
| postgres.
|
| I used to do this and stopped when I noticed that sqlite
| and postgres treat booleans differently; postgres accepts
| 't' as true, but SQLite stores a literal 't' in the
| boolean-typed field. This means you get different results
| when you read things back out. All in all, not a rabbit
| hole you want to go down.
|
| Personally, I just create a new database for each test
| against a postgres server on localhost. The startup time is
| nearly zero, and the accuracy compared to production is
| nearly 100%.
| bradfitz wrote:
| Exactly. That's why I wrote: "No Docker, no mocks, testing
| what we'd actually use in production."
| marcinzm wrote:
| I'm used to Scala for these things and it seemed fairly easy
| to do. Docker DB container would be spun up at the start of
| testing (using a package such as testcontainers-scala). Then
| the DB would be setup using the migration scripts (using
| Flyway) so it has a known state. Every test (or set of tests
| if they're linked) would have a pre-hook which nuked the DB
| to a clean state. Then the docker container would be shut
| down at the end of testing. I'm guessing there's even some
| way to have this run concurrently with multiple DBs (one per
| thread) but we never did that. Is Java's ecosystem for this
| type of tooling just that much better?
| bradfitz wrote:
| I wrote about that in the article. Search for "dockertest".
|
| We considered that option but didn't like it. It still
| wasn't fast enough, and placed onerous (or at least
| annoying) dependencies on future employees.
| mwcampbell wrote:
| > It still wasn't fast enough, and placed onerous (or at
| least annoying) dependencies on future employees.
|
| Did you configure the Postgres (or MySQL) database to be
| entirely in memory, e.g. by using a tmpfs Docker volume?
|
| As for being onerous or annoying for new employees, which
| is worse: having to set up a Docker environment, or using
| a relatively obscure data store in a way that nobody else
| does?
| rubenv wrote:
| For PostgreSQL and Go, here's a package to spin up a temp in-
| mem PostgreSQL: https://github.com/rubenv/pgtest/
| aidos wrote:
| From memory, in Postgres, you could also have a copy of
| your base database and then copy it in for each test, which
| is seem to recall being fairly fast. It includes the data
| too. Create database test_db template
| initial_db;
| 256dpi wrote:
| I found myself in a similar situation sometime ago with MongoDB.
| In one project my unit tests started slowing me down too much to
| be productive. In another, I had so little data that running a
| server alongside it was a waste of resources. I invested a couple
| of weeks in developing a SQLite type of library[1] for Go that
| implemented the official Go drivers API with a small wrapper to
| select between the two. Up until now, it paid huge dividends in
| both projects ongoing simplicity and was totally worth the
| investment.
|
| [1]: https://github.com/256dpi/lungo
| ed25519FUUU wrote:
| > _"Yeah, whenever something changes, we grab a lock in our
| single process and rewrite out the file!"_
|
| Is this actually easier than using SQLite?
| bradfitz wrote:
| Keep reading!
| [deleted]
| bob1029 wrote:
| I do like putting .json files on disk when it makes sense, as
| this is a one-liner to serialize both ways in .NET/C#. But, once
| you hit that wall of wanting to select subsets of data because
| the total dataset got larger than your CPU cache (or some other
| step-wise NUMA constraint)... It's time for a little bit more
| structure. I would have just gone with SQLite to start. If I am
| not serializing a singleton out to disk, I reach for SQLite by
| default.
| [deleted]
| worik wrote:
| This jumped out at me: "The obvious next step to take was to move
| to SQL"
|
| No. Not unless your data is relational. This is a common problem,
| relational databases have a lot of over head. They are worth it
| when dealing with relational data. Not so much with non
| relational data.
| bradfitz wrote:
| Maybe "obvious" was the wrong word there. What we meant to say
| was that would be the typical route people would go, doing the
| common, well-known, risk-averse thing.
| shapiro92 wrote:
| I find the article a bit hard to follow. What were the actual
| requirements? I probably didnt understand all of this, but was
| the time spent on thinking about this more valuable than using a
| KV Store?
| breck wrote:
| If you liked this I highly recommend "Clean Architecture" by
| Uncle Bob. He has a great story of how they kept putting off
| switching to a SQL DB on a project and then never needed to and
| it was a big success anyway.
| petters wrote:
| > (Attempts to avoid this with ORMs usually replace an annoying
| amount of typing with an annoying amount of magic and loss of
| efficiency.)
|
| Loss of efficiency? Come on, you were using a file before! :-)
|
| Article makes me glad I'm using Django. Just set up a managed
| Postgres instance in AWS and be done with it. Sqlite for testing
| locally. Just works and very little engineering time spent on
| persistent storage.
|
| Note: I do realize Brad is a very, very good engineer.
| malisper wrote:
| The post touches upon it, but I didn't really understand the
| point. Why doesn't synchronous replication in Postgres work for
| this use case? With synchronous replication you have a primary
| and secondary. Your queries go to the primary and the secondary
| is guaranteed to be at least as up to date as the primary. That
| way if the primary goes down, you can query the secondary instead
| and not lose any data.
| apenwarr wrote:
| That would have been considerably less scalable. etcd has some
| interesting scaling characteristics. I posted some followup
| notes on twitter here:
| https://twitter.com/apenwarr/status/1349453076541927425
| judofyr wrote:
| How is PostgreSQL (or MySQL) "considerably less scalable"
| exactly? etcd isn't particularly known for being scalable or
| performant. I'm sure it's fast enough for your use-case
| (since you've benchmarked it), but people have been scaling
| both PostgreSQL and MySQL far beyond what etcd can achieve
| (usually at the cost of availability of course).
| apenwarr wrote:
| [I work at Tailscale] I only mean scalable for our very
| specific and weird access patterns, which involves
| frequently read-iterating through a large section of the
| keyspace to calculate and distribute network+firewall
| updates.
|
| Our database has very small _amounts_ of data but a very,
| very large number of parallel readers. etcd explicitly
| disclaims any ability to scale to large data sizes, and
| probably rightly so :)
| bradfitz wrote:
| We could've done that. We could've also used DRBD, etc. But
| then we still have the SQL/ORM/testing latency/dependency
| problems.
| irrational wrote:
| Can you go into more about what these problems are? I've
| always used databases (about 15 years on Oracle and about 5
| years on Postgres) and I'm not sure if I know what problems
| you are referring to. Maybe I have experienced them, but have
| thought of them by a different name.
|
| SQL - I'm not sure what the problems are with SQL. But it is
| like a second language to me so maybe I experienced these
| problems long ago and have forgotten about them.
|
| ORM - I never use an ORM, so I have no idea what the problems
| might be.
|
| testing latency - I don't know what this refers to.
|
| dependency - ditto
| bradfitz wrote:
| SQL is fine. We use it for some things. But not writing SQL
| is easier than writing SQL. Our data is small enough to fit
| in memory. Having all the data in memory and just
| accessible is easier than doing SQL + network round trips
| to get anything.
|
| ORMs: consider yourself lucky. They try to make SQL easy by
| auto-generating terrible SQL.
|
| Testing latency: we want to run many unit tests very
| quickly without high start-up cost. Launching
| MySQL/PostgreSQL docker containers and running tests
| against Real Databases is slower than we'd like.
|
| Dependencies: Docker and those MySQL or PostgreSQL servers
| in containers.
| nmjohn wrote:
| Can you put some numbers on how much time is too much?
| I've never seen anyone go this far to avoid using a
| database for what sounds like the only "real" reason is
| to avoid testing latency (a problem which has many other
| solutions) so I am really confused, but curious to
| understand!
| bradfitz wrote:
| Running all of our control server tests (including
| integration tests) right now takes 8 seconds, and we're
| not even incredibly happy with that. There's no reason it
| should even be half that.
|
| So it's not really in our patience budget for adding a
| mysqld or postgres start up (possible docker pull, create
| its schema, etc).
| viraptor wrote:
| Not sure what their requirements are, but I'm using a
| "spin up an isolated postgres instance per test run"
| solution and end up with ~3s overhead to do that. (Using
| https://pypi.org/project/testing.postgresql/
|
| Edit: 3s for global setup/teardown. Not per test
| function/suite.
| majormajor wrote:
| Were you doing a lot of logic in SQL itself? Sounds like
| not really, but then I'm surprised you'd have so many
| tests hitting the DB directly, vs most feature logic
| living above that layer in a way that doesn't need the DB
| running at all.
| jolux wrote:
| Why are your _unit tests_ touching a database? I'm a real
| stickler about keeping unit tests isolated, because once
| I /O gets involved, they invariably become much less
| reliable and as you mention, too slow.
| bradfitz wrote:
| Sorry, I should've just said tests. Our tests overall are
| a mix of unit tests and integration tests and all sizes
| in between, depending on what they want to test.
| irrational wrote:
| Ah, we don't use Docker or any other container
| technology. Maybe that is why we aren't seeing the
| latency issues you are referring to.
| bradfitz wrote:
| Docker itself doesn't add much latency. It just makes
| getting MySQL and PostgreSQL easier. If anything, it
| helps with dependencies. The database server startup
| still isn't great, though.
| nicoburns wrote:
| If you don't use Docker, you can just leave the database
| server running in the background, which removes the
| startup latency (you can of course do this with Docker
| too, but Docker has a tendency to use quite a few
| resources when left running in the background, which a
| database server on it's own won't).
| bradfitz wrote:
| So then every engineer needs to install & maintain a
| database on their machines. (Hope they install the right
| version!)
|
| I mean, that's what my old company did pre-Docker. It
| works, but it's tedious.
| nicoburns wrote:
| I mean that's an `apt install postgres` or `brew install
| postgres` away. Takes about 5 minutes. I guess it could
| become a pain if you need to work with multiple different
| versions at once.
| majormajor wrote:
| Being deep in the cloud world right now, with aws and
| terraform and kubernetes cli tools, etc, not having to
| install third party tools on my machine does sound pretty
| great, but also entirely unrealistic.
|
| Managing local DBs once new versions are out and your
| server isn't upgraded yet is irritating, but when I'm
| using a Mac I'd still rather use a native DB than Docker
| because of the VM overhead, since I've not yet run into a
| bug caused by something like "my local postgres was a
| different version than the server was." (Closest I've
| gotten was imagemagick for mac doing something a bit
| differently than for linux, about 10 years ago at this
| point.)
| jgraettinger1 wrote:
| I think the "database" label is tripping up the
| conversation here. What's being talked about here, really,
| is fast & HA coordination over a (relatively) small amount
| of shared state by multiple actors within a distributed
| system. This is literally Etcd's raison d'etre, it excels
| at this use case.
|
| There are many operational differences between Etcd and a
| traditional RDBMs, but the biggest ones are that
| broadcasting updates (so that actors may react) is a core
| operation, and the MVCC log is "exposed" (via ModRevision)
| so that actors can resolve state disagreements (am I out of
| date, or are you?).
| dekhn wrote:
| I never took a course in databases. At some point I was expected
| to store some data for a webserver, looked as the BSDDB API, and
| went straight to mysql (this was in ~2000). I spent the time to
| read the manual on how to do CRUD but didn't really look at
| indices or anything exotic. The webserver just wrote raw SQL
| queries against an ultra-simple schema, storing lunch orders.
| It's worked for a good 20 years and only needed minor data
| updates when the vendor changed and small python syntax changes
| to move to python3.
|
| At that point I thought "hmm, i guess I know databases" and a few
| years later, attempted to store some slightly larger, more
| complicated data in MySQL and query it. My query was basically
| "join every record in this table against itself, returning only
| rows that satisfy some filter". It ran incredibly slowly, but it
| turned out our lab secretary was actually an ex-IBM Database
| Engineer, and she said "did you try sorting the data first?" One
| call to strace showed that MySQL was doing a very inefficient
| full table scan for each row, but by inserting the data in sorted
| order, the query ran much faster. Uh, OK. I can't repeat the
| result, so I expect MySQL fixed it at some point. She showed me
| the sorts of DBs "real professionals" designed- it was a third
| order normal form menu ordering system for an early meal delivery
| website (wayyyyy ahead of its time. food.com). At that point I
| realized that there was obviously something I didn't know about
| databases, in particular that there was an entire schema theory
| on how to structure knowledge to take advantage of the features
| that databases have.
|
| My next real experience with databases came when I was hired to
| help run Google's MySQL databases. Google's Ads DB was
| implemented as a collection of mysql primaries with many local
| and remote replicas. It was a beast to run, required many trained
| engineers, and never used any truly clever techniques, since the
| database was sharded so nobody could really do any interesting
| joins.
|
| I gained a ton of appreciation for MySQL's capabilities from that
| experience but I can't say I really enjoy MySQL as a system. I
| like PostgresQL much better; it feels like a grownup database.
|
| What I can say is that after all this experience, and some recent
| work with ORMs, has led me to believe that while the SQL query
| model is very powerful, and RDBMS are very powerful, you
| basically have to fully buy into the mental model and retain some
| serious engineering talent- folks who understand database index
| disk structures, multithreading, etc, etc.
|
| For everybody else, a simple single-machine on-disk key-value
| store with no schema is probably the best thing you can do.
| gfody wrote:
| > The file reached a peak size of 150MB
|
| is this a typo? 150MB is such a minuscule amount of data that you
| could do pretty much anything and be OK.
| bradfitz wrote:
| Not a typo. See why we're holding it all in RAM?
|
| But writing out 150MB many times per second isn't super nice
| when both 150MB and the number of times per second are both
| growing.
| jeff-davis wrote:
| This post illustrates the difference between persistence and a
| database.
|
| If you are expecting to simply persist one instance of one
| application's state across different runs and failures, a
| database can be frustrating.
|
| But if you want to manage your data across different versions of
| an app, different apps accessing the same data, or concurrent
| access, then a database will save you a lot of headaches.
|
| The trick is knowing which one you want. Persistence is tempting,
| so a lot of people fool themselves into going that direction, and
| it can be pretty painful.
|
| I like to say that rollback is the killer feature of SQL. A
| single request fails (e.g. unique violation), and the overall app
| keeps going, handling other requests. You application code can be
| pretty bad, and you can still have a good service. That's why PHP
| was awesome despite being bad -- SQL made it good (except for all
| the security pitfalls of PHP, which the DB couldn't help with).
| mrkstu wrote:
| Maybe also part of the success of Rails? Similarly an easy to
| engineer veneer atop a database.
| toddh wrote:
| So what happens when the computer fails in the middle of a long
| write to disk?
| bradfitz wrote:
| Keep reading!
| richardARPANET wrote:
| I like to call this "architectural masturbation". Should have
| just used Postgres.
___________________________________________________________________
(page generated 2021-01-13 23:00 UTC)