hngopher.com

       [HN Gopher] Databases in 2022: A Year in Review
       ___________________________________________________________________
        
       Databases in 2022: A Year in Review
        
       Author : gavinray
       Score  : 171 points
       Date   : 2023-01-02 17:17 UTC (5 hours ago)
        
 (HTM) web link (ottertune.com)
 (TXT) w3m dump (ottertune.com)
        
       | friedman23 wrote:
       | Happy to find this blog, wish there was more talk about NewSQL
       | databases. I've personally seen the amount of cost that goes into
       | migrating off of traditional RDBMs systems to sharded RDBMs
       | systems and was wondering why people don't go with these scalable
       | systems from the start now that they exist.
        
         | pm90 wrote:
         | Probably because there is a lot more operational understanding
         | of "traditional" RDBMs and they're sufficient in early stages.
         | Once your product takes off it can be difficult to scale but by
         | that time you probably have enough money to attract engineers
         | that can do the migration.
        
           | friedman23 wrote:
           | Yes, that's the common refrain, just migrate when you hit
           | scale. Just pay people to do the migration. My experience at
           | multiple, billion dollar companies has been that it's never
           | that easy.
           | 
           | People become afraid of doing schema migrations and tools
           | like dynamodb become the default datastore choice for all new
           | feature development.
           | 
           | Every single startup I've worked at, technical cruft that
           | existed from company founding never disappears despite the
           | founders intent that it was never supposed to stay that way
           | and it slows down feature development to a crawl.
           | 
           | If NewSQL databases exist that support 90% of the features of
           | postgres but give support for cross shard transactions then
           | I'd rather use them than technology that has been proven to
           | not scale. Unfortunately I'm not an expert on distributed
           | databases and there is not much written about these databases
           | in practice outside of the startups creating the databases
           | themselves.
        
         | likeabbas wrote:
         | +1. Recently, Yugabyte has seemed intriguing to me. (It seems
         | like) You can launch it as a standalone instance and use it
         | just like Postgres early in your company, and then when you
         | need to distribute, you can do that.
        
           | brnewd wrote:
           | I think many love the promise of a distributed database, but
           | in practice there is a real speed and feature set compromise
           | to make.
        
       | zX41ZdbW wrote:
       | > he claimed that because MonetDB focuses on read-only OLAP
       | workloads, then MMAP was good enough
       | 
       | ClickHouse has switchable IO engines: - read; - pread; - mmap; -
       | pread_threadpool;
       | 
       | The best performance is achieved with the combined method, named
       | 'pread_threadpool'. It uses the `preadv2` syscall to check if the
       | file is in the page cache.
       | 
       | More details here:
       | https://github.com/ClickHouse/ClickHouse/pull/26791
       | 
       | There was also big research of performance on HDD, SSD, and
       | Optane, comparing various methods:
       | https://clickhouse.com/blog/a-journey-to-io_uring-aio-and-mo...
       | 
       | And there is a pull request adding io_uring:
       | https://github.com/ClickHouse/ClickHouse/pull/38456 -
       | Unfortunately, it's unfinished and cannot be merged because the
       | CI has found bugs. Nevertheless, the advantages of io_uring for
       | analytical databases are negligible.
       | 
       | The `mmap` method is still useful. For example, for data import.
       | See here: https://github.com/ClickHouse/ClickHouse/pull/43927
        
       | nikita wrote:
       | Neon CEO here.
       | 
       | There are a number of ideas in the database space that the
       | industry is adopting across the board:
       | 
       | - Separation of storage and compute (Neon, AlloyDB, Aurora).
       | Every cloud database should built one. It's a big undertaking,
       | but benefits are undeniable.
       | 
       | - Good query processor for analytics (Snowflake, Velox,
       | Singlestore)
       | 
       | - Open source. Especially in OLTP open source == trust
       | 
       | - HTAP. Can run mixed workload (Singlestore, Unistore): both OLTP
       | (apps) and OLAP (reporting). This has always been a dream, but we
       | still live in the world of dedicated systems: E.g. Snowflake and
       | Postgres.
       | 
       | - Shared nothing sharding (Vitess). This is the most
       | controversial as you lose compatibility with the mothership
       | (MySQL for Vitess). So it's unclear this will be the dominant
       | architecture in the future. I think the world may get to "dynamic
       | sharding" where storage stays separate and compute can be
       | multinode and the user can easily and instantly change the number
       | of nodes.
        
         | emptysea wrote:
         | I'm not well versed in the tech but is Spanner the sort of
         | dynamic sharding approach your talking about?
        
           | gavinray wrote:
           | Scylla also uses a shared-nothing sharding approach, they
           | have an open-source "asynchronous I/O engine" toolkit that is
           | used inside of ScyllaDB called Seastar
           | 
           | They have some good reading on this approach in the Seastar
           | docs if you're curious, it's a great general introduction to
           | Shared Nothing:
           | 
           | https://seastar.io/shared-
           | nothing/#:~:text=The%20Seastar%20M....
           | 
           | https://www.linuxfoundation.org/webinars/under-the-hood-
           | of-a...
           | 
           | This is usually generalized to some variation of "Physical-
           | OS-thread-per-CPU-core" architecture, where each CPU core
           | gets it's own dedicated chunk of data. You use hashing or
           | some other mechanism to route requests for data to the right
           | core.
           | 
           | Each CPU core can have user-land concurrency or "virtual
           | threads" (fibers/coroutines/etc) as well, so you're not
           | limited to sequential code necessarily.
           | 
           | A combination of thread-per-core x userland-coroutines-per-
           | thread is how modern query processing approaches like
           | "Morsel-Driven Parallelism" work.                 CPU-1-OS-
           | THREAD-1         COROUTINE_1         COROUTINE_2
           | CPU-2-OS-THREAD-2        COROUTINE_35        ...
           | COROUTINE_102
           | 
           | https://db.in.tum.de/~leis/papers/morsels.pdf
        
         | swyx wrote:
         | since you were also a founder of Singlestore, just wondering -
         | what will be Neon's take on HTAP workloads?
         | 
         | trying to see how much you mean it that the industry is
         | "adopting (HTAP) across the board"
        
           | nikita wrote:
           | I wouldn't say HTAP is an across the board trend. The need is
           | mostly in the enterprise where we see high value mixed
           | workloads.
           | 
           | Neon will take a Postgres centric POV. I consider a few
           | approaches, but not committing to any particular at this
           | point. The reason to not commit is that there is a lot more
           | opportunity for Neon in serverless, devtools, and edge. And
           | it makes a ton more sense to partner for OLAP workloads. So
           | what is Postgres centric POV:
           | 
           | - Plugins like Timescale and Citus
           | 
           | - Seamless integration with sending CDC to Snowflake,
           | Singlestore, Databricks, Clickhouse, and Motherduck
           | 
           | - FDW for integration with Snowflake, Singlestore,
           | Clickhouse, Databricks, and Motherduck
           | 
           | - We looked at putting DuckDB on Neon storage with Jordan
           | Tigani, but it was too big of a lift as DuckDB doesn't push
           | all data through the WAL. Maybe in the future.
        
         | zX41ZdbW wrote:
         | The most developed analytical query engine as of today is
         | ClickHouse [1].
         | 
         | It is open-source, has the separation of storage and compute
         | and the best performance.
         | 
         | ClickHouse is a source of inspiration for many emerging
         | database engines: Velox, Doris, DuckDB, Datafusion...
         | 
         | [1] https://github.com/ClickHouse/ClickHouse
         | 
         | Disclaimer: I work on ClickHouse.
        
           | ddorian43 wrote:
           | > has the separation of storage and compute and the best
           | performance.
           | 
           | This part is not yet production ready though. And it's
           | unclear when it will be. Though no other open source OLAP
           | does it either AFAIK.
        
         | jandrewrogers wrote:
         | The separation of storage and compute is controversial, though
         | it is definitely convenient for cloud operators. It is
         | literally going backward in terms of the available storage
         | bandwidth, and many modern database architectures can very
         | effectively use all the storage bandwidth available in modern
         | systems. There is not enough network bandwidth, and won't be
         | for the foreseeable future, for running storage over networks
         | to be efficient. Database workloads are also starting to move
         | to edge environments, which make it even worse. The resource
         | mismatch currently leaves a lot of compute sitting idle in
         | practice.
         | 
         | Another issue with this model is that it tends to offer poor
         | ratios of bandwidth to storage size. This is at odds with the
         | desire to run mixed workloads in a single system.
        
           | glogla wrote:
           | The separation of compute and storage is most of all
           | convenient for whoever operates the database.
           | 
           | Need more compute, or scale it down when not in use? You can
           | add it in minutes, no re-balancing, no careful moving of
           | data. Need more storage? You have it on tap. Want to give
           | some group of users their dedicated compute? Or run
           | differently configured computes on top of one data, including
           | even using different compute engines for batch and for
           | interactive? You can.
           | 
           | Compare Snowflake "create warehouse", which can give you
           | compute that can immediately access any data, with a dance
           | you need to do to resize Clickhouse and decide which tables
           | are replicated in what way.
           | 
           | At the same time, everything you say is true, and that is why
           | system without storage-compute separation can give you order
           | of magnitude more performance for the same money than a
           | separated system can. If someone could solve the fragility
           | and hassle of it, it would be quite a winner.
        
           | nikita wrote:
           | > The resource mismatch currently leaves a lot of compute
           | sitting idle in practice.
           | 
           | When separate storage is multi tenant and compute is elastic
           | you can really squeeze more efficiency. With that you can
           | keep utilization extremely high, b/c storage is effectively
           | spread across the whole fleet. We really see this now running
           | 10K databases on a single (single per region really) storage
           | deployment.
           | 
           | Compute is also elastic, so you again can drive utilization
           | up. So my argument is that separation of storage and compute
           | is not necessarily for performance but rather for
           | utilization, convenience to operate, and additional
           | capabilities (branching, serverless).
        
           | AdamProut wrote:
           | There are hybrid designs for separation of storage and
           | compute that are aimed at mixed workloads[1]. They avoid
           | writes to remote storage on transaction commit (i.e., act
           | like a shared nothing databases for commits, but still push
           | data asynchronously to a shared remote disk that can be used
           | for scale up/point in time restores/branching).
           | 
           | [1] https://www.singlestore.com/blog/separating-storage-and-
           | comp...
           | 
           | (disclosure: SingleStoreDB CTO)
        
             | nikita wrote:
             | Adam, you still synchronously replicate each log record to
             | 2 places, right? Technically this should be roughly
             | equivalent to writing into a consensus. And use write
             | through cache for reads.
        
               | AdamProut wrote:
               | Yep, for writes network bandwidth usage is independent of
               | separation of storage of compute in some sense. Any
               | database that provides high availability is writing over
               | the network somewhere before it acks a transaction
               | committed. It matters who is on the other end of that
               | network write though. Take the typical Cloud DW (i.e.,
               | Snowflake) design of forcing writes to the blob storage
               | (shared remote disks) before commit. That is a much
               | higher latency write then what a high performance
               | transaction log replication protocol will do to replicate
               | a write to another host.
        
           | gavinray wrote:
           | > The separation of storage and compute is controversial
           | 
           | I don't think it's a controversial topic, it's an emerging
           | trend. See for example this slide from the recent RocksDB
           | meetup on their vision for the next 3 years of RocksDB's
           | future:
           | 
           | https://twitter.com/GavinRayDev/status/1600666127025156096
           | 
           | Disaggregated storage is becoming quite a hot topic, and
           | there are several technologies fueling this.
           | 
           | There are two major advances happening in terms of disk I/O
           | right now (that aren't to do with async IO uring stuff)
           | 
           | 1. NVMe Zones + NVMe FDP specs, which are ways for userland
           | applications to control placement of writes on drive (like a
           | software Flash-Translation-Layer)
           | 
           | 2. NVMe-over-Fabrics or similar technologies (with rdma/tcp)
           | enables the access of a remote disk with 'comparable' latency
           | to that of a local one.
           | 
           | E.g. if local disk access is 100us, AWS can give you 110us
           | with NVMe-OF.
           | 
           | These two advances support the growing trend of disaggregated
           | storage
           | 
           | ---
           | 
           | Full credit to Arun George from the Samsung Memory group for
           | explaining this to me.
        
             | jandrewrogers wrote:
             | The storage-over-fabric latency improvements allow it to
             | work well for software that was not particularly optimized
             | for storage performance, which has obvious benefits that I
             | am not arguing. There is a good reason storage-over-fabric
             | has been a popular database deployment model for decades,
             | it does have advantages, but this emerging trend is nothing
             | new nor are the issues that it has.
             | 
             | For systems that _are_ highly optimized for storage
             | performance, storage bandwidth tends to be more important
             | than storage latency and that is what is being sacrificed
             | for the sake of disaggregating storage. I see integer
             | factor reductions in workload throughput when deployed on
             | disaggregated storage versus the usually NVMe JBOD, and
             | that factor tracks real-world storage bandwidth pretty
             | well, particularly at high storage densities. At scale,
             | these economics start to matter quite a lot.
             | 
             | To be clear, disaggregated storage has no effect on
             | database design generally, seeing as how it has been widely
             | used for decades. It does put an artificial floor on
             | achievable throughput for many workloads due to the
             | bandwidth bottleneck. When people realize this then the
             | pendulum swings the other way (again), so I wouldn't want
             | to overfit for disaggregated storage.
        
               | gavinray wrote:
               | I didn't realize who I was talking to, just checked the
               | username
               | 
               | Would encourage other folks to do the same/read post
               | history
        
               | sitkack wrote:
               | The truthiness of your comments shouldn't change upon who
               | hears them?
        
               | gavinray wrote:
               | No, I still stand by my opinion that I don't find
               | disaggregated storage controversial
               | 
               | But if you're going to take an opinion on databases from
               | someone, jandrewrogers is probably a good person to take
               | them from. Me, on the other hand, I'm just a
               | hobbyist/enthusiast, I've no accolades under my belt and
               | my experience on the engineering end is shy of a year.
        
         | gavinray wrote:
         | Really enjoyed the talk from your engineer Heikki at Andy's CMU
         | "Databases!" seminar series earlier this year, great stuff =)
         | 
         | https://www.youtube.com/watch?v=rES0yzeERns
         | 
         | Also, it's been neat to have mostly worked our replacing our
         | old Heroku integration at Hasura with you folks.
        
       | Alifatisk wrote:
       | Is there a site that lists every db in every category?
       | 
       | Like Timescale, Clickhouse for time series & Mongodb for
       | documents etc
        
         | [deleted]
        
         | jsumrall wrote:
         | Check out https://db-engines.com/en/ranking
         | 
         | Note that you can select categories on the left side to limit
         | rankings to that category.
        
         | boberoni wrote:
         | Dbdb.io has a catalog of database tech. It is maintained by the
         | CMU database research group.
         | 
         | https://dbdb.io
        
           | apavlo wrote:
           | It's maintained by me (also the author of this article).
        
       | yingjunwu wrote:
       | I'm the founder of RisingWave Labs. Glad to know that our company
       | was mentioned in the blog.
       | 
       | Technology wise, agreed with Nikita that storage-compute-
       | decoupling is the trend. Severless will also be the main focus.
       | 
       | Market wise, I observed several trends:
       | 
       | * streaming. streamlining the data pipeline is a new trend.
       | RisingWave and Materialize are both addresssing this market. They
       | will probably need to figure out how to compete against
       | established solutions like KsqlDB and Apache Flink.
       | 
       | * real-time. There are probably 20+ vendors selling real-time
       | OLAP databases. They all claim that they are cheaper, faster,
       | better than Snowflake/Redshift. Just name a few: Druid, Pinot,
       | ClickHouse, StarRocks etc. Let's see who will be the winner in
       | 2023.
       | 
       | * small data. The biggest winner in this domain is Motherduck, a
       | startup commercializing DuckDB. Redshift/Snowflake are a overkill
       | for most companies who do not own large amount of data.
       | Motherduck may be the right choice.
       | 
       | * hybrid. Several companies have started selling hybrid databases
       | (mainly HTAP for now): Google AlloyDB, Snowflake Unistore,
       | SingleStore, PingCAP TiDB+TiFlash, etc.
       | 
       | In 2023, we will also watch how Databricks challenges Snowflake
       | in the data warehouse domain.
       | 
       | Best luck!
        
         | danielvaughn wrote:
         | Druid is indeed a bit faster than Snowflake, but a previous
         | team I worked on was able to achieve similar latencies under
         | heavy load using some clever techniques with materialized
         | views.
        
       | lioeters wrote:
       | Seems like Supabase deserves a mention, with their contributions
       | to the Postgres ecosystem.
       | 
       | https://supabase.com/
       | 
       | pg_graphql - https://supabase.com/blog/pg-graphql-v1
       | 
       | pg_jsonschema - https://supabase.com/blog/pg-jsonschema-a-
       | postgres-extension...
       | 
       | Postgres WASM - https://supabase.com/blog/postgres-wasm
       | 
       | ..And other open-source features.
        
         | gavinray wrote:
         | Supabase didn't do Postgres WASM first, wasmer did FYI (3 years
         | ago)
         | 
         | https://github.com/wasmerio/wasmer-postgres
        
           | kiwicopple wrote:
           | (supabase ceo)
           | 
           | these are two different things - wasmer runs WASM code
           | _inside_ postgres (as an extension), whereas our
           | implementation runs an entire Postgres server as WASM (eg,
           | you can run it in a browser: wasm.supabase.com)
           | 
           | We weren't the first to do this though - CrunchyData was.
           | Theirs wasn't open source, so we worked with Snaplet to build
           | an open source version and added networking:
           | https://news.ycombinator.com/item?id=33067962
        
             | gavinray wrote:
             | My mistake, I fully eat crow, and this is what I get for
             | not reading the thing I argued against!
        
         | apavlo wrote:
         | We want to write a paper on the vast extensions ecosystem for
         | Postgres at some point. It's on my TODO list for 2023.
         | 
         | Having so many plug-ins/add-ons available is unique to Postgres
         | and due to Stonebraker's original vision of including
         | extensibility as a first-class design principle in the DBMS.
        
           | dang wrote:
           | I didn't know that and it would make for an interesting
           | historical article in its own right.
        
         | danielvaughn wrote:
         | Also Planetscale
         | 
         | edit: I'm okay with being downvoted, but I'm honestly asking
         | why, not complaining. Is Planetscale not an interesting
         | database?
        
           | mritchie712 wrote:
           | fwiw, I don't get the downvotes either
        
           | viraptor wrote:
           | Possibly lack of context. Without the edit, I would assume
           | you're going for a version of "webscale" meme about postgres.
           | (Haven't heard of planetscale before)
        
             | danielvaughn wrote:
             | Ah interesting. Nope, definitely talking about a real
             | database. I think cloud native is interesting, and two
             | solutions I'm aware of in the space are CockroachDB and
             | Planetscale.
        
       | mythhouse wrote:
       | > Timescale's $110m series C, Voltron Data's $110m seed + series
       | A, and Dbt Labs's $222m series D.
       | 
       | weird that dbt is mentioned in 'big database' funding. Also
       | suprised to see the huge investment in dbt.
        
         | rorymalcolm wrote:
         | I think it makes sense in the context of the poential of plain
         | old SQL (and additions such as dbt) winning out vs Python or
         | other general purpose programming solutions as the analytics
         | workbench (although this is controversial, there's a school of
         | thought that backs it).
        
       | eximius wrote:
       | What I want is a new database that defines tables in terms of
       | _rich_ structures like proto, like Spanner.
       | 
       | If that doesn't exist in a couple years, maybe I'll try to build
       | one.
        
         | lukeramsden wrote:
         | > What I want is a new database that defines tables in terms of
         | rich structures like proto, like Spanner.
         | 
         | Do you mean like https://www.edgedb.com/? Or something more...
         | rich?
        
           | eximius wrote:
           | Well, that's certainly something. I haven't stumbled across
           | this before.
           | 
           | It's very interesting, but one reason I want it to be proto
           | is so you can re-use that for bindings with your
           | applications.
        
         | hardwaresofton wrote:
         | You might be interested in Project M36:
         | 
         | https://github.com/agentm/project-m36
        
       | [deleted]
        
       | saurabhnanda wrote:
       | What an amazingly written piece. I was learning interesting stuff
       | and chuckling alongside, as well.
       | 
       | Didn't get the closing remarks about Larry Ellison, though. Any
       | explanation?
        
         | SPBS wrote:
         | Oh great, it seems like this year he made the sarcasm about
         | Larry Ellison a lot easier to pick up. Frankly I was still
         | unsure how he really felt about Larry after his review from
         | last year[1]. After reading this, it confirms to me that he
         | doesn't like Larry :).
         | 
         | [1] https://news.ycombinator.com/item?id=29733729
        
         | friedman23 wrote:
         | The Larry Ellison remarks are written with a lot of sarcasm.
        
       | mildbyte wrote:
       | I mentioned it recently[0], but this looks like a very good topic
       | to plug our new database, Seafowl, that we released last year
       | [1]. It also uses Apache DataFusion (like IOx) and separates
       | storage and compute (like Neon, Snowflake etc) but is designed
       | for client-side Web apps to run analytical SQL queries over HTTP
       | (using semantics that make the query results cacheable by browser
       | caches and CDNs). This makes it really useful for things like
       | interactive visualizations or dashboards.
       | 
       | We're currently doing a lot of work at Splitgraph to reposition
       | the product around this "analytics at the edge" use case, with
       | usage-based billing, eventually moving our query execution from
       | PostgreSQL to Seafowl.
       | 
       | [0] https://news.ycombinator.com/item?id=34175545
       | 
       | [1] https://seafowl.io
        
         | ddorian43 wrote:
         | Your first link is not the correct one (click the timestamp of
         | the comment to get the exact link)
        
           | mildbyte wrote:
           | Thanks for the catch! Fixed.
        
       | getcrunk wrote:
       | I see the author links to his youtube playlist for his CMU
       | Databases class.
       | https://www.youtube.com/playlist?list=PLSE8ODhjZXjaKScG3l0nu...
       | 
       | Are there any good recommendations for a lecture or series of
       | lectures that go over Databases from a theoretical perspective
       | but also do a practical survey of the various DB systems and
       | technologies out there?
        
       | marsupialtail_2 wrote:
       | I am writing a Python based SQL query engine:
       | https://github.com/marsupialtail/quokka. My personal goal is to
       | get Andy Pavlo to mention it in his year-end blogs.
       | 
       | I agree with many of the points made in the blog by Andy. Writing
       | a distributed database has become way easier due to open source
       | components like Ray, Arrow, Velox, DuckDB, SQLGlot etc.
       | 
       | I personally believe we will see a switch from JVM based
       | technologies to Rust/C based with Python wrapper
        
         | apavlo wrote:
         | I saw your engine a week or so ago. Good stuff. I wanted to add
         | it to dbdb.io but I couldn't find an SVG version of this:
         | 
         | https://github.com/marsupialtail/quokka/blob/master/docs/doc...
         | 
         | Can you email it to me?
        
           | marsupialtail_2 wrote:
           | Just did.
        
       | bullen wrote:
       | I came back to MariaDB after leaving MySQL about a decade ago.
       | 
       | The JDBC driver is 10x as big and times-out after less than 12
       | hours default.
       | 
       | Things are not going in the right direction, but on the upside I
       | now learned how to install MariaDB manually so installers can't
       | brick my OS like MySQL did back in 2014 (MySQL bricked itself and
       | no way to unbrick it except reinstalling Windows).
        
         | KronisLV wrote:
         | > Things are not going in the right direction, but on the
         | upside I now learned how to install MariaDB manually so
         | installers can't brick my OS like MySQL did back in 2014 (MySQL
         | bricked itself and no way to unbrick it except reinstalling
         | Windows).
         | 
         | For local development and testing, running a database in a
         | container is generally a good idea:
         | https://hub.docker.com/_/mariadb/
         | 
         | You know, if you ever want to be able to create various
         | throwaway environments with resource limits or port mappings
         | easily, for any number of versions you might need to run in
         | each of your projects.
         | 
         | This works similarly well for something like PostgreSQL, too!
         | And if you need persistence, you can use either volumes or bind
         | mounts.
        
       | gavinray wrote:
       | Can we also start a general "Andy Pavlo" appreciation comment
       | chain?
       | 
       | Please comment below if Andy has in any way positively improved
       | your life or helped you further your education/career.
       | 
       | I can start:
       | 
       | - Early in 2022, I decided I wanted to learn how databases like
       | Postgres work and build one from scratch. If it wasn't for the
       | CMU lectures available to the public on YouTube, I don't think I
       | would have been able to give myself anywhere near a decent
       | education.
       | 
       | - This year, Andy even made the homework-grading platform
       | available to the public, so you could submit solutions to the
       | database coursework and get graded as if you were a student
       | attending CMU.
       | 
       | - Not only does Andy take the time to record, edit, and upload
       | his lectures to the public, but he hosts a yearly seminar series
       | where he interviews speakers from all over the database industry
       | (https://db.cs.cmu.edu/seminar2022)
       | 
       | - Andy does all he does (Professorship) on top of running
       | OtterTune, I have no idea how he functions. He truly is the
       | realest of all gangsters, and a secret member of the Wu-Tang
       | Clan. Andy makes databases fun.
        
         | [deleted]
        
         | polskibus wrote:
         | Can you point me to the homework-grading platform ? Can one use
         | it outside of standard course schedule ? It would be great to
         | do the task in my own pace.
        
           | gavinray wrote:
           | The homework solutions are submitted to a platform called
           | "Gradescope", see this tweet and the tweets above it in the
           | thread for more details:
           | 
           | https://twitter.com/andy_pavlo/status/1564305764730167303
           | 
           | Also, I should mention there is a very large (unofficial)
           | Discord server (+1,000 people) where some of the TA's from
           | this DB course and myself hang out if anyone is interested:
           | 
           | https://discord.gg/7HPufafdsW
           | 
           | And if you find database engineering interesting in general,
           | I might also invite you to come hang out in Phil Eaton's
           | hacker-space Discord, which is quite good:
           | 
           | https://discord.gg/SAqUyrPkY7
        
       | aanfhn wrote:
       | What are people's and the market take on Starburst? I haven't
       | kept up with databases and the general market, but recently heard
       | about them. Is there a sizeable demand and market for them? Any
       | other big players to watch for?
        
       ___________________________________________________________________
       (page generated 2023-01-02 23:00 UTC)