http://muratbuffalo.blogspot.com/2022/03/cockroachdb-resilient-geo-distributed.html

Skip to main content
 

Search This Blog

[                    ]
[Search]
Metadata

On distributed systems broadly defined and other curiosities. The
opinions on this site are my own.

CockroachDB: The Resilient Geo-Distributed SQL Database

  * Get link
  * Facebook
  * Twitter
  * Pinterest
  * Email
  * Other Apps

- March 04, 2022

This paper appeared in Sigmod 2020. Here is a link to the 10 minute,
but extremely useful, Sigmod presentation. There is also a 1 hour
extended presentation.

CockroachDB is open source available via GitHub. The core features of
the database are under a Business Source License (BSL), which
converts to a fully open-source Apache 2.0 license after three years.


Storage layer

CockroachDB (or CRDB for short) consists of a distributed SQL layer
on top of a distributed key-value store. The key value store is laid
out in a single monolithic ordered key space. The logical key space
is physically realized by dividing it into contiguous ranges of keys
which we call ranges. Ranges are about 64 megabytes in size. Ranges
start empty, grow, and split when they get too large, and merge with
their neighbors when they get too small. The ranges are sorted, and I
will talk about why in the SQL discussion later.

As we will discuss soon, CRDB uses multi-version concurrency control.
Hybrid logical clocks timestamping is used for versioning. The values
are never updated in place. Tombstones are used for deletion. This
way the multi-version store provides a snapshot to each transaction.

CRDB uses RocksDB (now Pebble) for storing of these key-value ranges
at the storage layer. Ranges are the unit of replication. CRDB uses
the Raft consensus protocol for replication and each range is a Raft
group. Again, each range is a Raft group: replication is not at the
level of nodes but at the level of ranges. This provides fine-grained
control over data placement. For example here you can see see a
database of dogs, where the orange range is a Raft group and is
replicated over 3 nodes, the group is currently assigned to.

[AVvXsEjPXzlc-0S3Mw2IqC_xZ_XFbxtlo5DWQ5_VrqBgwd7ZyGg__bfgmwdtky6fvgOCCn0ez_pBvbI41tZ4OPyx48]

Raft provides a useful building block which is atomic replication.
Commands are proposed by the leaseholder, which you can assume is the
same as the Raft leader. The command is accepted when a quorum of
replicas acknowledge.


Distributed transactions

Transactions can span arbitrary ranges and they support a
conversational protocol that supports SQL where the full set of
operations may not be known up front. Transactions in CockroachDB use
serializable isolation level.

To guarantee atomicity even for transactions that span multiple
ranges, CockroachDB takes advantage of the range level atomicity of
Raft. Each transaction is associated with a transaction record, which
is stored and arranged. Just like other data updates, the transaction
record go through RAFT which is how we support atomicity for
transactions.

Let's look at this simple example from the presentation to understand
how transactions work. Consider the older suboptimal version of the
algorithm first. This is basically 2PC over Paxos groups. An
important point is the use of a transaction record that is atomically
flipped to maintain the status of transaction in an atomic manner.


[AVvXsEhaeuxYYChslIAK_f3WZraMD975wb6Jdaw5B7zht3yGtqVJFckvgeHf-fkVPDxhWtvhrSVMjbi98JkO8PlKre]

What you see here is a cluster with three ranges spread across four
nodes. The leaseholders for each range are highlighted with a black
outline. This insert statement is inserting two rows into our dog's
table Sunny and Ozzie. To begin the client connects to a gateway node
which connects to the leaseholder for the range containing Sunny
since Sunny is the first key written as part of the transaction. The
transaction record is created on the range containing Sunny to
replicate the transaction record the lease holder proposes a Raft
command which writes the record to itself and the follower replicas.
Once the lease holder and at least one of the followers accept the
creation of the transaction record the transaction is in progress.
Next the same lease holder proposes a Raft command that writes Sunny
to itself and the followers. It again waits for a quorum before
moving on.

[AVvXsEirGRuU1mvB5EcSq77pAADgkCea5zPMPcHR9uemxyDYxQQILR6HbUaOZzdVy4wXT3DJG7JG88ZMhNTHCWsZjI]

Next we write Ozzie. The lease holder for Ozzie will propose a Raft
command to write the record to itself in its followers one of the
followers will acknowledge and since we have a quorum the write is
complete. As a final step the Gateway node commits the transaction by
updating this transaction record from pending to committed via
another round of consensus.


Optimized transactions

The above protocol requires multiple round trips. CockroachDB evolved
into one that can commit a distributed transaction with a latency of
a single round-trip with additional asynchronous round trips to
perform cleanup. The idea is to pipeline the writes first and then do
the transaction record writing as "staged" status and flipping it
when quorums acknowledge the writes.


[AVvXsEgzip_v5eW7a3mjrJnrQ9ChXPnZIrMMsprp1tRlKupyVoDTyToSQPEt8aLSljEipuSh0nX4DvtQUV6HYEBPeF]
Here is the explanation from the presentation. The first write
combines two operations: begin transaction and write Sunny. With the
serial protocol we had to wait for each write to complete before
initiating the next write: write the transaction record wait for it
to complete write Sunny wait for it to complete. With pipelining we
can fire off the write to Sunny but we don't need to wait for it to
complete. The transaction record is also not written until later.
This is safe because there's a small grace window where the
transaction can have a non-existent transaction record, and that
means it's in progress rather than aborted. With pipelining, we can
initiate a second write before waiting for the first to complete and
the commit is also pipelined. But we need to mark the transaction
record in this case as staged rather than committed. Staged basically
means that the transaction is finished but it's not clear yet whether
the commit was successful. Once all the writes complete successfully,
the Gateway node knows that the commit was successful and therefore
it can return to the client.


Concurrency control

To achieve serializable transaction, concurrency control is required
to detect conflicts and reorder transactions as needed. Below I use
the description from the paper, in order not to make any mistakes in
my description.

An atomic commit for a transaction is achieved by considering all of
its writes provisional until commit time. CRDB calls these
provisional values write intents. An intent is a regular MVCC KV
pair, except that it is preceded by metadata indicating that what
follows is an intent. This metadata points to a transaction record,
which is a special key (unique per transaction) that stores the
current disposition of the transaction: pending, staging, committed
or aborted. The transaction record serves to atomically change the
visibility of all the intents at once, and is durably stored in the
same Range as the first write of the transaction.

As mentioned before, CRDB is an MVCC system and each transaction
performs its reads and writes at its commit timestamp. This results
in a total ordering of all transactions in the system, representing a
serializable execution.

However, conflicts between transactions may require adjustments of
the commit timestamp. We describe the situations in which they arise
below, and note that whenever the commit timestamp does change, the
transaction typically tries to prove that its prior reads remain
valid at the new timestamp (Section 3.4), in which case it can simply
continue forward at the updated timestamp.

Write-read conflicts. A read running into an uncommitted intent with
a lower timestamp will wait for the earlier transaction to finalize.
Waiting is implemented using in-memory queue structures. A read
running into an uncommitted intent with a higher timestamp ignores
the intent and does not need to wait.

Read-write conflicts. A write to a key at timestamp ta cannot be
performed if there's already been a read on the same key at a higher
timestamp tb >= ta. CRDB forces the writing transaction to advance
its commit timestamp past tb.

Write-write conflicts. A write running into an uncommitted intent
with a lower timestamp will wait for the earlier transaction to
finalize (similar to write-read conflicts). If it runs into a
committed value at a higher timestamp, it advances its timestamp past
it (similar to read-write conflicts). Write-write conflicts may also
lead to deadlocks in cases where different transactions have written
intents in different orders. CRDB employs a distributed
deadlock-detection algorithm to abort one transaction from a cycle of
waiters.

Certain types of conflicts described above require advancing the
commit timestamp of a transaction. To maintain serializability, the
read timestamp must be advanced to match the commit timestamp.
Advancing a transaction's read timestamp from ta to tb > ta  is
possible if we can prove that none of the data that the transaction
read at ta  has been updated in the interval (ta,tb]. If the data has
changed, the transaction needs to be restarted. To determine whether
the read timestamp can be advanced, CRDB maintains the set of keys in
the transaction's read set (up to a memory budget). A "read refresh"
request validates that the keys have not been updated in a given
timestamp interval (Algorithm 1, Lines 11 to 14). This involves
re-scanning the read set and checking whether any MVCC values fall in
the given interval.

[AVvXsEi3ACLbSlcUt9hEPLrn4hONQ5Kps_yvTQ1ho6_ODM3tae1MsquX]
Advancing the transaction's read timestamp is also required when a
scan encounters an uncertain value. In this case, when a transaction
encounters a value on a key at a timestamp above its provisional
commit timestamp but within its uncertainty interval, it performs an
uncertainty restart, moving its provisional commit timestamp above
the uncertain value but keeping the upper bound of its uncertainty
interval fixed. This corresponds to treating all values in a
transaction's uncertainty window as past writes. As a result, the
operations on each key performed by transactions take place in an
order consistent with the real time ordering of those transactions.

[AVvXsEh_pRsMf0X3Cf8kj6h7PBYprPq469ZSlYwymyxIJ2fuM5XaqFfR]


Spanner versus CockroachDB

CRDB commit is similar to Spanner commit protocol (2012) in that
Spanner is also 2PL + 2PC. However, in Spanner, the timestamp doesn't
move up. Spanner sets it once the locks are taken, and then the
transaction commit waits out the clock uncertainty of commit
timestamp as needed. They can afford to it by using atomic clocks
with very small uncertainty intervals.

CRDB does not rely on specialized hardware for clock synchronization,
so it can run on off-the-shelf servers in public and private clouds
with software-level clock synchronization services such as NTP. Each
node within a CRDB cluster maintains a hybridlogical clock (HLC),
which provides timestamps that are a combination of physical and
logical time. HLCs provide causality tracking through their logical
component upon each inter-node exchange. HLCs provide strict
monotonicity within and across restarts on a single node. HLCs
provide self-stabilization in the presence of isolated transient
clock skew fluctuations. However, when the physical clock
synchronization uncertainty is high, CRDB still requires the above
techniques about read-refreshes to ensure serializability of
transactions.


SQL layer

Ok, let's start by revisiting the question of why CRDB maintains
sorted ranges. This is useful for SQL, in order to do efficient
range-scans for SQL queries using indexing. The index is also
maintained a range. But then how do we find the index? It is treated
specially and maintained as the first range in the system. When you
split the range, the same transaction that is updating range should
also update the index.

To use SQL over CRDB, SQL's tabular data is mapped to Key-Value data,
with key being in the format of /<table>/<index>/<key>. For example,
/inventory/primary/1. The prefix is stored as index for storage
saving.

Let's consider another index, and examples of it mapping to keys.
INDEX name_idx(name)
/inventory/name_idx/"Bat"/1
/inventory/name_idx/"Bat"/4

SQL execution layer turns transaction into series of key value
executions. It has the following phases: Scan -> Filter -> Project ->
Results. If there is index, the filter gets pushed into the scan.

SQL query optimizer has phases: Parse -> Prep -> Search -> Execute.
It can do cost based search.


Lessons learned

The paper has an interesting lessons learned section.

CockroachDB doesn't support any lower isolation levels than
Serializability. The paper says the following: Since CRDB was
primarily designed for SERIALIZABLE, we initially expected that
offering just snapshot isolation by removing the check for write
skews would be simple. However, this proved not to be the case. The
only safe mechanism to enforce strong consistency under snapshot
isolation is pessimistic locking, via the explicit locking modifiers
FOR SHARE and FOR UPDATE on queries. To guarantee strong consistency
across concurrent mixed isolation levels, CRDB would need to
introduce pessimistic locking for any row updates, even for
SERIALIZABLE transactions. To avoid this pessimization of the common
path, we opted to eschew true support for SNAPSHOT, keeping it as an
alias to SERIALIZABLE instead.

The long presentation includes a passing comment: "Raft is not any
easier to implement than another Paxos protocol." The paper has a
similar comment: "We initially chose Raft as the consensus algorithm
for CRDB due to its supposed ease of use and the precise description
of its implementation. In practice, we have found there are several
challenges in using Raft in a complex system like CRDB." The lessons
learned section discusses many Raft improvements. They make Raft less
chatty by coalescing the heartbeat messages into one per node to save
on the per-RPC overhead, and pausing Raft groups which have seen no
recent write activity. They also implemented a change to Raft
reconfiguration protocol, which they call Joint Consensus, similar to
ZooKeeper's reconfiguration protocol.

The discussion about Postgres compatibility is also interesting: "We
chose to adopt PostgreSQL's SQL dialect and network protocol in CRDB
to capitalize on the ecosystem of client drivers. This choice
initially boosted adoption and still results today in enhanced focus
and decision-making in the engineering team. However, CRDB behaves
differently from PostgreSQL in ways that require intervention in
client-side code. For example, clients must perform transaction
retries after an MVCC conflict and configure result paging. Reusing
PostgreSQL drivers as-is requires us to teach developers how to
deploy CRDB-specific code at a higher level, anew in every
application. This is a recurring source of friction which we had not
anticipated. As a result, we are now considering the gradual
introduction of CRDB-specific client drivers."

 
databases distributed transactions distSQL newsql

  * Get link
  * Facebook
  * Twitter
  * Pinterest
  * Email
  * Other Apps

Comments

[blo]
Tobin Baker said...
I don't know how the claim that snapshot isolation requires
pessimistic concurrency control ever got through review. There are
plenty of MVCC databases that use OCC (I know because I've written
one, but there are many others). To claim the contrary is not just
inaccurate; it shows a misunderstanding of elementary database
principles.

There are legitimate reasons to use PCC rather than OCC for an MVCC
database (e.g., for reducing wasted work due to aborts in highly
contended workloads), but it is not remotely a requirement.
March 5, 2022 at 1:02 PM [icon_delet]
Post a Comment

Popular posts from this blog

Graviton2 and Graviton3

- December 04, 2021
Image
What do modern cloud workloads look like? And what does that have to
do with new chip designs? I found these gems in Peter DeSantis's
ReInvent20 and ReInvent21 talks. These talks are very informative and
educational. Me likey! The speakers at ReInvent are not just
introducing new products/services, but they are also explaining the
thought processes behind them. To come up with this summary, I edited
the YouTube video transcripts slightly (mostly shortening it). The
presentation narratives have been really well planned, so this makes
a good read I think. Graviton2 This part is from the ReInvent2020
talk from Peter DeSantis.   Graviton2 is the best performing general
purpose processor in our cloud by a wide margin. It also offers
significantly lower cost. And it's also the most power efficient
processor we've ever deployed. Our plan was to build a processor that
was optimized for AWS and modern cloud workloads. But, what do modern
cloud workloads look like? Let's start by
 
Read more

Foundational distributed systems papers

- February 27, 2021
I talked about the importance of reading foundational papers last
week. To followup, here is my compilation of foundational papers in
the distributed systems area. (I focused on the core distributed
systems area, and did not cover networking, security, distributed
ledgers, verification work etc. I even left out distributed
transactions, I hope to cover them at a later date.)  I classified
the papers by subject, and listed them in chronological order. I also
listed expository papers and blog posts at the end of each section.
Time and State in Distributed Systems Time, Clocks, and the Ordering
of Events in a Distributed System. Leslie Lamport, Commn. of the
ACM,  1978. Distributed Snapshots: Determining Global States of a
Distributed System. K. Mani Chandy Leslie Lamport, ACM Transactions
on Computer Systems, 1985. Virtual Time and Global States of
Distributed Systems.  Mattern, F. 1988. Expository papers and blog
posts There is No Now . Justin Sheehy, ACM Queue 2015 Why Logical
Clock
 
Read more

Your attitude determines your success

- March 13, 2021
This may sound like a cliche your dad used to tell, but after many
years of going through new areas, ventures, and careers, I find this
to be the most underrated career advice. This is the number one
advice I would like my kids to internalize as they grow up. This is
the most important idea I would like every one undertaking a new
venture to know.  If you think you are not good enough, it becomes a
self-fulfilling prophecy. If you think you are not enjoying
something, you start to hate it.  I gave examples of this several
times before. Let's suffice with this one : In graduate school, I had
read "Hackers: Heroes of the Computer Revolution" from Steven Levy
and enjoyed it a lot. (I still keep the dog eared paper copy with
affection.) So, I should have read Steven Levy's Crypto book a long
time ago. But for some reason, I didn't...even though I was aware of
the book. I guess that was due to a stupid quirk of mine; I had some
aversion to the security/cryptography res
 
Read more

Progress beats perfect

- August 06, 2021
Image
This is a favorite saying of mine. I use it to motivate myself when I
feel disheartened about how much I have to learn and improve. If I do
a little every day or every week, I will get there. If I get one
percent better each day for one year, I'll end up thirty-seven times
better by the end of the year.   $1.01^{365}=37.78$ Years ago I had
read this idea in one of John Ousterhouts life lessons, and it stuck
with me.  "A little bit of slope makes up for a lot of y-intercept"
Recently I noticed another advantage of progress over perfect. The
emotional  advantage. Progress is better because it makes you feel
better as you see improvement. You are getting there, you are making
... progress. Progress is growth mindset . You have an opportunity
ahead of you. Perfect feels bad.. It puts you on defense. You have to
defend the perfect, you have to keep the appearances. You can only go
downwards from perfect, or maintain status quo. Progress gives you
momentum. As long as you manag
 
Read more

Learning a technical subject

- December 18, 2021
I love learning. I wanted to write about how I learn, so I can
analyze if there is a method to this madness. I will first talk about
what my learning process looks like in abstract terms, and then I'll
give an analogy to make things more concrete and visual.   Learning
is a messy process for me I know some very clear thinkers. They are
very organized and methodical. I am not like that. These tidy
thinkers seem to learn a new subject quickly (and effortlessly) by
studying the rules of the subject and then deriving everything about
that subject from that set of rules. They speak in precise statements
and have clear and hard-set opinions about the subject. They seem to
thrive most in theoretical subjects. In my observation those tidy
learners are in the minority. Maybe the tidy thinkers are able to
pull this feat off because they come from a neighboring domain/
subject and map the context there to this subject quickly. But, again
from my experience, it doesn't feel like that. It s
 
Read more

Cores that don't count

- June 06, 2021
This paper is from Google and appeared at HotOS 2021 . There is also
a very nice 10 minute video presentation for it. So Google found
fail-silent Corruption Execution Errors (CEEs) at CPU/cores. This is
interesting because we thought tested CPUs do not have logic errors,
and if they had an error it would be a fail-stop or at least
fail-noisy hardware errors triggering machine checks. Previously we
had known about fail-silent storage and network errors due to bit
flips, but the CEEs are new because they are computation errors.
While it is easy to detect data corruption due to bit flips, it is
hard to detect CEEs because they are rare and require expensive
methods to detect/correct in real-time.  What are the causes of CEEs?
This is mostly due to ever-smaller feature sizes that push closer to
the limits of CMOS scaling, coupled with ever-increasing complexity
in architectural design. Together, these create new challenges for
the verification methods that chip makers use to detect diverse
 
Read more

Learning about distributed systems: where to start?

- June 10, 2020
This is definitely not a "learn distributed systems in 21 days" post.
I recommend a principled, from the foundations-up, studying of
distributed systems, which will take a good three months in the first
pass, and many more months to build competence after that. If you are
practical and coding oriented you may not like my advice much. You
may object saying, "Shouldn't I learn distributed systems with coding
and hands on? Why can I not get started by deploying a Hadoop
cluster, or studying the Raft code." I think that is the wrong way to
go about learning distributed systems, because seeing similar code
and programming language constructs will make you think this is
familiar territory, and will give you a false sense of security. But,
nothing can be further from the truth. Distributed systems need
radically different software than centralized systems do.  --A.
Tannenbaum This quotation is literally the first sentence in my
distributed systems syllabus. Inst
 
Read more

Silent data corruptions at scale

- June 12, 2021
Image
This paper from Facebook (Arxiv Feb 2021) is referred in the Google
fail-silent Corruption Execution Errors (CEEs) paper as the most
related work. Both papers discuss the same phenomenon, and say that
we need to update our belief about quality-tested CPUs not having
logic errors, and that if they had an error it would be a fail-stop
or at least fail-noisy hardware errors triggering machine checks.
This paper provides an account of how Facebook have observed CEEs
over several years. After running a wide range of silent error test
scenarios across 100K  machines, they found that 100s of CPUs are
identified as having these errors, showing that CEEs are a systemic
issue across generations. This paper, as the Google paper, does not
name specific vendor or chipset types. Also the ~1/1000 ratio
reported here matches the ~1/1000 mercurial core ratio that the
Google paper reports. The paper claims that silent data corruptions
can occur due to device characteristics and are repeatable at scale
 
Read more

Paper review. Sharding the Shards: Managing Datastore Locality at
Scale with Akkio

- February 14, 2019
Image
This paper by Facebook, which appeared in OSDI'18, describes the data
locality management service, Akkio. Akkio has been in production use
at Facebook since 2014. It manages over 100PB of data, and processes
over 10 million data accesses per second. Why do we need to manage
locality?  Replicating all data to all datacenters is difficult to
justify economically (due to the extra storage and WAN networking
costs) when acceptable durability and request serving latency could
be achieved with 3 replicas. It looks like Facebook had been doing
full replication (at least for ViewState and AccessState applications
discussed in the evaluation) to all the 6 datacenters
back-in-the-day, but as the operation and the number of datacenters
grew, this became untenable. So, let's find suitable home-bases for
data, instead of fully replicating it to all datacenters. But the
problem is access locality is not static. What was a good location/
configuration for the data ceases to become suita
 
Read more

Decoupled Transactions: Low Tail Latency Online Transactions Atop
Jittery Servers (CIDR 2022)

- January 28, 2022
Image
This is a CIDR22 paper by Pat Helland. It is a long paper at 18
pages, followed by 12 pages of  appendix. Since he wrote such a long
paper, I won't apologize for writing a long review. This is a Pat
Helland paper, so it is full of Hellandisms. Pat's papers are always
remarkable, distinct. There is a lot of wisdom in them. So, in case
there could be any doubt about this, let me preface this review by
saying that I learn a lot from Pat's papers, and I am grateful to Pat
for teaching me and the community. Problem and scope Jitter refers to
probabilistic response times, message latencies in networks. In big
data processing systems it is easy to deal with jitter by retrying
stragglers, in fact this has been suggested in the MapReduce paper.
But the same approach does not apply to databases, where we don't
have the same idempotency affordances. Databases should be
transactionally correct! The paper presents a hypothetical design
(meaning this is not implemented). The design
 
Read more
Powered by Blogger
Theme images by Michael Elkan
Murat Demirbas
My photo

Murat
    I am a principal applied scientist at AWS S3 Automated Reasoning
    Group. On leave as a computer science and engineering professor
    at SUNY Buffalo. I work on distributed systems, distributed
    consensus, and cloud computing. You can follow me on Twitter.

Visit profile

Recent Posts

  * 2022 7
      + March 1
          o CockroachDB: The Resilient Geo-Distributed SQL Dat...
      + February 3
      + January 3

  * 2021 47
      + December 5
      + November 3
      + October 6
      + September 1
      + August 4
      + July 2
      + June 12
      + May 1
      + April 1
      + March 4
      + February 4
      + January 4
  * 2020 76
      + December 3
      + November 7
      + October 4
      + September 1
      + August 3
      + July 6
      + June 11
      + May 9
      + April 8
      + March 8
      + February 7
      + January 9
  * 2019 65
      + December 10
      + November 14
      + October 6
      + September 13
      + July 3
      + June 3
      + May 4
      + April 6
      + March 2
      + February 1
      + January 3
  * 2018 71
      + December 4
      + November 7
      + October 2
      + September 2
      + August 8
      + July 2
      + June 4
      + May 9
      + April 6
      + March 9
      + February 5
      + January 13
  * 2017 77
      + December 15
      + November 15
      + October 5
      + September 8
      + August 10
      + July 3
      + June 3
      + May 3
      + April 4
      + February 4
      + January 7
  * 2016 42
      + December 7
      + November 9
      + October 3
      + September 1
      + July 4
      + June 5
      + May 1
      + April 4
      + March 2
      + February 2
      + January 4
  * 2015 34
      + December 3
      + November 2
      + October 3
      + September 2
      + August 3
      + June 1
      + May 1
      + April 6
      + March 6
      + February 4
      + January 3
  * 2014 29
      + November 4
      + October 4
      + September 6
      + August 2
      + July 2
      + June 3
      + March 3
      + February 4
      + January 1
  * 2013 25
      + December 1
      + November 2
      + August 2
      + July 4
      + June 2
      + May 5
      + April 8
      + January 1
  * 2012 18
      + December 1
      + November 7
      + October 1
      + September 2
      + August 1
      + May 2
      + March 1
      + February 1
      + January 2
  * 2011 38
      + December 3
      + September 5
      + June 1
      + May 5
      + April 5
      + March 5
      + February 9
      + January 5
  * 2010 31
      + December 6
      + November 9
      + October 9
      + September 7
  * 2007 1
      + August 1

Show more Show less

Topics

auditability5 automated reasoning5 Azure9 bestof5 big-data20 
Blockchain39 book-review49 chaos2 cloud computing2 consistency23
Cosmos DB10 CosmosDB11 databases3 dataflow7 distributed consensus42 
distributed transactions6 distSQL1 facebook14 failures16 
fault-tolerance35 formal methods10 graph-processing1 humans10 
indexing3 links2 mad-questions42 misc101 mlbegin7 mldl25 mobile2 my
advice15 my-paper10 newsql2 paper-review134 paxos44 presenting4 
programming5 reading-group23 research-advice46 research-question43 
Rust3 scheduling3 seminar9 serverless1 smartphones2 sonification1 
stabilization5 stream-processing10 teaching30 tensorflow11 time8 time
synchronization1 tla38 trip-report20 wpaxos5 writing26
Show more Show less

Pageviews