Extending MVCC to be serializable, in TLA+
In the previous blog post, we saw how a transaction isolation strategy built on multi-version concurrency control (MVCC) does not implement the serializable isolation level. Instead, it implements a weaker isolation level called snapshot isolation. In this post, I’ll discuss how that MVCC model can be extended in order to achieve serializability, based on work published by Michael Cahill, Uwe Röhm, and Alan Fekete.
You can find the model I wrote in the https://github.com/lorin/snapshot-isolation-tla repo, in the SSI module (source, pdf).
A quick note on conventionsIn this post, I denote read of x=1 as r[x,1]. This means a transaction read the object x which returned a value of 1. As I mentioned in the previous post, you can imagine a read as being the following SQL statement:
SELECT v FROM obj WHERE k='x';Similarly, I denote a write of y←2 as w[y,2]. This means a transaction wrote the object y with a value of 2. You can imagine this as:
UPDATE obj SET v=2 WHERE k='y';Finally, I’ll assume that there’s an initial transaction (T0) which sets the values of all of the objects to 0, and has committed before any other transaction starts.
We assume this transaction always precedes all other transactionsBackgroundThe SQL isolation levels and phenomenaThe ANSI/ISO SQL standard defines four transaction isolation levels: read uncommitted, read committed, repeatable read, and serializable. The standard defines the isolation levels in terms of the phenomena they prevent. For example, the dirty read phenomenon is when one transaction reads a write done by a concurrent transaction that has not yet committed. Phenomena are dangerous because they may violate a software developer’s assumptions about how the database will behave, leading to software that behaves incorrectly.
Problems with the standard and a new isolation levelBerenson et al. noted that the standard’s wording is ambiguous, and of the two possible interpretations of the definitions, one was incorrect (permitting invalid execution histories) and the other was overly strict (proscribing valid execution histories).
The overly strict definition implicitly assumed that concurrency control would be implemented using locking, and this ruled out valid implementations based on alternate schemes, in particular, multi-version concurrency control. They also proposed a new isolation level: snapshot isolation.
Formalizing phenomena and anti-dependenciesIn his PhD dissertation work, Adya introduced a new formalization for reasoning about transaction isolation. The formalism is based on a graph of direct dependencies between transactions.
One type of dependency Adya introduced is called an anti-dependency, which is critical to the difference between snapshot isolation and serializable.
An anti-dependency between two concurrent transactions is when one read an object and the other writes the object with a different value, for example:
T1 is said to have an anti-dependency on T2: T1 must come before T2 in a serialization:
If T2 is sequenced before T1, then the read will not match the most recent write. Therefore, T1 must come before T2.In dependency graphs, anti-dependencies are labeled with rw because the transaction which does the read must be sequenced before the transaction that does the write, as shown above.
Adya demonstrated that for an implementation that supports snapshot isolation to generate execution histories that are not serializable, there must be a cycle in the dependency graph that includes an anti-dependency.
Non-serializable execution histories in snapshot isolationIn the paper Making Snapshot Isolation Serializable, Fekete et al. further narrowed the conditions under which snapshot isolation could lead to a non-serializable execution history, by proving the following theorem:
THEOREM 2.1. Suppose H is a multiversion history produced under Snapshot Isolation that is not serializable. Then there is at least one cycle in the serialization graph DSG(H), and we claim that in every cycle there are three consecutive transactions Ti.1, Ti.2, Ti.3 (where it is possible that Ti.1 and Ti.3 are the same transaction) such that Ti.1 and Ti.2 are concurrent, with an edge Ti.1 → Ti.2, and Ti.2 and Ti.3 are concurrent with an edge Ti.2 → Ti.3.
They also note:
By Lemma 2.3, both concurrent edges whose existence is asserted must be anti-dependencies:Ti.1→ Ti.2 and Ti.2→ Ti.3.
This means that one of the following two patterns must always be present in a snapshot isolation history that is not serializable:
Non-serializable snapshot isolation histories must contain one of these as subgraphs in the dependency graphModifying MVCC to avoid non-serializable historiesCahill et al. proposed a modification to MVCC that can dynamically identify potential problematic transactions that could lead to non-serializable histories, and abort them. By aborting these transactions, the resulting algorithm guarantees serializability.
As Fekete et al. proved, under snapshot isolation, cycles can only occur if there exists a transaction which contains an incoming anti-dependency edge and an outgoing anti-dependency edge, which they call pivot transactions.
Pivot transactions shown in redTheir approach is to identify and abort pivot transactions: if an active transaction contains both an outgoing and an incoming anti-dependency, the transaction is aborted. Note that this is a conservative algorithm: some of the transactions that it aborts may have still resulted in serializable execution histories. But it does guarantee serializability.
Their modification to MVCC involves some additional bookkeeping:
Reads performed by each transactionWhich transactions have outgoing anti-dependenciesWhich transactions have incoming anti-dependenciesThe tracking of reads is necessary to identify the presence of anti-dependencies, since an anti-dependency always involve a read (outgoing dependency edge) and a write (incoming dependency edge).
Extending our MVCC TLA+ model for serializabilityAdding variablesI created a new module called SSI, which stands for Serializable Snapshot Isolation. I extended the MVCC model to add three variables to implement the additional bookkeeping required by the Cahill et al. algorithm. MVCC already tracks which objects are written by each transaction, but we need to now also track reads.
rds – which objects are read by which transactionsoutc – set of transactions that have outbound anti-dependenciesinc – set of transactions that have inbound anti-dependencies
TLA+ is untyped (unless you’re using Apalache), but we can represent type information by defining a type invariant (above, called TypeOkS). Defining this is useful both for the reader, and because we can check that this holds with the TLC model checker.
Changes in behavior: new abort opportunitiesHere’s how the Next action in MVCC compares to the equivalent in SSI.
Note: Because extending the MVCC module brings all of the MVCC names into scope, I had to create new names for each of the equivalent actions in SSI, I did this by appending an S (e.g., StartTransactionS, DeadlockDetectionS).
In our original MVCC implementation, reads and commits always succeeded. Now, it’s possible for an attempted read or an attempted to commit to result in aborts as well, so we needed an action for this, which I called AbortRdS.
Commits can now also fail, so instead of having a single-step Commit action, we now have a BeginCommit action, which will complete successfully by an EndCommit action, or fail with an abort by the AbortCommit action. Writes can also now abort due to the potential for introducing pivot transactions.
Finding aborts with the model checkerHere’s how I used the TLC model checker to generate witnesses of the new abort behaviors:
Aborted readsTo get the model checker to generate a trace for an aborted read, I defined the following invariant in the MCSSI.tla file:
Then I specified it as an invariant to check in the model checker in the MCSSI.cfg file:
INVARIANT NeverAbortsReadBecause aborted reads can, indeed, happen, the model checker returned an error, with the following error trace:
The resulting trace looks like this, with the red arrows indicating the anti-dependencies.
Aborted commitSimilarly, we can use the model checker to identify scenarios where it a commit would fail, by specifying the following invariant:
The checker finds the following violation of that invariant:
While T2 is in the process of committing, T1 performs a read which turns T2 into a pivot transaction. This results in T2 aborting.
Checking serializability using refinement mappingJust like we did previously with MVCC, we can define a refinement mapping from our SSI spec to our Serializability spec. You can find it in the SSIRefinement module (source, pdf). It’s almost identical to the MVCCRefinement module (source, pdf), with some minor modifications to handle the new abort scenarios.
The main difference is that now the refinement mapping should actually hold, because SSI ensures serializability! I wasn’t able to find a counterexample when I ran the model checker against the refinement mapping, so that gave me some confidence in my model. Of course, that doesn’t prove that my implementation is correct. But it’s good enough for a learning exercise.
Coda: on extending TLA+ specificationsSerializable Snapshot Isolation provides us with a nice example of when we can extend an existing specification rather than create a new one from scratch.
Even so, it’s still a fair amount of work to extend an existing specification. I suspect it would have been less work to take a copy-paste-and-modify approach rather than extending it. Still, I found it a useful exercise in learning how to modify a specification by extending it.


