Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
35%
Flag icon
Traditional relational databases don’t limit the duration of a transaction,
35%
Flag icon
wait for human input.
35%
Flag icon
databases running 2PL can have quite unstable latencies,
35%
Flag icon
can be very slow at high percentiles
36%
Flag icon
Although deadlocks can happen with the lock-based read committed isolation level, they occur much more frequently under 2PL serializable isolation
36%
Flag icon
If deadlocks are frequent, this can mean significant wasted effort.
36%
Flag icon
a predicate lock
36%
Flag icon
it belongs to all objects that match some search condition,
36%
Flag icon
If transaction A wants to insert, update, or delete any object, it must first check whether either the old or the new value matches any existing predicate lock.
36%
Flag icon
a predicate lock applies even to objects that do not yet exist in the database, but which might be added in the future (phantoms).
36%
Flag icon
checking for matching locks becomes time-consuming.
36%
Flag icon
most databases with 2PL actually implement index-range locking
36%
Flag icon
next-key lo...
This highlight has been truncated due to consecutive passage length restrictions.
36%
Flag icon
It’s safe to simplify a predicate by making it match a greater set of objects.
36%
Flag icon
If there is no suitable index where a range lock can be attached, the database can fall back to a shared lock on the entire table.
36%
Flag icon
Serializable Snapshot Isolation (SSI)
36%
Flag icon
Are serializable isolation and good performance fundamentally at odds with each other?
36%
Flag icon
serializable snapshot isolation (SSI) is very promising.
36%
Flag icon
Today SSI is used both in single-node databases
36%
Flag icon
and distributed databases
36%
Flag icon
Two-phase locking is a so-called pessimistic concurrency control mechanism:
36%
Flag icon
By contrast, serializable snapshot isolation is an optimistic concurrency control technique.
36%
Flag icon
When a transaction wants to commit, the database checks whether anything bad happened
36%
Flag icon
if so, the transaction is aborted and has to be retried.
36%
Flag icon
Only transactions that executed serializably are a...
This highlight has been truncated due to consecutive passage length restrictions.
36%
Flag icon
It performs badly if there is high contention
36%
Flag icon
if there is enough spare capacity, and if contention between transactions is not too high, optimistic concurrency control techniques tend to perform better
36%
Flag icon
for example, if several transactions concurrently want to increment a counter, it doesn’t matter in which order the increments are applied
36%
Flag icon
SSI is based on snapshot isolation
36%
Flag icon
techniques. On top of snapshot isolation, SSI adds an algorithm for detecting serialization conflicts among writes and determining which transactions to abort.
36%
Flag icon
under snapshot isolation, the result from the original query may no longer be up-to-date by the time the transaction commits,
36%
Flag icon
the transaction is taking an action based on a premise
36%
Flag icon
Later, when the transaction wants to commit, the original data may have changed — the premise may no longer be true.
36%
Flag icon
To be safe, the database needs to assume that any change in the query result (the premise) means that writes in that transaction may be invalid.
36%
Flag icon
causal dependency between the queries and the writes in the transaction.
36%
Flag icon
Detecting reads of a stale MVCC object version (uncommitted write occurred before the read)
36%
Flag icon
Detecting writes that affect prior reads
36%
Flag icon
When the transaction wants to commit, the database checks whether any of the ignored writes have now been committed. If so, the transaction must be aborted.
36%
Flag icon
shift_id, the database can use the index entry 1234 to record the fact that transactions 42 and 43 read this data.
36%
Flag icon
a transaction writes to the database, it must look in the indexes for any other transactions that have recently read the affected data.
36%
Flag icon
it simply notifies the transactions that the data they read may no longer be up to date.
36%
Flag icon
Less detailed tracking is faster, but may lead to more transactions being aborted than strictly necessary.
36%
Flag icon
depending on what else happened, it’s sometimes possible to prove that the result of the execution is nevertheless serializable.
36%
Flag icon
serializable snapshot isolation is not limited to the throughput of a single CPU core:
36%
Flag icon
Even though data may be partitioned across multiple machines, transactions can read and write data in multiple partitions while ensuring serializable isolation
36%
Flag icon
so SSI requires that read-write transactions be fairly short
36%
Flag icon
(long-running read-only transactions may be okay).
37%
Flag icon
A large class of errors is reduced down to a simple transaction abort, and the application just needs to try again.
37%
Flag icon
Weak isolation levels protect against some of those anomalies but leave you, the application developer, to handle others manually (e.g., using explicit locking).
38%
Flag icon
In distributed systems, we are no longer operating in an idealized system model
1 11 28