Chena Lee’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Tali

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Chena Lee

See all Chena’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between August 2 - December 28, 2020

48%

Causal consistency goes further: it needs to track causal dependencies across the entire database, not just for a single key. Version vectors can be generalized to do this

48%

actually keeping track of all causal dependencies can become impracticable.

48%

it is not clear whether the write is causally dependent on all or only some of those prior reads.

48%

a large ov...

This highlight has been truncated due to consecutive passage length restrictions.

48%

sequence numbers

48%

If there is not a single leader

48%

how to generate sequence numbers for operations. Various

48%

own independent set of sequence numbers.

48%

one node can generate only odd numbers and the other ...

This highlight has been truncated due to consecutive passage length restrictions.

48%

time-of-day clock

48%

You can preallocate blocks of sequence numbers.

48%

The causality problems occur because these sequence number generators do not correctly capture the ordering of operations across different nodes:

48%

there is actually a simple method for generating sequence numbers that is consistent with causality.

48%

Lamport timestamp,

48%

simply a pair of (counter, node ID).

48%

if the counter values are the same, the one with the greater node ID is the greater timestamp.

48%

every node and every client keeps track of the maximum counter value it has seen so far,

48%

When a node receives a request or response with a maximum counter value greater than its own counter value, it immediately increases its own counter to that maximum.

48%

they have a different purpose: version vectors can distinguish whether two operations are concurrent or whether one is causally dependent on the other,

48%

Lamport timestamps always enforce a total ordering.

48%

dependent. The advantage of Lamport timestamps over version vectors is that they are more compact.

48%

in order to implement something like a uniqueness constraint for usernames, it’s not sufficient to have a total ordering of operations

48%

This idea of knowing when your total order is finalized is captured in the topic of total order broadcast.

48%

total order broadcast or atomic broadcast

48%

partitions. Total ordering across all partitions is possible, but requires additional coordination

48%

Total order broadcast is usually described as a protocol for exchanging messages between nodes.

48%

Reliable delivery

48%

Totally ordered delivery

48%

This fact is a hint that there is a strong connection between total order broadcast and consensus,

48%

Total order broadcast is exactly what you need for database replication:

48%

total order broadcast can be used to implement serializable transactions:

48%

Another way of looking at total order broadcast is that it is a way of creating a log

48%

Total order broadcast is also useful for implementing a lock service that provides fencing tokens

49%

linearizable system there is a total order of operations.

49%

Total order broadcast is asynchronous: messages are guaranteed to be delivered reliably in a fixed order, but there is no guarantee about when a message will be delivered

49%

others). By contrast, linearizability is a recency guarantee: a read is guaranteed to see the latest value written.

49%

Imagine that for every possible username, you can have a linearizable register with an atomic compare-and-set operation. Every register initially has the value null

49%

If multiple users try to concurrently grab the same username, only one of the compare-and-set operations will succeed,

49%

such a linearizable compare-and-set operation as follows by using total order broadcast as an append-only log

49%

Read the log, and wait for the message you appended to be delivered back to you.xi

49%

Choosing the first of the conflicting writes as the winner and aborting later ones ensures that all nodes agree on whether a write was committed or aborted.

49%

it doesn’t guarantee linearizable reads —

49%

You can sequence reads through the log by appending a message, reading the log, and performing the actual read when the message is delivered back to you.

49%

simple: for every message you want to send through total order broadcast, you increment-and-get the linearizable integer, and then attach the value you got from the register as a sequence number to the message.

49%

gaps. Thus, if a node has delivered message 4 and receives an incoming message with a sequence number of 6, it knows that it must wait for message 5 before it can deliver message 6.

49%

in fact, this is the key difference between total order broadcast and timestamp ordering.

49%

The problem lies in handling the situation when network connections to that node are interrupted, and restoring the value when that node fails

49%

you inevitably end up with a consensus algorithm.

49%

coincidence: it can be proved that a linearizable compare-and-set (or increment-and-get) register and total order broadcast are both equivalent to consensus

49%

is, if you can solve one of these problems, you can transform it into a solution for the others.

« Prev 1 … 16 17 18 … 28 Next »

See a Problem?

Preview — Designing Data-Intensive Applications by Martin Kleppmann