Nikhil Goyal’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Chena Lee

6 notes & 1353 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Nikhil Goyal

See all Nikhil’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between October 21 - November 26, 2024

23%

Pretending that replication is synchronous when in fact it is asynchronous is a recipe for problems down the line.

23%

A natural extension of the leader-based replication model is to allow more than one node to accept writes.

23%

It rarely makes sense to use a multi-leader setup within a single datacenter, because the benefits rarely outweigh the added complexity.

24%

With a normal leader-based replication setup, the leader has to be in one of the datacenters, and all writes must go through that datacenter.

24%

In a multi-leader configuration, you can have a leader in each datacenter.

24%

In a single-leader configuration, every write must go over the internet to the datacenter with the leader.

24%

In a multi-leader configuration, every write can be processed in the local datacenter and is replicated asynchronously to the other datacenters.

24%

If you make any changes while you are offline, they need to be synced with a server and your other devices when the device is next online.

24%

The biggest problem with multi-leader replication is that write conflicts can occur, which means that conflict resolution is required.

24%

If you want synchronous conflict detection, you might as well just use single-leader replication.

24%

In a multi-leader configuration, there is no defined ordering of writes, so it’s not clear what the final value should be.

24%

As the most appropriate way of resolving a conflict may depend on the application, most multi-leader replication tools let you write conflict resolution logic using application code.

24%

Even if the application checks availability before allowing a user to make a booking, there can be a conflict if the two bookings are made on two different leaders.

24%

A replication topology describes the communication paths along which writes are propagated from one node to another.

24%

With more than two leaders, various different topologies are possible.

25%

In circular and star topologies, a write may need to pass through several nodes before it reaches all replicas.

25%

When a node receives a data change that is tagged with its own identifier, that data change is ignored, because the node knows that it has already been processed.

25%

A leader determines the order in which writes should be processed, and followers apply the leader’s writes in the same order.

25%

In some leaderless implementations, the client directly sends its writes to several replicas, while in others, a coordinator node does this on behalf of the client.

25%

The client simply ignores the fact that one of the replicas missed the write.

25%

The replication system should ensure that eventually all the data is copied to every replica.

25%

After an unavailable node comes back online, how does it catch up on the writes that it missed?

25%

Note that without an anti-entropy process, values that are rarely read may be missing from some replicas and thus have reduced durability, because read repair is only performed when a value is read by the application.

25%

As long as w + r > n, we expect to get an up-to-date value when reading, because at least one of the r nodes we’re reading from must be up to date.

25%

If w + r > n, at least one of the r replicas you read from must have seen the most recent successful write.

25%

With a smaller w and r you are more likely to read stale values, because it’s more likely that your read didn’t include the node with the latest value.

25%

However, even with w + r > n, there are likely to be edge cases where stale values are returned.

25%

Stronger guarantees generally require transactions or consensus.

26%

The only safe way of using a database with LWW is to ensure that a key is only written once and thereafter treated as immutable, thus avoiding any concurrent updates to the same key.

26%

An operation A happens before another operation B if B knows about A, or depends on A, or builds upon A in some way.

26%

If one operation happened before another, the later operation should overwrite the earlier operation, but if the operations are concurrent, we have a conflict that needs to be resolved.

26%

For defining concurrency, exact time doesn’t matter: we simply call two operations concurrent if they are both unaware of each other, regardless of the physical time at which they occurred.

26%

When a write includes the version number from a prior read, that tells us which previous state the write is based on.

26%

With the example of a shopping cart, a reasonable approach to merging siblings is to just take the union.

26%

Each replica increments its own version number when processing a write, and also keeps track of the version numbers it has seen from each of the other replicas.

27%

The version vector allows the database to distinguish between overwrites and concurrent writes.

27%

A version vector is sometimes also called a vector clock, even though they are not quite the same.

27%

If a leader fails and you promote an asynchronously updated follower to be the new leader, recently committed data may be lost.

28%

In effect, each partition is a small database of its own, although the database may support operations that touch multiple partitions at the same time.