Chena Lee’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Chena Lee

See all Chena’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between August 2 - December 28, 2020

49%

consensus problem.

49%

There are a number of situations in which it is important for nodes to agree.

49%

Leader el...

This highlight has been truncated due to consecutive passage length restrictions.

49%

Atomic commit

49%

FLP result

49%

If the algorithm is allowed to use timeouts, or some other way of identifying suspected crashed nodes (even if the suspicion is sometimes wrong), then consensus becomes solvable

49%

Atomicity prevents failed transactions from littering the database with half-finished results and half-updated state.

49%

especially important for multi-object transactions (see

49%

maintain secondary indexes.

49%

secondary index is a separate data structure from the primary data

49%

record: before that moment, it is still possible to abort (due to a crash),

49%

have a multi-object transaction in a partitioned database, or a term-partitioned secondary index

49%

it is not sufficient to simply send a commit request to all of the nodes and independently commit the transaction on each one.

49%

A transaction commit must be irrevocable — you are not allowed to change your mind and retroactively abort a transaction after it has been committed.

49%

Introduction to two-phase commit

50%

2PC uses a new component that does not normally appear in single-node transactions: a coordinator

50%

The coordinator is often implemented as a library within the same application process that is requesting the transaction

50%

Surely the prepare and commit requests can just as easily be lost in the two-phase case. What makes 2PC different?

50%

transaction, it requests a transaction ID from the coordinator. This transaction ID is globally unique.

50%

participants. If this request fails or times out, the coordinator must retry forever until it succeeds. There is no more going back: if the decision was to commit, that decision must be enforced, no matter how many retries it takes. If a participant has crashed in the meantime, the transaction will be committed when it recovers

50%

If the coordinator fails before sending the prepare requests, a participant can safely abort the transaction. But once the participant has received a prepare request and voted “yes,” it can no longer abort unilaterally — it must wait to hear back from the coordinator whether the transaction was committed or aborted.

50%

in doubt or uncertain.

50%

The only way 2PC can complete is by waiting for the coordinator to recover.

50%

This is why the coordinator must write its commit or abort decision to a transaction log on disk before sending commit or abort requests to participants:

50%

Three-phase commit

50%

Two-phase commit is called a blocking atomic commit protocol due to the fact that 2PC can become stuck waiting for the coordinator to recover.

50%

3PC assumes a network with bounded delay and nodes with bounded response times; in most practical systems with unbounded network delay and process pauses

50%

it cannot guarantee atomicity.

50%

nonblocking atomic commit requires a perfect failure detector [67, 71] — i.e., a reliable mechanism for telling whether a node has crashed

50%

distributed transactions in MySQL are reported to be over 10 times slower than single-node transactions

50%

additional disk forcing (fsync) that is required for crash recovery [88], and the additional network round-trips.

50%

Database-internal distributed transactions

50%

all the nodes participating in the transaction are running the same database software.

50%

Heterogeneous distributed transactions

50%

the participants are two or more different technologies: for example, two databases from different vendors, or even non-database systems such as message brokers.

50%

Exactly-once message processing

50%

For example, say a side effect of processing a message is to send an email, and the email server does not support two-phase commit: it could happen that the email is sent two or more times if message processing fails and is retried.

50%

XA transactions

50%

X/Open XA (short for eXtended Architecture) is a standard for implementing two-phase commit across heterogeneous technologies

50%

XA is not a network protocol — it is merely a C API for interfacing with a transaction coordinator.

50%

in turn is supported by many drivers for databases using Java Database Connectivity (JDBC) and drivers for message brokers using the Java Message Service (JMS) APIs.

50%

If the driver supports XA, that means it calls the XA API to find out whether an operation should be part of a distributed transaction

50%

Since the coordinator’s log is on the application server’s local disk, that server must be restarted,

51%

in practice, orphaned in-doubt transactions do occur [89, 90] — that is, transactions for which the coordinator cannot decide the outcome for whatever reason

51%

Many XA implementations have an emergency escape hatch called heuristic decisions: allowing a participant to unilaterally decide to abort or commit an in-doubt transaction without a definitive decision from the coordinator

51%

heuristic here is a euphemism for probably breaking atomicity, since the heuristic decision violates the system of promises in two-phase commit.

51%

the key realization is that the transaction coordinator is itself a kind of database

51%

If the coordinator is not replicated

51%

Many server-side applications are developed in a stateless model (as

51%

However, when the coordinator is part of the application server, it changes the nature of the deployment.

« Prev 1 … 17 18 19 … 28 Next »

See a Problem?

Preview — Designing Data-Intensive Applications by Martin Kleppmann