Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
49%
Flag icon
consensus problem.
49%
Flag icon
There are a number of situations in which it is important for nodes to agree.
49%
Flag icon
Leader el...
This highlight has been truncated due to consecutive passage length restrictions.
49%
Flag icon
Atomic commit
49%
Flag icon
FLP result
49%
Flag icon
If the algorithm is allowed to use timeouts, or some other way of identifying suspected crashed nodes (even if the suspicion is sometimes wrong), then consensus becomes solvable
49%
Flag icon
Atomicity prevents failed transactions from littering the database with half-finished results and half-updated state.
49%
Flag icon
especially important for multi-object transactions (see
49%
Flag icon
maintain secondary indexes.
49%
Flag icon
secondary index is a separate data structure from the primary data
49%
Flag icon
record: before that moment, it is still possible to abort (due to a crash),
49%
Flag icon
have a multi-object transaction in a partitioned database, or a term-partitioned secondary index
49%
Flag icon
it is not sufficient to simply send a commit request to all of the nodes and independently commit the transaction on each one.
49%
Flag icon
A transaction commit must be irrevocable — you are not allowed to change your mind and retroactively abort a transaction after it has been committed.
49%
Flag icon
Introduction to two-phase commit
50%
Flag icon
2PC uses a new component that does not normally appear in single-node transactions: a coordinator
50%
Flag icon
The coordinator is often implemented as a library within the same application process that is requesting the transaction
50%
Flag icon
Surely the prepare and commit requests can just as easily be lost in the two-phase case. What makes 2PC different?
50%
Flag icon
transaction, it requests a transaction ID from the coordinator. This transaction ID is globally unique.
50%
Flag icon
participants. If this request fails or times out, the coordinator must retry forever until it succeeds. There is no more going back: if the decision was to commit, that decision must be enforced, no matter how many retries it takes. If a participant has crashed in the meantime, the transaction will be committed when it recovers
50%
Flag icon
If the coordinator fails before sending the prepare requests, a participant can safely abort the transaction. But once the participant has received a prepare request and voted “yes,” it can no longer abort unilaterally — it must wait to hear back from the coordinator whether the transaction was committed or aborted.
50%
Flag icon
in doubt or uncertain.
50%
Flag icon
The only way 2PC can complete is by waiting for the coordinator to recover.
50%
Flag icon
This is why the coordinator must write its commit or abort decision to a transaction log on disk before sending commit or abort requests to participants:
50%
Flag icon
Three-phase commit
50%
Flag icon
Two-phase commit is called a blocking atomic commit protocol due to the fact that 2PC can become stuck waiting for the coordinator to recover.
50%
Flag icon
3PC assumes a network with bounded delay and nodes with bounded response times; in most practical systems with unbounded network delay and process pauses
50%
Flag icon
it cannot guarantee atomicity.
50%
Flag icon
nonblocking atomic commit requires a perfect failure detector [67, 71] — i.e., a reliable mechanism for telling whether a node has crashed
50%
Flag icon
distributed transactions in MySQL are reported to be over 10 times slower than single-node transactions
50%
Flag icon
additional disk forcing (fsync) that is required for crash recovery [88], and the additional network round-trips.
50%
Flag icon
Database-internal distributed transactions
50%
Flag icon
all the nodes participating in the transaction are running the same database software.
50%
Flag icon
Heterogeneous distributed transactions
50%
Flag icon
the participants are two or more different technologies: for example, two databases from different vendors, or even non-database systems such as message brokers.
50%
Flag icon
Exactly-once message processing
50%
Flag icon
For example, say a side effect of processing a message is to send an email, and the email server does not support two-phase commit: it could happen that the email is sent two or more times if message processing fails and is retried.
50%
Flag icon
XA transactions
50%
Flag icon
X/Open XA (short for eXtended Architecture) is a standard for implementing two-phase commit across heterogeneous technologies
50%
Flag icon
XA is not a network protocol — it is merely a C API for interfacing with a transaction coordinator.
50%
Flag icon
in turn is supported by many drivers for databases using Java Database Connectivity (JDBC) and drivers for message brokers using the Java Message Service (JMS) APIs.
50%
Flag icon
If the driver supports XA, that means it calls the XA API to find out whether an operation should be part of a distributed transaction
50%
Flag icon
Since the coordinator’s log is on the application server’s local disk, that server must be restarted,
51%
Flag icon
in practice, orphaned in-doubt transactions do occur [89, 90] — that is, transactions for which the coordinator cannot decide the outcome for whatever reason
51%
Flag icon
Many XA implementations have an emergency escape hatch called heuristic decisions: allowing a participant to unilaterally decide to abort or commit an in-doubt transaction without a definitive decision from the coordinator
51%
Flag icon
heuristic here is a euphemism for probably breaking atomicity, since the heuristic decision violates the system of promises in two-phase commit.
51%
Flag icon
the key realization is that the transaction coordinator is itself a kind of database
51%
Flag icon
If the coordinator is not replicated
51%
Flag icon
Many server-side applications are developed in a stateless model (as
51%
Flag icon
However, when the coordinator is part of the application server, it changes the nature of the deployment.
1 18 28