Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
Kindle Notes & Highlights
32%
Flag icon
A key feature of a transaction is that it can be aborted and safely retried if an error occurred.
32%
Flag icon
object-relational mapping (ORM) frameworks such as Rails’s ActiveRecord and Django don’t retry aborted transactions — the error usually results in an exception bubbling up the stack, so any user input is thrown away and the user gets an error message.
32%
Flag icon
If the error is due to overload, retrying the transaction will make the problem worse, not better. To avoid such feedback cycles, you can limit the number of retries, use exponential backoff, and handle overload-related errors differently
32%
Flag icon
serializable isolation means that the database guarantees that transactions have the same effect as if they ran serially (i.e., one at a time, without any concurrency). In
32%
Flag icon
databases prevent dirty writes by using row-level locks: when a transaction wants to modify a particular object (row or document), it must first acquire a lock on that object. It must then hold that lock until the transaction is committed or aborted.
32%
Flag icon
Read skew is considered acceptable under read committed isolation:
33%
Flag icon
a key principle of snapshot isolation is readers never block writers, and writers never block readers. This allows a database to handle long-running read queries on a consistent snapshot at the same time as processing writes normally, without any lock contention between the two.
38%
Flag icon
In a supercomputer, a job typically checkpoints the state of its computation to durable storage from time to time. If one node fails, a common solution is to simply stop the entire cluster workload. After the faulty node is repaired, the computation is restarted from the last checkpoint [7, 8]. Thus, a supercomputer is more like a single-node computer than a distributed system: it deals with partial failure by letting it escalate into total failure
39%
Flag icon
Even if TCP acknowledges that a packet was delivered, the application may have crashed before handling it. If you want to be sure that a request was successful, you need a positive response from the application itself [24
39%
Flag icon
Conversely, if something has gone wrong, you may get an error response at some level of the stack, but in general you have to assume that you will get no response at all.
39%
Flag icon
UDP is a good choice in situations where delayed data is worthless.
39%
Flag icon
a circuit is a fixed amount of reserved bandwidth which nobody else can use while the circuit is established, whereas the packets of a TCP connection opportunistically use whatever network bandwidth is available.
40%
Flag icon
multi-tenancy with dynamic resource partitioning provides better utilization, so it is cheaper, but it has the downside of variable delays. Variable delays in networks are not a law of nature, but simply the result of a cost/benefit trade-off.
40%
Flag icon
it makes no sense to compare monotonic clock values from two different computers, because they don’t mean the same thing.
40%
Flag icon
The quartz clock in a computer is not very accurate: it drifts (runs faster or slower than it should). Clock drift varies depending on the temperature of the machine.
40%
Flag icon
NTP synchronization can only be as good as the network delay, so there is a limit to its accuracy when you’re on a congested network with variable packet delays.
40%
Flag icon
When a CPU core is shared between virtual machines, each VM is paused for tens of milliseconds while another VM is running. From an application’s point of view, this pause manifests itself as the clock suddenly jumping forward
40%
Flag icon
Additional causality tracking mechanisms, such as version vectors, are needed in order to prevent violations of causality (see “Detecting Concurrent Writes”).
40%
Flag icon
NTP’s synchronization accuracy is itself limited by the network round-trip time,
41%
Flag icon
if you have two confidence intervals, each consisting of an earliest and latest possible timestamp (A = [Aearliest, Alatest] and B = [Bearliest, Blatest]), and those two intervals do not overlap (i.e., Aearliest < Alatest < Bearliest < Blatest), then B definitely happened after A — there can be no doubt. Only if the intervals overlap are we unsure in which order A and B happened.
41%
Flag icon
Spanner deliberately waits for the length of the confidence interval before committing a read-write transaction. By doing so, it ensures that any transaction that may read the data is at a sufficiently later time, so their confidence intervals do not overlap.
41%
Flag icon
when a node obtains a lease, it knows that it is the leader for some amount of time, until the lease expires. In order to remain leader, the node must periodically renew the lease before it expires. If the node fails, it stops renewing the lease,
41%
Flag icon
code. In the case of a virtual machine, the CPU time spent in other virtual machines is known as steal time.
41%
Flag icon
real-time systems may have lower throughput, since they have to prioritize timely responses above all else
42%
Flag icon
If ZooKeeper is used as lock service, the transaction ID zxid or the node version cversion can be used as fencing token. Since they are guaranteed to be monotonically increasing,
42%
Flag icon
A system is Byzantine fault-tolerant if it continues to operate correctly even if some of the nodes are malfunctioning and not obeying the protocol, or if malicious attackers are interfering with the network.
42%
Flag icon
Most Byzantine fault-tolerant algorithms require a supermajority of more than two-thirds of the nodes to be functioning
43%
Flag icon
The use of multiple servers makes NTP more robust than if it only uses a single server.
43%
Flag icon
The synchronous model assumes bounded network delay, bounded process pauses, and bounded clock error.
43%
Flag icon
uniqueness and monotonic sequence are safety properties, but availability is a liveness property.
43%
Flag icon
scalability is not the only reason for wanting to use a distributed system. Fault tolerance and low latency (by placing data geographically close to users) are equally important goals,
45%
Flag icon
The best way of building fault-tolerant systems is to find some general-purpose abstractions with useful guarantees, implement them once, and then let applications rely on those guarantees.
45%
Flag icon
If two nodes both believe that they are the leader, that situation is called split brain, and it often leads to data loss.
45%
Flag icon
The edge cases of eventual consistency only become apparent when there is a fault in the system (e.g., a network interruption) or at high concurrency.
45%
Flag icon
In a linearizable system, as soon as one client successfully completes a write, all clients reading from the database must be able to see the value just written.
46%
Flag icon
A database may provide both serializability and linearizability, and this combination is known as strict serializability
46%
Flag icon
Coordination services like Apache ZooKeeper [15] and etcd [16] are often used to implement distributed locks and leader election.
46%
Flag icon
libraries like Apache Curator [17] help by providing higher-level recipes on top of ZooKeeper.
46%
Flag icon
hard uniqueness constraint, such as the one you typically find in relational databases, requires linearizability.
46%
Flag icon
Using the leader for reads relies on the assumption that you know for sure who the leader is.
46%
Flag icon
consensus algorithms can implement linearizable storage safely. This is how ZooKeeper [21] and etcd [22] work, for example.
46%
Flag icon
Systems with multi-leader replication are generally not linearizable, because they concurrently process writes on multiple nodes and asynchronously replicate them to other nodes.
46%
Flag icon
it is safest to assume that a leaderless system with Dynamo-style replication does not provide linearizability.
47%
Flag icon
If your application does not require linearizability, then it can be written in a way that each replica can process requests independently, even if it is disconnected from other replicas (e.g., multi-leader). In this case, the application can remain available in the face of a network problem, but its behavior is not linearizable.
47%
Flag icon
CAP is sometimes presented as Consistency, Availability, Partition tolerance: pick 2 out of
47%
Flag icon
if you want linearizability, the response time of read and write requests is at least proportional to the uncertainty of delays in the network.
47%
Flag icon
there are no concurrent operations in a linearizable datastore: there must be a single timeline along which all operations are totally ordered.
47%
Flag icon
linearizability implies causality: any system that is linearizable will preserve causality correctly
47%
Flag icon
causal consistency is the strongest possible consistency model that does not slow down due to network delays, and remains available in the face of network failures
52%
Flag icon
Unlike linearizability, which puts all operations in a single, totally ordered timeline, causality provides us with a weaker consistency model: some things can be concurrent, so the version history is like a timeline with branching and merging. Causal consistency does not have the coordination overhead of linearizability and is much less sensitive to network problems.