Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
40%
Flag icon
For correct ordering, you would need the clock source to be significantly more accurate than the thing you are measuring (namely network delay).
41%
Flag icon
So-called logical clocks [56, 57], which are based on incrementing counters rather than an oscillating quartz crystal, are a safer alternative for ordering events
41%
Flag icon
it doesn’t make sense to think of a clock reading as a point in time — it is more like a range of times, within a confidence interval:
41%
Flag icon
An interesting exception is Google’s TrueTime API in Spanner
41%
Flag icon
which explicitly reports the confidence interval on the local clock.
41%
Flag icon
The most common implementation of snapshot isolation requires a monotonically increasing transaction ID.
41%
Flag icon
However, when a database is distributed across many machines, potentially in multiple datacenters, a global, monotonically increasing transaction ID (across all partitions) is difficult to generate,
41%
Flag icon
With lots of small, rapid transactions, creating transaction IDs in a distributed system becomes an untenable bottleneck.vi
41%
Flag icon
Only if the intervals overlap are we unsure in which order A and B happened.
41%
Flag icon
Spanner deliberately waits for the length of the confidence interval before committing a read-write transaction.
41%
Flag icon
Google deploys a GPS receiver or atomic clock in each datacenter, allowing clocks to be synchronized to within about 7 ms
41%
Flag icon
Using clock synchronization for distributed transaction semantics is an area of active research
41%
Flag icon
Process Pauses
41%
Flag icon
writes. How does a node know that it is still leader
41%
Flag icon
One option is for the leader to obtain a lease from the other nodes, which is similar to a lock with a timeout
41%
Flag icon
Firstly, it’s relying on synchronized clocks: the expiry time on the lease is set by a different machine
41%
Flag icon
and it’s being compared to the local system clock.
41%
Flag icon
the code assumes
41%
Flag icon
that very little time passes between the point that it checks the time (System.currentTimeMillis()) and the time when the request is processed (process(request)). Normally this code runs very quickly, so the 10 second buffer
41%
Flag icon
what if there is an unexpected pause in the execution of the program? For example, imagine the thread stops for 15 seconds around the line lease.isValid()
41%
Flag icon
thread might be paused for so long? Unfortunately not. There are various reasons why this could
41%
Flag icon
garbage collector (GC) that occasionally needs to stop all running threads.
41%
Flag icon
“concurrent” garbage collectors like the HotSpot JVM’s CMS cannot fully run in parallel with the application code — even they need to stop the world from time to time
41%
Flag icon
a virtual machine can be suspended (pausing the execution of all processes and saving the contents of memory to disk) and resumed
41%
Flag icon
execution may also be suspended and resumed arbitrarily, e.g., when the user closes the lid of their laptop.
41%
Flag icon
the operating system context-switches to another thread,
41%
Flag icon
the hypervisor switches to a different virtual machine
41%
Flag icon
the currently running thread can be paused at any arbitrary...
This highlight has been truncated due to consecutive passage length restrictions.
41%
Flag icon
the application performs synchronous disk access, a thread may be paused waiting for a slow disk I/O operation to complete
41%
Flag icon
the Java classloader lazily loads class files when they are first used, which could happen at any time in the program execution.
41%
Flag icon
the disk is actually a network filesystem or network block device (such as Amazon’s EBS), the I/O latency is further subject to the variability of network delays
41%
Flag icon
swapping to disk (paging), a simple memory access may result in a page fault that requires a page from disk to be loaded into memory.
41%
Flag icon
swapping pages in and out of memory and getting little actual work done (this is known as thrashing).
41%
Flag icon
A Unix process can be paused by sending it the SIGSTOP signal, for example by pressing Ctrl-Z in a shell.
41%
Flag icon
The problem is similar to making multi-threaded code on a single machine thread-safe: you can’t assume anything about timing, because arbitrary context switches and parallelism may occur.
41%
Flag icon
A node in a distributed system must assume that its execution can be paused for a significant length of time at any point, even in the middle of a function.
41%
Flag icon
Some software runs in environments where a failure to respond within a specified time can cause serious damage: computers that control aircraft, rockets, robots, cars, and other physical objects must respond quickly and predictably to their sensor inputs.
41%
Flag icon
deadline
41%
Flag icon
hard real-time systems.
41%
Flag icon
you wouldn’t want the release of the airbag to be delayed due to an inopportune GC pause in the airbag release system.
41%
Flag icon
a real-time operating system (RTOS) that allows processes to be scheduled with a guaranteed allocation of CPU time in specified intervals is needed;
41%
Flag icon
Moreover, “real-time” is not the same as “high-performance” — in fact, real-time systems may have lower throughput, since they have to prioritize timely responses above
41%
Flag icon
An emerging idea is to treat GC pauses like brief planned outages of a node,
42%
Flag icon
A variant of this idea is to use the garbage collector only for short-lived objects (which are fast to collect) and to restart processes periodically,
42%
Flag icon
a node cannot necessarily trust its own judgment of a situation. A distributed system cannot exclusively rely on a single node, because a node may fail at any time, potentially leaving the system stuck and unable to recover.
42%
Flag icon
many distributed algorithms rely on a quorum, that is, voting among the nodes
42%
Flag icon
system requires there to be only one of some thing.
42%
Flag icon
Implementing this in a distributed system requires care:
42%
Flag icon
The problem is an example of what we discussed in “Process Pauses”: if the client holding the lease is paused for too long, its lease expires. Another client can obtain a lease for the same file, and start writing to the file. When the paused client comes back, it believes (incorrectly) that it still has a valid lease and proceeds to also write to the file.
42%
Flag icon
Let’s assume that every time the lock server grants a lock or lease, it also returns a fencing token, which is a number that increases every time a lock is granted
1 14 28