More on this book
Community
Kindle Notes & Highlights
Read between
October 21 - November 26, 2024
Clients maintain a long-lived session on ZooKeeper servers, and the client and server periodically exchange heartbeats to check that the other node is still alive.
If the leader fails, one of the other nodes should take over.
As nodes are removed or fail, other nodes need to take over the failed nodes’ work.
An application may initially run only on a single node, but eventually may grow to thousands of nodes.
ZooKeeper is not intended for storing the runtime state of the application, which may change thousands or even millions of times per second.
ZooKeeper, etcd, and Consul are also often used for service discovery—that is, to find out which IP address you need to connect to in order to reach a particular service.
Although service discovery does not require consensus, leader election does.
A membership service determines which nodes are currently active and live members of a cluster.
Causal consistency does not have the coordination overhead of linearizability and is much less sensitive to network problems.
If one node is going to accept a registration, it needs to somehow know that another node isn’t concurrently in the process of registering the same name.
When several clients are racing to grab a lock or lease, the lock decides which one successfully acquired it.
When several transactions concurrently try to create conflicting records with the same key, the constraint must decide which one to allow and which should fail with a constraint violation.
Although a single-leader database can provide linearizability without executing a consensus algorithm on every write, it still requires consensus to maintain its leadership and for leadership changes.
If you find yourself wanting to do one of those things that is reducible to consensus, and you want it to be fault-tolerant, then it is advisable to use something like ZooKeeper.
Strictly speaking, ZooKeeper and etcd provide linearizable writes, but reads may be stale, since by default they can be served by any one of the replicas.
A total order that is inconsistent with causality is easy to create, but not very useful.
Consensus is allowed to decide on any value that is proposed by one of the participants.
In a large application you often need to be able to access and process data in many different ways, and there is no one database that can satisfy all those different needs simultaneously.
In reality, integrating disparate systems is one of the most important things that needs to be done in a nontrivial application.
A system of record, also known as source of truth, holds the authoritative version of your data.
Data in a derived system is the result of taking some existing data from another system and transforming or processing it in some way.
If you lose derived data, you can recreate it from the original source.
Most databases, storage engines, and query languages are not inherently either a system of record or a derived system.
The distinction between system of record and derived data system depends not on the tool, but on how you use it in your application.
By being clear about which data is derived from which other data, you can bring clarity to an otherwise ...
This highlight has been truncated due to consecutive passage length restrictions.
A batch processing system takes a large amount of input data, runs a job to process it, and produces some output data.
The Ruby script keeps an in-memory hash table of URLs, where each URL is mapped to the number of times it has been seen.
Make each program do one thing well.
Design and build software, even operating systems, to be tried early, ideally within weeks.
The sort tool is a great example of a program that does one thing well.
If you want to be able to connect any program’s output to any program’s input, that means that all programs must use the same input/output interface.
A file is just an ordered sequence of bytes.
Although it’s not perfect, even decades later, the uniform interface of Unix is still something remarkable.
If you run a program and don’t specify anything else, stdin comes from the keyboard and stdout goes to the screen.
Pipes let you attach the stdout of one process to the stdin of another process (with a small in-memory buffer, and without writing the entire intermediate data stream to disk).
Programs that need multiple inputs or outputs are possible but tricky.
MapReduce is a bit like Unix tools, but distributed across potentially thousands of machines.
As with most Unix tools, running a MapReduce job normally does not modify the input and does not have any side effects other than producing the output.
While Unix tools use stdin and stdout as input and output, MapReduce jobs read and write files on a distributed filesystem.
Shared-disk storage is implemented by a centralized storage appliance, often using custom hardware and special network infrastructure such as Fibre Channel.
HDFS consists of a daemon process running on each machine, exposing a network service that allows other nodes to access files stored on that machine (assuming that every general-purpose machine in a datacenter has some disks attached to it).
A central server called the NameNode keeps track of which file blocks are stored on which machine.
In order to tolerate machine and disk failures, file blocks are replicated on multiple machines.
MapReduce is a programming framework with which you can write code to process large datasets in a distributed filesystem like HDFS.
The mapper is called once for every input record, and its job is to extract the key and value from the input record.
The MapReduce framework takes the key-value pairs produced by the mappers, collects all the values belonging to the same key, and calls the reducer with an iterator over that collection of values.
The main difference from pipelines of Unix commands is that MapReduce can parallelize a computation across many machines, without you having to write code to explicitly handle the parallelism.
In Hadoop MapReduce, the mapper and reducer are each a Java class that implements a particular interface.
The reduce task takes the files from the mappers and merges them together, preserving the sort order.
The reducer is called with a key and an iterator that sequentially scans over all records with the same key (which may in some cases not all fit in memory).