Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
18%
Flag icon
When HTTP is used as the underlying protocol for talking to the service, it is called a web service.
18%
Flag icon
There are two popular approaches to web services: REST and SOAP.
18%
Flag icon
REST is not a protocol, but rather a design philosophy that builds upon the principles of HTTP
19%
Flag icon
The client and the service may be implemented in different programming languages, so the RPC framework must translate datatypes from one language into another.
19%
Flag icon
All of these factors mean that there’s no point trying to make a remote service look too much like a local object in your programming language, because it’s a fundamentally different thing.
19%
Flag icon
Finagle and Rest.li use futures (promises) to encapsulate asynchronous actions that may fail. Futures also simplify situations where you need to make requests to multiple services in parallel, and combine their results
19%
Flag icon
The main focus of RPC frameworks is on requests between services owned by the same organization, typically within the same datacenter.
19%
Flag icon
look at asynchronous message-passing systems, which are somewhere between RPC and databases.
19%
Flag icon
goes via an intermediary called a message broker (also called a message queue or message-oriented middleware), which stores the message temporarily.
19%
Flag icon
It can act as a buffer if the recipient is unavailable
19%
Flag icon
It can automatically redeliver messages to a process that has crashed, and thus prevent messages from being lost.
19%
Flag icon
It avoids the sender needing to know the IP address and port number of the recipient (which is particularly useful in a cloud deployment whe...
This highlight has been truncated due to consecutive passage length restrictions.
19%
Flag icon
It logically decouples the sender from the recipient (the sender just publishes messages and does...
This highlight has been truncated due to consecutive passage length restrictions.
19%
Flag icon
More recently, open source implementations such as RabbitMQ, ActiveMQ, HornetQ, NATS, and Apache Kafka have become popular.
19%
Flag icon
message brokers are used as follows: one process sends a message to a named queue or topic, and the broker ensures that the message is delivered to one or more consumers of or subscribers to that queue or topic. There can be many producers and many consumers on the same topic.
21%
Flag icon
Partitioning Splitting a big database into smaller subsets called partitions so that different partitions can be assigned to different nodes (also known as sharding).
21%
Flag icon
leader-based replication (also known as active/passive or master–slave replication)
21%
Flag icon
it also sends the data change to all of its followers as part of a replication log or change stream.
21%
Flag icon
writes are only accepted on the leader
21%
Flag icon
(the followers are read-only from the client’s point of view).
21%
Flag icon
This mode of replication is a built-in feature of many relational databases, such as PostgreSQL (since version 9.0), MySQL,
21%
Flag icon
MongoDB, RethinkDB, and Espresso
21%
Flag icon
distributed message brokers such as Kafka [5] and RabbitMQ highly available queues [6] also use it.
21%
Flag icon
it is impracticable for all followers to be synchronous: any one node outage would cause the whole system to grind to a halt.
21%
Flag icon
If the synchronous follower becomes unavailable or slow, one of the asynchronous followers is made synchronous. This guarantees that you have an up-to-date copy of the data on at least two nodes: the leader and one synchronous follower.
21%
Flag icon
Often, leader-based replication is configured to be completely asynchronous. In this case, if the leader fails and is not recoverable, any writes that have not yet been replicated to followers are lost.
21%
Flag icon
chain replication [8, 9] is a variant of synchronous replication that has been successfully implemented in a few systems such as Microsoft Azure Storage
21%
Flag icon
The follower connects to the leader and requests all the data changes that have happened since the snapshot was taken. This requires that the snapshot is associated with an exact position in the leader’s replication log. That position has various names: for example, PostgreSQL calls it the log sequence number, and MySQL calls it the binlog coordinates.
22%
Flag icon
each follower keeps a log of the data changes it has received from the leader. If a follower crashes and is restarted, or if the network between the leader and the follower is temporarily interrupted, the follower can recover quite easily:
22%
Flag icon
Handling a failure of the leader is trickier: one of the followers needs to be promoted to be the new leader, clients need to be reconfigured to send their writes to the new leader, and the other followers need to start consuming data changes from the new leader. This process is called failover.
22%
Flag icon
There is no foolproof way of detecting what has gone wrong, so most systems simply use a timeout: nodes frequently bounce messages back and forth between each other, and if a node doesn’t respond for some period of time — say, 30 seconds — it is assumed to be dead.
22%
Flag icon
If the old leader comes back, it might still believe that it is the leader, not realizing that the other replicas have forced it to step down. The system needs to ensure that the old leader becomes a follower
22%
Flag icon
If asynchronous replication is used, the new leader may not have received all the writes from the old leader before it failed. If the former leader rejoins the cluster after a new leader has been chosen, what should happen to those writes?
22%
Flag icon
As a safety catch, some systems have a mechanism to shut down one node if two leaders are detected.ii
22%
Flag icon
Statement-based replication In the simplest case, the leader logs every write request (statement) that it executes and sends that statement log to its followers.
22%
Flag icon
Any statement that calls a nondeterministic function, such as NOW() to get the current date and time or RAND() to get a random number, is likely to generate a different value on each replica.
22%
Flag icon
If statements use an autoincrementing column, or if they depend on the existing data in the database (e.g., UPDATE … WHERE <some condition>), they must be executed in exactly the same order on each replica,
22%
Flag icon
Statement-based replication was used in MySQL before version 5.1.
22%
Flag icon
This method of replication is used in PostgreSQL and Oracle, among others [16]. The main disadvantage is that the log describes the data on a very low level:
22%
Flag icon
This makes replication closely coupled to the storage engine. If the database changes its storage format from one version to another,
22%
Flag icon
Logical (row-based) log replication
22%
Flag icon
Since a logical log is decoupled from the storage engine internals, it can more easily be kept backward compatible, allowing the leader and the follower to run different versions of the database software, or even different storage engines.
22%
Flag icon
This aspect is useful if you want to send the contents of a database to an external system, such as a data warehouse for offline analysis, or for building custom indexes and caches
22%
Flag icon
A trigger lets you register custom application code that is automatically executed when a data change (write transaction) occurs in a database system.
22%
Flag icon
In this read-scaling architecture, you can increase the capacity for serving read-only requests simply by adding more followers.
22%
Flag icon
if you tried to synchronously replicate to all followers, a single node failure or network outage would make the entire system unavailable for writing. And the more nodes you have, the likelier it is that one will be down, so a fully synchronous configuration would be very unreliable.
22%
Flag icon
For that reason, this effect is known as eventual consistency
22%
Flag icon
In normal operation, the delay between a write happening on the leader and being reflected on a follower — the replication lag — may be only a fraction of a second, and not noticeable in practice.
23%
Flag icon
In this situation, we need read-after-write consistency, also known as read-your-writes consistency
23%
Flag icon
When reading something that the user may have modified, read it from the leader;