Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
66%
Flag icon
In these circumstances, it’s not sufficient to just append another event to the log to indicate that the prior data should be considered deleted — you actually want to rewrite history and pretend that the data was never written in the first place.
66%
Flag icon
Truly deleting data is surprisingly hard [64], since copies can live in many places: for example, storage engines, filesystems, and SSDs often
66%
Flag icon
You can take the data in the events and write it to a database, cache, search index, or similar storage system, from where it can then be queried by other clients.
66%
Flag icon
You can push the events to users in some way, for example by sending email alerts or push notifications, or by streaming the events to a real-time dashboard where they are visualized.
66%
Flag icon
You can process one or more input streams to produce one or more output streams.
66%
Flag icon
Fault-tolerance mechanisms must also change: with a batch job that has been running for a few minutes, a failed task can simply be restarted from the beginning, but with a stream job that has been running for several years, restarting from the beginning after a crash may not be a viable option.
66%
Flag icon
Complex event processing (CEP) is an approach developed in the 1990s for analyzing event streams, especially geared toward the kind of application that requires searching for certain event patterns
67%
Flag icon
example, you might want to know the average number of queries per second to a service over the last 5 minutes, and their
67%
Flag icon
The time interval over which you aggregate is known as a window,
67%
Flag icon
This use of approximation algorithms sometimes leads people to believe that stream processing systems are always lossy and inexact, but that is wrong: there is nothing inherently approximate about stream processing, and probabilistic algorithms are merely an optimization
67%
Flag icon
Maintaining materialized views
67%
Flag icon
We can regard these examples as specific cases of maintaining materialized views
67%
Flag icon
Besides CEP, which allows searching for patterns consisting of multiple events, there is also sometimes a need to search for individual events based on complex criteria, such as full-text search queries.
67%
Flag icon
For example, media monitoring services subscribe to feeds of news articles and broadcasts from media outlets, and search for any news mentioning companies, products, or topics of interest.
67%
Flag icon
Conventional search engines first index the documents and then run queries over the index. By contrast, searching a stream turns the processing on its head: the queries are stored, and the documents run past the queries,
67%
Flag icon
A tricky problem when defining windows in terms of event time is that you can never be sure when you have received all of the events for a particular window, or whether there are some events still to come.
67%
Flag icon
Ignore the straggler events, as they are probably a small percentage of events in normal circumstances.
67%
Flag icon
Publish a correction, an updated value for the window with stragglers included.
67%
Flag icon
However, the clock on a user-controlled device often
67%
Flag icon
cannot be trusted, as it may be accidentally or deliberately set to the wrong time
67%
Flag icon
To adjust for incorrect device clocks, one approach is to log three timestamps [82]: The time at which the event occurred, according to the device clock The time at which the event was sent to the server, according to the device clock The time at which the event was received by the server, according to the server clock
67%
Flag icon
Tumbling window A tumbling window has a fixed length, and every event belongs to exactly one window.
67%
Flag icon
Hopping window A hopping window also has a fixed length, but allows windows to overlap in order to provide some smoothing.
68%
Flag icon
Sliding window A sliding window contains all the events that occur within some interval of each other.
68%
Flag icon
Session window Unlike the other window types, a session window has no fixed duration. Instead, it is defined by grouping together all events for the same user that occur closely together in time, and the window ends when the user has been inactive for some time
68%
Flag icon
Stream-stream join (window join)
68%
Flag icon
The click may never come if the user abandons their search, and even if it comes, the time between the search and the click may be highly variable: in many cases it might be a few seconds, but it could be as long as days or weeks
68%
Flag icon
Due to variable network delays, the click event may even arrive before the search event.
68%
Flag icon
stream processor needs to maintain state: for example, all the events that occurred in the last hour, indexed by session ID.
68%
Flag icon
If there is a matching event, you emit an event saying which search result was clicked. If the search event expires without you seeing a matching click event, you emit an event saying which search results were not clicked.
68%
Flag icon
Stream-table join (stream enrichment)
68%
Flag icon
This process is sometimes known as enriching the activity events with information from the database.
68%
Flag icon
such remote queries are likely to be slow and risk overloading the database
68%
Flag icon
Another approach is to load a copy of the database into the stream processor so that it can be queried locally without a network round-trip. This technique is very similar to the hash joins we discussed in “Map-Side Joins”: the
68%
Flag icon
The difference to batch jobs is that a batch job uses a point-in-time snapshot of the database as input, whereas a stream processor is long-running, and the contents of the database are likely to change over time, so the stream processor’s local copy of the database needs to be kept up to date.
68%
Flag icon
This issue can be solved by change data capture: the stream processor can subscribe to a changelog of the user profile database as well as the stream of activity events.
68%
Flag icon
Table-table join (materialized view maintenance)
68%
Flag icon
The stream process needs to maintain a database containing the set of followers for each user so that it knows which timelines need to be updated when a new tweet arrives
68%
Flag icon
Put another way: if state changes over time, and you join with some state, what point in time do you use for the join
68%
Flag icon
If the ordering of events across streams is undetermined, the join becomes nondeterministic [87], which means you cannot rerun the same job on the same input and necessarily get the same result:
68%
Flag icon
In data warehouses, this issue is known as a slowly changing dimension (SCD), and it is often addressed by using a unique identifier for a particular version of the joined record:
68%
Flag icon
This principle is known as exactly-once semantics, although effectively-once would be a more descriptive term
68%
Flag icon
Microbatching and checkpointing One solution is to break the stream into small blocks, and treat each block like a miniature batch process.
68%
Flag icon
order to give the appearance of exactly-once processing in the presence of faults, we need to ensure that all outputs and side effects of processing
68%
Flag icon
an event take effect if and only if the processing is successful. Those effects include any messages sent to downstream
68%
Flag icon
Those things either all need to happen atomically, or none of them must happen, but they should not go out of sync with each other.
68%
Flag icon
Idempotence Our goal is to discard the partial output of any failed tasks so that they can be safely retried without taking effect twice. Distributed transactions are one way of achieving that goal, but another way is to rely on idempotence [97].
69%
Flag icon
For example, when consuming messages from Kafka, every message has a persistent, monotonically increasing offset. When writing a value to an external database, you can include the offset of the message that triggered the last write with the value.
69%
Flag icon
Relying on idempotence implies several assumptions: restarting a failed task must replay the same messages in the same order (a log-based message broker does this), the processing must be deterministic, and no other node may concurrently update the same value [98,
69%
Flag icon
When failing over from one processing node to another, fencing may be required (see “The leader and the lock”) to prevent interference from a node that is thought to be dead but is actually alive.