Antonio Bustamante’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Chena Lee

6 notes & 1353 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Antonio Bustamante

See all Antonio’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Started reading July 24, 2020

if you have an application-managed caching layer (using Memcached or similar), or a full-text search server (such as Elasticsearch or Solr) separate from your main database, it is normally the application code’s responsibility to keep those caches and indexes in sync with the main database.

It is impossible to reduce the probability of a fault to zero; therefore it is usually best to design fault-tolerance mechanisms that prevent faults from causing failures.

A service that the system depends on that slows down, becomes unresponsive, or starts returning corrupted responses.

The bugs that cause these kinds of software faults often lie dormant for a long time until they are triggered by an unusual set of circumstances. In those circumstances, it is revealed that the software is making some kind of assumption about its environment — and while that assumption is usually true, it eventually stops being true for some reason [11

constantly check itself while it is running and raise an alert if a discrepancy is found

Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry.

Maintain a cache for each user’s home timeline — like a mailbox of tweets for each recipient user (see Figure 1-3). When a user posts a tweet, look up all the people who follow that user, and insert the new tweet into each of their home timeline caches.

The first version of Twitter used approach 1, but the systems struggled to keep up with the load of home timeline queries, so the company switched to approach 2. This works better because the average rate of published tweets is almost two orders of magnitude lower than the rate of home timeline reads, and so in this case it’s preferable

The final twist of the Twitter anecdote: now that approach 2 is robustly implemented, Twitter is moving to a hybrid of both approaches.

Amazon describes response time requirements for internal services in terms of the 99.9th percentile, even though it only affects 1 in 1,000 requests. This is because the customers with the slowest requests are often those who have the most data on their accounts because they have made many purchases — that is, they’re the most valuable customers

you are working on a fast-growing service, it is therefore likely that you will need to rethink your architecture on every order of magnitude load increase — or perhaps even more often than that.

magic scaling sauce).

An architecture that scales well for a particular application is built around assumptions of which operations will be common and which will be rare — the load parameters. If those assumptions turn out to be wrong, the engineering effort for scaling is at best wasted, and at worst counterproductive. In an early-stage startup or an unproven product it’s usually more important to be able to iterate quickly on product features than it is to scale to some hypothetical future load.

End of chapter 1

See a Problem?

Preview — Designing Data-Intensive Applications by Martin Kleppmann