Chena Lee’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Chena Lee

See all Chena’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between August 2 - December 28, 2020

34%

snapshot isolation does not prevent another user from concurrently inserting a conflicting meeting.

34%

Phantoms causing write skew

34%

However, the other four examples are different: they check for the absence of rows matching some search condition,

34%

can’t attach locks to anything.

34%

where a write in one transaction changes the result of a search query in another transaction, is called a phantom

34%

Materializing conflicts

34%

can artificially introduce a lock object into the database?

34%

You create rows for all possible combinations of rooms and time periods ahead of time, e.g. for the next six months.

34%

This approach is called materializing conflicts, because it takes a phantom and turns it into a lock conflict on a concrete set of rows that exist in the database

34%

it can be hard and error-prone to figure out how to materialize conflicts, and it’s ugly to let a concurrency control mechanism leak into the application data model.

34%

materializing conflicts should be considered a last resort if no alt...

This highlight has been truncated due to consecutive passage length restrictions.

35%

Serializable isolation is usually regarded as the strongest isolation level.

35%

But if serializable isolation is so much better than the mess of weak isolation levels, then why isn’t everyone using

35%

will discuss these techniques primarily in the context of single-node databases;

35%

The simplest way of avoiding concurrency problems is to remove the concurrency entirely:

35%

that a single-threaded loop for executing transactions was feasible

35%

RAM became cheap enough that for many use cases it is now feasible to keep the entire active dataset in memory

35%

transactions can execute much faster

35%

However, its throughput is limited to that of a single CPU core.

35%

a database transaction needs to wait for input from a user, the database needs to support a potentially huge number of concurrent transactions,

35%

and so almost all OLTP applications keep transactions short by avoiding interactively waiting for a user within a transaction.

35%

systems with single-threaded serial transaction processing don’t allow interactive multi-statement transactions.

35%

the application must submit the entire transaction code to the database ahead of time, as a stored procedure.

35%

Provided that all data required by a transaction is in memory, the stored procedure can execute very fast, without waiting for any network or disk I/O.

35%

badly written stored procedure (e.g., using a lot of memory or CPU time) in a database can cause much more trouble than equivalent badly written code in an application server.

35%

transactions on a single thread

35%

they don’t need to wait for I/O and they avoid the overhead of other concurrency control mechanisms, they can achieve quite good throughput on a single thread.

35%

also uses stored procedures for ...

This highlight has been truncated due to consecutive passage length restrictions.

35%

order to scale to multiple CPU cores, and multiple nodes, you can potentially partition your data

35%

In this case, you can give each CPU core its own partition,

35%

for any transaction that needs to access multiple partitions, the database must coordinate the transaction across all the partitions that it touches.

35%

Since cross-partition transactions have additional coordination overhead, they are vastly slower than single-partition transactions.

35%

two-phase locking (2PL).

35%

But as soon as anyone wants to write (modify or delete) an object, exclusive access is required:

35%

If transaction A has read an object and transaction B wants to write to that object, B must wait until A commits or aborts before it can continue. (This ensures that B can’t

35%

If transaction A has written an object and transaction B wants to read that object, B must wait until A commits o...

This highlight has been truncated due to consecutive passage length restrictions.

35%

(Reading an old version of the object, like in Figure 7-4, is ...

This highlight has been truncated due to consecutive passage length restrictions.

35%

In 2PL, writers don’t just block other writers; they also block re...

This highlight has been truncated due to consecutive passage length restrictions.

35%

because 2PL provides serializability,

35%

The lock can either be in shared mode or in exclusive mode.

35%

Several transactions are allowed to hold the lock in shared mode simultaneously, but if another transaction already has an exclusive lock on the object, these transactions must wait.