Chena Lee’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Chena Lee

See all Chena’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between August 2 - December 28, 2020

33%

index simply point to all versions of an object and require an index query to filter out any

33%

For example, PostgreSQL has optimizations for avoiding index updates if different versions of the same object can fit on the same page

33%

they use an append-only/copy-on-write variant that does not overwrite pages of the tree when they are updated, but instead creates a new copy of each modified page.

33%

append-only B-trees, every write transaction (or batch of transactions) creates a new B-tree root, and a particular root is a consistent snapshot of the database at the point in time when it was created.

33%

However, this approach also requires a background process for compaction and garbage collection.

33%

Snapshot isolation

33%

In Oracle it is called serializable, and in PostgreSQL and MySQL it is called repeatable read

33%

Instead, it defines repeatable read, which looks superficially similar to snapshot isolation.

33%

PostgreSQL and MySQL call their snapshot isolation level repeatable read because it meets the requirements of the standard,

33%

the SQL standard’s definition of isolation levels is flawed — it is ambiguous, imprecise, and not as implementation-independent as a standard should be

33%

ostensibly

33%

nobody really knows what repeatable read means.

33%

Preventing Lost Updates

33%

(a read-modify-write cycle).

33%

Atomic write operations

33%

atomic update operations, which remove the need to implement read-modify-write cycles in application code.

33%

UPDATE counters SET value = value + 1 WHERE key = 'foo';

33%

MongoDB provide atomic operations for making local modifications to a part of a JSON document,

33%

Redis provides atomic operations for modifying data structures such as priority queues.

33%

Not all writes can easily be expressed in terms of atomic operations — for example, updates to a wik...

This highlight has been truncated due to consecutive passage length restrictions.

33%

exclusive lock on the object when it is read so that no other transaction can read it until the update has been applied.

33%

cursor stability

33%

simply force all atomic operations to be executed on a single thread.

33%

object-relational mapping frameworks make it easy to accidentally write code that performs unsafe read-modify-write cycles instead of using atomic operations provided by the database

33%

for the application to explicitly lock objects that are going to be updated.

33%

FOR UPDATE clause indicates that the database should take a lock on all rows returned by this query.

34%

An alternative is to allow them to execute in parallel and, if the transaction manager detects a lost update, abort the transaction and force it to retry its read-modify-write cycle.

34%

this approach is that databases can perform this check efficiently in conjunction with snapshot isolation.

34%

Some authors [28, 30] argue that a

34%

database must prevent lost updates in order to qualify as providing snapshot isolation, so MySQL does not provide snapshot isolation under this definition.

34%

databases that don’t provide transactions, you sometimes find an atomic compare-and-set operation

34%

this operation is to avoid lost updates by allowing an update to happen only if the value has not changed since you last read it.

34%

However, if the database allows the WHERE clause to read from an old snapshot, this statement may not prevent lost updates,

34%

Check whether your database’s compare-and-set operation is safe before relying on it.

34%

Conflict resolution and replication

34%

Locks and compare-and-set operations assume that there is a single up-to-date copy of the data.

34%

databases with multi-leader or leaderless replication usually allow several writes to happen concurrently and replicate them asynchronously,

34%

techniques based on locks or compare-and-set do not appl...

This highlight has been truncated due to consecutive passage length restrictions.

34%

resolve and merge these versions after the fact.

34%

Atomic operations

34%

For example, incrementing a counter or adding an element to a set are commutative operations.

34%

Unfortunately, LWW is the default in many replicated databases.

34%

Write Skew and Phantoms

34%

Your requirement of having at least one doctor on call has been violated.

34%

This anomaly is called write skew

This is like vinyaas like application?

34%

write skew as a generalization of the lost update problem. Write skew can occur if two transactions read the same objects, and then update some of those objects (different transactions may update different objects). In the special case where different transactions update the same object, you get a dirty write or lost update anomaly

34%

Automatically preventing write skew requires true serializable isolation

34%

Some databases allow you to configure constraints, which are then enforced by the database (e.g., uniqueness, foreign key constraints, or restrictions on a particular value).

34%

you would need a constraint that involves multiple objects. Most databases do not have built-in support for such constraints, but you may be able to implement them with triggers or materialized views, depending on the database

34%

the second-best option in this case is probably to explicitly lock the rows that the transaction depends on.

« Prev 1 … 8 9 10 … 28 Next »

See a Problem?

Preview — Designing Data-Intensive Applications by Martin Kleppmann