Chena Lee’s Kindle Notes & Highlights for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Rate it:

Open Preview

More on this book

Community

Sparsh Priyadarshi

1 note & 1 highlight

Jefersson Nathan

11 notes & 11 highlights

Charles Fonseca

4 notes & 524 highlights

Ucchishta Sivaguru

9 notes & 20 highlights

Sugan

1 note & 44 highlights

Guzman Monne

28 notes & 34 highlights

Dong

2 notes & 26 highlights

Mohamed Elsherif

5 notes & 17 highlights

Joe Soltzberg

20 notes & 75 highlights

Corey

6 notes & 10 highlights

Dinesh Singh

2 notes & 11 highlights

Robert Gustavo

38 notes & 38 highlights

Cezar Castro rosa

Nikhil Goyal

Vladimir

Ion Gritco

Keith Sader

Guilherme Camargo

Vipin Ajayakumar

Jason

Alexis

Ory

Faisal Morensya

Muhaimen Ezabbad

Frederico Cabral

Ian Dunn

Tali

Antonio Bustamante

Asif Hoda

zhouqiang

Nick Fahrenkrog

Matt Chamlee

Atthavit Wannasakwong

Xuan Lin

Eric Chong

Dallin Coons

Di Fan

Prakash Srivastava

Denis

Kindle Notes & Highlights

by Chena Lee

See all Chena’s Notes & Highlights

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

by Martin Kleppmann

Read between August 2 - December 28, 2020

39%

TCP performs flow control

39%

backpressure), in which a node limits its own rate of sending in order to avoid overloading

39%

additional queueing at the sender before the

39%

TCP considers a packet to be lost if it is not acknowledged within some timeout

39%

and lost packets are automatically retransmitted.

39%

you can only choose timeouts experimentally:

39%

rather than using configured constant timeouts, systems can continually measure response times and their variability (jitter),

39%

TCP retransmission timeouts also work similarly

39%

network is synchronous:

39%

queueing, because the 16 bits of space for the call have already been reserved in the next hop of the network.

39%

bounded delay.

39%

You can give TCP a variable-sized block of data

39%

Why do datacenter networks and the internet use packet switching?

39%

that they are optimized for bursty traffic.

39%

On the other hand, requesting a web page, sending an email, or transferring a file doesn’t have any particular bandwidth requirement

39%

If you guess too low, the transfer is unnecessarily slow, leaving network capacity unused. If you guess too high, the circuit cannot be set up

39%

build hybrid networks

39%

ATM.iii

40%

Variable delays in networks are not a law of nature, but simply the result of a cost/benefit trade-off.

40%

However, such quality of service is currently not enabled in multi-tenant datacenters and public clouds, or when communicating via the internet.iv

40%

variable delays in the network,

40%

This fact sometimes makes it difficult to determine the order in which things happened when multiple machines are involved.

40%

which is an actual hardware device: usually a quartz crystal oscillator.

40%

so each machine has its own notion of time, which may be slightly faster or slower than on other machines.

40%

possible to synchronize clocks to some degree: the most commonly used mechanism is the Ne...

This highlight has been truncated due to consecutive passage length restrictions.

40%

Modern computers have at least two different kinds of clocks: a time-of-day clock and a monotonic clock.

40%

System.currentTimeMillis() in Java return

40%

calendar, not counting leap seconds.

40%

Time-of-day clocks are usually synchronized with NTP, which means that a timestamp from one machine (ideally) means the same as a timestamp on another machine.

40%

These jumps, as well as similar jumps caused by leap seconds, make time-of-day clocks unsuitable for measuring elapsed time

40%

A monotonic clock is suitable for measuring a duration (time interval),

40%

System.nanoTime() in Java are monotonic clocks,

40%

The name comes from the fact that they are guaranteed to always move forward

40%

On a server with multiple CPU sockets, there may be a separate timer per CPU,

40%

Operating systems compensate for any discrepancy and try to present a monotonic view of the clock to application threads,

40%

NTP may adjust the frequency at which the monotonic clock moves forward (this is known as slewing the clock)

40%

should). Clock drift varies depending on the temperature of the machine.

40%

Anecdotal

40%

NTP synchronization can only be as good as the network delay, so there is a limit to its accuracy when you’re on a congested network with variable packet delays.

40%

NTP clients are quite robust, because they query several servers and ignore outliers.

40%

The fact that leap seconds have crashed many large systems [38, 46] shows how easy it is for incorrect assumptions about clocks to sneak into a system.

40%

In virtual machines,

40%

When a CPU core is shared between virtual machines, each VM is paused for tens of milliseconds while another VM is running.

40%

a day may not have exactly 86,400 seconds,

40%

time-of-day clocks may

40%

move backward ...

This highlight has been truncated due to consecutive passage length restrictions.

40%

Thus, if you use software that requires synchronized clocks, it is essential that you also carefully monitor the clock offsets between all the machines.

40%

Database writes can mysteriously disappear: a node with a lagging clock is unable to overwrite values previously written by a node with a fast clock until the clock skew between the nodes has elapsed

40%

Even with tightly NTP-synchronized clocks, you could send a packet at timestamp 100 ms (according to the sender’s clock) and have it arrive at timestamp 99 ms (according to the recipient’s clock)

40%

Probably not, because NTP’s synchronization accuracy is itself limited by the network round-trip time,

« Prev 1 … 12 13 14 … 28 Next »

See a Problem?

Preview — Designing Data-Intensive Applications by Martin Kleppmann