Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Rate it:
Open Preview
32%
Flag icon
abandoned multi-object transactions because they are difficult to implement across partitions,
32%
Flag icon
In a relational data model, a row in one table often has a foreign key reference to a row in another table.
32%
Flag icon
Multi-object transactions allow you to ensure that these references remain valid:
32%
Flag icon
document data model,
32%
Flag icon
no multi-object transactions are needed when
32%
Flag icon
However, document databases lacking join functionality also encourage denormalization
32%
Flag icon
particular, datastores with leaderless replication (see “Leaderless Replication”) work much more on a “best effort” basis, which could be summarized as “the database will do as much as it can, and if it runs into an error, it won’t undo something it has already done”
32%
Flag icon
the happy path rather than the intricacies of error handling. For example, popular object-relational mapping (ORM) frameworks such as Rails’s ActiveRecord and Django don’t retry aborted transactions
32%
Flag icon
This is a shame, because the whole point of aborts is to enable safe retries.
32%
Flag icon
If the transaction actually succeeded, but the network failed while the server tried to acknowledge the successful commit to the client (so the client thinks it failed), then retrying the transaction causes it to be performed twice — unless
32%
Flag icon
you have an additional application-level deduplication mechanism in place.
32%
Flag icon
If the error is due to overload, retrying the transaction will ma...
This highlight has been truncated due to consecutive passage length restrictions.
32%
Flag icon
If the transaction also has side effects outside of the database, those side effects may happen even if the transaction is aborted. For example, if you’re sending an email, you wouldn’t want to send the email again every time you retry the transaction.
32%
Flag icon
the client process fails while retrying, any data it was trying to write to the database is lost.
32%
Flag icon
In theory, isolation should make your life easier by letting you pretend that no concurrency is happening:
32%
Flag icon
serializable isolation means that the database guarantees that transactions have the same effect as if they ran serially (i.e., one at a time, without any concurrency).
32%
Flag icon
It’s therefore common for systems to use weaker levels of isolation, which protect against some concurrency issues, but not all.
32%
Flag icon
Even many popular relational database systems (which are usually considered “ACID”) use weak isolation,
32%
Flag icon
we need to develop a good understanding of the kinds of concurrency problems that exist, and how to prevent them.
32%
Flag icon
read committed.v
32%
Flag icon
(no dirty reads).
32%
Flag icon
you will only overwrite data that has been committed (no dirty writes).
32%
Flag icon
read committed does not prevent the race condition between two counter increments in Figure 7-1. In this case, the second write happens after the first transaction has committed, so it’s not a dirty write.
32%
Flag icon
Read committed is a very popular isolation level. It is the default setting in Oracle 11g, PostgreSQL, SQL Server 2012, MemSQL, and many other databases
32%
Flag icon
How do we prevent dirty reads? One option would be to use the same lock, and to require any transaction that wants to read an object to briefly acquire the lock and then release it again immediately after reading.
32%
Flag icon
because one long-running write transaction can force many other transactions to wait until the long-running transaction has completed,
32%
Flag icon
While the transaction is ongoing, any other transactions that read the object are simply given the old value.
32%
Flag icon
This anomaly is called read skew, and it is an example of a nonrepeatable read:
32%
Flag icon
Read skew is considered acceptable under read committed isolation:
33%
Flag icon
During the time that the backup process is running, writes will continue to be made to the database.
33%
Flag icon
If you need to restore from such a backup, the inconsistencies (such as disappearing money) become permanent.
33%
Flag icon
Snapshot isolation
33%
Flag icon
The idea is that each transaction reads from a consistent snapshot of the database — that is, the transaction sees all the data that was committed in the database at the start of the transaction.
33%
Flag icon
Snapshot isolation is a boon for long-running, read-only queries such as backups and analytics.
33%
Flag icon
performance point of view, a key principle of snapshot isolation is readers never block writers, and writers never block readers.
33%
Flag icon
The database must potentially keep several different committed versions of an object, because various in-progress transactions may need to see the state of the database at different points in time.
33%
Flag icon
multi-version concurrency control (MVCC).
33%
Flag icon
read committed is...
This highlight has been truncated due to consecutive passage length restrictions.
33%
Flag icon
be sufficient to keep two versions ...
This highlight has been truncated due to consecutive passage length restrictions.
33%
Flag icon
that support snapshot isolation typically use MVCC for their read committed isolation level as well.
33%
Flag icon
read committed uses a separate snapshot for each query, while snapshot isolation uses the same snapshot for an entire transaction.
33%
Flag icon
MVCC-based snapshot isolation is implemented in PostgreSQL
33%
Flag icon
When a transaction is started, it is given a unique, always-increasingvii transaction ID (txid).
33%
Flag icon
Each row in a table has a created_by field, containing the ID of the transaction that inserted this row into the table.
33%
Flag icon
each row has a deleted_by field, which is initially empty.
33%
Flag icon
a transaction deletes a row, the row isn’t actually deleted from the database, but it is marked for deletion by setti...
This highlight has been truncated due to consecutive passage length restrictions.
33%
Flag icon
data, a garbage collection process in the database removes any rows marked for deletion
33%
Flag icon
delete and a create.
33%
Flag icon
The accounts table now actually contains two rows for account 2: a row with a balance of $500 which was marked as deleted by transaction 13, and a row with a balance of $400 which was created by transaction 13.
33%
Flag icon
Indexes and snapshot isolation
1 8 28