More on this book
Community
Kindle Notes & Highlights
Read between
August 2 - December 28, 2020
are more powerful,
Although in principle it’s possible to split and merge partitions (see the next section), a fixed number of partitions is operationally simpler, and so many fixed-partition databases choose not to implement partition splitting.
Thus, the number of partitions configured at the outset is the maximum number of nodes you can have, so you need to choose it high enough to accommodate future growth.
However, each partition also has management overhead, so it’s counterproductive to ...
This highlight has been truncated due to consecutive passage length restrictions.
If partitions are very large, rebalancing and recovery from node failures become expensive. But if partitions are too small, they incur too much overhead.
For databases that use key range partitioning (see “Partitioning by Key Range”), a fixed number of partitions with fixed boundaries would be very inconvenient:
When a partition grows to exceed a configured size (on HBase, the default is 10 GB), it is split into two partitions so that approximately half of the data ends up on each side of the split [26]. Conversely, if lots of data is deleted and a partition shrinks
An advantage of dynamic partitioning is that the number of partitions adapts to the total data volume.
a caveat is that an empty database starts off with a single partition, since there is no a priori information about where to draw the partition boundaries.
allow an initial set of partitions to be configured on an empty database (this is called pre-splitting).
With dynamic partitioning, the number of partitions is proportional to the size of the dataset,
On the other hand, with a fixed number of partitions, the size of each partition is proportional to the size of the dataset.
this case, the size of each partition grows proportionally to the dataset size while the number of nodes remains unchanged, but when you increase the number of nodes, the partitions become smaller again.
When a new node joins the cluster, it randomly chooses a fixed number of existing partitions to split, and then takes ownership of one half of each of those split partitions while leaving the other half of each partition in place. The randomization can produce unfair splits, but when averaged over a larger number of partitions
If it is not done carefully, this process can overload the network or the nodes and harm the performance of other requests while the rebalancing is in progress.
As partitions are rebalanced, the assignment of partitions to nodes changes. Somebody needs to stay on top of those changes in order to answer the question: if I want to read or write the key “foo”, which IP address and port number do I need to connect to?
service discovery,
This routing tier does not itself handle any requests; it only acts as a partition-aware load balancer.
Cassandra and Riak take a different approach: they use a gossip protocol among the nodes to disseminate any changes in cluster state.
This model puts more complexity in the database nodes but avoids the dependency on an external coordination service such as ZooKeeper.
However, massively parallel processing (MPP) relational database products, often used for analytics, are much more sophisticated in the types of queries they support.
transactions have been the mechanism of choice for simplifying these issues.
A transaction is a way for an application to group several reads and writes together into a logical unit.
either the entire transaction succeeds (commit) or it fails...
This highlight has been truncated due to consecutive passage length restrictions.
sometimes there are advantages to weakening transactional guarantees or abandoning them entirely
(NoSQL)
Transactions were the main casualty of this movement: many of this new generation of databases abandoned transactions entirely, or redefined the word to describe a much weaker set of guarantees than had previously been understood
popular belief that transactions were the antithesis of scalability,
ACID, which stands for Atomicity, Consistency, Isolation, and Durability.
(Systems that do not meet the ACID criteria are sometimes called BASE, which stands for Basically Available, Soft state, and Eventual consistency
Without atomicity, if an error occurs partway through making multiple changes, it’s difficult to know which changes have taken effect and which haven’t. The application could try again, but that risks making the same change twice, leading to duplicate or incorrect
ACID, consistency refers to an application-specific notion of the database being in a “good state.”
ACID consistency is that you have certain statements about your data (invariants) that must always be true
this idea of consistency depends on the application’s notion of invariants,
you write bad data that violates your invariants, the database can’t stop you.
Atomicity, isolation, and durability are properties of the database, whereas consistency (in the ACID sense) is a property of the application.
Isolation in the sense of ACID means that concurrently executing transactions are isolated from each other: they cannot step on each other’s toes.
textbooks formalize isolation as serializability, which means that each transaction can pretend that it is the only transaction running on the entire database.
the result is the same as if they had run serially (o...
This highlight has been truncated due to consecutive passage length restrictions.
However, in practice, serializable isolation is rarely used, because it carries a performance penalty.
In Oracle there is an isolation level called “serializable,” but it actually implements something called snapshot isolation, which is a weaker guarantee than serializability
In order to provide a durability guarantee, a database must wait until these writes or replications are complete before reporting a transaction as successfully committed.
One study of SSDs found that between 30% and 80% of drives develop at least one bad block during the first four years of operation
When a worn-out SSD (that has gone through many write/erase cycles) is disconnected from power, it can start losing data within a timescale of weeks to months, depending on the temperature
In practice, there is no one technique that can provide absolute guarantees. There are only various risk-reduction techniques, including writing to disk, replicating to remote machines, and backups — and they can and should be used together.
multi-object transactions are often needed if several pieces of data need to be kept in sync.
In relational databases, that is typically done based on the client’s TCP connection to the database server: on any particular connection, everything between a BEGIN
many nonrelational databases don’t have such a way of grouping operations together.
more complex atomic operations,iv such as an increment operation,
compare-and-set operation,