More on this book
Community
Kindle Notes & Highlights
Read between
August 2 - December 28, 2020
SQL is a declarative query language, whereas IMS and CODASYL queried the database using imperative code.
An imperative language tells the computer to perform certain operations in a certain order.
In a declarative query language, like SQL or relational algebra, you just specify the pattern of the data you want — what conditions the results must meet, and how you want the data to be transformed (e.g., sorted, grouped, and aggregated) — but not how to achieve that goal.
If the database wants to reclaim unused disk space behind the scenes, it might need to move records around, changing the order in which the animals appear. Can the database do that safely, without breaking queries?
MapReduce is neither a declarative query language nor a fully imperative query API, but somewhere in between: the logic of the query is expressed with snippets of code, which are called repeatedly by the processing framework. It is based on the map (also known as collect) and reduce (also known as fold or inject) functions that exist in many functional programming languages.
MongoDB 2.2 added support for a declarative query language called the aggregation pipeline
In the property graph model, each vertex consists of: A unique identifier A set of outgoing edges A set of incoming edges A collection of properties (key-value pairs) Each edge consists of: A unique identifier The vertex at which the edge starts (the tail vertex) The vertex at which the edge ends (the head vertex) A label to describe the kind of relationship between the two vertices A collection of properties (key-value pairs)
You can think of a graph store as consisting of two relational tables, one for vertices and one for edges,
In a triple-store, all information is stored in the form of very simple three-part statements: (subject, predicate, object). For example, in the triple (Jim, likes, bananas), Jim is the subject, likes is the predicate (verb), and bananas is the object.
The semantic web is fundamentally a simple and reasonable idea: websites already publish information as text and pictures for humans to read, so why don’t they also publish information as machine-readable data for computers to read?
plethora
hubris.
The reasoning behind this design is that you should be able to combine your data with someone else’s data, and if they attach a different meaning to the word within or lives_in, you won’t get a conflict because their predicates are actually <http://other.org/foo#within> and <http://other.org/foo#lives_in>.
for example, it is the query language of Datomic [40], and Cascalog [47] is a Datalog implementation for querying large datasets in Hadoop.viii
Similarly to what db_set does, many databases internally use a log, which is an append-only data file.
log is used in the more general sense: an append-only sequence of records.
An index is an additional structure that is derived from the primary data.
well-chosen indexes speed up read queries, but every index slows down writes.
Bitcask
subject to the requirement that all the keys fit in the available RAM,
Concurrency and crash recovery are much simpler if segment files are append-only or immutable.
SSTable
LSM-Tree)
B-trees have stood the test of time very well. They remain the standard index implementation in almost all relational databases, and many nonrelational databases use them too.
This design corresponds more closely to the underlying hardware, as disks are also arranged in fixed-size blocks.
In order to make the database resilient to crashes, it is common for B-tree implementations to include an additional data structure on disk: a write-ahead log (WAL, also known as a redo log).
This is typically done by protecting the tree’s data structures with latches
Log-structured approaches are simpler in this regard, because they do all the merging in the background without interfering with incoming queries and atomically swap old segments for new segments from time to time.
copy-on-write scheme
We can save space in pages by not storing the entire key, but abbreviating it.
keys only need to provide enough information to act as boundaries between key ranges.
Many B-tree implementations therefore try to lay out the tree so that leaf pages appear in sequential order on disk.
each leaf page may have references to its sibling pages to the left and right, which allows scanning keys in order without jumping back to parent pages.
Some storage engines even overwrite the same page twice in order to avoid ending up with a partially updated page in the event of a power failure
This effect — one write to the database resulting in multiple writes to the disk over the course of the database’s lifetime — is known as write amplification.
On many SSDs, the firmware internally uses a log-structured algorithm to turn random writes into sequential writes on the underlying storage chips, so the impact of the storage engine’s write pattern is less pronounced [19].
the disk’s finite write bandwidth needs to be shared between the initial write (logging and flushing a
If write throughput is high and compaction is not configured carefully, it can happen that compaction cannot keep up with the rate of incoming writes.
Typically, SSTable-based storage engines do not throttle the rate of incoming writes, even if compaction cannot keep up, so you need explicit monitoring to detect this situation
in many relational databases, transaction isolation is implemented using locks on ranges of keys, and in a B-tree index, those locks can be directly attached to the tree
The main difference is that in a secondary index, the indexed values are not necessarily unique;
it could be a reference to the row stored elsewhere. In the latter case, the place where rows are stored is known as a heap file, and it stores data in no particular order (it may be append-only,
The heap file approach is common because it avoids duplicating data when multiple secondary indexes are present:
The situation is more complicated if the new value is larger, as it probably needs to be moved to a new location in the heap where there is enough space. In that case, either all indexes need to be updated to point at the new heap location of the
it can be desirable to store the indexed row directly within an index. This is known as a clustered index.
covering index or index with included columns, which stores some of a table’s columns within the index
As with any kind of duplication of data, clustered and covering indexes can speed up reads, but they require additional storage and can add overhead on writes.
The most common type of multi-column index is called a concatenated index,
Multi-dimensional indexes are a more general way of querying several columns at once, which is particularly important for geospatial data.