Rate this book

Readings in Database Systems (4th, 05) by Hellerstein, Joseph M [Paperback (2005)]

Name: Readings in Database Systems (4th, 05) by Hellerstein, Joseph M [Paperback (2005)]
Rating: 4.62 (9 reviews)

Helerstein

Rate this book

GenresComputer ScienceProgrammingTechnologySoftwareTechnicalNonfictionComputers

Paperback

First published March 1, 1998

12 people are currently reading

607 people want to read

About the author

Helerstein

2 books

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

31 (68%)

4 stars

11 (24%)

3 stars

3 (6%)

2 stars

0 (0%)

1 star

0 (0%)

Displaying 1 - 9 of 9 reviews

Alexander Shelemin

15 reviews24 followers

October 27, 2017

Delightful Hadoop bashing included

Ferhat Elmas

880 reviews17 followers

May 17, 2017

Perfect. I wish seeing more frequent updates and more extended commentary.

* JSON is good for sparse data but not perfect for general hierarchical data so RDBMS will subsume it as a data type.
* SQL could be cleaned but there was no time for it so COBOL for 2020. SQL won against natural language.
* ODBC isn't the best interface to embed into programmign languages and to run queries: open db, open cursor, bind query, run fetches, etc. Looking at Linq.
* PostgreSQL open source educated many that are influential in new systems.
* Query planning (best-effort): cost estimation (catalog), equivalence, cost-based search (DP).
Concurrency control: serializable but generally more than enough so finer grained locks (default today). There is no best, totally depends on workload.
* Any performance test without crossover point is uninteresting (at worst trade-off for unlimited resources assumption). Resolving conflicts via blocking might make more sense since every system is limited by nature.
* Durability: ARIES (no force to write dirty pages at commit time, can flush dirty pages at any time)
* Distribution: brings its own set of problems. 2PC (atomic commit): presumed commit or abort. Consensus(Paxos, Raft, etc.) is generally used for replication where master executes transactions by itself and elect master on failure.
* New architectures: column-store, main-memory systems (w/ concurrency control and recovery), semi-structured data (JSON) and dataflow (Hadoop, Spark, Naiad, etc.)
* Dataflow: started with map-reduce (only 2 stage). Nowadays, higher level query language(SQL), general graph (not only 2 stage) and indexing (leverage of structured parts) are supported (Spark, Flink, etc.). Influential points are schema, interface and architecture flexibility.
* Non-serializable isolation is active by default and solutions for it are difficult to use. Clear research interest of weak isolation is to find simpler ways to maintain semantics and to keep programmability easeness as in serializability.
* Rethinking query optimizer due to streaming, errors in estimation, data outside of RDBMS, user-defined aggregates, etc. Extract optimizer from execution and then plan generates a data flow which is executed by execution later. There are two optimizations inter operator (due to blocking in the nature of operator such as hash join) and intra operator (feedback from execution to self adapt plan).
* OLAP: Sample (online or materialized - BlinkDB; countmin, hyperloglog, bloom filters, etc.), precomputation (all or critical subset since it's lattice, others can be generated) and online aggregation (feedback to user and to stop when satisfied).

2017_read

Guillermo

167 reviews9 followers

June 28, 2020

A relatively quick read of a collection of commentaries about foundational and state of the art papers on database design and problems.
A mostly easy read for application programmers like my self.
Those interested in low level details can optionally go to the mentioned sources.

Vasili

98 reviews4 followers

April 6, 2020

Un libro que, creo, se sigue usando en el MIT (desde 1988) para la lectura sobre cuestiones relacionadas a las Bases de Datos.

Para comprender sobre la arquitectura basica de las bases de datos me sirvio. Especialmente las unidades 1: Data Models and DBMS Architecture, 2: Query Processing 4: Transaction Management.

La parte de Data Warehousing no fue tan interesante como me parecia. Se le puede sacar mayor provecho si se lo complementa con las clases de por ejemplo: https://archive.org/details/UCBerkele...

Yo no lo hice.

computer-science

John Doe

68 reviews11 followers

October 21, 2020

A fast and eye-opening read.
It's a brief booklet with around 50 pages.
About 11 commentary articles about selected important database papers on major database cutting edge areas.
It is for database professionals not for newbie like me who has very limited knowledge about database except the few ones everyone uses.
So I don't understand most of it. 🤨
But it did add lots of new terms to me conceptually.

Neil

226 reviews52 followers

Want to read

June 20, 2020

http://www.redbook.io/ - found on HN recently. Again.

really-want-to-read

HonRevDrStainTruth

12 reviews3 followers

January 9, 2015

The standard text for "introduction to database systems and theory" classes in an ICS environment. Even though it presumes a lot more technical background than many IS/LIS students have, that too is useful as it provides a great primary text for syntagmatic analysis. "What Goes Around Comes Around" and "Anatomy of a Database System" should be required reading in any Information Studies course focused on databases.