Goodreads helps you keep track of books you want to read.
Start by marking “Database Internals: A deep-dive into how distributed data systems work” as Want to Read:
Database Internals: A deep-dive into how distributed data systems work
Enlarge cover
Rate this book
Clear rating
Open Preview

Database Internals: A deep-dive into how distributed data systems work

4.26  ·  Rating details ·  143 ratings  ·  22 reviews
When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internal ...more
Paperback, 376 pages
Published November 4th 2019 by O'Reilly Media
More Details... Edit Details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Database Internals, please sign up.

Be the first to ask a question about Database Internals

Community Reviews

Showing 1-30
Average rating 4.26  · 
Rating details
 ·  143 ratings  ·  22 reviews

More filters
Sort order
Start your review of Database Internals: A deep-dive into how distributed data systems work
Sebastian Gebski
Feb 09, 2020 rated it it was amazing
One of the best tech books I've read in the last 12 months.
It consists of 2 parts: DB internals & DB distribution internals.

The 1st part is pure gold - one can learn about B*-trees, LSM-trees, differences between locks and latches, memory VS disk optimizations, rebalancing, concurrency models for transactions and much, much more. I can't recall any single book that covers as much deep-level knowledge on these topics.

The 2nd part is less unique - there are other good resources on distributed syst
Dec 13, 2019 rated it it was amazing
Shelves: 2019
I liked this one a lot.

It complements nicely "Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems" by Martin Kleppmann.
While Kleppmanns' book provides pretty solid overview of data processing landscape, this book goes deeper into implementation details, data structures and algorithms.
It's a bit more dry and more technical, but still it's a relatively easy read.
Emre Sevinç
May 29, 2020 rated it it was amazing
“Database Internals: A Deep Dive Into How Distributed Data Systems Work” by Alex Petrov, belongs to a very special category of O’Reilly books such as “Designing Data-Intensive Applications” and “Cassandra: The Definitive Guide“, in the sense that it is a serious deep dive into the most fundamental and challenging aspects of big and distributed data systems that we rely on daily basis.

Today there’s an unprecedented proliferation of distributed database technologies, combined with an ever-growing
Mikhail Filatov
Apr 01, 2020 rated it it was ok
The book is a strange mix - the second part is really about distributed data systems and it's ok - while "Designing Data Intensive applications" is better in this part.
The first part contains a lot of descriptions of different implementations of B*-Trees (replace * with any other symbol(s)) - most of them unreadable.
Povilas Balzaravičius
Nov 05, 2020 rated it really liked it
Good and interesting content. But some chapters are scattered with missing transitioning between topics or algorithms. Some other parts are developed well. Diagrams are missing where I have expected better explanation or are present for obvious things. The writing is 3/5, but the content is 5. So it is 4/5. I'm glad I've read the book and definitely will get back to some chapters to refresh some details. ...more
Adrian Bercovici Simon
Unfortunately i have read this book after Martin Kelppman's Designing Data Intensive apps and probably this is the reason i rated it poorly.

The book starts really good by describing the intetnal systems that are encompassed in any dbms ( Connection Listener layer , Query parser+ optimizer layer , execution layer and of course the storage layer).

I wanted the book to explore more on this subject , how the components are drsigned and how they deal with concurrency etc.

The book then took a deep dive
Apr 28, 2020 rated it liked it
The book is divided into two parts: The first part deals with storage on hard disk and solid state storage but in the context of a singular system; while the second part deals with distributed systems. In this sense it differs from most other books on distributed storage that typically do not discuss the topics in the first part of this book.

I found the book informative, but not very effective in building a solid understanding of concepts. I felt the author jumps from idea to (related) idea too
Łukasz Słonina
Apr 06, 2020 rated it liked it
I like this book for the content, if you would like to know more about databases and distributed systems plus get long list of further reads then go for this book. What I don't like is actually that this material does not read like book (e.g. DDIA), it's more like compendium of algorithms, data structures and theories. Some of the algorithms could be better presented (more diagrams). ...more
Ahmad hosseini
I part 1, book explains internal database structure in details and examines its parts like storage engine very well.
“The storage engine (or database engine) is a software component of a database management system responsible for storing, retrieving, and managing data in memory and on disk, designed to capture a persistent, long-term memory of each node.”
Part 2 explains distributed systems characteristics in general and examine some specific topics related to distributed databases.
Book also intr
Bartosz Sypytkowski
Aug 09, 2020 rated it it was amazing
This book comes along nicely together with "Designing Data-Intensive Applications" by Martin Klepmann: they both focus on core, fundamental concepts of persistent, distributed systems, providing wide variety of known algorithms and protocols for common problems in that area, including rationale behind each one, which helps to build intuition about their trade offs. It's also full of references for anyone, who wants to continue more in-depth exploration for a given topic. ...more
Elijah Oyekunle
Apr 25, 2020 rated it it was amazing
Shelves: computers
Love it!
Jan 31, 2021 rated it liked it
Shelves: owned
This book really feels like two incomplete books in one. The first half of the book focuses on database internals, file formats, caching strategies etc. The second half of the book switches gears and dives into the components (algorithms, and strategies) used by distributed systems.

The problem is that there is nothing to tie the first and second parts of the book together. You could be reading entirely different books. The second issue is that even within each part, you are presented with a lot
Lauro Caetano
Apr 02, 2020 rated it it was amazing
Excellent book! It goes a bit in the direction of what Design Data-intensive applications goes when it talks about distributed systems, dist transactions and so on.
But this book goes some steps further: explaining how the db represents data internally, and also explaining distributed systems algorithms.

Excellent read!
Jul 20, 2020 rated it really liked it
Детальное, но без больших подробностей (это искупается большим числом ссылок и рекомендаций для дальнейшего изучения) описание структур и алгоритмов для современных систем.
May 03, 2020 rated it it was ok
Informative but would have preferred more examples with practical scenarios. No code and this all mostly conceptual. Some good references to papers for subsequent reading. The first part of the book deals primarily with storage and covers an in-depth discussion of b-trees and types. The second half is focused on distributed systems and has useful sections on consensus protocols. Concepts like "2-phase commits" are explained well with figures. However, the lack of practical examples/code and over ...more
Ricardo Hernández
Jun 24, 2020 rated it it was amazing
A book that I really appreciated to expand and have a 360 degree refresh on Database essentials. The progression of the book is built in an organic way, parting from basic concepts at low level implementation to modern distributed challenges. This helps you build comprehension naturally, in constructs with some other technical books that jump from topic to topic without any respect to cognitive challenges.

Totally recommended to get a solid understanding on databases to help solve contemporary pr
Lu Pan
Oct 18, 2020 rated it really liked it
The first part of the book is better. The part two which focuses on distributed system is less than a deep dive and I agree with other reviewer that Data Intensive is a better book on distributed system. But this is still a great book on single host database!
Aboullaite Mohammed
Jan 19, 2021 rated it it was amazing
I liked very much reading this book! It complements nicely "Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems" Book by Martin Kleppmann. I highly recommend! ...more
Mikhail Turnovskiy
Sep 22, 2020 rated it really liked it
Shelves: computer-science
This book seams unbalanced for me.
Too complex for technical overview, too shallow for deep dive.
But it is a good starting point to learn about different areas of database and distributed system design.
Oct 20, 2020 rated it really liked it
Le livre est partagé en 2 parties : algorithmiques et distributed systems.
Un must-read pour approfondir ses connaissances sur les DB.
Can't believe I forgot to write a review for this one!

Partly it's probably because I usually have less to say (or more precisely it's harder for me to be properly articulate) about things I like than I do about the ones I don't. And boy did I like Database Internals! I'll try my best to explain why, the book and the author surely deserve it.

Being a back-end engineer, the main reason for picking this one up was to better understand the distributed databases that I may end up in (or have already h
Dmitry Lomov
Jan 04, 2020 rated it it was amazing  ·  review of another edition
Great overview of modern databases.

Very nice overview of databases' state of the art. A lot of references to literature and source code. Includes multitude of distributed systems' algorithms.
rated it it was amazing
Apr 05, 2020
Juliang Li
rated it really liked it
Jan 03, 2021
Laurynas Biveinis
rated it it was amazing
Feb 08, 2020
Jury Razumau
rated it really liked it
May 07, 2020
Dmytro Hambal
rated it liked it
Oct 01, 2020
Ankush Sharma
rated it really liked it
Dec 24, 2019
rated it really liked it
Apr 26, 2020
Nikita Chizhov
rated it it was amazing
Jan 19, 2020
« previous 1 3 4 5 next »
topics  posts  views  last activity   
Cell Phone Number Listings - Finding Out the Owner of a Cell Number 1 2 Jul 16, 2020 01:27AM  

Readers also enjoyed

  • Designing Data-Intensive Applications
  • Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services
  • The Night Fire (Harry Bosch, #22; Renée Ballard, #3; Harry Bosch Universe, #32)
  • Fundamentals of Software Architecture: An Engineering Approach
  • The Pragmatic Programmer: From Journeyman to Master
  • Programming Rust: Fast, Safe Systems Development
  • Practical Recommender Systems
  • Magic Universe: A Grand Tour of Modern Science
  • The Test Book
  • The Pentium Chronicles: The People, Passion, and Politics Behind Intel's Landmark Chips
  • HTTP/2 in Action
  • Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale
  • An Elegant Puzzle: Systems of Engineering Management
  • The Moment of Lift: How Empowering Women Changes the World
  • Clean Code: A Handbook of Agile Software Craftsmanship
  • Software Engineering at Google: Lessons Learned from Programming Over Time
  • The Case of Alan Turing: The Extraordinary and Tragic Story of the Legendary Codebreaker
  • Philip K. Dick: A Comics Biography
See similar books…

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »

News & Interviews

Happy Women's History Month! One of the undisputedly good things about modern scholarship is that women’s history is finally getting its due....
26 likes · 3 comments
“A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. Leslie Lamport” 1 likes
“node can hold up to” 0 likes
More quotes…