Take advantage of today's sky-high demand for data engineers. With this in-depth book, current and aspiring engineers will learn powerful real-world best practices for managing data big and small. Contributors from notable companies including Twitter, Google, Stitch Fix, Microsoft, Capital One, and LinkedIn share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges. Edited by Tobias Macey, host of the popular Data Engineering Podcast, this book presents 97 concise and useful tips for cleaning, prepping, wrangling, storing, processing, and ingesting data. Data engineers, data architects, data team managers, data scientists, machine learning engineers, and software engineers will greatly benefit from the wisdom and experience of their peers. Topics
I didn't like it, mainly because I think this formula (very brief "snack-size" suggestions provided by respected practitioners) didn't work well here. I don't think it's the interviewees' fault - I have no reason to question their knowledge and experience. The problem is that they had too little space to illustrate their suggestion with a case study/visuals/working example - that's why the majority of advice felt very high-level, oversimplified, rushed, and frankly - much less valuable than what we could have got from the same people in different circumstances.
An unfulfilled potential, sadly. My suggestion would be to cut the number of interviewees by 3-4, group their 'topics' by general theme, and give them more space to elaborate. Yes, I know it may be a no-go for busy engineers, but maybe some of them won't mind.
In case you are not familiar with, the "97 things every XX" should know is high level advice on the specialisation that they focus, in this case data engineer. Do not expect on-hands solutions, it's always high level perspectives from different authors.
The articles have different levels of expertise (from entry level to advance) and different topics, then it's similar to meet a few friends in a bar and ask them histories about data engineering. The topics covered well are across the articles can be classified as : * Foundational concepts: data lake, data warehouse, data mesh, etc * Architecture and patterns: CAP, SQL vs NOSql, ETL, observability, quality, devops, ettc * Data governance topics: schemas, producers, consumers, linage, etc * Data quality: approaches (e.g. team responsibilities), tools (e.g. Great Expectations) and processes (e.g. SLA) * Management: building reputation for your team, different kinds of profiles, etc
Overall, how good this book it's will depend a lot on your previous experience. In my case, 50% of the topics were relevant or good refreshers. Most of the authors share their experience with the goal that is usefull to others.
I have always liked the 97 Things series. There are a lot of great ideas, best practices, warnings and stories all wrangled together into this little book. I enjoyed looking into the various worlds, tech stacks, and mental models that existed in the heads of the authors. As an aside I am also one of these authors, and it was great to read the collective wisdom of my peers.
This "book" is comprised of 97 mini-articles from different authors. I am generally not a fan of the generic information an article provides. Instead, I would prefer a book that talks about the foundations and theory of a field. This is challenging in the world of Software (fast evolving tech, undefined buzz words, repetition of ideas under new terminology, etc.), however Martin Klepmann's "Design Data-Intensive Applications" serves as a foundation for all Data Engineers.
A collection of 97 “blog posts” (in some cases literally) about various data engineering topics. All these are 2-3 pages long and are completely unrelated to each other, which makes it easy to read, but some topics are somewhat repeated or it would be much more beneficial if they were joined and discussed more deeply.
A smaller number of contributions with more details (including implementation details!) might have worked better. But this book even so still contains quite a few observations of value/interest to people in the field and some of the individual contributions are really very good, despite the limitations imposed by the format. Closer to three stars than one.
Got this as a free audiobook from the library. It was pretty decent! Basically 97 little lighting talks about Data Engineering topics. Good refreshers on a lot of relevant topics for those already working in the role.
Great book that touches pretty much all Data Engineering topics. Very good for someone considering a career change (coming from Analytics or Software) and curious to know what this field is about.
Meh… I finally took the time to finish it. It’s so non-coherent and superficial that I wouldn’t recommend it. The only nice thing is to see some familiar faces from the field