I read it to get an introduction and can't complain on that. Not sure how people well-versed will feel but I enjoyed it. Also helped that it was quite humorous.
I'm not a data scientist and I found this book interesting enough to read it through to the end. It gave me a glimpse into what data science is and how to use it. I also learned a little about what the typical role of a data scientist is and how to implement some of the tools in a small application.
This book was worth my time. Thank you to the authors.
I think it’s a decent book that gives a good overview to technologies that are needed for data scientists. It contains a lot of hands on exercises. If you are not interested in doing those exercises you can just scroll through them.
The book is well structured to help beginners like me to get an overview of the ecosystem of data science...good examples are used and very helpful summaries in each chapter to refresh my memory and understand the milestones when I progress through the reading...my thumb up for the book.
INTRODUCING DATA SCIENCE is a broad introduction to the field. Each chapter includes the theory, as well as practical examples. This is a big, complex field, and this book will take some time to absorb. For example, the authors illustrate in Chapter 6 the large number of database products that are used in this field. These include both "NoSQL," as well as "New SQL" designs. There isn't just one "right" method. To be honest, I didn't know there was any such thing as "graphical" databases.
I recommend trying some of the detailed examples to get a feel for the subject matter. For example, the authors show, in detail, how to use Wikipedia with a custom Python program to try some data mining. The example pretends that you are trying to research diseases, using information in Wikipedia, and show you how to write your program.
I thought the best chapter was Chapter 6, "Join the NoSQL Movement." Here, the authors explain the intrinsic limitations of the traditional RBBMS structure. A typical RDBMS table is stored with ALL the columns together. This often works well, but what if you only want certain columns? Well, that's too bad--you will end up "touching" all the other columns as well.
Column databases don't have the above limitation. So, they are far faster are scanning through large amounts of information, when you just want certain columns. (As an Oracle DBA for several decades, I can confirm that the authors correctly state the limitation.)
So all in all, I found INTRODUCING DATA SCIENCE to be a very good book--albeit a bit overwhelming. I found the examples especially helpful. The book has several appendices that explain how to install required libraries for use in the chapter examples.