Practical Data Science with R, Second Edition is a task-based tutorial that leads readers through dozens of useful data analysis practices using the R language. By concentrating on the most important tasks you'll face on the job, this friendly guide is comfortable both for business analysts and data scientists. Because data is only useful if it can be understood, you'll also find fantastic tips for organizing and presenting data in tables, as well as snappy visualizations.
I'm not always happy with the Manning texts (in comparison to the ORly books) but this one was great.
Step-by-step instructions walk the reader through getting the results shown in the book.
The code is all in a github repo, and the authors introduce new tools that they created (SQL Screwdriver, et al) for use by everyone.
This isn't a book about R per se, but a book about how to choose and attack datascience projects (and maybe the title is misleading since most of us "data science" types actually do data analysis or data engineering). The chapter on classification and clustering algorithms is a perfect example. They use R to teach the algos, rather than using algo examples to walk you through coding in R.
It's easy enough to just follow along with the code in the book, but you'll get the most out of it if you sit down with RStudio and work through it.
Couldn't be happier having spent the money for a dead-tree copy of this one. It's already been heavily marked up, and there's more to come.
This book is great introduction to Data Science in R. However, as the title implies, it is geared towards those looking for only a high-level, quick overview to Data Science practices as they apply in the business world as well as how to communicate results to non-practitioners and business partners.
If this is what you are looking for then I recommend this book. If you are looking for a more in-depth introduction to the theory of data science and machine learning, I would look elsewhere, as the topics are covered in a very superficial manner.
Had I done more research into this book before purchasing, I would not have bought it; instead opting for a more theoretical and statistics-heavy primer. Zumel and Mount do an excellent and concise job however of making data science accessible to those who have an interest in it at the business level.
"Practical data science with R" is an original book, yet not a great one, and I would not recommend it. This book belongs to the trend of data science by practitioners. They promote themselves as material with a practical focus and accessible writing style. However, usually they fail at explaining the theory behind. This book suffers this malaise, it struggles to explain the principles and sometimes is even wrong about basic concepts in stats (for example, the explanation of heteroscedasticity). Not everything was terrible, it introduces R, version control, databases, a bit of visualization and some techniques that everyone doing data science should have on their toolbox. Definitely better than "Doing Data Science: Straight Talk from the Frontline", but not memorable at all.
Quickly scanned through this book. The code base is well prepared. The business use case are described. Also glad to find that the author took care of model preparation, which is rare for a book on data science and R. Drawbacks are obvious as well - the theories behind the codes are explained neither well nor too accurately. Still, I may go back to this book for its richness of R code.
(This is my January book for my "read one work book per month" New Year's resolution.)
Good practical book on applying machine learning. Lots of examples, though I probably would have appreciated more effort to use a single domain or "business", rather than constantly leaping around, just because taking a number of approaches to a single problem area is a useful skill to develop. I'd also have liked to see more generic functions: most of their illustrations would need to adapted. For example, they used a "..." notation in their function for calculation of Euclidean distance to indicate that you do the calculation for each dimension, but it would have been trivial to write the function to take the number of dimensions from the input vectors. Final quibble is that their treatment of kernels and SVMs seemed far more theoretical than the other sections.
So not, a definitive reference, but definitely a good book to have on your shelf when working an ML project in R.
Also, I seem to recall that they had a n-part series on their blog recently on verification and validation of models...maybe for the second edition, they'll add a chapter specifically on this topic, in addition to the tips throughout on which summary stats are indicating model soundness.