Jump to ratings and reviews
Rate this book

Mastering Machine Learning with scikit-learn

Rate this book
Apply effective learning algorithms to real-world problems using scikit-learn If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential.








This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models. The book will also walk you through an example project that prompts you to label the most uncertain training examples. You will also use an unsupervised Hidden Markov Model to predict stock prices. By the end of the book, you will be an expert in scikit-learn and will be well versed in machine learning

240 pages, Kindle Edition

First published November 10, 2014

23 people are currently reading
55 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
10 (22%)
4 stars
25 (55%)
3 stars
10 (22%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 - 8 of 8 reviews
1 review
December 14, 2014

The book is a reasonable soft introduction to machine learning concepts for practitioners whose goal is to understand just enough of the theory to use the tools. The explanations are sometimes imprecise analogies, but usually seem to communicate about the right intuition. The book could do with some exercises to guide new learners but is otherwise a quite good introduction.

That said, did anyone proof-read the mathematics? For example, on page 40 of the pdf, a missing absolute value suggests that L1 norms can be negative. At another point he explains that only square matrices are invertible (true) and so we multiply by the transpose in order to get a square matrix that we can invert. Mechanically correct, but otherwise wrong: just because we have a square matrix doesn't make it meaningful to invert it.

The text is rife with this sort of almost-correct stuff. "Occam's razor states that a hypothesis with the fewest assumptions is the best". Actually, it states that among competing hypotheses and in the absence of certainty, one should tentatively prefer the hypothesis with fewest assumptions. until we have more data. "Hyperparameters are parameters of the model that are not learned automatically and must be set manually". Actually, hyperparameters are parameters of the prior distribution rather than parameters of the model.

A few points are differently perplexing. Why in 2014 does he use python 2.7 instead of python 3.x to illustrate his examples? Why does he not even mention ipython? Why does he use manifestly bad variables names (for example, "xx")?

Some of my biggest niggles concerned the EPUB formatting. Reading on my Nexus 7, the mathematics scaled differently (smaller) than the rest of the text, even rendering unreadably small for inline symbols. Many of the figures were also too small to read without enlargement. The python code itself wrapped awkwardly. Numbers are often left-justified where convention would right-justify or justify on the decimal point. The book was almost unreadable for me in epub. The pdf was fine on a 10" tablet but not on a 7" tablet. Otherwise, stick with paper.

Overall, the book is a decent enough introduction to machine learning concepts for those who just need to use its techniques and who will have plenty of opportunity to test their results. This last is important, and largely absent in the book: machine learning techniques are not use and forget. We try things, we measure, and then we try some more. And we keep measuring, because the underlying data may change in subtle ways over time. Whatever theoretical mistakes result from a high-level-only view become even more important to measure.
Profile Image for Okeyo Mayaka.
1 review
November 28, 2018
Excellent book overall. I read this book as a complete beginner to Machine Learning, and I was more than satisfied with the content therein. I liked the structuring of the chapters where a model[s] would be discussed then after training, the emphasis is shifted to performance metrics to use to evaluate the model[s]. I found this informative since most books I'd read to this point only discussed the model, training the model, and nothing more. I would recommend this for a beginner in Machine Learning.
Profile Image for Moustafa.
14 reviews7 followers
July 15, 2015
This book presents a very gentle introduction to machine learning in python using scikit-learn. It differs a broad range of machine learning methods (linear regression, logistic regression, svm, clustering, neural networks, etc.) with real examples and Complete Code to run those tasks by your self. This is very certainly an effective way to teach to use the scikit-learn framework.

On the other hand, I have some comments about how the book could have been better:
1. It assumes a good knowledge of python and numpy (which is used extensively by scikit-learn implementation), adding an introduction chapter (or perhaps an appendix) with quick tutorials about both would have made the meal more complete.

2. This is book is NOT machine learning book !!, it provides quick overview about the methods but it focuses more on the scikit-learn API and the code examples. If you are eager to learn more about machine learning then you still need to read a real machine learning book to understand the theory and limitations and of the various methods.

3. Some minor typos exist :)

My overall recommendation is that the book achieves it goal in providing quick hands-on experiments of machine with sklearn to those who look up to !
Profile Image for Phil Moyer.
24 reviews3 followers
January 4, 2016
Another excellent book from this series, it deals exclusively with the scikit-learn library in Python. This is more advanced than Python Machine Learning, but it is a very good book. It does not delve too deeply into the mathematics of machine learning systems, so it is much more "applied science" than, for example, Machine Learning: A Probabilistic Perspective, which is extremely technical and mathematical (or rather, statistical) in nature. You'll learn how to build machine learning systems from this book, while you'll learn (or be seriously challenged to learn) the back theory in Murphy's book. If you want to build ML systems In Real Life, this is one of the books to grab.
Profile Image for Franck Chauvel.
119 reviews5 followers
December 14, 2016
As the title suggests, this is about machine learning algorithm using Python scikit-learn library.

As for algorithms, the content is very similar to other machine learning books such as Machine Learning for Hackers, but also includes decision trees and random forests. I find the text really easy to read, and I appreciated the effort made to convey the intuition beyond the formulas.

As for Python code examples, although some are out-of-date, I found the documentation of scikit-learn detailed enough and I managed to reproduce the ones I was interested in without any problem.
67 reviews1 follower
November 6, 2019
A good overview over some of the machine learning algorithms in scikit learn.
Displaying 1 - 8 of 8 reviews

Can't find what you're looking for?

Get help and learn more about the design.