Data Science from Scratch: First Principles with Python
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.
If you have an aptitude for mathematics and som
Unfortunately, this book is based on python 2.7 ; that notwithstanding, I *would* recommend this book since it is very well written!
The approach of the author is not to explain how to merely apply the pre-made data science tools (i.e. the aforementioned libraries), but rather he *teaches the actual algorithms* behind the principal ideas in statistical analysis and machine learning by coding them in pure python, which is great!
Now, be aware that in real applications of data science you will most probably not follow this route (that is, coding your algorithms directly in python) due to the slowness of python. The preferred approach is to use python to call highly optimized libraries such as numpy that do the actual computation for you.
For this reason, differences between python 2 and 3 are minimal among the features of the language used in this book.
What I am trying to say is that you can most likely follow the book as if it were a python3 book, and the only issue will be having to change "print x" into "print(x)" as needed.
1: as stated in chapter 2, paragraph 1 "getting python"(less)
1. KNOWING data science
2. DOING data science
This book is about the second one. Make no mistake, this is a "statistical computation" manual. This shows you how to find statistical answers using Python. Fully half this book is code samples. If you do not plan to actually attempt to find statistical answers to known questions by writing Python code, then this isn't the book for you.
I would look ...more
(I also learned that my linear algebra is very rusty and I need a brush up ...)
I disagree with some of the reviews that they he doesn't do a good job explaining the computation ...more
- Practicing entry projects (exercises)
- simple language
- lack of some required details in some sections
- outdated code
- the apps -codes- are not that useful in some sections
Overall the book is a good refreshing read. but not that good for studying
1. you'll not learn math behind this or the methods will be explained (it's good for a programming, though)
2. regarding programming part, I think that people would benefit more if there were some actual exercises for them to do, not just "type in this code" attitude
3. would be nice if all of the data sets are actually generated in a book, not just "there is some data set with 2000 points, that I just pulled out of my ass"
4. more u ...more
I can foresee using this as a reference for the main concepts, or when looking for a straightforward implementation of the algorithms discussed. The information is very solid.
If you want to power straight through, it's a tough read at times--but Joel's a very good writer, and I enjoyed the dry humor intersp ...more
Joel does a great job walking through the tasks a data scientist would take to solve hypothetical problems, and explaining the models most popularly implemented in machine learning. An overwhelming majority of the code examples are useless, which is intentional as Joel notes how to build things from scratch. Libraries (like pandas, scikit-learn, etc) provide APIs to accomplish many of these tasks without writing from scratch, but without the un ...more
Leitura recomendada para um sólido entendimento da prática ...more
There's no shortage of information on the topic, but it's hard to find it all in one place. You could spend weeks combing through forums, blog posts, and video tutorials only to find half as much useful information. Data Science from Scratch covers the foundations of many basic Machine Learning algorithms in a succinct and humorous way.
As fair warning, the math is a little much to take in for a single book. The author provides introducti ...more
I rate it 4 because , some examples shown in the book do not provide data to test them
Even though the book is shallow, I would recommend it; here and there you can get a valuable piece of information from it.
Goodreads is hiring!
Learn more »