Jump to ratings and reviews
Rate this book

Pandas Cookbook

Rate this book
Key Features Learn to use the power of Pandas to solve most complex scientific computing problems Leverage fast, robust data structures in Pandas to gain most from your data Perform various data analysis tasks efficiently with ease Book Description

Pandas is one of the most efficient scientific computing packages in Python. It has an enormous amount of power and flexibility to tackle any data task in a variety of ways. It is common for advanced users to write “ugly” Pandas code. With this book, you will explore data in Pandas through dozens of practice problems with detailed solutions in iPython notebooks

This book will provide you with clean, clear recipes and solutions on how to handle common data manipulation tasks. You will be introduced to Pandas and its various features. You will learn about working with different types of data sets, data manipulation, and data wrangling. You will explore the power of Pandas DataFrames and find out about Boolean and multi-indexing with Pandas. You will perform statistical, time series computations, and implement them in financial and scientific applications.

By the end of this book, you will know how to perform fast and accurate scientific computing in Python.

What you will learn Group, aggregate, transform, reshape and filter data to discover meaningful insights Combine and merge data from different sources through Pandas SQL-like operations Create beautiful and insightful visualizations through Pandas direct hooks to Matplotlib and Seaborn Perform efficient and powerful analyses with Pandas time series functionality Build pipelines to import, clean and prepare real-world messy data sets for machine learning Create big data workflows for processing data that is too large to fit in the memory About the Author

Ted Petrou is a data scientist at Schlumberger where he spends the vast majority of his time exploring data. Some of his projects include using targeted sentiment analysis to discover the root cause of part failure from engineer text, developing customized client/server dashboarding applications and real-time web services to avoid mispricing of sales items. Ted received his Masters degree in statistics from Rice University and used his analytical skills to play poker professionally and teach math before becoming a data scientist. He is also head of Houston Data Science and a top Pandas answerer on stackoverflow.

538 pages, Paperback

Published December 6, 2017

55 people are currently reading
69 people want to read

About the author

Ted Petrou

4 books2 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
13 (38%)
4 stars
13 (38%)
3 stars
5 (14%)
2 stars
3 (8%)
1 star
0 (0%)
Displaying 1 - 3 of 3 reviews
12 reviews7 followers
March 4, 2019
Just revisited the book after a while. Still one of the better books available on the market. Cleaning/transforming data remains a huge issue in data mining and machine learning. Until we see something comparable pipe-like toolkit as in R, this book remains a valuable reference for those leaned towards Python/pandas.
Profile Image for John Banner.
32 reviews10 followers
February 25, 2018
I found this to be a good book, providing a good clear grounding in Pandas with lots of good examples. A worthwhile read (especially taking the time to work through the examples before you get the explanations of why they work is worth the effort) and worth keeping around if you are working on building your knowledge.
120 reviews18 followers
May 13, 2025
This book is well written. The author spends a lot of time explaining how things work and his depth of understanding shows up as assorted nuggets of knowledge. For example, I didn't realize that the round() function differs in Python 2.x and Python 3.x.

Since this is a cookbook, it is a collection of recipes for doing specific things. Where I think it really shines, though, is in the case studies in which the author pulls together a bunch of tasks and uses them to perform data science.

I think the book is best suited to people with limited programming experience since it moves too slowly for more advanced users, especially if they have much experience processing massaging data.

Note that I believe (and follow) the meanings Goodreads gives for what each number of stars means. Therefore, the majority of my ratings are 3 stars ("liked it").
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.