Goodreads helps you keep track of books you want to read.
Start by marking “Python for Data Analysis: Data Wrangling with Pandas, Numpy, and Ipython” as Want to Read:
Python for Data Analysis: Data Wrangling with Pandas, Numpy, and Ipython
Enlarge cover
Rate this book
Clear rating

Python for Data Analysis: Data Wrangling with Pandas, Numpy, and Ipython

3.97 of 5 stars 3.97  ·  rating details  ·  199 ratings  ·  26 reviews
"Python for Data Analysis" is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data...more
ebook, 470 pages
Published October 8th 2012 by O'Reilly Media (first published October 1st 2012)
more details... edit details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Python for Data Analysis, please sign up.

Be the first to ask a question about Python for Data Analysis

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

(showing 1-30 of 744)
filter  |  sort: default (?)  |  rating details
Ben
A better title for this book might be Pandas and NumPy in Action

As the creator of the pandas project, a Python data analysis framework, Wes McKinney is well placed to write this book. His experience and vision for the pandas framework is clear, and he is able to explain the main function and inner workings of both pandas and another package, NumPy, very well.

Although the title of the book suggests a broad look at the Python language for data analysis, McKinney almost exclusively focuses on an in...more
Rob
Aug 24, 2012 Rob rated it 3 of 5 stars  ·  review of another edition
Recommends it for: folks doing data analysis that have already decided to use Python
Shelves: 2012, technical, python
I did copy editing on this book, so my review is of an unfinished (but close to finished) version. That being said: McKinney is the principal author on pandas, a Python package for doing data transformation and statistical analysis. The book is largely about pandas (and NumPy), but also delves into general methodologies for munging data and performing analytical operations on them (e.g., normalizing messy data and turning it into graphs and tables); he also delves into some (semi) esoteric infor...more
Louis
For some time now I have been using R and Python for data analysis. And I have long ago discovered the Python technical stack of ipython, NumPy, Scipy, and Matplotlib and I thought I knew what I was doing. I even dipped my toe into pandas as my data structure for analysis. But Python for Data Analysis showed me entire worlds of improvement in my workflow and my ability to work with data in the messy form that is found in the real world.

Python, like most interpreted languages, is slow compared to...more
James Williams
In my office, we spend a lot of time in the database. As such, we tend to become fairly adept at analyzing data with SQL: join some tables on interesting columns, group by other interesting columns, sprinkle in some aggregates, and pretty soon you have yourself a table of answers.

The relatively new windowing functions added in SQL Server 2012 let you do even fancier analysis (at the risk needing to understand some new syntax).

Yet, sometimes, a raw table of SQL results just isn't enough. You mi...more
Sefa
Good introduction to pandas data analysis library by its main contributor, Wes McKinney. Also covers useful Python tools/libraries for data analysis such as ipython and numpy. Lots of examples.

Didn't read the last three chapters on time series, financial data analysis and advanced numpy.

Ipython notebooks are available here, forked from the official repository of the book.
Steve
Good introduction to Python Pandas and other libraries for data analysis. However, the book goes directly from the introduction into pretty complicated examples. As a reader new to R, Pandas, and statistical languages, it was hard work to learn the data structures and semantics. After working through several web-based tutorials, I had a better intuitive sense for how to solve problems with the framework presented by the author.

As documentation for Pandas alone, this book is useful.
Derek Bridge
This book is a reasonably comprehensive tutorial to pandas - the Python library for data wrangling. As a tutorial, it works well.

But it wasn't quite what I was expecting. I was expecting less tutorial and more case studies - taking meaningful datasets (instead of makey-upy ones) and using pandas and other tools to pose and answer questions. For me, this would have made the book a much more practical resource.
Dani Arribas-bel
I've been using Python for quantitaive economics and geography for the last five years and I wish this book had been written back then. It is clear, easy to read, full of (cool) examples and incredibly useful, regardless of whether you read it from first to last page or have it around to check while hacking. If you are getting started on data crunching with Python, you should not miss it. If you are not new to the party, this will still be a good resource to get your head around a bit more advan...more
Michael
I've used Python for a few years and have recently started implementing it at work. We used to have Matlab for basic scripting, plotting, and analysis and recently dropped it due to their ridiculous licensing costs. Some people moved on to Scilab successfully (it's a great solution) but I started using Python in lieu of Matlab.

This book is a great introduction to pandas (it's written by the main author of pandas) as well as an introduction to Numpy. Great read.
Nancy Wu
This book was the perfect set of training wheels for me, especially since my main goal was to operate on economic and financial data. By chapter 4 (practically the beginning of this book), I was able to sample random stocks, run correlations between stocks and commodities. I think that the TimeSeries chapter should be read just before or after chapter 4, to avoid some time groping in the dark with this datatype. Chapter 11 is also very useful with a focus on data munging for financial data.
Matt Heavner
This is definitely pandas and financial analysis focused (by the author's own admission). It is filled with lots of great examples (it doesn't matter so much if they are scientific time series or financial time series, but lots of the focus on "business days" and "last day of the quarter" aren't so applicable to scientific data sets -- but they do still have illustrative value). There were a few moments when the overall level "dipped" (one I remember specifically was the "hangup" early on regard...more
Rishi Singh
It's a good book and great for understanding Wes's original intent on how to use Pandas. I recommend any serious pandas or data analysis user to have this on their bookshelf. This is much better than the pandas docs for learning about the bigger picture and understanding how to connect the pieces of Pandas.

My only major issue is that the content will become more outdated with each passing edition. Pandas is rapidly developing. This is an unfair reason to remove a star, the second reason is the b...more
Tristan Williams
A great handbook for anyone looking to do break down data sets in Python. This won't teach you what to look for or how to do data analysis, but it will show you all the tools to get it done.
Philipp
Better name: 'Data munging with Python's pandas and numpy'

It focuses heavily on pandas and the myriad of things you can do with a DataFrame. Very often the examples are extremely specific, yet the example data is contrived, like "here's this rather specific case in which you want to average a subset of a column in a table, but only those cases where the person linked to the index of that column has the astrological sign Pisces", and about 5 minutes later, I already forgot how the author did it....more
Matthew
Want to use Python instead of Excel, Matlab, or R? Are you crunching numbers in Python? Want to learn about IPython, NumPy, Matplotlib, and pandas? Then Wes McKinney's book is for you. Whenever I'm performing data analysis using Python, I find myself referencing Python for Data Analysis. It's well written and covers a broad range of topics that you'll need when importing, manipulating, aggregating, calculating, or plotting data.
Eliot St. John
What I really need is "Python for Stata users."
Tom
It's not a bad book but if you are looking for a good book for scientific computing with Python you will probably be disappointed.
The book covers mostly pandas and doesn't give much information on numpy and matplotlib, and say completely nothing about scipy, which are all more essential for scientific computing as far as I understand that topic.
On the other hand I'm sure that I will use what I've learned here soon, but only after reading more comprehensive information about the whole scipy stack...more
Grace
This is not a first book on data analysis or a learn to program in python book. But, if you already know how to program a little in python, and have experience with data analysis in other programming languages, this book will teach you exactly what you need to do the job in python.

It's clearly written and well-edited and organized, just like other O'Reilly books. Even if you have no time to learn this yourself, buy it for your lab so your grad students will be more productive. ;-)
M Sheik Uduman Ali
As a primer for data analysis, this book has been well written. With the enough sample data for learning purpose, Wes explains NumPy, Pandas and matplotlib libraries in addition to the knowledge required on file handling, data loading, storage, wrangling and aggregation.

Python for Data Analysis would be your first and good stepping stone...

Read fully review at: http://udooz.net/blog/2012/12/book-re...
Jess
This book is helpful IF you already use python..or are willing to run through a bunch of basic python learning beside learning how to use Python for data analysis. There are some examples, but they are not fully annotated. Wes McKinney has produced a useful package with pandas. I warn readers (especially newbie readers) to be very patient in working out examples and annotating them until you understand what is happening.
Dgg32
With this book, I am beginning to ditch R and dive into Pandas. This way, I can have both data conversion and analysis in one code.

It is certainly not for Python beginners. I got also stuck at some mind boggling syntax and financial jargons. But the beauty of the Pandas is still right there to see. It is only with Pandas that I dare to take on some projects that were considered to be too cumbersome to tackle before.
Jascha
Not bad, but it doesn't provide anything more than the official documentation. Being the only book about pandas, you can't compare it. And you don't have alternatives. I honestly expected more. The only way to learn pandas is spending time with ipython and searching on stackoverflow, creating your own code snippets.
Court Corley
One of the best O'Reilly book, the introduction to iPython was as groundbreaking as learning Python itself. As an R user also, learning Pandas for data frames and time series is a huge time saver. Overall, this book is a must have for an aspiring and active data scientist!
Karl
I don't know if I'm reviewing this book or I'm reviewing the "pandas" package for Python, but I suspect that both will make my work infinitely easier. Highly recommended for those doing analysis in Python.
Alexander
numpy and pandas explained with examples
Rodrigo Rivera
Great book to learn how to leverage pandas and other python libraries for data analysis. We need more books like this to compete against the vast library of R books
Dmara61088
Dmara61088 is currently reading it
Sep 30, 2014
Shahab Hassani
Shahab Hassani marked it as to-read
Sep 29, 2014
« previous 1 3 4 5 6 7 8 9 24 25 next »
There are no discussion topics on this book yet. Be the first to start one »
  • Data Analysis with Open Source Tools
  • Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
  • Programming Collective Intelligence: Building Smart Web 2.0 Applications
  • Machine Learning for Hackers
  • Natural Language Processing with Python
  • Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann Series in Data Management Systems)
  • Data Science for Business: What you need to know about data mining and data-analytic thinking
  • Pattern Recognition and Machine Learning
  • Think Python
  • Algorithms
  • Hadoop: The Definitive Guide
  • JavaScript Patterns
  • Dive Into Python
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction
  • Objective-C Programming: The Big Nerd Ranch Guide
  • Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
  • Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers
  • Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »
Python for Data Analysis: Data Wrangling with Pandas, Numpy, and Ipython Pro Python Data Wrangling

Share This Book