Goodreads helps you keep track of books you want to read.

Start by marking “Python for Data Analysis: Data Wrangling with Pandas, Numpy, and IPython” as Want to Read:

# Python for Data Analysis: Data Wrangling with Pandas, Numpy, and IPython

by
Wes McKinney

"Python for Data Analysis" is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data
...more

ebook, 466 pages

Published
October 8th 2012
by O'Reilly Media
(first published December 30th 2011)

## Friend Reviews

To see what your friends thought of this book,
please sign up.

## Reader Q&A

To ask other readers questions about
Python for Data Analysis,
please sign up.

### Popular Answered Questions

This book is not yet featured on Listopia.
Add this book to your favorite list »

## Community Reviews

(showing 1-30)

*Pandas and NumPy in Action*

As the creator of the pandas project, a Python data analysis framework, Wes McKinney is well placed to write this book. His experience and vision for the pandas framework is clear, and he is able to explain the main function and inner workings of both pandas and another package, NumPy, very well.

Although the title of the book suggests a broad look at the Python language for data analysis, McKinney almost exclusively focuses on an in ...more

Nov 09, 2012
Louis
rated it
it was amazing
·
review of another edition

Shelves:
computer,
math-stats

For some time now I have been using R and Python for data analysis. And I have long ago discovered the Python technical stack of ipython, NumPy, Scipy, and Matplotlib and I thought I knew what I was doing. I even dipped my toe into pandas as my data structure for analysis. But Python for Data Analysis showed me entire worlds of improvement in my workflow and my ability to work with data in the messy form that is found in the real world.

Python, like most interpreted languages, is slow compared to ...more

Python, like most interpreted languages, is slow compared to ...more

Aug 20, 2012
Rob
rated it
liked it
·
review of another edition

Recommends it for:
folks doing data analysis that have already decided to use Python

I did copy editing on this book, so my review is of an unfinished (but close to finished) version. That being said: McKinney is the principal author on pandas, a Python package for doing data transformation and statistical analysis. The book is largely about pandas (and NumPy), but also delves into general methodologies for munging data and performing analytical operations on them (e.g., normalizing messy data and turning it into graphs and tables); he also delves into some (semi) esoteric infor
...more

Didn't read the last three chapters on time series, financial data analysis and advanced numpy.

Ipython notebooks are available here, forked from the official repository of the book.

But it wasn't quite what I was expecting. I was expecting less tutorial and more case studies - taking meaningful datasets (instead of makey-upy ones) and using pandas and other tools to pose and answer questions. For me, this would have made the book a much more practical resource.

As documentation for Pandas alone, this book is useful.

a new edition due in fall 2017 should correct some of the outdated material as well.

As well as Pandas you'll cover IPython, NumPy and Matplotlib in enough depth to get you started with data analysis and visualization.

You don't need to be a python expert but some python knowledge, and some experience of R, will definitely help.

The book is well structured, breaking down the different topics into well defined chapters which deal with topi ...more

The relatively new windowing functions added in SQL Server 2012 let you do even fancier analysis (at the risk needing to understand some new syntax).

Yet, sometimes, a raw table of SQL results just isn't enough. You mi ...more

The reason why I gave 2 stars is because it is little bit out-of-date and almost no practical examples during uncovering of pandas' functionality.

I would strongly recommend to dive into official documentation instead (10 minutes intro and tutorials) if you want to master pandas.

**Great for Transition**

As an R user I always hear people say that one should also learn Python as a secondary language. So I gave this book a shot and it did not disappoint.

*Wes McKinney*is the creator of Pandas, a framework for working with structured data in Python. That being said, a great deal of the book deals with Pandas and solving classical data science tasks (cleaning and munging your data). His style of writing is very clear and one can easily grasp the concepts by applying them in one's o ...more

Упрекать можно, но стоит помнить: pandas де-факто на текущий момент нет альтернатив в области анализа данных на Python, да и инструмент это ...more

My only major issue is that the content will become more outdated with each passing edition. Pandas is rapidly developing. This is an unfair reason to remove a star, the second reason is the b ...more

It focuses heavily on pandas and the myriad of things you can do with a DataFrame. Very often the examples are extremely specific, yet the example data is contrived, like "here's this rather specific case in which you want to average a subset of a column in a table, but only those cases where the person linked to the index of that column has the astrological sign Pisces", and about 5 minutes later, I already forgot how the author did it. ...more

The book covers mostly pandas and doesn't give much information on numpy and matplotlib, and say completely nothing about scipy, which are all more essential for scientific computing as far as I understand that topic.

On the other hand I'm sure that I will use what I've learned here soon, but only after reading more comprehensive information about the whole scipy stack ...more

It's clearly written and well-edited and organized, just like other O'Reilly books. Even if you have no time to learn this yourself, buy it for your lab so your grad students will be more productive. ;-)

This book is a great introduction to pandas (it's written by the main author of pandas) as well as an introduction to Numpy. Great read.

It is certainly not for Python beginners. I got also stuck at some mind boggling syntax and financial jargons. But the beauty of the Pandas is still right there to see. It is only with Pandas that I dare to take on some projects that were considered to be too cumbersome to tackle before.

Thankfully much of what it covers is contained in the pandas official documentation for all version of the library and if supplemented in this way the book is

*Python for Data Analysis*. It's well written and covers a broad range of topics that you'll need when importing, manipulating, aggregating, calculating, or plotting data.

Will definitely put this book at my side for a while for reference.

There are no discussion topics on this book yet.
Be the first to start one »

## Goodreads is hiring!

## Share This Book

No trivia or quizzes yet. Add some now »

“Act without doing; work without effort. Think of the small as large and the few as many. Confront the difficult while it is still easy; accomplish the great task by a series of small acts. — Laozi”
—
0 likes

More quotes…