Jump to ratings and reviews
Rate this book

Haskell Data Analysis Cookbook

Rate this book
Explore intuitive data analysis techniques and powerful machine learning methods using over 130 practical recipes.

This book will take you on a voyage through all the steps involved in data analysis. It provides synergy between Haskell and data modeling, consisting of carefully chosen examples featuring some of the most popular machine learning techniques.

You will begin with how to obtain and clean data from various sources. You will then learn how to use various data structures such as trees and graphs. The meat of data analysis occurs in the topics involving statistical techniques, parallelism, concurrency, and machine learning algorithms, along with various examples of visualizing and exporting results. By the end of the book, you will be empowered with techniques to maximize your potential when using Haskell for data analysis.

288 pages, Paperback

First published January 1, 2014

6 people are currently reading
46 people want to read

About the author

Nishant Shukla

10 books2 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
4 (20%)
4 stars
8 (40%)
3 stars
6 (30%)
2 stars
1 (5%)
1 star
1 (5%)
Displaying 1 - 3 of 3 reviews
4 reviews8 followers
August 4, 2014
[Full disclosure – I was given a free review copy of the book from the publisher. This review refers to the ebook version]

This book sees the purely functional, strongly typed language Haskell throw its hat in the data science ring. It is a great book for data scientists/analysts wanting to leverage the power of functional programming in data analysis applications. There are chapters on data cleaning, text scraping, hashing, tree traversal, social network analysis, basic stats and machine learning algorithms, mapReduce and visualisation. It is also good for more general purpose beginner and intermediate-level Haskell hackers, since it covers a lot of areas that can be essential in day-to-day programming such as reading and writing data from and to a variety of sources (including databases and the web), text processing, parallel programming and dealing with real time data. Often in books on haskell (as in many books on Lisp) much space is given to show off the cool FP aspects of the language but you are left struggling with doing the practical IO tasks that are no-brainers in more traditional languages.

The book covers a wide range of subjects, but it provides only a primer for most of them to get you started. In most cases this is enough, but there are a couple of areas I would have liked to have seen greater depth, for example in the statistics section, a more comprehensive introduction to linear modelling and regression would have been more convincing as to the advantages of switching to Haskell from, say R or Python. I would also have liked to have seen a treatment of random number generation and simulation; the purely functional nature of Haskell seems to make it difficult to generate simple random sequences because you need to set the seed for each run in order to maintain referential transparency (i.e. a function in Haskell generally needs to return the same value for the same input every time).

The examples are well chosen, written and explained and there is a Github page with the source code from all the chapters if you don’t want to type out every one from scratch. Another nice touch is the list of list of data sources and APIs for doing your own analyses with. I’m sure you could find them with a little Googling but it was good to have them all in one place and there were several I was not aware of.

It must be said that this is not an ‘introduction to Haskell’ book. There is plenty of assumed knowledge and the syntax is hardly discussed at all. However, I would highly recommend anyone starting off with Haskell to get this book alongside an introductory book such as the fantastic “Learn you a Haskell for Great Good” to get to grips with the mind-bending complexity of the pure FP paradigm alongside the practical real-world applications.
Profile Image for Ivan Fraixedes.
22 reviews
September 18, 2014
Haskell is one of the languages which I have aways been curious, however it is one of them which is not spread in the professional world, moreover nobody want to explore new horizons for different reasons outside of this review.

I could make an introduction to it in so many ways, but I’ve never got the time to do it, because other duties and interests took priority.

When I was requested to review this book, I considered a good opportunity to get an introduction to Haskell, moreover how to get into it in a practical way than an academic one.

This great book allowed me to discover the “odd” Haskell’s syntax but its strengths and its powerful parallel and concurrent computation as well.

Because the book is centred in how to use Haskell for data analysis, I got the chance to see how this language can be used for usefulness; today we live in a world where the data is growing so much faster that so many people can imagine, but people who work with this amount of raw data directly or just creating systems which have to support it we are constantly looking different ways, approaches and new technologies which may drive to enhance and improve our systems in several aspects.

This awesome book drive into the core of Haskell and its API and available libraries to analyse data, starting with getting into from different sources as JSON files, databases as MongoDB, sanitise for thereafter orchestrate with common and appreciated data structures.

When our data is into the those data structures, then it teaches about its analysing with statistics and its common techniques, then boosting up the performance with the awesome parallel and concurrent Haskell design.

However, the book does not stop here, it follows to the next step which the most of us developers may have fun time, it jumps into the Real-Time Data show how to analyse data coming from Twitter, IRC channels, polling web servers, watching file system, communicating with sockets and why not getting the data from a camera and tinker with it.

Furthermore, it will teach you to do a nice outcome with all the efforts done to get the data into your code and processed it, as visualising it with some plotting libraries and ending up the pipeline teaching you how to export it to several formats to keep it and reporting it to people who make decisions from it.

In this moment you have to feel that it is interesting enough to see a practical case of Haskell and get it to have in your hands and move your ideas forward, so go for it on PacktPub and have a great read.
Profile Image for Jake McCrary.
424 reviews25 followers
January 1, 2015
Packt Publishing recently asked me to write a review of the book Haskell Data Analysis Cookbook by Nishant Shukla. The book is broken into small sections that show you how to do a particular task related to data analysis. These tasks vary from reading a csv file or parsing json to listening to a stream of tweets.

I’m not a Haskell programmer. My Haskell experience is limited to reading some books (Learn You a Haskell for Great Good and most of Real World Haskell) and solving some toy problems. All of reading and programming happened years ago though so I’m out of practice.

This book is not for a programmer that is unfamiliar with Haskell. If you’ve never studied it before you’ll find yourself turning towards documentation. If you enter this book with a solid understanding of functional programming you can get by with a smaller understanding of Haskell but you will not get much from the book.

I’ve only read a few cookbook style books and this one followed the usual format. It will be more useful as a quick reference than as something you would read through. It doesn’t dive deep into any topic but does point you toward libraries for various tasks and shows a short example of using them.

A common critic I have of most code examples applies to this book. Most examples do not do qualified imports of namespaces or selective imports of functions from namespaces. This is especially useful when your examples might be read by people who are not be familiar with the languages standard libraries. Reading code and immediately knowing where a function comes from is incredibly useful to understanding.

The code for this book is available on GitHub. It is useful to look at the full example for a section. The examples in the book are broken into parts with English explanations and I found that made it hard to fully understand how the code fit together. Looking at the examples in the GitHub repo helped.

Recommendation

I’d recommend this book for Haskell programmers who find the table of contents interesting. If you read the table of contents and think it would be useful to have a shallow introduction to the topics listed then you’ll find this book useful. It doesn’t give a detailed dive into anything but at least gives you a starting point.

If you either learning Haskell or using Haskell then this book doesn’t have much to offer you.
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.