Goodreads helps you keep track of books you want to read.
Start by marking “Data Science at the Command Line: Facing the Future with Time-Tested Tools” as Want to Read:
Data Science at the Command Line: Facing the Future with Time-Tested Tools
Enlarge cover
Rate this book
Clear rating
Open Preview

Data Science at the Command Line: Facing the Future with Time-Tested Tools

3.87  ·  Rating details ·  99 ratings  ·  10 reviews
This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.

To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Scien
Paperback, 212 pages
Published October 12th 2014 by O'Reilly Media (first published June 1st 2014)
More Details... edit details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Data Science at the Command Line, please sign up.

Be the first to ask a question about Data Science at the Command Line

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

Showing 1-30
3.87  · 
Rating details
 ·  99 ratings  ·  10 reviews

Sort order
The book provides an easy and simple route to basic data analysis tasks -- scrubbing and exploration. It will be useful to readers who 1) are interested in data analysis and just getting started, 2) have been using tools such as R and Python for data analysis and have wanted simpler ways to scrub and explore data, or 3) are interested in improving your command-line chops in the context of data analysis. However, this is not the book to learn data analysis/science.

The author provides a virtualiza
Oct 06, 2018 rated it it was ok
I use and love CLI on daily basis, but fact is that it is not suitable for most data analysis tasks.

Despite there are some tools (a few introduced in the book) to work with data, you will sooner or later (and mostly better if sooner) end up re-doing everything in python/R...

The book also did not age well...

csvkit is nowadays replaced by xsv, drake has not seen a commit since 2015 (and seems not very useful anyway).

So what you get:
- some very basic intro to relevant bash tools (curl, sed)
- some o
Pritesh Shrivastava
Sep 15, 2018 rated it liked it
I used this book as more of a guide to get familiar with the regular command line workflow. I find IDEs a much better tool for data science than a command line, as it's replroducible. But nice to know some shortcuts available in bash scripts.
Sweemeng Ng
This books is a little focused on the tools. Which is good, which also means i need to revisit the book as I explore the tool. The commandline tools introduce is very interesting. I will definitely adopt it. It just didn't bring in a lot of new idea for me.
John Alan
Feb 24, 2015 rated it really liked it
This book will really help you turn your command line hacking into scalable and well managed data projects.

This book isn't about BIG data, it's about getting hands on data on your desktop in a flexible, fast and fun way.
However, the Author isn't asking you to give up hadoop etc, he's asking you if you'd like another set of tools for another day.

The book is well structured, it's flow and style are good and it provides an easy read.

If you have no command line experience there's a brief intro but
Steven Pennebaker
This is an excellent book. Thorough and clear, it has enough basic information for beginners but even intermediate and advanced users will pick up plenty of new tricks. When I've had to solve these types of problems in the past, I've leaned pretty heavily on AWK and, to a lesser extent, XSL (!). This book introduced me to a bunch of utilities that were new to me and reminded me of a few old friends I haven't used in years.
Ravi Sinha
Dec 29, 2014 rated it really liked it
Shelves: coding, data-science, unix
Great compilation of well-known, not-so-well-known, and brand-new custom command line tools for OSEMNing with your data. Reads like a good tutorial. Could use a little bit of refinement - e.g. some commands are used multiple times before they are explained in detail. I installed the tools natively, but you can also install a VM instead. But in either case, clone the book's companion repository- it has all the data and the author-supplied command line tools.
Jul 31, 2014 rated it really liked it  ·  review of another edition
It is a promising book, mostly for beginners, but an intermediate data scientist will find some good material to learn or will be inspired to dig into some very advanced topics.
In general having your own data science toolkit as a service as an idea is great!

The book is still in making so it only my preliminary rating.
Nov 15, 2014 rated it it was amazing  ·  review of another edition
A good but short demonstration of using command line tools to do data science. I have learned quite some new ideas from the book. Well worth reading.
Mar 27, 2016 rated it it was amazing  ·  review of another edition
It is short, but very useful book. Most of the commands are practical and can be used without a lot of adjustments. Highly recommended reading!
rated it it was amazing
Jun 06, 2016
rated it really liked it
Feb 03, 2018
Miroslav Vidović
rated it it was amazing
Apr 27, 2016
rated it it was amazing
Jul 15, 2018
rated it really liked it
Dec 15, 2014
Ronald L Cogswell
rated it it was ok
Apr 15, 2015
rated it liked it
Aug 27, 2016
rated it it was amazing
Feb 04, 2015
rated it really liked it
May 09, 2015
rated it really liked it
Jan 19, 2015
Guilherme Rocha
rated it it was amazing
Apr 23, 2015
Daniel Nguyen
rated it it was amazing
Apr 29, 2017
rated it really liked it
Dec 27, 2014
Alok Singh
rated it really liked it
Aug 24, 2015
rated it liked it
Sep 24, 2015
rated it really liked it
Sep 07, 2016
rated it really liked it
Jul 27, 2015
Zaighum Rajput
rated it really liked it
Oct 25, 2016
Simo Tumelius
rated it really liked it
Sep 14, 2018
David Elks
rated it really liked it
Feb 10, 2018
« previous 1 3 4 next »
There are no discussion topics on this book yet. Be the first to start one »
  • Graph Databases
  • Thoughtful Machine Learning with Python: A Test-Driven Approach
  • Advanced Analytics with Spark
  • Programming JavaScript Applications: Robust Web Architecture With Node, HTML5, and Modern JS Libraries
  • Python Data Science Handbook: Tools and Techniques for Developers
  • Machine Learning with R
  • Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions
  • The Node Beginner Book
  • Hadoop Explained
  • Test-Driven Web Development with Python
  • Docker: Up & Running: Shipping Reliable Containers in Production
  • Ruby on Rails 3 Tutorial: Learn Rails by Example (Addison-Wesley Professional Ruby Series)
  • The Hitchhiker's Guide to Python: Best Practices for Development
  • Think Bayes
  • I Heart Logs: Event Data, Stream Processing, and Data Integration
  • Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
  • Cassandra: The Definitive Guide
  • Hadoop: The Definitive Guide

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »