Goodreads helps you keep track of books you want to read.
Start by marking “Data Science at the Command Line: Facing the Future with Time-Tested Tools” as Want to Read:
Data Science at the Command Line: Facing the Future with Time-Tested Tools
Enlarge cover
Rate this book
Clear rating
Open Preview

Data Science at the Command Line: Facing the Future with Time-Tested Tools

3.83  ·  Rating details ·  115 ratings  ·  16 reviews
This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.

To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Scien
Paperback, 212 pages
Published October 12th 2014 by O'Reilly Media (first published June 1st 2014)
More Details... Edit Details

Friend Reviews

To see what your friends thought of this book, please sign up.

Reader Q&A

To ask other readers questions about Data Science at the Command Line, please sign up.

Be the first to ask a question about Data Science at the Command Line

This book is not yet featured on Listopia. Add this book to your favorite list »

Community Reviews

Showing 1-30
Average rating 3.83  · 
Rating details
 ·  115 ratings  ·  16 reviews

More filters
Sort order
Start your review of Data Science at the Command Line: Facing the Future with Time-Tested Tools
Oct 06, 2018 rated it it was ok
I use and love CLI on daily basis, but fact is that it is not suitable for most data analysis tasks.

Despite there are some tools (a few introduced in the book) to work with data, you will sooner or later (and mostly better if sooner) end up re-doing everything in python/R...

The book also did not age well...

csvkit is nowadays replaced by xsv, drake has not seen a commit since 2015 (and seems not very useful anyway).

So what you get:
- some very basic intro to relevant bash tools (curl, sed)
- some
The book provides an easy and simple route to basic data analysis tasks -- scrubbing and exploration. It will be useful to readers who 1) are interested in data analysis and just getting started, 2) have been using tools such as R and Python for data analysis and have wanted simpler ways to scrub and explore data, or 3) are interested in improving your command-line chops in the context of data analysis. However, this is not the book to learn data analysis/science.

The author provides a virtualiza
Tim Tilberg
Aug 14, 2020 rated it it was amazing
Shelves: my-books
The first 40 pages unpacks dozens of practical ideas and tools I had never previously considered. It introduced me to tools like [jq](, [csvtoolkit]( which led to [xsv](, and many more. It talks about why sometimes running data tasks using unix shell tools can be significantly more efficient than writing a move in a programming language (full parallelization with an entirely buffered pipelin ...more
As a backend/infra engineer with a few years experience and an introductory level knowledge of machine learning algorithms, I still found this sprinkled with useful snippets, tools, tips, and references. That being said, I found the structure of it odd - seems like it would be better off as a cookbook - and thought it was confused in identifying and catering to its target audience e.g. the early chapters are beginner (i.e. familiar with data science, but not the command line) friendly, but the l ...more
Aug 16, 2019 rated it really liked it
Pretty cool book if you are not already accustomed to your own things (there is a chapter on modelling data with good tools which I would not use because I am much more used to using different stuff). If you already have your own habits, you can still learn quite a bit of things (at least I did) and get some inspiration to build your own command line tools for data science!
Pritesh Shrivastava
Sep 15, 2018 rated it liked it
I used this book as more of a guide to get familiar with the regular command line workflow. I find IDEs a much better tool for data science than a command line, as it's replroducible. But nice to know some shortcuts available in bash scripts.
Sweemeng Ng
This books is a little focused on the tools. Which is good, which also means i need to revisit the book as I explore the tool. The commandline tools introduce is very interesting. I will definitely adopt it. It just didn't bring in a lot of new idea for me.
Ondrej Kokes
Jul 10, 2019 rated it it was ok
Not time tested tools. The book mentions coreutils and other unix tools fairly lightly, then spends much of the time with random tools that will become obsolete sooner or later (as we can see already now).

Pick up Unix Power Tools or Classic Shell Scripting instead.
Stein Karlsen
Mar 25, 2020 rated it it was amazing
Great overview and examples of command line tools to perform data science. I would wish for a few more tools and a deeper dive.
Michael Lee
Aug 08, 2019 rated it it was amazing
Must read for ML engineers in enterprise settings.
John Alan
Feb 24, 2015 rated it really liked it
This book will really help you turn your command line hacking into scalable and well managed data projects.

This book isn't about BIG data, it's about getting hands on data on your desktop in a flexible, fast and fun way.
However, the Author isn't asking you to give up hadoop etc, he's asking you if you'd like another set of tools for another day.

The book is well structured, it's flow and style are good and it provides an easy read.

If you have no command line experience there's a brief intro but
Steven Pennebaker
This is an excellent book. Thorough and clear, it has enough basic information for beginners but even intermediate and advanced users will pick up plenty of new tricks. When I've had to solve these types of problems in the past, I've leaned pretty heavily on AWK and, to a lesser extent, XSL (!). This book introduced me to a bunch of utilities that were new to me and reminded me of a few old friends I haven't used in years.
Ravi Sinha
Dec 29, 2014 rated it really liked it
Shelves: coding, data-science, unix
Great compilation of well-known, not-so-well-known, and brand-new custom command line tools for OSEMNing with your data. Reads like a good tutorial. Could use a little bit of refinement - e.g. some commands are used multiple times before they are explained in detail. I installed the tools natively, but you can also install a VM instead. But in either case, clone the book's companion repository- it has all the data and the author-supplied command line tools.
Jul 31, 2014 rated it really liked it  ·  review of another edition
It is a promising book, mostly for beginners, but an intermediate data scientist will find some good material to learn or will be inspired to dig into some very advanced topics.
In general having your own data science toolkit as a service as an idea is great!

The book is still in making so it only my preliminary rating.
Nov 15, 2014 rated it it was amazing  ·  review of another edition
A good but short demonstration of using command line tools to do data science. I have learned quite some new ideas from the book. Well worth reading.
Mar 27, 2016 rated it it was amazing  ·  review of another edition
It is short, but very useful book. Most of the commands are practical and can be used without a lot of adjustments. Highly recommended reading!
Guido Fawkes
rated it it was amazing
Jun 06, 2016
rated it really liked it
Feb 03, 2018
Miroslav Vidović
rated it it was amazing
Apr 27, 2016
rated it it was amazing
Jul 15, 2018
rated it really liked it
Dec 15, 2014
Ronald L Cogswell
rated it it was ok
Apr 15, 2015
rated it liked it
Aug 27, 2016
rated it it was amazing
Feb 04, 2015
rated it really liked it
May 09, 2015
rated it really liked it
Jan 19, 2015
Guilherme Rocha
rated it it was amazing
Apr 23, 2015
rated it really liked it
Dec 27, 2014
Alok Singh
rated it really liked it
Aug 24, 2015
rated it liked it
Sep 24, 2015
« previous 1 3 4 next »
There are no discussion topics on this book yet. Be the first to start one »

Readers also enjoyed

  • Learning Scrapy
  • The Minitest Cookbook: Testing Tactics for the Pragmatic Rubyist
  • Text Processing with Ruby: Extract Value from the Data That Surrounds You
  • The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
  • Enterprise Rails
  • Build Awesome Command-Line Applications in Ruby: Control Your Computer, Simplify Your Life
  • Deep Learning and the Game of Go
  • Stillness Is the Key
  • Annihilation Factor
  • A Monk's Guide to a Clean House and Mind
  • Advice Not Given: A Guide to Getting Over Yourself
  • God's Little Soldier
  • Neil Gaiman and Charles Vess' Stardust #1
  • The Icarus Deception: How High Will You Fly?
  • The Order of Time
  • Thinking with Data
  • Machine Trading: Deploying Computer Algorithms to Conquer the Markets (Wiley Trading)
  • Positive Evolutionary Psychology: Darwin's Guide to Living a Richer Life
See similar books…

Goodreads is hiring!

If you like books and love to build cool products, we may be looking for you.
Learn more »

Related Articles

Last year, Buzzfeed culture writer Anne Helen Petersen struck a chord with her viral article “How Millennials Became the Burnout Generation.”...
100 likes · 16 comments