Praise for previous editions: "Gandrud has written a great outline of how a fully reproducible research project should look from start to finish, with brief explanations of each tool that he uses along the way... Advanced undergraduate students in mathematics, statistics, and similar fields as well as students just beginning their graduate studies would benefit the most from reading this book. Many more experienced R users or second-year graduate students might find themselves thinking, 'I wish I'd read this book at the start of my studies, when I was first learning R!'...This book could be used as the main text for a class on reproducible research ..." (The American Statistician)
Reproducible Research with R and R Studio, Third Edition brings together the skills and tools needed for doing and presenting computational research. Using straightforward examples, the book takes you through an entire reproducible research workflow. This practical workflow enables you to gather and analyze data as well as dynamically present results in print and on the web. Supplementary materials and example are available on the author's website.
New to the Third Edition
Updated package recommendations, examples, URLs, and removed technologies no longer in regular use.
More advanced R Markdown (and less LaTeX) in discussions of markup languages and examples.
Stronger focus on reproducible working directory tools.
Updated discussion of cloud storage services and persistent reproducible material citation.
Added discussion of Jupyter notebooks and reproducible practices in industry.
Examples of data manipulation with Tidyverse tibbles (in addition to standard data frames) and pivot_longer() and pivot_wider() functions for pivoting data.
Features
Incorporates the most important advances that have been developed since the editions were published
Describes a complete reproducible research workflow, from data gathering to the presentation of results
Shows how to automatically generate tables and figures using R
Includes instructions on formatting a presentation document via markup languages
Discusses cloud storage and versioning services, particularly Github
Explains how to use Unix-like shell programs for working with large research projects
This book overpromises a bit. Gandrud claims to want to talk about sustainable ways to conduct reproducible research, which is a very laudable goal. He sketches out a project workflow to do this, but then he gets seduced by the weeds and starts just giving a tutorial about various actual coding commands in R, knitr, R Markdown, etc. He ends up giving relatively very short shrift to discussing and analyzing the actual workflow itself, its mechanics, pros and cons, ramifications, etc. Beyond that, there are good tidbits scattered throughout, as there usually are in these types of walkthrough books. I appreciate Gandrud's perspectives throughout. The main points are fine, I just wish he had dwelt on them more.
One of the best among many books on data science and reproducible research using R or Python. This book focuses on the entire flow of the reproducible research using R. It also points some interesting issues such as citing reproducible research, licensing it, etc. at the end of the book. Also it has a great cover summarizing the book as good as its title.
A not reproducible book on reproducible research would be awkward. The book provides all the source code and the text that can be run to generate the book itself at https://github.com/christophergandrud....
I learned a lot from this book. The book files are available in GitHub, but some files are missing, so that some of the files don't run without errors. This is disappointing given that the aim of the book is to make work reproducible and code future-proof. Would like to see a discussion of new "container tools" like Docker, which would be more useful and closer to the aims of the book then sections like "12.3 Slideshows".