Key FeaturesApply R to simplify predictive modeling with short and simple codeUse machine learning to solve problems ranging from small to big dataBuild a training and testing dataset from the churn dataset, applying different classification methodsBook DescriptionThe R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.
This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.
What you will learnCreate and inspect the transaction dataset, performing association analysis with the Apriori algorithmVisualize patterns and associations using a range of graphs and find frequent itemsets using the Eclat algorithmCompare differences between each regression method to discover how they solve problemsPredict possible churn users with the classification approachImplement the clustering method to segment customer dataCompress images with the dimension reduction methodIncorporate R and Hadoop to solve machine learning problems on Big DataAbout the AuthorYu-Wei, Chiu (David Chiu) is the founder of LargitData. He has previously worked for Trend Micro as a software engineer, with the responsibility of building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis.
Table of ContentsPractical Machine Learning with RData Exploration with RMS TitanicR and StatisticsUnderstanding Regression AnalysisClassification (I) – Tree, Lazy, and ProbabilisticClassification (II) – Neural Network and SVMModel EvaluationEnsemble LearningClusteringAssociation Analysis and Sequence MininDimension ReductionBig Data Analysis (R and Hadoop)
I got the free book "Machine Learning with R Cookbook" from Packt publishers for a review. I really thank the publisher for giving me the opportunity to have a good read on this book and I really appreciate the work done by the author "Yu-Wei Chiu". The following is the brief snapshot of my review.
To say the foreword about the book, Machine Learning and Statistics are made simple with R leading this to the competitive level of Python and other open source communities.
Chapters -1 & 2 ------------------------------------------------------------------ The author provided each detail about R installation and resources very clearly. This avoids beginners to google the basics.
Cons : 1. This assumes some basic programming in R especially Data management/handling.
Chapters -3 to 7 ------------------------------------------------------------------
All R statistical tools are provided with clear examples in 3rd Chapter. This gives very quick view of R-Statistical strength. The later chapters are sufficient enough to explain various types of Regression,Classification methods. This shows a valid comparison between Python's Scikit-Learn and R.
Cons : 1. This expects basic knowledge of Statistics and R utilities for available statistical methods are explored. 2. This does not explain for what sort of data/application what methods are feasible. Reader should explore this by his/her own knowledge.
Chapters -8 to 9 ------------------------------------------------------------------ Chapter -8 explains the Boosting/Bagging approaches. The author is very clever in making this chapter by mentioning data examples for theses approaches.
Chapters -10 to 12 ------------------------------------------------------------------
These chapters opened a wide door for R especially Big Data, Scale, Dimensionality problems. I could say these last chapters show real strength of R. Author's representation really appreciable.
Cons : More light should have been shown Big Data.
All in all, this book satisfies the title "Cookbook" for Machine learning in R.