Book Image

Data Analysis with R, Second Edition - Second Edition

Book Image

Data Analysis with R, Second Edition - Second Edition

Overview of this book

Frequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. Starting with the basics of R and statistical reasoning, this book dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples. Packed with engaging problems and exercises, this book begins with a review of R and its syntax with packages like Rcpp, ggplot2, and dplyr. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with messy data, large data, communicating results, and facilitating reproducibility. This book is engineered to be an invaluable resource through many stages of anyone’s career as a data analyst.
Table of Contents (24 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Exercises


Practise the following exercises to get a firm grasp on the concepts learned so far:

  • Did you notice that I put CV in italics when I said that using k=27 seems like a safe bet as measured by the minimization of the CV error? Did you wonder why? I (quite deliberately) made a gaffe in choosing the k in the k-NN from Figure 10.4. My choice wasn't wrong, per se, but my choice of k may have been informed by data that should have been unavailable to me. How might have I committed a common but serious error in hyper-parameter tuning? How might I have done things differently?
  • Remember that we spent a long time talking about the assumptions of linear regression? In contrast, we spent virtually no time discussing the assumptions of logistic regression. Although logistic regression has less stringent assumptions than its cousin, it is not assumption-free. Think about what some assumptions of logistic regression might be. Confirm your suspicions by doing research on the web. My omission of the...