Book Image

R Data Analysis Cookbook - Second Edition

By : Kuntal Ganguly, Shanthi Viswanathan, Viswa Viswanathan
Book Image

R Data Analysis Cookbook - Second Edition

By: Kuntal Ganguly, Shanthi Viswanathan, Viswa Viswanathan

Overview of this book

Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data. This book will show you how you can put your data analysis skills in R to practical use, with recipes catering to the basic as well as advanced data analysis tasks. Right from acquiring your data and preparing it for analysis to the more complex data analysis techniques, the book will show you how you can implement each technique in the best possible manner. You will also visualize your data using the popular R packages like ggplot2 and gain hidden insights from it. Starting with implementing the basic data analysis concepts like handling your data to creating basic plots, you will master the more advanced data analysis techniques like performing cluster analysis, and generating effective analysis reports and visualizations. Throughout the book, you will get to know the common problems and obstacles you might encounter while implementing each of the data analysis techniques in R, with ways to overcoming them in the easiest possible way. By the end of this book, you will have all the knowledge you need to become an expert in data analysis with R, and put your skills to test in real-world scenarios.
Table of Contents (14 chapters)

What this book covers

Chapter 1, Acquire and Prepare the Ingredients – Reading Your Data, provides the recipes to acquire, format, and cleanse data from multiple formats. Handling missing values, standardizing datasets, and transforming between numerical and categorical data are also covered.

Chapter 2, What's in There? – Exploratory Data Analysis, shows you how to perform exploratory data analysis and find underlying patterns to understand our dataset before getting into the analysis process.

Chapter 3, Where does it belong? - Classification, covers several classification techniques from basic classification trees, logistic regression, and support vector machines to text classification using Naive Bayes to find sentiment analysis.

Chapter 4, Give me a number - Regression, covers several algorithms for data prediction, such as linear regression, random forests, neural networks, and regression trees.

Chapter 5, Can you simplify that? – Data Reduction Techniques, covers code recipes for data reduction and clustering. We explore the different clustering algorithms in a practical way.

Chapter 6, Lessons from history - Time Series Analysis, explores how to work with financial time series data, how to visualize it, and how to perform predictions using the ARIMA algorithms.

Chapter 7, How does it look? - Advance data visualization, explores how to make attractive visualizations, 3D graphs, and advanced maps.

Chapter 8, This May also interest you – Building Recommendations Systems, guides you step by step through applying machine learning and data mining techniques, building and optimizing recommender models, followed by a fraud system practical example.

Chapter 9, It's all about Connections – Social Network Analysis, explores how to acquire, visualize, and cluster social network data using public APIs.

Chapter 10, Put your best foot forward – Document and present your Analysis, shows you how to show and share the results of the data analysis. It includes recipes to use R markdown, KnitR, and Shiny to create reports and dynamic dashboards.

Chapter 11, Work Smarter, not Harder – Efficient and elegant R code, covers recipes to handle large datasets using the apply family of functions, the plyr package, and using data tables to slice and dice data.

Chapter 12, Where in the world? – Geospatial Analysis, teaches you how to perform a geospatial data analysis implementing tools such as Google Maps and QGIS using R implementations. It covers how to import maps and visualize your own data into the maps.

Chapter 13, Playing nice – Working with external data sources, shows you how to work with external data sources such as Excel, MySql, or MongoDB, and how to perform large data processing methods with in-memory processing using Apache Spark.