Book Image

Machine Learning with R - Fourth Edition

By : Brett Lantz
5 (1)
Book Image

Machine Learning with R - Fourth Edition

5 (1)
By: Brett Lantz

Overview of this book

Dive into R with this data science guide on machine learning (ML). Machine Learning with R, Fourth Edition, takes you through classification methods like nearest neighbor and Naive Bayes and regression modeling, from simple linear to logistic. Dive into practical deep learning with neural networks and support vector machines and unearth valuable insights from complex data sets with market basket analysis. Learn how to unlock hidden patterns within your data using k-means clustering. With three new chapters on data, you’ll hone your skills in advanced data preparation, mastering feature engineering, and tackling challenging data scenarios. This book helps you conquer high-dimensionality, sparsity, and imbalanced data with confidence. Navigate the complexities of big data with ease, harnessing the power of parallel computing and leveraging GPU resources for faster insights. Elevate your understanding of model performance evaluation, moving beyond accuracy metrics. With a new chapter on building better learners, you’ll pick up techniques that top teams use to improve model performance with ensemble methods and innovative model stacking and blending techniques. Machine Learning with R, Fourth Edition, equips you with the tools and knowledge to tackle even the most formidable data challenges. Unlock the full potential of machine learning and become a true master of the craft.
Table of Contents (18 chapters)
16
Other Books You May Enjoy
17
Index

Summary

In this chapter, we learned about the basics of managing data in R. We started by taking an in-depth look at the structures used for storing various types of data. The foundational R data structure is the vector, which is extended and combined into more complex data types, such as lists and data frames. The data frame is an R data structure that corresponds to the notion of a dataset having both features and examples. R provides functions for reading and writing data frames to spreadsheet-like tabular data files.

We then explored a real-world dataset containing the prices of used cars. We examined numeric variables using common summary statistics of center and spread, and visualized relationships between prices and odometer readings with a scatterplot. Next, we examined nominal variables using tables. In examining the used car data, we followed an exploratory process that can be used to understand any dataset. These skills will be required for the other projects throughout this book.

Now that we have spent some time understanding the basics of data management with R, you are ready to begin using machine learning to solve real-world problems. In the next chapter, we will tackle our first classification task using nearest neighbor methods. You may be surprised to discover that with just a few lines of R code, a machine can achieve human-like performance on a challenging medical diagnosis task.

Join our book’s Discord space

Join our Discord community to meet like-minded people and learn alongside more than 4000 people at:

https://packt.link/r