Book Image

Machine Learning with R Quick Start Guide

By : Iván Pastor Sanz
Book Image

Machine Learning with R Quick Start Guide

By: Iván Pastor Sanz

Overview of this book

Machine Learning with R Quick Start Guide takes you on a data-driven journey that starts with the very basics of R and machine learning. It gradually builds upon core concepts so you can handle the varied complexities of data and understand each stage of the machine learning pipeline. From data collection to implementing Natural Language Processing (NLP), this book covers it all. You will implement key machine learning algorithms to understand how they are used to build smart models. You will cover tasks such as clustering, logistic regressions, random forests, support vector machines, and more. Furthermore, you will also look at more advanced aspects such as training neural networks and topic modeling. By the end of the book, you will be able to apply the concepts of machine learning, deal with data-related problems, and solve them using the powerful yet simple language that is R.
Table of Contents (9 chapters)

R Fundamentals for Machine Learning

You're probably used to hearing words such as big data, machine learning, and artificial intelligence in the news. It's amazing how many new applications of these terms appear every day. Recommender systems such as the ones used by Amazon, Netflix, search engines, stock market analysis, or even for speech recognition are only a few examples. Different new algorithms and new techniques emerge every year, and many of them are based on previous approaches or combine different existing algorithms. At the same time, there are more and more tutorials and courses focused on teaching them.

Many courses have a number of common limitations such as solving toy problems or focusing all of their attention on algorithms. These limitations could mean that you obtain an incorrect understanding of the data modeling approach. Thus, the modeling process entails important steps before, as business and data understanding, and data preparation. Without these previous steps, it isn't guaranteed that the model will be applied without flaws in the future. Furthermore, model development does not finish after finding an appropriate algorithm. The performance evaluation of the model, its interpretability, and the model's deployment are also very relevant and the culmination of the modeling process.

In this book, we will learn how to develop different predictive models. The applications or examples included in this book have been based on the financial sector, and will also try to create a theoretical framework that helps you understand the causes of the financial crisis, which had a dramatic impact on countries around the world.

All of the algorithms and techniques that are used in this book will be applied using the R language. Nowadays, R is one of the major languages for data science. There is an enormous debate about which language is better, R or Python. Both languages have many strengths and some weakness as well.

In my experience, R is more powerful for the analysis of financial data. I've found many R libraries that specialize in this field, but not so many in Python. Nevertheless, credit risk and financial information is very much related to the treatment of time series, which, at least in my opinion, performs better in Python. The use of recurrent or Long Short-Term Memory (LSTM) networks are better implemented in Python as well. However, R provides more powerful libraries for data visualization and interactive style. It is recommended that you use both R and Python interchangeably, depending on your project. Good resources on machine learning with Python are available at Packt, some of which are listed here for your convenience:

In this chapter, let's revive your knowledge on machine learning and get you started with coding using R.

The following topics will be covered in this chapter:

  • R and RStudio installation
  • Some basics commands
  • Objects, special cases, and basic operators in R
  • Controlling code flow
  • All about R packages
  • Taking further steps