Book Image

Machine Learning with R Quick Start Guide

By : Iván Pastor Sanz
Book Image

Machine Learning with R Quick Start Guide

By: Iván Pastor Sanz

Overview of this book

Machine Learning with R Quick Start Guide takes you on a data-driven journey that starts with the very basics of R and machine learning. It gradually builds upon core concepts so you can handle the varied complexities of data and understand each stage of the machine learning pipeline. From data collection to implementing Natural Language Processing (NLP), this book covers it all. You will implement key machine learning algorithms to understand how they are used to build smart models. You will cover tasks such as clustering, logistic regressions, random forests, support vector machines, and more. Furthermore, you will also look at more advanced aspects such as training neural networks and topic modeling. By the end of the book, you will be able to apply the concepts of machine learning, deal with data-related problems, and solve them using the powerful yet simple language that is R.
Table of Contents (9 chapters)

Logistic regression

Mathematically, a binary logistic model has a dependent variable with two categorical values. In our example, these values relate to whether or not a bank is solvent.

In a logistic model, log odds refers to the logarithm of the odds for a class, which is a linear combination of one or more independent variables, as follows:

The coefficients (beta values, β) of the logistic regression algorithm must be estimated using maximum likelihood estimation. Maximum likelihood estimation involves getting values for the regression coefficients that minimize the error in the probabilities that are predicted by the model and the real observed case.

Logistic regression is very sensitive to the presence of outlier values, so high correlations in variables should be avoided. Logistic regression in R can be applied as follows:

set.seed(1234)
LogisticRegression=glm(train...