Book Image

Machine Learning for Data Mining

By : Jesus Salcedo
Book Image

Machine Learning for Data Mining

By: Jesus Salcedo

Overview of this book

Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques. This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization. By the end of this book, you will be able to build predictive models and extract information of interest from the dataset
Table of Contents (7 chapters)

Using statistics to interpret machine learning models

In this section, we will be using statistics to interpret the results of a machine learning model. We talked about graphs and how they allow us to see and interpret the predictions of a machine learning model, but we can also use those graph for statistical tests. Have a look at the following table:

Categorical Continuous
Categorical Chi-square ANOVA
Continuous ANOVA Correlation

Let's say we had a categorical outcome variable and we had a categorical predictor, we could use the chi-square test. The chi-square test of Independence will allow us to look at the relationship between two categorical variables.

Also, if we have a categorical outcome variable and a continuous predictor, we can use ANOVA to analyze the variance. If we have two categories that we're comparing, we can have a t-test instead, which...