Book Image

Machine Learning for Data Mining

By : Jesus Salcedo
Book Image

Machine Learning for Data Mining

By: Jesus Salcedo

Overview of this book

Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques. This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization. By the end of this book, you will be able to build predictive models and extract information of interest from the dataset
Table of Contents (7 chapters)

Removing noise to improve models

Let's focus on how noise can affect the results. Noise is nothing but missing data, outliers, or too many predictors that try to confuse the model with unnecessary predictions.

Decision tree models don't have noise because of too many predictors, as by default, they eliminate the predictors that they don't use for predictions as opposed to other statistical and machine learning models.

Having too many predictors in a model causes the following problems:

  • Additional noise in the data that affects the overall accuracy of the model
  • The model becomes much more complex than it should be
  • If new data is to be added for new predictions, we need to collect data even for the variables that are not important and are not really required for the predictions, because our model uses them up to a certain extent.

If these kinds of predictors are...