Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface

Restricted Boltzmann Machines


RBMs are a family of unsupervised feature learning algorithms that use probabilistic models to learn new features. Like PCA and LDA, we can use RBMs to extract a new feature set from raw data and use them to enhance machine learning pipelines. The features that are extracted by RBMs tend to work best when followed by linear models such as linear regression, logistic regression, perceptron's, and so on.

The unsupervised nature of RBMs is important as they are more similar to PCA algorithms than they are to LDA. They do not require a ground-truth label for data points to extract new features. This makes them useful in a wider variety of machine learning problems.

Conceptually, RBMs are shallow (two-layer) neural networks. They are thought to be the building blocks of a class of algorithms called Deep Belief Networks (DBN). Keeping with standard terminology, there is a visible layer (the first layer), followed by a hidden layer (the second layer). These are the only...