Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface

Text-specific feature construction


Until this point, we have been working with categorical and numerical data. While our categorical data has come in the form of a string, the text has been part of a single category. We will now dive deeper into longer—form text data. This form of text data is much more complex than single—category text, because we now have a series of categories, or tokens. 

Before we get any further into working with text data, let's make sure we have a good understanding of what we mean when we refer to text data. Consider a service like Yelp, where users write up reviews of restaurants and businesses to share their thoughts on their experience. These reviews, all written in text format, contain a wealth of information that would be useful for machine learning purposes, for example, in predicting the best restaurant to visit. 

In general, a large part of how we communicate in today's world is through written text, whether in messaging services, social media, or email. As...