Book Image

Data Science for Marketing Analytics - Second Edition

By : Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali
Book Image

Data Science for Marketing Analytics - Second Edition

By: Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali

Overview of this book

Unleash the power of data to reach your marketing goals with this practical guide to data science for business. This book will help you get started on your journey to becoming a master of marketing analytics with Python. You'll work with relevant datasets and build your practical skills by tackling engaging exercises and activities that simulate real-world market analysis projects. You'll learn to think like a data scientist, build your problem-solving skills, and discover how to look at data in new ways to deliver business insights and make intelligent data-driven decisions. As well as learning how to clean, explore, and visualize data, you'll implement machine learning algorithms and build models to make predictions. As you work through the book, you'll use Python tools to analyze sales, visualize advertising data, predict revenue, address customer churn, and implement customer segmentation to understand behavior. By the end of this book, you'll have the knowledge, skills, and confidence to implement data science and machine learning techniques to better understand your marketing data and improve your decision-making.
Table of Contents (11 chapters)
Preface

Using Recursive Feature Selection for Feature Elimination

So far, we have discussed two important evaluation metrics – the MAE and RMSE. We also saw how these metrics can be used with the help of the scikit-learn library and how a change in the values of these metrics can be used as an indicator of a feature's importance. However, if you have a large number of features, removing one feature at a time would become a very tedious job, and this is where RFE comes into the picture. When a dataset contains features (all columns, except the column that we want to predict) that either are not related to the target column or are related to other columns, the performance of the model can be adversely affected if all the features are used for model training. Let's understand the basic reasoning behind this.

For example, consider that you want to predict the number of sales of a product given the cost price of the product, the discount available, the selling price of the...