Book Image

Data Cleansing Master Class in Python [Video]

By : Mike West
Book Image

Data Cleansing Master Class in Python [Video]

By: Mike West

Overview of this book

Data preparation may be the most important part of a machine learning project. It is the most time-consuming part, although it is the least discussed topic. Data preparation, sometimes referred to as data preprocessing, is the act of transforming raw data into a form that is appropriate for modeling. Machine learning algorithms require input data to be numbered, and most algorithm implementations maintain this expectation. Therefore, if your data contains data types and values that are not numbers, such as labels, you will need to change the data into numbers. Further, specific machine learning algorithms have expectations regarding the data types, scale, probability distribution, and relationships between input variables, and you may need to change the data to meet these expectations. In this course, you will learn data imputation and advanced data cleansing techniques, how to apply real-world data cleansing techniques to your data, advanced data cleansing techniques. Also, learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation. By the end of this course, you will perform data preprocessing and master data cleaning skills. The complete code bundle for this course is available at https://github.com/PacktPublishing/Data-Cleansing-Master-Class-in-Python
Table of Contents (7 chapters)
Chapter 5
Data Transforms
Content Locked
Section 1
Scale Numerical Data
In this video, we will focus on scale numerical data.