Data Cleansing Master Class in Python [Video]
Video
Video
$49.99
Subscription
$15.99
$10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with Video + Subscription?
Download this video in MP4 format, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
What do I get with Print?
Get a paperback copy of the book delivered to your specified Address*
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do I get with Print?
What do you get with video?
What do you get with video?
What do you get with Audiobook?
What do you get with Exam Trainer?
Video
$49.99
Subscription
$15.99
$10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with Video + Subscription?
Download this video in MP4 format, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do I get with Print?
Get a paperback copy of the book delivered to your specified Address*
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do I get with Print?
Get a paperback copy of the book delivered to your specified Address*
Access this title in our online reader
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
-
Free ChapterIntroduction
-
Foundations
- Introducing Data Preparation
- The Machine Learning Process
- Data Preparation Defined
- Choosing a Data Preparation Technique
- What is Data in Machine Learning?
- Raw Data
- Machine Learning is Mostly Data Preparation
- Common Data Preparation Tasks - Data Cleansing
- Common Data Preparation Tasks - Feature Selection
- Common Data Preparation Tasks - Data Transforms
- Common Data Preparation Tasks - Feature Engineering
- Common Data Preparation Tasks - Dimensionality Reduction
- Data Leakage
- Problem with NaÏve Data Preparation
- Case Study: Data Leakage: Train / Test / Split NaÏve Approach
- Case Study: Data Leakage: Train / Test / Split Correct Approach
- Case Study: Data Leakage: K-Fold NaÏve Approach
- Case Study: Data Leakage: K-Fold Correct Approach
-
Data Cleansing
- Data Cleansing Overview
- Identify Columns That Contain a Single Value
- Identify Columns with Few Values
- Remove Columns with Low Variance
- Identify and Remove Rows That Contain Duplicate Data
- Defining Outliers
- Remove Outliers - The Standard Deviation Approach
- Remove Outliers - The IQR Approach
- Automatic Outlier Detection
- Mark Missing Values
- Remove Rows with Missing Values
- Statistical Imputation
- Mean Value Imputation
- Simple Imputer with Model Evaluation
- Compare Different Statistical Imputation Strategies
- K-Nearest Neighbors Imputation
- KNNImputer and Model Evaluation
- Iterative Imputation
- IterativeImputer and Model Evaluation
- IterativeImputer and Different Imputation Order
-
Feature Selection
- Feature Selection Introduction
- Feature Selection Defined
- Statistics for Feature Selection
- Loading a Categorical Dataset
- Encode the Dataset for Modelling
- Chi-Squared
- Mutual Information
- Modeling with Selected Categorical Features
- Feature Selection with ANOVA on Numerical Input
- Feature Selection with Mutual Information
- Modeling with Selected Numerical Features
- Tuning a Number of Selected Features
- Select Features for Numerical Output
- Linear Correlation with Correlation Statistics
- Linear Correlation with Mutual Information
- Baseline and Model Built Using Correlation
- Model Built Using Mutual Information Features
- Tuning Number of Selected Features
- Recursive Feature Elimination
- RFE for Classification
- RFE for Regression
- RFE Hyperparameters
- Feature Ranking for RFE
- Feature Importance Scores Defined
- Feature Importance Scores: Linear Regression
- Feature Importance Scores: Logistic Regression and CART
- Feature Importance Scores: Random Forests
- Permutation Feature Importance
- Feature Selection with Importance
-
Data Transforms
- Scale Numerical Data
- Diabetes Dataset for Scaling
- MinMaxScaler Transform
- StandardScaler Transform
- Robust Scaling Data
- Robust Scaler Applied to Dataset
- Explore Robust Scaler Range
- Nominal and Ordinal Variables
- Ordinal Encoding
- One-Hot Encoding Defined
- One-Hot Encoding
- Dummy Variable Encoding
- Ordinal Encoder Transform on Breast Cancer Dataset
- Make Distributions More Gaussian
- Power Transform on Contrived Dataset
- Power Transform on Sonar Dataset
- Box-Cox on Sonar Dataset
- Yeo-Johnson on Sonar Dataset
- Polynomial Features
- Effect of Polynomial Degrees
-
Advanced Transforms
-
Dimensionality Reduction
About this
video
Data preparation may be the most important part of a machine learning project. It is the most time-consuming part, although it is the least discussed topic. Data preparation, sometimes referred to as data preprocessing, is the act of transforming raw data into a form that is appropriate for modeling.
Machine learning algorithms require input data to be numbered, and most algorithm implementations maintain this expectation. Therefore, if your data contains data types and values that are not numbers, such as labels, you will need to change the data into numbers. Further, specific machine learning algorithms have expectations regarding the data types, scale, probability distribution, and relationships between input variables, and you may need to change the data to meet these expectations.
In this course, you will learn data imputation and advanced data cleansing techniques, how to apply real-world data cleansing techniques to your data, advanced data cleansing techniques. Also, learn how to prepare data in a way that avoids data leakage, and in turn, incorrect model evaluation.
By the end of this course, you will perform data preprocessing and master data cleaning skills.
The complete code bundle for this course is available at https://github.com/PacktPublishing/Data-Cleansing-Master-Class-in-Python
- Publication date:
- December 2021
- Publisher
- Packt
- Duration
- 3 hours 33 minutes
- ISBN
- 9781803239040