Chapter 1: Exploring Data for Machine Learning | Data Labeling in Machine Learning with Python

Book Overview & Buying
Table Of Contents

Data Labeling in Machine Learning with Python

By : Vijaya Kumar Suda

5 (3)

Buy this Book

Data Labeling in Machine Learning with Python

5 (3)

By: Vijaya Kumar Suda

Buy this Book

Overview of this book

Data labeling is the invisible hand that guides the power of artificial intelligence and machine learning. In today’s data-driven world, mastering data labeling is not just an advantage, it’s a necessity. Data Labeling in Machine Learning with Python empowers you to unearth value from raw data, create intelligent systems, and influence the course of technological evolution. With this book, you'll discover the art of employing summary statistics, weak supervision, programmatic rules, and heuristics to assign labels to unlabeled training data programmatically. As you progress, you'll be able to enhance your datasets by mastering the intricacies of semi-supervised learning and data augmentation. Venturing further into the data landscape, you'll immerse yourself in the annotation of image, video, and audio data, harnessing the power of Python libraries such as seaborn, matplotlib, cv2, librosa, openai, and langchain. With hands-on guidance and practical examples, you'll gain proficiency in annotating diverse data types effectively. By the end of this book, you’ll have the practical expertise to programmatically label diverse data types and enhance datasets, unlocking the full potential of your data.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share your thoughts

Download a free PDF copy of this book

Part 1: Labeling Tabular Data

Free Chapter

Chapter 1: Exploring Data for Machine Learning

Technical requirements

EDA and data labeling

Understanding the ML project life cycle

Introducing Pandas DataFrames

Summary statistics and data aggregates

Creating visualizations using Seaborn for univariate and bivariate analysis

Profiling data using the ydata-profiling library

Unlocking insights from data with OpenAI and LangChain

Summary

Chapter 2: Labeling Data for Classification

Technical requirements

Predicting labels with LLMs for tabular data

Data labeling using Snorkel

Labeling data using the Compose library

Labeling data using semi-supervised learning

Labeling data using K-means clustering

Summary

Chapter 3: Labeling Data for Regression

Technical requirements

Using summary statistics to generate housing price labels

Using data augmentation to label regression data

Using k-means clustering to label regression data

Summary

Part 2: Labeling Image Data

Chapter 4: Exploring Image Data

Technical requirements

Visualizing image data using Matplotlib in Python

Analyzing image size and aspect ratio

Performing transformations on images – image augmentation

Summary

Chapter 5: Labeling Image Data Using Rules

Technical requirements

Labeling rules based on image visualization

Labeling images using rules based on properties

Labeling images using transfer learning

Labeling images using transformations

Summary

Chapter 6: Labeling Image Data Using Data Augmentation

Technical requirements

Training support vector machines with augmented image data

Implementing an SVM with data augmentation in Python

Image classification using the SVM with data augmentation on the MNIST dataset

Convolutional neural networks using augmented image data

Summary

Part 3: Labeling Text, Audio, and Video Data

Chapter 7: Labeling Text Data

Technical requirements

Real-world applications of text data labeling

Tools and frameworks for text data labeling

Exploratory data analysis of text

Exploring Generative AI and OpenAI for labeling text data

Hands-on labeling of text data using the Snorkel API

Hands-on text labeling using Logistic Regression

Hands-on label prediction using K-means clustering

Generating labels for customer reviews (sentiment analysis)

Summary

Chapter 8: Exploring Video Data

Technical requirements

Loading video data using cv2

Extracting frames from video data for analysis

Extracting features from video frames

Visualizing video data using Matplotlib

Labeling video data using k-means clustering

Advanced concepts in video data analysis

Summary

Chapter 9: Labeling Video Data

Technical requirements

Capturing real-time video

Building a CNN model for labeling video data

Using autoencoders for video data labeling

Using the Watershed algorithm for video data labeling

Real-world examples for video data labeling

Advances in video data labeling and classification

Summary

Chapter 10: Exploring Audio Data

Technical requirements

Real-life applications for labeling audio data

Audio data fundamentals

Hands-on with analyzing audio data

Extracting properties from audio data

Visualizing audio data with matplotlib and Librosa

Ethical implications of audio data

Recent advances in audio data analysis

Troubleshooting common issues during data analysis

Troubleshooting common installation issues for audio libraries

Summary

Chapter 11: Labeling Audio Data

Technical requirements

Real-time voice classification with Random Forest

Transcribing audio using the OpenAI Whisper model

Hands-on – labeling audio data using a CNN

Exploring audio data augmentation

Introducing Azure Cognitive Services – the speech service

Summary

Chapter 12: Hands-On Exploring Data Labeling Tools

Technical requirements

Data labeling using Azure Machine Learning

Exploring Label Studio

pyOpenAnnotate

Computer Vision Annotation Tool

Comparison of data labeling tools

Advanced methods in data labeling

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share your thoughts

Download a free PDF copy of this book

Data Labeling in Machine Learning with Python

By : Vijaya Kumar Suda

Data Labeling in Machine Learning with Python

By: Vijaya Kumar Suda

Overview of this book

Summary

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access