Book Image

Machine Learning with LightGBM and Python

By : Andrich van Wyk

3 (1)

Book Image

Machine Learning with LightGBM and Python

3 (1)

By: Andrich van Wyk

Overview of this book

Machine Learning with LightGBM and Python is a comprehensive guide to learning the basics of machine learning and progressing to building scalable machine learning systems that are ready for release. This book will get you acquainted with the high-performance gradient-boosting LightGBM framework and show you how it can be used to solve various machine-learning problems to produce highly accurate, robust, and predictive solutions. Starting with simple machine learning models in scikit-learn, you’ll explore the intricacies of gradient boosting machines and LightGBM. You’ll be guided through various case studies to better understand the data science processes and learn how to practically apply your skills to real-world problems. As you progress, you’ll elevate your software engineering skills by learning how to build and integrate scalable machine-learning pipelines to process data, train models, and deploy them to serve secure APIs using Python tools such as FastAPI. By the end of this book, you’ll be well equipped to use various -of-the-art tools that will help you build production-ready systems, including FLAML for AutoML, PostgresML for operating ML pipelines using Postgres, high-performance distributed training and serving via Dask, and creating and running models in the Cloud with AWS Sagemaker.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Share Your Thoughts

Download a free PDF copy of this book

Part 1: Gradient Boosting and LightGBM Fundamentals

Part 1: Gradient Boosting and LightGBM Fundamentals

Free Chapter

Chapter 1: Introducing Machine Learning

Chapter 1: Introducing Machine Learning

Technical requirements

What is machine learning?

Introducing models, datasets, and supervised learning

Decision tree learning

Chapter 2: Ensemble Learning – Bagging and Boosting

Chapter 2: Ensemble Learning – Bagging and Boosting

Technical requirements

Ensemble learning

Bagging and random forests

Gradient-boosted decision trees

Advanced boosting algorithm – DART

Chapter 3: An Overview of LightGBM in Python

Chapter 3: An Overview of LightGBM in Python

Technical requirements

Introducing LightGBM

Getting started with LightGBM in Python

Building LightGBM models

Chapter 4: Comparing LightGBM, XGBoost, and Deep Learning

Chapter 4: Comparing LightGBM, XGBoost, and Deep Learning

Technical requirements

An overview of XGBoost

Deep learning and TabTransformers

Comparing LightGBM, XGBoost, and TabTransformers

Part 2: Practical Machine Learning with LightGBM

Part 2: Practical Machine Learning with LightGBM

Chapter 5: LightGBM Parameter Optimization with Optuna

Chapter 5: LightGBM Parameter Optimization with Optuna

Technical requirements

Optuna and optimization algorithms

Optimizing LightGBM with Optuna

Chapter 6: Solving Real-World Data Science Problems with LightGBM

Chapter 6: Solving Real-World Data Science Problems with LightGBM

Technical requirements

The data science life cycle

Predicting wind turbine power generation with LightGBM

Classifying individual credit scores with LightGBM

Chapter 7: AutoML with LightGBM and FLAML

Chapter 7: AutoML with LightGBM and FLAML

Technical requirements

Automated machine learning

Introducing FLAML

Case study – using FLAML with LightGBM

Part 3: Production-ready Machine Learning with LightGBM

Part 3: Production-ready Machine Learning with LightGBM

Chapter 8: Machine Learning Pipelines and MLOps with LightGBM

Chapter 8: Machine Learning Pipelines and MLOps with LightGBM

Technical requirements

Introducing machine learning pipelines

Understanding MLOps

Deploying an ML pipeline for customer churn

Chapter 9: LightGBM MLOps with AWS SageMaker

Chapter 9: LightGBM MLOps with AWS SageMaker

Technical requirements

An introduction to AWS and SageMaker

Building a LightGBM ML pipeline with Amazon SageMaker

Chapter 10: LightGBM Models with PostgresML

Chapter 10: LightGBM Models with PostgresML

Technical requirements

Introducing PostgresML

Getting started with PostgresML

Case study – customer churn with PostgresML

Chapter 11: Distributed and GPU-Based Learning with LightGBM

Chapter 11: Distributed and GPU-Based Learning with LightGBM

Technical requirements

Distributed learning with LightGBM and Dask

GPU training for LightGBM

Index

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

3 (1)

5 star

0

4 star

0

3 star

100%

2 star

0

1 star

0

Comparing LightGBM, XGBoost, and TabTransformers

In this section, we compare the performance of LightGBM, XGBoost, and TabTransformers on two different datasets. We also look at more data preparation techniques for unbalanced classes, missing values, and categorical data.

Predicting census income

The first dataset we use is the Census Income dataset, which predicts whether personal income will exceed $50,000 based on attributes such as education, marital status, occupation, and others [4]. The dataset has 48,842 instances, and as we’ll see, some missing values and unbalanced classes.

The dataset is available from the following URL: https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data. The data has already been split into a training set and a test set. Once loaded, we can sample the data:

train_data.sample(5)[["age", "education", "marital_status", "hours_per_week", "income_bracket"]]

The data...