Production-Ready Applied Deep Learning

By : Tomasz Palczewski, Jaejun (Brandon) Lee, Lenin Mookiah

Production-Ready Applied Deep Learning

By: Tomasz Palczewski, Jaejun (Brandon) Lee, Lenin Mookiah

Overview of this book

Machine learning engineers, deep learning specialists, and data engineers encounter various problems when moving deep learning models to a production environment. The main objective of this book is to close the gap between theory and applications by providing a thorough explanation of how to transform various models for deployment and efficiently distribute them with a full understanding of the alternatives. First, you will learn how to construct complex deep learning models in PyTorch and TensorFlow. Next, you will acquire the knowledge you need to transform your models from one framework to the other and learn how to tailor them for specific requirements that deployment environments introduce. The book also provides concrete implementations and associated methodologies that will help you apply the knowledge you gain right away. You will get hands-on experience with commonly used deep learning frameworks and popular cloud services designed for data analytics at scale. Additionally, you will get to grips with the authors’ collective knowledge of deploying hundreds of AI-based services at a large scale. By the end of this book, you will have understood how to convert a model developed for proof of concept into a production-ready application optimized for a particular production setting.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Part 1 – Building a Minimum Viable Product

Free Chapter

Chapter 1: Effective Planning of Deep Learning-Driven Projects

Technical requirements

What is DL?

Understanding the role of DL in our daily lives

Overview of DL projects

Planning a DL project

Summary

Further reading

Chapter 2: Data Preparation for Deep Learning Projects

Technical requirements

Setting up notebook environments

Data collection, data cleaning, and data preprocessing

Extracting features from data

Performing data visualization

Introduction to Docker

Summary

Chapter 3: Developing a Powerful Deep Learning Model

Technical requirements

Going through the basic theory of DL

Components of DL frameworks

Implementing and training a model in PyTorch

Implementing and training a model in TF

Decomposing a complex, state-of-the-art model implementation

Summary

Chapter 4: Experiment Tracking, Model Management, and Dataset Versioning

Technical requirements

Overview of DL project tracking

DL project tracking with Weights & Biases

DL project tracking with MLflow and DVC

Dataset versioning – beyond Weights & Biases, MLflow, and DVC

Summary

Part 2 – Building a Fully Featured Product

Chapter 5: Data Preparation in the Cloud

Technical requirements

Data processing in the cloud

Introduction to Apache Spark

Setting up a single-node EC2 instance for ETL

Setting up an EMR cluster for ETL

Creating a Glue job for ETL

Utilizing SageMaker for ETL

Comparing the ETL solutions in AWS

Summary

Chapter 6: Efficient Model Training

Technical requirements

Training a model on a single machine

Training a model on a cluster

Training a model using SageMaker

Training a model using Horovod

Training a model using Ray

Training a model using Kubeflow

Summary

Chapter 7: Revealing the Secret of Deep Learning Models

Technical requirements

Obtaining the best performing model using hyperparameter tuning

Understanding the behavior of the model with Explainable AI

Summary

Part 3 – Deployment and Maintenance

Chapter 8: Simplifying Deep Learning Model Deployment

Technical requirements

Introduction to ONNX

Conversion between TensorFlow and ONNX

Conversion between PyTorch and ONNX

Summary

Chapter 9: Scaling a Deep Learning Pipeline

Technical requirements

Inferencing using Elastic Kubernetes Service

Inferencing using SageMaker

Summary

Chapter 10: Improving Inference Efficiency

Technical requirements

Network quantization – reducing the number of bits used for model parameters

Weight sharing – reducing the number of distinct weight values

Network pruning – eliminating unnecessary connections within the network

Knowledge distillation – obtaining a smaller network by mimicking the prediction

Network Architecture Search – finding the most efficient network architecture

Summary

Chapter 11: Deep Learning on Mobile Devices

Preparing DL models for mobile devices

Creating iOS apps with a DL model

Creating Android apps with a DL model

Summary

Chapter 12: Monitoring Deep Learning Endpoints in Production

Technical requirements

Introduction to DL endpoint monitoring in production

Monitoring using CloudWatch

Monitoring a SageMaker endpoint using CloudWatch

Monitoring an EKS endpoint using CloudWatch

Summary

Chapter 13: Reviewing the Completed Deep Learning Project

Reviewing a DL project

Gathering the reusable knowledge, concepts, and artifacts for future projects

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Network pruning – eliminating unnecessary connections within the network

Network pruning is an optimization process that eliminates unnecessary connections. This technique can be applied after training, but it can also be applied during training where the decrease in model accuracy can be further reduced. With fewer connections, fewer weights are necessary. As a result, we can reduce the model size as well as the inference latency. In the following sections, we will present how to apply network pruning in TF and PyTorch.

Network pruning in TensorFlow

Like model quantization and weight sharing, network pruning for TF is available through TensorFlow Model Optimization Toolkit. Therefore, the first thing you need for network pruning is to import the toolkit with the following line of code:

import tensorflow_model_optimization as tfmot

To apply network pruning during training, you must modify your model using the tfmot.sparsity.keras.prune_low_magnitude function:

...

Production-Ready Applied Deep Learning

By : Tomasz Palczewski, Jaejun (Brandon) Lee, Lenin Mookiah

Production-Ready Applied Deep Learning

By: Tomasz Palczewski, Jaejun (Brandon) Lee, Lenin Mookiah

Overview of this book

Related Content you might be interested in

Current Title:

Production-Ready Applied Deep Learning

Network pruning – eliminating unnecessary connections within the network

Network pruning in TensorFlow