Book Image

The Deep Learning Architect's Handbook

By : Ee Kin Chin

5 (1)

Book Image

The Deep Learning Architect's Handbook

5 (1)

By: Ee Kin Chin

Overview of this book

Deep learning enables previously unattainable feats in automation, but extracting real-world business value from it is a daunting task. This book will teach you how to build complex deep learning models and gain intuition for structuring your data to accomplish your deep learning objectives. This deep learning book explores every aspect of the deep learning life cycle, from planning and data preparation to model deployment and governance, using real-world scenarios that will take you through creating, deploying, and managing advanced solutions. You’ll also learn how to work with image, audio, text, and video data using deep learning architectures, as well as optimize and evaluate your deep learning models objectively to address issues such as bias, fairness, adversarial attacks, and model transparency. As you progress, you’ll harness the power of AI platforms to streamline the deep learning life cycle and leverage Python libraries and frameworks such as PyTorch, ONNX, Catalyst, MLFlow, Captum, Nvidia Triton, Prometheus, and Grafana to execute efficient deep learning architectures, optimize model performance, and streamline the deployment processes. You’ll also discover the transformative potential of large language models (LLMs) for a wide array of applications. By the end of this book, you'll have mastered deep learning techniques to unlock its full potential for your endeavors.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Share Your Thoughts

Share Your Thoughts

Download a free PDF copy of this book

Part 1 – Foundational Methods

Part 1 – Foundational Methods

Free Chapter

Chapter 1: Deep Learning Life Cycle

Chapter 1: Deep Learning Life Cycle

Technical requirements

Understanding the machine learning life cycle

Strategizing the construction of a deep learning system

Developing deep learning models

Delivering model insights

Further reading

Chapter 2: Designing Deep Learning Architectures

Chapter 2: Designing Deep Learning Architectures

Technical requirements

Exploring the foundations of neural networks using an MLP

Understanding neural network gradients

Understanding gradient descent

Implementing an MLP from scratch

Chapter 3: Understanding Convolutional Neural Networks

Chapter 3: Understanding Convolutional Neural Networks

Technical requirements

Understanding the convolutional neural network layer

Understanding the pooling layer

Building a CNN architecture

Designing a CNN architecture for practical usage

Exploring the CNN architecture families

Chapter 4: Understanding Recurrent Neural Networks

Chapter 4: Understanding Recurrent Neural Networks

Technical requirements

Understanding LSTM

Understanding GRU

Understanding advancements over the standard GRU and LSTM layers

Chapter 5: Understanding Autoencoders

Chapter 5: Understanding Autoencoders

Technical requirements

Decoding the standard autoencoder

Exploring autoencoder variations

Building a CNN autoencoder

Chapter 6: Understanding Neural Network Transformers

Chapter 6: Understanding Neural Network Transformers

Exploring neural network transformers

Decoding the original transformer architecture holistically

Uncovering transformer improvements using only the encoder

Uncovering transformer improvements using only the decoder

Chapter 7: Deep Neural Architecture Search

Chapter 7: Deep Neural Architecture Search

Technical requirements

Understanding the big picture of NAS

Understanding general hyperparameter search-based NAS

Understanding RL-based NAS

Understanding non-RL-based NAS

Chapter 8: Exploring Supervised Deep Learning

Chapter 8: Exploring Supervised Deep Learning

Technical requirements

Exploring supervised use cases and problem types

Implementing neural network layers for foundational problem types

Training supervised deep learning models effectively

Exploring general techniques to realize and improve supervised deep learning based solutions

Breaking down the multitask paradigm in supervised deep learning

Chapter 9: Exploring Unsupervised Deep Learning

Chapter 9: Exploring Unsupervised Deep Learning

Technical requirements

Exploring unsupervised deep learning applications

Creating pretrained network weights for downstream tasks

Creating general representations through unsupervised deep learning

Exploring zero-shot learning

Exploring the dimensionality reduction component of unsupervised deep learning

Detecting anomalies in external data

Part 2 – Multimodal Model Insights

Part 2 – Multimodal Model Insights

Chapter 10: Exploring Model Evaluation Methods

Chapter 10: Exploring Model Evaluation Methods

Technical requirements

Exploring the different model evaluation methods

Engineering the base model evaluation metric

Exploring custom metrics and their applications

Exploring statistical tests for comparing model metrics

Relating the evaluation metric to success

Directly optimizing the metric

Chapter 11: Explaining Neural Network Predictions

Chapter 11: Explaining Neural Network Predictions

Technical requirements

Exploring the value of prediction explanations

Demystifying prediction explanation techniques

Exploring gradient-based prediction explanations

Trusting and understanding integrated gradients

Using integrated gradients to aid in understanding predictions

Explaining prediction explanations automatically

Exploring common pitfalls in prediction explanations and how to avoid them

Further reading

Chapter 12: Interpreting Neural Networks

Chapter 12: Interpreting Neural Networks

Technical requirements

Interpreting neurons

Finding neurons to interpret

Interpreting learned image patterns

Discovering the counterfactual explanation strategy

Chapter 13: Exploring Bias and Fairness

Chapter 13: Exploring Bias and Fairness

Technical requirements

Exploring the types of bias

Understanding the source of AI bias

Discovering bias and fairness evaluation methods

Evaluating the bias and fairness of a deep learning model

Tailoring bias and fairness measures across use cases

Mitigating AI bias

Chapter 14: Analyzing Adversarial Performance

Chapter 14: Analyzing Adversarial Performance

Technical requirements

Using data augmentations for adversarial analysis

Analyzing adversarial performance for audio-based models

Analyzing adversarial performance for image-based models

Exploring adversarial analysis for text-based models

Part 3 – DLOps

Part 3 – DLOps

Chapter 15: Deploying Deep Learning Models to Production

Chapter 15: Deploying Deep Learning Models to Production

Technical requirements

Exploring the crucial components for DL model deployment

Identifying key DL model deployment requirements

Choosing the right DL model deployment options

Exploring deployment decisions based on practical use cases

Discovering general recommendations for DL deployment

Deploying a language model with ONNX, TensorRT, and NVIDIA Triton Server

Chapter 16: Governing Deep Learning Models

Chapter 16: Governing Deep Learning Models

Technical requirements

Governing deep learning model utilization

Governing a deep learning model through monitoring

Governing a deep learning model through maintenance

Chapter 17: Managing Drift Effectively in a Dynamic Environment

Chapter 17: Managing Drift Effectively in a Dynamic Environment

Technical requirements

Exploring the issues of drift

Exploring the types of drift

Exploring strategies to handle drift

Detecting drift programmatically

Chapter 18: Exploring the DataRobot AI Platform

Chapter 18: Exploring the DataRobot AI Platform

Technical requirements

A high-level look into what the DataRobot AI platform provides

Preparing data with DataRobot

Executing modeling experiments with DataRobot

Deploying a deep learning blueprint

Governing a deployed deep learning blueprint

Exploring some customer success stories

Chapter 19: Architecting LLM Solutions

Chapter 19: Architecting LLM Solutions

Overview of LLM solutions

Handling knowledge for LLM solutions

Evaluating LLM solutions

Identifying challenges with LLM solutions

Tackling challenges with LLM solutions

Leveraging LLM to build autonomous agents

Exploring LLM solution use cases

Further reading

Index

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 (1)

5 star

100%

4 star

0

3 star

0

2 star

0

1 star

0

Uncovering transformer improvements using only the decoder

Recall that the decoder block of the transformer focuses on an autoregressive structure. For the decoder-only transformer line of models, the task of predicting tokens autoregressively remains the same. With the removal of the encoder, the architecture has to adapt its input to accept more than one sentence, similar to what BERT does. Starting, ending, and separator tokens are used to encode input data sequentially. Masking is still performed to prevent the model from depending on the current token to predict future tokens from the input data during predictions, which is similar to the original transformer along with positional embeddings.

Diving into the GPT model family

All these architectural concepts were introduced by the GPT model in 2018, which is short for generative pre-training. As the name suggests, GPT also adopts unsupervised pre-training as the initial stage and subsequently moves into the supervised fine...