Chapter 6: Understanding Neural Network Transformers | The Deep Learning Architect's Handbook

Book Overview & Buying
Table Of Contents

The Deep Learning Architect's Handbook

By : Ee Kin Chin

4.8 (9)

Buy this Book

The Deep Learning Architect's Handbook

4.8 (9)

By: Ee Kin Chin

Buy this Book

Overview of this book

Deep learning enables previously unattainable feats in automation, but extracting real-world business value from it is a daunting task. This book will teach you how to build complex deep learning models and gain intuition for structuring your data to accomplish your deep learning objectives. This deep learning book explores every aspect of the deep learning life cycle, from planning and data preparation to model deployment and governance, using real-world scenarios that will take you through creating, deploying, and managing advanced solutions. You’ll also learn how to work with image, audio, text, and video data using deep learning architectures, as well as optimize and evaluate your deep learning models objectively to address issues such as bias, fairness, adversarial attacks, and model transparency. As you progress, you’ll harness the power of AI platforms to streamline the deep learning life cycle and leverage Python libraries and frameworks such as PyTorch, ONNX, Catalyst, MLFlow, Captum, Nvidia Triton, Prometheus, and Grafana to execute efficient deep learning architectures, optimize model performance, and streamline the deployment processes. You’ll also discover the transformative potential of large language models (LLMs) for a wide array of applications. By the end of this book, you'll have mastered deep learning techniques to unlock its full potential for your endeavors.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1 – Foundational Methods

Free Chapter

Chapter 1: Deep Learning Life Cycle

Technical requirements

Understanding the machine learning life cycle

Strategizing the construction of a deep learning system

Preparing data

Developing deep learning models

Delivering model insights

Managing risks

Summary

Further reading

Chapter 2: Designing Deep Learning Architectures

Technical requirements

Exploring the foundations of neural networks using an MLP

Understanding neural network gradients

Understanding gradient descent

Implementing an MLP from scratch

Summary

Chapter 3: Understanding Convolutional Neural Networks

Technical requirements

Understanding the convolutional neural network layer

Understanding the pooling layer

Building a CNN architecture

Designing a CNN architecture for practical usage

Exploring the CNN architecture families

Summary

Chapter 4: Understanding Recurrent Neural Networks

Technical requirements

Understanding LSTM

Understanding GRU

Understanding advancements over the standard GRU and LSTM layers

Summary

Chapter 5: Understanding Autoencoders

Technical requirements

Decoding the standard autoencoder

Exploring autoencoder variations

Building a CNN autoencoder

Summary

Chapter 6: Understanding Neural Network Transformers

Exploring neural network transformers

Decoding the original transformer architecture holistically

Uncovering transformer improvements using only the encoder

Uncovering transformer improvements using only the decoder

Summary

Chapter 7: Deep Neural Architecture Search

Technical requirements

Understanding the big picture of NAS

Understanding general hyperparameter search-based NAS

Understanding RL-based NAS

Understanding non-RL-based NAS

Summary

Chapter 8: Exploring Supervised Deep Learning

Technical requirements

Exploring supervised use cases and problem types

Implementing neural network layers for foundational problem types

Training supervised deep learning models effectively

Exploring general techniques to realize and improve supervised deep learning based solutions

Breaking down the multitask paradigm in supervised deep learning

Summary

Chapter 9: Exploring Unsupervised Deep Learning

Technical requirements

Exploring unsupervised deep learning applications

Creating pretrained network weights for downstream tasks

Creating general representations through unsupervised deep learning

Exploring zero-shot learning

Exploring the dimensionality reduction component of unsupervised deep learning

Detecting anomalies in external data

Summary

Part 2 – Multimodal Model Insights

Chapter 10: Exploring Model Evaluation Methods

Technical requirements

Exploring the different model evaluation methods

Engineering the base model evaluation metric

Exploring custom metrics and their applications

Exploring statistical tests for comparing model metrics

Relating the evaluation metric to success

Directly optimizing the metric

Summary

Chapter 11: Explaining Neural Network Predictions

Technical requirements

Exploring the value of prediction explanations

Demystifying prediction explanation techniques

Exploring gradient-based prediction explanations

Trusting and understanding integrated gradients

Using integrated gradients to aid in understanding predictions

Explaining prediction explanations automatically

Exploring common pitfalls in prediction explanations and how to avoid them

Summary

Further reading

Chapter 12: Interpreting Neural Networks

Technical requirements

Interpreting neurons

Finding neurons to interpret

Interpreting learned image patterns

Discovering the counterfactual explanation strategy

Summary

Chapter 13: Exploring Bias and Fairness

Technical requirements

Exploring the types of bias

Understanding the source of AI bias

Discovering bias and fairness evaluation methods

Evaluating the bias and fairness of a deep learning model

Tailoring bias and fairness measures across use cases

Mitigating AI bias

Summary

Chapter 14: Analyzing Adversarial Performance

Technical requirements

Using data augmentations for adversarial analysis

Analyzing adversarial performance for audio-based models

Analyzing adversarial performance for image-based models

Exploring adversarial analysis for text-based models

Summary

Part 3 – DLOps

Chapter 15: Deploying Deep Learning Models to Production

Technical requirements

Exploring the crucial components for DL model deployment

Identifying key DL model deployment requirements

Choosing the right DL model deployment options

Exploring deployment decisions based on practical use cases

Discovering general recommendations for DL deployment

Deploying a language model with ONNX, TensorRT, and NVIDIA Triton Server

Summary

Chapter 16: Governing Deep Learning Models

Technical requirements

Governing deep learning model utilization

Governing a deep learning model through monitoring

Governing a deep learning model through maintenance

Summary

Chapter 17: Managing Drift Effectively in a Dynamic Environment

Technical requirements

Exploring the issues of drift

Exploring the types of drift

Exploring strategies to handle drift

Detecting drift programmatically

Summary

Chapter 18: Exploring the DataRobot AI Platform

Technical requirements

A high-level look into what the DataRobot AI platform provides

Preparing data with DataRobot

Executing modeling experiments with DataRobot

Deploying a deep learning blueprint

Governing a deployed deep learning blueprint

Exploring some customer success stories

Summary

Chapter 19: Architecting LLM Solutions

Overview of LLM solutions

Handling knowledge for LLM solutions

Evaluating LLM solutions

Identifying challenges with LLM solutions

Tackling challenges with LLM solutions

Leveraging LLM to build autonomous agents

Exploring LLM solution use cases

Summary

The Deep Learning Architect's Handbook

By : Ee Kin Chin

The Deep Learning Architect's Handbook

By: Ee Kin Chin

Overview of this book

Summary

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access