Book Image

The Deep Learning Architect's Handbook

By : Ee Kin Chin
5 (1)
Book Image

The Deep Learning Architect's Handbook

5 (1)
By: Ee Kin Chin

Overview of this book

Deep learning enables previously unattainable feats in automation, but extracting real-world business value from it is a daunting task. This book will teach you how to build complex deep learning models and gain intuition for structuring your data to accomplish your deep learning objectives. This deep learning book explores every aspect of the deep learning life cycle, from planning and data preparation to model deployment and governance, using real-world scenarios that will take you through creating, deploying, and managing advanced solutions. You’ll also learn how to work with image, audio, text, and video data using deep learning architectures, as well as optimize and evaluate your deep learning models objectively to address issues such as bias, fairness, adversarial attacks, and model transparency. As you progress, you’ll harness the power of AI platforms to streamline the deep learning life cycle and leverage Python libraries and frameworks such as PyTorch, ONNX, Catalyst, MLFlow, Captum, Nvidia Triton, Prometheus, and Grafana to execute efficient deep learning architectures, optimize model performance, and streamline the deployment processes. You’ll also discover the transformative potential of large language models (LLMs) for a wide array of applications. By the end of this book, you'll have mastered deep learning techniques to unlock its full potential for your endeavors.
Table of Contents (25 chapters)
1
Part 1 – Foundational Methods
11
Part 2 – Multimodal Model Insights
17
Part 3 – DLOps

Building a CNN autoencoder

Let’s start by going through what a transpose convolution is. Figure 5.3 shows an example transpose convolution operation on a 2x2 sized input with a 2x2 sized convolutional filter, with a stride of 1.

Figure 5.3 – A transposed convolutional filter operation

Figure 5.3 – A transposed convolutional filter operation

In Figure 5.3, note that each of the 2x2 input data is marked with a number from 1 to 4. These numbers are used to map the output results, presented as 3x3 outputs. The convolutional kernel applies each of its weights individually to every value in the input data in a sliding window manner, and the outputs from the four convolutional operations are presented in the bottom part of the figure. After the operation is done, each of the outputs will be elementwise added to form the final output and subjected to a bias. This example process depicts how a 2x2 input can be scaled up to a 3x3 data size without relying completely on padding.

Let’s implement...