Book Image

Python Deep Learning - Third Edition

By : Ivan Vasilev
4 (1)
Book Image

Python Deep Learning - Third Edition

4 (1)
By: Ivan Vasilev

Overview of this book

The field of deep learning has developed rapidly recently and today covers a broad range of applications. This makes it challenging to navigate and hard to understand without solid foundations. This book will guide you from the basics of neural networks to the state-of-the-art large language models in use today. The first part of the book introduces the main machine learning concepts and paradigms. It covers the mathematical foundations, the structure, and the training algorithms of neural networks and dives into the essence of deep learning. The second part of the book introduces convolutional networks for computer vision. We’ll learn how to solve image classification, object detection, instance segmentation, and image generation tasks. The third part focuses on the attention mechanism and transformers – the core network architecture of large language models. We’ll discuss new types of advanced tasks they can solve, such as chatbots and text-to-image generation. By the end of this book, you’ll have a thorough understanding of the inner workings of deep neural networks. You'll have the ability to develop new models and adapt existing ones to solve your tasks. You’ll also have sufficient understanding to continue your research and stay up to date with the latest advancements in the field.
Table of Contents (17 chapters)
1
Part 1:Introduction to Neural Networks
5
Part 2: Deep Neural Networks for Computer Vision
8
Part 3: Natural Language Processing and Transformers
13
Part 4: Developing and Deploying Deep Neural Networks

Transfer learning (TL)

So far, we’ve trained small models on toy datasets, where the training took no more than an hour. But if we want to work with large datasets, such as ImageNet, we will need a much bigger network that trains for a lot longer. More importantly, large datasets are not always available for the tasks we’re interested in. Keep in mind that besides obtaining the images, they have to be labeled, and this could be expensive and time-consuming. So, what does a humble engineer do when they want to solve a real ML problem with limited resources? Enter TL.

TL is the process of applying an existing trained ML model to a new, but related, problem. For example, we can take a network trained on ImageNet and repurpose it to classify grocery store items. Alternatively, we could use a driving simulator game to train an NN to drive a simulated car, and then use the network to drive a real car (but don’t try this at home!). TL is a general ML concept that applies...