Book Image

Python Deep Learning - Third Edition

By : Ivan Vasilev
4 (1)
Book Image

Python Deep Learning - Third Edition

4 (1)
By: Ivan Vasilev

Overview of this book

The field of deep learning has developed rapidly recently and today covers a broad range of applications. This makes it challenging to navigate and hard to understand without solid foundations. This book will guide you from the basics of neural networks to the state-of-the-art large language models in use today. The first part of the book introduces the main machine learning concepts and paradigms. It covers the mathematical foundations, the structure, and the training algorithms of neural networks and dives into the essence of deep learning. The second part of the book introduces convolutional networks for computer vision. We’ll learn how to solve image classification, object detection, instance segmentation, and image generation tasks. The third part focuses on the attention mechanism and transformers – the core network architecture of large language models. We’ll discuss new types of advanced tasks they can solve, such as chatbots and text-to-image generation. By the end of this book, you’ll have a thorough understanding of the inner workings of deep neural networks. You'll have the ability to develop new models and adapt existing ones to solve your tasks. You’ll also have sufficient understanding to continue your research and stay up to date with the latest advancements in the field.
Table of Contents (17 chapters)
1
Part 1:Introduction to Neural Networks
5
Part 2: Deep Neural Networks for Computer Vision
8
Part 3: Natural Language Processing and Transformers
13
Part 4: Developing and Deploying Deep Neural Networks

Introduction to ML

ML is often associated with terms such as big data and artificial intelligence (AI). However, both are quite different from ML. To understand what ML is and why it’s useful, it’s important to understand what big data is and how ML applies to it.

Big data is a term used to describe huge datasets that are created as the result of large increases in data that is gathered and stored. For example, this may be through cameras, sensors, or internet social sites.

How much data do we create daily?

It’s estimated that Google alone processes over 20 petabytes of information per day, and this number is only going to increase. A few years ago, Forbes estimated that every day, 2.5 quintillion bytes of data are created and that 90% of all the data in the world has been created in the last two years.

(https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/)

Humans alone are unable to grasp, let alone analyze, such huge amounts of data, and ML techniques are used to make sense of these very large datasets. ML is the tool that’s used for large-scale data processing. It is well suited to complex datasets that have huge numbers of variables and features. One of the strengths of many ML techniques, and DL in particular, is that they perform best when used on large datasets, thus improving their analytic and predictive power. In other words, ML techniques, and DL NNs in particular, learn best when they can access large datasets where they can discover patterns and regularities hidden in the data.

On the other hand, ML’s predictive ability can be successfully adapted to AI systems. ML can be thought of as the brain of an AI system. AI can be defined (though this definition may not be unique) as a system that can interact with its environment. Also, AI machines are endowed with sensors that enable them to know the environment they are in and tools with which they can relate to the environment. Therefore, ML is the brain that allows the machine to analyze the data ingested through its sensors to formulate an appropriate answer. A simple example is Siri on an iPhone. Siri hears the command through its microphone and outputs an answer through its speakers or its display, but to do so, it needs to understand what it’s being told. Similarly, driverless cars will be equipped with cameras, GPS systems, sonars, and LiDAR, but all this information needs to be processed to provide a correct answer. This may include whether to accelerate, brake, or turn. ML is the information-processing method that leads to the answer.

We’ve explained what ML is, but what about DL? For now, let’s just say that DL is a subfield of ML. DL methods share some special common features. The most popular representatives of such methods are deep NNs.