Book Image

Deep Learning with TensorFlow 2 and Keras - Second Edition

By : Antonio Gulli, Amita Kapoor, Sujit Pal
Book Image

Deep Learning with TensorFlow 2 and Keras - Second Edition

By: Antonio Gulli, Amita Kapoor, Sujit Pal

Overview of this book

Deep Learning with TensorFlow 2 and Keras, Second Edition teaches neural networks and deep learning techniques alongside TensorFlow (TF) and Keras. You’ll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available. TensorFlow is the machine learning library of choice for professional applications, while Keras offers a simple and powerful Python API for accessing TensorFlow. TensorFlow 2 provides full Keras integration, making advanced machine learning easier and more convenient than ever before. This book also introduces neural networks with TensorFlow, runs through the main applications (regression, ConvNets (CNNs), GANs, RNNs, NLP), covers two working example apps, and then dives into TF in production, TF mobile, and using TensorFlow with AutoML.
Table of Contents (19 chapters)
17
Other Books You May Enjoy
18
Index

Machine learning, artificial intelligence, and the deep learning Cambrian explosion

Artificial intelligence (AI) lays the ground for everything this book discusses. Machine learning (ML) is a branch of AI, and Deep learning (DL) is in turn a subset within ML. This section will briefly discuss these three concepts, which you will regularly encounter throughout the rest of this book.

AI denotes any activity where machines mimic intelligent behaviors typically shown by humans. More formally, it is a research field in which machines aim to replicate cognitive capabilities such as learning behaviors, proactive interaction with the environment, inference and deduction, computer vision, speech recognition, problem solving, knowledge representation, and perception. AI builds on elements of computer science, mathematics, and statistics, as well as psychology and other sciences studying human behaviors. There are multiple strategies for building AI. During the 1970s and 1980s, ‘expert’ systems became extremely popular. The goal of these systems was to solve complex problems by representing the knowledge with a large number of manually defined if–then rules. This approach worked for small problems on very specific domains, but it was not able to scale up for larger problems and multiple domains. Later, AI focused more and more on methods based on statistical methods that are part of ML.

ML is a subdiscipline of AI that focuses on teaching computers how to learn without the need to be programmed for specific tasks. The key idea behind ML is that it is possible to create algorithms that learn from, and make predictions on, data. There are three different broad categories of ML:

  • Supervised learning, in which the machine is presented with input data and a desired output, and the goal is to learn from those training examples in such a way that meaningful predictions can be made for data that the machine has never observed before.
  • Unsupervised learning, in which the machine is presented with input data only, and the machine has to subsequently find some meaningful structure by itself, with no external supervision or input.
  • Reinforcement learning, in which the machine acts as an agent, interacting with the environment. The machine is provided with "rewards" for behaving in a desired manner, and "penalties" for behaving in an undesired manner. The machine attempts to maximize rewards by learning to develop its behavior accordingly.

DL took the world by storm in 2012. During that year, the ImageNet 2012 challenge [3] was launched with the goal of predicting the content of photographs using a subset of a large hand-labeled dataset. A deep learning model named AlexNet [4] achieved a top-5 error rate of 15.3%, a significant improvement with respect to previous state-of-the-art results. According to the Economist [5], "Suddenly people started to pay attention, not just within the AI community but across the technology industry as a whole." Since 2012, we have seen constant progress [5] (see Figure 1) with several models classifying ImageNet photography, with an error rate of less than 2%; better than the estimated human error rate at 5.1%:

Chart

Figure 1: Top 5 accuracy achieved by different deep learning models on ImageNet 2012

That was only the beginning. Today, DL techniques are successfully applied in heterogeneous domains including, but not limited to: healthcare, environment, green energy, computer vision, text analysis, multimedia, finance, retail, gaming, simulation, industry, robotics, and self-driving cars. In each of these domains, DL techniques can solve problems with a level of accuracy that was not possible using previous methods.

It is worth noting that interest in DL is also increasing. According to the State of Deep Learning H2 2018 Review [9] "Every 20 minutes, a new ML paper is born. The growth rate of machine learning papers has been around 3.5% a month [..] around a 50% growth rate annually." During the past three years, it seems like we are living during a Cambrian explosion for DL, with the number of articles on our arXiv growing faster than Moore's Law (see Figure 2). Still, according to the review this "gives you a sense that people believe that this is where the future value in computing is going to come from":

Figure 2: ML papers on arXiv appears to be growing faster than Moore's Law (source: https://www.kdnuggets.com/2018/12/deep-learning-major-advances-review.html)

arXiv is a repository of electronic preprints approved for posting after moderation, but not full peer review.

The complexity of deep learning models is also increasing. ResNet-50 is an image recognition model (see chapters 4 and 5), with about 26 million parameters. Every single parameter is a weight used to fine-tune the model. Transformers, gpt-1, bert, and gpt-2 [7] are natural language processing (see Chapter 8, Recurrent Neural Networks) models able to perform a variety of tasks on text. These models progressively grew from 340 million to 1.5 billion parameters. Recently, Nvidia claimed that it has been able to train the largest-known model, with 8.3 billion parameters, in just 53 minutes. This training allowed Nvidia to build one of the most powerful models to process textual information (https://devblogs.nvidia.com/training-bert-with-gpus/).

Chart

Figure 3: Growth in number of parameters for various deep learning models

Besides that, computational capacity is significantly increasing. GPUs and TPUs (Chapter 16, Tensor Processing Unit) are deep learning accelerators that have made it possible to train large models in a very short amount of time. TPU3s, announced on May 2018, are about twice as powerful (360 teraflops) as the TPU2s announced on May 2017. A full TPU3 pod can deliver more than 100 petaflops of machine learning performance, while TPU2 pods can get to 11.5 teraflops of performance.

An improvement of 10x per pod (see Figure 4) was achieved in one year only, which allows faster training:

Chart

Figure 4: TPU accelerators performance in petaflops

However, DL's growth is not only in terms of better accuracy, more research papers, larger models, and faster accelerators. There are additional trends that have been observed over the last four years.

First, the availability of flexible programming frameworks such as Keras [1], TensorFlow [2], PyTorch[8], and fast.ai; these frameworks have proliferated within the ML and DL community and have provided some very impressive results, as we'll see throughout this book. According to the Kaggle State of the Machine Learning and Data Science Survey 2019, based on responses from 19,717 Kaggle (https://www.kaggle.com/) members, Keras and TensorFlow are clearly the most popular choices (see Figure 5). TensorFlow 2.0 is the framework covered in this book. This framework aims to take the best of both worlds from the great features found in Keras and TensorFlow 1.x:

Figure 5: Adoption of deep learning frameworks

Second, the increasing possibility of using managed services in the cloud (see Chapter 12, TensorFlow and Cloud) with accelerators (Chapter 16, Tensor Processing Unit). This allows data scientists to focus on ML problems with no need to manage the infrastructural overhead.

Third, the increasing capacity of deploying models in more heterogeneous systems: mobile devices, Internet of Things (IoT) devices, and even the browsers normally used in your desktop and laptop (see Chapter 13, TensorFlow for Mobile and IoT and TensorFlow.js).

Fourth, the increased understanding of how to use more and more sophisticated DL architectures such as Dense Networks (Chapter 1, Neural Network Foundations with TensorFlow 2.0), Convolutional Network (Chapter 4, Convolutional Neural Networks, and Chapter 5, Advanced Convolutional Neural Networks), Generative Adversarial Networks (Chapter 6, Generative Adversarial Networks), Word Embeddings (Chapter 7, Word Embeddings), Recurrent Network (Chapter 8, Recurrent Neural Networks), Autoencoders (Chapter 9, Autoencoders), and advanced techniques such as Reinforcement Learning (Chapter 11, Reinforcement Learning).

Fifth, the advent of new AutoML techniques (Chapter 14, An Introduction to AutoML) that can enable domain experts who are unfamiliar with ML technologies to use ML techniques easily and effectively. AutoML made it possible to reduce the burden of finding the right model for specific application domains, spending time on fine-tuning the models, and spending time in identifying – given an application problem – the right set of features to use as input to ML models.

The above five trends culminated in 2019 when Yoshua Bengio, Geoffrey Hinton, and Yann LeCun – three of the fathers of Deep Learning – won the Turing Award "for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing." The ACM A.M. Turing Award is an annual prize given to an individual selected for contributions "of lasting and major technical importance to the computer field." Quotes taken from the ACM website (https://awards.acm.org/). Many are considering this award to be the Nobel of computer science.

Looking back at the previous eight years, it is fascinating and exciting to see the extent of the contributions that DL has made to science and industry. There is no reason to believe that the next eight years will see any less contribution; indeed, as the field of DL continues to advance, we anticipate that we'll see even more exciting and fascinating contributions provided by DL.

The intent of this book is to cover all the above five trends, and to introduce you to the magic of deep learning. We will start with simple models and progressively will introduce increasingly sophisticated models. The approach will always be hands-on, with an healthy dose of code to work with.