Book Image

Enhancing Deep Learning with Bayesian Inference

By : Matt Benatan, Jochem Gietema, Marian Schneider
Book Image

Enhancing Deep Learning with Bayesian Inference

By: Matt Benatan, Jochem Gietema, Marian Schneider

Overview of this book

Deep learning has an increasingly significant impact on our lives, from suggesting content to playing a key role in mission- and safety-critical applications. As the influence of these algorithms grows, so does the concern for the safety and robustness of the systems which rely on them. Simply put, typical deep learning methods do not know when they don’t know. The field of Bayesian Deep Learning contains a range of methods for approximate Bayesian inference with deep networks. These methods help to improve the robustness of deep learning systems as they tell us how confident they are in their predictions, allowing us to take more in how we incorporate model predictions within our applications. Through this book, you will be introduced to the rapidly growing field of uncertainty-aware deep learning, developing an understanding of the importance of uncertainty estimation in robust machine learning systems. You will learn about a variety of popular Bayesian Deep Learning methods, and how to implement these through practical Python examples covering a range of application scenarios. By the end of the book, you will have a good understanding of Bayesian Deep Learning and its advantages, and you will be able to develop Bayesian Deep Learning models for safer, more robust deep learning systems.
Table of Contents (11 chapters)

3.3 Reviewing neural network architectures

In the previous section, we saw how to implement a fully-connected network in the form of an MLP. While such networks were very popular in the early days of deep learning, over the years, machine learning researchers have developed more sophisticated architectures that work more successfully by including domain-specific knowledge (such as computer vision or Natural Language Processing (NLP)). In this section, we will review some of the most common of these neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), as well as attention mechanisms and transformers.

3.3.1 Exploring CNNs

When looking back at the example of trying to predict London housing prices with an MLP model, the input features we used (distance to the city centre, floor area, and construction year of the house) were still ”hand-engineered,” meaning that a human looked at the problem and decided which...