Book Image

Deep Learning with TensorFlow

By : Giancarlo Zaccone, Md. Rezaul Karim, Ahmed Menshawy
Book Image

Deep Learning with TensorFlow

By: Giancarlo Zaccone, Md. Rezaul Karim, Ahmed Menshawy

Overview of this book

Deep learning is the step that comes after machine learning, and has more advanced implementations. Machine learning is not just for academics anymore, but is becoming a mainstream practice through wide adoption, and deep learning has taken the front seat. As a data scientist, if you want to explore data abstraction layers, this book will be your guide. This book shows how this can be exploited in the real world with complex raw data using TensorFlow 1.x. Throughout the book, you’ll learn how to implement deep learning algorithms for machine learning systems and integrate them into your product offerings, including search, image recognition, and language processing. Additionally, you’ll learn how to analyze and improve the performance of deep learning models. This can be done by comparing algorithms against benchmarks, along with machine intelligence, to learn from the information and determine ideal behaviors within a specific context. After finishing the book, you will be familiar with machine learning techniques, in particular the use of TensorFlow for deep learning, and will be ready to apply your knowledge to research or commercial projects.
Table of Contents (11 chapters)

Neural network architectures

The way to connect the nodes, the number of layers present, that is, the levels of nodes between input and output, and the number of neurons per layer, defines the architecture of a neural network. There are various types of architecture in neural networks, but this book will focus mainly on two large architectural families.

Multilayer perceptron

In multilayer networks, one can identify the artificial neurons of layers such that:

  • Each neuron is connected with all those of the next layer
  • There are no connections between neurons belonging to the same layer
  • There are no connections between neurons belonging to non-adjacent layers
  • The number of layers and of neurons per layer depends on the problem to be solved

The input and output layers define inputs and outputs; there are hidden layers, whose complexity realizes different behaviors of the network. Finally, the connections between neurons are represented by as many matrices are the pairs of adjacent layers. Each array contains the weights of the connections between the pairs of nodes of two adjacent layers. The feed-forward networks are networks with no loops within the layers.

Following is the graphical representation of multilayer perceptron architecture:

Figure 8: A multilayer perceptron architecture

DNNs architectures

Deep Neural Networks (DNNs) are artificial neural networks strongly oriented to deep learning. Where normal procedures of analysis are inapplicable due to the complexity of the data to be processed, such networks are an excellent modeling tool. DNNs are neural networks, very similar to those we have discussed, but they must implement a more complex model (a great number of neurons, hidden layers, and connections), although they follow the learning principles that apply to all machine learning problems (that is, supervised learning).

As they are built, the DNNs work in parallel, so they are able to treat a lot of data. They are a sophisticated statistical system, equipped with a good immunity to errors.

Unlike algorithmic systems where you can examine the output generation step by step, in neural networks, you can also have very reliable results, but sometimes without the ability to understand the reasons for those results. There are no theorems to generate optimal neural networks--the likelihood of getting a good network is all in the hands of its creator, who must be familiar with statistical concepts, and particular attention must be given to the choice of predictor variables.

For brief cheat sheet on the different neural network architecture and their related publications, refer to the website of Asimov Institute at http://www.asimovinstitute.org/neural-network-zoo/.

Finally, we observe that, in order to be productive, the DNNs require training that properly tunes the weights. Training can take a long time if the data to be examined and the variables involved are high, as is often the case when you want optimal results. This section introduces the deep learning architectures that we will cover during the course of this book.

Convolutional Neural Networks

Convolutional Neural Networks (CNNs) has been designed specifically for image recognition. Each image used in learning is divided into compact topological portions, each of which will be processed by filters to search for particular patterns. Formally, each image is represented as a three-dimensional matrix of pixels (width, height, and color), and every sub-portion is put on convolution with the filter set. In other words, scrolling each filter along the image computes the inner product of the same filter and input. This procedure produces a set of feature maps (activation maps) for the various filters. By superimposing the various feature maps of the same portion of the image, we get an output volume. This type of layer is called a convolutional layer.

The following figure shows a typical CNN architecture:

Figure 9: Convolutional neural network architecture

Restricted Boltzmann Machines

A Restricted Boltzmann Machine (RBM) consists of a visible and a hidden layer of nodes, but without visible-visible connections and hidden-hidden by the term restricted. These restrictions allow more efficient network training (training that can be supervised or unsupervised).

This type of neural network can represent with few size of the network a large number of features of the inputs; in fact, the n hidden nodes can represent up to 2n features. The network can be trained to respond to a single question (yes/no), up until (again, in binary terms) a total of 2n questions.

The architecture of the RBM is as follows, with neurons arranged according to a symmetrical bipartite graph:

Figure 10: Restricted Boltzmann Machine architecture