Book Image

Intelligent Projects Using Python

By : Santanu Pattanayak
Book Image

Intelligent Projects Using Python

By: Santanu Pattanayak

Overview of this book

This book will be a perfect companion if you want to build insightful projects from leading AI domains using Python. The book covers detailed implementation of projects from all the core disciplines of AI. We start by covering the basics of how to create smart systems using machine learning and deep learning techniques. You will assimilate various neural network architectures such as CNN, RNN, LSTM, to solve critical new world challenges. You will learn to train a model to detect diabetic retinopathy conditions in the human eye and create an intelligent system for performing a video-to-text translation. You will use the transfer learning technique in the healthcare domain and implement style transfer using GANs. Later you will learn to build AI-based recommendation systems, a mobile app for sentiment analysis and a powerful chatbot for carrying customer services. You will implement AI techniques in the cybersecurity domain to generate Captchas. Later you will train and build autonomous vehicles to self-drive using reinforcement learning. You will be using libraries from the Python ecosystem such as TensorFlow, Keras and more to bring the core aspects of machine learning, deep learning, and AI. By the end of this book, you will be skilled to build your own smart models for tackling any kind of AI problems without any hassle.
Table of Contents (12 chapters)

Neural activation units

Several kinds of neural activation units are used in neural networks, depending on the architecture and the problem at hand. We will discuss the most commonly used activation functions, as these play an important role in determining the network architecture and performance. Linear and sigmoid unit activation functions were primarily used in artificial neural networks until rectified linear units (ReLUs), invented by Hinton et al., revolutionized the performance of neural networks.

Linear activation units

A linear activation unit outputs the total input to the neuron that is attenuated, as shown in the following graph:

Figure 1.5: Linear neuron

If x is the total input to the linear activation unit, then the output, y, can be represented as follows:

Sigmoid activation units

The output of the sigmoid activation unit, y, as a function of its total input, x, is expressed as follows:

Since the sigmoid activation unit response is a nonlinear function, as shown in the following graph, it is used to introduce nonlinearity in the neural network:

Figure 1.6: Sigmoid activation function

Any complex process in nature is generally nonlinear in its input-output relation, and hence, we need nonlinear activation functions to model them through neural networks. The output probability of a neural network for a two-class classification is generally given by the output of a sigmoid neural unit, since it outputs values from zero to one. The output probability can be represented as follows:

Here, x represents the total input to the sigmoid unit in the output layer.

The hyperbolic tangent activation function

The output, y, of a hyperbolic tangent activation function (tanh) as a function of its total input, x, is given as follows:

The tanh activation function outputs values in the range [-1, 1], as you can see in the following graph:

Figure 1.7: Tanh activation function

One thing to note is that both the sigmoid and the tanh activation functions are linear within a small range of the input, beyond which the output saturates. In the saturation zone, the gradients of the activation functions (with respect to the input) are very small or close to zero; this means that they are very prone to the vanishing gradient problem. As you will see later on, neural networks learn from the backpropagation method, where the gradient of a layer is dependent on the gradients of the activation units in the succeeding layers, up to the final output layer. Therefore, if the units in the activation units are working in the saturation region, much less of the error is backpropagated to the early layers of the neural network. Neural networks minimize the prediction error in order to learn the weights and biases (W) by utilizing the gradients. This means that, if the gradients are small or vanish to zero, then the neural network will fail to learn these weights properly.

Rectified linear unit (ReLU)

The output of a ReLU is linear when the total input to the neuron is greater than zero, and the output is zero when the total input to the neuron is negative. This simple activation function provides nonlinearity to a neural network, and, at the same time, it provides a constant gradient of one with respect to the total input. This constant gradient helps to keep the neural network from developing saturating or vanishing gradient problems, as seen in activation functions, such as sigmoid and tanh activation units. The ReLU function output (as shown in Figure 1.8) can be expressed as follows:

The ReLU activation function can be plotted as follows:

Figure 1.8: ReLU activation function

One of the constraints for ReLU is its zero gradients for negative values of input. This may slow down the training, especially at the initial phase. Leaky ReLU activation functions (as shown in Figure 1.9) can be useful in this scenario, where the output and gradients are nonzero, even for negative values of the input. A leaky ReLU output function can be expressed as follows:

The parameter is to be provided for leaky ReLU activation functions, whereas for a parametric ReLU, is a parameter that the neural network will learn through training. The following graph shows the output of the leaky ReLU activation function:

Figure 1.9: Leaky ReLU activation function

The softmax activation unit

The softmax activation unit is generally used to output the class probabilities, in the case of a multi-class classification problem. Suppose that we are dealing with an n class classification problem, and the total input corresponding to the classes is given by the following:

In this case, the output probability of the kth class of the softmax activation unit is given by the following formula:

There are several other activation functions, mostly variations of these basic versions. We will discuss them as we encounter them in the different projects that we will cover in the following chapters.