Book Image

Neural Networks with R

By : Balaji Venkateswaran, Giuseppe Ciaburro
Book Image

Neural Networks with R

By: Balaji Venkateswaran, Giuseppe Ciaburro

Overview of this book

Neural networks are one of the most fascinating machine learning models for solving complex computational problems efficiently. Neural networks are used to solve wide range of problems in different areas of AI and machine learning. This book explains the niche aspects of neural networking and provides you with foundation to get started with advanced topics. The book begins with neural network design using the neural net package, then you’ll build a solid foundation knowledge of how a neural network learns from data, and the principles behind it. This book covers various types of neural network including recurrent neural networks and convoluted neural networks. You will not only learn how to train neural networks, but will also explore generalization of these networks. Later we will delve into combining different neural network models and work with the real-world use cases. By the end of this book, you will learn to implement neural network models in your applications with the help of practical examples in the book.
Table of Contents (14 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Gradient descent


Gradient descent is an iterative approach for error correction in any learning model. For neural networks during backpropagation, the process of iterating the update of weights and biases with the error times derivative of the activation function is the gradient descent approach. The steepest descent step size is replaced by a similar size from the previous step. Gradient is basically defined as the slope of the curve and is the derivative of the activation function:

The objective of deriving gradient descent at each step is to find the global cost minimum, where the error is the lowest. And this is where the model has a good fit for the data and predictions are more accurate.

Gradient descent can be performed either for the full batch or stochastic. In full batch gradient descent, the gradient is computed for the full training dataset, whereas Stochastic Gradient Descent (SGD) takes a single sample and performs gradient calculation. It can also take mini-batches and perform the calculations. One advantage of SGD is faster computation of gradients.