Deep Learning with Hadoop

Deep Learning with Hadoop

By : Dipayan Dev

Buy this Book

Deep Learning with Hadoop

By: Dipayan Dev

Buy this Book

Overview of this book

This book will teach you how to deploy large-scale dataset in deep neural networks with Hadoop for optimal performance. Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep learning. In this book, you will also learn how to overcome the challenges that you face while implementing distributed deep learning with large-scale unstructured datasets. The book will also show you how you can implement and parallelize the widely used deep learning models such as Deep Belief Networks, Convolutional Neural Networks, Recurrent Neural Networks, Restricted Boltzmann machines and autoencoder using the popular deep learning library Deeplearning4j. Get in-depth mathematical explanations and visual representations to help you understand the design and implementations of Recurrent Neural network and Denoising Autoencoders with Deeplearning4j. To give you a more practical perspective, the book will also teach you the implementation of large-scale video processing, image processing and natural language processing on Hadoop. By the end of this book, you will know how to deploy various deep neural networks in distributed systems using Hadoop.

Deep Learning with Hadoop

Credits

About the Author

About the Reviewers

www.PacktPub.com

Customer Feedback

Dedication

Preface

Free Chapter

Introduction to Deep Learning

Getting started with deep learning

Deep learning terminologies

Deep learning: A revolution in Artificial Intelligence

Classification of deep learning networks

Summary

Distributed Deep Learning for Large-Scale Data

Deep learning for massive amounts of data

Challenges of deep learning for big data

Distributed deep learning and Hadoop

Deeplearning4j - an open source distributed framework for deep learning

Setting up Deeplearning4j on Hadoop YARN

Summary

Convolutional Neural Network

Understanding convolution

Background of a CNN

Basic layers of CNN

Distributed deep CNN

Convolutional layer using Deeplearning4j

Summary

Recurrent Neural Network

What makes recurrent networks distinctive from others?

Recurrent neural networks(RNNs)

Backpropagation through time (BPTT)

Long short-term memory

Bi-directional RNNs

Distributed deep RNNs

RNNs with Deeplearning4j

Summary

Restricted Boltzmann Machines

Energy-based models

Boltzmann machines

Restricted Boltzmann machine

Convolutional Restricted Boltzmann machines

Deep Belief networks

Distributed Deep Belief network

Implementation using Deeplearning4j

Summary

Autoencoders

Autoencoder

Sparse autoencoders

Deep autoencoders

Denoising autoencoder

Applications of autoencoders

Summary

Miscellaneous Deep Learning Operations using Hadoop

Distributed video decoding in Hadoop

Large-scale image processing using Hadoop

Natural language processing using Hadoop

Summary

References

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Deep learning terminologies

Deep Neural Network (DNN): This can be defined as a multilayer perceptron with many hidden layers. All the weights of the layers are fully connected to each other, and receive connections from the previous layer. The weights are initialized with either supervised or unsupervised learning.
Recurrent Neural Networks (RNN): RNN is a kind of deep learning network that is specially used in learning from time series or sequential data, such as speech, video, and so on. The primary concept of RNN is that the observations from the previous state need to be retained for the next state. The recent hot topic in deep learning with RNN is Long short-term memory (LSTM).
Deep belief network (DBN): This type of network [9] [10] [11] can be defined as a probabilistic generative model with visible and multiple layers of latent variables (hidden). Each hidden layer possesses a statistical relationship between units in the lower layer through learning. The more the networks tend to move to higher layers, the more complex relationship becomes. This type of network can be productively trained using greedy layer-wise training, where all the hidden layers are trained one at a time in a bottom-up fashion.
Boltzmann machine (BM): This can be defined as a network that is a symmetrically connected, neuron-like unit, which is capable of taking stochastic decisions about whether to remain on or off. BMs generally have a simple learning algorithm, which allows them to uncover many interesting features that represent complex regularities in the training dataset.
Restricted Boltzmann machine (RBM): RBM, which is a generative stochastic Artificial Neural Network, is a special type of Boltzmann Machine. These types of networks have the capability to learn a probability distribution over a collection of datasets. An RBM consists of a layer of visible and hidden units, but with no visible-visible or hidden-hidden connections.
Convolutional neural networks: Convolutional neural networks are part of neural networks; the layers are sparsely connected to each other and to the input layer. Each neuron of the subsequent layer is responsible for only a part of the input. Deep convolutional neural networks have accomplished some unmatched performance in the field of location recognition, image classification, face recognition, and so on.
Deep auto-encoder: A deep auto-encoder is a type of auto-encoder that has multiple hidden layers. This type of network can be pre-trained as a stack of single-layered auto-encoders. The training process is usually difficult: first, we need to train the first hidden layer to restructure the input data, which is then used to train the next hidden layer to restructure the states of the previous hidden layer, and so on.
Gradient descent (GD): This is an optimization algorithm used widely in machine learning to determine the coefficient of a function (f), which reduces the overall cost function. Gradient descent is mostly used when it is not possible to calculate the desired parameter analytically (for example, linear algebra), and must be found by some optimization algorithm.

In gradient descent, weights of the model are incrementally updated with every single iteration of the training dataset (epoch).

The cost function, J (w), with the sum of the squared errors can be written as follows:

The direction of magnitude of the weight update is calculated by taking a step in the reverse direction of the cost gradient, as follows:

In the preceding equation, η is the learning rate of the network. Weights are updated incrementally after every epoch with the following rule:

                         for one or more epochs, 
                           for each weight i, 
                             w_i:= w + ∆w_i 
                           end  
                         end

Popular examples that can be optimized using gradient descent are Logistic Regression and Linear Regression.

Stochastic Gradient Descent (SGD): Various deep learning algorithms, which operated on a large amount of datasets, are based on an optimization algorithm called stochastic gradient descent. Gradient descent performs well only in the case of small datasets. However, in the case of very large-scale datasets, this approach becomes extremely costly . In gradient descent, it takes only one single step for one pass over the entire training dataset; thus, as the dataset's size tends to increase, the whole algorithm eventually slows down. The weights are updated at a very slow rate; hence, the time it takes to converge to the global cost minimum becomes protracted.

Therefore, to deal with such large-scale datasets, a variation of gradient descent called stochastic gradient descent is used. Unlike gradient descent, the weight is updated after each iteration of the training dataset, rather than at the end of the entire dataset.

                     until cost minimum is reached 
                       for each training sample j: 
                         for each weight i 
                           w_i:= w + ∆w_i 
                         end 
                       end 
                     end

In the last few years, deep learning has gained tremendous popularity, as it has become a junction for research areas of many widely practiced subjects, such as pattern recognition, neural networks, graphical modelling, machine learning, and signal processing.

The other important reasons for this popularity can be summarized by the following points:

In recent years, the ability of GPU (Graphical Processing Units) has increased drastically
The size of data sizes of the dataset used for training purposes has increased significantly
Recent research in machine learning, data science, and information processing has shown some serious advancements

Detailed descriptions of all these points will be provided in an upcoming topic in this chapter.

Deep Learning with Hadoop

By : Dipayan Dev

Deep Learning with Hadoop

By: Dipayan Dev

Overview of this book

Related Content you might be interested in

Current Title:

Deep Learning with Hadoop

Deep Learning Essentials

Hands-On Deep Learning with Apache Spark

Deep Learning for Beginners

Deep learning terminologies