Java Deep Learning Cookbook

By : Rahul Raj

Java Deep Learning Cookbook

By: Rahul Raj

Overview of this book

Java is one of the most widely used programming languages in the world. With this book, you will see how to perform deep learning using Deeplearning4j (DL4J) – the most popular Java library for training neural networks efficiently. This book starts by showing you how to install and configure Java and DL4J on your system. You will then gain insights into deep learning basics and use your knowledge to create a deep neural network for binary classification from scratch. As you progress, you will discover how to build a convolutional neural network (CNN) in DL4J, and understand how to construct numeric vectors from text. This deep learning book will also guide you through performing anomaly detection on unsupervised data and help you set up neural networks in distributed systems effectively. In addition to this, you will learn how to import models from Keras and change the configuration in a pre-trained DL4J model. Finally, you will explore benchmarking in DL4J and optimize neural networks for optimal results. By the end of this book, you will have a clear understanding of how you can use DL4J to build robust deep learning applications in Java.

Preface

Who this book is for

What this book covers

To get the most out of this book

Sections

Get in touch

Free Chapter

Introduction to Deep Learning in Java

Technical requirements

Deep learning intuition

Determining the right network type to solve deep learning problems

Determining the right activation function

Combating overfitting problems

Determining the right batch size and learning rates

Configuring Maven for DL4J

Configuring DL4J for a GPU-accelerated environment

Troubleshooting installation issues

Data Extraction, Transformation, and Loading

Technical requirements

Reading and iterating through data

Performing schema transformations

Building a transformation process

Serializing transforms

Executing a transform process

Normalizing data for network efficiency

Building Deep Neural Networks for Binary Classification

Technical requirements

Extracting data from CSV input

Removing anomalies from the data

Applying transformations to the data

Designing input layers for the neural network model

Designing hidden layers for the neural network model

Designing output layers for the neural network model

Training and evaluating the neural network model for CSV data

Deploying the neural network model and using it as an API

Building Convolutional Neural Networks

Technical requirements

Extracting images from disk

Creating image variations for training data

Image preprocessing and the design of input layers

Constructing hidden layers for a CNN

Constructing output layers for output classification

Training images and evaluating CNN output

Creating an API endpoint for the image classifier

Implementing Natural Language Processing

Technical requirements

Reading and loading text data

Tokenizing data and training the model

Evaluating the model

Generating plots from the model

Saving and reloading the model

Importing Google News vectors

Troubleshooting and tuning Word2Vec models

Using Word2Vec for sentence classification using CNNs

Using Doc2Vec for document classification

Constructing an LSTM Network for Time Series

Technical requirements

Extracting and reading clinical data

Loading and transforming data

Constructing input layers for the network

Constructing output layers for the network

Training time series data

Evaluating the LSTM network's efficiency

Constructing an LSTM Neural Network for Sequence Classification

Technical requirements

Extracting time series data

Loading training data

Normalizing training data

Constructing input layers for the network

Constructing output layers for the network

Evaluating the LSTM network for classified output

Performing Anomaly Detection on Unsupervised Data

Technical requirements

Extracting and preparing MNIST data

Constructing dense layers for input

Constructing output layers

Training with MNIST images

Evaluating and sorting the results based on the anomaly score

Saving the resultant model

Using RL4J for Reinforcement Learning

Technical requirements

Setting up the Malmo environment and respective dependencies

Setting up the data requirements

Configuring and training a DQN agent

Evaluating a Malmo agent

Developing Applications in a Distributed Environment

Technical requirements

Setting up DL4J and the required dependencies

Creating an uber-JAR for training

CPU/GPU-specific configuration for training

Memory settings and garbage collection for Spark

Configuring encoding thresholds

Performing a distributed test set evaluation

Saving and loading trained neural network models

Performing distributed inference

Applying Transfer Learning to Network Models

Technical requirements

Modifying an existing customer retention model

Fine-tuning the learning configurations

Implementing frozen layers

Importing and loading Keras models and layers

Benchmarking and Neural Network Optimization

Technical requirements

DL4J/ND4J-specific configuration

Setting up heap spaces and garbage collection

Using asynchronous ETL

Using arbiter to monitor neural network behavior

Performing hyperparameter tuning

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Deep learning intuition

If you're a newbie to deep learning, you may be wondering how exactly it is differs from machine learning; or is it the same? Deep learning is a subset of the larger domain of machine learning. Let's think about this in the context of an automobile image classification problem:

As you can see in the preceding diagram, we need to perform feature extraction ourselves as legacy machine learning algorithms cannot do that on their own. They might be super-efficient with accurate results, but they cannot learn signals from data. In fact, they don't learn on their own and still rely on human effort:

On the other hand, deep learning algorithms learn to perform tasks on their own. Neural networks under the hood are based on the concept of deep learning and it trains on their own to optimize the results. However, the final decision process is hidden and cannot be tracked. The intent of deep learning is to imitate the functioning of a human brain.

Backpropagation

The backbone of a neural network is the backpropagation algorithm. Refer to the sample neural network structure shown as follows:

For any neural network, data flows from the input layer to the output layer during the forward pass. Each circle in the diagram represents a neuron. Every layer has a number of neurons present. Our data will pass through the neurons across layers. The input needs to be in a numerical format to support computational operations in neurons. Each neuron in the neural network is assigned a weight (matrix) and an activation function. Using the input data, weight matrix, and an activation function, a probabilistic value is generated at each neuron. The error (that is, a deviation from the actual value) is calculated at the output layer using a loss function. We utilize the loss score during the backward pass (that is, from the output layer to the input layer ) by reassigning weights to the neurons to reduce the loss score. During this stage, some output layer neurons will be assigned with high weights and vice versa depending upon the loss score results. This process will continue backward as far as the input layer by updating the weights of neurons. In a nutshell, we are tracking the rate of change of loss with respect to the change in weights across all neurons. This entire cycle (a forward and backward pass) is called an epoch. We perform multiple epochs during a training session. A neural network will tend to optimize the results after every training epoch.

Multilayer Perceptron (MLP)

An MLP is a standard feed-forward neural network with at least three layers: an input layer, a hidden layer, and an output layer. Hidden layers come after the input layer in the structure. Deep neural networks have two or more hidden layers in the structure, while an MLP has only one.

Convolutional Neural Network (CNN)

CNNs are generally used for image classification problems, but can also be exposed in Natural Language Processing (NLP), in conjunction with word vectors, because of their proven results. Unlike a regular neural network, a CNN will have additional layers such as convolutional layers and subsampling layers. Convolutional layers take input data (such as images) and apply convolution operations on top of them. You can think of it as applying a function to the input. Convolutional layers act as filters that pass a feature of interest to the upcoming subsampling layer. A feature of interest can be anything (for example, a fur, shade and so on in the case of an image) that can be used to identify the image. In the subsampling layer, the input from convolutional layers is further smoothed. So, we end up with a much smaller image resolution and reduced color contrast, preserving only the important information. The input is then passed on to fully connected layers. Fully connected layers resemble regular feed-forward neural networks.

Recurrent Neural Network (RNN)

An RNN is a neural network that can process sequential data. In a regular feed-forward neural network, the current input is considered for neurons in the next layer. On the other hand, an RNN can accept previously received inputs as well. It can also use memory to memorize previous inputs. So, it is capable of preserving long-term dependencies throughout the training session. RNN is a popular choice for NLP tasks such as speech recognition. In practice, a slightly variant structure called Long Short-Term Memory (LSTM) is used as a better alternative to RNN.

Why is DL4J important for deep learning?

The following points will help you understand why DL4J is important for deep learning:

DL4J provides commercial support. It is the first commercial-grade, open source, deep learning library in Java.
Writing training code is simple and precise. DL4J supports Plug and Play mode, which means switching between hardware (CPU to GPU) is just a matter of changing the Maven dependencies and no modifications are needed on the code.
DL4J uses ND4J as its backend. ND4J is a computation library that can run twice as fast as NumPy (a computation library in Python) in large matrix operations. DL4J exhibits faster training times in GPU environments compared to other Python counterparts.
DL4J supports training on a cluster of machines that are running in CPU/GPU using Apache Spark. DL4J brings in automated parallelism in distributed training. This means that DL4J bypasses the need for extra libraries by setting up worker nodes and connections.
DL4J is a good production-oriented deep learning library. As a JVM-based library, DL4J applications can be easily integrated/deployed with existing corporate applications that are running in Java/Scala.

Java Deep Learning Cookbook

By : Rahul Raj

Java Deep Learning Cookbook

By: Rahul Raj

Overview of this book

Related Content you might be interested in

Current Title:

Java Deep Learning Cookbook

Hands-On Deep Learning with Apache Spark

Java Deep Learning Projects

Deep Learning with Hadoop