Java Deep Learning Cookbook

By : Rahul Raj

Java Deep Learning Cookbook

By: Rahul Raj

Overview of this book

Java is one of the most widely used programming languages in the world. With this book, you will see how to perform deep learning using Deeplearning4j (DL4J) – the most popular Java library for training neural networks efficiently. This book starts by showing you how to install and configure Java and DL4J on your system. You will then gain insights into deep learning basics and use your knowledge to create a deep neural network for binary classification from scratch. As you progress, you will discover how to build a convolutional neural network (CNN) in DL4J, and understand how to construct numeric vectors from text. This deep learning book will also guide you through performing anomaly detection on unsupervised data and help you set up neural networks in distributed systems effectively. In addition to this, you will learn how to import models from Keras and change the configuration in a pre-trained DL4J model. Finally, you will explore benchmarking in DL4J and optimize neural networks for optimal results. By the end of this book, you will have a clear understanding of how you can use DL4J to build robust deep learning applications in Java.

Preface

Who this book is for

What this book covers

To get the most out of this book

Sections

Get in touch

Free Chapter

Introduction to Deep Learning in Java

Technical requirements

Deep learning intuition

Determining the right network type to solve deep learning problems

Determining the right activation function

Combating overfitting problems

Determining the right batch size and learning rates

Configuring Maven for DL4J

Configuring DL4J for a GPU-accelerated environment

Troubleshooting installation issues

Data Extraction, Transformation, and Loading

Technical requirements

Reading and iterating through data

Performing schema transformations

Building a transformation process

Serializing transforms

Executing a transform process

Normalizing data for network efficiency

Building Deep Neural Networks for Binary Classification

Technical requirements

Extracting data from CSV input

Removing anomalies from the data

Applying transformations to the data

Designing input layers for the neural network model

Designing hidden layers for the neural network model

Designing output layers for the neural network model

Training and evaluating the neural network model for CSV data

Deploying the neural network model and using it as an API

Building Convolutional Neural Networks

Technical requirements

Extracting images from disk

Creating image variations for training data

Image preprocessing and the design of input layers

Constructing hidden layers for a CNN

Constructing output layers for output classification

Training images and evaluating CNN output

Creating an API endpoint for the image classifier

Implementing Natural Language Processing

Technical requirements

Reading and loading text data

Tokenizing data and training the model

Evaluating the model

Generating plots from the model

Saving and reloading the model

Importing Google News vectors

Troubleshooting and tuning Word2Vec models

Using Word2Vec for sentence classification using CNNs

Using Doc2Vec for document classification

Constructing an LSTM Network for Time Series

Technical requirements

Extracting and reading clinical data

Loading and transforming data

Constructing input layers for the network

Constructing output layers for the network

Training time series data

Evaluating the LSTM network's efficiency

Constructing an LSTM Neural Network for Sequence Classification

Technical requirements

Extracting time series data

Loading training data

Normalizing training data

Constructing input layers for the network

Constructing output layers for the network

Evaluating the LSTM network for classified output

Performing Anomaly Detection on Unsupervised Data

Technical requirements

Extracting and preparing MNIST data

Constructing dense layers for input

Constructing output layers

Training with MNIST images

Evaluating and sorting the results based on the anomaly score

Saving the resultant model

Using RL4J for Reinforcement Learning

Technical requirements

Setting up the Malmo environment and respective dependencies

Setting up the data requirements

Configuring and training a DQN agent

Evaluating a Malmo agent

Developing Applications in a Distributed Environment

Technical requirements

Setting up DL4J and the required dependencies

Creating an uber-JAR for training

CPU/GPU-specific configuration for training

Memory settings and garbage collection for Spark

Configuring encoding thresholds

Performing a distributed test set evaluation

Saving and loading trained neural network models

Performing distributed inference

Applying Transfer Learning to Network Models

Technical requirements

Modifying an existing customer retention model

Fine-tuning the learning configurations

Implementing frozen layers

Importing and loading Keras models and layers

Benchmarking and Neural Network Optimization

Technical requirements

DL4J/ND4J-specific configuration

Setting up heap spaces and garbage collection

Using asynchronous ETL

Using arbiter to monitor neural network behavior

Performing hyperparameter tuning

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Determining the right batch size and learning rates

Although there is no specific batch size or learning rate that works for all models, we can find the best values for them by experimenting with multiple training instances. The primary step is to experiment with a set of batch size values and learning rates with the model. Observe the efficiency of the model by evaluating additional parameters such as Precision, Recall, and F1 Score. Test scores alone don't confirm the model's performance. Also, parameters such as Precision, Recall, and F1 Score vary according to the use case. You need to analyze your problem statement to get an idea about this. In this recipe, we will walk through key steps to determine the right batch size and learning rates.

How to do it...

Run the training instance multiple times and track the evaluation metrics.
Run experiments by increasing the learning rate and track the results.

How it works...

Consider the following experiments to illustrate step 1.

The following training was performed on 10,000 records with a batch size of 8 and a learning rate of 0.008:

The following is the evaluation performed on the same dataset for a batch size of 50 and a learning rate of 0.008:

To perform step 2, we increased the learning rate to 0.6, to observe the results. Note that a learning rate beyond a certain limit will not help efficiency in any way. Our job is to find that limit:

You can observe that Accuracy is reduced to 82.40% and F1 Score is reduced to 20.7%. This indicates that F1 Score might be the evaluation parameter to be accounted for in this model. This is not true for all models, and we reach this conclusion after experimenting with a couple of batch sizes and learning rates. In a nutshell, you have to repeat the same process for your model's training and choose arbitrary values that yield the best results.

There's more...

When we increase the batch size, the number of iterations will eventually reduce, hence the number of evaluations will also be reduced. This can overfit the data for a large batch size. A batch size of 1 is as useless as a batch size based on an entire dataset. So, you need to experiment with values starting from a safe arbitrary point.

A very small learning rate will lead to a very small convergence rate to the target. This can also impact the training time. If the learning rate is very large, this will cause divergent behavior in the model. We need to increase the learning rate until we observe the evaluation metrics getting better. There is an implementation of a cyclic learning rate in the fast.ai and Keras libraries; however, a cyclic learning rate is not implemented in DL4J.

Java Deep Learning Cookbook

By : Rahul Raj

Java Deep Learning Cookbook

By: Rahul Raj

Overview of this book

Related Content you might be interested in

Current Title:

Java Deep Learning Cookbook

Hands-On Deep Learning with Apache Spark

Java Deep Learning Projects

Deep Learning with Hadoop