Book Image

Java Deep Learning Cookbook

By : Rahul Raj
Book Image

Java Deep Learning Cookbook

By: Rahul Raj

Overview of this book

Java is one of the most widely used programming languages in the world. With this book, you will see how to perform deep learning using Deeplearning4j (DL4J) – the most popular Java library for training neural networks efficiently. This book starts by showing you how to install and configure Java and DL4J on your system. You will then gain insights into deep learning basics and use your knowledge to create a deep neural network for binary classification from scratch. As you progress, you will discover how to build a convolutional neural network (CNN) in DL4J, and understand how to construct numeric vectors from text. This deep learning book will also guide you through performing anomaly detection on unsupervised data and help you set up neural networks in distributed systems effectively. In addition to this, you will learn how to import models from Keras and change the configuration in a pre-trained DL4J model. Finally, you will explore benchmarking in DL4J and optimize neural networks for optimal results. By the end of this book, you will have a clear understanding of how you can use DL4J to build robust deep learning applications in Java.
Table of Contents (14 chapters)

Combating overfitting problems

As we know, overfitting is a major challenge that machine learning developers face. It becomes a big challenge when the neural network architecture is complex and training data is huge. While mentioning overfitting, we're not ignoring the chances of underfitting at all. We will keep overfitting and underfitting in the same category. Let's discuss how we can combat overfitting problems.

The following are possible reasons for overfitting, including but not limited to:

  • Too many feature variables compared to the number of data records
  • A complex neural network model

Self-evidently, overfitting reduces the generalization power of the network and the network will fit noise instead of a signal when this happens. In this recipe, we will walk through key steps to prevent overfitting problems.

How to do it...

  1. Use KFoldIterator to perform k-fold cross-validation-based resampling:
KFoldIterator kFoldIterator = new KFoldIterator(k, dataSet);
  1. Construct a simpler neural network architecture.
  2. Use enough train data to train the neural network.

How it works...

In step 1, k is the arbitrary number of choice and dataSet is the dataset object that represents your training data. We perform k-fold cross-validation to optimize the model evaluation process.

Complex neural network architectures can cause the network to tend to memorize patterns. Hence, your neural network will have a hard time generalizing unseen data. For example, it's better and more efficient to have a few hidden layers rather than hundreds of hidden layers. That's the relevance of step 2.

Fairly large training data will encourage the network to learn better and a batch-wise evaluation of test data will increase the generalization power of the network. That's the relevance of step 3. Although there are multiple types of data iterator and various ways to introduce batch size in an iterator in DL4J, the following is a more conventional definition for RecordReaderDataSetIterator:

public RecordReaderDataSetIterator(RecordReader recordReader,
WritableConverter converter,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
int numPossibleLabels,
int maxNumBatches,
boolean regression)

There's more...

When you perform k-fold cross-validation, data is divided into k number of subsets. For every subset, we perform evaluation by keeping one of the subsets for testing and the remaining k-1 subsets for training. We will repeat this k number of times. Effectively, we use the entire data for training with no data loss, as opposed to wasting some of the data on testing.

Underfitting is handled here. However, note that we perform the evaluation k number of times only.

When you perform batch training, the entire dataset is divided as per the batch size. If your dataset has 1,000 records and the batch size is 8, then you have 125 training batches.

You need to note the training-to-testing ratio as well. According to that ratio, every batch will be divided into a training set and testing set. Then the evaluation will be performed accordingly. For 8-fold cross-validation, you evaluate the model 8 times, but for a batch size of 8, you perform 125 model evaluations.

Note the rigorous mode of evaluation here, which will help to improve the generalization power while increasing the chances of underfitting.