Book Image

Java Deep Learning Projects

Book Image

Java Deep Learning Projects

Overview of this book

Java is one of the most widely used programming languages. With the rise of deep learning, it has become a popular choice of tool among data scientists and machine learning experts. Java Deep Learning Projects starts with an overview of deep learning concepts and then delves into advanced projects. You will see how to build several projects using different deep neural network architectures such as multilayer perceptrons, Deep Belief Networks, CNN, LSTM, and Factorization Machines. You will get acquainted with popular deep and machine learning libraries for Java such as Deeplearning4j, Spark ML, and RankSys and you’ll be able to use their features to build and deploy projects on distributed computing environments. You will then explore advanced domains such as transfer learning and deep reinforcement learning using the Java ecosystem, covering various real-world domains such as healthcare, NLP, image classification, and multimedia analytics with an easy-to-follow approach. Expert reviews and tips will follow every project to give you insights and hacks. By the end of this book, you will have stepped up your expertise when it comes to deep learning in Java, taking it beyond theory and be able to build your own advanced deep learning systems.

Preface

Who this book is for

What this book covers

To get the most out of this book

Free Chapter

Getting Started with Deep Learning

Getting Started with Deep Learning

A soft introduction to ML

Delving into deep learning

Artificial Neural Networks

ANNs and the backpropagation algorithm

Neural network architectures

DL frameworks and cloud platforms

Deep learning from a disaster – Titanic survival prediction

Frequently asked questions (FAQs)

Answers to FAQs

Cancer Types Prediction Using Recurrent Type Networks

Cancer Types Prediction Using Recurrent Type Networks

Deep learning in cancer genomics

Cancer genomics dataset description

Preparing programming environment

Cancer type prediction using an LSTM network

Frequently asked questions (FAQs)

Answers to questions

Multi-Label Image Classification Using Convolutional Neural Networks

Multi-Label Image Classification Using Convolutional Neural Networks

Image classification and drawbacks of DNNs

CNN architecture

Multi-label image classification using CNNs

Frequently asked questions (FAQs)

Answers to questions

Sentiment Analysis Using Word2Vec and LSTM Network

Sentiment Analysis Using Word2Vec and LSTM Network

Sentiment analysis is a challenging task

Using Word2Vec for neural word embeddings

Datasets and pre-trained model description

Sentiment analysis using Word2Vec and LSTM

Frequently asked questions (FAQs)

Answers to questions

Transfer Learning for Image Classification

Transfer Learning for Image Classification

Image classification with pretrained VGG16

Developing an image classifier using transfer learning

Making simple inferencing

Frequently asked questions (FAQs)

Answers to questions

Real-Time Object Detection using YOLO, JavaCV, and DL4J

Real-Time Object Detection using YOLO, JavaCV, and DL4J

Object detection from images and videos

You Only Look Once (YOLO)

Developing a real-time object detection project

Frequently asked questions (FAQs)

Answers to questions

Stock Price Prediction Using LSTM Network

Stock Price Prediction Using LSTM Network

State-of-the-art automated stock trading

Developing a stock price predictive model

Frequently asked questions (FAQs)

Answers to questions

Distributed Deep Learning – Video Classification Using Convolutional LSTM Networks

Distributed Deep Learning – Video Classification Using Convolutional LSTM Networks

Distributed deep learning across multiple GPUs

Video classification using convolutional – LSTM

Distributed training on AWS deep learning AMI 9.0

Frequently asked questions (FAQs)

Answers to questions

Playing GridWorld Game Using Deep Reinforcement Learning

Playing GridWorld Game Using Deep Reinforcement Learning

Notation, policy, and utility for RL

Neural Q-learning

Developing a GridWorld game using a deep Q-network

Playing the GridWorld game

Frequently asked questions (FAQs)

Answers to questions

Developing Movie Recommendation Systems Using Factorization Machines

Developing Movie Recommendation Systems Using Factorization Machines

Recommendation systems

Factorization machines in recommender systems

Developing a movie recommender system using FMs

Frequently asked questions (FAQs)

Answers to questions

Discussion, Current Trends, and Outlook

Discussion, Current Trends, and Outlook

Discussion and outlook

Current trends and outlook

Frequently asked questions (FAQs)

Answers to questions

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Delving into deep learning

Simple ML methods that were used in normal-size data analysis are not effective anymore and should be substituted by more robust ML methods. Although classical ML techniques allow researchers to identify groups or clusters of related variables, the accuracy and effectiveness of these methods diminish with large and high-dimensional datasets.

Here comes deep learning, which is one of the most important developments in artificial intelligence in the last few years. Deep learning is a branch of ML based on a set of algorithms that attempt to model high-level abstractions in data.

How did DL take ML into next level?

In short, deep learning algorithms are mostly a set of ANNs that can make better representations of large-scale datasets, in order to build models that learn these representations very extensively. Nowadays it's not limited to ANNs, but there have been really many theoretical advances and software and hardware improvements that were necessary for us to get to this day. In this regard, Ian Goodfellow et al. (Deep Learning, MIT Press, 2016) defined deep learning as follows:

"Deep learning is a particular kind of machine learning that achieves great power and flexibility by learning to represent the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones."

Let's take an example; suppose we want to develop a predictive analytics model, such as an animal recognizer, where our system has to resolve two problems:

To classify whether an image represents a cat or a dog
To cluster images of dogs and cats.

If we solve the first problem using a typical ML method, we must define the facial features (ears, eyes, whiskers, and so on) and write a method to identify which features (typically nonlinear) are more important when classifying a particular animal.

However, at the same time, we cannot address the second problem because classical ML algorithms for clustering images (such as k-means) cannot handle nonlinear features. Deep learning algorithms will take these two problems one step further and the most important features will be extracted automatically after determining which features are the most important for classification or clustering.

In contrast, when using a classical ML algorithm, we would have to provide the features manually. In summary, the deep learning workflow would be as follows:

A deep learning algorithm would first identify the edges that are most relevant when clustering cats or dogs. It would then try to find various combinations of shapes and edges hierarchically. This step is called ETL.
After several iterations, hierarchical identification of complex concepts and features is carried out. Then, based on the identified features, the DL algorithm automatically decides which of these features are most significant (statistically) to classify the animal. This step is feature extraction.
Finally, it takes out the label column and performs unsupervised training using AutoEncoders (AEs) to extract the latent features to be redistributed to k-means for clustering.
Then the clustering assignment hardening loss (CAH loss) and reconstruction loss are jointly optimized towards optimal clustering assignment. Deep Embedding Clustering (see more at https://arxiv.org/pdf/1511.06335.pdf) is an example of such an approach. We will discuss deep learning-based clustering approaches in Chapter 11, Discussion, Current Trends, and Outlook.

Up to this point, we have seen that deep learning systems are able to recognize what an image represents. A computer does not see an image as we see it because it only knows the position of each pixel and its color. Using deep learning techniques, the image is divided into various layers of analysis.

At a lower level, the software analyzes, for example, a grid of a few pixels with the task of detecting a type of color or various nuances. If it finds something, it informs the next level, which at this point checks whether or not that given color belongs to a larger form, such as a line. The process continues to the upper levels until you understand what is shown in the image. The following diagram shows what we have discussed in the case of an image classification system:

A deep learning system at work on a dog versus cat classification problem

More precisely, the preceding image classifier can be built layer by layer, as follows:

Layer 1: The algorithm starts identifying the dark and light pixels from the raw images
Layer 2: The algorithm then identifies edges and shapes
Layer 3: It then learns more complex shapes and objects
Layer 4: The algorithm then learns which objects define a human face

Although this is a very simple classifier, software capable of doing these types of things is now widespread and is found in systems for recognizing faces, or in those for searching by an image on Google, for example. These pieces of software are based on deep learning algorithms.

On the contrary, by using a linear ML algorithm, we cannot build such applications since these algorithms are incapable of handling nonlinear image features. Also, using ML approaches, we typically handle a few hyperparameters only. However, when neural networks are brought to the party, things become too complex. In each layer, there are millions or even billions of hyperparameters to tune, so much that the cost function becomes non-convex.

Another reason is that activation functions used in hidden layers are nonlinear, so the cost is non-convex. We will discuss this phenomenon in more detail in later chapters but let's take a quick look at ANNs.