Book Image

Hands-On Java Deep Learning for Computer Vision

By : Klevis Ramo

Book Image

Hands-On Java Deep Learning for Computer Vision

By: Klevis Ramo

Overview of this book

Although machine learning is an exciting world to explore, you may feel confused by all of its theoretical aspects. As a Java developer, you will be used to telling the computer exactly what to do, instead of being shown how data is generated; this causes many developers to struggle to adapt to machine learning. The goal of this book is to walk you through the process of efficiently training machine learning and deep learning models for Computer Vision using the most up-to-date techniques. The book is designed to familiarize you with neural networks, enabling you to train them efficiently, customize existing state-of-the-art architectures, build real-world Java applications, and get great results in a short space of time. You will build real-world Computer Vision applications, ranging from a simple Java handwritten digit recognition model to real-time Java autonomous car driving systems and face recognition models. By the end of this book, you will have mastered the best practices and modern techniques needed to build advanced Computer Vision Java applications and achieve production-grade accuracy.

Preface

Who this book is for

What this book covers

To get the most out of this book

Free Chapter

Introduction to Computer Vision and Training Neural Networks

Introduction to Computer Vision and Training Neural Networks

The computer vision state

Exploring neural networks

How does a neural network learn?

Organizing data and applications

Effective training techniques

Optimizing algorithms

Configuring the training parameters of the neural network

Representing images and outputs

Building a handwritten digit recognizer

Convolutional Neural Network Architectures

Convolutional Neural Network Architectures

Understanding edge detection

Building a Java edge detection application

Convolution on RGB images

Working with convolutional layers' parameters

Building and training a Convolution Neural Network

Improving the handwritten digit recognition application

Transfer Learning and Deep CNN Architectures

Transfer Learning and Deep CNN Architectures

Working with classical networks

Using residual networks for image recognition

The power of 1 x 1 convolutions and the inception network

Applying transfer learning

Neural networks

Building an animal image classification – using transfer learning and VGG-16 architecture

Real-Time Object Detection

Real-Time Object Detection

Resolving object localization

Object detection with the sliding window solution

Convolutional sliding window

Detecting objects with the YOLO algorithm

Max suppression and anchor boxes

Building a real-time video, car, and pedestrian detection application

Creating Art with Neural Style Transfer

Creating Art with Neural Style Transfer

What are convolution network layers learning?

Neural style transfer

Applying content cost function

Applying style cost function

Building a neural network that produces art

Face Recognition

Face Recognition

Problems in face detection

Differentiating inputs with Siamese networks

Exploring triplet loss

Binary classification

Building a face recognition Java application

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Differentiating inputs with Siamese networks

Let's see how the similarity function is implemented through Siamese networks. The idea was first implemented at paper published by Taigman in 2014, DeepFace: Closing the Gap to Human-Level Performance in Face Verification. Then we will see how Siamese networks learn by giving a slightly more formal definition.

First, we will continue to use convolution architectures with many convolution layers:

The fully connected layers within neurons, and the softmax for the prediction.

Let's fit the first image we want to compare, X¹:

And what we will do is, through a forward pass, grab the activation values of the last fully connected layer, and we will refer to those values as F(x¹), or sometimes also the encoded values of the image, because we transform this image through the forward paths to another set of values of the activation...