Book Image

Deep Learning By Example

Book Image

Deep Learning By Example

Overview of this book

Deep learning is a popular subset of machine learning, and it allows you to build complex models that are faster and give more accurate predictions. This book is your companion to take your first steps into the world of deep learning, with hands-on examples to boost your understanding of the topic. This book starts with a quick overview of the essential concepts of data science and machine learning which are required to get started with deep learning. It introduces you to Tensorflow, the most widely used machine learning library for training deep learning models. You will then work on your first deep learning problem by training a deep feed-forward neural network for digit classification, and move on to tackle other real-world problems in computer vision, language processing, sentiment analysis, and more. Advanced deep learning models such as generative adversarial networks and their applications are also covered in this book. By the end of this book, you will have a solid understanding of all the essential concepts in deep learning. With the help of the examples and code provided in this book, you will be equipped to train your own deep learning models with more confidence.
Table of Contents (18 chapters)
16
Implementing Fish Recognition

Implementing the fish recognition/detection model

To introduce the power of machine learning and deep learning in particular, we are going to implement the fish recognition example. No understanding of the inner details of the code will be required. The point of this section is to give you an overview of a typical machine learning pipeline.

Our knowledge base for this task will be a bunch of images, each one of them is labeled as opah or tuna. For this implementation, we are going to use one of the deep learning architectures that made a breakthrough in the area of imaging and computer vision in general. This architecture is called Convolution Neural Networks (CNNs). It is a family of deep learning architectures that use the convolution operation of image processing to extract features from the images that can explain the object that we want to classify. For now, you can think of it as a magic box that will take our images, learn from it how to distinguish between our two classes (opah and tuna), and then we will test the learning process of this box by feeding it with unlabeled images and see if it's able to tell which type of fish is in the image.

Different types of learning will be addressed in a later section, so you will understand later on why our fish recognition task is under the supervised learning category.

In this example, we will be using Keras. For the moment, you can think of Keras as an API that makes building and using deep learning way easier than usual. So let's get started! From the Keras website we have:

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Knowledge base/dataset

As we mentioned earlier, we need a historical base of data that will be used to teach the learning algorithm about the task that it's supposed to do later. But we also need another dataset for testing its ability to perform the task after the learning process. So to sum up, we need two types of datasets during the learning process:

  1. The first one is the knowledge base where we have the input data and their corresponding labels such as the fish images and their corresponding labels (opah or tuna). This data will be fed to the learning algorithm to learn from it and try to discover the patterns/trends that will help later on for classifying unlabeled images.
  2. The second one is mainly for testing the ability of the model to apply what it learned from the knowledge base to unlabeled images or unseen data, in general, and see if it's working well.

As you can see, we only have the data that we will use as a knowledge base for our learning method. All of the data we have at hand will have the correct output associated with it. So we need to somehow make up this data that does not have any correct output associated with it (the one that we are going to apply the model to).

While performing data science, we'll be doing the following:

  • Training phase: We present our data from our knowledge base and train our learning method/model by feeding the input data along with its correct output to the model.
  • Validation/test phase: In this phase, we are going to measure how well the trained model is doing. We also use different model property techniques in order to measure the performance of our trained model by using (R-square score for regression, classification errors for classifiers, recall and precision for IR models, and so on).

The validation/test phase is usually split into two steps:

  1. In the first step, we use different learning methods/models and choose the best performing one based on our validation data (validation step)
  2. Then we measure and report the accuracy of the selected model based on the test set (test step)

Now let's see how we get this data to which we are going to apply the model and see how well trained it is.

Since we don't have any training samples without the correct output, we can make up one from the original training samples that we will be using. So we can split our data samples into three different sets (as shown in Figure 1.9):

  • Train set: This will be used as a knowledge base for our model. Usually, will be 70% from the original data samples.
  • Validation set: This will be used to choose the best performing model among a set of models. Usually this will be 10% of the original data samples.
  • Test set: This will be used to measure and report the accuracy of the selected model. Usually, it will be as big as the validation set.
Figure 1.9: Splitting data into train, validation, and test sets

In case you have only one learning method that you are using, you can cancel the validation set and re-split your data to be train and test sets only. Usually, data scientists use 75/25 as percentages, or 70/30.

Data analysis pre-processing

In this section we are going to analyze and preprocess the input images and have it in an acceptable format for our learning algorithm, which is the convolution neural networks here.

So let's start off by importing the required packages for this implementation:

import numpy as np
np.random.seed(2018)
import os
import glob
import cv2
import datetime
import pandas as pd
import time
import warnings
warnings.filterwarnings("ignore")
from sklearn.cross_validation import KFold
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD
from keras.callbacks import EarlyStopping
from keras.utils import np_utils
from sklearn.metrics import log_loss
from keras import __version__ as keras_version

In order to use the images provided in the dataset, we need to get them to have the same size. OpenCV is a good choice for doing this, from the OpenCV website:

OpenCV (Open Source Computer Vision Library) is released under a BSD license and hence it’s free for both academic and commercial use. It has C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform.
You can install OpenCV by using the python package manager by issuing, pip install opencv-python
# Parameters
# ----------
# img_path : path
# path of the image to be resized
def rezize_image(img_path):
#reading image file
img = cv2.imread(img_path)
#Resize the image to to be 32 by 32
img_resized = cv2.resize(img, (32, 32), cv2.INTER_LINEAR)
return img_resized

Now we need to load all the training samples of our dataset and resize each image, according to the previous function. So we are going to implement a function that will load the training samples from the different folders that we have for each fish type:

# Loading the training samples and their corresponding labels
def load_training_samples():
#Variables to hold the training input and output variables
train_input_variables = []
train_input_variables_id = []
train_label = []
# Scanning all images in each folder of a fish type
print('Start Reading Train Images')
folders = ['ALB', 'BET', 'DOL', 'LAG', 'NoF', 'OTHER', 'SHARK', 'YFT']
for fld in folders:
folder_index = folders.index(fld)
print('Load folder {} (Index: {})'.format(fld, folder_index))
imgs_path = os.path.join('..', 'input', 'train', fld, '*.jpg')
files = glob.glob(imgs_path)
for file in files:
file_base = os.path.basename(file)
# Resize the image
resized_img = rezize_image(file)
# Appending the processed image to the input/output variables of the classifier
train_input_variables.append(resized_img)
train_input_variables_id.append(file_base)
train_label.append(folder_index)
return train_input_variables, train_input_variables_id, train_label

As we discussed, we have a test set that will act as the unseen data to test the generalization ability of our model. So we need to do the same with testing images; load them and do the resizing processing:

def load_testing_samples():
# Scanning images from the test folder
imgs_path = os.path.join('..', 'input', 'test_stg1', '*.jpg')
files = sorted(glob.glob(imgs_path))
# Variables to hold the testing samples
testing_samples = []
testing_samples_id = []
#Processing the images and appending them to the array that we have
for file in files:
file_base = os.path.basename(file)
# Image resizing
resized_img = rezize_image(file)
testing_samples.append(resized_img)
testing_samples_id.append(file_base)
return testing_samples, testing_samples_id

Now we need to invoke the previous function into another one that will use the load_training_samples() function in order to load and resize the training samples. Also, it will add a few lines of code to convert the training data into NumPy format, reshape that data to fit into our classifier, and finally convert it to float:

def load_normalize_training_samples():
# Calling the load function in order to load and resize the training samples
training_samples, training_label, training_samples_id = load_training_samples()
# Converting the loaded and resized data into Numpy format
training_samples = np.array(training_samples, dtype=np.uint8)
training_label = np.array(training_label, dtype=np.uint8)
# Reshaping the training samples
training_samples = training_samples.transpose((0, 3, 1, 2))
# Converting the training samples and training labels into float format
training_samples = training_samples.astype('float32')
training_samples = training_samples / 255
training_label = np_utils.to_categorical(training_label, 8)
return training_samples, training_label, training_samples_id

We also need to do the same with the test:

def load_normalize_testing_samples():
# Calling the load function in order to load and resize the testing samples
testing_samples, testing_samples_id = load_testing_samples()
# Converting the loaded and resized data into Numpy format
testing_samples = np.array(testing_samples, dtype=np.uint8)
# Reshaping the testing samples
testing_samples = testing_samples.transpose((0, 3, 1, 2))
# Converting the testing samples into float format
testing_samples = testing_samples.astype('float32')
testing_samples = testing_samples / 255
return testing_samples, testing_samples_id

Model building

Now it's time to create the model. As we mentioned, we are going to use a deep learning architecture called CNN as a learning algorithm for this fish recognition task. Again, you are not required to understand any of the previous or the upcoming code in this chapter as we are only demonstrating how complex data science tasks can be solved by using only a few lines of code with the help of Keras and TensorFlow as a deep learning platform.

Also note that CNN and other deep learning architectures will be explained in greater detail in later chapters:

Figure 1.10: CNN architecture

So let's go ahead and create a function that will be responsible for creating the CNN architecture that will be used in our fish recognition task:

def create_cnn_model_arch():
pool_size = 2 # we will use 2x2 pooling throughout
conv_depth_1 = 32 # we will initially have 32 kernels per conv. layer...
conv_depth_2 = 64 # ...switching to 64 after the first pooling layer
kernel_size = 3 # we will use 3x3 kernels throughout
drop_prob = 0.5 # dropout in the FC layer with probability 0.5
hidden_size = 32 # the FC layer will have 512 neurons
num_classes = 8 # there are 8 fish types
# Conv [32] -> Conv [32] -> Pool
cnn_model = Sequential()
cnn_model.add(ZeroPadding2D((1, 1), input_shape=(3, 32, 32), dim_ordering='th'))
cnn_model.add(Convolution2D(conv_depth_1, kernel_size, kernel_size, activation='relu',
dim_ordering='th'))
cnn_model.add(ZeroPadding2D((1, 1), dim_ordering='th'))
cnn_model.add(Convolution2D(conv_depth_1, kernel_size, kernel_size, activation='relu',
dim_ordering='th'))
cnn_model.add(MaxPooling2D(pool_size=(pool_size, pool_size), strides=(2, 2),
dim_ordering='th'))
# Conv [64] -> Conv [64] -> Pool
cnn_model.add(ZeroPadding2D((1, 1), dim_ordering='th'))
cnn_model.add(Convolution2D(conv_depth_2, kernel_size, kernel_size, activation='relu',
dim_ordering='th'))
cnn_model.add(ZeroPadding2D((1, 1), dim_ordering='th'))
cnn_model.add(Convolution2D(conv_depth_2, kernel_size, kernel_size, activation='relu',
dim_ordering='th'))
cnn_model.add(MaxPooling2D(pool_size=(pool_size, pool_size), strides=(2, 2),
dim_ordering='th'))
# Now flatten to 1D, apply FC then ReLU (with dropout) and finally softmax(output layer)
cnn_model.add(Flatten())
cnn_model.add(Dense(hidden_size, activation='relu'))
cnn_model.add(Dropout(drop_prob))
cnn_model.add(Dense(hidden_size, activation='relu'))
cnn_model.add(Dropout(drop_prob))
cnn_model.add(Dense(num_classes, activation='softmax'))
# initiating the stochastic gradient descent optimiser
stochastic_gradient_descent = SGD(lr=1e-2, decay=1e-6, momentum=0.9, nesterov=True) cnn_model.compile(optimizer=stochastic_gradient_descent, # using the stochastic gradient descent optimiser
loss='categorical_crossentropy') # using the cross-entropy loss function
return cnn_model

Before starting to train the model, we need to use a model assessment and validation method to help us assess our model and see its generalization ability. For this, we are going to use a method called k-fold cross-validation. Again, you are not required to understand this method or how it works as we are going to explain this method later in much detail.

So let's start and and create a function that will help us assess and validate the model:

def create_model_with_kfold_cross_validation(nfolds=10):
batch_size = 16 # in each iteration, we consider 32 training examples at once
num_epochs = 30 # we iterate 200 times over the entire training set
random_state = 51 # control the randomness for reproducibility of the results on the same platform
# Loading and normalizing the training samples prior to feeding it to the created CNN model
training_samples, training_samples_target, training_samples_id =
load_normalize_training_samples()
yfull_train = dict()
# Providing Training/Testing indices to split data in the training samples
# which is splitting data into 10 consecutive folds with shuffling
kf = KFold(len(train_id), n_folds=nfolds, shuffle=True, random_state=random_state)
fold_number = 0 # Initial value for fold number
sum_score = 0 # overall score (will be incremented at each iteration)
trained_models = [] # storing the modeling of each iteration over the folds
# Getting the training/testing samples based on the generated training/testing indices by
Kfold
for train_index, test_index in kf:
cnn_model = create_cnn_model_arch()
training_samples_X = training_samples[train_index] # Getting the training input variables
training_samples_Y = training_samples_target[train_index] # Getting the training output/label variable
validation_samples_X = training_samples[test_index] # Getting the validation input variables
validation_samples_Y = training_samples_target[test_index] # Getting the validation output/label variable
fold_number += 1
print('Fold number {} from {}'.format(fold_number, nfolds))
callbacks = [
EarlyStopping(monitor='val_loss', patience=3, verbose=0),
]
# Fitting the CNN model giving the defined settings
cnn_model.fit(training_samples_X, training_samples_Y, batch_size=batch_size,
nb_epoch=num_epochs,
shuffle=True, verbose=2, validation_data=(validation_samples_X,
validation_samples_Y),
callbacks=callbacks)
# measuring the generalization ability of the trained model based on the validation set
predictions_of_validation_samples =
cnn_model.predict(validation_samples_X.astype('float32'),
batch_size=batch_size, verbose=2)
current_model_score = log_loss(Y_valid, predictions_of_validation_samples)
print('Current model score log_loss: ', current_model_score)
sum_score += current_model_score*len(test_index)
# Store valid predictions
for i in range(len(test_index)):
yfull_train[test_index[i]] = predictions_of_validation_samples[i]
# Store the trained model
trained_models.append(cnn_model)
# incrementing the sum_score value by the current model calculated score
overall_score = sum_score/len(training_samples)
print("Log_loss train independent avg: ", overall_score)
#Reporting the model loss at this stage
overall_settings_output_string = 'loss_' + str(overall_score) + '_folds_' + str(nfolds) +
'_ep_' + str(num_epochs)
return overall_settings_output_string, trained_models

Now, after building the model and using k-fold cross-validation method in order to assess and validate the model, we need to report the results of the trained model over the test set. In order to do this, we are also going to use k-fold cross-validation but this time over the test to see how good our trained model is.

So let's define the function that will take the trained CNN models as an input and then test them using the test set that we have:

def test_generality_crossValidation_over_test_set( overall_settings_output_string, cnn_models):
batch_size = 16 # in each iteration, we consider 32 training examples at once
fold_number = 0 # fold iterator
number_of_folds = len(cnn_models) # Creating number of folds based on the value used in the training step
yfull_test = [] # variable to hold overall predictions for the test set
#executing the actual cross validation test process over the test set
for j in range(number_of_folds):
model = cnn_models[j]
fold_number += 1
print('Fold number {} out of {}'.format(fold_number, number_of_folds))
#Loading and normalizing testing samples
testing_samples, testing_samples_id = load_normalize_testing_samples()
#Calling the current model over the current test fold
test_prediction = model.predict(testing_samples, batch_size=batch_size, verbose=2)
yfull_test.append(test_prediction)
test_result = merge_several_folds_mean(yfull_test, number_of_folds)
overall_settings_output_string = 'loss_' + overall_settings_output_string \ + '_folds_' +
str(number_of_folds)
format_results_for_types(test_result, testing_samples_id, overall_settings_output_string)

Model training and testing

Now we are ready to start the model training phase by calling the main function create_model_with_kfold_cross_validation() for building and training the CNN model using 10-fold cross-validation; then we can call the testing function to measure the model's ability to generalize to the test set:

if __name__ == '__main__':
info_string, models = create_model_with_kfold_cross_validation()
test_generality_crossValidation_over_test_set(info_string, models)

Fish recognition – all together

After explaining the main building blocks for our fish recognition example, we are ready to see all the code pieces connected together and see how we managed to build such a complex system with just a few lines of code. The full code is placed in the Appendix section of the book.