Book Image

Deep Learning By Example

Book Image

Deep Learning By Example

Overview of this book

Deep learning is a popular subset of machine learning, and it allows you to build complex models that are faster and give more accurate predictions. This book is your companion to take your first steps into the world of deep learning, with hands-on examples to boost your understanding of the topic. This book starts with a quick overview of the essential concepts of data science and machine learning which are required to get started with deep learning. It introduces you to Tensorflow, the most widely used machine learning library for training deep learning models. You will then work on your first deep learning problem by training a deep feed-forward neural network for digit classification, and move on to tackle other real-world problems in computer vision, language processing, sentiment analysis, and more. Advanced deep learning models such as generative adversarial networks and their applications are also covered in this book. By the end of this book, you will have a solid understanding of all the essential concepts in deep learning. With the help of the examples and code provided in this book, you will be equipped to train your own deep learning models with more confidence.
Table of Contents (18 chapters)
16
Implementing Fish Recognition

Understanding data science by an example

To illustrate the life cycle and challenges of building a learning algorithm for specific data, let us consider a real example. The Nature Conservancy is working with other fishing companies and partners to monitor fishing activities and preserve fisheries for the future. So they are looking to use cameras in the future to scale up this monitoring process. The amount of data that will be produced from the deployment of these cameras will be cumbersome and very expensive to process manually. So the conservancy wants to develop a learning algorithm to automatically detect and classify different species of fish to speed up the video reviewing process.

Figure 1.1 shows a sample of images taken by conservancy-deployed cameras. These images will be used to build the system.

Figure 1.1: Sample of the conservancy-deployed cameras' output

So our aim in this example is to separate different species such as tunas, sharks, and more that fishing boats catch. As an illustrative example, we can limit the problem to only two classes, tuna and opah.

Figure 1.2: Tuna fish type (left) and opah fish type (right)

After limiting our problem to contain only two types of fish, we can take a sample of some random images from our collection and start to note some physical differences between the two types. For example, consider the following physical differences:

  • Length: You can see that compared to the opah fish, the tuna fish is longer
  • Width: Opah is wider than tuna
  • Color: You can see that the opah fish tends to be more red while the tuna fish tends to be blue and white, and so on

We can use these physical differences as features that can help our learning algorithm(classifier) to differentiate between these two types of fish.

Explanatory features of an object are something that we use in daily life to discriminate between objects that surround us. Even babies use these explanatory features to learn about the surrounding environment. The same for data science, in order to build a learned model that can discriminate between different objects (for example, fish type), we need to give it some explanatory features to learn from (for example, fish length). In order to make the model more certain and reduce the confusion error, we can increase (to some extent) the explanatory features of the objects.

Given that there are physical differences between the two types of fish, these two different fish populations have different models or descriptions. So the ultimate goal of our classification task is to get the classifier to learn these different models and then give an image of one of these two types as an input. The classifier will classify it by choosing the model (tuna model or opah model) that corresponds best to this image.

In this case, the collection of tuna and opah fish will act as the knowledge base for our classifier. Initially, the knowledge base (training samples) will be labeled/tagged, and for each image, you will know beforehand whether it's tuna or opah fish. So the classifier will use these training samples to model the different types of fish, and then we can use the output of the training phase to automatically label unlabeled/untagged fish that the classifier didn't see during the training phase. This kind of unlabeled data is often called unseen data. The training phase of the life cycle is shown in the following diagram:

Supervised data science is all about learning from historical data with known target or output, such as the fish type, and then using this learned model to predict cases or data samples, for which we don't know the target/output.
Figure 1.3: Training phase life cycle

Let's have a look at how the training phase of the classifier will work:

  • Pre-processing: In this step, we will try to segment the fish from the image by using the relevant segmentation technique.
  • Feature extraction: After segmenting the fish from the image by subtracting the background, we will measure the physical differences (length, width, color, and so on) of each image. At the end, you will get something like Figure 1.4.

Finally, we will feed this data into the classifier in order to model different fish types.

As we have seen, we can visually differentiate between tuna and opah fish based on the physical differences (features) that we proposed, such as length, width, and color.

We can use the length feature to differentiate between the two types of fish. So we can try to differentiate between the fish by observing their length and seeing whether it exceeds some value (length*) or not.

So, based on our training sample, we can derive the following rule:

If length(fish)> length* then label(fish) = Tuna
Otherwise label(fish) = Opah

In order to find this length* we can somehow make length measurements based on our training samples. So, suppose we get these length measurements and obtain the histogram as follows:

Figure 1.4: Histogram of the length measurements for the two types of fish

In this case, we can derive a rule based on the length feature and differentiate the tuna and opah fish. In this particular example, we can tell that length* is 7. So we can update the preceding rule to be:

If length(fish)> 7 then label(fish) = Tuna
Otherwise label(fish) = Opah

As you may notice, this is not a promising result because of the overlap between the two histograms, as the length feature is not a perfect one to use solely for differentiating between the two types. So we can try to incorporate more features such as the width and then combine them. So, if we somehow manage to measure the width of our training samples, we might get something like the histogram as follows:

Figure 5: Histogram of width measurements for the two types of fish

As you can see, depending on one feature only will not give accurate results and the output model will do lots of misclassifications. Instead, we can somehow combine the two features and come up with something that looks reasonable.

So if we combine both features, we might get something that looks like the following graph:

Figure 1.6 : Combination between the subset of the length and width measurements for the two types of fish

Combining the readings for the length and width features, we will get a scatter plot like the one in the preceding graph. We have the red dots to represent the tuna fish and the green dots to represent the opah fish, and we can suggest this black line to be the rule or the decision boundary that will differentiate between the two types of fish.

For example, if the reading of one fish is above this decision boundary, then it's a tuna fish; otherwise, it will be predicted as an opah fish.

We can somehow try to increase the complexity of the rule to avoid any errors and get a decision boundary like the one in the following graph:

Figure 1.7: Increasing the complexity of the decision boundary to avoid misclassifications over the training data

The advantage of this model is that we get almost 0 misclassifications over the training samples. But actually this is not the objective of using data science. The objective of data science is to build a model that will be able to generalize and perform well over the unseen data. In order to find out whether we built a model that will generalize or not, we are going to introduce a new phase called the testing phase, in which we give the trained model an unlabeled image and expect the model to assign the correct label (Tuna and Opah) to it.

Data science's ultimate objective is to build a model that will work well in production, not over the training set. So don't be happy when you see your model is performing well on the training set, like the one in figure 1.7. Mostly, this kind of model will fail to work well in recognizing the fish type in the image. This incident of having your model work well only over the training set is called overfitting, and most practitioners fall into this trap.

Instead of coming up with such a complex model, you can drive a less complex one that will generalize in the testing phase. The following graph shows the use of a less complex model in order to get fewer misclassification errors and to generalize the unseen data as well:

Figure 1.8: Using a less complex model in order to be able to generalize over the testing samples (unseen data)