Book Image

Deep Learning By Example

Book Image

Deep Learning By Example

Overview of this book

Deep learning is a popular subset of machine learning, and it allows you to build complex models that are faster and give more accurate predictions. This book is your companion to take your first steps into the world of deep learning, with hands-on examples to boost your understanding of the topic. This book starts with a quick overview of the essential concepts of data science and machine learning which are required to get started with deep learning. It introduces you to Tensorflow, the most widely used machine learning library for training deep learning models. You will then work on your first deep learning problem by training a deep feed-forward neural network for digit classification, and move on to tackle other real-world problems in computer vision, language processing, sentiment analysis, and more. Advanced deep learning models such as generative adversarial networks and their applications are also covered in this book. By the end of this book, you will have a solid understanding of all the essential concepts in deep learning. With the help of the examples and code provided in this book, you will be equipped to train your own deep learning models with more confidence.
Table of Contents (18 chapters)
16
Implementing Fish Recognition

Design procedure of data science algorithms

Different learning systems usually follow the same design procedure. They start by acquiring the knowledge base, selecting the relevant explanatory features from the data, going through a bunch of candidate learning algorithms while keeping an eye on the performance of each one, and finally the evaluation process, which measures how successful the training process was.

In this section, we are going to address all these different design steps in more detail:

Figure 1.11: Model learning process outline

Data pre-processing

This component of the learning cycle represents the knowledge base of our algorithm. So, in order to help the learning algorithm give accurate decisions about the unseen data, we need to provide this knowledge base in the best form. Thus, our data may need a lot of cleaning and pre-processing (conversions).

Data cleaning

Most datasets require this step, in which you get rid of errors, noise, and redundancies. We need our data to be accurate, complete, reliable, and unbiased, as there are lots of problems that may arise from using bad knowledge base, such as:

  • Inaccurate and biased conclusions
  • Increased error
  • Reduced generalizability, which is the model's ability to perform well over the unseen data that it didn't train on previously

Data pre-processing

In this step, we apply some conversions to our data to make it consistent and concrete. There are lots of different conversions that you can consider while pre-processing your data:

  • Renaming (relabeling): This means converting categorical values to numbers, as categorical values are dangerous if used with some learning methods, and also numbers will impose an order between the values
  • Rescaling (normalization): Transforming/bounding continuous values to some range, typically [-1, 1] or [0, 1]
  • New features: Making up new features from the existing ones. For example, obesity-factor = weight/height

Feature selection

The number of explanatory features (input variables) of a sample can be enormous wherein you get xi=(xi1, xi2, xi3, ... , xid) as a training sample (observation/example) and d is very large. An example of this can be a document classification task3, where you get 10,000 different words and the input variables will be the number of occurrences of different words.

This enormous number of input variables can be problematic and sometimes a curse because we have many input variables and few training samples to help us in the learning procedure. To avoid this curse of having an enormous number of input variables (curse of dimensionality), data scientists use dimensionality reduction techniques in order to select a subset from the input variables. For example, in the text classification task they can do the following:

  • Extracting relevant inputs (for instance, mutual information measure)
  • Principal component analysis (PCA)
  • Grouping (cluster) similar words (this uses a similarity measure)

Model selection

This step comes after selecting a proper subset of your input variables by using any dimensionality reduction technique. Choosing the proper subset of the input variable will make the rest of the learning process very simple.

In this step, you are trying to figure out the right model to learn.

If you have any prior experience with data science and applying learning methods to different domains and different kinds of data, then you will find this step easy as it requires prior knowledge of how your data looks and what assumptions could fit the nature of your data, and based on this you choose the proper learning method. If you don't have any prior knowledge, that's also fine because you can do this step by guessing and trying different learning methods with different parameter settings and choose the one that gives you better performance over the test set.

Also, initial data analysis and visualization will help you to make a good guess about the form of the distribution and nature of your data.

Learning process

By learning, we mean the optimization criteria that you are going to use to select the best model parameters. There are various optimization criteria for that:

  • Mean square error (MSE)
  • Maximum likelihood (ML) criterion
  • Maximum a posterior probability (MAP)

The optimization problem may be hard to solve, but the right choice of model and error function makes a difference.

Evaluating your model

In this step, we try to measure the generalization error of our model on the unseen data. Since we only have the specific data without knowing any unseen data beforehand, we can randomly select a test set from the data and never use it in the training process so that it acts like valid unseen data. There are different ways you can to evaluate the performance of the selected model:

  • Simple holdout method, which is dividing the data into training and testing sets
  • Other complex methods, based on cross-validation and random subsampling

Our objective in this step is to compare the predictive performance for different models trained on the same data and choose the one with a better (smaller) testing error, which will give us a better generalization error over the unseen data. You can also be more certain about the generalization error by using a statistical method to test the significance of your results.