Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Deep Learning with R Cookbook
  • Table Of Contents Toc
Deep Learning with R Cookbook

Deep Learning with R Cookbook

By : Gupta, Ansari, Sarkar
5 (3)
close
close
Deep Learning with R Cookbook

Deep Learning with R Cookbook

5 (3)
By: Gupta, Ansari, Sarkar

Overview of this book

Deep learning (DL) has evolved in recent years with developments such as generative adversarial networks (GANs), variational autoencoders (VAEs), and deep reinforcement learning. This book will get you up and running with R 3.5.x to help you implement DL techniques. The book starts with the various DL techniques that you can implement in your apps. A unique set of recipes will help you solve binomial and multinomial classification problems, and perform regression and hyperparameter optimization. To help you gain hands-on experience of concepts, the book features recipes for implementing convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Long short-term memory (LSTMs) networks, as well as sequence-to-sequence models and reinforcement learning. You’ll then learn about high-performance computation using GPUs, along with learning about parallel computation capabilities in R. Later, you’ll explore libraries, such as MXNet, that are designed for GPU computing and state-of-the-art DL. Finally, you’ll discover how to solve different problems in NLP, object detection, and action identification, before understanding how to use pre-trained models in DL apps. By the end of this book, you’ll have comprehensive knowledge of DL and DL packages, and be able to develop effective solutions for different DL problems.
Table of Contents (11 chapters)
close
close

Understanding strides and padding

In this recipe, we will learn about two key configuration hyperparameters of CNN, which are strides and padding. Strides are used mainly to reduce the size of the output volume. Padding is another technique that lets us preserve the dimensions of the input volume in the output volume, thus enabling us to extract the low-level features efficiently.

Strides: Stride, in very simple terms, means the step of the convolution operation. Stride specifies the amount by which filters convolve around the input. For example, if we specify the value of stride argument as 1, that means the filter will shift one unit at a time over the input matrix. 

Strides can be used for multiple purposes, primarily the following:

  • To avoid feature overlapping
  • To achieve smaller spatial dimensionality of the output volume

In the following diagram, you can see an example of a convolution operation on a 7 × 7 input data with a filter size of a 3 × 3 and a stride of 1:

Padding: For better modeling performance, we need to preserve the information about the low-level features of the input volume in the early layers of the network. As we keep applying convolutional layers, the size of the output volume decreases faster. Also, the pixels at corners of the input matrix are traversed a lesser number of times, compared to the pixels in the middle of the input matrix, which leads to throwing away a lot of the information near the edge of the image. To avoid this, we use zero padding. Zero padding symmetrically pads the input volume with zeros around the border.

There are two types of padding:

  • Valid: If we specify valid padding, our convolutional layer is not going to pad anything around the input matrix and the size of the output volume will keep decreasing on adding layers.

  • Same: This pads the original input with zeros around the edges of the input matrix before we convolve it so that the output size is the same size as the input size.

In the following screenshot, we can see a pictorial representation of zero padding:

Now that we are aware of the concepts of strides and padding, let's move further to the implementation part.

How to do it...

In this section, we will use the same Fashion-MNIST dataset that was used in the previous Introduction to the convolution operation recipe of this chapter. The data exploration and transformation will remain the same, hence we directly jump to the model configuration:

  1. Let's define our model with strides and padding:
cnn_model_sp <- keras_model_sequential() %>% 
layer_conv_2d(filters = 8, kernel_size = c(4,4), activation = 'relu',
input_shape = c(28,28,1),
strides = c(2L, 2L),,padding = "same") %>%
layer_conv_2d(filters = 16, kernel_size = c(3,3), activation = 'relu') %>%
layer_flatten() %>%
layer_dense(units = 16, activation = 'relu') %>%
layer_dense(units = 10, activation = 'softmax')

Let's look at the summary of the model:

cnn_model_sp %>% summary()

The following screenshot shows the details about the model created:

  1. After configuring our model, we define its objective loss function, then compile and train it:
# loss function
loss_entropy <- function(y_pred, y_true) {
loss_categorical_crossentropy(y_pred, y_true)
}

# Compile model
cnn_model_sp %>% compile(
loss = loss_entropy,
optimizer = optimizer_sgd(),
metrics = c('accuracy')
)

# Train model
cnn_model_sp %>% fit(
x_train, y_train,
batch_size = 128,
epochs = 5,
validation_split = 0.2
)

Let's evaluate the performance of the model on the test data and print the evaluation metrics:

scores <- cnn_model_sp %>% evaluate(x_test,
y_test,
verbose = 0
)

Now we print the model loss and accuracy on the test data:

# Output metrics
paste('Test loss:', scores[[1]], '\n')
paste('Test accuracy:', scores[[2]], '\n')

We can see that the model's accuracy on test data is around 78%:

We can see that the model did a good job in the classification task.

How it works...

In the previous recipe, Introduction to convolution operation, we built a simple CNN model. Apart from filter size and the number of filters, there are two more parameters of a convolution layer that can be configured for better feature extraction, and these are strides and padding. In step 1, we passed a vector of two integers (width and height), specifying the strides of the convolution along the width and height. The padding argument takes two values, valid, and same, with valid meaning no padding, and same means that the input and output sizes remain the same. Next, we printed a summary of the model.

The output shape and number of trainable parameters of a convolutional layer can be given by the following formula:

  • Output shape: If the input to our convolutional layer is  and we apply  filters of  and  strides and  padding, then the output shape is given by the following formula:

  • The number of parameters in each layer is calculated as follows:

In step 2, we defined the loss function of our model, then compiled and trained it. We then tested the model's performance on the testing dataset and printed the model's loss and accuracy.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Deep Learning with R Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon