Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By : Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo
Book Image

Hands-On Convolutional Neural Networks with TensorFlow

By: Iffat Zafar, Giounona Tzanidou, Richard Burton, Nimesh Patel, Leonardo Araujo

Overview of this book

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time! We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation. After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks. Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

The session


Now we have constructed all the parts of our computational graph. The very final thing we need to do is create a tf.Session and run our graph. The TensorFlow session is a way to connect your TensorFlow program, written in Python, with the C++ runtime powering TensorFlow. The session also gives TensorFlow access to devices such as CPUs and GPUs present on your local or remote machine. In addition, the session will cache information about the constructed graph so computation can be efficiently run many times.

 

 

The standard way to create a session is to do so using a Python context manager: the with statement block:

with tf.Session() as sess:.  

The reason for this is that when you create a session, it has control of CPU, memory, and GPU resources on your computer. When you are finished using your session, you want all these resources to be freed up again, and the easiest way to ensure this is by using a with statement.

The first thing we'll do after creating our session is to run our initializer op. You can evaluate nodes and Tensors in a graph using a session by calling tf.Session.run on the graph objects you want to evaluate. When you supply part of your graph to session.run, TensorFlow will work its way through the graph evaluating everything that the supplied graph part depends on in order to produce a result.

So, in our example, calling sess.run(initializer) will search back through the graph, find everything that is required to execute the initializer, and then execute these nodes in order. In this case, nothing is connected to the initializer node, so it will simply execute this one node that initializes all our Variables.

Now that our variables are initialized, we start the training loop. We will train for 1000 steps or iterations, so we create a for loop where our training steps will take place. The amount of steps to train for is a hyperparameter of sorts; it is something that we need to decide on when we train our model. There can be trade-offs with the value you choose, and this will be discussed in the future chapters. For this problem, 1000 steps will be good enough to get the desired result.

We grab a batch of training data and labels that we will feed into our graph. Next, we call session.run again. This time, we call it on two things, the loss and optimizer. We can supply as many things as we want to evaluate by putting them in a list that we supply to session.run. TensorFlow will be smart enough not to evaluate the graph multiple times if it doesn't need to, and it will reuse results that have already been calculated. This list we supply is called our fetches; it is the nodes in the graph that we want to evaluate and fetch.

After the list of fetches, we supply a feed_dict or feed dictionary. This is a dictionary in which each key is the Tensor in the graph that we will feed values to (in this case, our placeholders) and the corresponding value is the value that will be fed to it.

The return values of session.run correspond to each of the values in our fetch list. Our first fetch is the loss Tensor in our graph, so the first return argument comes from this. The second fetch is the optimizer node. We don't care about what is returned from this node as we only care about what the optimizer node calculates, so we leave its corresponding return empty:

with tf.Session() as sess: 
    sess.run(initializer) 
     
    for i in range(1000): 
        batch_x, batch_y = train_data[:,:], train_labels[:,:] 
         
        loss_val, _ = sess.run([loss, optimizer], feed_dict={x : batch_x, y: batch_y}) 
    print("Train Accuracy:", sess.run(accuracy, feed_dict={x: train_data, y: train_labels})) 
    print("Test Accuracy:", sess.run(accuracy, feed_dict={x: test_data, y: test_labels})) 

After running for 1000 iterations, we use another session.run call to fetch the output of our accuracy node. We do this twice, once feeding in our training data to get accuracy on the training set, and once feeding in our held out test data to get the accuracy on the test set. You should get a test accuracy printed out of 0.977778, which means our model correctly classified 44 out of 45 of our test sets, not too bad at all!