Book Image

Hands-On Neural Network Programming with C#

By : Matt Cole
Book Image

Hands-On Neural Network Programming with C#

By: Matt Cole

Overview of this book

Neural networks have made a surprise comeback in the last few years and have brought tremendous innovation in the world of artificial intelligence. The goal of this book is to provide C# programmers with practical guidance in solving complex computational challenges using neural networks and C# libraries such as CNTK, and TensorFlowSharp. This book will take you on a step-by-step practical journey, covering everything from the mathematical and theoretical aspects of neural networks, to building your own deep neural networks into your applications with the C# and .NET frameworks. This book begins by giving you a quick refresher of neural networks. You will learn how to build a neural network from scratch using packages such as Encog, Aforge, and Accord. You will learn about various concepts and techniques, such as deep networks, perceptrons, optimization algorithms, convolutional networks, and autoencoders. You will learn ways to add intelligent features to your .NET apps, such as facial and motion detection, object detection and labeling, language understanding, knowledge, and intelligent search. Throughout this book, you will be working on interesting demonstrations that will make it easier to implement complex neural networks in your enterprise applications.
Table of Contents (16 chapters)
13
Activation Function Timings

Neural network overview

Let's start by defining exactly what we are going to call a neural network. Let me first note that you may also hear a neural network called an Artificial Neural Network (ANN). Although personally I do not like the term artificial, we'll use those terms interchangeably throughout this book.

"Let's state that a neural network, in its simplest form, is a system comprising several simple but highly interconnected elements; each processes information based upon their response to external inputs."

Did you know that neural networks are more commonly, but loosely, modeled after the cerebral cortex of a mammalian brain? Why didn't I say that they were modeled after humans? Because there are many instances where biological and computational studies are used from brains from rats, monkeys, and, yes, humans. A large neural network may have hundreds or maybe even thousands of processing units, where as a mammalian brain has billions. It's the neurons that do the magic, and we could in fact write an entire book on that topic alone.

Here's why I say they do all the magic: If I showed you a picture of Halle Berry, you would recognize her right away. You wouldn't have time to analyze things; you would know based upon a lifetime of collected knowledge. Similarly, if I said the word pizza to you, you would have an immediate mental image and possibly even start to get hungry. How did all that happen just like that? Neurons! Even though the neural networks of today continue to gain in power and speed, they pale in comparison to the ultimate neural network of all time, the human brain. There is so much we do not yet know or understand about this neural network; just wait and see what neural networks will become once we do!

Neural networks are organized into layers made up of what are called nodes or neurons. These nodes are the neurons themselves and are interconnected (throughout this book we use the terms nodes and neurons interchangeably). Information is presented to the input layer, processed by one or more hidden layers, then given to the output layer for final (or continued further) processing—lather, rinse, repeat!

But what is a neuron, you ask? Using the following diagram, let's state this:

"A neuron is the basic unit of computation in a neural network"

As I mentioned earlier, a neuron is sometimes also referred to as a node or a unit. It receives input from other nodes or external sources and computes an output. Each input has an associated weight (w1 and w2 below), which is assigned based on its relative importance to the other inputs. The node applies a function f (an activation function, which we will learn more about later on) to the weighted sum of its inputs. Although that is an extreme oversimplification of what a neuron is and what it can do, that's basically it.

Let's look visually at the progression from a single neuron into a very deep learning network. Here is what a single neuron looks like visually based on our description:

Next, the following diagram shows a very simple neural network comprised of several neurons:

Here is a somewhat more complicated, or deeper, network:

Neural network training

Now that we know what a neural network and neurons are, we should talk about what they do and how they do it. How does a neural network learn? Those of you with children already know the answer to this one. If you want your child to learn what a cat is, what do you do? You show them cats (pictures or real). You want your child to learn what a dog is? Show them dogs. A neural network is conceptually no different. It has a form of learning rule that will modify the incoming weights from the input layer, process them through the hidden layers, put them through an activation function, and hopefully will be able to identify, in our case, cats and dogs. And, if done correctly, the cat does not become a dog!

One of the most common learning rules with neural networks is what is known as the delta rule. This is a supervised rule that is invoked each time the network is presented with another learning pattern. Each time this happens it is called a cycle or epoch. The invocation of the rule will happen each time that input pattern goes through one or more forward propagation layers, and then through one or more backward propagation layers.

More simply put, when a neural network is presented with an image it tries to determine what the answer might be. The difference between the correct answer and our guess is the error or error rate. Our objective is that the error rate gets either minimized or maximized. In the case of minimization, we need the error rate to be as close to 0 as possible for each guess. The closer we are to 0, the closer we are to success.

As we progress, we undertake what is termed a gradient descent, meaning we continue along toward what is called the global minimum, our lowest possible error, which hopefully is paramount to success. We descend toward the global minimum.

Once the network itself is trained, and you are happy, the training cycle can be put to bed and you can move on to the testing cycle. During the testing cycle, only the forward propagation layer is used. The output of this process results in the model that will be used for further analysis. Again, no back propagation occurs during testing.

A visual guide to neural networks

In this section, I could type thousands of words trying to describe all of the combinations of neural networks and what they look like. However, no amount of words would do any better than the diagram that follows:

Reprinted with permission, Copyright Asimov Institute
Source: http://www.asimovinstitute.org/neural-network-zoo/

Let's talk about a few of the more common networks from the previous diagram:

  • Perceptron: This is the simplest feed-forward neural network available, and, as you can see, it does not contain any hidden layers:
  • Feed-forward network: This network is perhaps the simplest type of artificial neural network devised. It contains multiple neurons (nodes) arranged in layers. Nodes from adjacent layers have connections or edges between them. Each connection has weights associated with them:
  • Recurrent neural network (RNN): RNNs are called recurrent because they perform the same task for every element of a sequence, with the output depending on the previous computations. They are also able to look back at previous steps, which form a sort of short-term memory: