Book Image

Hands-On Neural Network Programming with C#

By : Matt Cole
Book Image

Hands-On Neural Network Programming with C#

By: Matt Cole

Overview of this book

Neural networks have made a surprise comeback in the last few years and have brought tremendous innovation in the world of artificial intelligence. The goal of this book is to provide C# programmers with practical guidance in solving complex computational challenges using neural networks and C# libraries such as CNTK, and TensorFlowSharp. This book will take you on a step-by-step practical journey, covering everything from the mathematical and theoretical aspects of neural networks, to building your own deep neural networks into your applications with the C# and .NET frameworks. This book begins by giving you a quick refresher of neural networks. You will learn how to build a neural network from scratch using packages such as Encog, Aforge, and Accord. You will learn about various concepts and techniques, such as deep networks, perceptrons, optimization algorithms, convolutional networks, and autoencoders. You will learn ways to add intelligent features to your .NET apps, such as facial and motion detection, object detection and labeling, language understanding, knowledge, and intelligent search. Throughout this book, you will be working on interesting demonstrations that will make it easier to implement complex neural networks in your enterprise applications.
Table of Contents (16 chapters)
13
Activation Function Timings

Understanding perceptrons

The most basic element that we will deal with is called the neuron. If we were to take the most basic form of an activation function that a neuron would use, we would have a function that has only two possible results, 1 and 0. Visually, such a function would be represented like this:

This function returns 1 if the input is positive or 0, otherwise it returns 0. A neuron whose activation function is like this is called a perceptron. It is the simplest form of neural network we could develop. Visually, it looks like the following:

The perceptron follows the feed-forward model, meaning inputs are sent into the neuron, processed, and then produce output. Inputs come in, and output goes out. Let's use an example.

Let's suppose that we have a single perceptron with two inputs as shown previously. For the purposes of this example, input 0 will be x1 and input 1 will be x2. If we assign those two variable values, they will look something like this:

Input 0: x1 = 12
Input 1: x2 = 4

Each of those inputs must be weighted, that is, multiplied by some value, which is often a number between -1 and 1. When we create our perceptron, we begin by assigning them random weights. As an example, Input 0 (x1) will have a weight we'll label w1, and input 1, x2 will have a weight we'll label w2. Given this, here's how our weights look for this perceptron:

Weight 0: 0.5
Weight 1: -1

Once the inputs are weighted, they now need to be summed. Using the previous example, we would have this:

6 + -4 = 2

That sum would then be passed through an activation function, which we will cover in much more detail in a later chapter. This would generate the output of the perceptron. The activation function is what will ultimately tell the perceptron whether it is OK to fire, that is, to activate.

Now, for our activation function we will just use a very simple one. If the sum is positive, the output will be 1. If the sum is negative, the output will be -1. It can't get any simpler than that, right?

So, in pseudo code, our algorithm for our single perceptron looks like the following:

  • For every input, multiply that input by its weight
  • Sum all the weighted inputs
  • Compute the output of the perceptron based on that sum passed through an activation function (the sign of the sum)

Is this useful?

Yes, in fact it is, and let's show you how. Consider an input vector as the coordinates of a point. For a vector with n elements, the point would like it's in a n-dimensional space. Take a sheet of paper, and on this paper, draw a set of points. Now separate those two points by a single straight line. Your piece of paper should now look something like the following:

As you can see, the points are now divided into two sets, one set on each side of the line. If we can take a single line and clearly separate all the points, then those two sets are what is known as linearly separable.

Our single perceptron, believe it or not, will be able to learn where this line is, and when your program is complete, the perceptron will also be able to tell whether a single point is above or below the line (or to the left or the right of it, depending upon how the line was drawn).

Let's quickly code a Perceptron class, just so it becomes clearer for those of you who love to read code more than words (like me!). The goal will be to create a simple perceptron that can determine which side of the line a point should be on, just like the previous diagram:

class Perceptron {

float[] weights;

The constructor could receive an argument indicating the number of inputs (in this case three: x, y, and a bias) and size the array accordingly:

Perceptron(int n) {
weights = new float[n];
for (int i = 0; i<weights.length; i++) {

The weights are picked randomly to start with:

      weights[i] = random(-1,1);
}
}

Next, we'll need a function for the perceptron to receive its information, which will be the same length as the array of weights, and then return the output value to us. We'll call this feedforward:

int feedforward(float[] inputs) {
float sum = 0;
for (int i = 0; i<weights.length; i++) {
sum += inputs[i]*weights[i];
}

The result is the sign of the sum, which will be either -1 or +1. In this case, the perceptron is attempting to guess which side of the line the output should be on:

 return activate(sum);
}

Thus far, we have a minimally functional perceptron that should be able to make an educated guess as to where our point will lie.

Create the Perceptron:

Perceptron p = new Perceptron(3);

The input is 3 values: x, y, and bias:

float[] point = {5,-2,19};

Obtain the answer:

int result = p.feedforward(point);

The only thing left that will make our perceptron more valuable is the ability to train it rather than have it make educated guesses. We do that by creating a train function such as this:

  1. We will introduce a new variable to control the learning rate:
float c = 0.01;
  1. We will also provide the inputs and the known answer:
void train(float[] inputs, int desired) {
  1. And we will make an educated guess according to the inputs provided:
  int guess = feedforward(inputs);
  1. We will compute the error, which is the difference between the answer and our guess:
float error = desired - guess;
  1. And, finally, we will adjust all the weights according to the error and learning constant:
  for (int i = 0; i<weights.length; i++) {
weights[i] += c * error * inputs[i];

So, now that you know and see what a perceptron is, let's add activation functions into the mix and take it to the next level!