In step 1, we initialized a sequential model by calling the keras_model_sequential() function. In the next step, we stacked hidden and output layers by using a series of layer functions. The layer_dense() function adds a densely-connected layer to the defined model. The first layer of the sequential model needs to know what input shape it should expect, so we passed a value to the input_shape argument of the first layer. In our case, the input shape was equal to the number of features in the dataset. When we add layers to the keras sequential model, the model object is modified in-place, and we do not need to assign the updated object back to the original. The keras object's behavior is unlike most R objects (R objects are typically immutable). For our model, we used the relu activation function. The layer_activation() function creates an activation layer that takes input from the preceding hidden layer and applies activation to the output of our previous hidden layer. We can also use different functions, such as leaky ReLU, softmax, and more (activation functions will be discussed in Implementing a single-layer neural network recipe). In the output layer of our model, no activation was applied.
We can also implement various activation functions for each layer by passing a value to the activation argument in the layer_dense() function instead of adding an activation layer explicitly. It applies the following operation:
output=activation(dot(input, kernel)+bias)
Here, the activation argument refers to the element-wise activation function that's passed, while the kernel is a weights matrix that's created by the layer. The bias is a bias vector that's produced by the layer.
To train a model, we need to configure the learning process. We did this in step 3 using the compile() function. In our training process, we applied a stochastic gradient descent optimizer to find the weights and biases that minimize our objective loss function; that is, the mean squared error. The metrics argument calculates the metric to be evaluated by the model during training and testing.
In step 4, we looked at the summary of the model; it showed us information about each layer, such as the shape of the output of each layer and the parameters of each layer.
In the last step, we trained our model for a fixed number of iterations on the dataset. Here, the epochs argument defines the number of iterations. The validation_split argument can take float values between 0 and 1. It specifies a fraction of the training data to be used as validation data. Finally, batch_size defines the number of samples that propagate through the network.