Book Image

Applied Unsupervised Learning with Python

By : Benjamin Johnston, Aaron Jones, Christopher Kruger
Book Image

Applied Unsupervised Learning with Python

By: Benjamin Johnston, Aaron Jones, Christopher Kruger

Overview of this book

Unsupervised learning is a useful and practical solution in situations where labeled data is not available. Applied Unsupervised Learning with Python guides you in learning the best practices for using unsupervised learning techniques in tandem with Python libraries and extracting meaningful information from unstructured data. The book begins by explaining how basic clustering works to find similar data points in a set. Once you are well-versed with the k-means algorithm and how it operates, you’ll learn what dimensionality reduction is and where to apply it. As you progress, you’ll learn various neural network techniques and how they can improve your model. While studying the applications of unsupervised learning, you will also understand how to mine topics that are trending on Twitter and Facebook and build a news recommendation engine for users. Finally, you will be able to put your knowledge to work through interesting activities such as performing a Market Basket Analysis and identifying relationships between different products. By the end of this book, you will have the skills you need to confidently build your own models using Python.
Table of Contents (12 chapters)
Applied Unsupervised Learning with Python
Preface

Chapter 5: Autoencoders


Activity 8: Modeling Neurons with a ReLU Activation Function

Solution:

  1. Import numpy and matplotlib:

    import numpy as np
    import matplotlib.pyplot as plt
  2. Allow latex symbols to be used in labels:

    plt.rc('text', usetex=True)
  3. Define the ReLU activation function as a Python function:

    def relu(x):
        return np.max((0, x))
  4. Define the inputs (x) and tunable weights (theta) for the neuron. In this example, the inputs (x) will be 100 numbers linearly spaced between -5 and 5. Set theta = 1:

    theta = 1
    x = np.linspace(-5, 5, 100)
    x

    The output is as follows:

    Figure 5.35: Printing the inputs

  5. Compute the output (y):

    y = [relu(_x * theta) for _x in x]
  6. Plot the output of the neuron versus the input:

    fig = plt.figure(figsize=(10, 7))
    ax = fig.add_subplot(111)
    
    ax.plot(x, y)
    ax.set_xlabel('$x$', fontsize=22);
    ax.set_ylabel('$h(x\Theta)$', fontsize=22);
    ax.spines['left'].set_position(('data', 0));
    ax.spines['top'].set_visible(False);
    ax.spines['right'].set_visible(False);
    ax.tick_params(axis='both', which='major', labelsize=22)

    The output is as follows:

    Figure 5.36: Plot of the neuron versus input

  7. Now, set theta = 5 and recompute and store the output of the neuron:

    theta = 5
    y_2 = [relu(_x * theta) for _x in x]
  8. Now, set theta = 0.2 and recompute and store the output of the neuron:

    theta = 0.2
    y_3 = [relu(_x * theta) for _x in x]
  9. Plot the three different output curves of the neuron (theta = 1, theta = 5, theta = 0.2) on one graph:

    fig = plt.figure(figsize=(10, 7))
    ax = fig.add_subplot(111)
    
    ax.plot(x, y, label='$\Theta=1$');
    ax.plot(x, y_2, label='$\Theta=5$', linestyle=':');
    ax.plot(x, y_3, label='$\Theta=0.2$', linestyle='--');
    ax.set_xlabel('$x\Theta$', fontsize=22);
    ax.set_ylabel('$h(x\Theta)$', fontsize=22);
    ax.spines['left'].set_position(('data', 0));
    ax.spines['top'].set_visible(False);
    ax.spines['right'].set_visible(False);
    ax.tick_params(axis='both', which='major', labelsize=22);
    ax.legend(fontsize=22);

    The output is as follows:

    Figure 5.37: Three output curves of the neuron

In this activity, we created a model of a ReLU-based artificial neural network neuron. We can see that the output of this neuron is very different to the sigmoid activation function. There is no saturation region for values greater than 0 because it simply returns the input value of the function. In the negative direction, there is a saturation region where only 0 will be returned if the input is less than 0. The ReLU function is an extremely powerful and commonly used activation function that has shown to be more powerful than the sigmoid function in some circumstances. ReLU is often a good first-choice activation function.

Activity 9: MNIST Neural Network

Solution:

In this activity, you will train a neural network to identify images in the MNIST dataset and reinforce your skills in training neural networks:

  1. Import pickle, numpy, matplotlib, and the Sequential and Dense classes from Keras:

    import pickle
    import numpy as np
    import matplotlib.pyplot as plt
    from keras.models import Sequential
    from keras.layers import Dense
  2. Load the mnist.pkl file, which contains the first 10,000 images and corresponding labels from the MNIST dataset that are available in the accompanying source code. The MNIST dataset is a series of 28 x 28 grayscale images of handwritten digits 0 through 9. Extract the images and labels:

    with open('mnist.pkl', 'rb') as f:
        data = pickle.load(f)
        
    images = data['images']
    labels = data['labels']
  3. Plot the first 10 samples along with the corresponding labels:

    plt.figure(figsize=(10, 7))
    for i in range(10):
        plt.subplot(2, 5, i + 1)
        plt.imshow(images[i], cmap='gray')
        plt.title(labels[i])
        plt.axis('off')

    The output is as follows:

    Figure 5.38: First 10 samples

  4. Encode the labels using one hot encoding:

    one_hot_labels = np.zeros((images.shape[0], 10))
    
    for idx, label in enumerate(labels):
        one_hot_labels[idx, label] = 1
        
    one_hot_labels

    The output is as follows:

    Figure 5.39: Result of one hot encoding

  5. Prepare the images for input into a neural network. As a hint, there are two separate steps in this process:

    images = images.reshape((-1, 28 ** 2))
    images = images / 255.
  6. Construct a neural network model in Keras that accepts the prepared images, has a hidden layer of 600 units with a ReLU activation function, and an output of the same number of units as classes. The output layer uses a softmax activation function:

    model = Sequential([
        Dense(600, input_shape=(784,), activation='relu'),
        Dense(10, activation='softmax'),
    ])
  7. Compile the model using multiclass cross-entropy, stochastic gradient descent, and an accuracy performance metric:

    model.compile(loss='categorical_crossentropy',
                  optimizer='sgd',
                  metrics=['accuracy'])
  8. Train the model. How many epochs are required to achieve at least 95% classification accuracy on the training data? Let's have a look:

    model.fit(images, one_hot_labels, epochs=20)

    The output is as follows:

    Figure 5.40: Training the model

    15 epochs are required to achieve at least 95% classification accuracy on the training set.

In this example, we have measured the performance of the neural network classifier using the data that the classifier was trained with. In general, this method should not be used as it typically reports a higher level of accuracy than one should expect from the model. In supervised learning problems, there are a number of cross-validation techniques that should be used instead. As this is a book on unsupervised learning, cross-validation lies outside the scope of this book.

Activity 10: Simple MNIST Autoencoder

Solution:

  1. Import pickle, numpy, and matplotlib, and the Model, Input, and Dense classes from Keras:

    import pickle
    import numpy as np
    import matplotlib.pyplot as plt
    from keras.models import Model
    from keras.layers import Input, Dense
  2. Load the images from the supplied sample of the MNIST dataset that is provided with the accompanying source code (mnist.pkl):

    with open('mnist.pkl', 'rb') as f:
        images = pickle.load(f)['images']
  3. Prepare the images for input into a neural network. As a hint, there are two separate steps in this process:

    images = images.reshape((-1, 28 ** 2))
    images = images / 255.
  4. Construct a simple autoencoder network that reduces the image size to 10 x 10 after the encoding stage:

    input_stage = Input(shape=(784,))
    encoding_stage = Dense(100, activation='relu')(input_stage)
    decoding_stage = Dense(784, activation='sigmoid')(encoding_stage)
    autoencoder = Model(input_stage, decoding_stage)
  5. Compile the autoencoder using a binary cross-entropy loss function and adadelta gradient descent:

    autoencoder.compile(loss='binary_crossentropy',
                  optimizer='adadelta')
  6. Fit the encoder model:

    autoencoder.fit(images, images, epochs=100)

    The output is as follows:

    Figure 5.41: Training the model

  7. Calculate and store the output of the encoding stage for the first five samples:

    encoder_output = Model(input_stage, encoding_stage).predict(images[:5])
  8. Reshape the encoder output to 10 x 10 (10 x 10 = 100) pixels and multiply by 255:

    encoder_output = encoder_output.reshape((-1, 10, 10)) * 255
  9. Calculate and store the output of the decoding stage for the first five samples:

    decoder_output = autoencoder.predict(images[:5])
  10. Reshape the output of the decoder to 28 x 28 and multiply by 255:

    decoder_output = decoder_output.reshape((-1, 28, 28)) * 255
  11. Plot the original image, the encoder output, and the decoder:

    images = images.reshape((-1, 28, 28))
    plt.figure(figsize=(10, 7))
    for i in range(5):
        plt.subplot(3, 5, i + 1)
        plt.imshow(images[i], cmap='gray')
        plt.axis('off')
        plt.subplot(3, 5, i + 6)
        plt.imshow(encoder_output[i], cmap='gray')
        plt.axis('off')   
        
        plt.subplot(3, 5, i + 11)
        plt.imshow(decoder_output[i], cmap='gray')
        plt.axis('off')    

    The output is as follows:

    Figure 5.42: The original image, the encoder output, and the decoder

So far, we have shown how a simple single hidden layer in both the encoding and decoding stage can be used to reduce the data to a lower dimension space. We can also make this model more complicated by adding additional layers to both the encoding and the decoding stages.

Activity 11: MNIST Convolutional Autoencoder

Solution:

  1. Import pickle, numpy, matplotlib, and the Model class from keras.models and import Input, Conv2D, MaxPooling2D, and UpSampling2D from keras.layers:

    import pickle
    import numpy as np
    import matplotlib.pyplot as plt
    from keras.models import Model
    from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
  2. Load the data:

    with open('mnist.pkl', 'rb') as f:
        images = pickle.load(f)['images']
  3. Rescale the images to have values between 0 and 1:

    images = images / 255.
  4. We need to reshape the images to add a single depth channel for use with convolutional stages. Reshape the images to have a shape of 28 x 28 x 1:

    images = images.reshape((-1, 28, 28, 1))
  5. Define an input layer. We will use the same shape input as an image:

    input_layer = Input(shape=(28, 28, 1,))
  6. Add a convolutional stage with 16 layers or filters, a 3 x 3 weight matrix, a ReLU activation function, and using same padding, which means the output has the same length as the input image:

    hidden_encoding = Conv2D(
        16, # Number of layers or filters in the weight matrix
        (3, 3), # Shape of the weight matrix
        activation='relu',
        padding='same', # How to apply the weights to the images
    )(input_layer)
  7. Add a max pooling layer to the encoder with a 2 x 2 kernel:

    encoded = MaxPooling2D((2, 2))(hidden_encoding)
  8. Add a decoding convolutional layer:

    hidden_decoding = Conv2D(
        16, # Number of layers or filters in the weight matrix
        (3, 3), # Shape of the weight matrix
        activation='relu',
        padding='same', # How to apply the weights to the images
    )(encoded)
  9. Add an upsampling layer:

    upsample_decoding = UpSampling2D((2, 2))(hidden_decoding)
  10. Add the final convolutional stage, using one layer as per the initial image depth:

    decoded = Conv2D(
        1, # Number of layers or filters in the weight matrix
        (3, 3), # Shape of the weight matrix
        activation='sigmoid',
        padding='same', # How to apply the weights to the images
    )(upsample_decoding)
  11. Construct the model by passing the first and last layers of the network to the Model class:

    autoencoder = Model(input_layer, decoded)
  12. Display the structure of the model:

    autoencoder.summary()

    The output is as follows:

    Figure 5.43: Structure of model

  13. Compile the autoencoder using a binary cross-entropy loss function and adadelta gradient descent:

    autoencoder.compile(loss='binary_crossentropy',
                  optimizer='adadelta')
  14. Now, let's fit the model; again, we pass the images as the training data and as the desired output. Train for 20 epochs as convolutional networks take a lot longer to compute:

    autoencoder.fit(images, images, epochs=20)

    The output is as follows:

    Figure 5.44: Training the model

  15. Calculate and store the output of the encoding stage for the first five samples:

    encoder_output = Model(input_layer, encoded).predict(images[:5])
  16. Reshape the encoder output for visualization, where each image is X*Y in size:

    encoder_output = encoder_output.reshape((-1, 14 * 14, 16))
  17. Get the output of the decoder for the first five images:

    decoder_output = autoencoder.predict(images[:5])
  18. Reshape the decoder output to 28 x 28 in size:

    decoder_output = decoder_output.reshape((-1, 28, 28))
  19. Reshape the original images back to 28 x 28 in size:

    images = images.reshape((-1, 28, 28))
  20. Plot the original image, the mean encoder output, and the decoder:

    plt.figure(figsize=(10, 7))
    for i in range(5):
        plt.subplot(3, 5, i + 1)
        plt.imshow(images[i], cmap='gray')
        plt.axis('off')
        
        plt.subplot(3, 5, i + 6)
        plt.imshow(encoder_output[i], cmap='gray')
        plt.axis('off')   
        
        plt.subplot(3, 5, i + 11)
        plt.imshow(decoder_output[i], cmap='gray')
        plt.axis('off')        

    The output is as follows:

    Figure 5.45: The original image, the encoder output, and the decoder

At the end of this activity, you will have developed an autoencoder comprising convolutional layers within the neural network. Note the improvements made in the decoder representations. This architecture has a significant performance benefit over fully-connected neural network layers and is extremely useful in working with image-based datasets and generating artificial data samples.