PyTorch is the Python deep learning framework and it's getting a lot of traction lately. PyTorch is the implementation of Torch, which uses Lua. It is by Facebook and is fast thanks to GPU-accelerated tensor computations. A huge benefit of using over other frameworks is that graphs are created on the fly and are not static. This means networks are dynamic and you can adjust your network without having to start over again. As a result, the graph that is created on the fly can be different for each example. PyTorch supports multiple GPUs and you can manually set which computation needs to be performed on which device (CPU or GPU).
- First, we install in our Anaconda environment, as follows:
conda install pytorch torchvision cuda80 -c soumith
If you want to install on another platform, you can have a look at the PyTorch website for clear guidance: http://pytorch.org/.
- Let's import PyTorch into our Python environment:
import torch
- While Keras provides higher-level abstraction for building neural networks, PyTorch has this feature built in. This means one can build with higher-level building blocks or can even build the forward and backward pass manually. In this introduction, we will use the higher-level abstraction. First, we need to set the size of our random training data:
batch_size = 32 input_shape = 5 output_shape = 10
- To make use of GPUs, we will cast the tensors as follows:
torch.set_default_tensor_type('torch.cuda.FloatTensor')
This ensures that all computations will use the attached GPU.
- We can use this to generate random training data:
from torch.autograd import Variable X = Variable(torch.randn(batch_size, input_shape)) y = Variable(torch.randn(batch_size, output_shape), requires_grad=False)
- We will use a simple neural network having one hidden layer with 32 units and an output layer:
model = torch.nn.Sequential( torch.nn.Linear(input_shape, 32), torch.nn.Linear(32, output_shape), ).cuda()
We use the .cuda()
extension to make sure the model runs on the GPU.
- Next, we the MSE loss function:
loss_function = torch.nn.MSELoss()
- We are now ready to start training our model for 10 epochs with the following code:
learning_rate = 0.001 for i in range(10): y_pred = model(x) loss = loss_function(y_pred, y) print(loss.data[0]) # Zero gradients model.zero_grad() loss.backward() # Update weights for param in model.parameters(): param.data -= learning_rate * param.grad.data
Note
The PyTorch framework gives a lot of freedom to implement simple neural networks and more complex deep learning models. What we didn't demonstrate in this introduction, is the use of dynamic graphs in PyTorch. This is a really powerful feature that we will demonstrate in other chapters of this book.