Book Image

Caffe2 Quick Start Guide

By : Ashwin Nanjappa
Book Image

Caffe2 Quick Start Guide

By: Ashwin Nanjappa

Overview of this book

Caffe2 is a popular deep learning library used for fast and scalable training, and inference of deep learning models on different platforms. This book introduces you to the Caffe2 framework and demonstrates how you can leverage its power to build, train, and deploy efficient neural network models at scale. The Caffe 2 Quick Start Guide will help you in installing Caffe2, composing networks using its operators, training models, and deploying models to different architectures. The book will also guide you on how to import models from Caffe and other frameworks using the ONNX interchange format. You will then cover deep learning accelerators such as CPU and GPU and learn how to deploy Caffe2 models for inference on accelerators using inference engines. Finally, you'll understand how to deploy Caffe2 to a diverse set of hardware, using containers on the cloud and resource-constrained hardware such as Raspberry Pi. By the end of this book, you will not only be able to compose and train popular neural network models with Caffe2, but also deploy them on accelerators, to the cloud and on resource-constrained platforms such as mobile and embedded hardware.
Table of Contents (9 chapters)

Introduction to deep learning

Terms such as artificial intelligence (AI), machine learning (ML), and deep learning (DL) are popular right now. This popularity can be attributed to significant improvements that deep learning techniques have brought about in the last few years in enabling computers to see, hear, read, and create. First and foremost, we'll introduce these three fields and how they intersect:

Figure 1.1: Relationship between deep learning, ML, and AI

AI

Artificial intelligence (AI) is a general term used to refer to the intelligence of computers, specifically their ability to reason, sense, perceive, and respond. It is used to refer to any non-biological system that has intelligence, and this intelligence is a consequence of a set of rules. It does not matter in AI if those sets of rules were created manually by a human, or if those rules were automatically learned by a computer by analyzing data. Research into AI started in 1956, and it has been through many ups and a couple of downs, called AI winters, since then.

ML

Machine learning (ML) is a subset of AI that uses statistics, data, and learning algorithms to teach computers to learn from given data. This data, called training data, is specific to the problem being solved, and contains examples of input and the expected output for each input. ML algorithms learn models or representations automatically from training data, and these models can be used to obtain predictions for new input data.

There are many popular types of models in ML, including artificial neural networks (ANNs), Bayesian networks, support vector machines (SVM), and random forests. The ML model that is of interest to us in this book is ANN. The structure of ANNs are inspired by the connections in the brain. These neural network models were initially popular in ML, but later fell out of favor since they required enormous computing power that was not available at that time.

Deep learning

Over the last decade, utilization of the parallel processing capability of graphics processing units (GPUs) to solve general computation problems became popular. This type of computation came to be known as general-purpose computing on GPU (GPGPU). GPUs were quite affordable and were easy to use as accelerators by using GPGPU programming models and APIs such as Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). Starting in 2012, neural network researchers harnessed GPUs to train neural networks with a large number of layers and started to generate breakthroughs in solving computer vision, speech recognition, and other problems. The use of such deep neural networks with a large number of layers of neurons gave rise to the term deep learning. Deep learning algorithms form a subset of ML and use multiple layers of abstraction to learn and parameterize multi-layer neural network models of data.