Book Image

Caffe2 Quick Start Guide

By : Ashwin Nanjappa
Book Image

Caffe2 Quick Start Guide

By: Ashwin Nanjappa

Overview of this book

Caffe2 is a popular deep learning library used for fast and scalable training, and inference of deep learning models on different platforms. This book introduces you to the Caffe2 framework and demonstrates how you can leverage its power to build, train, and deploy efficient neural network models at scale. The Caffe 2 Quick Start Guide will help you in installing Caffe2, composing networks using its operators, training models, and deploying models to different architectures. The book will also guide you on how to import models from Caffe and other frameworks using the ONNX interchange format. You will then cover deep learning accelerators such as CPU and GPU and learn how to deploy Caffe2 models for inference on accelerators using inference engines. Finally, you'll understand how to deploy Caffe2 to a diverse set of hardware, using containers on the cloud and resource-constrained hardware such as Raspberry Pi. By the end of this book, you will not only be able to compose and train popular neural network models with Caffe2, but also deploy them on accelerators, to the cloud and on resource-constrained platforms such as mobile and embedded hardware.
Table of Contents (9 chapters)

Inference engines

Popular DL frameworks, such as TensorFlow, PyTorch, and Caffe, are designed primarily for training deep neural networks. They focus on offering features that are more useful for researchers to experiment easily with different types of network structures, training regimens, and techniques to achieve optimum training accuracy to solve a particular problem in the real world. After a neural network model has been successfully trained, practitioners could continue to use the same DL framework for deploying the trained model for inference. However, there are more efficient deployment solutions for inference. These are pieces of inference software that compile a trained model into a computation engine that is most efficient in latency or throughput on the accelerator hardware used for deployment.

Much like a C or C++ compiler, inference engines take the trained model...