Book Image

Applied Deep Learning and Computer Vision for Self-Driving Cars

By : Sumit Ranjan, Dr. S. Senthamilarasu

Book Image

Applied Deep Learning and Computer Vision for Self-Driving Cars

By: Sumit Ranjan, Dr. S. Senthamilarasu

Overview of this book

Thanks to a number of recent breakthroughs, self-driving car technology is now an emerging subject in the field of artificial intelligence and has shifted data scientists' focus to building autonomous cars that will transform the automotive industry. This book is a comprehensive guide to use deep learning and computer vision techniques to develop autonomous cars. Starting with the basics of self-driving cars (SDCs), this book will take you through the deep neural network techniques required to get up and running with building your autonomous vehicle. Once you are comfortable with the basics, you'll delve into advanced computer vision techniques and learn how to use deep learning methods to perform a variety of computer vision tasks such as finding lane lines, improving image classification, and so on. You will explore the basic structure and working of a semantic segmentation model and get to grips with detecting cars using semantic segmentation. The book also covers advanced applications such as behavior-cloning and vehicle detection using OpenCV, transfer learning, and deep learning methodologies to train SDCs to mimic human driving. By the end of this book, you'll have learned how to implement a variety of neural networks to develop your own autonomous vehicle using modern Python libraries.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Section 1: Deep Learning Foundation and SDC Basics

Section 1: Deep Learning Foundation and SDC Basics

Free Chapter

The Foundation of Self-Driving Cars

The Foundation of Self-Driving Cars

Introduction to SDCs

Benefits of SDCs

Advancements in SDCs

Challenges in current deployments

Building safe systems

The cheapest computer and hardware

Software programming

Levels of autonomy

Level 0 – manual cars

Level 1 – driver support

Level 2 – partial automation

Level 3 – conditional automation

Level 4 – high automation

Level 5 – complete automation

Deep learning and computer vision approaches for SDCs

LIDAR and computer vision for SDC vision

Dive Deep into Deep Neural Networks

Dive Deep into Deep Neural Networks

Diving deep into neural networks

Introduction to neurons

Understanding neurons and perceptrons

The workings of ANNs

Understanding activation functions

The threshold function

The sigmoid function

The rectifier linear function

The hyperbolic tangent activation function

The cost function of neural networks

Understanding hyperparameters

Model training-specific hyperparameters

Number of epochs

Network architecture-specific hyperparameters

Number of hidden layers

L1 and L2 regularization

Activation functions as hyperparameters

TensorFlow versus Keras

Implementing a Deep Learning Model Using Keras

Implementing a Deep Learning Model Using Keras

Starting work with Keras

Advantages of Keras

The working principle behind Keras

Building Keras models

The sequential model

The functional model

Types of Keras execution

Keras for deep learning

Building your first deep learning model

Description of the Auto-Mpg dataset

Importing the data

Splitting the data

Standardizing the data

Building and compiling the model

Training the model

Predicting new, unseen data

Evaluating the model's performance

Saving and loading models

Section 2: Deep Learning and Computer Vision Techniques for SDC

Section 2: Deep Learning and Computer Vision Techniques for SDC

Computer Vision for Self-Driving Cars

Computer Vision for Self-Driving Cars

Introduction to computer vision

Challenges in computer vision

Artificial eyes versus human eyes

Building blocks of an image

Digital representation of an image

Converting images from RGB to grayscale

Road-marking detection

Detection with the grayscale image

Detection with the RGB image

Challenges in color selection techniques

Color space techniques

Introducing the RGB space

Color space manipulation

Introduction to convolution

Sharpening and blurring

Edge detection and gradient calculation

Introducing Sobel

Introducing the Laplacian edge detector

Canny edge detection

Image transformation

Affine transformation

Projective transformation

Image translation

Perspective transformation

Cropping, dilating, and eroding an image

Masking regions of interest

The Hough transform

Finding Road Markings Using OpenCV

Finding Road Markings Using OpenCV

Finding road markings in an image

Loading the image using OpenCV

Converting the image into grayscale

Smoothing the image

Canny edge detection

Masking the region of interest

Applying bitwise_and

Applying the Hough transform

Optimizing the detected road markings

Detecting road markings in a video

Improving the Image Classifier with CNN

Improving the Image Classifier with CNN

Images in computer format

The need for CNNs

The intuition behind CNNs

Introducing CNNs

Understanding the convolution layer

Depth, stride, and padding

Fully connected layers

The softmax function

Introduction to handwritten digit recognition

Problem and aim

Loading the data

Reshaping the data

The transformation of data

One-hot encoding the output

Building and compiling our model

Compiling the model

Training the model

Validation versus train loss

Validation versus test accuracy

Saving the model

Visualizing the model architecture

Confusion matrix

The accuracy report

Road Sign Detection Using Deep Learning

Road Sign Detection Using Deep Learning

Dataset overview

Dataset structure

Loading the data

Image exploration

Data preparation

Section 3: Semantic Segmentation for Self-Driving Cars

Section 3: Semantic Segmentation for Self-Driving Cars

The Principles and Foundations of Semantic Segmentation

The Principles and Foundations of Semantic Segmentation

Introduction to semantic segmentation

Understanding the semantic segmentation architecture

Overview of different semantic segmentation architectures

Implementing Semantic Segmentation

Implementing Semantic Segmentation

Semantic segmentation in images

Semantic segmentation in videos

Section 4: Advanced Implementations

Section 4: Advanced Implementations

Behavioral Cloning Using Deep Learning

Behavioral Cloning Using Deep Learning

Neural network for regression

Behavior cloning using deep learning

Data collection

Data preparation

Model development

Evaluating the simulator

Vehicle Detection Using OpenCV and Deep Learning

Vehicle Detection Using OpenCV and Deep Learning

What makes YOLO different?

The YOLO loss function

The YOLO architecture

Implementation of YOLO object detection

Importing the libraries

Processing the image function

The get class function

Draw box function

Detect image function

Detect video function

Detecting objects in images

Detecting objects in videos

Next Steps

Ultrasonic sensors

Odometric sensors

Introduction to sensor fusion

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

YOLO v2

YOLO v2 (also known as YOLO9000) increased YOLO's original input size from 224x224 to 448x448. It was observed that this increase in size resulted in an improved mAP. YOLO v2 also uses batch normalization, which leads to a significant improvement in the accuracy of the model. It also resulted in an improvement in the detection of small objects, which was achieved by dividing the entire image using a 13x13 grid. In order to obtain good priors (anchors) for the model, YOLO v2 runs k-means clustering on the bounding box scale. YOLO v2 also uses five anchor boxes, as shown in the following image:

Fig 11.3: Anchor boxes

In the preceding image, the boxes in blue are anchor boxes, while the box in red is the ground truth box for the object.

YOLOv2 uses the Darknet architecture for object classification and has 19 convolution layers, five max-pooling layers, and a softmax layer.