Book Image

OpenCV 3.x with Python By Example - Second Edition

By : Gabriel Garrido Calvo, Prateek Joshi
Book Image

OpenCV 3.x with Python By Example - Second Edition

By: Gabriel Garrido Calvo, Prateek Joshi

Overview of this book

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease. We start off by manipulating images using simple filtering and geometric transformations. We then discuss affine and projective transformations and see how we can use them to apply cool advanced manipulations to your photos like resizing them while keeping the content intact or smoothly removing undesired elements. We will then cover techniques of object tracking, body part recognition, and object recognition using advanced techniques of machine learning such as artificial neural network. 3D reconstruction and augmented reality techniques are also included. The book covers popular OpenCV libraries with the help of examples. This book is a practical tutorial that covers various examples at different levels, teaching you about the different functions of OpenCV and their actual implementation. By the end of this book, you will have acquired the skills to use OpenCV and Python to develop real-world computer vision applications.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Contributors
Packt Upsell
Preface

Image translation


In this section, we will discuss shifting an image. Let's say we want to move the image within our frame of reference. In computer vision terminology, this is referred to as translation. Let's go ahead and see how we can do that:

import cv2
import numpy as np
img = cv2.imread('images/input.jpg')
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
img_translation = cv2.warpAffine(img, translation_matrix, (num_cols, num_rows), cv2.INTER_LINEAR)
cv2.imshow('Translation', img_translation)
cv2.waitKey()

If you run the preceding code, you will see something like the following:

What just happened?

To understand the preceding code, we need to understand how warping works. Translation basically means that we are shifting the image by adding/subtracting the x and y coordinates. In order to do this, we need to create a transformation matrix, as follows:

Here, the tx and ty values are the x and y translation values; that is, the image will be moved by x units to the right, and by y units downwards. So once we create a matrix like this, we can use the function, warpAffine, to apply it to our image. The third argument in warpAffine refers to the number of rows and columns in the resulting image. As follows, it passes InterpolationFlags which defines combination of interpolation methods.

Since the number of rows and columns is the same as the original image, the resultant image is going to get cropped. The reason for this is we didn't have enough space in the output when we applied the translation matrix. To avoid cropping, we can do something like this:

img_translation = cv2.warpAffine(img, translation_matrix,
 (num_cols + 70, num_rows + 110))

If you replace the corresponding line in our program with the preceding line, you will see the following image:

Let's say you want to move the image to the middle of a bigger image frame; we can do something like this by carrying out the following:

import cv2
import numpy as np
img = cv2.imread('images/input.jpg')
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
img_translation = cv2.warpAffine(img, translation_matrix, (num_cols + 70, num_rows + 110))
translation_matrix = np.float32([ [1,0,-30], [0,1,-50] ])
img_translation = cv2.warpAffine(img_translation, translation_matrix, (num_cols + 70 + 30, num_rows + 110 + 50))
cv2.imshow('Translation', img_translation)
cv2.waitKey()

If you run the preceding code, you will see an image like the following:

Moreover, there are two more arguments, borderMode and borderValue, that allow you to fill up the empty borders of the translation with a pixel extrapolation method:

import cv2
import numpy as np
img = cv2.imread('./images/input.jpg')
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
img_translation = cv2.warpAffine(img, translation_matrix, (num_cols, num_rows), cv2.INTER_LINEAR, cv2.BORDER_WRAP, 1)
cv2.imshow('Translation', img_translation)
cv2.waitKey()