Book Image

OpenCV 3 Computer Vision with Python Cookbook

By : Aleksei Spizhevoi, Aleksandr Rybnikov
Book Image

OpenCV 3 Computer Vision with Python Cookbook

By: Aleksei Spizhevoi, Aleksandr Rybnikov

Overview of this book

OpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications. In this book, you will learn how to process an image by manipulating pixels and analyze an image using histograms. Then, we'll show you how to apply image filters to enhance image content and exploit the image geometry in order to relay different views of a pictured scene. We’ll explore techniques to achieve camera calibration and perform a multiple-view analysis. Later, you’ll work on reconstructing a 3D scene from images, converting low-level pixel information to high-level concepts for applications such as object detection and recognition. You’ll also discover how to process video from files or cameras and how to detect and track moving objects. Finally, you'll get acquainted with recent approaches in deep learning and neural networks. By the end of the book, you’ll be able to apply your skills in OpenCV to create computer vision applications in various domains.
Table of Contents (11 chapters)

Reading images from files

In this recipe, we will learn how to read images from files. OpenCV supports reading images in different formats, such as PNG, JPEG, and TIFF. Let's write a program that takes the path to an image as its first parameter, reads the image, and prints its shape and size.

Getting ready

You need to have OpenCV 3.x installed with Python API support.

How to do it...

For this recipe, you need to perform the following steps:

  1. You can easily read an image with the cv2.imread function, which takes path to image and optional flags:
import argparse
import cv2
parser = argparse.ArgumentParser()
parser.add_argument('--path', default='../data/Lena.png', help='Image path.')
params = parser.parse_args()
img = cv2.imread(params.path)
  1. Sometimes it's useful to check whether the image was successfully loaded or not:
assert img is not None  # check if the image was successfully loaded
print('read {}'.format(params.path))
print('shape:', img.shape)
print('dtype:', img.dtype)
  1. Load the image and convert it to grayscale, even if it had many color channels originally:
img = cv2.imread(params.path, cv2.IMREAD_GRAYSCALE)
assert img is not None print('read {} as grayscale'.format(params.path)) print('shape:', img.shape) print('dtype:', img.dtype)

How it works...

The loaded image is represented as a NumPy array. The same representation is used in OpenCV for matrices. NumPy arrays have such properties as shape, which is an image's size and number of color channels, and dtype, which is the underlying data type (for example, uint8 or float32). Note that OpenCV loads images in BGR, not RGB, format.

The shape tuple in this case should be interpreted as follows: image height, image width, color channels count.

The cv.imread function also supports optional flags, where users can specify whether conversion to uint8 type should be performed, and whether the image is grayscale or color.

Having run the code with the default parameters, you should see the following output:

read ../data/Lena.png
shape: (512, 512, 3)
dtype: uint8

read ../data/Lena.png as grayscale
shape: (512, 512)
dtype: uint8