Book Image

Hands-On Image Processing with Python

By : Sandipan Dey
Book Image

Hands-On Image Processing with Python

By: Sandipan Dey

Overview of this book

Image processing plays an important role in our daily lives with various applications such as in social media (face detection), medical imaging (X-ray, CT-scan), security (fingerprint recognition) to robotics & space. This book will touch the core of image processing, from concepts to code using Python. The book will start from the classical image processing techniques and explore the evolution of image processing algorithms up to the recent advances in image processing or computer vision with deep learning. We will learn how to use image processing libraries such as PIL, scikit-mage, and scipy ndimage in Python. This book will enable us to write code snippets in Python 3 and quickly implement complex image processing algorithms such as image enhancement, filtering, segmentation, object detection, and classification. We will be able to use machine learning models using the scikit-learn library and later explore deep CNN, such as VGG-19 with Keras, and we will also use an end-to-end deep learning model called YOLO for object detection. We will also cover a few advanced problems, such as image inpainting, gradient blending, variational denoising, seam carving, quilting, and morphing. By the end of this book, we will have learned to implement various algorithms for efficient image processing.
Table of Contents (20 chapters)
Title Page
Copyright and Credits
Dedication
About Packt
Contributors
Preface
Index

What is image processing and some applications


Let's start by defining what is an image, how it is stored on a computer, and how we are going to process it with Python.

What is an image and how it is stored on a computer

Conceptually, an image in its simplest form (single-channel; for example, binary or mono-chrome, grayscale or black and white images) is a two-dimensional function f(x,y) that maps a coordinate-pair to an integer/real value, which is related to the intensity/color of the point. Each point is called a pixel or pel (picture element). An image can have multiple channels too (for example, colored RGB images, where a color can be represented using three channels—red, green, and blue). For a colored RGB image, each pixel at the (x,y) coordinate can be represented by a three-tuple (rx,y, gx,y, bx,y).

In order to be able to process it on a computer, an image f(x,y) needs to be digitalized both spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling. Amplitude digitization is called gray-level quantization. In a computer, a pixel value corresponding to a channel is generally represented as an integer value between (0-255) or a floating-point value between (0-1). An image is stored as a file, and there can be many different types (formats) of files. Each file generally has some metadata and some data that can be extracted as multi-dimensional arrays (for example, 2-D arrays for binary or gray-level images and 3D arrays for RGB and YUV colored images). The following figure shows how the image data is stored as matrices for different types of image. As shown, for a grayscale image, a matrix (2-D array) of width x height suffices to store the image, whereas an RGB image requires a 3-D array of a dimension of width x height x 3:

The next figure shows example binary, grayscale, and RGB images:

In this book, we shall focus on processing image data and will use Python libraries to extract the data from images for us, as well as run different algorithms for different image processing tasks on the image data. Sample images are taken from the internet, from the Berkeley Segmentation Dataset and Benchmark (https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/BSDS300/html/dataset/images.html), and the USC-SIPI Image Database (http://sipi.usc.edu/database/), and many of them are standard images used for image processing.

What is image processing?

Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on a computer. It has applications in many disciplines and fields in science and technology such as television, photography, robotics, remote sensing, medical diagnosis, and industrial inspection. Social networking sites such as Facebook and Instagram, which we have got used to in our daily lives and where we upload tons of images every day, are typical examples of the industries that need to use/innovate many image processing algorithms to process the images we upload.

In this book, we are going to use a few Python packages to process an image. First, we shall use a bunch of libraries to do classical image processing: right from extracting image data, transforming the data with some algorithms using library functions to pre-process, enhance, restore, represent (with descriptors), segment, classify, and detect and recognize (objects) to analyze, understand, and interpret the data better. Next, we shall use another bunch of libraries to do image processing based on deep learning, a technology that has became very popular in the last few years.

 

Some applications of image processing

Some typical applications of image processing include medical/biological fields (for example, X-rays and CT scans), computational photography (Photoshop), fingerprint authentication, face recognition, and so on.