Book Image

OpenCV 3 Blueprints

By : Joseph Howse, Puttemans, Sinha
Book Image

OpenCV 3 Blueprints

By: Joseph Howse, Puttemans, Sinha

Overview of this book

Computer vision is becoming accessible to a large audience of software developers who can leverage mature libraries such as OpenCV. However, as they move beyond their first experiments in computer vision, developers may struggle to ensure that their solutions are sufficiently well optimized, well trained, robust, and adaptive in real-world conditions. With sufficient knowledge of OpenCV, these developers will have enough confidence to go about creating projects in the field of computer vision. This book will help you tackle increasingly challenging computer vision problems that you may face in your careers. It makes use of OpenCV 3 to work around some interesting projects. Inside these pages, you will find practical and innovative approaches that are battle-tested in the authors’ industry experience and research. Each chapter covers the theory and practice of multiple complementary approaches so that you will be able to choose wisely in your future projects. You will also gain insights into the architecture and algorithms that underpin OpenCV’s functionality. We begin by taking a critical look at inputs in order to decide which kinds of light, cameras, lenses, and image formats are best suited to a given purpose. We proceed to consider the finer aspects of computational photography as we build an automated camera to assist nature photographers. You will gain a deep understanding of some of the most widely applicable and reliable techniques in object detection, feature selection, tracking, and even biometric recognition. We will also build Android projects in which we explore the complexities of camera motion: first in panoramic image stitching and then in video stabilization. By the end of the book, you will have a much richer understanding of imaging, motion, machine learning, and the architecture of computer vision libraries and applications!
Table of Contents (9 chapters)
8
Index

Supercharging the PlayStation Eye

Sony developed the Eye camera in 2007 as an input device for PlayStation 3 games. Originally, no other system supported the Eye. Since then, third parties have created drivers and SDKs for Linux, Windows, and Mac. The following list describes the current state of some of these third-party projects:

  • For Linux, the gspca_ov534 driver supports the PlayStation Eye and works out of the box with OpenCV's videoio module. This driver comes standard with most recent Linux distributions. Current releases of the driver support modes as fast as 320x240 @ 125 FPS and 640x480 @ 60 FPS. An upcoming release will add support for 320x240 @187 FPS. If you want to upgrade to this future version today, you will need to familiarize yourself with the basics of Linux kernel development, and build the driver yourself.

    Note

    See the driver's latest source code at https://github.com/torvalds/linux/blob/master/drivers/media/usb/gspca/ov534.c. Briefly, you would need to obtain the source code of your Linux distribution's kernel, merge the new ov534.c file, build the driver as part of the kernel, and finally, load the newly built gspca_ov534 driver.

  • For Mac and Windows, developers can add PlayStation Eye support to their applications using an SDK called PS3EYEDriver, available from https://github.com/inspirit/PS3EYEDriver. Despite the name, this project is not a driver; it supports the camera at the application level, but not the OS level. The supported modes include 320x240 @ 187 FPS and 640x480 @ 60 FPS. The project comes with sample application code. Much of the code in PS3EYEDriver is derived from the GPL-licensed gspca_ov534 driver, and thus, the use of PS3EYEDriver is probably only appropriate to projects that are also GPL-licensed.
  • For Windows, a commercial driver and SDK are available from Code Laboratories (CL) at https://codelaboratories.com/products/eye/driver/. At the time of writing, the CL-Eye Driver costs $3. However, the driver does not work with OpenCV 3's videoio module. The CL-Eye Platform SDK, which depends on the driver, costs an additional $5. The fastest supported modes are 320x240 @ 187 FPS and 640x480 @ 75 FPS.
  • For recent versions of Mac, no driver is available. A driver called macam is available at http://webcam-osx.sourceforge.net/, but it was last updated in 2009 and does not work on Mac OS X Mountain Lion and newer versions.

Thus, OpenCV in Linux can capture data directly from an Eye camera, but OpenCV in Windows or Mac requires another SDK as an intermediary.

First, for Linux, let us consider a minimal example of a C++ application that uses OpenCV to record a slow-motion video based on high-speed input from an Eye. Also, the program should log its frame rate. Let's call this application Unblinking Eye.

Note

Unblinking Eye's source code and build files are in this book's GitHub repository at https://github.com/OpenCVBlueprints/OpenCVBlueprints/tree/master/chapter_1/UnblinkingEye.

Note that this sample code should also work with other OpenCV-compatible cameras, albeit at a slower frame rate compared to the Eye.

Unblinking Eye can be implemented in a single file, UnblinkingEye.cpp, containing these few lines of code:

#include <stdio.h>
#include <time.h>

#include <opencv2/core.hpp>
#include <opencv2/videoio.hpp>

int main(int argc, char *argv[]) {

  const int cameraIndex = 0;
  const bool isColor = true;
  const int w = 320;
  const int h = 240;
  const double captureFPS = 187.0;
  const double writerFPS = 60.0;
  // With MJPG encoding, OpenCV requires the AVI extension.
  const char filename[] = "SlowMo.avi";
  const int fourcc = cv::VideoWriter::fourcc('M','J','P','G');
  const unsigned int numFrames = 3750;

  cv::Mat mat;

  // Initialize and configure the video capture.
  cv::VideoCapture capture(cameraIndex);
  if (!isColor) {
    capture.set(cv::CAP_PROP_MODE, cv::CAP_MODE_GRAY);
  }
  capture.set(cv::CAP_PROP_FRAME_WIDTH, w);
  capture.set(cv::CAP_PROP_FRAME_HEIGHT, h);
  capture.set(cv::CAP_PROP_FPS, captureFPS);

  // Initialize the video writer.
  cv::VideoWriter writer(
      filename, fourcc, writerFPS, cv::Size(w, h), isColor);

  // Get the start time.
  clock_t startTicks = clock();

  // Capture frames and write them to the video file.
  for (unsigned int i = 0; i < numFrames;) {
    if (capture.read(mat)) {
      writer.write(mat);
      i++;
    }
  }

  // Get the end time.
  clock_t endTicks = clock();

  // Calculate and print the actual frame rate.
  double actualFPS = numFrames * CLOCKS_PER_SEC /
      (double)(endTicks - startTicks);
  printf("FPS: %.1f\n", actualFPS);
}

Note that the camera's specified mode is 320x240 @ 187 FPS. If our version of the gspca_ov534 driver does not support this mode, we can expect it to fall back to 320x240 @ 125 FPS. Meanwhile, the video file's specified mode is 320x240 @ 60 FPS, meaning that the video will play back at slower-than-real speed as a special effect. Unblinking Eye can be built using a Terminal command such as the following:

$ g++ UnblinkingEye.cpp -o UnblinkingEye -lopencv_core -lopencv_videoio

Build Unblinking Eye, run it, record a moving subject, observe the frame rate, and play back the recorded video, SlowMo.avi. How does your subject look in slow motion?

On a machine with a slow CPU or slow storage, Unblinking Eye might drop some of the captured frames due to a bottleneck in video encoding or file output. Do not be fooled by the low resolution! The rate of data transfer for a camera in 320x240 @ 187 FPS mode is greater than for a camera in 1280x720 @ 15 FPS mode (an HD resolution at a slightly choppy frame rate). Multiply the pixels by the frame rate to see how many pixels per second are transferred in each mode.

Suppose we want to reduce the amount of data per frame by capturing and recording monochrome video. Such an option is available when OpenCV 3 is built for Linux with libv4l support. (The relevant CMake definition is WITH_LIBV4L, which is turned on by default.) By changing the following line in the code of Unblinking Eye and then rebuilding it, we can switch to grayscale capture:

const bool isColor = false;

Note that the change to this Boolean affects the highlighted portions of the following code:

  cv::VideoCapture capture(cameraIndex);
  if (!isColor) {
    capture.set(cv::CAP_PROP_MODE, cv::CAP_MODE_GRAY);
  }
  capture.set(cv::CAP_PROP_FRAME_WIDTH, w);
  capture.set(cv::CAP_PROP_FRAME_HEIGHT, h);
  capture.set(cv::CAP_PROP_FPS, captureFPS);

  cv::VideoWriter writer(
      filename, fourcc, writerFPS, cv::Size(w, h), isColor);

Behind the scenes, the VideoCapture and VideoWriter objects are now using a planar YUV format. The captured Y data are copied to a single-channel OpenCV Mat and are ultimately stored in the video file's Y channel. Meanwhile, the video file's U and V color channels are just filled with the mid-range value, 128, for gray. U and V use a lower resolution than Y, so at the time of capture, the YUV format has only 12 bits per pixel (bpp), compared to 24 bpp for OpenCV's default BGR format.

Note

The libv4l interface in OpenCV's videoio module currently supports the following values for cv::CAP_PROP_MODE:

  • cv::CAP_MODE_BGR (the default) captures 24 bpp color in BGR format (8 bpp per channel).
  • cv::CAP_MODE_RGB captures 24 bpp color in RGB format (8 bpp per channel).
  • cv::CAP_MODE_GRAY extracts 8 bpp grayscale from a 12 bpp planar YUV format.
  • cv::CAP_MODE_YUYV captures 16 bpp color in a packed YUV format (8 bpp for Y and 4 bpp each for U and V).

For Windows or Mac, we should instead capture data using PS3EYEDriver, CL-Eye Platform SDK, or another library, and then create an OpenCV Mat that references the data. This approach is illustrated in the following partial code sample:

int width = 320, height = 240;
int matType = CV_8UC3; // 8 bpp per channel, 3 channels
void *pData;

// Use the camera SDK to capture image data.
someCaptureFunction(&pData);

// Create the matrix. No data are copied; the pointer is copied.
cv::Mat mat(height, width, matType, pData);

Indeed, the same approach applies to integrating almost any source of data into OpenCV. Conversely, to use OpenCV as a source of data for another library, we can get a pointer to the data stored in a matrix:

void *pData = mat.data;

Later in this chapter, in Supercharging the GS3-U3-23S6M-C and other Point Grey Research cameras, we cover a nuanced example of integrating OpenCV with other libraries, specifically FlyCapture2 for capture and SDL2 for display. PS3EYEDriver comes with a comparable sample, in which the pointer to captured data is passed to SDL2 for display. As an exercise, you might want to adapt these two examples to build a demo that integrates OpenCV with PS3EYEDriver for capture and SDL2 for display.

Hopefully, after some experimentation, you will conclude that the PlayStation Eye is a more capable camera than its $10 price tag suggests. For fast-moving subjects, its high frame rate is a good tradeoff for its low resolution. Banish motion blur!

If we are willing to invest in hardware modifications, the Eye has even more tricks hidden up its sleeve (or in its socket). The lens and IR blocking filter are relatively easy to replace. An aftermarket lens and filter can allow for NIR capture. Furthermore, an aftermarket lens can yield higher resolution, a different FOV, less distortion, and greater efficiency. Peau Productions sells premodified Eye cameras as well as do-it-yourself (DIY) kits, at http://peauproductions.com/store/index.php?cPath=136_1. The company's modifications support interchangeable lenses with an m12 mount or CS mount (two different standards of screw mounts). The website offers detailed recommendations based on lens characteristics such as distortion and IR transmission. Peau's price for a premodified NIR Eye camera plus a lens starts from approximately $85. More expensive options, including distortion-corrected lenses, range up to $585. However, at these prices, it is advisable to compare lens prices across multiple vendors, as described later in this chapter's Shopping for glass section.

Next, we will examine a camera that lacks high-speed modes, but is designed to separately capture visible and NIR light, with active NIR illumination.