OpenCV is a native, cross-platform C++ library for computer vision, machine learning, and image processing. It is increasingly being adopted in Python for development. OpenCV has C++/C, Python, and Java interfaces, with support for Windows, Linux, Mac, iOS, and Android. Developers who use OpenCV build applications to process visual data; this can include live streaming data such as photographs or videos from a device such as a camera. However, as developers move beyond their first computer vision applications, they might find it difficult to come up with solutions that are well-optimized, robust, and scalable for real-world scenarios.
This book demonstrates how to develop a series of intermediate to advanced projects using OpenCV and Python, rather than teaching the core concepts of OpenCV in theoretical lessons. The working projects developed in this book teach you how to apply your theoretical knowledge to topics such as image manipulation, augmented reality, object tracking, 3D scene reconstruction, statistical learning, and object categorization.
By the end of this book, you will be an OpenCV expert, and your newly gained experience will allow you to develop your own advanced computer vision applications.
Chapter 1, Fun with Filters, explores a number of interesting image filters (such as a black-and-white pencil sketch, warming/cooling filters, and a cartoonizer effect), and we apply them to the video stream of a webcam in real time.
Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, helps you develop an app to detect and track simple hand gestures in real time using the output of a depth sensor, such as a Microsoft Kinect 3D Sensor or Asus Xtion.
Chapter 3, Finding Objects via Feature Matching and Perspective Transforms, is where you develop an app to detect an arbitrary object of interest in the video stream of a webcam, even if the object is viewed from different angles or distances, or under partial occlusion.
Chapter 4, 3D Scene Reconstruction Using Structure from Motion, shows you how to reconstruct and visualize a scene in 3D by inferring its geometrical features from camera motion.
Chapter 5, Tracking Visually Salient Objects, helps you develop an app to track multiple visually salient objects in a video sequence (such as all the players on the field during a soccer match) at once.
Chapter 6, Learning to Recognize Traffic Signs, shows you how to train a support vector machine to recognize traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB) dataset.
Chapter 7, Learning to Recognize Emotions on Faces, is where you develop an app that is able to both detect faces and recognize their emotional expressions in the video stream of a webcam in real time.
This book supports several operating systems as development environments, including Windows XP or a later version, Max OS X 10.6 or a later version, and Ubuntu12.04 or a later version. The only hardware requirement is a webcam (or camera device), except for in Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, which instead requires access to a Microsoft Kinect 3D Sensor or an Asus Xtion.
The book contains seven projects, with the following requirements.
All projects can run on any of Windows, Mac, or Linux, and they require the following software packages:
OpenCV 2.4.9 or later: Recent 32-bit and 64-bit versions as well as installation instructions are available at http://opencv.org/downloads.html. Platform-specific installation instructions can be found at http://docs.opencv.org/doc/tutorials/introduction/table_of_content_introduction/table_of_content_introduction.html.
Python 2.7 or later: Recent 32-bit and 64-bit installers are available at https://www.python.org/downloads. The installation instructions can be found at https://wiki.python.org/moin/BeginnersGuide/Download.
NumPy 1.9.2 or later: This package for scientific computing officially comes in 32-bit format only, and can be obtained from http://www.scipy.org/scipylib/download.html. The installation instructions can be found at http://www.scipy.org/scipylib/building/index.html#building.
wxPython 2.8 or later: This GUI programming toolkit can be obtained from http://www.wxpython.org/download.php. Its installation instructions are given at http://wxpython.org/builddoc.php.
In addition, some chapters require the following free Python modules:
SciPy 0.16.0 or later (Chapter 1): This scientific Python library officially comes in 32-bit only, and can be obtained from http://www.scipy.org/scipylib/download.html. The installation instructions can be found at http://www.scipy.org/scipylib/building/index.html#building.
matplotlib 1.4.3 or later (Chapters 4 to 7): This 2D plotting library can be obtained from http://matplotlib.org/downloads.html. Its installation instructions can be found by going to http://matplotlib.org/faq/installing_faq.html#how-to-install.
libfreenect 0.5.2 or later (Chapter 2): The libfreenect module by the OpenKinect project (http://www.openkinect.org) provides drivers and libraries for the Microsoft Kinect hardware, and can be obtained from https://github.com/OpenKinect/libfreenect. Its installation instructions can be found at http://openkinect.org/wiki/Getting_Started.
Furthermore, the use of iPython (http://ipython.org/install.html) is highly recommended as it provides a flexible, interactive console interface.
Finally, if you are looking for help or get stuck along the way, you can go to several websites that provide excellent help, documentation, and tutorials:
The official OpenCV API reference, user guide, and tutorials: http://docs.opencv.org
The official OpenCV forum: http://www.answers.opencv.org/questions
OpenCV-Python tutorials by Alexander Mordvintsev and Abid Rahman K: http://opencv-python-tutroals.readthedocs.org/en/latest
This book is for intermediate users of OpenCV who aim to master their skills by developing advanced practical applications. You should already have some experience of building simple applications, and you are expected to be familiar with OpenCV's concepts and Python libraries. Basic knowledge of Python programming is expected and assumed.
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "In OpenCV, a webcam can be accessed with a call to cv2.VideoCapture
."
A block of code is set as follows:
def main(): capture = cv2.VideoCapture(0) if not(capture.isOpened()): capture.open() capture.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 640) capture.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 480)
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "The Take Snapshot button is placed below the radio buttons."
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <[email protected]>
, and mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. The latest and most up-to-date example code for this book is also publicly available on GitHub: http://www.github.com/mbeyeler/opencv-python-blueprints.
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/OpenCVwithPythonBlueprints_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]>
with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
You can contact us at <[email protected]>
if you are having a problem with any aspect of the book, and we will do our best to address it.