Book Image

Machine Learning for OpenCV 4 - Second Edition

By : Aditya Sharma, Vishwesh Ravi Shrimali, Michael Beyeler
Book Image

Machine Learning for OpenCV 4 - Second Edition

By: Aditya Sharma, Vishwesh Ravi Shrimali, Michael Beyeler

Overview of this book

OpenCV is an opensource library for building computer vision apps. The latest release, OpenCV 4, offers a plethora of features and platform improvements that are covered comprehensively in this up-to-date second edition. You'll start by understanding the new features and setting up OpenCV 4 to build your computer vision applications. You will explore the fundamentals of machine learning and even learn to design different algorithms that can be used for image processing. Gradually, the book will take you through supervised and unsupervised machine learning. You will gain hands-on experience using scikit-learn in Python for a variety of machine learning applications. Later chapters will focus on different machine learning algorithms, such as a decision tree, support vector machines (SVM), and Bayesian learning, and how they can be used for object detection computer vision operations. You will then delve into deep learning and ensemble learning, and discover their real-world applications, such as handwritten digit classification and gesture recognition. Finally, you’ll get to grips with the latest Intel OpenVINO for building an image processing system. By the end of this book, you will have developed the skills you need to use machine learning for building intelligent computer vision applications with OpenCV 4.
Table of Contents (18 chapters)
Free Chapter
1
Section 1: Fundamentals of Machine Learning and OpenCV
6
Section 2: Operations with OpenCV
11
Section 3: Advanced Machine Learning with OpenCV

Using decision trees to diagnose breast cancer

Now that we have built our first decision tree, it's time to turn our attention to a real dataset: the Breast Cancer Wisconsin dataset (https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)).

This dataset is a direct result of medical imaging research and is considered a classic today. The dataset was created from digitized images of healthy (benign) and cancerous (malignant) tissues. Unfortunately, I wasn't able to find any public-domain examples from the original study, but the images look similar to the following screenshot:

The goal of the research was to classify tissue samples into benign and malignant (a binary classification task).

To make the classification task feasible, the researchers performed feature extraction on the images, as we did in Chapter 4, Representing Data and Engineering...