Book Image

Machine Learning for OpenCV 4 - Second Edition

By : Aditya Sharma, Vishwesh Ravi Shrimali, Michael Beyeler
Book Image

Machine Learning for OpenCV 4 - Second Edition

By: Aditya Sharma, Vishwesh Ravi Shrimali, Michael Beyeler

Overview of this book

OpenCV is an opensource library for building computer vision apps. The latest release, OpenCV 4, offers a plethora of features and platform improvements that are covered comprehensively in this up-to-date second edition. You'll start by understanding the new features and setting up OpenCV 4 to build your computer vision applications. You will explore the fundamentals of machine learning and even learn to design different algorithms that can be used for image processing. Gradually, the book will take you through supervised and unsupervised machine learning. You will gain hands-on experience using scikit-learn in Python for a variety of machine learning applications. Later chapters will focus on different machine learning algorithms, such as a decision tree, support vector machines (SVM), and Bayesian learning, and how they can be used for object detection computer vision operations. You will then delve into deep learning and ensemble learning, and discover their real-world applications, such as handwritten digit classification and gesture recognition. Finally, you’ll get to grips with the latest Intel OpenVINO for building an image processing system. By the end of this book, you will have developed the skills you need to use machine learning for building intelligent computer vision applications with OpenCV 4.
Table of Contents (18 chapters)
Free Chapter
1
Section 1: Fundamentals of Machine Learning and OpenCV
6
Section 2: Operations with OpenCV
11
Section 3: Advanced Machine Learning with OpenCV

Representing categorical variables

One of the most common data types we might encounter while building a machine learning system is categorical features (also known as discrete features), such as the color of a fruit or the name of a company. The challenge with categorical features is that they don't change in a continuous way, which makes it hard to represent them with numbers.

For example, a banana is either green or yellow, but not both. A product belongs either in the clothing department or in the books department, but rarely in both, and so on.

How would you go about representing such features?

For example, let's assume we are trying to encode a dataset consisting of a list of forefathers of machine learning and artificial intelligence:

In [1]: data = [
... {'name': 'Alan Turing', 'born': 1912, 'died': 1954},
... ...