Book Image

R Machine Learning Essentials

By : Michele Usuelli
Book Image

R Machine Learning Essentials

By: Michele Usuelli

Overview of this book

Table of Contents (15 chapters)
R Machine Learning Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Building the feature data


This section shows how we can structure the raw data to build the features. For each country, the data is:

  • A picture of the flag

  • Some geographical data such as continent, geographic quadrant, area, and population

  • The language and religion of the country

The target is to build a model that predicts a country language starting from its flag. Most of the models can deal with numeric and/or categorical data, so we can't use the image of the flag as a feature for the model. The solution is to define some features, for instance the number of colors, that describe each flag. In this way, we start from a table whose rows correspond to the countries and whose columns correspond to the flag features.

It would take a lot of time to build the matrix with the flag attributes based on the pictures. Fortunately, we can use a dataset that contains some features. The data that we have is still a bit messy, so we need to clean and transform it to build a feature table in the right format...