This section shows you how to estimate a new country language starting from its flag, using a simple supervised learning technique that is the k-nearest neighbor (KNN). In this case, we estimate the language, which is a categoric
attribute so we use a classification technique. If the attribute was numeric, we would have used a regression technique. The reason I chose KNN is that it's simple to explain, and there are some options to modify its parameters in order to improve the result's accuracy.
Let's see how the KNN works. We know the flag and the language of 150 countries and we want to determine the language of a new country starting from its flag. First, we identify the 10 countries whose flag is the most similar to the new one. Out of them, we have six Spanish-speaking countries, two English-speaking countries, one French-speaking country, and one Arabic-speaking country.
Out of these 10 countries, the most common language is Spanish, so we can...