This section shows you how to improve the performance of KNN by tuning its parameters. We are dealing with the k parameter that defines the number of neighbors. Use these steps to identify the k parameter performing best:
Define which values of k we will test. The KNN works locally, in the sense that given a new country flag it identifies just a few similar flags. How many of them should we use at most? Since there are less than 200 flags in total, we don't want to use more than 50 flags. Then, we should test each k between 1 and 50 and we can define
arrayK
containing the options:# define the k to test arrayK <- 1:50
Define the number of iterations. For each k in
arrayK
, we need to build and validate the KNN a sufficiently high amount of times defined bynIterations
. In the previous chapter, we learned that we need at least 100 iterations to have a meaningful KNN accuracy:nIterations <- 100
Evaluate the accuracy for each k.
Choose the k that maximizes the accuracy...