A multivariate dataset is defined as a set of multiple observations (attributes) associated with different aspects of a phenomenon. In this chapter, we will use a multivariate dataset, which is the result of a chemical analysis of wines that grew in three different cultivars from the same area in Italy. The Wine dataset is available in the UC Irvine Machine Learning Repository and can be freely downloaded from http://archive.ics.uci.edu/ml/datasets/Wine. This dataset includes physicochemical data from white and red wine from the north of Portugal in order to find quality levels. The dataset includes 13 features with no missing data, and all the features are numerical or real values.
The complete list of features is listed here:
Alcohol
Malic acid
Ash
Alkalinity of ash
Magnesium
Total phenols
Flavanoids
Nonflavanoid phenols
Proanthocyanins
Color intensity
Hue
OD280/OD315 of diluted wines
Proline
The classes in the dataset are ordered and not balanced; this means that...