Types of data reduction
There are two types of data reduction methods. They are called numerosity data reduction and dimensionality data reduction. As their names suggest, the former performs data reduction by reducing the number of data objects or rows in a dataset, while the latter performs data reduction by reducing the number of dimensions or attributes in a dataset.
In this chapter, we will cover three methods for numerosity reduction and six methods for dimensionality reduction. The following are the numerosity reduction methods we will cover:
- Random Sampling: Randomly selecting some of the data objects to avoid unaffordable computational costs.
- Stratified Sampling: Randomly selecting some of the data objects to avoid the unaffordable computational costs, all the while maintaining the ratio representation of the sub-populations in the sample.
- Random Over/Under Sampling: Randomly selecting some of the data objects to avoid the unaffordable computational costs...