In the previous chapters, we discussed how real datasets are very often high-dimensional representations of samples that lie on low-dimensional manifolds (this is one of the semi-supervised pattern's assumptions, but it's generally true). As the complexity of a model is proportional to the dimensionality of the input data, many techniques have been analyzed and optimized in order to reduce the actual number of valid components. For example, PCA selects the features according to the relative explained variance, while ICA and generic dictionary learning techniques look for basic atoms that can be combined to rebuild the original samples. In this chapter, we are going to analyze a family of models based on a slightly different approach, but whose capabilities are dramatically increased by the employment of deep learning methods.
A generic autoencoder is a model that is split into two separate (but not completely autonomous) components called an Encoder and a Decoder. The task of...