Deep convolutional networks
In the previous chapter, we saw how a multi-layer perceptron can achieve very high accuracy when working with an image dataset that is not very complex, such as the MNIST handwritten digits dataset. However, as the fully-connected layers are horizontal, the images, which in general are three-dimensional structures (width x height x channels), must be flattened and transformed into one-dimensional arrays where the geometric properties are definitively lost.
With more complex datasets, where the distinction between classes depends on details and on their relationships, this approach can yield moderate accuracy, but it can never reach the precision required by production-ready applications.
The conjunction of neuroscientific studies and image processing techniques suggested experimenting with neural networks where the first layers work with bidimensional structures (without the channels), trying to extract a hierarchy of features...