From the previous chapter we know that Image classification only really deals with the case when we have a single instance of a class in an input image. Even then it only provides a coarse output for us, letting us know what object is present in an image but not where it is. A more interesting scenario is when we want to find where all instances of a class, or even multiple different classes, are located in an input image.
To deal with this more challenging problem, object detection and segmentation come into the picture. These are areas of computer vision that until recently were very challenging. However, applying convolutional neural networks to these problems has gained a lot of attention in recent years and consequently, for the most part, these problems can now be considered solved. In this chapter we will see how CNNs have managed to tackle these difficult tasks so well.
The following image shows the differences between different solutions...