In this chapter, we examined the sources of unstructured data and the motivation behind analyzing the unstructured data. We explained various techniques that are required in pre-processing unstructured data and how Spark provides most of these tools out of the box. We also covered some of the algorithms supported by Spark that can be used in text analytics.
In the next chapter, we will go through different types of visualization techniques that are insightful in different stages of data analytics lifecycle.