In this chapter, we've reviewed many ideas for exploring data quality and data content. We have also introduced the reader to tools and techniques for working with GDELT, which are aimed at encouraging the reader to expand their own investigations. We have demonstrated rapid development in Zeppelin, and written much of our code in SparkSQL to demonstrate the excellent portability of this method. As the GKG files are so complex in terms of content, much of the rest of this book is dedicated to in-depth analyses that move beyond exploration, and we step away from SparkSQL as we dig deeper into the Spark codebase.
In the next chapter,that is, Chapter 5, Spark for Geographic Analysis, we will explore GeoMesa; an ideal tool for managing and exploring the GeoGCAM dataset created in this chapter, as well as GeoServer and the GeoTools toolsets to further expand our knowledge of spatio-temporal exploration and visualization.