In this section, we will continue to demonstrate Spark's computation speed and ease of coding for a real-life project of movie recommendation, but to be completed by SPSS on Apache Spark.
SPSS is a widely used software package for statistical analysis. SPSS originally stood for Statistical Package for Social Science, but it is also used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations, data miners, and others. Long produced by SPSS Inc., it was acquired by IBM in 2009. Since then, IBM further developed it and turned it into a popular tool for data scientists and machine learning professionals. To make Spark available to SPSS users, IBM developed technologies making SPSS Spark integration easy, which will be covered in this chapter.