In this recipe, we will use the functions scatter()
and plot()
from the Scala Breeze linear algebra library (part of) to draw a scatter plot from a two-dimensional data. Once the results are computed on the Spark cluster, either the actionable data can be used in the driver for drawing or a JPEG or GIF can be generated in the backend and pushed forward for efficiency and speed (popular with GPU-based analytical databases such as MapD)
- First, we need to download the necessary ScalaNLP library. Download the JAR from the Maven repository available at https://repo1.maven.org/maven2/org/scalanlp/breeze-viz_2.11/0.12/breeze-viz_2.11-0.12.jar.
- Place the JAR in the
C:\spark-2.0.0-bin-hadoop2.7\examples\jars
directory on a Windows machine: - In macOS, please put the JAR in its correct path. For our setting examples, the path is
/Users/USERNAME/spark/spark-2.0.0-bin-hadoop2.7/examples/jars/
. - The following is the sample screenshot...