In this recipe, we'll take a look at how to bring Spark into our project (using SBT) and how Spark works internally.
Note
The code for this recipe can be found at https://github.com/arunma/ScalaDataAnalysisCookbook/blob/master/chapter1-spark-csv/build.sbt.
Let's now throw some Spark dependencies into our build.sbt
file so that we can start playing with them in subsequent recipes. For now, we'll just focus on three of them: Spark Core, Spark SQL, and Spark MLlib. We'll take a look at a host of other Spark dependencies as we proceed further in this book:
Under a brand new folder (which will be your project root), create a new file called
build.sbt
.Next, let's add the Spark libraries to the project dependencies.
Note that Spark 1.4.x requires Scala 2.10.x. This becomes the first section of our
build.sbt
:organization := "com.packt" name := "chapter1-spark-csv" scalaVersion := "2.10.4" val sparkVersion="1.4.1" libraryDependencies ++= Seq( "org.apache.spark...