-
Book Overview & Buying
-
Table Of Contents
Apache Spark 2.x Cookbook
By :
Spark comes bundled with a read–eval–print loop (REPL) shell, which is a wrapper around the Scala shell. Though the Spark shell looks like a command line for simple things, in reality, a lot of complex queries can also be executed using it. A lot of times, the Spark shell is used in the initial development phase and once the code is stabilized, it is written as a class file and bundled as a jar to be run using spark-submit flag. This chapter explores different development environments in which Spark applications can be developed.
Hadoop MapReduce's word count, which takes at least three class files and one configuration file, namely project object model (POM), becomes very simple with the Spark shell. In this recipe, we are going to create a simple one-line text file, upload it to the Hadoop distributed file system (HDFS), and use Spark to count the occurrences of words. Let's see how:
words directory using the following command:$ mkdir words...
Change the font size
Change margin width
Change background colour