Let's start with a simple example of Streaming in which in one terminal, we will type some text and the Streaming application will capture it in another window.
Start the Spark shell and give it some extra memory:
$ spark-shell --driver-memory 1G
Stream specific imports:
scala> import org.apache.spark.SparkConf scala> import org.apache.spark.streaming.{Seconds, StreamingContext} scala> import org.apache.spark.storage.StorageLevel scala> import StorageLevel._
Import for an implicit conversion:
scala> import org.apache.spark._ scala> import org.apache.spark.streaming._ scala> import org.apache.spark.streaming.StreamingContext._
Create
StreamingContext
with a 2 second batch interval:scala> val ssc = new StreamingContext(sc, Seconds(2))
Create a
SocketTextStream
Dstream on localhost with port8585
with theMEMORY_ONLY
caching:scala> val lines = ssc.socketTextStream("localhost",8585,MEMORY_ONLY)
Divide the lines into multiple...