Before we dive into structured streaming, let's start by talking about DStreams. DStreams are built on top of RDDs and represent a stream of data divided into small chunks. The following figure represents these data chunks in micro-batches of milliseconds to seconds. In this example, the lines of DStream is micro-batched into seconds where each square represents a micro-batch of events that occurred within that second window:
- At time interval 1 second, there were five occurrences of the event blue and three occurrences of the event green
- At time interval 2 seconds, there is a single occurrence of gohawks
- At time interval 4 seconds, there are two occurrences of the event green