Book Image

Real-time Analytics with Storm and Cassandra

By : Shilpi Saxena
Book Image

Real-time Analytics with Storm and Cassandra

By: Shilpi Saxena

Overview of this book

Table of Contents (19 chapters)
Real-time Analytics with Storm and Cassandra
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Stream groupings


Next we need to get acquainted with various stream groupings (a stream grouping is basically the mechanism that defines how Storm partitions and distributes the streams of tuples amongst tasks of bolts) provided by Storm. Streams are the basic wiring component of a Storm topology, and understanding them provides a lot of flexibility to the developer to handle various problems in programs efficiently.

Local or shuffle grouping

Local or shuffle grouping is the most common grouping that randomly distributes the tuples emitted by the source ensuring equal distribution, that is, each instance of the bolt gets to process the same number of events. Load balancing is automatically taken care of by this grouping.

Due to the random nature of distribution of this grouping, it's useful only for atomic operations by specifying a single parameter—source of stream. The following snippet is from WordCount topology (which we reated earlier), which demonstrates the usage of shuffle grouping...