Book Image

Real-time Analytics with Storm and Cassandra

By : Shilpi Saxena
Book Image

Real-time Analytics with Storm and Cassandra

By: Shilpi Saxena

Overview of this book

Table of Contents (19 chapters)
Real-time Analytics with Storm and Cassandra
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Setting up workers and parallelism to enhance processing


Storm is a highly scalable, distributed, and fault tolerant real-time parallel processing compute framework. Note that the emphasis is on scalability, distributed, and parallel processing—well, we already know that Storm operates in clustered mode and is therefore distributed in its basic nature. Scalability was covered in the previous section; now, let's have a closer look at parallelism. We introduced you to this concept in an earlier chapter, but now we'll get you acquainted with how to tweak it to achieve the desired performance. The following points are the key criteria for this:

  • A topology is allocated a certain number of workers at the time it's started.

  • Each component in the topology (bolts and spouts) has a specified number of executors associated with it. These executors specify the number or degree of parallelism for each running component of the topology.

  • The whole efficiency and speed factor of Storm are driven by the parallelism...