Book Image

Storm Real-time Processing Cookbook

By : Quinton Anderson
Book Image

Storm Real-time Processing Cookbook

By: Quinton Anderson

Overview of this book

<p>Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!<br />Storm Real Time Processing Cookbook will have basic to advanced recipes on Storm for real-time computation.<br /><br />The book begins with setting up the development environment and then teaches log stream processing. This will be followed by real-time payments workflow, distributed RPC, integrating it with other software such as Hadoop and Apache Camel, and more.</p>
Table of Contents (16 chapters)
Storm Real-time Processing Cookbook
Credits
About the Author
About the Reviewers
www.packtpub.com
Preface
Index

Counting and persisting log statistics


There are many statistics that can be gathered for log streams; for the purposes of this recipe and to illustrate the concept, only a single-time series will be dealt with (log volume per minute); however, this should fully illustrate the design and approach for implementing other analyses.

How to do it…

  1. Download and install the storm-cassandra contrib project into your Maven repository:

    git clone https://github.com/quintona/storm-cassandra
    cd storm-cassandra
    mvn clean install
    
  2. Create a new BaseRichBolt class called VolumeCountingBolt in the storm.cookbook.log package. The bolt must declare three output fields:

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
          declarer.declare(new Fields(FIELD_ROW_KEY, FIELD_COLUMN, FIELD_INCREMENT));
       }
  3. Then implement a static utility method to derive the minute representation of the log's time:

    public static Long getMinuteForTime(Date time) {
          Calendar c = Calendar.getInstance();
          c.setTime...