Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing the design


Let's first examine the real-time portion of the system beginning with the spout through to the Druid persistence. The topology is straightforward and mimics topologies we have written in previous chapters.

The following are the critical lines of the topology:

TwitterSpout spout = new TwitterSpout();
Stream inputStream = topology.newStream("nlp", spout);
try {
inputStream.each(new Fields("tweet"), new TweetSplitterFunction(), new Fields("word"))
          .each(new Fields("searchphrase", "tweet", "word"), new WordFrequencyFunction(), new Fields("baseline"))
          .each(new Fields("searchphrase", "tweet", "word", "baseline"), new PersistenceFunction(), new Fields())	
          .partitionPersist(new DruidStateFactory(), new Fields("searchphrase", "tweet", "word", "baseline"), new DruidStateUpdater());
} catch (IOException e) {
throw new RuntimeException(e);
}
return topology.build();

In the end, after parsing and enrichment, the tuples have four fields as shown in...