Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Architecture


The architecture for our application is relatively simple. We will create a Twitter client application that reads a subset of the Twitter firehose and writes each message to a Kafka queue as a JSON data structure. We'll then use the Kafka spout to feed that data into our storm topology. Finally, our storm topology will analyze the incoming messages and populate the graph database.

The Twitter client

Twitter provides a comprehensive RESTful API that in addition to a typical request-response interface also provides a streaming API that supports long-lived connections. The Twitter4J Java library (http://twitter4j.org/) offers full compatibility with the latest version of the Twitter API and takes care of all the low-level details (connection management, OAuth authentication, and JSON parsing) with a clean Java API. We will use Twitter4J to connect to the Twitter-streaming API.

Kafka spout

In the previous chapter, we developed a Logback Appender extension that allowed us to easily publish...