Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Twitter graph topology


The Twitter graph topology will read raw tweet data from the Kafka queue, parse out the relevant information, and then create nodes and relationships in the Titan graph database. Instead of writing to the graph database individually for each tuple received, we will implement a trident state implementation for performing persistence operations in bulk using Trident's transaction mechanism.

This approach offers several benefits. First, for graph databases, such as Titan that supports transactions, we can leverage this capability to provide additional exactly-once processing guarantees. Second, it allows us to perform a bulk-write followed by a bulk-commit (when supported) for an entire batch of tuples rather than a write-commit operation for each individual tuple. Finally, by using the generic Blueprints API, our Trident state implementation will be largely agnostic to the underlying graph database implementation, allowing any Blueprints graph database backend to be easily...