Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Introducing the log analysis topology


With the means to write our log data to Kafka, we're ready to turn our attention to the implementation of a Trident topology to perform the analytical computation. The topology will perform the following operations:

  1. Receive and parse the raw JSON log event data.

  2. Extract and emit necessary fields.

  3. Update an exponentially-weighted moving average function.

  4. Determine if the moving average has crossed a specified threshold.

  5. Filter out events that do not represent a state change (for example, rate moved above/below threshold).

  6. Send an instant message (XMPP) notification.

The topology is depicted in the following diagram with the Trident stream operations at the top and stream processing components at the bottom:

Kafka spout

The first step in creating the log analysis topology is to configure the Kafka spout to stream data received from Kafka into our topology as follows:

        TridentTopology topology = new TridentTopology();

        StaticHosts kafkaHosts = KafkaConfig...