Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Configuring the infrastructure


First, we need to configure the infrastructure. Since Storm will run on the YARN infrastructure, we will first configure YARN and then show how to configure Storm-YARN for deployment on that cluster.

The Hadoop infrastructure

To configure a set of machines, you will need a copy of Hadoop residing on them or a copy that is accessible to each of them. First, download the latest copy of Hadoop and unpack the archive. For this example, we will use Version 2.1.0-beta.

Assuming that you have uncompressed the archive into /home/user/hadoop, add the following environment variables on each of the nodes in the cluster:

export HADOOP_PREFIX=/home/user/hadoop
export HADOOP_YARN_HOME=/home/user/hadoop
export HADOOP_CONF_DIR=/home/user/hadoop/etc/Hadoop

Add YARN to your execute path as follows:

export PATH=$PATH:$HADOOP_YARN_HOME/bin

All the Hadoop configuration files are located in $HADOOP_CONF_DIR. The three key configuration files for this example are: core-site.xml, yarn-site...