First, we need to configure the infrastructure. Since Storm will run on the YARN infrastructure, we will first configure YARN and then show how to configure Storm-YARN for deployment on that cluster.
To configure a set of machines, you will need a copy of Hadoop residing on them or a copy that is accessible to each of them. First, download the latest copy of Hadoop and unpack the archive. For this example, we will use Version 2.1.0-beta.
Assuming that you have uncompressed the archive into /home/user/hadoop
, add the following environment variables on each of the nodes in the cluster:
export HADOOP_PREFIX=/home/user/hadoop export HADOOP_YARN_HOME=/home/user/hadoop export HADOOP_CONF_DIR=/home/user/hadoop/etc/Hadoop
Add YARN to your execute path as follows:
export PATH=$PATH:$HADOOP_YARN_HOME/bin
All the Hadoop configuration files are located in $HADOOP_CONF_DIR
. The three key configuration files for this example are: core-site.xml
, yarn-site...