Book Image

Apache Flume: Distributed Log Collection for Hadoop

By : Steve Hoffman, Steven Hoffman
Book Image

Apache Flume: Distributed Log Collection for Hadoop

By: Steve Hoffman, Steven Hoffman

Overview of this book

Table of Contents (16 chapters)
Apache Flume: Distributed Log Collection for Hadoop Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Web logs to searchable UI


Let's simulate a web application by setting up a simple web server whose logs we want to stream into some searchable application. In this case, we'll be using a Kibana UI to perform ad hoc queries against Elasticsearch.

For this example, I'll start three servers in Amazon's Elastic Compute Cluster (EC2), as shown in this diagram:

Each server has a public IP (starting with 54) and a private IP (starting with 172). For interserver communication, I'll be using the private IPs in my configurations. My personal interaction with the web server (to simulate traffic) and with Kibana and Elasticsearch will require the public IPs, since I'm sitting in my house and not in Amazon's data centers.

Note

Pay careful attention to the IP addresses in the shell prompts, as we will be jumping from machine to machine, and I don't want you to get lost. For instance, on the Collector box, the prompt will contain its private IP:

[ec2-user@ip-172-31-26-205 ~]$

If you try this out yourself...