Now, we will create a new Storm topology that will read the data from Kafka using the KafkaSpout
spout, process the server logfiles, and store the process data in MySQL for further analysis.
In this section, we will write a bolt, ApacheLogSplitterBolt
, which has logic to fetch the IP address, status code, referrer, bytes sent, and other such information from the server log line. We will create a new Maven project for this use case:
Create a new Maven project with
com.learningstorm
forgroupId
andstormlogprocessing
forartifactId
.Add the following dependencies to the
pom.xml
file:<!-- Dependency for Storm --> <dependency> <groupId>storm</groupId> <artifactId>storm-core</artifactId> <version>0.9.0.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId...