Book Image

Building Python Real-Time Applications with Storm

By : Kartik Bhatnagar, Barry Hart
Book Image

Building Python Real-Time Applications with Storm

By: Kartik Bhatnagar, Barry Hart

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)

Twitter analysis

Most of you have heard of Twitter, but if you have not, check out how Wikipedia describes Twitter:

"an online social networking service that enables users to send and read short 140-character messages called "tweets"."

In 2013, users posted 400 million messages per day on Twitter. Twitter offers an API that gives developers real-time access to streams of tweets. On it, messages are public by default. The volume of messages, the availability of an API, and the public nature of tweets combine to make Twitter a valuable source of insights on current events, topics of interest, public sentiment, and so on.

Storm was originally developed at BackType to process tweets, and Twitter analysis is still a popular use case of Storm. You can see several examples on the Storm website at

The topology in this chapter demonstrates how to read from Twitter's real-time streaming API, computing a ranking of the most popular words. It's a Python...