Book Image

Building Python Real-Time Applications with Storm

By : Kartik Bhatnagar, Barry Hart
Book Image

Building Python Real-Time Applications with Storm

By: Kartik Bhatnagar, Barry Hart

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)

Installing Winpdb

Activate your Petrel virtual environment and then use pip to install it:

source <virtualenv directory>/bin/activate
pip install winpdb

Add Winpdb breakpoint

In the file, add the following at the beginning of the run() function:

import rpdb2

The 'password' value can be anything; this is simply the password that you will use in the next step to attach to

When this line of code executes, the script will freeze for a default period of 5 minutes, waiting for a debugger to attach.

Launching and attaching the debugger

Now run the topology:

petrel submit --config topology.yaml

Once you see log messages from the spout, you will know that the topology is up and running, so you can connect with the debugger.

Launch Winpdb simply by running winpdb.

For more details on how to use Winpdb for embedded debugging, see the documentation at

When the window appears, select File...