Book Image

Building Python Real-Time Applications with Storm

By : Kartik Bhatnagar, Barry Hart
Book Image

Building Python Real-Time Applications with Storm

By: Kartik Bhatnagar, Barry Hart

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)


Until now, we've debugged topologies using log messages and automated tests. These techniques are very powerful, but sometimes it may be necessary to debug directly inside the Storm environment. For example, the problem may:

  • Depend on running as a particular user

  • Occur only with real data

  • Occur only when there are many instances of the component running in parallel

This section introduces a tool for debugging inside Storm.

Winpdb is a portable, GUI-based debugger for Python, with support for embedded debugging. If you're not familiar with the term "embedded debugging", note this: it simply means that Winpdb can attach to a program that was launched in some other way and not necessarily from WinDbg or your command shell. For this reason, it is a good fit for debugging Petrel components that run in Storm.