Book Image

Building Python Real-Time Applications with Storm

By : Kartik Bhatnagar, Barry Hart
Book Image

Building Python Real-Time Applications with Storm

By: Kartik Bhatnagar, Barry Hart

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)

Chapter 6. Petrel in Practice

In previous chapters, we saw working examples of Storm topologies, both simple and complex. In doing so, however, we skipped some of the tools and techniques that you'll need while developing your own topologies:

  • Storm is a great environment for running your code, but deploying to Storm (even on your local machine) adds complexity and takes extra time. We'll see how to test your spouts and bolts outside of Storm.

  • When components run inside Storm, they can't read from the console, which prevents the use of pdb, the standard Python debugger. This chapter demonstrates Winpdb, an interactive debugging tool suitable for debugging components inside Storm.

  • Storm lets you easily harness the power of many servers, but performance of your code still matters. In this chapter, we'll see some ways of measuring the performance of our topology's components.