Book Image

Building Python Real time Applications with Storm

Book Image

Building Python Real time Applications with Storm

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)

Productivity tips with Petrel


We've covered a lot of ground in this chapter. While we don't know every detail of Storm, we've seen how to construct a topology with multiple components and send data between them.

The Python code for the topology is quite short—only about 75 lines in all. This makes a nice example, but really, it's just a little too short. When you start writing your own topologies, things probably won't work perfectly the first time. New code usually has bugs, and may even crash sometimes. To get things working correctly, you'll need to know what's happening in the topology, especially when there are problems. As you work on fixing problems, you'll be running the same topology over and over, and the 30-second startup time for a topology can seem like eternity.

Improving startup performance

Let's address startup performance first. By default, when a Petrel topology starts up, it creates a new Python virtualenv and installs Petrel and other dependencies in it. While this behavior...