Book Image

Building Python Real time Applications with Storm

Book Image

Building Python Real time Applications with Storm

Overview of this book

Big data is a trending concept that everyone wants to learn about. With its ability to process all kinds of data in real time, Storm is an important addition to your big data “bag of tricks.” At the same time, Python is one of the fastest-growing programming languages today. It has become a top choice for both data science and everyday application development. Together, Storm and Python enable you to build and deploy real-time big data applications quickly and easily. You will begin with some basic command tutorials to set up storm and learn about its configurations in detail. You will then go through the requirement scenarios to create a Storm cluster. Next, you’ll be provided with an overview of Petrel, followed by an example of Twitter topology and persistence using Redis and MongoDB. Finally, you will build a production-quality Storm topology using development best practices.
Table of Contents (14 chapters)

Running the topology


Just a few more details and we'll be ready to run the topology:

  1. Create a topology.yaml file. This is a configuration file for Storm. A complete explanation of this file is beyond the scope of this book, but you can see the entire set of available options at https://github.com/apache/storm/blob/master/conf/defaults.yaml:

    nimbus.host: "localhost"
    topology.workers: 1
  2. Create an empty manifest.txt file. You can use an editor to do this or simply run touch manifest.txt. This is a Petrel-specific file that tells Petrel what additional files (if any) should be included in the .jar file that it submits to Storm. In Chapter 4, Example Topology – Twitter we'll see an example that really uses this file.

  3. Before running the topology, let's review the list of files we've created. Make sure you have created these files correctly:

    • randomsentence.py

    • splitsentence.py

    • wordcount.py

    • create.py

    • topology.yaml

    • manifest.txt

  4. Run the topology with the following command:

    petrel submit --config topology...