Chapter 1. Distributed Word Count
In this chapter, we will introduce you to the core concepts involved in creating distributed stream processing applications with Storm. We do this by building a simple application that calculates a running word count from a continuous stream of sentences. The word count example involves many of the structures, techniques, and patterns required for more complex computation, yet it is simple and easy to follow.
We will begin with an overview of Storm's data structures and move on to implementing the components that comprise a fully fledged Storm application. By the end of the chapter, you will have gained a basic understanding of the structure of Storm computations, setting up a development environment, and techniques for developing and debugging Storm applications.
This chapter covers the following topics:
Storm's basic constructs – topologies, streams, spouts, and bolts
Setting up a Storm development environment
Implementing a basic word count application
Parallelization and fault tolerance
Scaling by parallelizing computation tasks