Apache Beam is a new programming model and library for portable massive-scale data processing—both batch and streaming. Using Beam, you can author data processing pipelines and execute them on various data processing engines, including Apache Apex.
In this chapter, we will cover the following topics:
- Introducing the technical vision of Apache Beam
- Explaining the most important concepts in the Beam programming model
- Discussing simple classic example—counting the occurrences of words in the works of Shakespeare
- Launching this pipeline on Apache Apex