Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 8. Natural Language Processing

Some people believe Storm will eventually replace Hadoop as demand increases for real-time analytics and data processing. In this chapter, we will see how Storm and Hadoop actually complement each other.

Although Storm blurs the lines between traditional On-Line Transactional Processing (OLTP) and On-Line Analytical Processing (OLAP), it can handle a high volume of transactions while performing aggregations and dimensional analysis typically associated with data warehouses. It is often the case that you still need additional infrastructure to perform historical analysis and to support ad hoc queries across the entire dataset. Additionally, batch processing is often used to correct anomalies where the OLTP system cannot ensure consistency in the event of failures. This is exactly what we encountered in the Storm-Druid integration.

For these reasons, batch processing infrastructure is often paired with real-time infrastructure. Hadoop provides us with...