Book Image

Storm Real-time Processing Cookbook

By : Quinton Anderson
Book Image

Storm Real-time Processing Cookbook

By: Quinton Anderson

Overview of this book

<p>Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!<br />Storm Real Time Processing Cookbook will have basic to advanced recipes on Storm for real-time computation.<br /><br />The book begins with setting up the development environment and then teaches log stream processing. This will be followed by real-time payments workflow, distributed RPC, integrating it with other software such as Hadoop and Apache Camel, and more.</p>
Table of Contents (16 chapters)
Storm Real-time Processing Cookbook
Credits
About the Author
About the Reviewers
www.packtpub.com
Preface
Index

Creating a Random Forest classification model using R


As mentioned in the introduction to the chapter, three approaches to machine learning within Storm will be presented. It is important to understand when to use each approach. This choice starts by understanding what machine learning approach and algorithm you would like to use: online or batch-based. Remember that, for machine learning, online and batch refers to the way in which the model is trained. This distinction does not imply any particular underlying engineering approach to achieve either the batch or online modes. A batch model can be built in a real-time software platform, and conversely, an online model can be built as part of a batch software process. It is important to understand this distinction between the engineering aspect and the machine learning aspect.

If you would like to explore this concept in more depth, please read the following overview of neural networks in which the distinction is explored in more detail: http...