Book Image

Learning Cascading

Book Image

Learning Cascading

Overview of this book

Table of Contents (18 chapters)
Learning Cascading
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Preface
7
Optimizing the Performance of a Cascading Application
Index

Next steps


We just completed a relatively simple NLP application using Cascading. As we saw through the example code in this chapter, Cascading is the perfect fit for "pipeline processing" (pun intended) of large volumes of unstructured text. Considering that Cascading is optimized for big data and parallel processing, we are in the unique position of being able to improve the area of text analytics by several orders of magnitude. Using Cascading on top of Hadoop, we are addressing the issues of volume, speed, veracity, and complexity of text analytics.

Take a look at the following diagram, which covers most of the domain of text analytics. With Cascading, we implemented the subassemblies with * in their names with elegance and ease. However, we are only scratching the surface. The other components of the art and science of NLP and text analytics are shown in the diagram. Using the methodology we described in this chapter, all we really need is NLP/TA domain expertise to implement the rest...