Book Image

Learning Apache Mahout

Book Image

Learning Apache Mahout

Overview of this book

Table of Contents (17 chapters)
Learning Apache Mahout
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
1
Introduction to Mahout
9
Case Study – Churn Analytics and Customer Segmentation
Index

Chapter 10. Case Study – Text Analytics

So far, we have focused on deriving insights and building models on top of data that has a well defined and fixed structure. Data sources such as delimited files and database tables have a fixed format and are called structured sources of data. Structured data is the mainstay of analytics, and most of the use cases we discussed rely on structured data. Data sources such as social media posts, support case comments, e-mails, articles, and so on are called unstructured, data and they can contain business insights about customers and products that is not readily available in structured data. For example, structured information such as product usage tables can tell us that a particular customer is not using the product, but the reason for that could be documented in a support case comment. Mining unstructured data for information follows a slightly different approach than what we have discussed so far. In this chapter, we are going to discuss the steps...