Book Image

Mastering Text Mining with R

By : KUMAR ASHISH
Book Image

Mastering Text Mining with R

By: KUMAR ASHISH

Overview of this book

Text Mining (or text data mining or text analytics) is the process of extracting useful and high-quality information from text by devising patterns and trends. R provides an extensive ecosystem to mine text through its many frameworks and packages. Starting with basic information about the statistics concepts used in text mining, this book will teach you how to access, cleanse, and process text using the R language and will equip you with the tools and the associated knowledge about different tagging, chunking, and entailment approaches and their usage in natural language processing. Moving on, this book will teach you different dimensionality reduction techniques and their implementation in R. Next, we will cover pattern recognition in text data utilizing classification mechanisms, perform entity recognition, and develop an ontology learning framework. By the end of the book, you will develop a practical application from the concepts learned, and will understand how text mining can be leveraged to analyze the massively available data on social media.
Table of Contents (15 chapters)

Summary


In this chapter, we learned about text classification methods and their implementation in R language. We learnt how to use classifiers to build applications such as spam filter, topic models, and so on. We also looked into the model evaluation and validation methods. In the coming chapters, we will take up more practical examples to utilize the tools and knowledge gained in the previous chapters to mine and extract insights from social media. We will also look into different distributed text mining methods to help applications build in R to scale up for high dimensional data.