Book Image

Data Manipulation with R - Second Edition

By : Jaynal Abedin, Kishor Kumar Das
Book Image

Data Manipulation with R - Second Edition

By: Jaynal Abedin, Kishor Kumar Das

Overview of this book

<p>This book starts with the installation of R and how to go about using R and its libraries. We then discuss the mode of R objects and its classes and then highlight different R data types with their basic operations.</p> <p>The primary focus on group-wise data manipulation with the split-apply-combine strategy has been explained with specific examples. The book also contains coverage of some specific libraries such as lubridate, reshape2, plyr, dplyr, stringr, and sqldf. You will not only learn about group-wise data manipulation, but also learn how to efficiently handle date, string, and factor variables along with different layouts of datasets using the reshape2 package.</p> <p>By the end of this book, you will have learned about text manipulation using stringr, how to extract data from twitter using twitteR library, how to clean raw data, and how to structure your raw data for data mining.</p>
Table of Contents (13 chapters)
Data Manipulation with R Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 6. Text Manipulation

Text data is one of the most important areas in the field of data analytics. Every day, we are producing a huge amount of text data through various media. For example, Twitter posts, blog writing, and Facebook posts are major sources of text data. Text data can be used to retrieve information in sentiment analysis and even entity recognition. In this chapter, we will discuss how R can be used to process text data, which we can utilize in any text analytics areas. These types of data can also be used in text categorization, predictive analytics, lexical analysis, document summarization, and even in natural language processing. First, we will discuss the default functions of R for processing text data. Then, we will introduce a stringr library to work with text data. We will cover the following topics in this chapter:

  • What is text data?

  • Sources of text data

  • Obtaining text data

  • Text processing using default functions

  • Text processing using stringr

  • Structuring text data...