In any statistical software, all the data is expected to be either numeric or at least a factor, but sometimes we have to work with character data. In the area of text mining, character, or string, manipulation is the most important. R has complete functionality to manipulate character (string) data for further analysis. Besides default R functionality, there is one contributed package to deal with character data, which is more user friendly and intuitive, compared to the base R counterpart. Wickham developed the stringr
package in 2010 to manipulate character data with some user friendly functions. In this section, we will introduce different functions and their counterparts in a table, so that the readers are able to use the functions from the stringr
package easily:
Data Manipulation with R - Second Edition
By :
Data Manipulation with R - Second Edition
By:
Overview of this book
<p>This book starts with the installation of R and how to go about using R and its libraries. We then discuss the mode of R objects and its classes and then highlight different R data types with their basic operations.</p>
<p>The primary focus on group-wise data manipulation with the split-apply-combine strategy has been explained with specific examples. The book also contains coverage of some specific libraries such as lubridate, reshape2, plyr, dplyr, stringr, and sqldf. You will not only learn about group-wise data manipulation, but also learn how to efficiently handle date, string, and factor variables along with different layouts of datasets using the reshape2 package.</p>
<p>By the end of this book, you will have learned about text manipulation using stringr, how to extract data from twitter using twitteR library, how to clean raw data, and how to structure your raw data for data mining.</p>
Table of Contents (13 chapters)
Data Manipulation with R Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Introduction to R Data Types and Basic Operations
Basic Data Manipulation
Data Manipulation Using plyr and dplyr
Reshaping Datasets
R and Databases
Text Manipulation
Index
Customer Reviews