Book Image

Data Manipulation with R - Second Edition

By : Jaynal Abedin, Kishor Kumar Das
Book Image

Data Manipulation with R - Second Edition

By: Jaynal Abedin, Kishor Kumar Das

Overview of this book

<p>This book starts with the installation of R and how to go about using R and its libraries. We then discuss the mode of R objects and its classes and then highlight different R data types with their basic operations.</p> <p>The primary focus on group-wise data manipulation with the split-apply-combine strategy has been explained with specific examples. The book also contains coverage of some specific libraries such as lubridate, reshape2, plyr, dplyr, stringr, and sqldf. You will not only learn about group-wise data manipulation, but also learn how to efficiently handle date, string, and factor variables along with different layouts of datasets using the reshape2 package.</p> <p>By the end of this book, you will have learned about text manipulation using stringr, how to extract data from twitter using twitteR library, how to clean raw data, and how to structure your raw data for data mining.</p>
Table of Contents (13 chapters)
Data Manipulation with R Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

R and sqldf


The sqldf package is an R package that allows users to run SQL statements within R. SQL is the popular programming language for manipulating data from relational databases, and the sqldf package creates an opportunity to work directly with SQL statements on an R data frame. With this package, the user can do the following tasks easily:

  • Write alternate syntax for data frame manipulation, particularly for purposes of faster processing, since using sqldf (with SQLite as the underlying database) is often faster compared to performing the same manipulations in built-in R functions

  • Read portions of large files into R without reading the entire file

The user need not perform the following tasks once they use sqldf because these are automatically done:

  • Database setup

  • Writing the create table statement, which defines each table

  • Importing and exporting to and from the database

  • The coercing of the returned columns to the appropriate class in common cases