Data Manipulation with R - Second Edition

Data Manipulation with R - Second Edition

By : Jaynal Abedin, Kishor Kumar Das

Buy this Book

Data Manipulation with R - Second Edition

By: Jaynal Abedin, Kishor Kumar Das

Buy this Book

Overview of this book

This book starts with the installation of R and how to go about using R and its libraries. We then discuss the mode of R objects and its classes and then highlight different R data types with their basic operations. The primary focus on group-wise data manipulation with the split-apply-combine strategy has been explained with specific examples. The book also contains coverage of some specific libraries such as lubridate, reshape2, plyr, dplyr, stringr, and sqldf. You will not only learn about group-wise data manipulation, but also learn how to efficiently handle date, string, and factor variables along with different layouts of datasets using the reshape2 package. By the end of this book, you will have learned about text manipulation using stringr, how to extract data from twitter using twitteR library, how to clean raw data, and how to structure your raw data for data mining.

Data Manipulation with R Second Edition

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Introduction to R Data Types and Basic Operations

Getting different versions of R

Installing R on different platforms

Installing and using R libraries

Comparing R with other software

R as an enterprise solution

Writing commands in R

R data types and basic operations

The R object structure and mode conversion

Factor and its types

Missing values in R

Summary

Basic Data Manipulation

Acquiring data

Vector and matrix operations

Factor manipulation

Factors from numeric variables

Date processing using lubridate

Character manipulation

Subscripting and subsetting

Summary

Data Manipulation Using plyr and dplyr

Applying the split-apply-combine strategy

Introducing the plyr and dplyr libraries

Comparing base R and plyr

Powerful data manipulation with dplyr

Summary

Reshaping Datasets

Typical layout of a dataset

New layout of a dataset

Reshaping the dataset from the typical layout

Reshaping the dataset with the reshape package

The reshape2 package

Summary

R and Databases

R and different databases

Relational databases in R

R and sqldf

Data manipulation using sqldf

Summary

Text Manipulation

Text data and its source

Text processing using default functions

Working with Twitter data

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Summary

This chapter introduced a theoretical framework for reshaping a dataset. The limitations of conventional approaches were pointed out, and the new paradigm of data layout was highlighted. In the new paradigm, employing only two functions allows users to rearrange datasets into various layouts as required. This chapter also discussed structural missing, sampling zero values, and how to deal with these missing values during the melting process. For faster and large data rearrangement, you were redirected to the reshape2 package.

In the next chapter, we will discuss how R can be connected with databases and handle large-scale data.

Data Manipulation with R - Second Edition

By : Jaynal Abedin, Kishor Kumar Das

Data Manipulation with R - Second Edition

By: Jaynal Abedin, Kishor Kumar Das

Overview of this book

Related Content you might be interested in

Current Title:

Data Manipulation with R - Second Edition

Summary