Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Filtering data by string matching


Although some filtering algorithms were already discussed in the previous chapters, the dplyr package contains some magic features that have not yet been covered and are worth mentioning here. As we all know by this time, the subset function in base, or the filter function from dplyr is used for filtering rows, and the select function can be used to choose a subset of columns.

The function filtering rows usually takes an R expression, which returns the IDs of the rows to drop, similar to the which function. On the other hand, providing such R expressions to describe column names is often more problematic for the select function; it's harder if not impossible to evaluate R expressions on column names.

The dplyr package provides some useful functions to select some columns of the data, based on column name patterns. For example, we can keep only the variables ending with the string, delay:

> library(dplyr)
> library(hflights)
> str(select(hflights...