Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Some other metrics


And, of course, we can use the standard data analysis tools as well after quantifying our package descriptions a bit. Let's see, for example, the length of the documents in the corpus:

> vnchar <- sapply(v, function(x) nchar(x$content))
> summary(vnchar)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00   27.00   37.00   39.85   50.00  168.00

So, the average package description consists of around 40 characters, while there is a package with only two characters in the description. Well, two characters after removing numbers, punctuations, and the common words. To see which package has this very short description, we might simply call the which.min function:

> (vm <- which.min(vnchar))
[1] 221

And this is what's strange about it:

> v[[vm]]
<<PlainTextDocument (metadata: 7)>>
NA
> res[vm, ]
    V1   V2
221    <NA>

So, this is not a real package after all, but rather an empty row in the original table. Let's visually inspect the overall...