Probably the easiest way to retrieve network-flavored information on the R ecosystem is to analyze how R packages depend on each other. Based on Chapter 2, Getting the Data, we could try to load this data via HTTP parsing of the CRAN mirrors but, luckily, R has a built-in function to return all available R packages from CRAN with some useful meta-information as well:
Tip
The number of packages hosted on CRAN is growing from day to day. As we are working with live data, the actual results you see might be slightly different.
> library(tools) > pkgs <- available.packages() > str(pkgs) chr [1:6548, 1:17] "A3" "abc" "ABCanalysis" "abcdeFBA" ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:6548] "A3" "abc" "ABCanalysis" "abcdeFBA" ... ..$ : chr [1:17] "Package" "Version" "Priority" "Depends" ...
So we have a matrix with more than 6,500 rows, and the fourth column includes the dependencies in a comma-separated list. Instead of parsing those strings and cleaning...