Book Image

RStudio for R Statistical Computing Cookbook

By : Andrea Cirillo
Book Image

RStudio for R Statistical Computing Cookbook

By: Andrea Cirillo

Overview of this book

The requirement of handling complex datasets, performing unprecedented statistical analysis, and providing real-time visualizations to businesses has concerned statisticians and analysts across the globe. RStudio is a useful and powerful tool for statistical analysis that harnesses the power of R for computational statistics, visualization, and data science, in an integrated development environment. This book is a collection of recipes that will help you learn and understand RStudio features so that you can effectively perform statistical analysis and reporting, code editing, and R development. The first few chapters will teach you how to set up your own data analysis project in RStudio, acquire data from different data sources, and manipulate and clean data for analysis and visualization purposes. You'll get hands-on with various data visualization methods using ggplot2, and you will create interactive and multidimensional visualizations with D3.js. Additional recipes will help you optimize your code; implement various statistical models to manage large datasets; perform text analysis and predictive analysis; and master time series analysis, machine learning, forecasting; and so on. In the final few chapters, you'll learn how to create reports from your analytical application with the full range of static and dynamic reporting tools that are available in RStudio so that you can effectively communicate results and even transform them into interactive web applications.
Table of Contents (15 chapters)
RStudio for R Statistical Computing Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Getting data from Facebook with the Rfacebook package


The Rfacebook package, developed and maintained by Pablo Barberá, lets you easily establish and take advantage of Facebook's API thanks to a series of functions.

As we did for the twitteR package, we are going to establish a connection with the API and retrieve posts pertaining to a given keyword.

Getting ready

This recipe will mainly be based on functions from the Rfacebok package. Therefore, we need to install and load this package in our environment:

install.packages("Rfacebook")
library(Rfacebook)

How to do it...

  1. In order to leverage an API's functionalities, we first have to create an application in our Facebook profile. Navigating to the following URL will let you create an app (assuming you are already logged in to Facebook): https://developers.facebook.com.

    After skipping the quick start (the button on the upper-right corner), you can see the settings of your app and take note of app_id and app_secret, which you will need in order to establish a connection with the app.

  2. After installing and loading the Rfacebook package, you will easily be able to establish a connection by running the fbOAuth() function as follows:

    fb_connection <-   fbOauth(app_id     = "your_app_id",
                           app_secret = "your_app_secret")
    fb_connection

    Running the last line of code will result in a console prompt, as shown in the following lines of code:

    copy and paste into site URL on Facebook App Settings: http://localhost:1410/ When done press any key to continue
    

    Following this prompt, you will have to copy the URL and go to your Facebook app settings.

    Once there, you will have to select the Settings tab and create a new platform through the + Add Platform control. In the form, which will prompt you after clicking this control, you should find a field named Site Url. In this field, you will have to paste the copied URL.

    Close the process by clicking on the Save Changes button.

    At this point, a browser window will open up and ask you to allow access permission from the app to your profile. After allowing this permission, the R console will print out the following code snippet:

    Authentication complete
    Authentication successful.
    
  3. To test our API connection, we are going to search Facebook for posts related to data science with R and save the results within data.frame for further analysis.

    Among other useful functions, Rfacebook provides the searchPages() function, which as you would expect, allows you to search the social network for pages mentioning a given string.

    Different from the searchTwitter function, this function will not let you specify a lot of arguments:

    • string: This is the query string

    • token: This is the valid OAuth token created with the fbOAuth() function

    • n: This is the maximum number of posts to be retrieved

      pages ← searchPages('data science with R',fb_connection)
      hist(pages$likes)

    Note

    The Unix timestamp

    The Unix timestamp is a time-tracking system originally developed for the Unix OS. Technically, the Unix timestamp x expresses the number of seconds elapsed since the Unix Epoch (January 1, 1970 UTC) and the timestamp.

    To search for data science with R, you will have to run the following line of code:

    This will result in data.frame storing all the pages retrieved along with the data concerning them.

    As seen for the twitteR package, we can take a quick look at the like distribution, leveraging the base R hist() function:

    This will result in a plot similar to the following:

    Refer to the data visualization section for further recipes on data visualization.