In most cases, the majority of data will not exist in your database, but will instead be published in different forms on the Internet. To dig up more valuable information from these data sources, we need to know how to access and scrape data from the Web. Here, we will illustrate how to use the rvest
package to harvest finance data from http://www.bloomberg.com/.
In this recipe, you need to prepare your environment with R installed and a computer that can access the Internet.
Perform the following steps to scrape data from http://www.bloomberg.com/:
First, access the following link to browse the S&P 500 index on the Bloomberg Business websitehttp://www.bloomberg.com/quote/SPX:IND:
Once the page appears, as shown in the preceding screenshot, we can begin installing and loading the
rvest
package:> install.packages("rvest") > library(rvest)
Next, you can use the HTML function from the
rvest
package to scrape and...