-
Book Overview & Buying
-
Table Of Contents
Machine Learning for the Web
By :
Since scraping directly from the most relevant search engines such as Google, Bing, Yahoo, and others is against their term of service, we need to take initial review pages from their REST API (using scraping services such as Crawlera, http://crawlera.com/, is also possible). We decided to use the Bing service, which allows 5,000 queries per month for free.
In order to do that, we register to the Microsoft Service to obtain the key needed to allow the search. Briefly, we followed these steps:
Register online on https://datamarket.azure.com.
In My Account, take the Primary Account Key.
Register a new application (under DEVELOPERS | REGISTER; put Redire
ct URI: https://www.
bing.com)
After that, we can write a function that retrieves as many URLs relevant to our query as we want:
num_reviews = 30
def bing_api(query):
keyBing = API_KEY # get Bing key from: https://datamarket.azure.com/account/keys
credentialBing = 'Basic ' + (':%s' % keyBing...