Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Streaming Twitter data


Twitter is a famous microblogging platform. It produces a massive amount of data with around 500 million tweets sent each day. Twitter allows its data to be accessed by APIs and that makes it the poster child of testing any big data streaming application.

In this recipe, we will see how we can live stream data in Spark using Twitter streaming libraries. Twitter is just one source of providing the streaming data to Spark and has no special status. Therefore, there are no built-in libraries for Twitter. Spark does provide some APIs to facilitate integration with Twitter libraries, though.

An example use of live Twitter data feed can be to find trending tweets in the last 5 minutes.

How to do it...

  1. Create a Twitter account if you do not already have one.

  2. Go to http://apps.twitter.com.

  3. Click on Create New App.

  4. Enter Name, Description, Website, and Callback URL, and then click on Create your Twitter Application.

  5. You will reach Application Management screen.

  6. Navigate to Keys and...