Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Getting started


We will use the stream.py script options to extract JSON data and retrieve a specific number of tweets; we can run this with a command such as the following:

$ python stream.py -j -n 10000 > tweets.json

The tweets.json file will contain one JSON string on each line representing a tweet.

Remember that the Twitter API credentials need to be made available as environment variables or hardcoded in the script itself.