Book Image

Python Data Analysis

By : Ivan Idris
Book Image

Python Data Analysis

By: Ivan Idris

Overview of this book

Table of Contents (22 chapters)
Python Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Key Concepts
Online Resources
Index

Parsing RSS and Atom feeds


Really Simple Syndication (RSS) and Atom feeds (refer to http://en.wikipedia.org/wiki/RSS) are often used for blogs and news. These type of feeds follow the publish/subscribe model. For instance, Packt Publishing has an RSS feed with article and book announcements. We can subscribe to the feed to get timely updates. The Python feedparser module allows us to parse RSS and Atom feeds easily without dealing with a lot of technical details. The feedparser module can be installed with pip as follows:

$ sudo pip install feedparser
$ pip freeze|grep feedparser
feedparser==5.1.3

After parsing an RSS file, we can access the underlying data using a dotted notation. Parse the Packt Publishing RSS feed and print the number of entries:

import feedparser as fp

rss = fp.parse("http://www.packtpub.com/rss.xml")

print "# Entries", len(rss.entries)

The number of entries is printed (the number may vary for each program run):

# Entries 50

Print entry titles and summaries if the entry...