We discussed the basics of automated analysis of the social network. We looked at one particular social network: the people who use Twitter to exchange messages. This is about 316 million active users, exchanging about 500 million messages a month. We saw how to find information about specific people, about the list of friends a person follows, and the tweets a person makes.
We also discussed how to download additional media from social networking sites. We used PIL to confirm that an image is saved to work with. We also used PIL to create thumbnails of images. We can do a great deal of processing to gather and analyze data that people readily publish about themselves.
In the next chapter, we'll look at another source of data that's often difficult to work with. The ubiquitous PDF file format is difficult to process without specialized tools. The file is designed to allow consistent display and printing of documents. It's not, however, too helpful for analysis of content. We'll need...