Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Download the data


This chapter makes use of the data of follower data from the Twitter social network. The data is provided as a part of the Stanford Large Network Dataset Collection. You can download the Twitter data from https://snap.stanford.edu/data/egonets-Twitter.html.

We'll be making use of both the twitter.tar.gz file and the twitter_combined.txt.gz files. Both of these files should be downloaded and decompressed inside the sample code's data directory.

Note

The sample code for this chapter is available at https://github.com/clojuredatascience/ch8-network-analysis.

As usual, a script has been provided that will do this for you. You can run it by executing the following command line from within the project directory:

script/download-data.sh

If you'd like to run this chapter's examples, make sure you download the data before continuing.

Inspecting the data

Let's look at one of the files in the Twitter directory, specifically the twitter/98801140.edges file. If you open it in a text editor...