Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Whole-graph analysis


Let's turn our attention away from the smaller graphs we've been working with towards the larger graph of followers provided by the twitter_combined.txt file. This contains over 2.4 million edges and will provide a more interesting sample to work with.

One of the simplest metrics to determine about a whole graph is its density. For directed graphs, this is defined as the number of edges |E|, over the number of vertices |V| multiplied by one less than itself.

For a connected graph (one where every vertex is connected to every other vertex by an edge), the density would be 1. By contrast, a disconnected graph (one with no edges) would have a density of 0. Loom implements graph density as the alg/density function. Let's calculate the density of the larger Twitter graph:

(defn ex-8-17 []
  (->> (load-edges "twitter_combined.txt")
       (apply loom/digraph)
       (alg/density)
       (double)))

;; 2.675E-4

This seems very sparse, but bear in mind that a value of 1 would...