Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Cluster evaluation measures


At the bottom of the file we looked at in the previous section, you'll see some statistics that suggest how well the data has been clustered:

Inter-Cluster Density: 0.6135607681542804
Intra-Cluster Density: 0.6957348405534836

These two numbers can be considered as the equivalent to the variance within and the variance between measures we have seen in Chapter 2, Inference and Chapter 3, Correlation. Ideally, we are seeking a lower variance (or a higher density) within clusters compared to the density between clusters.

Inter-cluster density

Inter-cluster density is the average distance between cluster centroids. Good clusters probably don't have centers that are too close to each other. If they did, it would indicate the clustering is creating groups with similar features, and perhaps drawing distinctions between cluster members that are hard to support.

Thus, ideally our clustering will produce clusters with a large inter-cluster distance.

Intra-cluster density

By contrast...