Book Image

Network Science with Python

By : David Knickerbocker
Book Image

Network Science with Python

By: David Knickerbocker

Overview of this book

Network analysis is often taught with tiny or toy data sets, leaving you with a limited scope of learning and practical usage. Network Science with Python helps you extract relevant data, draw conclusions and build networks using industry-standard – practical data sets. You’ll begin by learning the basics of natural language processing, network science, and social network analysis, then move on to programmatically building and analyzing networks. You’ll get a hands-on understanding of the data source, data extraction, interaction with it, and drawing insights from it. This is a hands-on book with theory grounding, specific technical, and mathematical details for future reference. As you progress, you’ll learn to construct and clean networks, conduct network analysis, egocentric network analysis, community detection, and use network data with machine learning. You’ll also explore network analysis concepts, from basics to an advanced level. By the end of the book, you’ll be able to identify network data and use it to extract unconventional insights to comprehend the complex world around you.
Table of Contents (17 chapters)
1
Part 1: Getting Started with Natural Language Processing and Networks
5
Part 2: Graph Construction and Cleanup
9
Part 3: Network Science and Social Network Analysis

Unsupervised Machine Learning on Network Data

Welcome to another exciting chapter exploring network science and data science together. In the last chapter, we used supervised ML to train a model that was able to detect the revolutionaries from the book Les Miserables, using graph features alone. In this chapter, we are going to explore unsupervised ML and how it can also be useful in graph analysis as well as node classification with supervised ML.

The order these two chapters have been written in was intentional. I wanted you to learn how to create your own training data using graphs rather than being reliant on embeddings from unsupervised ML. The reason for this is important: when you rely on embeddings, you lose the ability to interpret why ML models have been classified the way that they have. You lose interpretability and explainability. The classifier essentially works as a black box, no matter which model you use. I wanted to show you the interpretable and explainable approach...