Book Image

Network Science with Python

By : David Knickerbocker
Book Image

Network Science with Python

By: David Knickerbocker

Overview of this book

Network analysis is often taught with tiny or toy data sets, leaving you with a limited scope of learning and practical usage. Network Science with Python helps you extract relevant data, draw conclusions and build networks using industry-standard – practical data sets. You’ll begin by learning the basics of natural language processing, network science, and social network analysis, then move on to programmatically building and analyzing networks. You’ll get a hands-on understanding of the data source, data extraction, interaction with it, and drawing insights from it. This is a hands-on book with theory grounding, specific technical, and mathematical details for future reference. As you progress, you’ll learn to construct and clean networks, conduct network analysis, egocentric network analysis, community detection, and use network data with machine learning. You’ll also explore network analysis concepts, from basics to an advanced level. By the end of the book, you’ll be able to identify network data and use it to extract unconventional insights to comprehend the complex world around you.
Table of Contents (17 chapters)
1
Part 1: Getting Started with Natural Language Processing and Networks
5
Part 2: Graph Construction and Cleanup
9
Part 3: Network Science and Social Network Analysis

Why NLP in a network analysis book?

Most of you probably bought this book in order to learn applied social network analysis using Python. So, why am I explaining NLP? Here’s why: if you know your way around NLP and are comfortable extracting data from text, that can be extremely powerful for creating network data and investigating the relationship between things that are mentioned in text. Here is an example from the book Alice’s Adventures in Wonderland by Lewis Carroll, my favorite book.

“Once upon a time there were three little sisters” the Dormouse began in a great hurry; “and their names were Elsie, Lacie, and Tillie; and they lived at the bottom of a well.”

What can we observe from these words? What characters or places are mentioned? We can see that the Dormouse is telling a story about three sisters named Elsie, Lacie, and Tillie and that they lived at the bottom of a well. If you allow yourself to think in terms of relationships, you will see that these relationships exist:

  • Three sisters -> Dormouse (he either knows them or knows a story about them)
  • Dormouse -> Elsie
  • Dormouse -> Lacie
  • Dormouse -> Tillie
  • Elsie -> bottom of a well
  • Lacie -> bottom of a well
  • Tillie -> bottom of a well

It’s also very likely that the three sisters all know each other, so additional relationships emerge:

  • Elsie -> Lacie
  • Elsie -> Tillie
  • Lacie -> Elsie
  • Lacie -> Tillie
  • Tillie -> Elsie
  • Tillie -> Lacie

Our minds build these relationship maps so effectively that we don’t even realize that we are doing it. The moment I read that the three were sisters, I drew a mental image that the three knew each other.

Let’s try another example from a current news story: Ocasio-Cortez doubles down on Manchin criticism (CNN, June 2021: https://edition.cnn.com/videos/politics/2021/06/13/alexandria-ocasio-cortez-joe-manchin-criticism-sot-sotu-vpx.cnn).

Rep. Alexandria Ocasio-Cortez (D-NY) says that Sen. Joe Manchin (D-WV) not supporting a house voting rights bill is being influenced by the legislation’s sweeping reforms to limit the role of lobbyists and the influence of “dark money” political donations.

Who is mentioned, and what is their relationship? What can we learn from this short text?

  • Rep. Alexandria Ocasio-Cortez is talking about Sen. Joe Manchin
  • Both are Democrats
  • Sen. Joe Manchin does not support a house voting rights bill
  • Rep. Alexandria Ocasio-Cortez claims that Sen. Joe Manchin is being influenced by the legislation’s reforms
  • Rep. Alexandria Ocasio-Cortez claims that Sen. Joe Manchin is being influenced by “dark money” political donations
  • There may be a relationship between Sen. Joe Manchin and “dark money” political donors

We can see that even a small amount of text has a lot of information embedded.

If you are stuck trying to figure out relationships when dealing with text, I learned in college creative writing classes to consider the “W” questions (and How) in order to explain things in a story:

  • Who: Who is involved? Who is telling the story?
  • What: What is being talked about? What is happening?
  • When: When does this take place? What time of the day is it?
  • Where: Where is this taking place? What location is being described?
  • Why: Why is this important?
  • How: How is the thing being done?

If you ask these questions, you will notice relationships between things and other things, which is foundational for building and analyzing networks. If you can do this, you can identify relationships in text. If you can identify relationships in text, you can use that knowledge to build social networks. If you can build social networks, you can analyze relationships, detect importance, detect weaknesses, and use this knowledge to gain a really profound understanding of whatever it is that you are analyzing. You can also use this knowledge to attack dark networks (crime, terrorism, and so on) or protect people, places, and infrastructure. This isn’t just insights. These are actionable insights—the best kind.

That is the point of this book. Marrying NLP with social network analysis and data science is extremely powerful for acquiring a new perspective. If you can scrape or get the data you need, you can really gain deep knowledge of how things relate and why.

That is why this chapter aims to explain very simply what NLP is, how to use it, and what it can be used for. But before that, let’s get into the history for a bit, as that is often left out of NLP books.