Book Image

Network Science with Python

By : David Knickerbocker
Book Image

Network Science with Python

By: David Knickerbocker

Overview of this book

Network analysis is often taught with tiny or toy data sets, leaving you with a limited scope of learning and practical usage. Network Science with Python helps you extract relevant data, draw conclusions and build networks using industry-standard – practical data sets. You’ll begin by learning the basics of natural language processing, network science, and social network analysis, then move on to programmatically building and analyzing networks. You’ll get a hands-on understanding of the data source, data extraction, interaction with it, and drawing insights from it. This is a hands-on book with theory grounding, specific technical, and mathematical details for future reference. As you progress, you’ll learn to construct and clean networks, conduct network analysis, egocentric network analysis, community detection, and use network data with machine learning. You’ll also explore network analysis concepts, from basics to an advanced level. By the end of the book, you’ll be able to identify network data and use it to extract unconventional insights to comprehend the complex world around you.
Table of Contents (17 chapters)
1
Part 1: Getting Started with Natural Language Processing and Networks
5
Part 2: Graph Construction and Cleanup
9
Part 3: Network Science and Social Network Analysis

How has NLP helped me?

I want to do more than show you how to do something. I want to show you how it can help you. The easiest way for me to explain how this may be useful to you is to explain how it has been useful to me. There are a few things that were really appealing to me about NLP.

Simple text analysis

I am really into reading and grew up loving literature, so when I first learned that NLP techniques could be used for text analysis, I was immediately intrigued. Even something as simple as counting the number of times a specific word is mentioned in a book can be interesting and spark curiosity. For example, how many times is Eve, the first woman mentioned in the Bible, mentioned in the book of Genesis? How many times is she mentioned in the entire Bible? How many times is Adam mentioned in Genesis? How many times is Adam mentioned in the entire Bible? For this example, I’m using the King James Version.

Let’s compare:

Name

Genesis Count

Bible Count

Eve

2

4

Adam

17

29

Figure 1.1 – Table of biblical mentions of Adam and Eve

These are interesting results. Even if we do not take the Genesis story as literal truth, it’s still an interesting story, and we often hear about Adam and Eve. Hence, it is easy to assume they would be mentioned as frequently, but Adam is actually mentioned over eight times as often as Eve in Genesis and over seven times as often in the entire Bible. Part of understanding literature is building a mental map of what is happening in the text. To me, it’s a little odd that Eve is mentioned so rarely, and it makes me want to investigate the amount of male versus female mentions or maybe investigate which books of the Bible have the largest number of female characters and then what those stories are about. If nothing else, it sparks curiosity, which should lead to deeper analysis and understanding.

NLP gives me tools to extract quantifiable data from raw text. It empowers me to use that quantifiable data in research that would have been impossible otherwise. Imagine how long it would have taken to do this manually, reading every single page without missing a detail, to get to these small counts. Now, consider that this took me about a second once the code was written. That is powerful, and I can use this functionality to research any person in any text.

Community sentiment analysis

Second, and related to the point I just made, NLP provides ways to investigate themes and sentiments carried by groups of people. During the Covid-19 pandemic, a group of people has been vocally anti-mask, spreading fear and misinformation. If I capture text from those people, I can use sentiment analysis techniques to determine and measure sentiment shared by that group of people across various topics. I did exactly that. I was able to scrape thousands of tweets and understand what they really felt about various topics such as Covid-19, the flag, the Second Amendment, foreigners, science, and many more. I did this exact analysis for one of my projects, #100daysofnlp, on LinkedIn (https://www.linkedin.com/feed/hashtag/100daysofnlp/), and the results were illuminating.

NLP allows us to objectively investigate and analyze group sentiment about anything, so long as we are able to acquire text or audio. Much of Twitter data is posted openly and is consumable by the public. Just one caveat: if you are going to scrape, please use your abilities for good, not evil. Use this to understand what people are thinking about and what they feel. Use it for research, not surveillance.

Answer previously unanswerable questions

Really, what ties these two together is that NLP helps me answer questions that were previously unanswerable. In the past, we could have conversations discussing what people felt and why or describing literature we had read but only at a surface level. With what I am going to show you how to do in this book, you will no longer be limited to the surface. You will be able to map out complex relationships that supposedly took place thousands of years ago very quickly, and you will be able to closely analyze any relationship and even the evolution of relationships. You will be able to apply these same techniques to any kind of text, including transcribed audio, books, news articles, and social media posts. NLP opens up a universe of untapped knowledge.

Safety and security

In 2020, the Covid-19 pandemic hit the entire world. When it hit, I was worried that many people would lose their jobs and their homes, and I feared that the world would spin out of control into total anarchy. It’s gotten bad but we do not have armed gangs raiding towns and communities around the world. Tension is up, but I wanted a way to keep an eye on violence in my area in real time. So, I scraped police tweets from several police accounts in my area, as they report all kinds of crime in near real time, including violence. I created a dataset of violent versus non-violent tweets, where violent tweets contained words such as shooting, stabbing, and other violence-related words. I then trained an ML classifier to detect tweets having to do with violence. Using the results of this classifier, I can keep an eye on violence in my area. I can keep an eye on anything that I want, so long as I can get text but knowing how my area is doing in terms of street violence could serve as a warning or give me comfort. Again, replacing what was limited to feeling and emotion with quantifiable data is powerful.