Book Image

Network Science with Python

By : David Knickerbocker
Book Image

Network Science with Python

By: David Knickerbocker

Overview of this book

Network analysis is often taught with tiny or toy data sets, leaving you with a limited scope of learning and practical usage. Network Science with Python helps you extract relevant data, draw conclusions and build networks using industry-standard – practical data sets. You’ll begin by learning the basics of natural language processing, network science, and social network analysis, then move on to programmatically building and analyzing networks. You’ll get a hands-on understanding of the data source, data extraction, interaction with it, and drawing insights from it. This is a hands-on book with theory grounding, specific technical, and mathematical details for future reference. As you progress, you’ll learn to construct and clean networks, conduct network analysis, egocentric network analysis, community detection, and use network data with machine learning. You’ll also explore network analysis concepts, from basics to an advanced level. By the end of the book, you’ll be able to identify network data and use it to extract unconventional insights to comprehend the complex world around you.
Table of Contents (17 chapters)
1
Part 1: Getting Started with Natural Language Processing and Networks
5
Part 2: Graph Construction and Cleanup
9
Part 3: Network Science and Social Network Analysis

Summary

In this chapter, we covered two easier ways to scrape text data from the internet. Newspaper3k made short work of scraping news websites, returning clean text, headlines, keywords, and more. It allowed us to skip steps we’d done using BeautifulSoup and get to clean data much quicker. We used this clean text and NER to create and visualize networks. Finally, we used the Twitter Python library and V2 API to scrape tweets and connections, and we also used tweets to create and visualize networks. Between what you learned in this chapter and the previous one, you now have a lot of flexibility in scraping the web and converting text into networks so that you can explore embedded and hidden relationships.

Here is some good news: collecting and cleaning data is the most difficult part of what we are going to do, and this marks the end of data collection and most of the cleanup. After this chapter, we will mostly be having fun with networks!

In the next chapter, we will...