Book Image

Network Science with Python

By : David Knickerbocker
Book Image

Network Science with Python

By: David Knickerbocker

Overview of this book

Network analysis is often taught with tiny or toy data sets, leaving you with a limited scope of learning and practical usage. Network Science with Python helps you extract relevant data, draw conclusions and build networks using industry-standard – practical data sets. You’ll begin by learning the basics of natural language processing, network science, and social network analysis, then move on to programmatically building and analyzing networks. You’ll get a hands-on understanding of the data source, data extraction, interaction with it, and drawing insights from it. This is a hands-on book with theory grounding, specific technical, and mathematical details for future reference. As you progress, you’ll learn to construct and clean networks, conduct network analysis, egocentric network analysis, community detection, and use network data with machine learning. You’ll also explore network analysis concepts, from basics to an advanced level. By the end of the book, you’ll be able to identify network data and use it to extract unconventional insights to comprehend the complex world around you.
Table of Contents (17 chapters)
1
Part 1: Getting Started with Natural Language Processing and Networks
5
Part 2: Graph Construction and Cleanup
9
Part 3: Network Science and Social Network Analysis

Choosing between libraries, APIs, and source data

As part of this demonstration, I showed several ways to pull useful data off of the internet. I showed that several libraries have ways to load data directly but that there are limitations to what they have available. NLTK only offered a small portion of the complete Gutenberg book archive, so we had to use the Requests library to load The Metamorphosis. I also demonstrated that Requests accompanied by BeautifulSoup can easily harvest links and raw text.

Python libraries can also make loading data very easy when those libraries have data loading functionality as part of their library, but you are limited by what those libraries make available. If you just want some data to play with, with minimal cleanup, this may be ideal, but there will still be cleanup. You will not get away from that when working with text.

Other web resources expose their own APIs, which makes it pretty simple to load data after sending a request to them...