Book Image

Applied Geospatial Data Science with Python

By : David S. Jordan
3 (1)
Book Image

Applied Geospatial Data Science with Python

3 (1)
By: David S. Jordan

Overview of this book

Data scientists, when presented with a myriad of data, can often lose sight of how to present geospatial analyses in a meaningful way so that it makes sense to everyone. Using Python to visualize data helps stakeholders in less technical roles to understand the problem and seek solutions. The goal of this book is to help data scientists and GIS professionals learn and implement geospatial data science workflows using Python. Throughout this book, you’ll uncover numerous geospatial Python libraries with which you can develop end-to-end spatial data science workflows. You’ll learn how to read, process, and manipulate spatial data effectively. With data in hand, you’ll move on to crafting spatial data visualizations to better understand and tell the story of your data through static and dynamic mapping applications. As you progress through the book, you’ll find yourself developing geospatial AI and ML models focused on clustering, regression, and optimization. The use cases can be leveraged as building blocks for more advanced work in a variety of industries. By the end of the book, you’ll be able to tackle random data, find meaningful correlations, and make geospatial data models.
Table of Contents (17 chapters)
1
Part 1:The Essentials of Geospatial Data Science
Free Chapter
2
Chapter 1: Introducing Geographic Information Systems and Geospatial Data Science
6
Part 2: Exploratory Spatial Data Analysis
10
Part 3: Geospatial Modeling Case Studies

To get the most out of this book

As readers of this book, we assume that you come from a background in either data science or GIS. We also expect that you have some foundational knowledge of working with Python.

Software/hardware covered in the book

Operating system requirements

Anaconda Distribution

Windows, macOS, or Linux

Python 3.10.6

Windows, macOS, or Linux

Additionally, you will need to set up keys to several APIs, from which you will access data throughout the book.

API

Setup link

OpenMapQuest

https://developer.mapquest.com/user/login/sign-up

Google Maps

https://developers.google.com/maps

US Census Bureau

https://api.census.gov/data/key_signup.html

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

The quality of the hardware can impact the runtime for some analyses, as is the case for most data science activities. As such, we recommend hardware similar or better to the specified hardware outlined to prevent any potential issues:

  • NVIDIA GeForce GTX 1050
  • 16 GB RAM

We recommend that you use Anaconda as your Python environment and package manager. To begin installing the Anaconda Distribution, you’ll want to visit the Anaconda Distribution installation website at https://docs.anaconda.com/anaconda/install/. The Python version we are using throughout this book is 3.10.6, as this is one of the latest versions of Python available at the time of publication. Leveraging this version will ensure that all packages are compatible. To make the setup of your virtual environment as streamlined as possible, we’ve exported our environment.yml file and uploaded it to the GitHub repository at https://github.com/PacktPublishing/Applied-Geospatial-Data-Science-with-Python.

To set up the virtual environment called GeospatialPython, launch Anaconda prompt and execute the following command:

conda env create -file environment.yml

You’ll need to substitute environment.yml for the full path of the downloaded file.

After the environment is installed, you can activate it by executing the following command:

conda activate GeospatialPython

Throughout the book, you’ll see the following code:

data_path = r'YOUR FILE PATH'

Anytime you see this, you’ll need to substitute ‘YOUR FILE PATH’ with the file path of the data folder which can be downloaded from the GitHub repo. The data stored in the GitHub repo can be found in the Releases section or by visiting https://github.com/PacktPublishing/Applied-Geospatial-Data-Science-with-Python/releases. There are three parts to the data:

  • Data.pt1.zip
  • LCMS_CONUS_v2021-7_Land_Cover_Annual_2021.zip
  • S2B_MSIL2A_20220504T161829_N0400_R040_T17TNF_20220504T210702.SAFE.zip

You’ll need to extract the contents of these zip folders and store the contents in a single folder. You’ll then point to this folder any time you see ‘YOUR FILE PATH’ referenced in the Jupyter notebooks.

Similarly, you will also see the following code from time to time:

out_path = r"YOUR FILE PATH"

You’ll need to substitute YOUR FILE PATH in this code reference with the directory to which you’d like the output to be saved.