Book Image

Applied Geospatial Data Science with Python

By : David S. Jordan
3 (1)
Book Image

Applied Geospatial Data Science with Python

3 (1)
By: David S. Jordan

Overview of this book

Data scientists, when presented with a myriad of data, can often lose sight of how to present geospatial analyses in a meaningful way so that it makes sense to everyone. Using Python to visualize data helps stakeholders in less technical roles to understand the problem and seek solutions. The goal of this book is to help data scientists and GIS professionals learn and implement geospatial data science workflows using Python. Throughout this book, you’ll uncover numerous geospatial Python libraries with which you can develop end-to-end spatial data science workflows. You’ll learn how to read, process, and manipulate spatial data effectively. With data in hand, you’ll move on to crafting spatial data visualizations to better understand and tell the story of your data through static and dynamic mapping applications. As you progress through the book, you’ll find yourself developing geospatial AI and ML models focused on clustering, regression, and optimization. The use cases can be leveraged as building blocks for more advanced work in a variety of industries. By the end of the book, you’ll be able to tackle random data, find meaningful correlations, and make geospatial data models.
Table of Contents (17 chapters)
1
Part 1:The Essentials of Geospatial Data Science
Free Chapter
2
Chapter 1: Introducing Geographic Information Systems and Geospatial Data Science
6
Part 2: Exploratory Spatial Data Analysis
10
Part 3: Geospatial Modeling Case Studies

What is GIS?

GIS stands for Geographic Information Systems. GIS are computerized systems used in the creation, collection, organization, analysis, and visualization of geospatial data. Geospatial data is a representation of the real world and it is rooted in geography. Geography is the study of the physical features of the Earth and its atmosphere, as well as how human activity impacts both. Human activity is looked at through many lenses, such as population distribution and land usage.

To represent the Earth in a GIS, you will leverage one of two data formats: vectors or rasters. Figure 1.1 shows a stylized version of how real-world data can be represented in vector and raster formats. We’ll define and discuss both of these terms in more detail in Chapter 2, What Is Geospatial Data and Where Can I Find It?

Figure 1.1 – Real-world data in vector and raster format

Figure 1.1 – Real-world data in vector and raster format

A typical GIS enables you to query and combine data assets in relation to the spatial relationship of each asset. This data is then visualized in the form of a static or interactive map or within a mapping application.

Geospatial data stored within a GIS comes in many different formats and from many different domains. A GIS used in local government may include information on the land parcels of local neighborhoods, the roads that run through that neighborhood, and the location of public service infrastructure, such as hospitals and fire stations. A GIS servicing a local weather station may include some of these assets, but will likely also include other types of data, such as real-time feeds of storm paths, rainfall totals, and wind speeds at various points around an area at various times. In Chapter 2, What Is Geospatial Data and Where Can I Find It?, we will focus more on various types of spatial data, their file structure, including shapefiles and GeoJSON, and some of the public sources in which spatial data can be found.

In your day-to-day life, you’ve likely used a GIS platform or an application more frequently than you may have realized. Take, for instance, Google Maps, which is arguably the most used GIS application in the world. Google Maps allows you to search for points of interest around you, such as a coffee shop or an auto mechanic, find directions to these points of interest, and also understand adverse conditions such as rush-hour traffic or roadworks that may impact your commute. There are many other forms of GIS applications out there, including applications that trace the route of an Amazon delivery vehicle as it approaches your home, applications that help you understand where public busses and transit hubs are located, and even applications that help monitor the spread of infectious diseases, as we mentioned in the preface to this book.

In addition to web and mobile GIS systems, there are also desktop-based, point-and-click GIS platforms that allow users to perform more complex spatial operations and analyses. These platforms are often used by specialized GIS practitioners who often have the title of geographer, GIS analyst, GIS engineer, or GIS specialist. These systems are used in a variety of different industries for different purposes. A GIS analyst in local government may use a desktop GIS platform to edit parcel boundaries within a town while a GIS analyst for a rail operator may use it to monitor the operation status and location of each railcar. The uses of GIS and the industries in which it is used are near limitless.

Typically, desktop GIS systems are provided by vendors, with the most dominant vendor in the space being Esri. As the dominant player in the GIS space, Esri’s proprietary software integrates into numerous other applications with other vendors, including Microsoft and AutoCAD. In more recent versions of its software, Esri has also extended its application to work with many open source data science languages, such as Python and R, and Integrated Development Environments (IDEs), such as Jupyter Notebook. This book will focus on open source Python packages that do not require licensing. In Chapter 4, Exploring Geospatial Data Science Packages, we will cover packages including GeoPandas, PySAL, and GeoViews, along with many others you’ll leverage in the case studies later in this book.

Now that you have an understanding of GIS, let’s now define what data science is. As we define data science, hopefully, you’ll begin to see how GIS and data science interact.