Learning Geospatial Analysis with Python

Learning Geospatial Analysis with Python

By : Joel Lawhead

Buy this Book

Learning Geospatial Analysis with Python

By: Joel Lawhead

Buy this Book

Overview of this book

Geospatial Analysis is used in almost every field you can think of from medicine, to defense, to farming. This book will guide you gently into this exciting and complex field. It walks you through the building blocks of geospatial analysis and how to apply them to influence decision making using the latest Python software. Learning Geospatial Analysis with Python, 2nd Edition uses the expressive and powerful Python 3 programming language to guide you through geographic information systems, remote sensing, topography, and more, while providing a framework for you to approach geospatial analysis effectively, but on your own terms. We start by giving you a little background on the field, and a survey of the techniques and technology used. We then split the field into its component specialty areas: GIS, remote sensing, elevation data, advanced modeling, and real-time data. This book will teach you everything you need to know about, Geospatial Analysis from using a particular software package or API to using generic algorithms that can be applied. This book focuses on pure Python whenever possible to minimize compiling platform-dependent binaries, so that you don’t become bogged down in just getting ready to do analysis. This book will round out your technical library through handy recipes that will give you a good understanding of a field that supplements many a modern day human endeavors.

Learning Geospatial Analysis with Python Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Learning Geospatial Analysis with Python

Geospatial analysis and our world

History of geospatial analysis

Geographic information systems

Remote sensing

Elevation data

Computer-aided drafting

Geospatial analysis and computer programming

Importance of geospatial analysis

Geographic information system concepts

Common vector GIS concepts

Common raster data concepts

Creating the simplest possible Python GIS

Summary

Geospatial Data

An overview of common data formats

Summary

The Geospatial Technology Landscape

Data access

Computational geometry

Desktop tools (including visualization)

Metadata management

Summary

Geospatial Python Toolbox

Installing third-party Python modules

Python networking libraries for acquiring data

Python markup and tag-based parsers

Python JSON libraries

OGR

PyShp

dbfpy

Shapely

Fiona

GDAL

NumPy

PIL

PNGCanvas

GeoPandas

PyMySQL

PyFPDF

Spectral Python

Summary

Python and Geographic Information Systems

Measuring distance

Calculating line direction

Coordinate conversion

Reprojection

Editing shapefiles

Performing selections

Creating images for visualization

Dot density calculations

Summary

Python and Remote Sensing

Extracting features from images

Change detection

Summary

Python and Elevation Data

ASCII Grid files

Creating a shaded relief

Creating elevation contours

Working with LIDAR

Summary

Advanced Geospatial Python Modeling

Creating a Normalized Difference Vegetative Index

Creating a flood inundation model

Creating a color hillshade

Least cost path analysis

Routing along streets

Geolocating photos

Summary

Real-Time Data

Tracking vehicles

The NextBus agency list

The NextBus route list

NextBus vehicle locations

Mapping NextBus locations

Storm chasing

Reports from the field

Summary

Putting It All Together

A typical GPS report

Working with GPX-Reporter.py

Stepping through the program

The initial setup

Working with utility functions

Parsing the GPX

Getting the bounding box

Downloading map and elevation images

Creating the hillshade

Creating maps

Measuring the elevation

Measuring the distance

Retrieving weather data

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Common vector GIS concepts

This section will discuss the different types of GIS processes commonly used in geospatial analysis. This list is not exhaustive; however, it provides you with the essential operations that all other operations are based on. If you understand these operations, you can quickly understand much more complex processes as they are either derivatives or combinations of these processes.

Data structures

GIS vector data uses coordinates consisting of, at a minimum, an x horizontal value and a y vertical value to represent a location on the Earth. In many cases, a point may also contain a z value. Other ancillary values are possible including measurements or timestamps.

These coordinates are used to form points, lines, and polygons to model real-world objects. Points can be a geometric feature in and of themselves or they can connect line segments. Closed areas created by line segments are considered polygons. Polygons model objects such as buildings, terrain, or political boundaries.

A GIS feature can consist of a single point, line, or polygon or it can consist of more than one shape. For example, in a GIS polygon dataset containing world country boundaries, the Philippines, which is made up of 7,107 islands, would be represented as a single country made up of thousands of polygons.

Vector data typically represents topographic features better than raster data. Vector data has better accuracy potential and is more precise. However, to collect vector data on a large scale is also traditionally more costly than raster data.

Two other important terms related to vector data structures are bounding box and convex hull. The bounding box or minimum bounding box is the smallest possible square that contains all of the points in a dataset. The following image demonstrates a bounding box for a collection of points:

The convex hull of a dataset is similar to the bounding box, but instead of a square, it is the smallest possible polygon that can contain a dataset. The bounding box of a dataset always contains its convex hull. The following image shows the same point data as the previous example with the convex hull polygon shown in red:

Buffer

A buffer operation can be applied to spatial objects including points, lines, or polygons. This operation creates a polygon around the object at a specified distance. Buffer operations are used for proximity analysis, for example, establishing a safety zone around a dangerous area. In the following image, the black shapes represent the original geometry while the red outlines represent the larger buffer polygons generated from the original shape:

Dissolve

A dissolve operation creates a single polygon out of adjacent polygons. A common use for a dissolve operation is to merge two adjacent properties in a tax database that have been purchased by a single owner. Dissolves are also used to simplify data extracted from remote sensing:

Generalize

Objects that have more points than necessary for the geospatial model can be generalized to reduce the number of points used to represent the shape. This operation usually requires a few attempts to get the optimal number of points without compromising the overall shape. It is a data optimization technique to simplify data for the efficiency of computing or better visualization. This technique is useful in web-mapping applications. Computer screens have a resolution of 72 dots per inch (dpi). Highly-detailed point data, which would not be visible, can be reduced so that less bandwidth is used to send a visually equivalent map to the user:

Intersection

An intersection operation is used to see if one part of a feature intersects with one or more features. This operation is for spatial queries in proximity analysis and is often a follow-on operation to a buffer analysis:

Merge

A merge operation combines two or more non-overlapping shapes in a single multishape object. Multishape objects mean that the shapes maintain separate geometries but are treated as a single feature with a single set of attributes by the GIS:

Point in polygon

A fundamental geospatial operation is checking to see whether a point is inside a polygon. This one operation is the atomic building block of many different types of spatial queries. If the point is on the boundary of the polygon, it is considered inside. Very few spatial queries exist that do not rely on this calculation in some way. However, it can be very slow on a large number of points.

The most common and efficient algorithm to detect if a point is inside a polygon is called the ray casting algorithm. First, a test is performed to see if the point is on the polygon boundary. Next, the algorithm draws a line from the point in question in a single direction. The program counts the number of times the line crosses the polygon boundary until it reaches the bounding box of the polygon. The bounding box is the smallest box that can be drawn around the entire polygon. If the number is odd, the point is inside. If the number of boundary intersections is even, the point is outside:

Union

The union operation is less common but very useful to combine two or more overlapping polygons in a single shape. It is similar to dissolve, but in this case, the polygons are overlapping as opposed to being adjacent. Usually, this operation is used to clean up automatically-generated feature datasets from remote sensing operations:

Join

A join or SQL join is a database operation used to combine two or more tables of information. Relational databases are designed to avoid storing redundant information for one-to-many relationships. For example, a U.S. state may contain many cities. Rather than creating a table for each state containing all of its cities, a table of states with numeric IDs is created, while a table for all the cities in every state is created with a state numeric ID. In a GIS, you can also have spatial joins that are part of the spatial extension software for a database. In spatial joins, combine the attributes to two features in the same way that you do in a SQL join, but the relation is based on the spatial proximity of the two features. To follow the previous cities example, we could add the county name that each city resides in using a spatial join. The cities layer could be loaded over a county polygon layer whose attributes contain the county name. The spatial join would determine which city is in which county and perform a SQL join to add the county name to each city's attribute row.

Geospatial rules about polygons

In geospatial analysis, there are several general rules of thumb regarding polygons that are different from mathematical descriptions of polygons:

Polygons must have at least four points—the first and last points must be the same
A polygon boundary should not overlap itself
Polygons in a layer shouldn't overlap
A polygon in a layer inside another polygon is considered as a hole in the underlying polygon

Different geospatial software packages and libraries handle exceptions to these rules differently and can lead to confusing errors or software behaviors. The safest route is to make sure that your polygons obey these rules. There is one more important piece of information about polygons. A polygon is by definition a closed shape, which means that the first and last vertices of the polygon are identical. Some geospatial software will throw an error if you haven't explicitly duplicated the first point as the last point in the polygon dataset. Other software will automatically close the polygon without complaining. The data format that you use to store your geospatial data may also dictate how polygons are defined. This issue is a gray area and so it didn't make the polygon rules, but knowing this quirk will come in handy someday when you run into an error that you can't explain easily.

Learning Geospatial Analysis with Python

By : Joel Lawhead

Learning Geospatial Analysis with Python

By: Joel Lawhead

Overview of this book

Related Content you might be interested in

Current Title:

Learning Geospatial Analysis with Python

Common vector GIS concepts

Data structures

Buffer

Dissolve

Generalize

Intersection

Merge

Point in polygon

Union

Join

Geospatial rules about polygons