Book Image

Mastering Geospatial Analysis with Python

By : Silas Toms, Paul Crickard, Eric van Rees
Book Image

Mastering Geospatial Analysis with Python

By: Silas Toms, Paul Crickard, Eric van Rees

Overview of this book

Python comes with a host of open source libraries and tools that help you work on professional geoprocessing tasks without investing in expensive tools. This book will introduce Python developers, both new and experienced, to a variety of new code libraries that have been developed to perform geospatial analysis, statistical analysis, and data management. This book will use examples and code snippets that will help explain how Python 3 differs from Python 2, and how these new code libraries can be used to solve age-old problems in geospatial analysis. You will begin by understanding what geoprocessing is and explore the tools and libraries that Python 3 offers. You will then learn to use Python code libraries to read and write geospatial data. You will then learn to perform geospatial queries within databases and learn PyQGIS to automate analysis within the QGIS mapping suite. Moving forward, you will explore the newly released ArcGIS API for Python and ArcGIS Online to perform geospatial analysis and create ArcGIS Online web maps. Further, you will deep dive into Python Geospatial web frameworks and learn to create a geospatial REST API.
Table of Contents (23 chapters)
Title Page
Copyright and Credits
Packt Upsell
Geoprocessing with Geodatabases

HDFS and Hive in Python

This book is about Python for geospatial development, so in this section, you will learn how to use Python for HDFS operations and Hive queries. There are several database wrapper libraries with Python and Hadoop, but it does not seem like a single library has become a standout go-to library, and others, like Snakebite, don't appear ready to run on Python 3. In this section, you will learn how to use two libraries—PyHive and PyWebHDFS. You will also learn how you can use the Python subprocess module to execute HDFS and Hive commands.

To get PyHive, you can use conda and the following command:

conda install -c blaze pyhive

You may also need to install the sasl library:

conda install -c blaze sasl

The previous libraries will give you the ability to run Hive queries from Python. You will also want to be able to move files to HDFS. To do so, you can install pywebhdfs:

conda install -c conda-forge pywebhdfs

The preceding command will install the library, and as always, you can...