Book Image

Geospatial Development By Example with Python

By : Pablo Carreira
5 (1)
Book Image

Geospatial Development By Example with Python

5 (1)
By: Pablo Carreira

Overview of this book

From Python programming good practices to the advanced use of analysis packages, this book teaches you how to write applications that will perform complex geoprocessing tasks that can be replicated and reused. Much more than simple scripts, you will write functions to import data, create Python classes that represent your features, and learn how to combine and filter them. With pluggable mechanisms, you will learn how to visualize data and the results of analysis in beautiful maps that can be batch-generated and embedded into documents or web pages. Finally, you will learn how to consume and process an enormous amount of data very efficiently by using advanced tools and modern computers’ parallel processing capabilities.
Table of Contents (17 chapters)
Geospatial Development By Example with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Programming and running your first example


Now that we have all we need installed, we will go through our first example. In this example, we will test the installation and then see a glimpse of OGR's capabilities.

To do this, we will open a vector file containing the boundaries of all the countries in the world and make a list of country names. The objective of this simple example is to present the logic behind OGR objects and functions and give an understanding of how geospatial files are represented. Here's how:

  1. First, you need to copy the sample data provided with the book to your project folder. You can do this by dragging and dropping the data folder into the geopy folder. Make sure that the data folder is named data and that it's inside the geopy folder.

  2. Now, create a new directory for this chapter code, inside PyCharm. With your geopy project opened, right-click on the project folder and select New | Directory. Name it Chapter1.

  3. Create a new Python file. To do this, right-click on the Chapter1 folder and select New | Python File. Name it world_borders.py, and PyCharm will automatically open the file for editing.

  4. Type the following code in this file:

    import ogr
    # Open the shapefile and get the first layer.
    datasource = ogr.Open("../data/world_borders_simple.shp") 
    layer = datasource.GetLayerByIndex(0)
    print("Number of features: {}".format(layer.GetFeatureCount()))
  5. Now, run the code; in the menu bar, navigate to Run | Run, and in the dialog, choose world_borders. An output console will open at the bottom of the screen, and if everything goes fine, you should see this output:

    C:\Python27\python.exe C:/geopy/world_borders.py
    Number of features: 246
    
    Process finished with exit code 0
    

Congratulations! You successfully opened a Shapefile and counted the number of features inside it. Now, let's understand what this code does.

The first line imports the ogr package. From this point on, all the functions are available as ogr.FunctionName(). Note that ogr doesn't follow the Python naming conventions for functions.

The line after the comment opens the OGR datasource (this opens the shapefile containing the data) and assigns the object to the datasource variable. Note that the path, even on Windows, uses a forward slash (/) and not a backslash.

The next line gets the first layer of the data source by its index (0). Some data sources can have many layers, but this is not the case of a Shapefile, which has only one layer. So, when working with Shapefiles, we always know that the layer of interest is layer 0.

In the final line, the print statement prints the number of features returned by layer.GetFeatureCount(). Here, we will use Python's string formatting, where the curly braces are replaced by the argument passed to format().

Now, perform the following steps:

  1. In the same file, let's type the next part of our program:

    # Inspect the fields available in the layer.
    feature_definition = layer.GetLayerDefn()
    for field_index in range(feature_definition.GetFieldCount()):
        field_definition = feature_definition.GetFieldDefn(field_index)
        print("\t{}\t{}\t{}".format(field_index,
                                    field_definition.GetTypeName(),
                                    field_definition.GetName()))
  2. Rerun the code; you can use the Shift + F10 shortcut for this. Now, you should see the number of features as before plus a pretty table showing information on all the fields in the shapefile, as follows:

    Number of features: 246
     0 String  FIPS
     1 String  ISO2
     2 String  ISO3
     3 Integer UN
     4 String  NAME
     5 Integer POP2005
     6 Integer REGION
     7 Integer SUBREGION
    
    Process finished with exit code 0
    

    What happens in this piece of code is that feature_definition = layer.GetLayerDefn() gets the object that contains the definition of the features. This object contains the definition for each field and the type of geometry.

    In the for loop, we will get each field definition and print its index, name, and type. Note that the object returned by layer.GetLayerDefn() is not iterable, and we can't use for directly with it. So first, we will get the number of fields and use it in the range() function so that we can iterate through the indexes of the fields:

  3. Now, type the last part, as follows:

    # Print a list of country names.
    layer.ResetReading()
    for feature in layer:
        print(feature.GetFieldAsString(4))
  4. Run the code again and check the results in the output:

    Number of features: 246
     0 String FIPS
     1 String ISO2
     2 String ISO3
     3 Integer UN
     4 String NAME
     5 Integer POP2005
     6 Integer REGION
     7 Integer SUBREGION
    Antigua and Barbuda
    Algeria
    Azerbaijan
    Albania
    Armenia
    Angola
    ...
    Saint Barthelemy
    Guernsey
    Jersey
    South Georgia South Sandwich Islands
    Taiwan
    
    Process finished with exit code 0
    

The layer is iterable, but first, we need to ensure that we are at the beginning of the layer list with layer.ResetReading() (this is one of OGR's "gotcha" points).

The feature.GetFieldAsString(4) method returns the value of field 4 as a Python string. There are two ways of knowing whether the country names are in field 4:

  • Looking at the data's DBF file (by opening it with LibreOffice or Excel)

  • Looking at the table that we printed in the first part of the code

Your complete code should look similar to the following:

import ogr


# Open the shapefile and get the first layer.
datasource = ogr.Open("../data/world_borders_simple.shp")
layer = datasource.GetLayerByIndex(0)
print("Number of features: {}".format(layer.GetFeatureCount()))

# Inspect the fields available in the layer.
feature_definition = layer.GetLayerDefn()
for field_index in range(feature_definition.GetFieldCount()):
    field_definition = feature_definition.GetFieldDefn(field_index)
    print("\t{}\t{}\t{}".format(field_index,
                                field_definition.GetTypeName(),
                                field_definition.GetName()))

# Print a list of country names.
layer.ResetReading()
for feature in layer:
    print(feature.GetFieldAsString(4))