Book Image

Geospatial Development By Example with Python

By : Pablo Carreira
5 (1)
Book Image

Geospatial Development By Example with Python

5 (1)
By: Pablo Carreira

Overview of this book

From Python programming good practices to the advanced use of analysis packages, this book teaches you how to write applications that will perform complex geoprocessing tasks that can be replicated and reused. Much more than simple scripts, you will write functions to import data, create Python classes that represent your features, and learn how to combine and filter them. With pluggable mechanisms, you will learn how to visualize data and the results of analysis in beautiful maps that can be batch-generated and embedded into documents or web pages. Finally, you will learn how to consume and process an enormous amount of data very efficiently by using advanced tools and modern computers’ parallel processing capabilities.
Table of Contents (17 chapters)
Geospatial Development By Example with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Transforming the coordinate system and calculating the area of all countries


Now, the objective is to know how much area is occupied by each country. However, the coordinates of country borders are expressed in latitude and longitude, and we can't calculate areas in this coordinate system. We want the area to be in the metric system, so first we need to convert the spatial reference system of the geometries.

Let's also take a step further in the programming techniques and start using functions to avoid the repetition of code. Perform the following steps:

  1. Create a new file in the Chapter1 directory, name this file world_areas.py, and program this first function:

    import ogr
    
    
    def open_shapefile(file_path):
        """Open the shapefile, get the first layer and returns
        the ogr datasource.
        """
        datasource = ogr.Open(file_path)
        layer = datasource.GetLayerByIndex(0)
        print("Opening {}".format(file_path))
        print("Number of features: {}".format(  
        layer.GetFeatureCount()))
        return datasource
  2. Run the code, go to Run | Run... in the menu, and select world_areas. If everything is correct, nothing should happen. This is because we are not calling our function. Add this line of code at the end and outside the function:

    datasource = open_shapefile("../data/world_borders_simple.shp") 
  3. Now, run the code again with Shift + F10 and check the output, as follows:

    Opening ../data/world_borders_simple.shp
    Number of features: 246
    
    Process finished with exit code 0
    

That's wonderful! You just created a piece of very useful and reusable code. You now have a function that can open any shapefile, print the number of features, and return the ogr datasource. From now on, you can reuse this function in any of your projects.

You are already familiar with how this code works, but there are a few novelties here that deserve an explanation. The def statement defines a function with the def function_name(arguments): syntax.

Remember when I told you that OGR doesn't follow Python's naming convention? Well, the convention is that function names should all be in lowercase with an underscore between words. A good hint for names is to follow the verb_noun rule.

Tip

These conventions are described in a document called PEP-8, where PEP stands for Python Enhancement Program. You can find this document at https://www.python.org/dev/peps/pep-0008/.

Right after the function's definition, you can see a description between triple quotes; this is a docstring, and it is used to document the code. It's optional but very useful for you to know what the function does.

Now, let's get back to our code. The second important thing to point out is the return statement. This makes the function return the values of the variables listed after the statement—in this case, the datasource.

Note

It's very important that all pieces of the OGR objects flow together through the program. In this case, if we return only the layer, for example, we will get a runtime error later in our program. This happens because in OGR internals, the layer has a reference to the data source, and when you exit a Python function, all objects that don't exit the function are trashed, and this breaks the reference.

Now, the next step is to create a function that performs the transformation. In OGR, the transformation is made in the feature's geometry, so we need to iterate over the features, get the geometry, and transform its coordinates. We will do this using the following steps:

  1. Add the following function to your world_areas.py file just after the open_shapefile function:

    def transform_geometries(datasource, src_epsg, dst_epsg):
        """Transform the coordinates of all geometries in the
     first layer.
     """
        # Part 1
        src_srs = osr.SpatialReference()
        src_srs.ImportFromEPSG(src_epsg)
        dst_srs = osr.SpatialReference()
        dst_srs.ImportFromEPSG(dst_epsg)
        transformation = osr.CoordinateTransformation(src_srs, dst_srs)
        layer = datasource.GetLayerByIndex(0)
    
        # Part 2
        geoms = []
        layer.ResetReading()
        for feature in layer:
            geom = feature.GetGeometryRef().Clone()
            geom.Transform(transformation)
            geoms.append(geom)
         return geoms

    The function takes three arguments: the ogr layer, the EPSG code of the coordinate system of the file, and the EPSG code for the transformation output.

    Here, it created an osr.CoordinateTransformation object; this object contains the instructions to perform the transformation.

    Probably by now, Pycharm should be complaining that osr is an unresolved reference; osr is the part of GDAL that deals with coordinate systems.

  2. Now, import the module by adding this line at the top of your code:

    import osr
    

    Here, the code iterates over all features, gets a reference to the geometry, and performs the transformation. As we don't want to change the original data, the geometry is cloned, and the transformation is made on the clone.

    Python lists are ordered; this means that the elements are in the same order in which they are appended to the list, and this order is always kept. This allows us to create a list of geometries in the same order of the features that are in the data source. This means that the geometries in the list and the features have the same index and can be related in the future by the index.

  3. Now, let's test the code; add the following lines at the end of the file (the first line is the one that you already added before):

    datasource = open_shapefile("../data/world_borders_simple.shp")
    layer = datasource.GetLayerByIndex(0)
    feature = layer.GetFeature(0)
    print("Before transformation:")
    print(feature.GetGeometryRef())
    transformed_geoms = transform_geometries(datasource, 4326, 3395)
    print("After transformation:")
    print(transformed_geoms[0])
  4. Finally, before you run the code, add one more import at the beginning of the program. It should be the first statement of your code, as follows:

    from __future__ import print_function

    This import allows us to use the print() function from Python 3 with the desired behavior, thus maintaining the compatibility.

  5. The complete code should look similar to this:

    from __future__ import print_function
    import ogr
    import osr
    
    
    def open_shapefile(file_path):
        ...
    
    def transform_geometries(datasource, src_epsg, dst_epsg):
        ...
        
    datasource = open_shapefile("../data/world_borders_simple.shp")
    layer = datasource.GetLayerByIndex(0)
    feature = layer.GetFeature(0)
    print("Before transformation:")
    print(feature.GetGeometryRef())
    transformed_geoms = transform_geometries(datasource, 4326, 3395)
    print("After transformation:")
    print(transformed_geoms[0])
  6. Run your program again by pressing Shift + F10. In the output, note the difference in the coordinates before and after the transformation:

    Opening ../data/world_borders_simple.shp
    Number of features: 246
    Before transformation:
    MULTIPOLYGON (((-61.686668 17.024441000000138 ... )))
    After transformation:
    MULTIPOLYGON (((-6866928.4704937246 ... )))
    
    Process finished with exit code 0
    
  7. Now, add another function. This function will calculate the area in square meters (because we will use the geometries that have coordinates in meters), convert the value (or not) to square kilometers or square miles, and store the values in another list with the same order as before. Execute the following code:

    def calculate_areas(geometries, unity='km2'):
        """Calculate the area for a list of ogr geometries."""
        # Part 1
        conversion_factor = {
            'sqmi': 2589988.11,
            'km2': 1000000,
            'm': 1}
        # Part2
        if unity not in conversion_factor:
            raise ValueError(
                "This unity is not defined: {}".format(unity))
        # Part 3
        areas = []
        for geom in geometries:
            area = geom.Area()
            areas.append(area / conversion_factor[unity])
        return areas

    Firstly, note that in the function definition, we use unity='km2'; this is a keyword argument, and when you call the functions, this argument is optional.

    In Part 1, a dictionary is used to define a few conversion factors for the area unit. Feel free to add more units if you wish. By the way, Python doesn't care if you use single or double quotes.

    In Part 2, a verification is made to check whether the passed unity exists and whether it is defined in conversion_factor. Another way of doing this is catching the exception later; however, for now, let's opt for readability.

    In Part 3, the code iterates the ogr geometries, calculates the area, converts the values, and puts it on a list.

  8. Now, to test the code, edit your first line, including division to the future imports. This will ensure that all divisions return floating point numbers and not integers. Here's how it should look:

    from __future__ import print_function, division
  9. Then, update the testing part of your code to the following:

    datasource = open_shapefile("../data/world_borders_simple.shp")
    transformed_geoms = transform_geometries(datasource, 4326, 3395)
    calculated_areas = calculate_areas(transformed_geoms, unity='sqmi')
    print(calculated_areas)
  10. Run it, change the unity, then run again, and note how the results change.

Very well, unity conversion is another very important procedure in geoprocessing, and you just implemented it in your calculate_areas function.

However, having a list of numbers as the output is not very useful to us. So, it's time to combine everything that we did so far in order to extract valuable information.