Book Image

Mastering PostGIS

By : Dominik Mikiewicz, Michal Mackiewicz , Tomasz Nycz
Book Image

Mastering PostGIS

By: Dominik Mikiewicz, Michal Mackiewicz , Tomasz Nycz

Overview of this book

PostGIS is open source extension onf PostgreSQL object-relational database system that allows GIS objects to be stored and allows querying for information and location services. The aim of this book is to help you master the functionalities offered by PostGIS- from data creation, analysis and output, to ETL and live edits. The book begins with an overview of the key concepts related to spatial database systems and how it applies to Spatial RMDS. You will learn to load different formats into your Postgres instance, investigate the spatial nature of your raster data, and finally export it using built-in functionalities or 3th party tools for backup or representational purposes. Through the course of this book, you will be presented with many examples on how to interact with the database using JavaScript and Node.js. Sample web-based applications interacting with backend PostGIS will also be presented throughout the book, so you can get comfortable with the modern ways of consuming and modifying your spatial data.
Table of Contents (9 chapters)

Importing vector data using ogr2ogr


ogr2ogr is the GDAL's vector transform utility. It is - not without reason - considered a Swiss Army knife for vector transformations. Despite its size, ogr2ogr can handle a wide range of formats and this makes it a really worthy tool.

We'll use ogr2ogr to import a few data formats other than SHP, although ogr2ogr can obviously import SHP too. For this scenario, we'll use some data downloaded earlier:

  • OS GB address base in GML format
  • OS GB code point polygons in MapInof MIF & TAB formats
  • USGS earthquakes in KML format

Some of the most common ogr2ogr params are:

  • -f: The format of the output (when importing to PostGIS it will be PostgreSQL).
  • -nln: Assigns a name to the layer. In the case of importing the data to PostGIS this will be the table name.
  • -select: Lets you specify a comma separated list of columns to pick.
  • -where: Lets you specify a sql like query to filter out the data.
  • append: Appends data to the output dataset.
  • overwrite: Overwrites the output datasource - in case of PostgreSQL it will drop and re-create a table.
  • s_srs: Specifies the input SRID.
  • t_srs: Transforms coordinates to the specified SRID.
  • a_srs: Specifies the output SRID.
    • lco NAME=VALUE: Layer creation options - these are driver specific; for pgsql options, see http://www.gdal.org/drv_pg.html. The most commonly used layer creation options are:
    • LAUNDER: This defaults to YES. It is responsible for converting column names into pgsql compatible ones (lower case, underscores).
    • PRECISION: This defaults to YES. It is responsible for using numeric and char types over float and varchar.
    • GEOMETRY_NAME: Defaults to wkb_geometry.

Note

For a full list of ogr2ogr params, just type ogr2ogr.Ogr2ogr has an accompanying utility called ogrinfo. This tool lets one inspect the metadata of a dataset. Verifying the metadata of any dataset prior to working with it is considered good practice and one should get into the habit of always using it before importing or exporting the data.

 

 

Importing GML

Let's start with importing the GML of the OS GB address base. First we'll see what data we're dealing with exactly:

    ogrinfo sx9090.gml

The following should be the output:

    Had to open data source read-only.
    INFO: Open of `sx9090.gml'
          using driver `GML' successful.
    1: Address (Point)

We can then review the layer information:

    ogrinfo sx9090.gml Address -so

You should see a similar output:

    Had to open data source read-only.
    INFO: Open of `sx9090.gml'
          using driver `GML' successful.

    Layer name: Address
    Geometry: Point
    Feature Count: 42861
    Extent: (-3.560100, 50.699470) - (-3.488340, 50.744770)
    Layer SRS WKT:
    GEOGCS["ETRS89",
        DATUM["European_Terrestrial_Reference_System_1989",
            SPHEROID["GRS 1980",6378137,298.257222101,
                AUTHORITY["EPSG","7019"]],
            TOWGS84[0,0,0,0,0,0,0],
            AUTHORITY["EPSG","6258"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]],
        AUTHORITY["EPSG","4258"]]
    gml_id: String (0.0)
    uprn: Real (0.0)
    osAddressTOID: String (20.0)
    udprn: Integer (0.0)
    subBuildingName: String (25.0)
    buildingName: String (36.0)
    thoroughfare: String (27.0)
    postTown: String (6.0)
    postcode: String (7.0)
    postcodeType: String (1.0)
    rpc: Integer (0.0)
    country: String (1.0)
    changeType: String (1.0)
    laStartDate: String (10.0)
    rmStartDate: String (10.0)
    lastUpdateDate: String (10.0)
    class: String (1.0)
    buildingNumber: Integer (0.0)
    dependentLocality: String (27.0)
    organisationName: String (55.0)
    dependentThoroughfare: String (27.0)
    poBoxNumber: Integer (0.0)
    doubleDependentLocality: String (21.0)
    departmentName: String (37.0)

Note

-so param makes ogrinfo display the data summary only; otherwise, info on a full dataset would be displayed.

Once we're ready to import the data, let's execute the following command:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" sx9090.gml -nln data_import.osgb_address_base_gml -geomfield geom

At this stage, the address GML should be available in our database.

Note

We did not specify the SRID of the GML data. This is because this information is present in GML and the utility picks it up automatically.

Importing MIF and TAB

Both MIF and TAB are MapInfo formats. TAB is the default format that contains formatting, while MIF is the interchange format.

We'll start with reviewing metadata:

    ogrinfo EX_sample.mif

And then:

    ogrinfo EX_sample.mif EX_Sample -so
    Had to open data source read-only.
    INFO: Open of `EX_sample.mif'
          using driver `MapInfo File' successful.

    Layer name: EX_sample
    Geometry: Unknown (any)
    Feature Count: 4142
    Extent: (281282.800000, 85614.570000) - (300012.000000, 
    100272.000000)
    Layer SRS WKT:
    PROJCS["unnamed",
        GEOGCS["unnamed",
            DATUM["OSGB_1936",
                SPHEROID["Airy 1930",6377563.396,299.3249646],
                TOWGS84[375,-111,431,-0,-0,-0,0]],
            PRIMEM["Greenwich",0],
            UNIT["degree",0.0174532925199433]],
        PROJECTION["Transverse_Mercator"],
        PARAMETER["latitude_of_origin",49],
        PARAMETER["central_meridian",-2],
        PARAMETER["scale_factor",0.9996012717],
        PARAMETER["false_easting",400000],
        PARAMETER["false_northing",-100000],
        UNIT["Meter",1]]
    POSTCODE: String (8.0)
    UPP: String (20.0)
    PC_AREA: String (2.0)

Note

Please note that ogrinfo projection metadata for our MIF file does not specify the EPSG code. This is fine, as the projection definition is present. But it will result in ogr2ogr creating a new entry in the spatial_ref_sys, which is not too good, as we'll end up with the wrong coordsys identifiers; the coordinate reference id will be the next available. This is because ogr2ogr expands the coordinate reference into a WKT string and then does a string comparison against the coordsys identifiers definitions in the spatial_ref_sys table; minor differences in formatting or precision will result in ogr2ogr failing to match coordsys. In such a scenario, a new entry will be created; for example, if you happen to use the EPSG:3857 coordinate system and the system's definition is slightly different and cannot be matched, the assigned SRID will not be 3857, but the next available ID will be chosen. A solution to this is to specify the exact coordinate system; ogr2ogr should output the data via the a_srs parameter.

Once ready, we can import the data:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" EX_sample.mif -nln data_import.osgb_code_point_polygons_mif -lco GEOMETRY_NAME=geom -a_srs EPSG:27700

If you followed the very same procedure for TAB file and loaded the data, both datasets are now in their own tables in the data_import schema:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" EX_sample.tab -nln data_import.osgb_code_point_polygons_tab -lco GEOMETRY_NAME=geom -a_srs EPSG:27700

Importing KML

As usual, we'll start with the dataset's metadata checkup:

    ogrinfo 2.5_day_age.kml

The output shows that there is more than one layer:

    INFO: Open of `2.5_day_age.kml'
          using driver `LIBKML' successful.
    1: Magnitude 5
    2: Magnitude 4
    3: Magnitude 3
    4: Magnitude 2

Therefore, in order to review metadata for each layer at once, the following command should be used:

    ogrinfo 2.5_day_age.kml -al -so

The output of the previous command is rather longish, so we'll truncate it a bit and only show the info for the first layer:

    INFO: Open of `2.5_day_age.kml'
          using driver `LIBKML' successful.

    Layer name: Magnitude 5
    Geometry: Unknown (any)
    Feature Count: 2
    Extent: (-101.000100, -36.056300) - (120.706400, 13.588200)
    Layer SRS WKT:
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            TOWGS84[0,0,0,0,0,0,0],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9108"]],
        AUTHORITY["EPSG","4326"]]
    Name: String (0.0)
    description: String (0.0)
    timestamp: DateTime (0.0)
    begin: DateTime (0.0)
    end: DateTime (0.0)
    altitudeMode: String (0.0)
    tessellate: Integer (0.0)
    extrude: Integer (0.0)
    visibility: Integer (0.0)
    drawOrder: Integer (0.0)
    icon: String (0.0)
    snippet: String (0.0)

This time, EPSG information is available, so we do not have to worry; ogr2ogr will create extra SRID definition in the database.

Once we've confirmed that this is the exact dataset we'd like to import, we can continue with the following command:

ogr2ogr -f "PostgreSQL" PG:"host=localhost port=5434 user=postgres dbname=mastering_postgis" 2.5_day_age.kml -nln data_import.usgs_earthquakes_kml -lco GEOMETRY_NAME=geom -append

Note

Note the append param in the command earlier. This is required because our KML has more than one layer and ogr2ogr is importing them one by one. Without the append param, only the first layer would be imported and then ogr2ogr would fail with a similar output:FAILED: Layer data_import.usgs_earthquakes_kml already exists, and -append not specified. Consider using -append, or -overwrite. ERROR 1: Terminating translation prematurely after failed translation of layer Magnitude 4 (use -skipfailures to skip errors) 

The cmd output should be similar to:

WARNING: Layer creation options ignored since an existing layer is
    being appended to.
WARNING: Layer creation options ignored since an existing layer is
    being appended to.
WARNING: Layer creation options ignored since an existing layer is
    being appended to.

At this stage, the KML dataset should have made it to our PostGIS database.

ogr2ogr GUI (Windows only)

For those preferring GUI over CMD, there is an alternative to plain old ogr2ogr--ogr2gui available from http://www.ogr2gui.ca/.

Simply download the required archive, extract it, and launch the appropriate .exe. After having played with ogr2ogr a bit already, the GUI should be rather self-explanatory: