Book Image

PostGIS Cookbook

Book Image

PostGIS Cookbook

Overview of this book

Table of Contents (18 chapters)
PostGIS Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Importing nonspatial tabular data (CSV) using PostGIS functions


There are a couple of alternative approaches to import a Comma Separated Values (CSV) file, which stores attributes and geometries in PostGIS. In this recipe, we will use the approach of importing such a file using the PostgreSQL COPY command and a couple of PostGIS functions.

Getting ready

We will import the firenews.csv file that stores a series of web news collected from the various RSS feeds related to forest fires in Europe in the context of the European Forest Fire Information System (EFFIS ), available at http://effis.jrc.ec.europa.eu/.

For each news feed, there are attributes like place name, size of the fire in hectares, URL, and so on. Most importantly, there are the x and y fields that give the position of the geolocalized news in decimal degrees (in the WGS 84 spatial reference system, SRID = 4326).

How to do it...

The steps you need to follow to complete this recipe are as shown:

  1. Inspect the structure of the CSV file, firenews.csv, which you can find within the book dataset (if you are on Windows, open the CSV file with an editor such as Notepad).

    Tip

    Downloading the example code

    You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

    $ cd ~/postgis_cookbook/data/chp01/
    $ head -n 5 firenews.csv
    

    The output of the preceding command is as shown:

    x,y,place,size,update,startdate,enddate,title,url-8.2499,42.37657,Avión,52,2011/03/07,2011/03/05,2011/03/06,Dos incendios calcinan 74 hectáreas el fin de semana,http://www.laregion.es/noticia/145578/incendios/calcinan/hectareas/semana/
    -8.1013,42.13924,Quintela de Leirado,22,2011/03/07,2011/03/06,2011/03/06,Dos incendios calcinan 74 hectáreas el fin de semana,http://www.laregion.es/noticia/145578/incendios/calcinan/hectareas/semana/
    3.48159,43.99156,Arrigas,4,2011/03/06,2011/03/05,2011/03/05,"À Arrigas, la forêt sous la menace d'un feu",http://www.midilibre.com/articles/2011/03/06/NIMES-A-Arrigas-la-foret-sous-la-menace-d-39-un-feu-1557923.php5
    6.1672,44.96038,Vénéon,9,2011/03/06,2011/03/06,2011/03/06,Isère Spectaculaire incendie dans la vallée du Vénéon,http://www.ledauphine.com/isere-sud/2011/03/06/isere-spectaculaire-incendie-dans-la-vallee-du-veneon
    
  2. Connect to PostgreSQL and create the following table:

    $ psql -U me -d postgis_cookbook
    postgis_cookbook=> CREATE TABLE chp01.firenews
    (
      x float8,
      y float8,
      place varchar(100),
      size float8,
      update date,
      startdate date,
      enddate date,
      title varchar(255),
      url varchar(255),
      the_geom geometry(POINT, 4326)
    );
    

    Note

    We are using the psql client for connecting to PostgreSQL, but you can use your favorite one, for example, pgAdmin.

    Using the psql client, we will not show the host and port options as we will assume that you are using a local PostgreSQL installation on the standard port.

    If that is not the case, please provide those options!

  3. Copy the records from the CSV file to the PostgreSQL table using the COPY command (if you are on Windows, use an input directory such as c:\temp instead of /tmp) as follows:

    postgis_cookbook=> COPY chp01.firenews (x, y, place, size, update, startdate, enddate, title, url) FROM '/tmp/firenews.csv' WITH CSV HEADER;
    

    Tip

    Make sure that the firenews.csv file is in a location accessible from the PostgreSQL process user. For example, in Linux, copy the file to the /tmp directory.

    If you are on Windows, you most likely will need to set the encoding to UTF-8 before copying:

    postgis_cookbook=# set client_encoding to 'UTF-8';
    
  4. Check if all of the records have been imported from the CSV file to the PostgreSQL table:

    postgis_cookbook=> SELECT COUNT(*) FROM chp01.firenews;
    

    The output of the preceding command is as follows:

    count
    -------
    3006
    (1 row)
    
  5. Check if a record related to this new table is in the PostGIS geometry_columns metadata view:

    postgis_cookbook=# SELECT f_table_name, f_geometry_column, coord_dimension, srid, type FROM geometry_columns where f_table_name = 'firenews';
    
     f_table_name | f_geometry_column | coord_dimension | srid  | type  
    --------------+-------------------+-----------------+-------+-------
     firenews     | the_geom          |    2            | 4326 | POINT
    (1 row)
    

    Tip

    Before PostGIS 2.0, you had to create a table containing spatial data in two distinct steps; in fact, the geometry_columns view was a table that needed to be manually updated. For that purpose, you had to use the AddGeometryColumn function to create the column. For example, for this recipe:

    postgis_cookbook=> CREATE TABLE chp01.firenews
    (
      x float8,
      y float8,
      place varchar(100),
      size float8,
      update date,
      startdate date,
      enddate date,
      title varchar(255),
      url varchar(255)
    )
    WITHOUT OIDS;
    postgis_cookbook=> SELECT AddGeometryColumn('chp01', 'firenews', 'the_geom', 4326, 'POINT', 2);
    chp01.firenews.the_geom SRID:4326 TYPE:POINT DIMS:2
    

    Tip

    In PostGIS 2.0, you can still use the AddGeometryColumn function if you wish; however, you need to set its use_typmod parameter to false.

  6. Now, import the points in the geometric column using the ST_MakePoint or ST_PointFromText functions (use one of the following two update commands):

    postgis_cookbook=> UPDATE chp01.firenews SET the_geom = ST_SetSRID(ST_MakePoint(x,y), 4326);
    postgis_cookbook=> UPDATE chp01.firenews SET the_geom = ST_PointFromText('POINT(' || x || ' ' || y || ')', 4326);
    
  7. Check how the geometry field has been updated in some records from the table:

    postgis_cookbook=# SELECT place, ST_AsText(the_geom) AS wkt_geom FROM chp01.firenews ORDER BY place LIMIT 5;
    

    The output of the preceding comment is as follows:

    place                             | wkt
    ----------------------------------------------------------
    Abbaslık                          | POINT(29.95...
    Abeledos, Montederramo            | POINT(-7.48...
    Abreiro                           | POINT(-7.28...
    Abrunheira, Montemor-o-Velho      | POINT(-8.72...
    Achaia                            | POINT(21.89...
    (5 rows)
    
  8. Finally, create a spatial index for the geometric column of the table:

    postgis_cookbook=> CREATE INDEX idx_firenews_geom ON chp01.firenews USING GIST (the_geom);
    

How it works...

This recipe showed you how to load nonspatial tabular data (in CSV format) in PostGIS using the COPY PostgreSQL command.

After creating the table and copying the CSV file rows to the PostgreSQL table, you updated the geometric column using one of the geometry constructor functions that PostGIS provides (ST_MakePoint and ST_PointFromText for bi-dimensional points).

These geometry constructors (in this case, ST_MakePoint and ST_PointFromText) must always provide the spatial reference system identifier (SRID) together with the point coordinates to define the point geometry.

Each geometric field added in any table in the database is tracked with a record in the geometry_columns PostGIS metadata view. In the previous PostGIS version (< 2.0), the geometry_fields view was a table and needed to be manually updated, possibly with the convenient AddGeometryColumn function.

For the same reason, to maintain the updated geometry_columns view, when dropping a geometry column or removing a spatial table in the previous PostGIS versions, there were the DropGeometryColumn and DropGeometryTable functions. With PostGIS 2.0, you don't need to use these functions any more, but you can safely remove the column or the table with the standard ALTER TABLE DROP COLUMN and DROP TABLE SQL commands.

In the last step of the recipe, you have created a spatial index on the table to improve performances. Please be aware that as in the case of alphanumerical database fields, indexes improve performances only when reading data using the SELECT command. In this case, you are making a number of updates on the table (INSERT, UPDATE, and DELETE); depending on the scenario, it could be less time consuming to drop and recreate the index after the updates.