Tuesday, April 30, 2013

GDAL/OGR 1.10.0 released

Finally, the new GDAL version has been released, and with it, the IRIS format driver that I coded some time ago for reading meteorological radar data from the Vaisala system.

For more information, you can read the post about the driver, and the new features list from the GDAL mailing list.

I hope that projects like ubuntugis update the version soon, so the users can easily install it.


Monday, April 1, 2013

Fast tip: Filtering features using OGR Python

Actualization

Take a look to the -where option in ogr2ogr. It does more or less the same, although coding it gives much more flexibility.


Original post

Playing with some files I found that I didn't know how to get only the features in a vectorial file fitting some condition.
So I made a little script.
You can find it at GitHub for download, or take a look at the code:

'''
Writes an ogr file from an input file, filtering the content
Roger Veciana, 23/mar/2013
'''
from osgeo import ogr
from os.path import exists
from os.path import basename
from os.path import splitext
from os import remove
import sys

def extract(in_file, out_file, filter_field, filter_values):
    '''
    Opens the input file, copies it into the oputput file, checking 
    the filter.
    '''
    print "Extracting data"
    
    in_ds = ogr.Open( in_file )
    if in_ds is None:
        print "Open failed.\n"
        sys.exit( 1 )
    in_lyr = in_ds.GetLayerByName( splitext(basename(in_file))[0] )
    if in_lyr is None:
        print "Error opening layer"
        sys.exit( 1 )


    ##Creating the output file, with its projection
    if exists(out_file):
        remove(out_file)
    driver_name = "ESRI Shapefile"
    drv = ogr.GetDriverByName( driver_name )
    if drv is None:
        print "%s driver not available.\n" % driver_name
        sys.exit( 1 )
    out_ds = drv.CreateDataSource( out_file )
    if out_ds is None:
        print "Creation of output file failed.\n"
        sys.exit( 1 )
    proj = in_lyr.GetSpatialRef()
    ##Creating the layer with its fields
    out_lyr = out_ds.CreateLayer( 
        splitext(basename(out_file))[0], proj, ogr.wkbMultiPolygon )
    lyr_def = in_lyr.GetLayerDefn ()
    for i in range(lyr_def.GetFieldCount()):
        out_lyr.CreateField ( lyr_def.GetFieldDefn(i) )
    
    
    ##Writing all the features
    in_lyr.ResetReading()

    for feat in in_lyr:
        value = feat.GetFieldAsString(feat.GetFieldIndex(filter_field))
        if filter_func(value, filter_values):
            out_lyr.CreateFeature(feat)
       

    in_ds = None
    out_ds = None


def filter_func(value, filter_values):
    '''
    The custom filter function. In this case, we check that the value is in the
    value list, stripping the white spaces. In the case of numeric values, a
    comparaison could be done
    '''
    if value.strip() in filter_values:
        return True
    else:
        return False

if __name__ == '__main__':
    extract('./recintos_etrs89.shp', './municipis.shp',
     'provincia', ('8', '25', '17', '43'))

The data I have used is from the CNIG, downloading the file LINEAS LIMITES MUNICIPALES. The file contains all the Spanish municipalities. The script will write the output file using the input data that matches as condition that the field provincia is one of the values 8, 25, 17 or 43, corresponding to the Catalan provinces.

How does the script works:

  • The main method just calls the function extract. (filter is a reserved name for python, so I haven't used it here)
  • The function opens the input file, and creates the output file with the same projection.
  • Then, at line 42, the first layer of the input file is replicated at the output file. This should be changed if more than one layer exists.
  •  At line 50, for each feature in the input layer, the filter function filter_func is applied, and if the result is true, the feature is written in the ouput file.
  • The filter function can be changed depending of the needs (comparing numbers, strings, or whatever). In this case, the white spaces are stripped (if not, some leading spaces that come with the file make the comparison fail), and then, the strings are compared. Quite simple.
Maybe there is better tool (a command-line tool, not a desktop GIS), but I didn't find it. If there is one, please tell me! If there is no one, may be could be a good idea to create a serious one, like gdal_calc.py for the raster calculations.