Actualization
Take a look to the -where option in ogr2ogr. It does more or less the same, although coding it gives much more flexibility.
Original post
Playing with some files I found that I didn't know how to get only the features in a vectorial file fitting some condition.
So I made a little script.
You can find it at
GitHub for download, or take a look at the code:
'''
Writes an ogr file from an input file, filtering the content
Roger Veciana, 23/mar/2013
'''
from osgeo import ogr
from os.path import exists
from os.path import basename
from os.path import splitext
from os import remove
import sys
def extract(in_file, out_file, filter_field, filter_values):
'''
Opens the input file, copies it into the oputput file, checking
the filter.
'''
print "Extracting data"
in_ds = ogr.Open( in_file )
if in_ds is None:
print "Open failed.\n"
sys.exit( 1 )
in_lyr = in_ds.GetLayerByName( splitext(basename(in_file))[0] )
if in_lyr is None:
print "Error opening layer"
sys.exit( 1 )
##Creating the output file, with its projection
if exists(out_file):
remove(out_file)
driver_name = "ESRI Shapefile"
drv = ogr.GetDriverByName( driver_name )
if drv is None:
print "%s driver not available.\n" % driver_name
sys.exit( 1 )
out_ds = drv.CreateDataSource( out_file )
if out_ds is None:
print "Creation of output file failed.\n"
sys.exit( 1 )
proj = in_lyr.GetSpatialRef()
##Creating the layer with its fields
out_lyr = out_ds.CreateLayer(
splitext(basename(out_file))[0], proj, ogr.wkbMultiPolygon )
lyr_def = in_lyr.GetLayerDefn ()
for i in range(lyr_def.GetFieldCount()):
out_lyr.CreateField ( lyr_def.GetFieldDefn(i) )
##Writing all the features
in_lyr.ResetReading()
for feat in in_lyr:
value = feat.GetFieldAsString(feat.GetFieldIndex(filter_field))
if filter_func(value, filter_values):
out_lyr.CreateFeature(feat)
in_ds = None
out_ds = None
def filter_func(value, filter_values):
'''
The custom filter function. In this case, we check that the value is in the
value list, stripping the white spaces. In the case of numeric values, a
comparaison could be done
'''
if value.strip() in filter_values:
return True
else:
return False
if __name__ == '__main__':
extract('./recintos_etrs89.shp', './municipis.shp',
'provincia', ('8', '25', '17', '43'))
The data I have used is from the
CNIG, downloading the file
LINEAS LIMITES MUNICIPALES. The file contains all the Spanish municipalities. The script will write the output file using the input data that matches as condition that the field
provincia is one of the values 8, 25, 17 or 43, corresponding to the Catalan
provinces.
How does the script works:
- The main method just calls the function extract. (filter is a reserved name for python, so I haven't used it here)
- The function opens the input file, and creates the output file with the same projection.
- Then, at line 42, the first layer of the input file is replicated at the output file. This should be changed if more than one layer exists.
- At line 50, for each feature in the input layer, the filter function filter_func is applied, and if the result is true, the feature is written in the ouput file.
- The filter function can be changed depending of the needs (comparing numbers, strings, or whatever). In this case, the white spaces are stripped (if not, some leading spaces that come with the file make the comparison fail), and then, the strings are compared. Quite simple.
Maybe there is better tool (a command-line tool, not a desktop GIS), but I didn't find it. If there is one, please tell me! If there is no one, may be could be a good idea to create a serious one, like
gdal_calc.py for the raster calculations.