Wednesday, August 28, 2013

Flag map with D3js

 In this example I'll show how to draw a map filling the polygons with images. In this case, Western Africa contries filled with their flags:

The zone represented is this one:

As usual, you can get all the code at GitHub:


Canvas

The code basically uses Mike Bostock's queue.js to download all the flag images from the wikipedia, the world borders and the contries names. Once everyting is donwloaded, a function is triggered, that draws the flags, clipping them by the country shape. The first version is written using Canvas instead of SVG, just to see how the clipping and loading of external images is done.

The code:



  • I make an array with all the image locations, indexing by name to make it easier to understand, even though the map file doesn't contain names but indexes.
  • Lines 37 to 47 configure the map projection and the Canvas element.
  • Lines 54 to 77 load all the elements. Since loading an image is not a function in D3js, a custom function has to be created, returning the error or the success value. If images are not preloaded, the code will try to paste them anyway, so the country flag won't be drawn
  • Function ready is triggered when everything is loaded, and it actually draws the map.
    • The error is handled at the first lines (very poorly, I know, a nodata image could be loaded, or something like that)
    • Until line 97, the base map is loaded (the land gets hidden later by the flags).
    • Lines 100 to 109 get the countries from the list, looks for them in the countries file by using a filter function and creates the images we are going to use later.
    • At line 111, all the flags are pasted on the map.
      • context.save();  Saves what has been drawn until the moment, so new things can happen
      • Then, the country border is drawn in a path, without using any color or fill, because then, context.clip(); is called to use this path as the clip path.
      • The image is pasted, using the drawImage method. The bounds are recalculated before for the countries that appera cut at the map.
      • context.restore(); frees the clipping.
    • Finally, at line 130, the borders are drawn to make the map prettier.

SVG

The SVG version is simplier, because no image preloading is needed, since the svg contains only the image source, and the browser will draw it when the image loads.
 


  • The code is almost the same, just that it doesn't preload the images with queue.js
  • Obviously, the canvas tag is replaced by SVG, and to draw, the usual SVG mode is used.
  • To clip, a clipPath is defined with the country shape and an id, so it can be assigned to an image. 
  • The image is added using an image tag
    • xlink:href sets the image link
    • clip-path sets the path used to clip the image
    • preserveAspectRatio has to be set as none, so the image ratio can be changed to make the flag fit the bounding box.

Notes:

To preload images using queue.js I asked this question at StackOverflow, kindly
answered by explunit.

Wednesday, August 14, 2013

Representing population density with D3js


Choropleth maps represent variables colouring (or filling with a pattern) areas in the map.
We were discussing a map with the local OGGeo group in Barcelona, and we all noticed that it had a problem.
The map was about the percentage of people in Catalonia whose home language was Catalan.It showed most of the areas coloured as mostly Catalan-speakers zone, but since the greatest share of the population in Catalonia is around the city of Barcelona, where Spanish is more spoken as primary language, the map could give a wrong impression to people that didn't know the population distribution in Catalonia.

Cartograms are thought to solve this problem, but the author of the map didn't like them, because if the population is very unevenly distributed, the result is really difficult to understand
Map taken from the Wikipedia's cartogram entry

So I tried a pair of things to see if the result was better. The examples code are in a GIST at GitHUB, so they can be visualized at http://bl.ocks.org:

Resizing polygons according to the population density

Every comarca (the regions Catalonia is divided in) is coloured according to the percentage of people able to speak Catalan (yes, I couldn't find the number of people who has Catalan as their first language). When the button is pressed, the coloured polygons shrink according to the population density of the region:
The only comarca that stays with the same size is Barcelonès, since is the one with the highest population density. You can see that the dark-green zones are low populated, so they shrink a lot.
It's very difficult to me to see the color when the polygon gets too small, even though I didn't shrink them linearly, but smoothing it with a square root, so I think that it's not the best solution.
Since the code is very similar to the one at Mike Bostok's example I won't comment it.

Filling with points according to the population density

This example at Kartograph web page gave me the idea for the second example, although I changed it a little using colors.
 In this case, every polygon is filled with circles. The color of the circle represents the magnitude we are showing (% of people able to speak Catalan, in this case), and the size represents the population density.
I think that the result is clearer in this case. The separation between circles and the maximum circle radius can be adjusted to get better results.
I will comment the code, since I had to write it from the scratch.

Basically, for each polygon, the bounding box is calculated, and the region filled with dots. Then, the polygon is used to clip the rectangle of circles.

Point density according to the population density







  •  The first lines set the pointSeparation and maxPointSize and set the color scale.
  • Lines 20 to 31 set the projection and the SVG element.
  • Lines 32 to 47 get the data about the number of speakers and population, and set the maximum and minimum values to be able to adjust colors and circle sizes.
  • Line 49 sets the color scale
  • Lines 53 to 60 sets the masks for each polygon by appending a clipPath
  • Then, from 63 to 69, the points are drawn by calling the function draw_circles for each polygon. To call the function, the each method is used. The clip-path attribute links the masks set before.
  • Lines 74 to 93 contain the function draw circles.
    • The bounding box and centroid are nneded. The first one to get the width and height, and the second to move it at he region center.
    • Once the size is set, the grid is drawn with the two loops
  • lines 95 to 100 draw the borders
  • lines 102 to 120 draw the legend.

Notes

I created a GIST to store all the data for both examples, since It's easier to re-use it.

Monday, August 5, 2013

Creating vectorial isobands with Python

GDAL has an utility program, gdal_contour, that generates contour lines from a raster file (take a look at his example post). GDAL provides access to this method through its API, so contour lines can be generated inside your GDAL python scripts.

Unfortunately, there is no way to generate isobands, that is, filled contour lines that use polygons instead of polylines. Creating isobands allows the creation of nice contour maps easily, colouring the zones between two values:
 
 
Using contour lines, the result is this one:



Both visualizations are drawn directly from QGIS.

So I created two scripts to get the result. As usual, you can download them from my GitHub account.
I created two scripts, since the one I like the most uses matplotlib, so sometimes it's not convenient to install. On the other hand, the other has less requirements (GDAL python), but the code is dirtier.

Both have the same usage, which is very similar to the gdal_contour usage:

    Usage: isobands_matplotlib.py [-h] [-b band] [-off offset] [-i interval]
    [-nln layer_name] [-a attr_name] [-f formatname]
    src_file out_file
    Calculates the isobands from a raster into a vector file

    positional arguments:
      src_file         The raster source file
      out_file         The vectorial out file
    optional arguments:
      -h, --help       show this help message and exit
      -b band          The band in the source file to process (default 1)
      -off offset      The offset to start the isobands (default 0)
      -i interval      The interval (default 0)
      -nln layer_name  The out layer name (default bands)
      -a attr_name     The out layer attribute name (default h)
      -f formatname    The output file format name (default ESRI Shapefile)
Just choose the input and output files, the input file band to process (1 by default), and the interval, and you will get the  vector file in the GIS format of your choice.

Now, the code of both script, step by step.

isobands_matplotlib.py

As it's name suggests,  this version uses matplotlib, and more specifically the contourf function, to create the isobands.

The script has a function called isobands, that actually generates the data, and a main part that reads the user input values:
if __name__ == "__main__":

    PARSER = ArgumentParser(
        description="Calculates the isobands from a raster into a vector file")
    PARSER.add_argument("src_file", help="The raster source file")
    PARSER.add_argument("out_file", help="The vectorial out file")
    PARSER.add_argument("-b", 
        help="The band in the source file to process (default 1)", 
        type=int, default = 1, metavar = 'band')
    PARSER.add_argument("-off", 
        help="The offset to start the isobands (default 0)", 
        type=float, default = 0.0, metavar = 'offset')
    PARSER.add_argument("-i", 
        help="The interval  (default 0)", 
        type=float, default = 0.0, metavar = 'interval')
    PARSER.add_argument("-nln", 
        help="The out layer name  (default bands)", 
        default = 'bands', metavar = 'layer_name')
    PARSER.add_argument("-a", 
        help="The out layer attribute name  (default h)", 
        default = 'h', metavar = 'attr_name')
    PARSER.add_argument("-f", 
        help="The output file format name  (default ESRI Shapefile)", 
        default = 'ESRI Shapefile', metavar = 'formatname')
    ARGS = PARSER.parse_args()

    isobands(ARGS.src_file, ARGS.b, ARGS.out_file, ARGS.f, ARGS.nln, ARGS.a, 
        ARGS.off, ARGS.i)
I used the argparse.ArgumentParser to read the script arguments, since is easy to use and it should come woth most of the updated python distributions (since python 2.7, I think).
This part of the code is quite easy to understand, it simply sets the arguments with their default value, sets if they are mandatory or not, etc.

The isobands function creates the vector file:
def isobands(in_file, band, out_file, out_format, layer_name, attr_name, 
    offset, interval, min_level = None):
    '''
    The method that calculates the isobands
    '''
    ds_in = gdal.Open(in_file)
    band_in = ds_in.GetRasterBand(band)
    xsize_in = band_in.XSize
    ysize_in = band_in.YSize

    geotransform_in = ds_in.GetGeoTransform()

    srs = osr.SpatialReference()
    srs.ImportFromWkt( ds_in.GetProjectionRef() )  

    #Creating the output vectorial file
    drv = ogr.GetDriverByName(out_format)

    if exists(out_file):
        remove(out_file)
    dst_ds = drv.CreateDataSource( out_file )
       
    dst_layer = dst_ds.CreateLayer(layer_name, geom_type = ogr.wkbPolygon, 
        srs = srs)

    fdef = ogr.FieldDefn( attr_name, ogr.OFTReal )
    dst_layer.CreateField( fdef )



    x_pos = arange(geotransform_in[0], 
        geotransform_in[0] + xsize_in*geotransform_in[1], geotransform_in[1])
    y_pos = arange(geotransform_in[3], 
        geotransform_in[3] + ysize_in*geotransform_in[5], geotransform_in[5])
    x_grid, y_grid = meshgrid(x_pos, y_pos)

    raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)

    stats = band_in.GetStatistics(True, True)
    if min_level == None:
        min_value = stats[0]
        min_level = offset + interval * floor((min_value - offset)/interval)
   
    max_value = stats[1]
    #Due to range issues, a level is added
    max_level = offset + interval * (1 + ceil((max_value - offset)/interval)) 

    levels = arange(min_level, max_level, interval)

    contours = plt.contourf(x_grid, y_grid, raster_values, levels)

    

    for level in range(len(contours.collections)):
        paths = contours.collections[level].get_paths()
        for path in paths:

            feat_out = ogr.Feature( dst_layer.GetLayerDefn())
            feat_out.SetField( attr_name, contours.levels[level] )
            pol = ogr.Geometry(ogr.wkbPolygon)


            ring = None            
            
            for i in range(len(path.vertices)):
                point = path.vertices[i]
                if path.codes[i] == 1:
                    if ring != None:
                        pol.AddGeometry(ring)
                    ring = ogr.Geometry(ogr.wkbLinearRing)
                    
                ring.AddPoint_2D(point[0], point[1])
            

            pol.AddGeometry(ring)
            
            feat_out.SetGeometry(pol)
            if dst_layer.CreateFeature(feat_out) != 0:
                print "Failed to create feature in shapefile.\n"
                exit( 1 )

            
            feat_out.Destroy()
  • Lines 26 to 49 open the input file and create the output vector file.
  • Lines 53 to 57 set two matrices, containing the coordinates for each pixel in the input file. To do it, two vectors are created first with the coordinates calculated from the input geotransform, and then the meshgrid is called to generate the 2d matrices, needed by matplotlib to make its calculations.
  • Lines 59 to 70 read the input file data and calculates the levels inside it, from the interval and the offset. Since calculating isobands for levels not present in the file isn't efficient, the minimum and maximum values in the input file are calculated before the isobands.
  • The line 72 is where the isobands are actually calculated using the contourf function.
  •  Lines 76 to 105 create the polygons at the output vector file.
    • The generated contour has a property called collections, containing every isoband.
    • Each element in the collections has the paths of the isolines.
    • The property codes in the path indicates if a new ring has to be generated.
    • The other lines are to generate an OGR feature 

isobands_gdal.py

This script is quite dirtier, since I have used a trick to get the isobands from the ContourGenerate function.
Closing the contour lines, which is what has to be done to transform isolines to isobands, is not an easy task.
Looking at some D3js examples I got the idea from Jason Davies conrec.js. I can't find the original example now, but basicaly, the idea is to add some pixels around the image with a value lower than the lowest value, so all the isolines will be closed. The solution gets closer, but some problems must be solved:
  • Convert the isolines to polygons
  • Clip the vectors to the original extent
  • Substract the overlapping polygons, since when the lines are converted to polygons, all the area above the isoline value is contained in the polygon even though other lines with higher values are inside
When not clipped, the isolines will create strange effects at the borders

The main part code is exacly the same as in the other script, and the isobands function is:

def isobands(in_file, band, out_file, out_format, layer_name, attr_name, 
    offset, interval, min_level = None):
    '''
    The method that calculates the isobands
    '''    
    #Loading the raster file
    ds_in = gdal.Open(in_file)
    band_in = ds_in.GetRasterBand(band)
    xsize_in = band_in.XSize
    ysize_in = band_in.YSize

    stats = band_in.GetStatistics(True, True)
    
    if min_level == None:
        min_value = stats[0]
        min_level = ( offset + interval * 
            (floor((min_value - offset)/interval) - 1) )
    nodata_value = min_level - interval    



    geotransform_in = ds_in.GetGeoTransform()
    
    srs = osr.SpatialReference()
    srs.ImportFromWkt( ds_in.GetProjectionRef() )  

    data_in = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)


    #The contour memory
    contour_ds = ogr.GetDriverByName('Memory').CreateDataSource('')
    contour_lyr = contour_ds.CreateLayer('contour', 
        geom_type = ogr.wkbLineString25D, srs = srs )
    field_defn = ogr.FieldDefn('ID', ogr.OFTInteger)
    contour_lyr.CreateField(field_defn)
    field_defn = ogr.FieldDefn('elev', ogr.OFTReal)
    contour_lyr.CreateField(field_defn)

    #The in memory raster band, with new borders to close all the polygons
    driver = gdal.GetDriverByName( 'MEM' )
    xsize_out = xsize_in + 2
    ysize_out = ysize_in + 2

    column = numpy.ones((ysize_in, 1)) * nodata_value
    line = numpy.ones((1, xsize_out)) * nodata_value

    data_out = numpy.concatenate((column, data_in, column), axis=1)
    data_out = numpy.concatenate((line, data_out, line), axis=0)

    ds_mem = driver.Create( '', xsize_out, ysize_out, 1, band_in.DataType)
    ds_mem.GetRasterBand(1).WriteArray(data_out, 0, 0)
    ds_mem.SetProjection(ds_in.GetProjection())
    #We have added the buffer!
    ds_mem.SetGeoTransform((geotransform_in[0]-geotransform_in[1],
        geotransform_in[1], 0, geotransform_in[3]-geotransform_in[5], 
        0, geotransform_in[5]))
    gdal.ContourGenerate(ds_mem.GetRasterBand(1), interval, 
        offset, [], 0, 0, contour_lyr, 0, 1)

    #Creating the output vectorial file
    drv = ogr.GetDriverByName(out_format)
    if exists(out_file):
        remove(out_file)
    dst_ds = drv.CreateDataSource( out_file )

      
    dst_layer = dst_ds.CreateLayer(layer_name, 
        geom_type = ogr.wkbPolygon, srs = srs)

    fdef = ogr.FieldDefn( attr_name, ogr.OFTReal )
    dst_layer.CreateField( fdef )


    contour_lyr.ResetReading()

    geometry_list = {}
    for feat_in in contour_lyr:
        value = feat_in.GetFieldAsDouble(1)

        geom_in = feat_in.GetGeometryRef()
        points = geom_in.GetPoints()

        if ( (value >= min_level and points[0][0] == points[-1][0]) and 
            (points[0][1] == points[-1][1]) ):
            if (value in geometry_list) is False:
                geometry_list[value] = []

            pol = ogr.Geometry(ogr.wkbPolygon)
            ring = ogr.Geometry(ogr.wkbLinearRing)

            for point in points:

                p_y = point[1]
                p_x = point[0]
                          
                if p_x < (geotransform_in[0] + 0.5*geotransform_in[1]):
                    p_x = geotransform_in[0] + 0.5*geotransform_in[1]
                elif p_x > ( (geotransform_in[0] + 
                    (xsize_in - 0.5)*geotransform_in[1]) ):
                    p_x = ( geotransform_in[0] + 
                        (xsize_in - 0.5)*geotransform_in[1] )
                if p_y > (geotransform_in[3] + 0.5*geotransform_in[5]):
                    p_y = geotransform_in[3] + 0.5*geotransform_in[5]
                elif p_y < ( (geotransform_in[3] + 
                    (ysize_in - 0.5)*geotransform_in[5]) ):
                    p_y = ( geotransform_in[3] + 
                        (ysize_in - 0.5)*geotransform_in[5] )

                ring.AddPoint_2D(p_x, p_y)
                
  
            pol.AddGeometry(ring)
            geometry_list[value].append(pol)



    values = sorted(geometry_list.keys())

    geometry_list2 = {}

    for i in range(len(values)):
        geometry_list2[values[i]] = []
        interior_rings = []
        for j in range(len(geometry_list[values[i]])):
            if (j in interior_rings) == False:
                geom = geometry_list[values[i]][j]
                
                for k in range(len(geometry_list[values[i]])):
                    
                    if ((k in interior_rings) == False and 
                        (j in interior_rings) == False):
                        geom2 = geometry_list[values[i]][k]
                        
                        if j != k and geom2 != None and geom != None:
                            if geom2.Within(geom) == True:
                                
                                geom3 = geom.Difference(geom2)
                                interior_rings.append(k)
                                geometry_list[values[i]][j] = geom3
                                                            
                            elif geom.Within(geom2) == True:
                                
                                geom3 = geom2.Difference(geom)
                                interior_rings.append(j)
                                geometry_list[values[i]][k] = geom3
                    
        for j in range(len(geometry_list[values[i]])):
            if ( (j in interior_rings) == False and 
                geometry_list[values[i]][j] != None ):
                geometry_list2[values[i]].append(geometry_list[values[i]][j])
    

    for i in range(len(values)):
        value = values[i]
        if value >= min_level:
            for geom in geometry_list2[values[i]]:
                
                if i < len(values)-1:

                    for geom2 in geometry_list2[values[i+1]]:
                        if geom.Intersects(geom2) is True:
                            geom = geom.Difference(geom2)
                
                feat_out = ogr.Feature( dst_layer.GetLayerDefn())
                feat_out.SetField( attr_name, value )
                feat_out.SetGeometry(geom)
                if dst_layer.CreateFeature(feat_out) != 0:
                    print "Failed to create feature in shapefile.\n"
                    exit( 1 )
                feat_out.Destroy()
  • Lines 23 to 44 read the input file, including the data, geotransform, etc.
  • Lines 47 to 54 create a vector file where the contour lines will be written. Since the data must be clipped later, the file is a temporary memory file.
  • Lines 56 to 71 create a memory raster  file with the input data, plus a buffer around it with a value lower than the lowest in the input data, so the contour lines will get closed.
  • Line 74 calls the ContourGenerate function, that will add the contour lines to the vectorial memory file.
  • Lines 77 to 91 create the actual output file
  • Lines 93 to 130 create a collection of geometries with the polygons.
    • The lines are transformed to polygons by creating a ring from the polyline and adding it to a polygon.
    • Lines 113 to 123 clip the polygons to the original geotransform, by moving the coordinates of the points outside the geotransform to the border.
  • Lines 138 to 167 stores the difference from the polygons that overlap, so the output polygons contain only the zones where the values are in the interval.
  • Lines 170 to 187 write the final polygons to the output file, the same way than the other script.

Notes:

An other option would be, of course, creating the polygons myself. I tried to do it, but as you can see at the marching squares wikipedia page, the algorithm is not easy when using polygons, and especially the matplotlib version of the scripts is quite clean.

The scripts have a problem when there is a nodata zone, since I haven't added the code to handle it.

Finally, as asked in the GDAL-dev mailing list, the best option would be modifying the ContourGenerate function to create isobands, but I think that is out of my scope.
The discussion at the GDAL-dev mailing list where I posted the first version. 
 
The data used for the examples is the same I used at the raster classification post.
 
Some links I have used:
http://matplotlib.org/api/path_api.html
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.contourf
http://matplotlib.org/basemap/api/basemap_api.html#mpl_toolkits.basemap.Basemap.contourf
http://fossies.org/dox/matplotlib-1.2.0/classmatplotlib_1_1contour_1_1QuadContourSet.html