Showing posts with label interpolation. Show all posts
Showing posts with label interpolation. Show all posts

Saturday, June 29, 2013

GDAL Performance II: Create a PNG from a raster file

After improving the raster classification script, in this entry, is the turn for the PNG creation script.

As usual, all the code is at GitHub

Problems in the original script

  • The PIL python module was used to draw the PNG image pixel by pixel. This can be changed using the SciPy function scipy.misc.imsave. Using it, all the iterations, comparisons, etc. are done at a lower level, so the script is cleaner and much faster.
  • GDAL blocks weren't used, so all the image was loaded into a huge array if the original raster was big. Using blocks as in the last entry example, performance improves a lot.
  • All the values in the color file were evaluated. If the minimum and maximum values of the raster are known, many of these comparisons can be avoided, so the performance there is also improved.

The new script

The new script works exactly the same way as the old one at the user level.
'''
Script to generate a png from a raster file.
It can be used from the command line or from an other python script
'''
from os.path import exists
from osgeo import gdal
from osgeo.gdalconst import GA_ReadOnly
from sys import argv
from sys import exit
from numpy import logical_and
from numpy import zeros
from numpy import uint8
import scipy

'''
Main function.
Requires the input file, the color file and the otuput file name. 
The raster band is 1 by default. 
If nothing is asked, the discrete colroscale image is created.
'''
def raster2png(raster_file, color_file, out_file_name, raster_band=1, discrete=True):

    if exists(raster_file) is False:
            raise Exception('[Errno 2] No such file or directory: \'' + raster_file + '\'')    
    
    dataset = gdal.Open(raster_file, GA_ReadOnly )
    if dataset == None:
        raise Exception("Unable to read the data file")
    
    band = dataset.GetRasterBand(raster_band)

    block_sizes = band.GetBlockSize()
    x_block_size = block_sizes[0]
    y_block_size = block_sizes[1]

    xsize = band.XSize
    ysize = band.YSize

    max_value = band.GetMaximum()
    min_value = band.GetMinimum()

    if max_value == None or min_value == None:
        stats = band.GetStatistics(0, 1)
        max_value = stats[1]
        min_value = stats[0]

    #Reading the color table
    color_table = readColorTable(color_file)
    #Adding an extra value to avoid problems with the last & first entry
    if sorted(color_table.keys())[0] > min_value:
        color_table[min_value - 1] = color_table[sorted(color_table.keys())[0]]

    if sorted(color_table.keys())[-1] < max_value:
        color_table[max_value + 1] = color_table[sorted(color_table.keys())[-1]]
    #Preparing the color table and the output file
    classification_values = color_table.keys()
    classification_values.sort()
    

    rgb = zeros((ysize, xsize, 4), dtype = uint8)

    for i in range(0, ysize, y_block_size):
        if i + y_block_size < ysize:
            rows = y_block_size
        else:
            rows = ysize - i
        
        for j in range(0, xsize, x_block_size):
            if j + x_block_size < xsize:
                cols = x_block_size
            else:
                cols = xsize - j
                
            
            values = band.ReadAsArray(j, i, cols, rows)
            r = zeros((rows, cols), dtype = uint8)
            g = zeros((rows, cols), dtype = uint8)
            b = zeros((rows, cols), dtype = uint8)
            a = zeros((rows, cols), dtype = uint8)

            for k in range(len(classification_values) - 1):
                #print classification_values[k]
                if classification_values[k] < max_value and (classification_values[k + 1] > min_value ):
                    mask = logical_and(values >= classification_values[k], values < classification_values[k + 1])
                    if discrete == True:
                        r = r + color_table[classification_values[k]][0] * mask
                        g = g + color_table[classification_values[k]][1] * mask
                        b = b + color_table[classification_values[k]][2] * mask
                        a = a + color_table[classification_values[k]][3] * mask
                    else:
                        v0 = float(classification_values[k])
                        v1 = float(classification_values[k + 1])

                        r = r + mask * (color_table[classification_values[k]][0] + (values - v0)*(color_table[classification_values[k + 1]][0] - color_table[classification_values[k]][0])/(v1-v0) )
                        g = g + mask * (color_table[classification_values[k]][1] + (values - v0)*(color_table[classification_values[k + 1]][1] - color_table[classification_values[k]][1])/(v1-v0) )
                        b = b + mask * (color_table[classification_values[k]][2] + (values - v0)*(color_table[classification_values[k + 1]][2] - color_table[classification_values[k]][2])/(v1-v0) )
                        a = a + mask * (color_table[classification_values[k]][3] + (values - v0)*(color_table[classification_values[k + 1]][3] - color_table[classification_values[k]][3])/(v1-v0) )
                     
            rgb[i:i+rows,j:j+cols, 0] = r 
            rgb[i:i+rows,j:j+cols, 1] = g 
            rgb[i:i+rows,j:j+cols, 2] = b
            rgb[i:i+rows,j:j+cols, 3] = a   

    scipy.misc.imsave(out_file_name, rgb)
    

def readColorTable(color_file):
    '''
    The method for reading the color file.
    * If alpha is not defined, a 255 value is set (no transparency).
    '''    

    color_table = {}
    if exists(color_file) is False:
        raise Exception("Color file " + color_file + " does not exist")
    
    fp = open(color_file, "r")
    for line in fp:
        if line.find('#') == -1 and line.find('/') == -1:
            entry = line.split()
            if len(entry) == 5:
                alpha = int(entry[4])
            else:
                alpha=255
            color_table[eval(entry[0])]=[int(entry[1]),int(entry[2]),int(entry[3]),alpha]
    fp.close()
    
    return color_table

'''
Usage explanation
'''
def Usage():
    print "Not enough arguments." 
    print "Usage:"
    print argv[0] + ' [-exact_color_entry] [-band=1] data_file color_file output_file'    
    exit()

'''
Program Mainline
'''
if __name__ == "__main__":
    
    file_name = None
    colorfile_name = None
    out_file_name = None
    discrete = False
    band = 1

    i = 1
    while i < len(argv):
        arg = argv[i]
        if arg == '-exact_color_entry':
            discrete = True
        elif arg == '-band':
            band = argv[i+1]
            i = i + 1
        elif file_name is None:
            file_name = arg
            file_name = file_name.replace("'","")
            file_name = file_name.replace('"','')
        elif colorfile_name is None:
            colorfile_name = arg
            colorfile_name = colorfile_name.replace("'","")
            colorfile_name = colorfile_name.replace('"','')
        elif out_file_name is None:
            out_file_name = arg
            out_file_name = out_file_name.replace("'","")
            out_file_name = out_file_name.replace('"','')
        i = i + 1   



    if len(argv) == 1 or file_name == None or colorfile_name == None or out_file_name == None: 
       Usage()     
    '''
    try:
        raster2png(file_name,colorfile_name,out_file_name,band,discrete)
    except Exception, ex:
        print "Error: " + str(ex)
    '''
    raster2png(file_name,colorfile_name,out_file_name,band,discrete)
  • The part of the code that reads the input (lines 130 - 182) hasn't changed, like the function readColorTable, that reads the color file so it can be processed. Only the main method raster2png has changed.
  • Lines 26 to 45 read the raster file metadata, including the extreme values in the file.
  • Then, the colorfile is read, and extra values are added if the raster file has values above the maximum value at the color file or below the minimum (lines 47-57). If this isn't done, some strange nodata zones will appear.
  • At line 60, an array is created, where the four bands (red, green, blue and alpha) of the output png will be stored.
  • Then, every block of data is evaluated (after line 62). Empty arrays are created for each band, and for every value in the colorfile, the affected pixels are chosen at line 84.
  • Depending on the user choice, the colour is assigned either directly from the colorfile value or interpolating from the two most close colors. (lines 85-97) Line 91 and 92 have a float cast, very important to get the right value.
  • Lines 99-103 assign the block to the right output array region (see how a region of a NumPy array is chosen)
  • Line 104 creates the png file.

Benchmarks

To check the results, I have used the same raster used at the original post, and for bigger stuff, the rasters I downloaded for the last post (read it for the big file creation).

Old script

Filewith -exact_color_entryContinuous colors
w001001.adf (1126x576px)9.363s13.329s
srtm_54_05.tif (6001x6001px)9m19s12m44s
big.tif (12001x12001px)ErrorError

New script

Filewith -exact_color_entryContinuous colors
w001001.adf (1126x576px)0.822s1.653s
srtm_54_05.tif (6001x6001px)22.503s1m07s
big.tif (12001x12001px)1m46s4m58s

The improvement is really big and, sometimes, the old version will crash because not enough memory will be available.

What's next

I'm not satisfied yet with the way the image is saved using SciPy, because all the image must be in a single array, instead of working by blocks. Maybe there's an alternative to do this more efficiently. With really big images, the script will fail even using the new script version.

Friday, July 6, 2012

Testing inverse of the distance interpolation with AEMet pressure data

I like testing the things I learn with actual data. Some time ago, I wrote an entry about how to use the inverse of the distance to interpolate data from scattered points. Now, I will show how to use it to draw a sea level pressure map from the data given by the Spanish Meteorology Agency. The Agency has a nice Open Data section that it's worth a visit if you are interested in meteorology as I am.

As usual, I have uploaded all the files you can need to check the entry.
Getting the data
The station data from the AEMet is available in their ftp server. 
In the example, an FTP connection to the AEMet server has to be done, and only the sea level pressure is parsed. If the actual pressure is used, the interpolation will be wrong, since the altitude won't be considered. AEMet gives the corrected pressure directly.
The file format is quite easy to understand and parse. It's a compressed CSV file, with the variable names indicated in each value. Not all the stations have pressure data, and the file's directory changes it's name with the date, so the script has to do some checks:


from ftplib import FTP
import gzip

'''
Gets the last sea level pressure values from the AEMet service.
'''
def get_data():
 ftp = FTP('ftpdatos.aemet.es')   # connect to host, default port
 ftp.login()               # user anonymous, passwd anonymous@
 ftp.cwd('datos_observacion/observaciones_diezminutales') #Getting in the 10 minute datadirectory

 files = ftp.nlst()  # list directory contents
 #Getting the newest day directory. It must include the word "diezminutales". The name includes the date, so ordering it by name, the last one is also the newest one.
 directories = []
 for file_name in files:
     if file_name.endswith('diezminutales'):
  directories.append(file_name)

 directories.sort()

 ftp.cwd(directories.pop())
        #Getting the last file, ordered by name.
 files = ftp.nlst()
 data_files = []
 for file_name in files:
     if file_name.endswith('gz'):
  data_files.append(file_name)

 data_files.sort()

 data_file = data_files.pop()
 ftp.retrbinary("RETR " + data_file, open(data_file, 'wb').write)
 
 #The files are compressed with gzip. Getting the contents into a variable.
 f = gzip.open(data_file, 'rb')

 data = {'lon':[],'lat':[],'press':[]}
        #Parsing the file line by line
 for line in f:
     station_values = line.split(',')
     #Some valeus are always in the same position
     lat = float(station_values[2])
     lon = float(station_values[3])
     #Other values must be found using their label name, and not all the stations give pressure data
     press = None
     for var in station_values:
  if ('PRES_nmar=' in var) is True and ('QPRES_nmar=' in var) is False:
      press = float(var.replace('PRES_nmar=',''))
     if press != None:
  data['lon'].append(lon)
  data['lat'].append(lat)
  data['press'].append(press)
 f.close()

 return data 
  • The script defines a function for using it from other files, as we will in the next section.
  • We will use the latest data, so is necessary to order the directory and file names to guess which file to download. To understand it better, just connect to the ftp and check how is organized. 
  • The data is compressed, so I use gzip to open it.
  • Finally, only one of the values is used, so it's chosen. Other maps could be done from the same data set, although interpolating some variables, like temperature,  this way doesn't make sense.

Viewing the data 

To represent the data in the example, we will create a GeoTiff layer for the interpolated raster, and a Shapefile fot the point data. Then, using QGIS, we will draw the map. The coasline has been downloaded from the NOAA coastline extractor.
from get_data import get_data
from invdistgis import invDist
from osgeo import osr
from osgeo import ogr


def writeShape(data,proj,outFile):
    driverName = "ESRI Shapefile"
    drv = ogr.GetDriverByName( driverName )
    dsOut = drv.CreateDataSource( outFile )  
    outLayer = dsOut.CreateLayer('PRESS', None, ogr.wkbPoint)  
    field = ogr.FieldDefn( "PRESS", ogr.OFTReal )
    outLayer.CreateField ( field )
    for i in range(0,len(data['press'])):
        #print str(i) + " - " + str(data['lon'][i])+" "+str(data['lat'][i])
        geometry = ogr.CreateGeometryFromWkt("POINT("+str(data['lon'][i])+" "+str(data['lat'][i])+")")
        feature = ogr.Feature( outLayer.GetLayerDefn())
        feature.SetField( 'PRESS', data['press'][i] )

        feature.SetGeometry(geometry)
        outLayer.CreateFeature(feature)        

    dsOut = None


    
def createMap():
    #Getting the data from the function
    data = get_data()
    driverName="GTiff"
    outFile="press.tiff"
    power=2
    smoothing=5
    xSize=200
    ySize=200
    proj = osr.SpatialReference()
    proj.SetFromUserInput(str("EPSG:4326"))  
    geotransform=[-10.35,0.07185,0,44.28,0,-0.0458]
    writeShape(data,proj,"press.shp")
    invDist(data['lon'],data['lat'],data['press'],geotransform,proj,xSize,ySize,power,smoothing,driverName,outFile)
  
    
if __name__ == "__main__":
    createMap()
  • I have created a function to draw the shapefile, so it's easier to understand how to do it.
  • If you run the script and a shapefile already exists, the GDAL will throw an error. I think that is a kind of bug... or feature!
  • To get the data, the script is using the function described above.
  • The map creation uses the inverse of the distance function used in the entry that explained how to do it.
The final result looks like that:

Saturday, May 26, 2012

Creating a grid from scattered data using inverse of the distance with python (gdal_grid approach)

OK, I have to admit that I was so happy when I found the  scipy rbf function that I went too fast writing the entry about inverse of the distance.
I swear that the images are real (you can check it yourself), but when you change a few parameters or the point position, the result is really strange, definitely not the inverse of the distance algorithm I was thinking about.
Looking a little more around the gdal documentation, I found this explanation, which is very easy and clear:
http://www.gdal.org/grid_tutorial.html
The problem is that, at least in my python gdal version, is not possible to call the function directly, so I've been coding a little to get it.
All the example code and files can be downloaded in a zip file.

First version: the algorithm

Putting all the code directly makes things difficult to understand, so first,  I'll reproduce the same program I wanted to show in the entrance, using matplotlib to show the results:

from math import pow
from math import sqrt
import numpy as np
import matplotlib.pyplot as plt

def pointValue(x,y,power,smoothing,xv,yv,values):
    nominator=0
    denominator=0
    for i in range(0,len(values)):
        dist = sqrt((x-xv[i])*(x-xv[i])+(y-yv[i])*(y-yv[i])+smoothing*smoothing);
        #If the point is really close to one of the data points, return the data point value to avoid singularities
        if(dist<0.0000000001):
            return values[i]
        nominator=nominator+(values[i]/pow(dist,power))
        denominator=denominator+(1/pow(dist,power))
    #Return NODATA if the denominator is zero
    if denominator > 0:
        value = nominator/denominator
    else:
        value = -9999
    return value

def invDist(xv,yv,values,xsize=100,ysize=100,power=2,smoothing=0):
    valuesGrid = np.zeros((ysize,xsize))
    for x in range(0,xsize):
        for y in range(0,ysize):
            valuesGrid[y][x] = pointValue(x,y,power,smoothing,xv,yv,values)
    return valuesGrid
    

if __name__ == "__main__":
    power=1
    smoothing=20

    #Creating some data, with each coodinate and the values stored in separated lists
    xv = [10,60,40,70,10,50,20,70,30,60]
    yv = [10,20,30,30,40,50,60,70,80,90]
    values = [1,2,2,3,4,6,7,7,8,10]
    
    #Creating the output grid (100x100, in the example)
    ti = np.linspace(0, 100, 100)
    XI, YI = np.meshgrid(ti, ti)

    #Creating the interpolation function and populating the output matrix value
    ZI = invDist(xv,yv,values,100,100,power,smoothing)


    # Plotting the result
    n = plt.normalize(0.0, 100.0)
    plt.subplot(1, 1, 1)
    plt.pcolor(XI, YI, ZI)
    plt.scatter(xv, yv, 100, values)
    plt.title('Inv dist interpolation - power: ' + str(power) + ' smoothing: ' + str(smoothing))
    plt.xlim(0, 100)
    plt.ylim(0, 100)
    plt.colorbar()

    plt.show()
    
Let's explain it step by step:
  1. The main method defines some sample data, calls the interpolation function and draws it using matplotlib. Just copy&paste, since it's long to explain.
  2. invDist function: For each point in the coordinates matrix, a value needs to be calculated. First, a xSize*ySize zeroes values is calculated. Then for each point coordinate (the pixel position, in this simple case), a function that calculates the value is called.
  3. This is the important part: We apply the formula: $V=\frac{\sum\limits_{i=1}^n \frac{v_{i}}{d_{i}^{p}}}{\sum\limits_{i=1}^n \frac{1}{d_{i}^{p}}}$ for each pixel.
    • V is the interpolated value. 
    • n is the number of points
    • d is the distance from the point to the pixel
    • v is the value of the point
    • p is the power we want to apply to the distance o weight it.
    The distance is the Cartesian one, plus the smoothing factor (it gives more distance than the actual one, lowering the influence around the points). If the distance is close to the precision of the float numbers, we give the data value instead of the interpolated one, to avoid strange results.
And that's all. Here, the results (you can compare them with the ones in the other entry) which are consistent to the inverse of the distance method:
 If the smoothing is zero and the power is high, the interpolation changes a lot around the points to give them their exact value.
If the smoothing is high and the power is one, the result is much smoother, but the values at the points are not maintained.

Adding the GIS stuff:

The example above doesn't write to a GIS file. The function invDist must be changed so it creates the file using a gdal driver:
def invDist(xv,yv,values,geotransform,proj,xSize,ySize,power,smoothing,driverName,outFile):
    #Transform geographic coordinates to pixels
    for i in range(0,len(xv)):
         xv[i] = (xv[i]-geotransform[0])/geotransform[1]
    for i in range(0,len(yv)):
         yv[i] = (yv[i]-geotransform[3])/geotransform[5]
    #Creating the file
    driver = gdal.GetDriverByName( driverName )
    ds = driver.Create( outFile, xSize, ySize, 1, gdal.GDT_Float32)
    if proj is not None:
        ds.SetProjection(proj.ExportToWkt())
    ds.SetGeoTransform(geotransform)
    valuesGrid = np.zeros((ySize,xSize))
    #Getting the interpolated values
    for x in range(0,xSize):
        for y in range(0,ySize):
            valuesGrid[y][x] = pointValue(x,y,power,smoothing,xv,yv,values)
    
    ds.GetRasterBand(1).WriteArray(valuesGrid)
    ds = None
    return valuesGrid
  1. Note that the function will  need the geographic parameters geotransform and proj, as well as the output file name and driver.
  2. First, the point positions must be changed to pixel locations in order to calculate distances
  3. Then, the output file is created with the size and projection asked by the user
  4. The values are calculated in the same way as in the other example
  5. The file is written
The data in the first example was set into three lists, but the correct way to do it would be read it from a geometry file with the OGR library. Another function is defined to do that:

def readPoints(dataFile,Zfield='Z'):
    data = {}
    xv=[]
    yv=[]
    values=[]
    ds = ogr.Open(dataFile)
    if ds is None:
       raise Exception('Could not open ' + dataFile)
    
    layer = ds.GetLayer()
    proj = layer.GetSpatialRef()
    extent = layer.GetExtent()

    feature = layer.GetNextFeature()
    if feature.GetFieldIndex(zField) == -1:
         raise Exception('zField is not valid: ' + zField)

    while feature:
        geometry = feature.GetGeometryRef()
        xv.append(geometry.GetX())
        yv.append(geometry.GetY())
        values.append(feature.GetField(zField))

        feature = layer.GetNextFeature()
    data['extent'] = extent 
    data['xv']=xv
    data['yv']=yv
    data['values']=values
    data['proj'] = proj
    ds = None
    return data

  1. The function reads the point position and values from the given file and field name.
  2. A dictionary is returned, to help retrieving all the values we may need to create the output raster. If no projection or geotransform is passed by the user, the input file extents and projection are used.
  3. In the example files, a file called points.shp has some sample points to test the functions.
Finally, the main method must get the  parameters from the user, and call the other two methods to create the raster. I have copied some of the input options from the gdal_grid program, so it's easier to test the results. Of course, not all the options are implemented, but it shouldn't be difficult to do it.

if __name__ == "__main__":
    #Setting the default values
    power=2
    smoothing=0
    
    zField='Z'
    dataFile=None
    outFile=None
    driverName='GTiff'
    proj=None

    geotransform = None
    xMin=None
    xMax=None
    yMin=None
    yMax=None
    xSize=100
    ySize=100

    #Parsing the command line
    argv = gdal.GeneralCmdLineProcessor( sys.argv )
    
    i = 1
    while i < len(argv):
        arg = argv[i]
        if arg == '-out_format':
         driverName = argv[i+1]
         driverName = driverName.replace("'","")
         driverName = driverName.replace('"','')
         i = i + 1
        elif arg == '-zfield':
         zField = argv[i+1]
         zField = zField.replace("'","")
         zField = zField.replace('"','')
         i = i + 1
        elif arg == '-a_srs':
         proj = argv[i+1]
         proj = proj.replace("'","")
         proj = proj.replace('"','')
         i = i + 1
        elif arg == '-outsize':
         xSize = argv[i+1]
         xSize = xSize.replace("'","")
         xSize = int(xSize.replace('"',''))
         ySize = argv[i+2]
         ySize = ySize.replace("'","")
         ySize = int(ySize.replace('"',''))
         i = i + 2
        elif arg == '-txe':
         xMin = argv[i+1]
         xMin = xMin.replace("'","")
         xMin = float(xMin.replace('"',''))
         xMax = argv[i+2]
         xMax = xMax.replace("'","")
         xMax = float(xMax.replace('"',''))
         i = i + 2
        elif arg == '-tye':
         yMin = argv[i+1]
         yMin = yMin.replace("'","")
         yMin = float(yMin.replace('"',''))
         yMax = argv[i+2]
         yMax = yMax.replace("'","")
         yMax = float(yMax.replace('"',''))
         i = i + 2
        elif dataFile is None:
         dataFile = arg
         dataFile = dataFile.replace("'","")
         dataFile = dataFile.replace('"','')
        elif outFile is None:
         outFile = arg
         outFile = outFile.replace("'","")
         outFile = outFile.replace('"','')
        i = i + 1



    if dataFile is None or outFile is None:
        usage()
    try:
        data = readPoints(dataFile,zField)
    except Exception,ex:
        print ex
        sys.exit(0)
    if xMin is None:
        xMin=data['extent'][0]
        xMax=data['extent'][1]
    if yMin is None:
        yMin=data['extent'][2]
        yMax=data['extent'][3]

    geotransform=[]
    geotransform.append(xMin)
    geotransform.append((xMax-xMin)/xSize)
    geotransform.append(0)
    geotransform.append(yMax)
    geotransform.append(0)
    geotransform.append((yMin-yMax)/ySize)

    if proj is None:
        proj = data['proj'] 
    else:
        try:
            proj = osr.SpatialReference()
            proj.SetFromUserInput(str(proj))  
        except Exception,ex:
            print ex 
            sys.exit(0)
    #Creating the interpolation function and populating the output matrix value
    try:
        ZI = invDist(data['xv'],data['yv'],data['values'],geotransform,proj,xSize,ySize,power,smoothing,driverName,outFile)
    except Exception,ex:
        print ex
        sys.exit(0)

  1. First, the default parameter values are set. Some of them are None, since they are mandatory.
  2. Then, the input arguments are read.
  3. After having the input arguments, all the parameters that can be set, are set.
  4. The function that reads the input file is run, and with it's result, the raster is created. Note that the geotransform is defined from the input file if no one is set by the user.
And the result is something like this when opened with QGis or any GIS:
The points are not coloured, but are the ones in the original file.
The whole code comes in the example file, with all the data you need to run the example.
To generate the picture above, the command is:
 python invdistgis.py -zfield value points.shp out.tiff

What's next

The original gdal_grid implements more interpolation methods (nearest value, and average). Doing the same in the script is as easy as adding other functions to use instead of pointValue.
Also, in the original program, an ellipse can be set to search only in some area around each pixel. This would be a little more difficult do implement, but is easy to find it in the gdal source code in a file called gdalgrid.cpp. I've put it in the example data.

As you can see, implementing this kind of stuff in python is not as difficult as it may seem at the beginning.

Saturday, March 3, 2012

Creating a grid from scattered data using inverse of the distance with python


Attention: I've found some problems in this post. Please read the new one about the same topic.


A usual problem is drawing a map from some scattered data. So it's necessary to set a value for each point in the map from the data in the points already known.
There are different methods used for this purpose, such as kriging or inverse of the distance. The second one is the one used in the example.
The python library scipy has a function called RBF that does that. The code:


import numpy as np
from scipy.interpolate import Rbf

import matplotlib.pyplot as plt

#Creating some data, with each coordinate and the values stored in separated lists
x = [10,60,40,70,10,50,20,70,30,60]
y = [10,20,30,30,40,50,60,70,80,90]
values = [1,2,2,3,4,6,7,7,8,10]

#Creating the output grid (100x100, in the example)
ti = np.linspace(0, 100.0, 100)
XI, YI = np.meshgrid(ti, ti)

#Creating the interpolation function and populating the output matrix value
rbf = Rbf(x, y, values, function='inverse')
ZI = rbf(XI, YI)

# Plotting the result
n = plt.normalize(0.0, 100.0)
plt.subplot(1, 1, 1)
plt.pcolor(XI, YI, ZI)
plt.scatter(x, y, 100, values)
plt.title('RBF interpolation')
plt.xlim(0, 100)
plt.ylim(0, 100)
plt.colorbar()

plt.show() 

When executing the script, the output is an image like this:
The Rbf function  takes as arguments the x and y axes, and a list of the values in the points.  A function to use has to be set as well. In our example, inverse of the distance is used, but there are other options, and even a custom function can be used.
Then an output matrix has to be defined. To do it, the new x and y axes must be defined. In our example, a 100x100 pixel matrix has been defined with the meshgrid function.
Finally, the output matrix is created with the function that was th output of the Rbf function. This ZI matrix is what we can use to draw a map.
In this example, there is only a simple plot, but in the next post a real case will be shown.