I like testing the things I learn with actual data. Some time ago, I wrote an entry about how to use the inverse of the distance to interpolate data from scattered points. Now, I will show how to use it to draw a sea level pressure map from the data given by the Spanish Meteorology Agency. The Agency has a nice Open Data section that it's worth a visit if you are interested in meteorology as I am.
As usual, I have uploaded all the files you can need to check the entry.
Getting the dataThe station data from the AEMet is available in their ftp server.
In the example, an FTP connection to the AEMet server has to be done, and only the sea level pressure is parsed. If the actual pressure is used, the interpolation will be wrong, since the altitude won't be considered. AEMet gives the corrected pressure directly.
The file format is quite easy to understand and parse. It's a compressed CSV file, with the variable names indicated in each value. Not all the stations have pressure data, and the file's directory changes it's name with the date, so the script has to do some checks:
from ftplib import FTP import gzip ''' Gets the last sea level pressure values from the AEMet service. ''' def get_data(): ftp = FTP('ftpdatos.aemet.es') # connect to host, default port ftp.login() # user anonymous, passwd anonymous@ ftp.cwd('datos_observacion/observaciones_diezminutales') #Getting in the 10 minute datadirectory files = ftp.nlst() # list directory contents #Getting the newest day directory. It must include the word "diezminutales". The name includes the date, so ordering it by name, the last one is also the newest one. directories = [] for file_name in files: if file_name.endswith('diezminutales'): directories.append(file_name) directories.sort() ftp.cwd(directories.pop()) #Getting the last file, ordered by name. files = ftp.nlst() data_files = [] for file_name in files: if file_name.endswith('gz'): data_files.append(file_name) data_files.sort() data_file = data_files.pop() ftp.retrbinary("RETR " + data_file, open(data_file, 'wb').write) #The files are compressed with gzip. Getting the contents into a variable. f = gzip.open(data_file, 'rb') data = {'lon':[],'lat':[],'press':[]} #Parsing the file line by line for line in f: station_values = line.split(',') #Some valeus are always in the same position lat = float(station_values[2]) lon = float(station_values[3]) #Other values must be found using their label name, and not all the stations give pressure data press = None for var in station_values: if ('PRES_nmar=' in var) is True and ('QPRES_nmar=' in var) is False: press = float(var.replace('PRES_nmar=','')) if press != None: data['lon'].append(lon) data['lat'].append(lat) data['press'].append(press) f.close() return data
- The script defines a function for using it from other files, as we will in the next section.
- We will use the latest data, so is necessary to order the directory and file names to guess which file to download. To understand it better, just connect to the ftp and check how is organized.
- The data is compressed, so I use gzip to open it.
- Finally, only one of the values is used, so it's chosen. Other maps could be done from the same data set, although interpolating some variables, like temperature, this way doesn't make sense.
Viewing the data
To represent the data in the example, we will create a GeoTiff layer for the interpolated raster, and a Shapefile fot the point data. Then, using QGIS, we will draw the map. The coasline has been downloaded from the NOAA coastline extractor.
from get_data import get_data from invdistgis import invDist from osgeo import osr from osgeo import ogr def writeShape(data,proj,outFile): driverName = "ESRI Shapefile" drv = ogr.GetDriverByName( driverName ) dsOut = drv.CreateDataSource( outFile ) outLayer = dsOut.CreateLayer('PRESS', None, ogr.wkbPoint) field = ogr.FieldDefn( "PRESS", ogr.OFTReal ) outLayer.CreateField ( field ) for i in range(0,len(data['press'])): #print str(i) + " - " + str(data['lon'][i])+" "+str(data['lat'][i]) geometry = ogr.CreateGeometryFromWkt("POINT("+str(data['lon'][i])+" "+str(data['lat'][i])+")") feature = ogr.Feature( outLayer.GetLayerDefn()) feature.SetField( 'PRESS', data['press'][i] ) feature.SetGeometry(geometry) outLayer.CreateFeature(feature) dsOut = None def createMap(): #Getting the data from the function data = get_data() driverName="GTiff" outFile="press.tiff" power=2 smoothing=5 xSize=200 ySize=200 proj = osr.SpatialReference() proj.SetFromUserInput(str("EPSG:4326")) geotransform=[-10.35,0.07185,0,44.28,0,-0.0458] writeShape(data,proj,"press.shp") invDist(data['lon'],data['lat'],data['press'],geotransform,proj,xSize,ySize,power,smoothing,driverName,outFile) if __name__ == "__main__": createMap()
- I have created a function to draw the shapefile, so it's easier to understand how to do it.
- If you run the script and a shapefile already exists, the GDAL will throw an error. I think that is a kind of bug... or feature!
- To get the data, the script is using the function described above.
- The map creation uses the inverse of the distance function used in the entry that explained how to do it.
- You can find the color scale used in the project files with the name colorscale.txt
- It could be a good idea to interpolate more pixels so the raster is smoother