tag:blogger.com,1999:blog-5034489627146372337.post814610395106437128..comments2024-03-21T09:51:03.150+01:00Comments on GeoExamples: GDAL performance: raster classification using NumPyAnonymoushttp://www.blogger.com/profile/12015764612166658132noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-5034489627146372337.post-37817996741867043802017-08-17T08:47:10.522+02:002017-08-17T08:47:10.522+02:00Hi,
I am new to this , can you help me please . He...Hi,<br />I am new to this , can you help me please . Here the link and I wanted to get all the records into a DB ( https://www.fs.usda.gov/rds/fedora/objects/RDS:RDS-2015-0046/datastreams/RDS-2015-0046/content ) <br /><br />If you can help me that would be a greatAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-52581714136170993492016-09-26T12:50:23.425+02:002016-09-26T12:50:23.425+02:00It worked out. Thanks a lot for tipsIt worked out. Thanks a lot for tipsAnonymoushttps://www.blogger.com/profile/12636511306031516220noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-32793836596184108832016-09-26T11:22:36.140+02:002016-09-26T11:22:36.140+02:00I have created a output image that is "GTiff&...I have created a output image that is "GTiff" and I am trying to write the array. But my result is showing an empty image with only o values. Is it possible to write array to destination raster block wise? if yes why it is showing no value on it? I am trying it for last three day. It worked with smaller raster for write array but this case it is showing empty or I show save the destination raster to get the result. It is really hard for me to understand for me because I am no a programer. Anonymoushttps://www.blogger.com/profile/12636511306031516220noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-18918130447114684612016-09-23T15:23:10.151+02:002016-09-23T15:23:10.151+02:00Hi,
Does your file Aschhorn_mosaik.jpg have a worl...Hi,<br />Does your file Aschhorn_mosaik.jpg have a worldfile? I think that adding a projection to a jpeg file is not possible. You shoud create it yourself and change the line<br />ds.GetProjection()<br />and <br />ds.GetGeoTransform()Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-62349332790107676132016-09-23T15:16:40.958+02:002016-09-23T15:16:40.958+02:00Hi, I am tring to use your code to read my big ras...Hi, I am tring to use your code to read my big raster image about 8gb. May be I am doing something wrong. I do not need primarily any classification. I just want to get return the original raster values in the array. But it iy showing all values are only Zer. Could you kindly check my code where i did mistakes.<br />my code is here:<br />import numpy<br />from numpy import zeros<br />from numpy import logical_and<br />from osgeo import gdal<br />#import struct<br /><br /><br /> <br />in_file = r"F:\Geodaten\Eigene\Luftbild\Orthobilder\20120502\Aschhorn_mosaik.jpg"<br />ds = gdal.Open(in_file)<br />band = ds.GetRasterBand(1)<br /># Get "natural" block size, and total raster XY size. <br />block_sizes = band.GetBlockSize()<br />x_block_size = block_sizes[0]<br />y_block_size = block_sizes[1]<br />xsize = band.XSize<br />ysize = band.YSize<br /><br />format = "GTiff"<br />driver = gdal.GetDriverByName( format )<br />dst_ds = driver.Create("original_blocks.tif", xsize, ysize, 1, gdal.GDT_Byte )<br /><br />dst_ds.SetGeoTransform(ds.GetGeoTransform())<br />dst_ds.SetProjection(ds.GetProjection())<br /><br /># Function to read the raster as arrays for the chosen block size.<br />def read_raster(x_block_size, y_block_size):<br /> raster = r"F:\Geodaten\Eigene\Luftbild\Orthobilder\20120502\Aschhorn_mosaik.jpg"<br /> ds = gdal.Open(raster)<br /> band = ds.GetRasterBand(1)<br /> xsize = band.XSize<br /> ysize = band.YSize<br /> blocks = 0<br /> for y in xrange(0, ysize, y_block_size):<br /> if y + y_block_size < ysize:<br /> rows = y_block_size<br /> else:<br /> rows = ysize - y<br /> for x in xrange(0, xsize, x_block_size):<br /> if x + x_block_size < xsize:<br /> cols = x_block_size<br /> else:<br /> cols = xsize - x<br /> array = band.ReadAsArray(x, y, cols, rows)<br /> print array<br /> dst_ds.GetRasterBand(1).WriteArray(array)<br /> del array<br /> blocks += 1<br /> band = None<br /> ds = None<br />#dst_ds.GetRasterBand(1).WriteArray(data)<br /><br />read_raster(x_block_size, y_block_size)<br />band = None<br />ds = None<br />dst_ds = NoneAnonymoushttps://www.blogger.com/profile/12636511306031516220noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-51308393085531348402015-10-26T12:29:35.854+01:002015-10-26T12:29:35.854+01:00Hi,
I don't understand your question. You wan...Hi,<br /><br />I don't understand your question. You want the distance from each cell (or pixel) to where?<br /><br />The block method is transparent. You just have to take the correct pixels at the same time so the underlaying GDAL library can use only one of the blocks. But the distance will depend on the x and y positions in the same wayAnonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-63727287642334246972015-10-22T16:24:51.936+02:002015-10-22T16:24:51.936+02:00Hi Roger,
I've tried to use your code to calc...Hi Roger, <br />I've tried to use your code to calculate distance for the centroid of each grid cell (10000x10000) to one point and unfortunately I didn't manage to do it using the "block method". <br />I don't really understand how to get each centroid coordinates using blocks? <br />Do you think it is possible ? <br /><br />thanks <br />ppaschhttps://www.blogger.com/profile/13344256490349922019noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-76287973975311265062015-09-22T09:19:59.065+02:002015-09-22T09:19:59.065+02:00This comment has been removed by the author.findchris2dayhttps://www.blogger.com/profile/01861494270020436243noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-14417692536148873582015-02-12T14:48:35.964+01:002015-02-12T14:48:35.964+01:00Hi Benjamin,
You are right, there is a bug here. ...Hi Benjamin,<br /><br />You are right, there is a bug here. I'll change it when I have a moment!<br />I'm happy to know that somebody us using it.Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-23971934520353230512015-02-12T12:20:19.420+01:002015-02-12T12:20:19.420+01:00Hi Roger,
after using your code for a while I fo...Hi Roger, <br /><br />after using your code for a while I found possible problems with the max_value.<br />In my specific setup I use the classification for a large number of files. They do not always fill the complete range of classification_values, which normally is not problem. Recently I had a file where the max_value was exactly one of my classification values which lead to a 0 after the reclassification.<br />So I propose to change line 50 to:<br />if classification_values[k] <= max_value and (classification_values[k + 1] > min_value ):Benjaminhttps://www.blogger.com/profile/14361493015466375001noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-14749498907321307012015-01-22T15:12:08.687+01:002015-01-22T15:12:08.687+01:00Hi again. I don't remember why I put this valu...Hi again. I don't remember why I put this value, but your comment seems right. If the value is over the index k value, the output classification should be the k+1 index.<br />Thank you again for the comment!Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-91642464382860905642015-01-22T15:11:14.577+01:002015-01-22T15:11:14.577+01:00This comment has been removed by the author.Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-56529318926713574362015-01-22T14:31:27.950+01:002015-01-22T14:31:27.950+01:00As you can see it has also taken me sometimes to a...As you can see it has also taken me sometimes to answer. I'm happy if I could help to improve your work ! Thanks for sharing.<br /><br />However, there's something else I don't understand at line 53. We're just going out the (for) loop and we want to test the last old value at index k+1. Why do you set these pixels to the new value of index k ?<br />Huguesnoreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-40919122850851398522015-01-19T18:11:17.830+01:002015-01-19T18:11:17.830+01:00Well, finally, you were wight, Hughes, the line 44...Well, finally, you were wight, Hughes, the line 44 should be cols = xsize - j<br />I'll change the post so it's correct.<br />Thanks a lotAnonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-85620211735791713512015-01-19T18:07:11.065+01:002015-01-19T18:07:11.065+01:00Hi,
You are right. The docs say that the first tw...Hi,<br /><br />You are right. The docs say that the first two parameters are the offset, and then the size to read: http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray<br />Then, I don't understand what's happening! I swear it worked :) <br />I'm sorry I don't have many time these days, but I'll post the answer as soon as I can.Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-64883343820311453942015-01-13T15:38:35.612+01:002015-01-13T15:38:35.612+01:00Hi,
Thanks for your answer. Your explanation is v...Hi,<br /><br />Thanks for your answer. Your explanation is very clear but there's still something I don't understand. Before I understood that ReadAsArray allows us to give an offset and then extract the given number of pixels. We were not extracting from pixel n to x but x pixels from the n one.<br /><br />That is how I understood your first statement for rows at line 39. In my own script if I don't add "-j" to xsize to extract the last col's pixels it throws an error:<br />ERROR 5: band 1: Access window out of range in RasterIO(). Requested (9728,0) of size 9961x256 on raster of 9961x13081.Huguesnoreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-77305793878156600822015-01-13T11:33:11.970+01:002015-01-13T11:33:11.970+01:00Thank you Huges.
I think that the code is correct...Thank you Huges.<br /><br />I think that the code is correct. If the number of columns is smaller than the block size, we read all the columns at once (xsize).<br />When reading the data at line 46, we use<br />data = band.ReadAsArray(j, i, cols, rows) <br />that reads from j to cols. If the j + block size is less than the xsize, we will read from j to xsize, which is correct.<br /><br />I hope that the explanation is clearAnonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-42163646617911553252015-01-12T15:41:12.800+01:002015-01-12T15:41:12.800+01:00Hello,
Thanks for your work which is very useful ...Hello,<br /><br />Thanks for your work which is very useful to reclass rasters and which learnt me a lot about the use of blocks. However, I think line 44 should be cols = xsize - j, shouldn't it ?<br /><br />But if I'm not wrong no one noticed this ine the previous post. A question raises : is there something I don't understand ?Huguesnoreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-25084876325500790352014-09-29T09:50:31.749+02:002014-09-29T09:50:31.749+02:00Hi Flash,
I think that the main calculation you w...Hi Flash,<br /><br />I think that the main calculation you want to do is:<br />NDVI = (G - R) / (G + R)<br />So it's a good candidate to be used as in the post, since the result that you want depends only on the pixel values for each band (doesn't depens on neighbour pixels, for instance).<br />Look at the post's code from the lnie 35 to 46. Instead of reading all the band at once as you did bandar = canalr.ReadAsArray() , you can read it by block. You should do it for the green and red bands:<br /><br />data_r = canalr.ReadAsArray(j, i, cols, rows) <br />data_g = canalg.ReadAsArray(j, i, cols, rows) <br /><br />Then, do the calculation (by the way, you are doing it in a not-numpy style!):<br /><br />ndvi = (data_g - data_r) / (data_g + data_r)<br /><br />And you will have the result for this block. Now, you have to write it at the output file, as in my line 55:<br />dst_ds.GetRasterBand(1).WriteArray(ndvi,j,i) <br /><br />This should increase the performance a lot.<br />I hope this helps a little.<br /><br />Roger<br /><br />Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-1746677761807833822014-09-28T16:54:29.286+02:002014-09-28T16:54:29.286+02:00Hello, I'm from Brazil and I'm starting st...Hello, I'm from Brazil and I'm starting studies with GDAL and python, created the following scritp to qgis for NDVI classification on a raster <br /><br />This code works correctly for small images <br /><br />The problem happens with memory error when I try to process large raster <br /><br />This is very important to me, did not quite understand the logic of the code that you have implemented, I tried to adapt your code in my code, but I did not succeed. <br /><br />Could you help me create a code that performs the same function that I developed, only for large raster? <br /><br />thank you so much<br /><br />##Raster_de_entrada=raster<br />##NDVI=output raster<br /><br />import numpy as np<br />from osgeo import gdal<br />from numpy import *<br /><br />entrada = gdal.Open(Raster_de_entrada) <br />canalr = entrada.GetRasterBand(1) <br />bandar = canalr.ReadAsArray() <br />canalg = entrada.GetRasterBand(2) <br />bandag = canalg.ReadAsArray() <br /><br /># Inicio do calculo NDVI para imagens RGB<br /># NDVI = (G - R) / (G + R)<br /><br />bandag = array(bandag, dtype = float) <br />bandar = array(bandar, dtype = float)<br /><br />var1 = subtract(bandag, bandar) <br />var2 = add(bandag, bandar)<br /><br />ndvi = divide(var1,var2)<br />ndvi = ndvi*255<br /><br /><br />geo = entrada.GetGeoTransform() <br />proj = entrada.GetProjection() <br />shape = bandar.shape <br /><br /><br /><br />driver = gdal.GetDriverByName('GTiff')<br />dst_ds = driver.Create(NDVI, shape[1], shape[0], 1, gdal.GDT_Float32, ['TFW=YES', \<br /> 'COMPRESS=LZW', 'TILED=YES'] )<br /> <br />dst_ds.SetGeoTransform( geo ) <br />dst_ds.SetProjection( proj ) <br /><br />dst_ds.GetRasterBand(1).WriteArray(ndvi) <br />stat = dst_ds.GetRasterBand(1).GetStatistics(1,1) <br />dst_ds.GetRasterBand(1).SetStatistics(stat[0], stat[1], stat[2], stat[3]) Flashhttps://www.blogger.com/profile/10351923197408918254noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-80456480133830739122014-08-29T09:49:13.706+02:002014-08-29T09:49:13.706+02:00Hi Roger,
I got it now. Of course python calculat...Hi Roger,<br /><br />I got it now. Of course python calculated the correct numbers, the problem was the GDAL Create(). The last argument is a GDALDataType which was set to eight bit unsigned integer. Changing it to sixteen bit signed integer did the trick for me.<br />So the changed line of code is:<br /><br />dst_ds = driver.Create("original_blocks.tif", xsize, ysize, 1, gdal.GDT_Int16 )<br /><br />to be sure I also altered the numpy.zeros creation to (which in fact had no influence, but to be sure):<br /><br />r = zeros((rows, cols), numpy.int16)<br /><br />Thanks again for the classification code! <br />BenjaminBenjaminhttps://www.blogger.com/profile/14361493015466375001noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-24653887429078304662014-08-28T22:24:04.774+02:002014-08-28T22:24:04.774+02:00Hi Benjamin,
I'm not at home now, so I can...Hi Benjamin,<br /><br />I'm not at home now, so I can't test the code, but a negative number multiplied by True should give the negative value, since True acts as 1. <br />Maybe the problem is when using Numpy. I'll take a look, sorry for the delay.<br /><br />RogerAnonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-63316557875123775452014-08-28T15:20:31.359+02:002014-08-28T15:20:31.359+02:00Hi,
I am testing your code and it works like a c...Hi, <br /><br />I am testing your code and it works like a charm. I have only one question: How to get negaitve output values. Whatever I am trying, a negative value in the classification_output_values results in a zero on the output .tif.<br /><br />I guess the problem lies here <br /><br />r = r + classification_output_values[k] * logical_and(data >= classification_values[k], data < classification_values[k + 1])<br /><br />but I don't understand why a multiplication of a negative value and True results in zero. I am not a python expert so I am asking you for help.<br /><br />Cheers,<br />BenjaminBenjaminhttps://www.blogger.com/profile/14361493015466375001noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-14618492614867875052014-04-28T13:10:19.804+02:002014-04-28T13:10:19.804+02:00Hi mr Franklin :)
I have tested the scripts in se...Hi mr Franklin :)<br /><br />I have tested the scripts in several formats, but not with files as big as 60GB. The performance can change depending of the format, since each driver implements the rading block in a different way. GeoTIFF uses one line for each block, but you can define blocks of other sizes, let's say 256x256 pixels in the driver. The original example uses data from the usgs in AIG/Arc/Info Binary Grid, that uses 4 lines as a block. The driver that writes the output file can affect the result too for the same reasons, of course.Anonymoushttps://www.blogger.com/profile/12015764612166658132noreply@blogger.comtag:blogger.com,1999:blog-5034489627146372337.post-50099092861551639302014-04-28T06:39:05.369+02:002014-04-28T06:39:05.369+02:00I have found that combining gdalbuildvrt and gdal_...I have found that combining gdalbuildvrt and gdal_translate is much better for performance when merging the blocks back together then gdal_merge.py. It also avoids the memory problems that gdal_merge.py has. The only downside is that it adds an extra step (albeit a short and simple one).<br />e.g. instead of "gdal_merge.py -o big.tiff srtm*.tif" use the following two lines:<br />gdalbuildvrt srtm*.tif big.vrt<br />gdal_translate -of GTiff big.vrt big.tif<br /><br />I have been curious about the different types of rasters and their comparative read/write speeds for the intermediate steps (with gdal), have you used anything apart from GeoTIFFs for processing? I am working with 60GB compressed GeoTIFFs at the moment.Anonymoushttps://www.blogger.com/profile/05615706462957759958noreply@blogger.com