I am working on my thesis to produce IDF curves from TRMM data . for this I must process 52000 images and I want to know which format is best for this work and how process that .
Hi Javad, that's a lot of data! Unless you have really lots of memory to directly read all that in one batch, you might want to make several subsets first.
Have a look at gdal warp and the extent arguments. You can write a shell script to create some subsets first. Then you could use e.g. python and the python gdal library. I recommend using anaconda (https://www.continuum.io/downloads) for this, which will allow you to easily install needed python packages from console (e.g. "conda install GDAL"). Gdal reads pretty much all common data formats, like the netCDF you probably have for your TRMM data. Alternatively consider using hdf files that can allow you to read data directly from file without the need for a copy (could be an alternative) ! In python you can read individual files as:
raster = gdal.Open(filename) # open image
geoTransform = raster.GetGeoTransform() # get geotransform (i.e. extend and pixel size)
projection = raster.GetProjection() # get projection (i.e. UTM, WGS, etc)
data = raster.GetRasterBand(i).ReadAsArray() # Get band 'i' and convert it to array
make one 3d array with numpy and stack all files. Then you can simply run through your stack: array[i,j,:] (i.e. pixel_i,j over the whole stack":") and apply the statistics you need for your IDF. With the information for projection and geoTransform you can also save the stack. Here an example for one band. simply adjust and implement a loop where it says ...WriteArray(raster), with raster being your layer i (i.e. data[:,:,layer]), and set number of bands arguments in the ...Create(...) line as well.
rows, cols = data.shape
output_raster = gdal.GetDriverByName('GTiff').Create(filename,cols, rows, 1 ,gdal.GDT_Float32) # Create new GEOTIFF, the '1' refers to the number of bands, last argument is the format