[MITgcm-devel] gluemncbig and many tiles
Oliver Jahn
jahn at MIT.EDU
Mon Jan 14 08:31:24 EST 2013
Hi Martin,
this can already be done by assigning to a slice of the variable object.
I have a version of gluemnc somewhere that operates like that and will
dig it out later. I have found, though, that on some systems, this
jumping around in the output file is part of what makes other scripts so
slow. The other reason is multiple writing of metadata (which also
causes jumping around). I think gluemncbig is so fast because it reads
and writes all files sequentially. I understand we will need to support
more tiles that we can have open files, but we should probably keep the
current loop order as an option, maybe even allowing a hybrid approach
where larger blocks (e.g., all tiles in x) are assembled in memory (with
the corresponding number of open files) and then written "locally".
Oliver
On 2013-01-14 05:58, Martin Losch wrote:
> Oliver,
>
> I am not sure I understand everything in gluemncbig, but I think, unless one want to try to change the number of allowed open files to infinity, one could try to change the order of the loops around. Instead of opening all files and then looping over the variables, I can imagine to loop over the files, then over the variables always reading the local bit and also writing it locally:
>
> for fname in files0:
> nc = netcdf_file(fname, 'r', **readopts)
> […]
> for name,pos,irec in ncout.begins
> […]
> data = nc.read_recvar(name, irecin)
> ncout.write_recvar_sub(name, irec, iX, iY, data )
>
> nc.close()
>
> where write_recvar_sub is something like the fortran nf_put_vars_real (which I don't know yet how to write, but assume you do).
> That shouldn't be much slower that the current implementation. What do you think?
>
> Martin
>
> On Jan 14, 2013, at 10:29 AM, Martin Losch <Martin.Losch at awi.de> wrote:
>
>> Hi Oliver,
>>
>> I am trying to use the great gluemncbig script to glue many (1120) netcdf files. After fixing this re.compile issue (although this may not be the best fix, and maybe {3,5} is even better):
>> tilepatt = re.compile(r'(\.t[0-9]{3,4}\.nc)$')
>> I am getting different error messages on different computers:
>>
>> 1. On hicegate.hlrn.de (www.hlrn.de) with python 4.2
>>
>> hicegate0:/gfs1/work/hbklosch/MITgcm/arctic_4km/run_jfnk_01120 $ gluemncbig -o diags2D.glob.nc diags2D.0*.t*.nc
>> Tiled dimensions: Yp1 Y X Xp1
>> Record dimension: T
>> Writing non-record variables
>> Traceback (most recent call last):
>> File "/home/h/hbklosch/bin/gluemncbig", line 1336, in ?
>> if progress and not verbose: sys.stderr.write('Writing {} records: '.format(nrec))
>> AttributeError: 'str' object has no attribute 'format'
>>
>> 2. On a local machine at AWI (uv100.awi.de) with python 2.6
>> uv100:run_jfnk_1120> gluemncbig -o diags2D.glob.nc diags2D.0001971000.t*
>> Tiled dimensions: Yp1 Y X Xp1
>> Record dimension: T
>> Traceback (most recent call last):
>> File "/uv/home1/mlosch/bin/gluemncbig", line 1299, in <module>
>> ncs[fname] = nc = netcdf_file(fname, 'r', **readopts)
>> File "/uv/home1/mlosch/bin/gluemncbig", line 260, in __init__
>> self.fp = open(self.filename, '%sb' % mode)
>> IOError: [Errno 24] Too many open files: 'diags2D.0001971000.t900.nc'
>>
>> I am absolutely new to python (started looking at python last Thursday), so my opinion on this is probably very unqualified, but I assume that the first error has to do with the python version, but that the second error can be fixed somehow, right? I didn't find any solution yet, but while I keep looking you probably know the answer right away.
>>
>> A more general question: I am so excited by the things python can do now (with matplotlib can plot finite elements very easily and beautifully), that I want to use it more. Do you already have a set of scripts that would replace our matlab functions (I saw mds, mnc, but, e.g. plotting the cube etc.) that you are willing to share?
>>
>> Martin
>>
>>
>>
>>
>>
>>
>> --
>>
>> Martin Losch
>> Alfred Wegener Institute for Polar and Marine Research
>> Postfach 120161, 27515 Bremerhaven, Germany;
>> Tel./Fax: ++49(0471)4831-1872/1797
>>
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
More information about the MITgcm-devel
mailing list