[MITgcm-devel] mnc and "global" files

Ed Hill ed at eh3.com
Wed Sep 7 22:01:37 EDT 2005


On Wed, 2005-09-07 at 21:35 -0400, Baylor Fox-Kemper wrote:
> Hi Ed,
>    A few points:
> 
> 1)  The underscore is overkill-- BASENAME.MYITER.fFACENUM.nc or 
> state.0000000000.f000001.nc suffices.  I hate having to hunt and peck 
> up in the top of the keyboard...
> 
> 2)  Whatever naming scheme is chosen, it should be IDENTICAL to the 
> per-processor files, except the per-processor files should have another 
> number.  Thus:
> 
> pickup.0000001440.f000001.nc
> 
> is a global file, which could be a comprised of the processor outputs:
> 
> pickup.0000001440.0000.f000001.nc
> pickup.0000001440.0001.f000001.nc
> pickup.0000001440.0002.f000001.nc
> pickup.0000001440.0003.f000001.nc
> 
> Or, perhaps more clearly, global files should replace the processor 
> number with a similar length symbol, e.g.,
> 
> pickup.0000001440.all.f000001.nc
> or
> pickup.0000001440.glob.f000001.nc
> 
> I personally find the latter much easier to pick out of an ls command, 
> as well as the advantage in easy globbing:
> 
> ls pickup.*all*.nc
> 
> or even
> 
> ls pick*a*.nc

Hi Baylor,

I think you partially missed the point of the filename syntax.  The
underscore is meaningless (so its easily removed as you suggest) but the
"t" and "f" characters ARE significant since they mean that the number
following them is either a *TILE* ("t") index or a *FACE* ("f") index.
I don't intend to have three numbers.  Just two -- the first is a model
iteration number (the current value of myIter) and the second is EITHER
a tile or face index.  In the case where we have one tile per every
face, then the tile and face indicies should be identical--and the tile
or face files are interchangeable.  But thats the simplest case.  In
general, they won't be interchangeable.


> 3) The matlab script I wrote should be easily adaptable to converting 
> back and forth for such files.

Yes!

> 4) The myiter is really an improvement.

Cool!

> 5) Don't forget our other CRITICAL improvement (which I just spent an 
> hour figuring out on some old outputs restarted with messy pickups).  
> We need to synch the output of pickup files with the output of <2GB 
> requirement!!!  As it currently exists, if one restarts from a pickup 
> file, there will be a few stragglers left behind in say, state.*.nc, so 
> that the new state.*.nc has repeated values from the old one.  I thus 
> recommend either,

I intend to completely drop the current "sequence" number and replace it
with the myIter number.  When a new file needs to be generated (either
because of the 2GB file size limit has been hit or because the user has
requested a new file every, say, month) then that new file will be
generated with the *CURRENT* myIter value.  So a sequence of files for
tile #1 would look like:

  state.0017280000.t000001.nc   [ myIter = nIter0 = 17280000 ]
  state.0019958400.t000001.nc   [ 1 month = 31 days after nIter0 ]
  state.0022550400.t000001.nc   [ 2 months = 61 days after nIter0 ]
  state.0025228800.t000001.nc   [ 3 months = 92 days after nIter0 ]
  ...etc...

So, in this scheme, it seems that theres no need for the sequence
number.

Or, is this not sufficiently general?  Can you think of any cases where
it won't work?  The only case that I can think of where it fails is if
you can't fit just *one* iteration worth of data into a single netCDF
file.  And thats a pathological example, anyway.

Ed

-- 
Edward H. Hill III, PhD
office:  MIT Dept. of EAPS;  Rm 54-1424;  77 Massachusetts Ave.
             Cambridge, MA 02139-4307
emails:  eh3 at mit.edu                ed at eh3.com
URLs:    http://web.mit.edu/eh3/    http://eh3.com/
phone:   617-253-0098
fax:     617-253-4464




More information about the MITgcm-devel mailing list