[MITgcm-support] mnc pickups

Ed Hill ed at eh3.com
Mon Sep 19 17:04:54 EDT 2005


Hi Jeff,

I've added comments in-line below:

On Mon, 2005-09-19 at 13:06 -0700, jeff polton wrote:
> In summary I tried the new mnc files on 3 similar runs which stopped 
> for 2 different reasons after a number of hours running (one of which I 
> don't understand). The diagnostics output files however all have the 
> same time stamp as the error files.
> I'll decribe what happened.
> "cvs -n update" told me I needed 2 files from pkg/mnc so I got them
> $cvs update mnc_cw_readwrite.template
> $cvs update mnc_cw_cvars.F

This is good, but please check that you re-ran genmake2 when you
re-built things since the *.template files need to be "expanded" and
this is done by genmake2.  I think you did the right thing, but I just
want to be clear that the safest thing is to completely re-build your
executable.


> I restarted the runs from pickups, changing only the chkptFreq to 
> something nonzero and nIter0 and nTimeSteps.
> 
> The 1st to fail failed with the following error
> > forrtl: severe (66): output statement overflows record, unit -5, file 
> > Internal Formatted Write
> > Image              PC                  Routine            Line        
> > Source
> > mitgcmuv           0x4000000000408ab0  Unknown               Unknown  
> > Unknown

...snip...
 
> > Unknown
> > libc.so.6.1        0x20000000003b2970  Unknown               Unknown  
> > Unknown
> and STDERR.0000:
> > (PID.TID 0000.0001) *** ERROR *** NetCDF ERROR: Numeric conversion not 
> > representable

The "internal formatted write" is a character string formatting issue.
And the "numeric conversion not representable" is an error reported by
the netCDF library.  It would be really helpful if I could somehow
reproduce your errors.  Any chance that I could have a set of your input
files that triggers the above error messages?


> I had previously encountered this error and resolved ("circumnatigated" 
> might be more appropriate) by reducing the number of diagnostics in 
> data.diagnotics. Clearly that was not the real problem as that error 
> has arisen again.

Again, its going to be hard to find the problem if I can't reproduce the
error(s).  Please help me by giving me a set of input files that causes
the above problem(s).


> The other 2 runs both died later because I filled my 500GB limit! Doh! 
> Though interestingly both made it further than the above run and both 
> completing a successful pickup dump. The above failed run completed 
> only a checkpoint dump.

Well, I'm glad that you're making some progress!  ;-)

Ed

-- 
Edward H. Hill III, PhD
office:  MIT Dept. of EAPS;  Rm 54-1424;  77 Massachusetts Ave.
             Cambridge, MA 02139-4307
emails:  eh3 at mit.edu                ed at eh3.com
URLs:    http://web.mit.edu/eh3/    http://eh3.com/
phone:   617-253-0098
fax:     617-253-4464




More information about the MITgcm-support mailing list