[MITgcm-devel] netcdf on sx8

Martin Losch Martin.Losch at awi.de
Thu Nov 27 11:47:20 EST 2008


Hi again, sorry for this piecewise flow of information:

I have now figured out what the difference between lab_sea and the  
other experiments with netcdf is: In lab_sea the diagnostics package  
write 14 netcdf files. When I reduce this number to 6, then the model  
finishes without errors, leaving me 30 files in the end: 2* 
(6diagnostics+regular output+tave-output). Redirecting the monitor  
output to netcdf opens additional files and the model stops again. So  
apparently on our sx8, we can have only 30 netcdf files simultaneously.

That's really odd, and I wonder if there's something that one can do  
about this at the compilation time of the netcdf libraries (that's  
why there's a cc to Kerstin Fieg, who created the netcdf libraries).

Martin

On 27 Nov 2008, at 17:20, Martin Losch wrote:

> Following up on my own previous observation:
> the error for lab_sea has not gone away, and I still don't know  
> exactly what the problem is. But apparently, when mitgcmuv is  
> trying to create the file for the second tile, the netcdf library  
> routine NF_CREATE returns an error code (12) that translates into  
> "Not enough space". I still have no idea why this error should  
> arise. I have about 380GB of disk space available. the exact  
> calling statement is also completely independent of the size of the  
> problem: err = NF_CREATE(fname, NF_CLOBBER, fid). The only input is  
> fname, which a character of length 500 (MNC_MAX_PATH).
>
> When I comment out the stop statement in mnc_handle_err, the model  
> finishes with many error messages from the mnc-package (mostly  
> invalid id) and produces a corrupted netcdf file for each of the  
> variables that are saved after the initial problem occurs.
>
> All of this happens for 2 tiles (1 tile is OK obviously, because no  
> second file is opened), regardless of doing this on 1 or 2CPU  
> (nSx=2 or nPx=2).
>
> To me this looks very much like a non-local problem with memory  
> array boundaries, but I have no clue why and where this should  
> happen. I have tried an array bound check with -eC, but that seemed  
> to be OK. Something really fishy ...
>
> Any comments are welcome,
>
> Martin
>
> cc to Jens-Olaf, although he cannot reply to this list.
>
> Oh yes, happy thanksgiving ...
>
> On 30 Jun 2008, at 10:28, Martin Losch wrote:
>
>> Hi all,
>>
>> I found a funny error with netcdf in my SX8 routine test: in  
>> lab_sea/run
>> I get this
>> > cat STDERR.*
>> (PID.TID 0001.0001) *** ERROR *** NetCDF ERROR:les
>> (PID.TID 0001.0001) *** ERROR *** MNC ERROR: opening 'phiHydLow. 
>> 0000000000.t002.nc'
>> > cat STDOUT.0001
>>  NetCDF ERROR:
>>  ===
>>  Not enough space
>>  ===
>>  MNC ERROR: opening 'phiHydLow.0000000000.t002.nc'
>>
>> and in ideal_2D_oce
>> > cat STDERR.*
>> (PID.TID 0001.0001) *** ERROR *** NetCDF ERROR:
>> (PID.TID 0001.0001) *** ERROR *** MNC ERROR: opening 'flxDiag. 
>> 0000036000.t004.nc'
>> > tail STDOUT.0001
>>  NetCDF ERROR:
>>  ===
>>  Not enough space
>>  ===
>>  MNC ERROR: opening 'flxDiag.0000036000.t004.nc'
>>
>> phiHydLow ist not part of the diagnostics out and flxDiag.* is  
>> only the 4th output stream in data.diagnostics? By lucky accident  
>> I found that the second error occurs when the model calls
>>> C       Update the record dimension by writing the iteration number
>>>         CALL MNC_CW_SET_UDIM(diag_mnc_bn, -1, myThid)
>>>         CALL MNC_CW_RL_W_S('D',diag_mnc_bn, 
>>> 0,0,'T',myTime,myThid)  <=======
>>>         CALL MNC_CW_SET_UDIM(diag_mnc_bn, 0, myThid)
>>>         CALL MNC_CW_I_W_S('I',diag_mnc_bn,0,0,'iter',myIter,myThid)
>>>
>> from diagnostics_out.F
>>
>> "not enough space" cannot refer to disks-space, as I am well below  
>> my file number and disk-space quotas.
>>
>> Any idea what could be going on? The other examples with netcdf  
>> seem to be doing fine (and in  our "production" runs we generally  
>> don't have problems with MITgcm+netcdf  ...)
>>
>> Martin
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel




More information about the MITgcm-devel mailing list