[MITgcm-devel] strange error messages from diagnostics pkg

Jean-Michel Campin jmc at ocean.mit.edu
Mon May 18 09:45:24 EDT 2015


Hi Martin,

I guess it's related to multi-threading (OpenMP) and it looks like
we are missing few "BARRIER" in the code. I am currently checking
this and we let you know later.

Cheers,
Jean-Michel

On Mon, May 18, 2015 at 10:14:29AM +0200, Martin Losch wrote:
> Hi there,
> 
> every now and then I am getting strange error messages from the
> diagnostics pkg:
> 
> (PID.TID 0121.0001) *** DIAGNOSTICS_STATUS_ERROR *** from:
> DIAGNOSTICS_FILL call
> (PID.TID 0121.0001) *** ERROR *** DIAGNOSTICS_FILL: diagName="ETAN
> ", expectStatus= 20, pkgStatus= 10
> (PID.TID 0121.0001) *** ERROR *** DIAGNOSTICS_FILL: <== called from
> the WRONG place, i.e.
> (PID.TID 0121.0001) *** ERROR *** DIAGNOSTICS_FILL: before
> DIAGNOSTICS_SWITCH_ONOFF call in FORWARD_STEP
> 
> this time from the processes 11 and 121 of a 624 cpu run (in fact I
> use nPx = 156 and nSx = 4 with OpenMP. They tend to be not
> reproducible (i.e. I rerun the same setup without any changes and
> without any problems), but pop up every couple of runs in a
> "chain-job" on the cray XC-30 (cca.ecmwf.int). This seems to happen
> when the model tries to store the first time slice of my first
> diagnostics (ETAN).
> The code is "vanilla" checkpoint65k plus a few days.
> 
> Have you seen that before? Are there any chances to debug this? Can
> it have to do anything with OpenMP?
> 
> Martin
> 
> 
> -- 
> Martin Losch
> Alfred Wegener Institute for Polar and Marine Research
> Postfach 120161, 27515 Bremerhaven, Germany;
> Tel./Fax: ++49(0471)4831-1872/1797
> 
> 
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel



More information about the MITgcm-devel mailing list