[MITgcm-support] model blew up but not terminated
Constantinos Evangelinos
ce107 at ocean.mit.edu
Fri Jul 8 13:27:15 EDT 2011
Sorry for the delayed response - this has been covered before:
http://forge.csail.mit.edu/pipermail/mitgcm-support/2006-April/003931.html
Constantinos
On Monday 27 June 2011 14:39:00 Matthew Mazloff wrote:
> Hi David,
>
> you can put something like:
>
> IF ( rhsMax .GE. 100000000. _d 0) THEN
> CALL WRITE_STATE( endTime, 999999, myThid )
> STOP 'CMM => ABNORMAL END: THE MODEL IS BLOWING UP!'
> ENDIF
>
>
> right after
>
> WRITE(standardmessageunit,'(A,1P2E22.14)')
> & ' cg2d: Sum(rhs),rhsMax = ', sumRHS,rhsMax
>
>
> in cg2d.F
>
> This usually catches only the model blow ups
>
> -Matt
>
> On Jun 27, 2011, at 7:09 AM, Daiwei (David) Wang wrote:
> > Hi,
> >
> > I found a few jobs of mine on beagle blew up, but kept running. By
> > blowup, I mean monitor statistics values became populated by NaN, for
> > example,
> >
> > $ grep dynstat_eta STDOUT.0000
> > (PID.TID 0000.0001) %MON dynstat_eta_max =
> > 1.2697030002398E+00
> > (PID.TID 0000.0001) %MON dynstat_eta_min =
> > -1.9409175554150E+00
> > (PID.TID 0000.0001) %MON dynstat_eta_mean =
> > -1.0404669740582E-16
> > (PID.TID 0000.0001) %MON dynstat_eta_sd =
> > 6.2681413921827E-01
> > (PID.TID 0000.0001) %MON dynstat_eta_del2 =
> > 1.1009328164515E-04
> > (PID.TID 0000.0001) %MON dynstat_eta_max =
> > 1.0775823955686E+00
> > (PID.TID 0000.0001) %MON dynstat_eta_min =
> > -1.9467522542860E+00
> > (PID.TID 0000.0001) %MON dynstat_eta_mean =
> > -2.9479897598316E-16
> > (PID.TID 0000.0001) %MON dynstat_eta_sd =
> > 6.2739266263267E-01
> > (PID.TID 0000.0001) %MON dynstat_eta_del2 =
> > 1.1075534674701E-04
> > (PID.TID 0000.0001) %MON dynstat_eta_max = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_min = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_mean = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_sd = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_del2 = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_max = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_min = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_mean = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_sd = NaN
> > (PID.TID 0000.0001) %MON dynstat_eta_del2 = NaN
> >
> > But the job kept running, in vain of course, until endTime. I wonder
> > if
> > there is a flag to stop the run and write something to standard error
> > when NaN appears. I didn't found one.
> >
> > Thanks,
> > David
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
More information about the MITgcm-support
mailing list