[MITgcm-support] model blew up but not terminated

Matthew Mazloff mmazloff at ucsd.edu
Mon Jun 27 14:39:00 EDT 2011


Hi David,

you can put something like:

        IF ( rhsMax .GE.  100000000. _d 0) THEN
          CALL WRITE_STATE( endTime, 999999, myThid )
          STOP 'CMM => ABNORMAL END: THE MODEL IS BLOWING UP!'
        ENDIF


right after

         WRITE(standardmessageunit,'(A,1P2E22.14)')
      &  ' cg2d: Sum(rhs),rhsMax = ', sumRHS,rhsMax


in cg2d.F

This usually catches only the model blow ups

-Matt



On Jun 27, 2011, at 7:09 AM, Daiwei (David) Wang wrote:

> Hi,
>
> I found a few jobs of mine on beagle blew up, but kept running. By
> blowup, I mean monitor statistics values became populated by NaN, for
> example,
>
> $ grep dynstat_eta STDOUT.0000
> (PID.TID 0000.0001) %MON dynstat_eta_max              =
> 1.2697030002398E+00
> (PID.TID 0000.0001) %MON dynstat_eta_min              =
> -1.9409175554150E+00
> (PID.TID 0000.0001) %MON dynstat_eta_mean             =
> -1.0404669740582E-16
> (PID.TID 0000.0001) %MON dynstat_eta_sd               =
> 6.2681413921827E-01
> (PID.TID 0000.0001) %MON dynstat_eta_del2             =
> 1.1009328164515E-04
> (PID.TID 0000.0001) %MON dynstat_eta_max              =
> 1.0775823955686E+00
> (PID.TID 0000.0001) %MON dynstat_eta_min              =
> -1.9467522542860E+00
> (PID.TID 0000.0001) %MON dynstat_eta_mean             =
> -2.9479897598316E-16
> (PID.TID 0000.0001) %MON dynstat_eta_sd               =
> 6.2739266263267E-01
> (PID.TID 0000.0001) %MON dynstat_eta_del2             =
> 1.1075534674701E-04
> (PID.TID 0000.0001) %MON dynstat_eta_max              =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_min              =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_mean             =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_sd               =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_del2             =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_max              =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_min              =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_mean             =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_sd               =  NaN
> (PID.TID 0000.0001) %MON dynstat_eta_del2             =  NaN
>
> But the job kept running, in vain of course, until endTime. I wonder  
> if
> there is a flag to stop the run and write something to standard error
> when NaN appears. I didn't found one.
>
> Thanks,
> David
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list