[MITgcm-support] model blew up but not terminated

Daiwei (David) Wang daiwei at MIT.EDU
Mon Jun 27 10:09:30 EDT 2011


Hi,

I found a few jobs of mine on beagle blew up, but kept running. By 
blowup, I mean monitor statistics values became populated by NaN, for 
example,

$ grep dynstat_eta STDOUT.0000
(PID.TID 0000.0001) %MON dynstat_eta_max              =   
1.2697030002398E+00
(PID.TID 0000.0001) %MON dynstat_eta_min              =  
-1.9409175554150E+00
(PID.TID 0000.0001) %MON dynstat_eta_mean             =  
-1.0404669740582E-16
(PID.TID 0000.0001) %MON dynstat_eta_sd               =   
6.2681413921827E-01
(PID.TID 0000.0001) %MON dynstat_eta_del2             =   
1.1009328164515E-04
(PID.TID 0000.0001) %MON dynstat_eta_max              =   
1.0775823955686E+00
(PID.TID 0000.0001) %MON dynstat_eta_min              =  
-1.9467522542860E+00
(PID.TID 0000.0001) %MON dynstat_eta_mean             =  
-2.9479897598316E-16
(PID.TID 0000.0001) %MON dynstat_eta_sd               =   
6.2739266263267E-01
(PID.TID 0000.0001) %MON dynstat_eta_del2             =   
1.1075534674701E-04
(PID.TID 0000.0001) %MON dynstat_eta_max              =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_min              =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_mean             =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_sd               =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_del2             =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_max              =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_min              =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_mean             =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_sd               =  NaN
(PID.TID 0000.0001) %MON dynstat_eta_del2             =  NaN

But the job kept running, in vain of course, until endTime. I wonder if 
there is a flag to stop the run and write something to standard error 
when NaN appears. I didn't found one.

Thanks,
David



More information about the MITgcm-support mailing list