[MITgcm-support] model blew up but not terminated
Daiwei (David) Wang
daiwei at MIT.EDU
Mon Jun 27 10:09:30 EDT 2011
Hi,
I found a few jobs of mine on beagle blew up, but kept running. By
blowup, I mean monitor statistics values became populated by NaN, for
example,
$ grep dynstat_eta STDOUT.0000
(PID.TID 0000.0001) %MON dynstat_eta_max =
1.2697030002398E+00
(PID.TID 0000.0001) %MON dynstat_eta_min =
-1.9409175554150E+00
(PID.TID 0000.0001) %MON dynstat_eta_mean =
-1.0404669740582E-16
(PID.TID 0000.0001) %MON dynstat_eta_sd =
6.2681413921827E-01
(PID.TID 0000.0001) %MON dynstat_eta_del2 =
1.1009328164515E-04
(PID.TID 0000.0001) %MON dynstat_eta_max =
1.0775823955686E+00
(PID.TID 0000.0001) %MON dynstat_eta_min =
-1.9467522542860E+00
(PID.TID 0000.0001) %MON dynstat_eta_mean =
-2.9479897598316E-16
(PID.TID 0000.0001) %MON dynstat_eta_sd =
6.2739266263267E-01
(PID.TID 0000.0001) %MON dynstat_eta_del2 =
1.1075534674701E-04
(PID.TID 0000.0001) %MON dynstat_eta_max = NaN
(PID.TID 0000.0001) %MON dynstat_eta_min = NaN
(PID.TID 0000.0001) %MON dynstat_eta_mean = NaN
(PID.TID 0000.0001) %MON dynstat_eta_sd = NaN
(PID.TID 0000.0001) %MON dynstat_eta_del2 = NaN
(PID.TID 0000.0001) %MON dynstat_eta_max = NaN
(PID.TID 0000.0001) %MON dynstat_eta_min = NaN
(PID.TID 0000.0001) %MON dynstat_eta_mean = NaN
(PID.TID 0000.0001) %MON dynstat_eta_sd = NaN
(PID.TID 0000.0001) %MON dynstat_eta_del2 = NaN
But the job kept running, in vain of course, until endTime. I wonder if
there is a flag to stop the run and write something to standard error
when NaN appears. I didn't found one.
Thanks,
David
More information about the MITgcm-support
mailing list