[MITgcm-support] running error!
Martin Losch
mlosch at awi-bremerhaven.de
Mon Aug 7 05:56:16 EDT 2006
Hi Van Thinh,
since the model completed 46640 timesteps before it crashes I can
think of two possibilities:
1. some spontaneous problem with the hardware (unlikely with a
floating point exception). I sometimes have the model crash without
error messages and when I rerun the same job, it does fine, so it's
worth retrying
2. if the error is reproducable then I assume that the model simply
"runs out of bounds", that is something in the forcing or whatever
causes the model to explode or diverge. In that case you should be
able so see this happening in monitor output (if the monitorFreq is
small enough) or at least in the cg2d output that you get at every
timestep (if you did not set debugLevel < 0). These numbers (cfl-
numbers or cg2d residuals) will probably diverges exponentially until
you get numbers that the compile cannot handle. I had this happen
after hundreds of years of integration (order 100000 timesteps) with
a coarse model configuration when the timestep was too large.
Hope that helps,
Martin
On Aug 3, 2006, at 8:53 PM, Van Thinh Nguyen wrote:
> Hi all,
>
> I have complied & run the MItgcm on a cluster (HP Linux XC 3.0). At
> the time 23320 s (time step=0.5s), the program was terminated with
> this error:
>
> ------
> srun: error: req256: task[0-1]: Floating point exception (core dumped)
> srun: Terminating job ------
>
> Someone may have any idea?
>
> Thanks a lot!
>
> Van Thinh -----------------------------------------------
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list