[MITgcm-support] Restarting the MITgcm

Martin Losch mlosch at awi-bremerhaven.de
Tue Aug 10 04:37:45 EDT 2004


Uli,

there are two run-time flags in data namelist PARM03:
chkptFreq and pChkptFreq, the latter sets the frequency for permanent 
checkpoint/restart files (they have the correct time step number 
already in their file name), the former the frequency for the rolling 
checkpoint/restart files (with ckptA,ckptB in their names).

The rolling checkpoints files alternatively are overwritten at a new 
checkpoint. So if your chkptFreq = 5h, then after 5h 
pickup.ckptA.001.001.data/meta are written, after 10h 
pickup.ckptB.001.001.data/meta, and after 15h ckptA is overwritten, 
then after 20h ckptB is overwritten, etc. That means that ckptA is NOT 
always the latest checkpoint file, only half of the time (Dear code 
czars, I hope that this is correct).

  If you want to restart from, say, ckptA, you have to rename 
pickup.ckptA.001.001.data to pickup.0000004800.001.001.data (don't 
forget the *.data) and set iter0 or starttime appropriately, as you 
have done. In case there are some other pickup_*.ckptA.001.001.data 
files, they have to be renamed, too.

I would however recommend for the future, that if you know for how long 
you can and want to run your experiement and from where you want to 
restart it, set
chkptFreq=0.0, (to turn off rolling checkpoints)
pChkptFreq=whatever this time is in seconds (for permanent checkpoint 
files)

If you still have problems, have a look at the standard-output and this 
will tell you which restart files are not found (just to make sure that 
this is not the problem: for the restart you'll also need all data 
files that you used for the run that lead to the restart file).

Hope this helps,

Martin


> Hello again,
>
> I have started taking the MIT model through some longer runs and have
> tried to restart them to safe time, but so far unsuccessfully.
>
> The runs produce two types of pickup files: pickup.ckptA.data and
> pickup.chptB.data. I figured out that the 'A' file is a collection of
> the data after the final time step and I am assuming the 'B' file is
> from some time step before that.
>
> I have tried renaming pickup.ckptA.data to pickup.000004800 and set
> nIter0 to 4800 to start the model at this point, but it aborts almost
> immediately.
>
> What do I need to do to get it going successfully?!
> Thanks,
> Uli
Martin Losch // mailto:mlosch at awi-bremerhaven.de
Alfred-Wegener-Institut für Polar- und Meeresforschung
Postfach 120161, 27515 Bremerhaven, Germany
Tel./Fax: ++49(471)4831-1872/1797
http://www.awi-bremerhaven.de/People/show?mlosch





More information about the MITgcm-support mailing list