[MITgcm-support] Restarting the MITgcm

Martin Losch mlosch at awi-bremerhaven.de
Tue Aug 10 10:15:53 EDT 2004


Uli,
I don't have too much experience with multiprocessor runs, but in the 
past I had to rename ALL pickup-files in the case of multiple tiles. 
So, if you use 2 processores, then both pickup.ckptA.001.001.data and 
pickup.ckptA.002.001.data have to be renamed in 
pickup.0000004800.001.001.data and pickup.0000004800.002.001.data so 
that each processor/tile has it's own restart file. Have a look at 
STDOUT.0003 to see, which file MDSREADFIELD tried to open last.

Martin

On Aug 10, 2004, at 4:06 PM, Uli Riemenschneider wrote:

> Hi Martin,
>
> thanks for the reply. I followed your instructions, but somehow it is 
> still not working. In file STDERR.0003 I find the following error 
> message:
>
> (PID.TID 0003.0001) *** ERROR *** MDSREADFIELD: File does not exist
>
> I am not even sure what file it is referring too here? Any idea?
> I am running the model on multiple processors, does that make a 
> difference to the restart procedure?
>
> Thanks for the help!
> Ciao
> Uli
>
> Martin Losch wrote:
>> Uli,
>> there are two run-time flags in data namelist PARM03:
>> chkptFreq and pChkptFreq, the latter sets the frequency for permanent 
>> checkpoint/restart files (they have the correct time step number 
>> already in their file name), the former the frequency for the rolling 
>> checkpoint/restart files (with ckptA,ckptB in their names).
>> The rolling checkpoints files alternatively are overwritten at a new 
>> checkpoint. So if your chkptFreq = 5h, then after 5h 
>> pickup.ckptA.001.001.data/meta are written, after 10h 
>> pickup.ckptB.001.001.data/meta, and after 15h ckptA is overwritten, 
>> then after 20h ckptB is overwritten, etc. That means that ckptA is 
>> NOT always the latest checkpoint file, only half of the time (Dear 
>> code czars, I hope that this is correct).
>>  If you want to restart from, say, ckptA, you have to rename 
>> pickup.ckptA.001.001.data to pickup.0000004800.001.001.data (don't 
>> forget the *.data) and set iter0 or starttime appropriately, as you 
>> have done. In case there are some other pickup_*.ckptA.001.001.data 
>> files, they have to be renamed, too.
>> I would however recommend for the future, that if you know for how 
>> long you can and want to run your experiement and from where you want 
>> to restart it, set
>> chkptFreq=0.0, (to turn off rolling checkpoints)
>> pChkptFreq=whatever this time is in seconds (for permanent checkpoint 
>> files)
>> If you still have problems, have a look at the standard-output and 
>> this will tell you which restart files are not found (just to make 
>> sure that this is not the problem: for the restart you'll also need 
>> all data files that you used for the run that lead to the restart 
>> file).
>> Hope this helps,
>> Martin
>>> Hello again,
>>>
>>> I have started taking the MIT model through some longer runs and have
>>> tried to restart them to safe time, but so far unsuccessfully.
>>>
>>> The runs produce two types of pickup files: pickup.ckptA.data and
>>> pickup.chptB.data. I figured out that the 'A' file is a collection of
>>> the data after the final time step and I am assuming the 'B' file is
>>> from some time step before that.
>>>
>>> I have tried renaming pickup.ckptA.data to pickup.000004800 and set
>>> nIter0 to 4800 to start the model at this point, but it aborts almost
>>> immediately.
>>>
>>> What do I need to do to get it going successfully?!
>>> Thanks,
>>> Uli
>> Martin Losch // mailto:mlosch at awi-bremerhaven.de
>> Alfred-Wegener-Institut für Polar- und Meeresforschung
>> Postfach 120161, 27515 Bremerhaven, Germany
>> Tel./Fax: ++49(471)4831-1872/1797
>> http://www.awi-bremerhaven.de/People/show?mlosch
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://dev.mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> -- 
> *********************************************************
> Ulrike Riemenschneider,  Postdoctoral Investigator
> Physical Oceanography Dept. MS #21
> Woods Hole Oceanographic Institution
> Woods Hole, MA 02543, USA
>
> Phone: (+1) 508 289 2916 Fax: (+1) 508 457 2181
> *********************************************************
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://dev.mitgcm.org/mailman/listinfo/mitgcm-support





More information about the MITgcm-support mailing list