[MITgcm-support] old pickups, new model version -- fails on first timestep with STOP in CALC_R_STAR
Rose, Brian
brose at albany.edu
Thu Nov 6 14:16:08 EST 2014
Thanks Jody.
I have other evidence that a lot of my old files [rescued from an old cluster with filesystem problems] got corrupted.
Fortunately I discovered pristine copies of my old model output and pickups that I had stored elsewhere.
The model is now restarting and running normally.
Thanks again JMC for pointing me in the right direction!
- Brian
On Nov 6, 2014, at 11:32 AM, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>> wrote:
Hi Brian,
On Nov 6, 2014, at 8:01 AM, Rose, Brian <brose at albany.edu<mailto:brose at albany.edu>> wrote:
You are right, this particular pickup has some very wacky values in it.
It looks like a lot of my old files from beagle got corrupted at some point.
Are you sure its not just different endianness? In which case its just a compiler option (http://en.wikipedia.org/wiki/Endianness)
try:
rdmds('pickup’,424224000,’ieee-be')
rdmds('pickup’,424224000,’ieee-le’)
and see if one of them makes any sense.
Cheers, Jody
On Nov 5, 2014, at 3:00 PM, Jean-Michel Campin <jmc at ocean.mit.edu<mailto:jmc at ocean.mit.edu>> wrote:
Hi Brian,
Looking closely to your error message, the few values of eta that are printed
are far too low (some as low as -8.601127+273); Also it's coming from the first
call to CALC_R_STAR (called with argument myIter = -1) and at this stage,
etaH has not been changed yet and is directly what has been read-in from
the pickup file. Such low value is suspicious since none of the field
in the pickup file are supposed to reach such low value.
Could you check the content of the pickup files, specially the EtaH field
(i.e. the last one). If you are using matlab:
[var,iters,M]=rdmds('pickup',424224000);
etaH=var(:,:,end);
and if you want to check the other fields:
nc=24; nr=15;
all3Dvar=reshape(var(:,:,1:nr*12),[nc*6 nc nr 12]);
all2Dvar=var(:,:,nr*12+1:end);
Cheers,
Jean-Michel
On Wed, Nov 05, 2014 at 03:48:45PM +0000, Rose, Brian wrote:
Sorry, I just realized that the checkpoint is clearly labelled in the STDOUT files that I have from the old runs:
(PID.TID 0000.0001) // MITgcmUV version: checkpoint61t
(PID.TID 0000.0001) // Build user: brose
(PID.TID 0000.0001) // Build host: compute-1-24.local
(PID.TID 0000.0001) // Build date: Thu Sep 30 16:17:04 EDT 2010
Bear with me! I'm a little rusty on my MITgcm skills!
Thanks
Brian
On Nov 5, 2014, at 10:42 AM, Rose, Brian <brose at albany.edu<mailto:brose at albany.edu><mailto:brose at albany.edu>> wrote:
Hi Jean-Michel!
To answer your questions:
1) Yes, it's the ocean. The error appears in STDERR files for several (not all) of the ocean ranks
2) Good question. The checkpoint information might be lost forever, as everything was onbeagle.darwinproject.mit.edu<http://beagle.darwinproject.mit.edu/><http://beagle.darwinproject.mit.edu/>, which has been taken offline. Looking back at my notes, I first started setting up and running the coupled model on beagle around August 2009.
3) The content of one pickup*.meta file and the data file (both for the ocean component) appended below.
Thanks for any suggestions!
- Brian
[br546577 at headnode Ocn]$ cat pickup.0424224000.001.001.meta
nDims = [ 2 ];
dimList = [
144, 1, 12,
24, 1, 24
];
dataprec = [ 'float64' ];
nrecords = [ 183 ];
timeStepNumber = [ 424224000 ];
/* modelTime = [ 1.527206400000E+12 ];*/
nFlds = [ 15 ];
fldList = {
'Uvel ' 'Vvel ' 'Theta ' 'Salt ' 'GuNm1 ' 'GuNm2 ' 'GvNm1 ' 'GvNm2 ' 'TempNm1 ' 'TempNm2 ' 'SaltNm1 ' 'SaltNm2 ' 'EtaN ' 'dEtaHdt ' 'EtaH '
};
[br546577 at headnode rank_1]$ cat data
# ====================
# | Model parameters |
# ====================
#
# Continuous equation parameters
&PARM01
tRef=16.5, 14.0, 13.5, 13.0, 12.0,
10.0, 6.7, 4.0, 2.2, 1.0,
0.2, -0.3, -0.7, -1.1, -1.4,
sRef=15*35.,
viscAh =3.E5,
viscAr =1.E-3,
bottomDragLinear=1.E-3,
diffKhT=0.,
diffK4T=0.,
diffKrT=3.E-5,
diffKhS=0.,
diffK4S=0.,
diffKrS=3.E-5,
gravity=9.81,
rhonil=1030.,
rhoConstFresh=1000.,
eosType='JMD95Z',
#allowFreezing=.TRUE.,
ivdc_kappa=10.,
implicitDiffusion=.TRUE.,
implicitFreeSurface=.TRUE.,
exactConserv=.TRUE.,
select_rStar=2,
nonlinFreeSurf=4,
hFacInf=0.2,
hFacSup=2.0,
useRealFreshWaterFlux=.TRUE.,
convertFW2Salt=34.,
temp_EvPrRn=0.,
hFacMin=.1,
hFacMinDr=20.,
vectorInvariantMomentum=.TRUE.,
staggerTimeStep=.TRUE.,
readBinaryPrec=64,
writeBinaryPrec=64,
#debugLevel=0,
#tempAdvScheme=20,
#saltAdvScheme=20,
&
# Elliptic solver parameters
&PARM02
cg2dMaxIters=200,
cg2dTargetResidual=1.E-9,
#cg2dTargetResWunit=1.E-14,
&
# Time stepping parameters
&PARM03
nIter0=424224000,
nTimeSteps=172800,
pChkptFreq=622080000.,
taveFreq=0.,
dumpFreq=622080000.,
monitorFreq=2592000.,
deltaTmom=3600.,
deltaTtracer=3600.,
deltaTClock =3600.,
#abEps=0.12,
#alph_AB=0.62,
#beta_AB=0.,
alph_AB=0.5,
beta_AB=0.281105,
doAB_onGtGs=.FALSE.,
forcing_In_AB=.FALSE.,
momDissip_In_AB=.FALSE.,
periodicExternalForcing=.TRUE.,
externForcingPeriod=2592000.,
externForcingCycle=31104000.,
pickupStrictlyMatch=.TRUE.,
&
# Gridding parameters
&PARM04
usingCurvilinearGrid=.TRUE.,
#horizGridFile='dxC1_dXYa',
#delR= 50., 70., 100., 140., 190.,
# 240., 290., 340., 390., 440.,
# 490., 540., 590., 640., 690.,
delR= 30., 40., 60., 80., 110.,
140., 160., 200., 220., 260.,
280., 310., 340., 370., 400.,
&
# Input datasets
&PARM05
bathyFile ='Ridge.c24.Bathy.3p0km.bin',
hydrogThetaFile='Rg3p0_Cpl289T_C24_z15.bin',
hydrogSaltFile ='Rg3p0_Cpl289S_C24_z15.bin',
&
On Nov 5, 2014, at 8:50 AM, Jean-Michel Campin <jmc at ocean.mit.edu<mailto:jmc at ocean.mit.edu><mailto:jmc at ocean.mit.edu>> wrote:
Hi Brian,
The way MITgcm is able to restart, even after some code changes affecting
the content of pickup files, is by checking the content of all pickup*.meta
Now in your case:
1) can you confirm that it's the ocean component that is stopping ?
2) how old was your original code (date ? checkpoint ?)
3) what is the content of 1 of these oceanic pickup*.meta file ?
and your main parameter file "data" ?
Since it's stopping even before starting the 1rst iteration (i.e.,
during initialisation), might not be too difficult to find what is
wrong.
Cheers,
Jean-Michel
On Wed, Nov 05, 2014 at 02:22:14AM +0000, Rose, Brian wrote:
Hi MITgcmers,
I'm trying to revive some calculations from several years ago. The model setup is coupled ocean + AIM atmosphere + thsice on C24 grid, as used in a number of papers by Ferreira, Rose, Marshall, etc.
I have all pickup and configuration files from these old runs, but unfortunately no longer have access to the machine I used to run them on.
So, I have ported an up-to-date version of MITgcm to our new cluster here at U. Albany. I successfully built and ran the test case
MITgcm/verification/cpl_aim+ocn
But when I try to set up the C24 coupled model as close to my previous runs as possible and initialize with my previous pickup files, the ocean model crashes on the first time step with errors like this:
fail at i,j= 1 25 ; rStarFacC,H,eta = -3.398999 3.000000E+03 -1.319700E+04
fail at i,j= 1 25 ; rStarFacS,H,eta = -1.730615 3.000000E+03 -6.998053E+00 -1.319700E+04
fail at i,j= 3 25 ; rStarFacC,H,eta =********** 3.000000E+03 -8.601127+273
fail at i,j= 3 25 ; rStarFacS,H,eta =********** 3.000000E+03 -7.290877E+00 -8.601127+273
fail at i,j= 5 25 ; rStarFacC,H,eta =********** 3.000000E+03 -4.028258E+08
fail at i,j= 5 25 ; rStarFacS,H,eta =********** 3.000000E+03 -8.277906E+00 -4.028258E+08
fail at i,j= 6 25 ; rStarFacC,H,eta =********** 3.000000E+03 -3.952771+291
fail at i,j= 6 25 ; rStarFacS,H,eta =********** 3.000000E+03 -9.155281E+00 -3.952771+291
fail at i,j= 8 25 ; rStarFacC,H,eta =********** 3.000000E+03 -5.011502E+48
fail at i,j= 8 25 ; rStarFacS,H,eta =********** 3.000000E+03 -1.126187E+01 -5.011502E+48
fail at i,j= 11 25 ; rStarFacC,H,eta =********** 3.000000E+03 -8.416418+231
fail at i,j= 11 25 ; rStarFacS,H,eta =********** 3.000000E+03 -1.384315E+01 -8.416418+231
fail at i,j= 13 25 ; rStarFacC,H,eta =********** 3.000000E+03 -2.545287E+63
WARNING: r*FacC < hFacInf at 7 pts : bi,bj,Thid,Iter= 1 1 1 -1
WARNING: r*FacS < hFacInf at 6 pts : bi,bj,Thid,Iter= 1 1 1 -1
STOP in CALC_R_STAR : too SMALL rStarFac[C,W,S] !
I'm not sure where to begin in debugging this.
I found a few references to similar-sounding problems in the support archives, but no clear answers. Someone suggested that this problem could arise from heavy sea ice loading. That is certainly the case here. The pickup files are from a very cold simulation with some tens of meters of sea ice. However this never caused problems before.
Any hints or suggestions?
Thanks,
Brian
|--------------------------------------------------------------------------------------------------------------------------------------|
Brian E. J. Rose
Assistant Professor
Atmospheric & Environmental Sciences
University at Albany
Earth Sciences 315
(518) 442-4477
http://www.atmos.albany.edu/facstaff/brose/
|--------------------------------------------------------------------------------------------------------------------------------------|
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org><mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org><mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
|--------------------------------------------------------------------------------------------------------------------------------------|
Brian E. J. Rose
Assistant Professor
Atmospheric & Environmental Sciences
University at Albany
Earth Sciences 315
(518) 442-4477
http://www.atmos.albany.edu/facstaff/brose/
|--------------------------------------------------------------------------------------------------------------------------------------|
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org><mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
|--------------------------------------------------------------------------------------------------------------------------------------|
Brian E. J. Rose
Assistant Professor
Atmospheric & Environmental Sciences
University at Albany
Earth Sciences 315
(518) 442-4477
http://www.atmos.albany.edu/facstaff/brose/
|--------------------------------------------------------------------------------------------------------------------------------------|
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
|--------------------------------------------------------------------------------------------------------------------------------------|
Brian E. J. Rose
Assistant Professor
Atmospheric & Environmental Sciences
University at Albany
Earth Sciences 315
(518) 442-4477
http://www.atmos.albany.edu/facstaff/brose/
|--------------------------------------------------------------------------------------------------------------------------------------|
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mitgcm.org/mailman/listinfo/mitgcm-support
|--------------------------------------------------------------------------------------------------------------------------------------|
Brian E. J. Rose
Assistant Professor
Atmospheric & Environmental Sciences
University at Albany
Earth Sciences 315
(518) 442-4477
http://www.atmos.albany.edu/facstaff/brose/
|--------------------------------------------------------------------------------------------------------------------------------------|
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20141106/19282c0b/attachment-0001.htm>
More information about the MITgcm-support
mailing list