[MITgcm-devel] yesterday changes in global_ocean.cs32x15.viscA4

Jean-Michel Campin jmc at mit.edu
Tue May 10 07:43:08 EDT 2016


Hi Martin,

You are right, if the unzip file is there (and it was already the case with the 
other gzip file: pickup_nh.0000086400) and not older (this is new check), 
then it does not even try to unzip it.

A somehow related question: do you have "gzip" on the compute node (which 
testreport use after making the tar file of the output dir) ?

Cheers,
Jean-Michel

On Tue, May 10, 2016 at 09:34:12AM +0200, Martin Losch wrote:
> Hi Jean-Michel,
> 
> last night???s testreport seems to have worked. 
> I use one of your complicated schemes that checks, if files are there and up-to-date, before downloading/updating/redoing the tr_run* directories, so I assume that I somehow gunzipped the file by hand and as long as it does not change, it will not be removed and the problem is suppressed. But again, using ???gunzip??? on this machine is probably not the best of ideas. 
> 
> Martin
> 
> > On 09 May 2016, at 17:29, Martin Losch <Martin.Losch at awi.de> wrote:
> > 
> > Hi Jean-Michel,
> > 
> > here???s the content of the directory and the error message:
> > stan1:tr_run.viscA4> ls -l pickup*
> > lrwxrwxrwx 1 mlosch CLIDYN      26 May 13  2015 pickup.0000072000 -> ../input/pickup.0000072000
> > lrwxrwxrwx 1 mlosch CLIDYN      31 May 13  2015 pickup.0000072000.meta -> ../input/pickup.0000072000.meta
> > lrwxrwxrwx 1 mlosch CLIDYN      17 May 13  2015 pickup.0000086400 -> pickup.0000072000
> > lrwxrwxrwx 1 mlosch CLIDYN      38 May 13  2015 pickup.0000086400.meta -> ../input.viscA4/pickup.0000086400.meta
> > -rw-r--r-- 1 mlosch CLIDYN 7520256 May  8 06:46 pickup.ckptA.data
> > -rw-r--r-- 1 mlosch CLIDYN     379 May  8 06:46 pickup.ckptA.meta
> > -rw-r--r-- 1 mlosch CLIDYN 1474560 Jul 16  2006 pickup_nh.0000086400
> > lrwxrwxrwx 1 mlosch CLIDYN      39 May 14  2015 pickup_nh.0000086400.gz -> ../input.viscA4/pickup_nh.0000086400.gz
> > -rw-r--r-- 1 mlosch CLIDYN  737280 May  8 20:37 pickup_ph.0000086400
> > stan1:tr_run.viscA4> cat STDERR.0000 | tail -5
> > (PID.TID 0000.0001) WARNING >> READ_PICKUP:  no field-list found
> > (PID.TID 0000.0001) WARNING >>  try to read pickup as it used to be written
> > (PID.TID 0000.0001) WARNING >>  until checkpoint59i (2007 Oct 22)
> > (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: filename: pickup_ph.0000086400.data
> > (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: File does not exist
> > 
> > So pickup_ph.00000864000 is there, but I also have this error message:
> > 
> > run_clean skipped!
> > linkdata from dirs: input.viscA4 input
> > ldir=input.viscA4: pickup_ph.0000086400.gz ;../input.viscA4/prepare_run: line 24: gunzip: command not found
> > unzip files: pickup_ph.0000086400 ;
> > ldir=input: eedata ; link files: from dir: ../../tutorial_held_suarez_cs/input
> > runmodel in global_ocean.cs32x15/tr_run.viscA4 ... failed (run: 1  end: 0 )
> > => output from running in global_ocean.cs32x15/tr_run.viscA4 :
> >> STOP ABNORMAL END: S/R MDS_READ_FIELD
> >> ./mitgcmuv(lang:f90): signal trap(SIGTERM: Software termination)
> >> stan-004: mpid: MPI process (universe = 0, rank = 1) terminated by exit(1)
> > 
> > I am not sure what???s happening, running the experiment alone with testreport ( like this:
> > ./testreport -t global_ocean.cs32x15 -MPI 2 -of=../tools/build_options/SUPER-UX_SX-ACE_sxf90_awi  -small_f -runonly -command "mpirun -np TR_NPROC ./mitgcmuv??? )
> > does not work (above error messages again), then I ran it directly (mpirun -np 2 ./mitgcmuv) which worked, then ran it again with testreport, and it worked again.
> > I am not sure how to debug this. Definitly ???gunzip??? does not work on the compute nodes of stan, maybe that???s the issue, because that???s the only place where gunzip is used?
> > 
> > Martin
> > 
> > 
> > 
> >> On 09 May 2016, at 16:14, Jean-Michel Campin <jmc at mit.edu> wrote:
> >> 
> >> Hi Martin,
> >> 
> >> It seems that the changes I made yesterday (most likely in 
> >> verification/global_ocean.cs32x15/input.viscA4/prepare_run
> >> ) are causing problems for the test you run on stan1.
> >> The file:  pickup_ph.0000086400 is not found but should have been
> >> gunzip when running "prepare_run"
> >> 
> >> When you have time, if you could take a look and tell me what is 
> >> wrong in "prepare_run", this would be nice.
> >> 
> >> Thanks,
> >> Jean-Michel
> >> 
> >> _______________________________________________
> >> MITgcm-devel mailing list
> >> MITgcm-devel at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> > 
> > 
> > _______________________________________________
> > MITgcm-devel mailing list
> > MITgcm-devel at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-devel
> 
> 
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel



More information about the MITgcm-devel mailing list