[MITgcm-support] Running a test in parallel

Klymak Jody jklymak at uvic.ca
Fri Feb 3 15:40:35 EST 2017


Sorry my fault.  If you can't read a record on a file it usually means that it is the wrong size.  Does nx times ny in the file equal what's in Size.h?   

Cheers.   Jody

Sent from my iPhone

> On Feb 3, 2017, at 11:09, Luis Cebamanos <l.cebamanos at epcc.ed.ac.uk> wrote:
> 
> Hi Jody,
> 
> I am sorry for the confusion. It is still on the same system, just
> different partitions. The computing nodes can only see some partitions
> of the system.
> I guess my question is, why is MITgcm failing to read  bathymetry.bin
> although the file is created?
> 
> Thanks,
> Luis
>> On 03/02/2017 18:58, Jody Klymak wrote:
>> Hi Luis,
>> 
>> You must recompile `mitgcmuv` on the new computer.  Executable programs aren’t usually portable between architectures.  Hopefully there is a helpful file in `tools/build_options` to help w/ your genmake2 step.
>> 
>> Cheers,   Jody
>> 
>> 
>>> On 3 Feb 2017, at  10:53 AM, Luis Cebamanos <l.cebamanos at epcc.ed.ac.uk> wrote:
>>> 
>>> Hi all,
>>> 
>>> I am totally new to MITgcm, so apologies if I say any  nonsense. I m
>>> trying to run a test in parallel,so I started with
>>> global_ocean.90x40x15. I changed the SIZE.h to run on 4 mpi processes
>>> and built it successfully. Now, I am working on a Cray system, meaning
>>> that it has to be run on a different location. So I copied over to the
>>> right place /work/MITgcm/ the following directories from
>>> global_ocean.90x40x15:
>>> 
>>> global_ocean.90x40x15/input
>>> global_ocean.90x40x15/build/mitgcmuv
>>> tutorial_global_oce_latlon/input
>>> 
>>> I then created the directory /work/MITgcm/run and run the following:
>>> ln -s ../input/* .
>>> ../input/prepare_run
>>> ln -s ../mitgcmuv .
>>> 
>>> My script calls MITgcm:
>>> 
>>> aprun -n 4 -N 4 -d 1 ./mitgcmuv
>>> 
>>> It appears to start running properly but soon fails with the following
>>> errors:
>>> 
>>> lib-4016 : UNRECOVERABLE library error
>>> A READ operation tried to read a nonexistent record (721).
>>> 
>>> Encountered during a direct access unformatted READ from unit 9
>>> Fortran unit 9 is connected to a direct unformatted unblocked file:
>>> "bathymetry.bin"
>>> 
>>> lib-4016 : UNRECOVERABLE library error
>>> A READ operation tried to read a nonexistent record (730).
>>> 
>>> lib-4016 : UNRECOVERABLE library error
>>> A READ operation tried to read a nonexistent record (370).
>>> 
>>> 
>>> Could someone please help me to run a simple test case ?
>>> 
>>> Regards,
>>> Luis
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>> --
>> Jody Klymak    
>> http://web.uvic.ca/~jklymak/
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list