[MITgcm-support] Running a test in parallel
Luis Cebamanos
l.cebamanos at epcc.ed.ac.uk
Mon Feb 6 10:55:05 EST 2017
Hi Jody,all
What you suggested works! Thanks. However I've noticed that this is a
very quick run on just 4 processes. My intention is to gather 2 o 3
benchmark test cases so I can port them to the KNL architecture.
Is there a way to modify this run and extend it to a few minutes?
Cheers,
Luis
On 03/02/17 22:21, Jody Klymak wrote:
> Hi Luis,
>
> Nx = sNx*nSx*nPx = 10*9*2 = 180.
>
> If you want two processors in x, set sNx=45,nSx=1, and nPx=2
>
> (ahem, though be aware that some compilers w/ optimizations don’t like odd sNx)
>
> Cheers, Jody
>
>
>
>> On 3 Feb 2017, at 14:16 PM, Luis Cebamanos <l.cebamanos at epcc.ed.ac.uk> wrote:
>>
>> Hi Jody,
>>
>> This sounds sensible, where can I find nx times ny value? My SIZE.h
>> under the global_ocean.90x40x15 directory looks like this:
>>
>> PARAMETER (
>> & sNx = 10,
>> & sNy = 10,
>> & OLx = 3,
>> & OLy = 3,
>> & nSx = 9,
>> & nSy = 4,
>> & nPx = 2,
>> & nPy = 2,
>> & Nx = sNx*nSx*nPx,
>> & Ny = sNy*nSy*nPy,
>> & Nr = 15)
>>
>>
>> Cheers,
>> Luis
>>
>> On 03/02/2017 20:40, Klymak Jody wrote:
>>> Sorry my fault. If you can't read a record on a file it usually means that it is the wrong size. Does nx times ny in the file equal what's in Size.h?
>>>
>>> Cheers. Jody
>>>
>>> Sent from my iPhone
>>>
>>>> On Feb 3, 2017, at 11:09, Luis Cebamanos <l.cebamanos at epcc.ed.ac.uk> wrote:
>>>>
>>>> Hi Jody,
>>>>
>>>> I am sorry for the confusion. It is still on the same system, just
>>>> different partitions. The computing nodes can only see some partitions
>>>> of the system.
>>>> I guess my question is, why is MITgcm failing to read bathymetry.bin
>>>> although the file is created?
>>>>
>>>> Thanks,
>>>> Luis
>>>>> On 03/02/2017 18:58, Jody Klymak wrote:
>>>>> Hi Luis,
>>>>>
>>>>> You must recompile `mitgcmuv` on the new computer. Executable programs aren’t usually portable between architectures. Hopefully there is a helpful file in `tools/build_options` to help w/ your genmake2 step.
>>>>>
>>>>> Cheers, Jody
>>>>>
>>>>>
>>>>>> On 3 Feb 2017, at 10:53 AM, Luis Cebamanos <l.cebamanos at epcc.ed.ac.uk> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am totally new to MITgcm, so apologies if I say any nonsense. I m
>>>>>> trying to run a test in parallel,so I started with
>>>>>> global_ocean.90x40x15. I changed the SIZE.h to run on 4 mpi processes
>>>>>> and built it successfully. Now, I am working on a Cray system, meaning
>>>>>> that it has to be run on a different location. So I copied over to the
>>>>>> right place /work/MITgcm/ the following directories from
>>>>>> global_ocean.90x40x15:
>>>>>>
>>>>>> global_ocean.90x40x15/input
>>>>>> global_ocean.90x40x15/build/mitgcmuv
>>>>>> tutorial_global_oce_latlon/input
>>>>>>
>>>>>> I then created the directory /work/MITgcm/run and run the following:
>>>>>> ln -s ../input/* .
>>>>>> ../input/prepare_run
>>>>>> ln -s ../mitgcmuv .
>>>>>>
>>>>>> My script calls MITgcm:
>>>>>>
>>>>>> aprun -n 4 -N 4 -d 1 ./mitgcmuv
>>>>>>
>>>>>> It appears to start running properly but soon fails with the following
>>>>>> errors:
>>>>>>
>>>>>> lib-4016 : UNRECOVERABLE library error
>>>>>> A READ operation tried to read a nonexistent record (721).
>>>>>>
>>>>>> Encountered during a direct access unformatted READ from unit 9
>>>>>> Fortran unit 9 is connected to a direct unformatted unblocked file:
>>>>>> "bathymetry.bin"
>>>>>>
>>>>>> lib-4016 : UNRECOVERABLE library error
>>>>>> A READ operation tried to read a nonexistent record (730).
>>>>>>
>>>>>> lib-4016 : UNRECOVERABLE library error
>>>>>> A READ operation tried to read a nonexistent record (370).
>>>>>>
>>>>>>
>>>>>> Could someone please help me to run a simple test case ?
>>>>>>
>>>>>> Regards,
>>>>>> Luis
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>> Scotland, with registration number SC005336.
>>>>>> _______________________________________________
>>>>>> MITgcm-support mailing list
>>>>>> MITgcm-support at mitgcm.org
>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>> --
>>>>> Jody Klymak
>>>>> http://web.uvic.ca/~jklymak/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-support mailing list
>>>>> MITgcm-support at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>> _______________________________________________
>>>> MITgcm-support mailing list
>>>> MITgcm-support at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> --
> Jody Klymak
> http://web.uvic.ca/~jklymak/
>
>
>
>
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20170206/52940501/attachment.sig>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20170206/52940501/attachment.el>
More information about the MITgcm-support
mailing list