[MITgcm-support] error while writing pickup files with Cray compilers
Jody Klymak
jklymak at uvic.ca
Tue Apr 11 17:58:32 EDT 2017
Hopefully people who really understand the compiler issues will pipe up. I assume you are running w/ MPI - maybe the parallel writing of the mds files is failing somehow? But if you have a compiler issue:
- Did you try running w/ the optimizations turned off? (edit the `linux_ia64_cray_archer` file)
- in `data` you shoudl set `debugLevel=5` or something large like that and see if there are clues in the output.
Good luck! Jody
> On 11 Apr 2017, at 13:39 PM, Laura Cimoli <laura.cimoli at physics.ox.ac.uk> wrote:
>
> Hi Jody,
>
> yes, it does start writing the pickup file.
> I also made a few other tests (of course much shorter than 100 y!), and I got always the same error. Also, the configuration works with the gnu compiler, but if the Cray compiler is really 10x faster it would be nice to use it!
>
> I wonder if the pickup file is overwritten or if it is appending to the file...? Maybe it is doing something funny when trying to appending to it?
>
> Thanks,
> Laura
>
>
> From: Jody Klymak [jklymak at uvic.ca]
> Sent: 11 April 2017 21:27
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] error while writing pickup files with Cray compilers
>
> Hi Laura,
>
> Are you sure the mitgcm can write to the directory it is trying to write to? Does it *start* to write the pickup file?
>
> These are just dumb questions. Maybe it truly is a compiler issue, but it seems more likely it is a configuration issue. Obviously, for testing I’d suggest writing a pickup file well before 100 y has passed.
>
> Good luck,
>
> Jody
>
>
>
>
>
>
>
>
>> On 11 Apr 2017, at 11:21 AM, Laura Cimoli <laura.cimoli at physics.ox.ac.uk <mailto:laura.cimoli at physics.ox.ac.uk>> wrote:
>>
>> Hello Jody,
>>
>> sorry I forgot to mention that all my other outputs are in netcdf format, and they look fine.
>> The data file is attached.
>>
>> Thanks,
>> Laura
>>
>> From: Jody Klymak [jklymak at uvic.ca <mailto:jklymak at uvic.ca>]
>> Sent: 11 April 2017 19:01
>> To: mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>
>> Subject: Re: [MITgcm-support] error while writing pickup files with Cray compilers
>>
>> Are you able to write any mds files? i.e. did the T.000000000000.data file write? Can you supply your `data` file?
>>
>> Cheers, Jody
>>
>>
>>> On 11 Apr 2017, at 10:54 AM, Laura Cimoli <laura.cimoli at physics.ox.ac.uk <mailto:laura.cimoli at physics.ox.ac.uk>> wrote:
>>>
>>> Hello,
>>>
>>> this question is relevant mainly for Archer user, but of course any help is appreciated!
>>>
>>> I have recently tried to use Cray instead of gnu compilers, since the model should run much faster according to what stated here <http://www.archer.ac.uk/community/eCSE/eCSE03-09/eCSE03-09_White_Paper.pdf>. I have to admit I have not read that report in detail, but I hope that there are not particular constraints on the use of Cray compilers on Archer.
>>>
>>> I used the linux_ia64_cray_archer optfile, as indicated in the report.
>>>
>>> At a first glance, the model is compiled without any odd warning, and seems to run without any problem, but it crashes when writing the pickup file. This is the message I got (the whole error file is attached):
>>>
>>> lib-5058 : UNRECOVERABLE library error
>>> A read system call read less data than expected.
>>>
>>> Encountered during a direct access unformatted WRITE to unit 9
>>> Fortran unit 9 is connected to a direct unformatted unblocked file:
>>> "pickup.0001752000.data"
>>>
>>> _pmiu_daemon(SIGCHLD): [NID 02940] [c7-1c0s15n0] [Tue Apr 11 08:49:37 2017] PE RANK 69 exit signal Aborted
>>> [NID 02940] 2017-04-11 08:49:37 Apid 26123498: initiated application termination
>>>
>>>
>>> I am writing the permanent pickup file, and I don't have any temporary pickup file.
>>>
>>> The only weird warning I have noticed in the genmake.log file (attached) is below, but I don't know whether it is related to the problem reported above:
>>>
>>> running: check_HAVE_SIGREG()
>>> cc -c genmake_tc_1.c
>>> CC-513 craycc: WARNING File = genmake_tc_1.c, Line = 22
>>> A value of type "void *" cannot be assigned to an entity of type
>>> "void (*)(int, siginfo_t *, void *)".
>>> s.sa_sigaction = (void *)killhandler;
>>> ^
>>> Total warnings detected in genmake_tc_1.c: 1
>>> program hello
>>> integer anint
>>> common /iv/ anint
>>> external sigreg
>>> call sigreg(anint)
>>> end
>>> ftn -o genmake_tc genmake_tc_2.f genmake_tc_1.o
>>> --> set HAVE_SIGREG='t'
>>>
>>>
>>> Does anyone know why the Cray compilers return this error while writing the output binary file?
>>>
>>> Many thanks,
>>> Laura
>>> <genmake.log><output_000.e4441213>_______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support <http://mitgcm.org/mailman/listinfo/mitgcm-support>
>> --
>> Jody Klymak
>> http://web.uvic.ca/~jklymak/ <http://web.uvic.ca/~jklymak/>
>>
>>
>>
>>
>>
>> <data>_______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
>> http://mitgcm.org/mailman/listinfo/mitgcm-support <http://mitgcm.org/mailman/listinfo/mitgcm-support>
> --
> Jody Klymak
> http://web.uvic.ca/~jklymak/ <http://web.uvic.ca/~jklymak/>
>
>
>
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
> http://mitgcm.org/mailman/listinfo/mitgcm-support <http://mitgcm.org/mailman/listinfo/mitgcm-support>
--
Jody Klymak
http://web.uvic.ca/~jklymak/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20170411/e197ab8f/attachment-0001.htm>
More information about the MITgcm-support
mailing list