[MITgcm-devel] model crashes when opening scratch units

Martin Losch Martin.Losch at awi.de
Tue Jul 18 08:50:30 EDT 2017


Sorry about that, I just realized that is not allowed to specify a file name together with STATUS=‘SCRATCH’ as in
      OPEN(UNIT=scrUnit1, FILE=scratchFile1, STATUS=’SCRATCH’)
but one open the file with status UNKNOWN and use a CLOSE(UNIT=scrUnit1,STATUS=’DELETE’).

What do you think?
Martin

> On 18. Jul 2017, at 14:14, Martin Losch <Martin.Losch at awi.de> wrote:
> 
> Hi all,
> 
> I am following up on an old post. Is there good reason for having
>       OPEN(UNIT=scrUnit1,STATUS='SCRATCH’)
> (in open_copy_data_file.F and eeset_parms.F) without specifying a file name? In other words, wouldn’t it make sense to replace
> 
> # if defined (TARGET_BGL) || defined (TARGET_CRAYXT)
>      WRITE(scratchFile1,'(A,I4.4)') 'scratch1.', myProcId
>      WRITE(scratchFile2,'(A,I4.4)') 'scratch2.', myProcId
>      OPEN(UNIT=scrUnit1, FILE=scratchFile1, STATUS='UNKNOWN')
>      OPEN(UNIT=scrUnit2, FILE=scratchFile2, STATUS='UNKNOWN')
> # else
>      OPEN(UNIT=scrUnit1,STATUS='SCRATCH')
>      OPEN(UNIT=scrUnit2,STATUS='SCRATCH')
> # endif
> 
> by
>      WRITE(scratchFile1,'(A,I4.4)') 'scratch1.', myProcId
>      WRITE(scratchFile2,'(A,I4.4)') 'scratch2.', myProcId
>      OPEN(UNIT=scrUnit1, FILE=scratchFile1, STATUS=’SCRATCH')
>      OPEN(UNIT=scrUnit2, FILE=scratchFile2, STATUS=’SCRATCH’)
> and get rid off the TARGET_CRAYXT altogether (I think it’s the only place where it is used in the code)? One would probably have to change the format for the integer to ‘(A,I6.6)' or so. 
> Specifying the status “SCRATCH” obvious produces non-unique file names that cause the model to crash in some cases (I just had another one at our computer leaving the beginnger PhD student very puzzled).
> 
> Martin
> 
> 
>> On 17. Apr 2015, at 10:06, Martin Losch <Martin.Losch at awi.de> wrote:
>> 
>> Hi Matt,
>> 
>> thanks, that’s the flag I was looking for, it works for me too (except for all the “scratch*” files).
>> 
>> M.
>> 
>>> On 16 Apr 2015, at 18:56, Matthew Mazloff <mmazloff at ucsd.edu> wrote:
>>> 
>>> Hi Martin
>>> 
>>> Yes, I sometimes get this when running with lots of cores, so I don't think Cray_XC30 is the only platform where this is an issue.
>>> 
>>> However defining 
>>> #define TARGET_CRAYXT
>>> in CPP_EEOPTIONS.h
>>> always fixes it
>>> 
>>> Matt
>>> 
>>> On Apr 16, 2015, at 8:16 AM, Martin Losch <Martin.Losch at awi.de> wrote:
>>> 
>>>> Hi there,
>>>> 
>>>> this is probably not directly related to the MITgcm, but when I try to run the model on ECMWF’s Cray_XC30, I sometimes (depends a little on the number processors) get this type of error message:
>>>> 
>>>>> lib-4051 : UNRECOVERABLE library error 
>>>>> The file must not exist prior to OPEN if STATUS is 'NEW'.
>>>>> 
>>>>> Encountered during an OPEN of unit 11
>>>>> Fortran unit 11 is not connected
>>>>> 
>>>>> lib-4051 : UNRECOVERABLE library error 
>>>>> The file must not exist prior to OPEN if STATUS is 'NEW'.
>>>>> 
>>>>> Encountered during an OPEN of unit 11
>>>>> Fortran unit 11 is not connected
>>>>> Application 56285485 is crashing. ATP analysis proceeding...
>>>>> 
>>>>> ATP Stack walkback for Rank 579 starting:
>>>>> _start at start.S:113
>>>>> __libc_start_main at libc-start.c:242
>>>>> main at main.f:4353
>>>>> eeboot_ at eeboot.f:1583
>>>>> eeset_parms_ at eeset_parms.f:1821
>>>>> _OPEN at 0xa3a47d
>>>>> __OPN at 0xa3a22c
>>>>> _f_open at 0xa380a4
>>>>> _ferr at 0xa33d6a
>>>>> abort at abort.c:92
>>>>> raise at pt-raise.c:42
>>>>> ATP Stack walkback for Rank 579 done
>>>>>>>>> 
>>>> The line in eeset_parms.f is the one where the first scratch unit is opened.
>>>>   OPEN(UNIT=scrUnit1,STATUS='SCRATCH')
>>>>   OPEN(UNIT=scrUnit2,STATUS='SCRATCH’)
>>>> 
>>>> Has anyone had this problem? Is this a hardware bug?
>>>> 
>>>> Martin



More information about the MITgcm-devel mailing list