[MITgcm-devel] [altMITgcm/MITgcm66h] Bugfix/scratch files (#11)

Martin Losch Martin.Losch at awi.de
Fri Aug 4 10:03:38 EDT 2017


Hi Jean-Michel,
I checked in a new eeset_parms.F While I think that this version will not break any tests, it is probably not very good in terms of some special cases (e.g. it will break SINGLE_DISK_IO, because I forgot add a proper flag for the declaration of scratchFile1 and 2).
It’s Friday afternoon and my brain seems to be in weekend mode already, that’s why I am reluctant to check in anything without consulting with you. Here’s what I think I should do:
(1) remove the SINGLE_DISK_IO block, because now you always pass something meaningfull in “procID" to eeboot_minimal.
(2) replace it with a
#ifdef SINGLE_DISK_IO
       IF ( procID .EQ. 0 ) THEN
#else
       IF ( .TRUE. ) THEN
#endif
       ELSE
…
       ENDIF

at the beginning of the default (if !defined USE_FORTRAN_SCRATCH_FILES) block.
I think that should work, what do you think?

Martin

> On 3. Aug 2017, at 15:10, Jean-Michel Campin <jmc at mit.edu> wrote:
> 
> Hi Martin,
> 
> Yes, last changes are good, and you can proceed with next step
> when you want.
> 
> Cheers,
> Jean-Michel
> 
> On Thu, Aug 03, 2017 at 12:54:56PM +0200, Martin Losch wrote:
>> Hi Jean-Michel,
>> 
>> I know you have been busy with other stuff, but it does not look like there are any problems with my changes to eeset_parms.F
>> Should I now do the second step and change the default as suggested (just to eeset_parms.F, if it works, I can add the stuff to all namelists)?
>> 
>> Martin
>> 
>>> On 28. Jul 2017, at 14:57, Martin Losch <Martin.Losch at awi.de> wrote:
>>> 
>>> OK,then Iet???s wait until Monday,
>>> 
>>> Martin
>>> 
>>>> On 28. Jul 2017, at 14:50, Jean-Michel Campin <jmc at mit.edu> wrote:
>>>> 
>>>> Hi Martin,
>>>> 
>>>> These experiments were already failing before, in the same way,
>>>> so I am not worried too much. 
>>>> Now some tests are not running everyday (I alternate -fast and -devel), 
>>>> so it might be good to wait at least an other day (to pass more -devel tests).
>>>> 
>>>> Cheers,
>>>> Jean-Michel
>>>> 
>>>> On Fri, Jul 28, 2017 at 09:58:35AM +0200, Martin Losch wrote:
>>>>> Hi Jean-Michel,
>>>>> 
>>>>> it looks like some forward tests actually do fail since my change to eeset_parms.F, e.g. here:
>>>>> svante linux_amd64_pgf77+mth.fast ( the corresponding linux_amd64_pgf77+mth.dvlp looks OK)
>>>>> 
>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   aim.5l_cs
>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   aim.5l_cs.thSI
>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   aim.5l_Equatorial_Channel
>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   aim.5l_LatLon
>>>>> 
>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   hs94.cs-32x32x5
>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   hs94.cs-32x32x5.impIGW
>>>>> 
>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  . N/O   short_surf_wave
>>>>> 
>>>>> The comile time error (hs94.cs-32x32x5, short_surf_wave) does not look related to me:
>>>>> 
>>>>> pgf77 -byteswapio -Ktrap=fp -mp -tp k8-64 -pc=64 -O2 -Mvect=sse  -c ini_dynvars.f
>>>>> PGFTN-F-0007-Subprogram too large to compile at this optimization level  (ini_dynvars.f)
>>>>> PGFTN/x86-64 Linux 16.9-0: compilation aborted
>>>>> Makefile:1653: recipe for target 'ini_dynvars.o' failed
>>>>> make[1]: *** [ini_dynvars.o] Error 2
>>>>> make[1]: Leaving directory '/net/fs09/d0/jm_c/test_svante/MITgcm_pgiMth/verification/hs94.cs-32x32x5/build'
>>>>> Makefile:1561: recipe for target 'fwd_exe_target' failed
>>>>> make: *** [fwd_exe_target] Error 2
>>>>> 
>>>>> but the aim.* experiments loose their threads. 
>>>>>>>> Error: _mp_pcpu_reset: lost thread
>>>>> Can that be related to closing some files?
>>>>> 
>>>>> Martin
>>>>> 
>>>>>> On 27. Jul 2017, at 00:22, Jean-Michel Campin <jmc at mit.edu> wrote:
>>>>>> 
>>>>>> Hi Martin,
>>>>>> 
>>>>>> two things:
>>>>>> 1) I've checked that MPI_COMM_RANK is not blocking (can be called
>>>>>> by only a subset of procs) so I added this call in the OASIS block
>>>>>> and add argument "procId" to EESET_PARMS as suggested before.
>>>>>> This should make your coming set of changes simpler.
>>>>>> 2) the set of changes you propose seems good to me. And for now,
>>>>>> I would set this USE_FORTRAN_SCRATCH_FILES in CPP_EEOPTIONS.h 
>>>>>> and not worry about genmake_local.
>>>>>> 
>>>>>> Cheers,
>>>>>> Jean-Michel
>>>>>> 
>>>>>> On Wed, Jul 26, 2017 at 10:16:45AM +0200, Martin Losch wrote:
>>>>>>> Hi Jean-Michel,
>>>>>>> 
>>>>>>> I suggest to test this now as you say, i.e. check in an eeset_parms.F where only the appropriate close statements are ammended with STATUS=???DELETE??? (which in my opinion should always work, since this option is F77 standard, but you never know ???), but also have (at least) one testreport-verification-experiment use the USE_FORTRAN_SCRATCH_FILES flag, so that it is always tested (that???s a bit annoying, since it would be the only experiment with it???s own CPP_EEOPTIONS.h file, or can this be put into some genmake_local?)
>>>>>>> 
>>>>>>> Martin
>>>>>>> 
>>>>>>>> On 25. Jul 2017, at 18:17, Jean-Michel Campin <jmc at mit.edu> wrote:
>>>>>>>> 
>>>>>>>> An other thing:
>>>>>>>> Are we 100% sure that closing a scratch unit file with status "delete" 
>>>>>>>> is completly standard on all platforms & compilers ? If not, we could
>>>>>>>> test just this independently (i.e., check-in and see how daily test run). 
>>>>>>>> The reason is that when someone chose to use USE_FORTRAN_SCRATCH_FILES,
>>>>>>>> (which is not going to be the default and therefore not tested) we need to be 
>>>>>>>> sure that the close instruction is OK.
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> MITgcm-devel mailing list
>>>>>>> MITgcm-devel at mitgcm.org
>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>>> 
>>>>>> _______________________________________________
>>>>>> MITgcm-devel mailing list
>>>>>> MITgcm-devel at mitgcm.org
>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> MITgcm-devel mailing list
>>>>> MITgcm-devel at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>> 
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>> 
>>> 
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>> 
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel



More information about the MITgcm-devel mailing list