[MITgcm-devel] [altMITgcm/MITgcm66h] Bugfix/scratch files (#11)
Jean-Michel Campin
jmc at mit.edu
Sat Aug 5 10:32:20 EDT 2017
Hi Martin,
Sorry to insist, but I've just tried to compile with:
#define SINGLE_DISK_IO
and the 3 versions of eeset_parms.F, 1.40, 1.41 and the latest 1.42
compile fine (no problem with myProcId).
The only one that does not compile is the latest (1.42) when I also set
#define USE_FORTRAN_SCRATCH_FILES
but the problem is not myProcId but missing declaration of
scratchFile1 & scratchFile2
Cheers,
Jean-Michel
On Sat, Aug 05, 2017 at 12:09:22PM +0200, Martin Losch wrote:
> Hi Jean-Michel,
>
> I hope that my messy checkin sequence produced something that you can live with. I think, that it is pretty much inline with your last email, except that I changed one myProcId into procId, so that the code will compile with SINGLE_DISK_IO defined.
>
> Will add the test for the old default on Monday
>
> M.
>
> > On 4. Aug 2017, at 17:40, Jean-Michel Campin <jmc at mit.edu> wrote:
> >
> > Hi Martin,
> >
> >> On Fri, Aug 04, 2017 at 05:14:15PM +0200, Martin Losch wrote:
> >> OK,
> >> I didn???t realize that we don???t need that anymore, will remove it with the next version.
> >>
> >> About the single_disk_io: current code will not compile: myProcID was renamed into procID and I forgot to change,
> > No, I did it on purpose, and it's fine & safe, there is a stop.
> >
> > also we can have USE_FORTRAN_SCRATCH_FILES and SINGLE_DISK_IO defined at the same time (not sure if anyone would do that), in this case scratchfile1 and 2 are not defined. I suggest to replace lines 142-147:
> >> WRITE(scratchFile1,'(A)') 'scratch1'
> >> WRITE(scratchFile2,'(A)') 'scratch2'
> >> IF( procId .EQ. 0 ) THEN
> >> OPEN(UNIT=scrUnit1, FILE=scratchFile1, STATUS='UNKNOWN')
> >> OPEN(UNIT=scrUnit2, FILE=scratchFile2, STATUS='UNKNOWN')
> >> ENDIF
> >> with
> >> IF( procId .EQ. 0 ) THEN
> >> OPEN(UNIT=scrUnit1, FILE=???scratch1???, STATUS='UNKNOWN')
> >> OPEN(UNIT=scrUnit2, FILE='scratch2', STATUS='UNKNOWN')
> >> ENDIF
> >
> > Apart from missing declaratiopn of scratchFile1 & scratchFile2 in the case:
> > #defined SINGLE_DISK_IO with #defined USE_FORTRAN_SCRATCH_FILES
> > which need to be fixed, i would not change anything in SINGLE_DISK_IO blocks
> > (as I wrote earlier).
> >
> >> Also I suggest to define SINGLE_DISK_IO in ideal_2D_oce/code/CPP_EEOPTIONS.h to test this code
> > I am not very much in favor of this, at least not now.
> >
> >> and USE_FORTRAN_SCRATCH_FILES in lab_sea/code_ad/CPP_EEOPTIONS.h
> >> This would avoid having to check in another version of CPP_EEOPTIONS.h (all other use the default)
> > This sounds good.
> >
> > Cheers,
> > Jean-Michel
> >
> >>
> >> Martin
> >>
> >>> On 4. Aug 2017, at 17:01, Jean-Michel Campin <jmc at mit.edu> wrote:
> >>>
> >>> Hi Martin,
> >>>
> >>> The changes you made seems complicated:
> >>> This part: line 155-160
> >>> IF ( .NOT.doReport ) THEN
> >>> C called from eeboot_minimal.F before myProcId is set, so we have to
> >>> C use scratch files and keep our fingers crossed
> >>> OPEN(UNIT=scrUnit1,STATUS='SCRATCH')
> >>> OPEN(UNIT=scrUnit2,STATUS='SCRATCH')
> >>> ELSE
> >>> is not needed + it relies on opening unit with STATUS='SCRATCH' that we would
> >>> like to avoid when USE_FORTRAN_SCRATCH_FILES is undef (and with this
> >>> IF ( .NOT.doReport ) THEN .. the procId argument that I added few days ago is
> >>> of no use).
> >>>
> >>> But I would not change anything regarding the SINGLE_DISK_IO block (there is a
> >>> stop there, for good reasons, and it already open scrUnit 1 & 2
> >>> as real file, i.e, STATUS='UNKNOWN').
> >>>
> >>> Cheers,
> >>> Jean-Michel
> >>>
> >>>> On Fri, Aug 04, 2017 at 04:03:38PM +0200, Martin Losch wrote:
> >>>> Hi Jean-Michel,
> >>>> I checked in a new eeset_parms.F While I think that this version will not break any tests, it is probably not very good in terms of some special cases (e.g. it will break SINGLE_DISK_IO, because I forgot add a proper flag for the declaration of scratchFile1 and 2).
> >>>> It???s Friday afternoon and my brain seems to be in weekend mode already, that???s why I am reluctant to check in anything without consulting with you. Here???s what I think I should do:
> >>>> (1) remove the SINGLE_DISK_IO block, because now you always pass something meaningfull in ???procID" to eeboot_minimal.
> >>>> (2) replace it with a
> >>>> #ifdef SINGLE_DISK_IO
> >>>> IF ( procID .EQ. 0 ) THEN
> >>>> #else
> >>>> IF ( .TRUE. ) THEN
> >>>> #endif
> >>>> ELSE
> >>>> ???
> >>>> ENDIF
> >>>>
> >>>> at the beginning of the default (if !defined USE_FORTRAN_SCRATCH_FILES) block.
> >>>> I think that should work, what do you think?
> >>>>
> >>>> Martin
> >>>>
> >>>>> On 3. Aug 2017, at 15:10, Jean-Michel Campin <jmc at mit.edu> wrote:
> >>>>>
> >>>>> Hi Martin,
> >>>>>
> >>>>> Yes, last changes are good, and you can proceed with next step
> >>>>> when you want.
> >>>>>
> >>>>> Cheers,
> >>>>> Jean-Michel
> >>>>>
> >>>>>> On Thu, Aug 03, 2017 at 12:54:56PM +0200, Martin Losch wrote:
> >>>>>> Hi Jean-Michel,
> >>>>>>
> >>>>>> I know you have been busy with other stuff, but it does not look like there are any problems with my changes to eeset_parms.F
> >>>>>> Should I now do the second step and change the default as suggested (just to eeset_parms.F, if it works, I can add the stuff to all namelists)?
> >>>>>>
> >>>>>> Martin
> >>>>>>
> >>>>>>> On 28. Jul 2017, at 14:57, Martin Losch <Martin.Losch at awi.de> wrote:
> >>>>>>>
> >>>>>>> OK,then Iet???s wait until Monday,
> >>>>>>>
> >>>>>>> Martin
> >>>>>>>
> >>>>>>>> On 28. Jul 2017, at 14:50, Jean-Michel Campin <jmc at mit.edu> wrote:
> >>>>>>>>
> >>>>>>>> Hi Martin,
> >>>>>>>>
> >>>>>>>> These experiments were already failing before, in the same way,
> >>>>>>>> so I am not worried too much.
> >>>>>>>> Now some tests are not running everyday (I alternate -fast and -devel),
> >>>>>>>> so it might be good to wait at least an other day (to pass more -devel tests).
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Jean-Michel
> >>>>>>>>
> >>>>>>>>> On Fri, Jul 28, 2017 at 09:58:35AM +0200, Martin Losch wrote:
> >>>>>>>>> Hi Jean-Michel,
> >>>>>>>>>
> >>>>>>>>> it looks like some forward tests actually do fail since my change to eeset_parms.F, e.g. here:
> >>>>>>>>> svante linux_amd64_pgf77+mth.fast ( the corresponding linux_amd64_pgf77+mth.dvlp looks OK)
> >>>>>>>>>
> >>>>>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O aim.5l_cs
> >>>>>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O aim.5l_cs.thSI
> >>>>>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O aim.5l_Equatorial_Channel
> >>>>>>>>> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O aim.5l_LatLon
> >>>>>>>>>
> >>>>>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O hs94.cs-32x32x5
> >>>>>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O hs94.cs-32x32x5.impIGW
> >>>>>>>>>
> >>>>>>>>> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . N/O short_surf_wave
> >>>>>>>>>
> >>>>>>>>> The comile time error (hs94.cs-32x32x5, short_surf_wave) does not look related to me:
> >>>>>>>>>
> >>>>>>>>> pgf77 -byteswapio -Ktrap=fp -mp -tp k8-64 -pc=64 -O2 -Mvect=sse -c ini_dynvars.f
> >>>>>>>>> PGFTN-F-0007-Subprogram too large to compile at this optimization level (ini_dynvars.f)
> >>>>>>>>> PGFTN/x86-64 Linux 16.9-0: compilation aborted
> >>>>>>>>> Makefile:1653: recipe for target 'ini_dynvars.o' failed
> >>>>>>>>> make[1]: *** [ini_dynvars.o] Error 2
> >>>>>>>>> make[1]: Leaving directory '/net/fs09/d0/jm_c/test_svante/MITgcm_pgiMth/verification/hs94.cs-32x32x5/build'
> >>>>>>>>> Makefile:1561: recipe for target 'fwd_exe_target' failed
> >>>>>>>>> make: *** [fwd_exe_target] Error 2
> >>>>>>>>>
> >>>>>>>>> but the aim.* experiments loose their threads.
> >>>>>>>>>>>> Error: _mp_pcpu_reset: lost thread
> >>>>>>>>> Can that be related to closing some files?
> >>>>>>>>>
> >>>>>>>>> Martin
> >>>>>>>>>
> >>>>>>>>>> On 27. Jul 2017, at 00:22, Jean-Michel Campin <jmc at mit.edu> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Martin,
> >>>>>>>>>>
> >>>>>>>>>> two things:
> >>>>>>>>>> 1) I've checked that MPI_COMM_RANK is not blocking (can be called
> >>>>>>>>>> by only a subset of procs) so I added this call in the OASIS block
> >>>>>>>>>> and add argument "procId" to EESET_PARMS as suggested before.
> >>>>>>>>>> This should make your coming set of changes simpler.
> >>>>>>>>>> 2) the set of changes you propose seems good to me. And for now,
> >>>>>>>>>> I would set this USE_FORTRAN_SCRATCH_FILES in CPP_EEOPTIONS.h
> >>>>>>>>>> and not worry about genmake_local.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Jean-Michel
> >>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 26, 2017 at 10:16:45AM +0200, Martin Losch wrote:
> >>>>>>>>>>> Hi Jean-Michel,
> >>>>>>>>>>>
> >>>>>>>>>>> I suggest to test this now as you say, i.e. check in an eeset_parms.F where only the appropriate close statements are ammended with STATUS=???DELETE??? (which in my opinion should always work, since this option is F77 standard, but you never know ???), but also have (at least) one testreport-verification-experiment use the USE_FORTRAN_SCRATCH_FILES flag, so that it is always tested (that???s a bit annoying, since it would be the only experiment with it???s own CPP_EEOPTIONS.h file, or can this be put into some genmake_local?)
> >>>>>>>>>>>
> >>>>>>>>>>> Martin
> >>>>>>>>>>>
> >>>>>>>>>>>> On 25. Jul 2017, at 18:17, Jean-Michel Campin <jmc at mit.edu> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> An other thing:
> >>>>>>>>>>>> Are we 100% sure that closing a scratch unit file with status "delete"
> >>>>>>>>>>>> is completly standard on all platforms & compilers ? If not, we could
> >>>>>>>>>>>> test just this independently (i.e., check-in and see how daily test run).
> >>>>>>>>>>>> The reason is that when someone chose to use USE_FORTRAN_SCRATCH_FILES,
> >>>>>>>>>>>> (which is not going to be the default and therefore not tested) we need to be
> >>>>>>>>>>>> sure that the close instruction is OK.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> MITgcm-devel mailing list
> >>>>>>>>>>> MITgcm-devel at mitgcm.org
> >>>>>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> MITgcm-devel mailing list
> >>>>>>>>>> MITgcm-devel at mitgcm.org
> >>>>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> MITgcm-devel mailing list
> >>>>>>>>> MITgcm-devel at mitgcm.org
> >>>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> MITgcm-devel mailing list
> >>>>>>>> MITgcm-devel at mitgcm.org
> >>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> MITgcm-devel mailing list
> >>>>>>> MITgcm-devel at mitgcm.org
> >>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> MITgcm-devel mailing list
> >>>>>> MITgcm-devel at mitgcm.org
> >>>>>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>> _______________________________________________
> >>>>> MITgcm-devel mailing list
> >>>>> MITgcm-devel at mitgcm.org
> >>>>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>
> >>>> _______________________________________________
> >>>> MITgcm-devel mailing list
> >>>> MITgcm-devel at mitgcm.org
> >>>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> >>> _______________________________________________
> >>> MITgcm-devel mailing list
> >>> MITgcm-devel at mitgcm.org
> >>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> >>
> >> _______________________________________________
> >> MITgcm-devel mailing list
> >> MITgcm-devel at mitgcm.org
> >> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> > _______________________________________________
> > MITgcm-devel mailing list
> > MITgcm-devel at mitgcm.org
> > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list