[MITgcm-support] Error on Cheyenne HPC with diagnostics package
Uchida Takaya
tu2140 at columbia.edu
Mon Nov 19 09:48:13 EST 2018
Hi Martin,
Thank you for getting back to me.
I do have the eedata file in the directory and what I have in the file is:
&EEPARMS
debugMode=.TRUE.,
&
I also checked lines around line 2141 in eeset_parms.f but I cannot seem to determine where the failure originates from:
C-- Read namelist
iUnit = scrUnit1
REWIND(iUnit)
READ(UNIT=iUnit,NML=EEPARMS,IOSTAT=errIO,err=3)
IF ( errIO .GE. 0 ) GOTO 4
3 CONTINUE
IF ( doReport ) THEN
WRITE(msgBuf,'(2A)') 'EESET_PARMS: ',
& 'Error reading parameter file "eedata"'
CALL PRINT_ERROR( msgBuf, 1 )
CALL EEDATA_EXAMPLE
eeBootError = .TRUE.
ENDIF
4 CONTINUE
C-- Execution Environment parameter file read
CLOSE(iUnit,STATUS='DELETE')
IF ( doReport .AND. .NOT.usingMPI ) THEN
WRITE(msgBuf,'(2A)') 'EESET_PARMS: ',
& 'in eedata: usingMPI=F conflicts'
CALL PRINT_ERROR( msgBuf, 1 )
WRITE(msgBuf,'(A)') 'EESET_PARMS: with #define ALWAYS_USE_MPI'
CALL PRINT_ERROR( msgBuf, 1 )
eeBootError = .TRUE.
ENDIF
usingMPI = .TRUE.
What is weird to me is that the run works fine when the diagnostics package is turned off, but fails when it is turned on with no errors related to the package...
Best,
Takaya
———————
PhD Candidate
Physical Oceanography
Columbia University in the City of New York
https://roxyboy.github.io/
> On Nov 19, 2018, at 5:28 AM, Martin Losch <Martin.Losch at awi.de> wrote:
>
> Hi Takaya,
>
> the job-log implies says that the executable is looking for a file and cannot find it. And the traceback even gives a good clue, which file it is:
> eeset_parms.F reads the file “eedata”. Are you sure that it is in your run directory?
>
> Martin
>
>> On 15. Nov 2018, at 23:48, Uchida Takaya <tu2140 at columbia.edu> wrote:
>>
>> Dear MITgcm support,
>>
>> I have a run on Cheyenne, which runs fine without the diagnostics package turned on but fails within seconds once I turn the package on and it gives me no useful errors except for:
>>
>> MPT ERROR: MPI_COMM_WORLD rank 1589 has terminated without calling MPI_Finalize()
>> aborting job
>>
>> I use the same namelist files, which run fine on the Columbia University Habanero cluster so my expectation was that it should all work fine. I am a bit lost here and would like to know if others have had related issues running MITgcm on Cheyenne.
>>
>> The optfile I use, which can also be found here ( https://github.com/roxyboy/ChannelMOC_Cheyenne/tree/master/SO_only-physics/channel_flat ) including the namelist files, is:
>>
>> module load intel/17.0.1 mpt/2.15f netcdf/4.6.1
>>
>> FC=mpif90
>> CC=mpicc
>> F90C=mpif90
>>
>> DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -DWORDLENGTH=4'
>> CPP='/lib/cpp -traditional -P'
>> EXTENDED_SRC_FLAG='-132'
>> OMPFLAG='-openmp'
>> CFLAGS='-fPIC'
>> LDADD='-shared-intel'
>>
>> LIBS="-L${MPI_ROOT}/lib"
>> INCLUDES="-I${MPI_ROOT}/include"
>> NOOPTFLAGS='-O0 -fPIC'
>>
>> #FFLAGS="-fPIC -convert big_endian -assume byterecl -align -xCORE-AVX2" # 4% slower with -O2
>> FFLAGS="-fPIC -convert big_endian -assume byterecl -align"
>> FDEBUG='-W0 -WB'
>> FFLAGS="$FDEBUG $FFLAGS"
>>
>> FOPTIM='-O3'
>> FOPTIM="$FOPTIM -ip -fp-model precise -traceback -ftz"
>>
>> Any advice would be appreciated.
>>
>> Thank you,
>> Takaya
>> ———————
>> PhD Candidate
>> Physical Oceanography
>> Columbia University in the City of New York
>> https://roxyboy.github.io/
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20181119/e8305f12/attachment-0001.html>
More information about the MITgcm-support
mailing list