[MITgcm-support] cubeSphereExchange and MPI error?

Jean-Michel Campin jmc at ocean.mit.edu
Wed Jul 9 08:53:57 EDT 2014


Hi Anthony,

likely that there is something not right in your main parameter file "data".

Cheers,
Jean-Michel

On Tue, Jul 08, 2014 at 03:13:08PM -0400, Anthony Coletti wrote:
> Hi Jean-Michel,
> 
> I have realized I made the mistake of not specifying -mods=../code in my batch script.  That could be why none of my packages installed - so you are correct in mentioning that.  Thank you.   Unfortunately another error has popped up and the model still crashes after ~6 seconds:
> 
> ModuleCmd_Switch.c(172):ERROR:152: Module 'PrgEnv-cray' is currently not loaded
> At line 2825 of file ini_parms.f (unit = 11, file = '/lustre/medusa/acoletti/gfortrantm
> pLb7lfy')
> Fortran runtime error: Cannot match namelist object name .
> At line 2825 of file ini_parms.f (unit = 11, file = '/lustre/medusa/acoletti/gfortrantm
> p0SUaGP')
> Fortran runtime error: Cannot match namelist object name .
> At line 2825 of file ini_parms.f (unit = 11, file = '/lustre/medusa/acoletti/gfortrantm
> pjDleuD')
> Fortran runtime error: Cannot match namelist object name .
> At line 2825 of file ini_parms.f (unit = 11, file = '/lustre/medusa/acoletti/gfortrantm
> p0mWCfy')
> 
> There are 20 (or so) more lines of this error each specifying a different gfortran tmp file.
> 
> Here is line 2825 of my ini_parms.f file.  I think it is having a problem with PARM03 in the 'data' file?  I am not completely familiar with fortran so I am not sure what namelist object it is referring to:
> 
> 816 C--   Time stepping parameters
> 2817       rCD               = -1.D0
> 2818       epsAB_CD          = UNSET_RL
> 2819       latBandClimRelax  = UNSET_RL
> 2820       deltaTtracer      = 0.D0
> 2821       forcing_In_AB     = .TRUE.
> 2822       WRITE(msgBuf,'(A)') ' INI_PARMS ; starts to read PARM03'
> 2823       CALL PRINT_MESSAGE( msgBuf, standardMessageUnit,
> 2824      &                    SQUEEZE_RIGHT, myThid )
> 2825       READ(UNIT=iUnit,NML=PARM03) !,IOSTAT=errIO)
> 2826       IF ( errIO .LT. 0 ) THEN
> 2827        WRITE(msgBuf,'(A)')
> 2828      &  'S/R INI_PARMS: Error reading model parameter file "data"'
> 2829        CALL PRINT_ERROR( msgBuf, myThid )
> 2830        WRITE(msgBuf,'(A)') 'S/R INI_PARMS: Problem in namelist PARM03'
> 2831        CALL PRINT_ERROR( msgBuf, myThid )
> 2832        STOP 'ABNORMAL END: S/R INI_PARMS'
> 2833       ELSE
> 2834        WRITE(msgBuf,'(A)') ' INI_PARMS ; read PARM03 : OK'
> 2835        CALL PRINT_MESSAGE( msgBuf, standardMessageUnit,
> 2836      &                     SQUEEZE_RIGHT, myThid )
> 2837       ENDIF
> 
> Anthony
> 
> Anthony J. Coletti
> Climate System Research Center
> Department of Geosciences
> Morrill Building
> 611 N. Pleasant Street
> 233 Morrill Science Center
> University of Massachusetts-Amherst
> paleoclimate.org
> Email: ajcolett at geo.umass.edu
> http://blogs.umass.edu/ajcolett/
> http://necsc.umass.edu/people/anthony-coletti
> 
> “For me, I am driven by two main philosophies: know more today about the world than I knew yesterday and lessen the  suffering of others. You'd be surprised how far that gets you.” ― Neil deGrasse Tyson
> 
> 
> 
> 
> On Jul 8, 2014, at 8:45 AM, Anthony Coletti <ajcolett at geo.umass.edu> wrote:
> 
> > Hi Jean-Michel,
> > 
> > Let’s start with the exch2 problem,
> > 
> > So I check the PACKAGES_CONFIG.h and it seems you are right..for some reason, exch2 is undef
> > 
> > Here is the copy of my PACKAGES_CONFIG.h file:
> > 
> > reated by convert_cpp_cmd2defines with the following command line arguments:
> >  -bPACKAGES_CONFIG_H Disabled packages: -UALLOW_ADMTLM -UALLOW_AIM_V23 -UALLOW_ATM2D -UALLOW_ATM_COMMON -UALLOW_A
> > TM_COMPON_INTERF -UALLOW_ATM_OCN_COUPLER -UALLOW_ATM_PHYS -UALLOW_AUTODIFF -UALLOW_BBL -UALLOW_BULK_FORCE -UALLOW
> > _CAL -UALLOW_CD_CODE -UALLOW_CFC -UALLOW_CHEAPAML -UALLOW_CHRONOS -UALLOW_COMPON_COMMUNIC -UALLOW_COST -UALLOW_CT
> > RL -UALLOW_DIAGNOSTICS -UALLOW_DIC -UALLOW_DOWN_SLOPE -UALLOW_EBM -UALLOW_ECCO -UALLOW_EMBED_FILES -UALLOW_EXCH2 
> > -UALLOW_EXF -UALLOW_FIZHI -UALLOW_FLT -UALLOW_FRAZIL -UALLOW_GCHEM -UALLOW_GGL90 -UALLOW_GMREDI -UALLOW_GRDCHK -U
> > ALLOW_GRIDALT -UALLOW_ICEFRONT -UALLOW_KPP -UALLOW_LAND -UALLOW_LAYERS -UALLOW_LONGSTEP -UALLOW_MATRIX -UALLOW_MN
> > C -UALLOW_MY82 -UALLOW_MYPACKAGE -UALLOW_OBCS -UALLOW_OCN_COMPON_INTERF -UALLOW_OFFLINE -UALLOW_OPENAD -UALLOW_OP
> > PS -UALLOW_PP81 -UALLOW_PROFILES -UALLOW_PTRACERS -UALLOW_RBCS -UALLOW_REGRID -UALLOW_RUNCLOCK -UALLOW_SALT_PLUME
> >  -UALLOW_SBO -UALLOW_SEAICE -UALLOW_SHAP_FILT -UALLOW_SHELFICE -UALLOW_SHOWFLOPS -UALLOW_SMOOTH -UALLOW_SPHERE -U
> > ALLOW_STREAMICE -UALLOW_THSICE -UALLOW_TIMEAVE -UALLOW_ZONAL_FILT   Enabled packages: -DALLOW_DEBUG -DALLOW_GENER
> > IC_ADVDIFF -DALLOW_MDSIO -DALLOW_MOM_COMMON -DALLOW_MOM_FLUXFORM -DALLOW_MOM_VECINV -DALLOW_MONITOR -DALLOW_RW
> > */
> > 
> > #ifndef PACKAGES_CONFIG_H
> > #define PACKAGES_CONFIG_H
> > /*  Disabled packages:  */
> > #undef  ALLOW_ADMTLM
> > #undef  ALLOW_AIM_V23
> > #undef  ALLOW_ATM2D
> > #undef  ALLOW_ATM_COMMON
> > #undef  ALLOW_ATM_COMPON_INTERF
> > #undef  ALLOW_ATM_OCN_COUPLER
> > #undef  ALLOW_ATM_PHYS
> > #undef  ALLOW_AUTODIFF
> > #undef  ALLOW_BBL
> > #undef  ALLOW_BULK_FORCE
> > #undef  ALLOW_CAL
> > #undef  ALLOW_CD_CODE
> > #undef  ALLOW_CFC
> > #undef  ALLOW_CHEAPAML
> > #undef  ALLOW_CHRONOS
> > #undef  ALLOW_COMPON_COMMUNIC
> > #undef  ALLOW_COST
> > #undef  ALLOW_CTRL
> > #undef  ALLOW_DIAGNOSTICS
> > #undef  ALLOW_DIC
> > #undef  ALLOW_DOWN_SLOPE
> > #undef  ALLOW_EBM
> > #undef  ALLOW_ECCO
> > #undef  ALLOW_EMBED_FILES
> > #undef  ALLOW_EXCH2
> > #undef  ALLOW_EXF
> > #undef  ALLOW_FIZHI
> > #undef  ALLOW_FLT
> > #undef  ALLOW_FRAZIL
> > #undef  ALLOW_GCHEM
> > #undef  ALLOW_GGL90
> > #undef  ALLOW_GMREDI
> > #undef  ALLOW_GRDCHK
> > #undef  ALLOW_GRIDALT
> > #undef  ALLOW_ICEFRONT
> > #undef  ALLOW_KPP
> > #undef  ALLOW_LAND
> > #undef  ALLOW_LAYERS
> > #undef  ALLOW_LONGSTEP
> > #undef  ALLOW_MATRIX
> > #undef  ALLOW_MNC
> > #undef  ALLOW_MY82
> > #undef  ALLOW_MYPACKAGE
> > #undef  ALLOW_OBCS
> > #undef  ALLOW_OCN_COMPON_INTERF
> > #undef  ALLOW_OFFLINE
> > #undef  ALLOW_OPENAD
> > #undef  ALLOW_OPPS
> > #undef  ALLOW_PP81
> > #undef  ALLOW_PROFILES
> > #undef  ALLOW_PTRACERS
> > #undef  ALLOW_RBCS
> > #undef  ALLOW_REGRID
> > #undef  ALLOW_RUNCLOCK
> > #undef  ALLOW_SALT_PLUME
> > #undef  ALLOW_SBO
> > #undef  ALLOW_SEAICE
> > #undef  ALLOW_SHAP_FILT
> > #undef  ALLOW_SHELFICE
> > #undef  ALLOW_SHOWFLOPS
> > #undef  ALLOW_SMOOTH
> > #undef  ALLOW_SPHERE
> > #undef  ALLOW_STREAMICE
> > #undef  ALLOW_THSICE
> > #undef  ALLOW_TIMEAVE
> > #undef  ALLOW_ZONAL_FILT
> > /*   */
> > /*  Enabled packages:  */
> > #define ALLOW_DEBUG
> > #define ALLOW_GENERIC_ADVDIFF
> > #define ALLOW_MDSIO
> > #define ALLOW_MOM_COMMON
> > #define ALLOW_MOM_FLUXFORM
> > #define ALLOW_MOM_VECINV
> > #define ALLOW_MONITOR
> > #define ALLOW_RW
> > #endif /* PACKAGES_CONFIG_H */
> > 
> > 
> > I wonder why that would be considering I have the packages to install listed in my packages.config file.
> > 
> > This is the command I use for compiling the GCM which seems correct:
> > ../../../tools/genmake2 -optfile=../../../tools/build_options/linux_amd64_gfortran -mpi
> > 
> > 
> > And here is some lines from genmake2:
> > 
> > ===  Processing options files and arguments  ===
> >   getting local config information:  none found
> > Warning: ROOTDIR was not specified ; try using a local copy of MITgcm found at "../../.."
> >   getting OPTFILE information:
> >     using OPTFILE="../../../tools/build_options/linux_amd64_gfortran"
> >   getting AD_OPTFILE information:
> >     using AD_OPTFILE="../../../tools/adjoint_options/adjoint_default"
> >   check makedepend (local: 0, system: 0, 0)
> >   Turning on MPI cpp macros
> > 
> > ===  Checking system libraries  ===
> >   Do we have the system() command using gfortran...  yes
> >   Do we have the fdate() command using gfortran...  yes
> >   Do we have the etime() command using gfortran...  no
> >   Can we call simple C routines (here, "cloc()") using gfortran...  yes
> >   Can we unlimit the stack size using gfortran...  yes
> >   Can we register a signal handler using gfortran...  yes
> >   Can we use stat() through C calls...  yes
> >   Can we create NetCDF-enabled binaries...  yes
> >   Can we create LAPACK-enabled binaries...  no
> >   Can we call FLUSH intrinsic subroutine...  yes
> > 
> > ===  Setting defaults  ===
> >   Adding MODS directories: 
> >   Making source files in eesupp from templates
> >   Making source files in pkg/exch2 from templates
> >   Making source files in pkg/regrid from templates
> > 
> > ===  Determining package settings  ===
> >   getting package dependency info from  ../../../pkg/pkg_depend
> >   getting package groups info from      ../../../pkg/pkg_groups
> >   checking list of packages to compile:
> >     before group expansion packages are: default_pkg_list
> >     replacing "default_pkg_list" with:  gfd
> >     replacing "gfd" with:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor
> >     after group expansion packages are:  mom_common mom_fluxform mom_vecinv generic_advdiff debug mdsio rw monitor
> >   applying DISABLE settings
> >   applying ENABLE settings
> >     packages are:  debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
> >   applying package dependency rules
> >     packages are:  debug generic_advdiff mdsio mom_common mom_fluxform mom_vecinv monitor rw
> >   Adding STANDARDDIRS='eesupp model'
> >   Searching for *OPTIONS.h files in order to warn about the presence
> >     of "#define "-type statements that are no longer allowed:
> >     found CPP_EEOPTIONS="../../../eesupp/inc/CPP_EEOPTIONS.h"
> >     found CPP_OPTIONS="../../../model/inc/CPP_OPTIONS.h"
> >   Creating the list of files for the adjoint compiler.
> > 
> > ===  Creating the Makefile  ===
> >   setting INCLUDES
> >   Determining the list of source and include files
> >   Writing makefile: Makefile
> >   Add the source list for AD code generation
> >   Making list of "exceptions" that need ".p" files
> >   Making list of NOOPTFILES
> >   Add rules for links
> >   Adding makedepend marker
> > 
> > FYI: I am building a simulation using the 32x6x32x15 MPI run.
> > 
> > The makefile looks okay…it seems to be making the dependencies.  I do have 2 or 3 sets of this warning:
> > 
> > gfortran -fconvert=big-endian -fimplicit-none -mcmodel=medium  -O0 -funroll-loops -c solve_tridiagonal.f
> > cat solve_uv_tridiago.F |  cpp -traditional -P -DWORDLENGTH=4 -DNML_TERMINATOR -DALLOW_USE_MPI -DALWAYS_USE_MPI -DALLOW_USE_MPI -DHAVE_SYSTEM -DHAVE_FDATE -DHAVE_CLOC -DHAVE_SETRLSTK -DHAVE_SIGREG -DHAVE_STAT -DHAVE_NETCDF -DHAVE_FLUSH  -I/opt/cray/netcdf-hdf5parallel/4.3.1/GNU/48/include -I/opt/cray/mpt/6.3.0/gni/mpich2-gnu/48/include | ../../../tools/set64bitConst.sh  > solve_uv_tridiago.f
> > gfortran -fconvert=big-endian -fimplicit-none -mcmodel=medium  -O0 -funroll-loops -c solve_uv_tridiago.f
> > solve_uv_tridiago.f:1076.72:
> > 
> >           DO bj=2,nSy                                                   
> >                                                                         1
> > Warning: DO loop at (1) will be executed zero times
> > solve_uv_tridiago.f:1103.24:
> > 
> >           DO bj=nSy-1,1,-1                                              
> >                         1
> > Warning: DO loop at (1) will be executed zero times
> > 
> > 
> > I am not sure if that means anything.
> > 
> > 
> > Attached is my ini_threading_environment.f (small f)
> > <ini_threading_environment.f>
> > 
> > 
> > Thanks for your fruitful insight!
> > Anthony
> > 
> > 
> > Anthony J. Coletti
> > Climate System Research Center
> > Department of Geosciences
> > Morrill Building
> > 611 N. Pleasant Street
> > 233 Morrill Science Center
> > University of Massachusetts-Amherst
> > paleoclimate.org
> > Email: ajcolett at geo.umass.edu
> > http://blogs.umass.edu/ajcolett/
> > http://necsc.umass.edu/people/anthony-coletti
> > 
> > “For me, I am driven by two main philosophies: know more today about the world than I knew yesterday and lessen the  suffering of others. You'd be surprised how far that gets you.” ― Neil deGrasse Tyson
> > 
> > 
> > 
> > 
> > On Jul 7, 2014, at 11:30 PM, Jean-Michel Campin <jmc at ocean.mit.edu> wrote:
> > 
> >> PACKAGES_CONFIG.h
> > 
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
> 

> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list