[MITgcm-support] Run-time errors on Archer

Dan Jones dcjones.work at gmail.com
Tue Mar 11 10:48:25 EDT 2014


David - thanks very much for the Gnu build options file!  It worked for me
as well.  My model is running and is producing reasonable-looking test
fields.

Martin - good suggestion, thanks!  I received a whole new crop of errors on
my last attempt to *compile* with Intel, so I think I will put Intel on the
back-burner for now.  I will revisit in the future once I have some
production runs in place.

Thanks again,
Dan


On Fri, Mar 7, 2014 at 4:12 PM, <mitgcm-support-request at mitgcm.org> wrote:

> Send MITgcm-support mailing list submissions to
>         mitgcm-support at mitgcm.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mitgcm.org/mailman/listinfo/mitgcm-support
> or, via email, send a message with subject or body 'help' to
>         mitgcm-support-request at mitgcm.org
>
> You can reach the person managing the list at
>         mitgcm-support-owner at mitgcm.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of MITgcm-support digest..."
>
>
> Today's Topics:
>
>    1. Re: Run-time errors on Archer (Martin Losch)
>    2. Re: Run-time errors on Archer (David Ferreira)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 7 Mar 2014 14:52:59 +0100
> From: Martin Losch <Martin.Losch at awi.de>
> To: MITgcm Support <mitgcm-support at mitgcm.org>
> Subject: Re: [MITgcm-support] Run-time errors on Archer
> Message-ID: <91B8D918-54BC-40D1-9556-232F65038FEB at awi.de>
> Content-Type: text/plain; charset="windows-1252"
>
> Dan,
>
> maybe you can find out what?s happening in your line 3331 in print.f (that
> should be within PRINT_LIST_I).
> There are a couple of internal writes of the type
>  WRITE(fmt1,?(A,I1,A)?) ?some string?,someInteger,?some string?
>
> For debugging I would identify the line, and print to screen whatever is
> written to the character variable. Maybe there is a type mismatch that
> hasn?t been noticed before, because previous compilers have more mercy than
> yours? This is unlikely, but you never know. If this is not the problem,
> this debugging procedure may give you a hint of what may have gone wrong
> (maybe some type mismatch in the name list that wasn?t caught, etc.).
>
> Martin
>
>
> On Mar 7, 2014, at 2:16 PM, Dan Jones <dcjones.work at gmail.com> wrote:
>
> > Correction:
> >
> > The snippet of source code listed above actually comes from yet
> *another* error produced if the obcs package is included in packages.conf:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> > Image              PC                Routine            Line
>  Source
> > mitgcmuv           000000000302A3EA  Unknown               Unknown
>  Unknown
> >
> > mitgcmuv           00000000023B5F85  print_list_i_            3331
>  print.f
> >
> > mitgcmuv           0000000001A302E7  obcs_readparms_          2425
>  obcs_readparms.f
> > mitgcmuv           000000000297EBF0  packages_readparm        1907
>  packages_readparms.f
> >
> > mitgcmuv           0000000002932168  initialise_fixed_        1874
>  initialise_fixed.f
> >
> > mitgcmuv           0000000002B03175  the_model_main_          3052
>  the_model_main.f
> > mitgcmuv           00000000023A38F1  MAIN__                   4407
>  main.f
> >
> > mitgcmuv           0000000000400F06  Unknown               Unknown
>  Unknown
> >
> > mitgcmuv           00000000030A8124  Unknown               Unknown
>  Unknown
> > mitgcmuv           0000000000400DD1  Unknown               Unknown
>  Unknown
> >
> >
> >       IF ( debugLevel.GE.debLevA ) THEN
> >         CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> >     &                    .FALSE., .TRUE., standardMessageUnit )
> > It does *not* come from ini_depths, but from obcs_readparms.  Sorry
> about that.
> >
> > Dan
> >
> >
> > On Fri, Mar 7, 2014 at 1:12 PM, Dan Jones <dcjones.work at gmail.com>
> wrote:
> > Greetings:
> >
> > I am having trouble getting MITgcm to run on Archer.  I am using the
> Intel compiler (14.0.1.106) with the following defines/flags in the build
> options file:
> >
> > DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO -DWORDLENGTH=4
> -DHAVE_FLUSH'
> > LIBS='-L${CRAY_MPICH2_DIR}/lib -L${HDF5_DIR}/lib -L$NETCDF_DIR/lib
> -lnetcdf -lnetcdff  -lhdf5 -lhdf5_hl'
> > INCLUDES='-I${CRAY_MPICH2_DIR}/include -I${HDF5_DIR}/include
> -I${NETCDF_DIR}/include -I${HDF5_INCLUDE_OPTS}'
> > FFLAGS='-h byteswapio -assume byterecl -convert big_endian -heap-arrays
> -O2 -g -traceback'
> >
> > The code compiles with no errors, but it does not run.  The code crashes
> with the error:
> >
> > ABNORMAL END: S/R INI_THETA
> >
> > with no other information.  The initial theta file is fine and has been
> used successfully in other MITgcm model setups.  When I turn on the
> debugger (i.e. set debugMode=.TRUE. in input/eedata and set the
> debugLevel=4 in input/data), I get a *different* error that appears to
> occur in an *earlier* part of the code.  The code crashes as it tries to
> read in the bathymetry file:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> >
> > Image              PC                Routine            Line
>  Source
> > mitgcmuv    00000000023CCCBC  print_maprs_         4981  print.f
> > mitgcmuv    000000000297FD81   plot_field_xyrs_       1841  plot_field.f
> > mitgcmuv    000000000276CDC2  ini_depths_           3271  ini_depths.f
> > mitgcmuv    0000000002932628   initialise_fixed_      1908
>  initialise_fixed.f
> > mitgcmuv    0000000002B03175   the_model_main       3052
>  the_model_main.f
> > mitgcmuv    00000000023A38F1   MAIN__           4407  main.f
> >
> > Again, the bathymetry file is fine and has been used successfully
> before.  The problem indicated above in ini_depths.f happens in this
> function:
> >         CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> >     &                    .FALSE., .TRUE., standardMessageUnit )
> > I can suppress the output conversion error by re-compiling with a -check
> nooutput_conversion flag, but the code quickly produces a segmentation
> fault at about the same place (ini_depths.f and the functions that call it):
> >
> > forrtl: severe (194): SIGSEGV, segmentation fault occurred
> > mitgcmuv           00000000006FFAE7  print_maprs_             4982
>  print.f
> > mitgcmuv           00000000007950D3  plot_field_xyrs_         1841
>  plot_field.f
> >
> >
> > mitgcmuv           0000000000760811  ini_depths_              3271
>  ini_depths.f
> >
> > mitgcmuv           000000000078699A  initialise_fixed_        1908
>  initialise_fixed.f
> > mitgcmuv           00000000007AFA93  the_model_main_          3052
>  the_model_main.f
> >
> >
> > mitgcmuv           00000000006F5C41  MAIN__                   4407
>  main.f
> >
> > The fact that turning on the debugger produces an error *earlier* in the
> code is the most interesting/distressing bit here.  Is this an I/O issue?
>  Has anyone else run into something like this?  I have contacted the Archer
> support team, but I thought it would be worth asking around here as well.
> >
> > Many thanks,
> > Dan
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> >
> >
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 07 Mar 2014 16:12:41 +0000
> From: David Ferreira <dfer at mit.edu>
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Run-time errors on Archer
> Message-ID: <5319EFF9.3060709 at mit.edu>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
> Hi,
> Just in case it helps with your problem (and in response to a previous
> e-mail asking about Archer): in attachment is an optfile for Archer with
> gfortran.
> (module swap PrgEnv-cray PrgEnv-gnu required to compile/run).
> I have run testreports successfully with it (with/without optimization,
> with/without MPI, netcdf, restart tests too).
> So it *seems* to work fine, but no warranty. Further testing would be
> useful.
>
> I tried to do make an optfile with the cray compiler suite and intel
> too, but it didn't work (some experiments broke with optimization,
> restart failed, ...). It was probably me doing something wrong but for
> some reason gfortran worked straight away.
>
> cheers,
> david
>
> PS: the comments in the optfile are irrelevant
>
>
> On 3/7/14 1:16 PM, Dan Jones wrote:
> > Correction:
> >
> > The snippet of source code listed above actually comes from yet
> > *another* error produced if the obcs package is included in
> packages.conf:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> > Image              PC                Routine            Line
>  Source
> > mitgcmuv           000000000302A3EA  Unknown               Unknown
>  Unknown
> >
> > mitgcmuv           00000000023B5F85  print_list_i_            3331
>  print.f
> > mitgcmuv           0000000001A302E7  obcs_readparms_          2425
>  obcs_readparms.f
> > mitgcmuv           000000000297EBF0  packages_readparm        1907
>  packages_readparms.f
> >
> > mitgcmuv           0000000002932168  initialise_fixed_        1874
>  initialise_fixed.f
> > mitgcmuv           0000000002B03175  the_model_main_          3052
>  the_model_main.f
> > mitgcmuv           00000000023A38F1  MAIN__                   4407
>  main.f
> >
> > mitgcmuv           0000000000400F06  Unknown               Unknown
>  Unknown
> > mitgcmuv           00000000030A8124  Unknown               Unknown
>  Unknown
> > mitgcmuv           0000000000400DD1  Unknown               Unknown
>  Unknown
> >
> >
> >        IF ( debugLevel.GE.debLevA ) THEN
> >          CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> >      &                    .FALSE., .TRUE., standardMessageUnit )
> > It does *not* come from ini_depths, but from obcs_readparms. Sorry
> > about that.
> >
> > Dan
> >
> >
> > On Fri, Mar 7, 2014 at 1:12 PM, Dan Jones <dcjones.work at gmail.com
> > <mailto:dcjones.work at gmail.com>> wrote:
> >
> >     Greetings:
> >
> >     I am having trouble getting MITgcm to run on Archer.  I am using
> >     the Intel compiler (14.0.1.106) with the following defines/flags
> >     in the build options file:
> >
> >     DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO
> >     -DWORDLENGTH=4 -DHAVE_FLUSH'
> >     LIBS='-L${CRAY_MPICH2_DIR}/lib -L${HDF5_DIR}/lib -L$NETCDF_DIR/lib
> >     -lnetcdf -lnetcdff  -lhdf5 -lhdf5_hl'
> >     INCLUDES='-I${CRAY_MPICH2_DIR}/include -I${HDF5_DIR}/include
> >     -I${NETCDF_DIR}/include -I${HDF5_INCLUDE_OPTS}'
> >     FFLAGS='-h byteswapio -assume byterecl -convert big_endian
> >     -heap-arrays -O2 -g -traceback'
> >
> >     The code compiles with no errors, but it does not run.  The code
> >     crashes with the error:
> >
> >     ABNORMAL END: S/R INI_THETA
> >
> >     with no other information.  The initial theta file is fine and has
> >     been used successfully in other MITgcm model setups.  When I turn
> >     on the debugger (i.e. set debugMode=.TRUE. in input/eedata and set
> >     the debugLevel=4 in input/data), I get a *different* error that
> >     appears to occur in an *earlier* part of the code.  The code
> >     crashes as it tries to read in the bathymetry file:
> >
> >     forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> >
> >
> >     Image PC                Routine Line        Source
> >     mitgcmuv    00000000023CCCBC print_maprs_         4981  print.f
> >     mitgcmuv    000000000297FD81 plot_field_xyrs_       1841 plot_field.f
> >     mitgcmuv    000000000276CDC2 ini_depths_           3271 ini_depths.f
> >     mitgcmuv    0000000002932628 initialise_fixed_      1908
> >     initialise_fixed.f
> >     mitgcmuv    0000000002B03175 the_model_main       3052
> >     the_model_main.f
> >     mitgcmuv    00000000023A38F1 MAIN__           4407  main.f
> >
> >     Again, the bathymetry file is fine and has been used successfully
> >     before.  The problem indicated above in ini_depths.f happens in
> >     this function:
> >
> >              CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> >          &                    .FALSE., .TRUE., standardMessageUnit )
> >
> >     I can suppress the output conversion error by re-compiling with a
> >     -check nooutput_conversion flag, but the code quickly produces a
> >     segmentation fault at about the same place (ini_depths.f and the
> >     functions that call it):
> >
> >     forrtl: severe (194): SIGSEGV, segmentation fault occurred
> >
> >     mitgcmuv           00000000006FFAE7  print_maprs_             4982
>  print.f
> >     mitgcmuv           00000000007950D3  plot_field_xyrs_         1841
>  plot_field.f
> >
> >
> >     mitgcmuv           0000000000760811  ini_depths_              3271
>  ini_depths.f
> >     mitgcmuv           000000000078699A  initialise_fixed_        1908
>  initialise_fixed.f
> >     mitgcmuv           00000000007AFA93  the_model_main_          3052
>  the_model_main.f
> >
> >
> >     mitgcmuv           00000000006F5C41  MAIN__                   4407
>  main.f
> >
> >     The fact that turning on the debugger produces an error *earlier*
> >     in the code is the most interesting/distressing bit here.  Is this
> >     an I/O issue?  Has anyone else run into something like this?  I
> >     have contacted the Archer support team, but I thought it would be
> >     worth asking around here as well.
> >
> >     Many thanks,
> >     Dan
> >
> >     --
> >     *************************************************
> >
> >     Dr Dan Jones
> >     Open Oceans Group
> >     British Antarctic Survey
> >     Cambridge, UK
> >
> >     Phone: +44 (0)1223 221505 <tel:%2B44%20%280%291223%20221505>
> >     Fax: +44 (0)1223 362616 <tel:%2B44%20%280%291223%20362616>
> >
> >     *************************************************
> >
> >
> >
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> >
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mitgcm.org/pipermail/mitgcm-support/attachments/20140307/5a99e780/attachment.htm
> >
> -------------- next part --------------
> #!/bin/bash
>
> #  $Header: /u/gcmpack/MITgcm/tools/build_options/linux_amd64_gfortran,v
> 1.23 2013/11/27 22:07:11 jmc Exp $
> #  $Name:  $
>
> # and on baudelaire.csail.mit.edu (FC13), using:
> #       export
> MPI_GCC_DIR=/srv/software/gcc/gcc-packages/gcc-4.4.5/mpich2/mpich2-1.3
> #       export MPI_INC_DIR=$MPI_GCC_DIR/include
> #       export PATH=$MPI_GCC_DIR/bin:$PATH
> #
> #-------
> # run with OpenMP: needs to set environment var. OMP_NUM_THREADS
> #    and generally, needs to increase the thread stack-size:
> #   -  sh,bash:
> #     > export OMP_NUM_THREADS=2
> #     > export GOMP_STACKSIZE=400m
> #   - csh,tcsh:
> #     > setenv OMP_NUM_THREADS 2
> #     > setenv GOMP_STACKSIZE 400m
> #-------
>
> CC=cc
> FC=ftn
> F90C=ftn
>
> DEFINES='-DWORDLENGTH=4 -DNML_TERMINATOR'
> EXTENDED_SRC_FLAG='-ffixed-line-length-132'
> F90FIXEDFORMAT='-ffixed-form'
> GET_FC_VERSION="--version"
> OMPFLAG='-fopenmp'
>
> NOOPTFLAGS='-O0'
> NOOPTFILES=''
>
> CFLAGS='-O0'
> #- Requires gfortran from 2006 onwards for -fconvert=big-endian
> FFLAGS="$FFLAGS -fconvert=big-endian -fimplicit-none"
> #- for big setups, compile & link with "-fPIC" or set memory-model to
> "medium":
> #CFLAGS="$CFLAGS -fPIC"
> #FFLAGS="$FFLAGS -fPIC"
> #-  with FC 19, need to use this without -fPIC (which cancels -mcmodel
> option):
>  CFLAGS="$CFLAGS -mcmodel=medium"
>  FFLAGS="$FFLAGS -mcmodel=medium"
> #- might want to use '-fdefault-real-8' for fizhi pkg:
> #FFLAGS="$FFLAGS -fdefault-real-8 -fdefault-double-8"
>
> INCLUDES='-I/opt/cray/mpt/6.1.1/gni/mpich2-gnu/48/include'
> LIBS='-L/opt/cray/mpt/6.1.1/gni/mpich2-gnu/48/lib'
>
> if test "x$IEEE" = x ; then     #- with optimisation:
>    #- can use -O2 (safe optimisation) to avoid Pb with some gcc version of
> -O3:
>     FOPTIM='-O3 -funroll-loops'
>     NOOPTFILES="$NOOPTFILES ini_masks_etc.F"
> else
>    # these may also be useful, but require specific gfortran versions:
>    # -Wno-tabs            for gfortran >= 4.3
>    #FFLAGS="$FFLAGS -Waliasing -Wampersand -Wsurprising -Wline-truncation"
>    #- or simply:
>     FFLAGS="$FFLAGS -Wall -Wno-unused-dummy-argument"
>    #- to get plenty of warnings: -Wall -Wextra (older form: -Wall -W) or:
>    #FFLAGS="$FFLAGS -Wconversion -Wimplicit-interface -Wunused-labels"
>   if test "x$DEVEL" = x ; then  #- no optimisation + IEEE :
>     FOPTIM='-O0'
>   else                          #- development/check options:
>     FOPTIM='-O0 -g -fbounds-check -ffpe-trap=invalid,zero,overflow
> -finit-real=inf'
>   fi
> fi
>
> F90FLAGS=$FFLAGS
> F90OPTIM=$FOPTIM
>
> ------------------------------
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> End of MITgcm-support Digest, Vol 129, Issue 12
> ***********************************************
>



-- 
*************************************************

Dr Dan Jones
Open Oceans Group
British Antarctic Survey
Cambridge, UK

Phone: +44 (0)1223 221505
Fax: +44 (0)1223 362616

*************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20140311/f00d3887/attachment-0001.htm>


More information about the MITgcm-support mailing list