[MITgcm-support] Run-time errors on Archer
Dan Jones
dcjones.work at gmail.com
Tue Mar 11 10:48:25 EDT 2014
David - thanks very much for the Gnu build options file! It worked for me
as well. My model is running and is producing reasonable-looking test
fields.
Martin - good suggestion, thanks! I received a whole new crop of errors on
my last attempt to *compile* with Intel, so I think I will put Intel on the
back-burner for now. I will revisit in the future once I have some
production runs in place.
Thanks again,
Dan
On Fri, Mar 7, 2014 at 4:12 PM, <mitgcm-support-request at mitgcm.org> wrote:
> Send MITgcm-support mailing list submissions to
> mitgcm-support at mitgcm.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mitgcm.org/mailman/listinfo/mitgcm-support
> or, via email, send a message with subject or body 'help' to
> mitgcm-support-request at mitgcm.org
>
> You can reach the person managing the list at
> mitgcm-support-owner at mitgcm.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of MITgcm-support digest..."
>
>
> Today's Topics:
>
> 1. Re: Run-time errors on Archer (Martin Losch)
> 2. Re: Run-time errors on Archer (David Ferreira)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 7 Mar 2014 14:52:59 +0100
> From: Martin Losch <Martin.Losch at awi.de>
> To: MITgcm Support <mitgcm-support at mitgcm.org>
> Subject: Re: [MITgcm-support] Run-time errors on Archer
> Message-ID: <91B8D918-54BC-40D1-9556-232F65038FEB at awi.de>
> Content-Type: text/plain; charset="windows-1252"
>
> Dan,
>
> maybe you can find out what?s happening in your line 3331 in print.f (that
> should be within PRINT_LIST_I).
> There are a couple of internal writes of the type
> WRITE(fmt1,?(A,I1,A)?) ?some string?,someInteger,?some string?
>
> For debugging I would identify the line, and print to screen whatever is
> written to the character variable. Maybe there is a type mismatch that
> hasn?t been noticed before, because previous compilers have more mercy than
> yours? This is unlikely, but you never know. If this is not the problem,
> this debugging procedure may give you a hint of what may have gone wrong
> (maybe some type mismatch in the name list that wasn?t caught, etc.).
>
> Martin
>
>
> On Mar 7, 2014, at 2:16 PM, Dan Jones <dcjones.work at gmail.com> wrote:
>
> > Correction:
> >
> > The snippet of source code listed above actually comes from yet
> *another* error produced if the obcs package is included in packages.conf:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> > Image PC Routine Line
> Source
> > mitgcmuv 000000000302A3EA Unknown Unknown
> Unknown
> >
> > mitgcmuv 00000000023B5F85 print_list_i_ 3331
> print.f
> >
> > mitgcmuv 0000000001A302E7 obcs_readparms_ 2425
> obcs_readparms.f
> > mitgcmuv 000000000297EBF0 packages_readparm 1907
> packages_readparms.f
> >
> > mitgcmuv 0000000002932168 initialise_fixed_ 1874
> initialise_fixed.f
> >
> > mitgcmuv 0000000002B03175 the_model_main_ 3052
> the_model_main.f
> > mitgcmuv 00000000023A38F1 MAIN__ 4407
> main.f
> >
> > mitgcmuv 0000000000400F06 Unknown Unknown
> Unknown
> >
> > mitgcmuv 00000000030A8124 Unknown Unknown
> Unknown
> > mitgcmuv 0000000000400DD1 Unknown Unknown
> Unknown
> >
> >
> > IF ( debugLevel.GE.debLevA ) THEN
> > CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> > & .FALSE., .TRUE., standardMessageUnit )
> > It does *not* come from ini_depths, but from obcs_readparms. Sorry
> about that.
> >
> > Dan
> >
> >
> > On Fri, Mar 7, 2014 at 1:12 PM, Dan Jones <dcjones.work at gmail.com>
> wrote:
> > Greetings:
> >
> > I am having trouble getting MITgcm to run on Archer. I am using the
> Intel compiler (14.0.1.106) with the following defines/flags in the build
> options file:
> >
> > DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO -DWORDLENGTH=4
> -DHAVE_FLUSH'
> > LIBS='-L${CRAY_MPICH2_DIR}/lib -L${HDF5_DIR}/lib -L$NETCDF_DIR/lib
> -lnetcdf -lnetcdff -lhdf5 -lhdf5_hl'
> > INCLUDES='-I${CRAY_MPICH2_DIR}/include -I${HDF5_DIR}/include
> -I${NETCDF_DIR}/include -I${HDF5_INCLUDE_OPTS}'
> > FFLAGS='-h byteswapio -assume byterecl -convert big_endian -heap-arrays
> -O2 -g -traceback'
> >
> > The code compiles with no errors, but it does not run. The code crashes
> with the error:
> >
> > ABNORMAL END: S/R INI_THETA
> >
> > with no other information. The initial theta file is fine and has been
> used successfully in other MITgcm model setups. When I turn on the
> debugger (i.e. set debugMode=.TRUE. in input/eedata and set the
> debugLevel=4 in input/data), I get a *different* error that appears to
> occur in an *earlier* part of the code. The code crashes as it tries to
> read in the bathymetry file:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> >
> > Image PC Routine Line
> Source
> > mitgcmuv 00000000023CCCBC print_maprs_ 4981 print.f
> > mitgcmuv 000000000297FD81 plot_field_xyrs_ 1841 plot_field.f
> > mitgcmuv 000000000276CDC2 ini_depths_ 3271 ini_depths.f
> > mitgcmuv 0000000002932628 initialise_fixed_ 1908
> initialise_fixed.f
> > mitgcmuv 0000000002B03175 the_model_main 3052
> the_model_main.f
> > mitgcmuv 00000000023A38F1 MAIN__ 4407 main.f
> >
> > Again, the bathymetry file is fine and has been used successfully
> before. The problem indicated above in ini_depths.f happens in this
> function:
> > CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> > & .FALSE., .TRUE., standardMessageUnit )
> > I can suppress the output conversion error by re-compiling with a -check
> nooutput_conversion flag, but the code quickly produces a segmentation
> fault at about the same place (ini_depths.f and the functions that call it):
> >
> > forrtl: severe (194): SIGSEGV, segmentation fault occurred
> > mitgcmuv 00000000006FFAE7 print_maprs_ 4982
> print.f
> > mitgcmuv 00000000007950D3 plot_field_xyrs_ 1841
> plot_field.f
> >
> >
> > mitgcmuv 0000000000760811 ini_depths_ 3271
> ini_depths.f
> >
> > mitgcmuv 000000000078699A initialise_fixed_ 1908
> initialise_fixed.f
> > mitgcmuv 00000000007AFA93 the_model_main_ 3052
> the_model_main.f
> >
> >
> > mitgcmuv 00000000006F5C41 MAIN__ 4407
> main.f
> >
> > The fact that turning on the debugger produces an error *earlier* in the
> code is the most interesting/distressing bit here. Is this an I/O issue?
> Has anyone else run into something like this? I have contacted the Archer
> support team, but I thought it would be worth asking around here as well.
> >
> > Many thanks,
> > Dan
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> >
> >
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 07 Mar 2014 16:12:41 +0000
> From: David Ferreira <dfer at mit.edu>
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Run-time errors on Archer
> Message-ID: <5319EFF9.3060709 at mit.edu>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
> Hi,
> Just in case it helps with your problem (and in response to a previous
> e-mail asking about Archer): in attachment is an optfile for Archer with
> gfortran.
> (module swap PrgEnv-cray PrgEnv-gnu required to compile/run).
> I have run testreports successfully with it (with/without optimization,
> with/without MPI, netcdf, restart tests too).
> So it *seems* to work fine, but no warranty. Further testing would be
> useful.
>
> I tried to do make an optfile with the cray compiler suite and intel
> too, but it didn't work (some experiments broke with optimization,
> restart failed, ...). It was probably me doing something wrong but for
> some reason gfortran worked straight away.
>
> cheers,
> david
>
> PS: the comments in the optfile are irrelevant
>
>
> On 3/7/14 1:16 PM, Dan Jones wrote:
> > Correction:
> >
> > The snippet of source code listed above actually comes from yet
> > *another* error produced if the obcs package is included in
> packages.conf:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> > Image PC Routine Line
> Source
> > mitgcmuv 000000000302A3EA Unknown Unknown
> Unknown
> >
> > mitgcmuv 00000000023B5F85 print_list_i_ 3331
> print.f
> > mitgcmuv 0000000001A302E7 obcs_readparms_ 2425
> obcs_readparms.f
> > mitgcmuv 000000000297EBF0 packages_readparm 1907
> packages_readparms.f
> >
> > mitgcmuv 0000000002932168 initialise_fixed_ 1874
> initialise_fixed.f
> > mitgcmuv 0000000002B03175 the_model_main_ 3052
> the_model_main.f
> > mitgcmuv 00000000023A38F1 MAIN__ 4407
> main.f
> >
> > mitgcmuv 0000000000400F06 Unknown Unknown
> Unknown
> > mitgcmuv 00000000030A8124 Unknown Unknown
> Unknown
> > mitgcmuv 0000000000400DD1 Unknown Unknown
> Unknown
> >
> >
> > IF ( debugLevel.GE.debLevA ) THEN
> > CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> > & .FALSE., .TRUE., standardMessageUnit )
> > It does *not* come from ini_depths, but from obcs_readparms. Sorry
> > about that.
> >
> > Dan
> >
> >
> > On Fri, Mar 7, 2014 at 1:12 PM, Dan Jones <dcjones.work at gmail.com
> > <mailto:dcjones.work at gmail.com>> wrote:
> >
> > Greetings:
> >
> > I am having trouble getting MITgcm to run on Archer. I am using
> > the Intel compiler (14.0.1.106) with the following defines/flags
> > in the build options file:
> >
> > DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO
> > -DWORDLENGTH=4 -DHAVE_FLUSH'
> > LIBS='-L${CRAY_MPICH2_DIR}/lib -L${HDF5_DIR}/lib -L$NETCDF_DIR/lib
> > -lnetcdf -lnetcdff -lhdf5 -lhdf5_hl'
> > INCLUDES='-I${CRAY_MPICH2_DIR}/include -I${HDF5_DIR}/include
> > -I${NETCDF_DIR}/include -I${HDF5_INCLUDE_OPTS}'
> > FFLAGS='-h byteswapio -assume byterecl -convert big_endian
> > -heap-arrays -O2 -g -traceback'
> >
> > The code compiles with no errors, but it does not run. The code
> > crashes with the error:
> >
> > ABNORMAL END: S/R INI_THETA
> >
> > with no other information. The initial theta file is fine and has
> > been used successfully in other MITgcm model setups. When I turn
> > on the debugger (i.e. set debugMode=.TRUE. in input/eedata and set
> > the debugLevel=4 in input/data), I get a *different* error that
> > appears to occur in an *earlier* part of the code. The code
> > crashes as it tries to read in the bathymetry file:
> >
> > forrtl: error (63): output conversion error, unit -5, file Internal
> Formatted Write
> >
> >
> > Image PC Routine Line Source
> > mitgcmuv 00000000023CCCBC print_maprs_ 4981 print.f
> > mitgcmuv 000000000297FD81 plot_field_xyrs_ 1841 plot_field.f
> > mitgcmuv 000000000276CDC2 ini_depths_ 3271 ini_depths.f
> > mitgcmuv 0000000002932628 initialise_fixed_ 1908
> > initialise_fixed.f
> > mitgcmuv 0000000002B03175 the_model_main 3052
> > the_model_main.f
> > mitgcmuv 00000000023A38F1 MAIN__ 4407 main.f
> >
> > Again, the bathymetry file is fine and has been used successfully
> > before. The problem indicated above in ini_depths.f happens in
> > this function:
> >
> > CALL PRINT_LIST_I( OB_Jnorth, 1, OBNS_Nx, INDEX_I,
> > & .FALSE., .TRUE., standardMessageUnit )
> >
> > I can suppress the output conversion error by re-compiling with a
> > -check nooutput_conversion flag, but the code quickly produces a
> > segmentation fault at about the same place (ini_depths.f and the
> > functions that call it):
> >
> > forrtl: severe (194): SIGSEGV, segmentation fault occurred
> >
> > mitgcmuv 00000000006FFAE7 print_maprs_ 4982
> print.f
> > mitgcmuv 00000000007950D3 plot_field_xyrs_ 1841
> plot_field.f
> >
> >
> > mitgcmuv 0000000000760811 ini_depths_ 3271
> ini_depths.f
> > mitgcmuv 000000000078699A initialise_fixed_ 1908
> initialise_fixed.f
> > mitgcmuv 00000000007AFA93 the_model_main_ 3052
> the_model_main.f
> >
> >
> > mitgcmuv 00000000006F5C41 MAIN__ 4407
> main.f
> >
> > The fact that turning on the debugger produces an error *earlier*
> > in the code is the most interesting/distressing bit here. Is this
> > an I/O issue? Has anyone else run into something like this? I
> > have contacted the Archer support team, but I thought it would be
> > worth asking around here as well.
> >
> > Many thanks,
> > Dan
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505 <tel:%2B44%20%280%291223%20221505>
> > Fax: +44 (0)1223 362616 <tel:%2B44%20%280%291223%20362616>
> >
> > *************************************************
> >
> >
> >
> >
> > --
> > *************************************************
> >
> > Dr Dan Jones
> > Open Oceans Group
> > British Antarctic Survey
> > Cambridge, UK
> >
> > Phone: +44 (0)1223 221505
> > Fax: +44 (0)1223 362616
> >
> > *************************************************
> >
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mitgcm.org/pipermail/mitgcm-support/attachments/20140307/5a99e780/attachment.htm
> >
> -------------- next part --------------
> #!/bin/bash
>
> # $Header: /u/gcmpack/MITgcm/tools/build_options/linux_amd64_gfortran,v
> 1.23 2013/11/27 22:07:11 jmc Exp $
> # $Name: $
>
> # and on baudelaire.csail.mit.edu (FC13), using:
> # export
> MPI_GCC_DIR=/srv/software/gcc/gcc-packages/gcc-4.4.5/mpich2/mpich2-1.3
> # export MPI_INC_DIR=$MPI_GCC_DIR/include
> # export PATH=$MPI_GCC_DIR/bin:$PATH
> #
> #-------
> # run with OpenMP: needs to set environment var. OMP_NUM_THREADS
> # and generally, needs to increase the thread stack-size:
> # - sh,bash:
> # > export OMP_NUM_THREADS=2
> # > export GOMP_STACKSIZE=400m
> # - csh,tcsh:
> # > setenv OMP_NUM_THREADS 2
> # > setenv GOMP_STACKSIZE 400m
> #-------
>
> CC=cc
> FC=ftn
> F90C=ftn
>
> DEFINES='-DWORDLENGTH=4 -DNML_TERMINATOR'
> EXTENDED_SRC_FLAG='-ffixed-line-length-132'
> F90FIXEDFORMAT='-ffixed-form'
> GET_FC_VERSION="--version"
> OMPFLAG='-fopenmp'
>
> NOOPTFLAGS='-O0'
> NOOPTFILES=''
>
> CFLAGS='-O0'
> #- Requires gfortran from 2006 onwards for -fconvert=big-endian
> FFLAGS="$FFLAGS -fconvert=big-endian -fimplicit-none"
> #- for big setups, compile & link with "-fPIC" or set memory-model to
> "medium":
> #CFLAGS="$CFLAGS -fPIC"
> #FFLAGS="$FFLAGS -fPIC"
> #- with FC 19, need to use this without -fPIC (which cancels -mcmodel
> option):
> CFLAGS="$CFLAGS -mcmodel=medium"
> FFLAGS="$FFLAGS -mcmodel=medium"
> #- might want to use '-fdefault-real-8' for fizhi pkg:
> #FFLAGS="$FFLAGS -fdefault-real-8 -fdefault-double-8"
>
> INCLUDES='-I/opt/cray/mpt/6.1.1/gni/mpich2-gnu/48/include'
> LIBS='-L/opt/cray/mpt/6.1.1/gni/mpich2-gnu/48/lib'
>
> if test "x$IEEE" = x ; then #- with optimisation:
> #- can use -O2 (safe optimisation) to avoid Pb with some gcc version of
> -O3:
> FOPTIM='-O3 -funroll-loops'
> NOOPTFILES="$NOOPTFILES ini_masks_etc.F"
> else
> # these may also be useful, but require specific gfortran versions:
> # -Wno-tabs for gfortran >= 4.3
> #FFLAGS="$FFLAGS -Waliasing -Wampersand -Wsurprising -Wline-truncation"
> #- or simply:
> FFLAGS="$FFLAGS -Wall -Wno-unused-dummy-argument"
> #- to get plenty of warnings: -Wall -Wextra (older form: -Wall -W) or:
> #FFLAGS="$FFLAGS -Wconversion -Wimplicit-interface -Wunused-labels"
> if test "x$DEVEL" = x ; then #- no optimisation + IEEE :
> FOPTIM='-O0'
> else #- development/check options:
> FOPTIM='-O0 -g -fbounds-check -ffpe-trap=invalid,zero,overflow
> -finit-real=inf'
> fi
> fi
>
> F90FLAGS=$FFLAGS
> F90OPTIM=$FOPTIM
>
> ------------------------------
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> End of MITgcm-support Digest, Vol 129, Issue 12
> ***********************************************
>
--
*************************************************
Dr Dan Jones
Open Oceans Group
British Antarctic Survey
Cambridge, UK
Phone: +44 (0)1223 221505
Fax: +44 (0)1223 362616
*************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20140311/f00d3887/attachment-0001.htm>
More information about the MITgcm-support
mailing list