[MITgcm-support] mpi problem after cvs update

m. r. schaferkotter schaferk at bellsouth.net
Sat Jan 23 14:01:14 EST 2010


thanks JM.

a) it was a clean make. i updated last night.

here more info.

previous version:

(PID.TID 0000.0001) // MITgcmUV version:  checkpoint61v
(PID.TID 0000.0001) // Build user:        schaferk
(PID.TID 0000.0001) // Build host:        sapphire01
(PID.TID 0000.0001) // Build date:        Thu Jan 21 18:01:07 CST 2010

environment:

schaferk:sapphire01% uname -a
Linux sapphire01 2.6.16.54-0.2.12_1.0101.4789.0-ss #1 SMP Thu Nov 12  
18:02:52 CST 2009 x86_64 x86_64 x86_64 GNU/Linux

build script:
linux_amd64_pgf90_sapphire

FC='ftn'
CC='cc'
CPP='cpp -P -traditional'

DEFINES='-DWORDLENGTH=4 -DNML_TERMINATOR -DALLOW_USE_MPI - 
DALWAYS_USE_MPI -DTARGET_CRAYXT'

INCLUDES="-I/opt/mpt/3.2.0/xt/mpich2-pgi64/include"

FFLAGS='-byteswapio -r8 -Mnodclchk -Mextend -fPIC'
FOPTIM='-O3 -fastsse -tp k8-64 -pc=64 -Msmart -Mipa=fast'

CFLAGS='-O3 -fastsse -fPIC'



packages:

data.pkg

# Packages
  &PACKAGES
  useOBCS=.TRUE.,
  useDiagnostics=.TRUE.,
  useMNC=.FALSE.,
  &

packages.conf

debug
generic_advdiff
kpp
mdsio
mom_fluxform
mom_vecinv
monitor
obcs
rw
timeave
cal
exf
diagnostics


comments:

nothing strange in the genmake_warnings other than remarks about  
netcdf includes, which i/m not using and did _not_ use in the previous  
build which runs.

this is the last part of the of STDOUT.0000 file:
(PID.TID 0000.0001) // Model current state
(PID.TID 0000.0001) //  
=======================================================
(PID.TID 0000.0001)
(PID.TID 0000.0001) //  
=======================================================
(PID.TID 0000.0001) // Begin MONITOR dynamic field statistics
(PID.TID 0000.0001) //  
=======================================================
(PID.TID 0000.0001) %MON time_tsnumber                 
=                     0
(PID.TID 0000.0001) %MON time_secondsf                =    
0.0000000000000E+00


On Jan 23, 2010, at 11:49 AM, Jean-Michel Campin wrote:

> Hi Michael,
>
> One thing that could be useful to know is what was the version of
> the code you updated from (or when did you do the previous update).
> Otherwise, the code is tested everyday, with and without MPI,
> and all the tests from last night went normally.
> Can be caused by:
> a) something in the build process. Did you try a full "make Clean"
> before making the new executable ? And could you try to run one of
> the simple verification experiment with MPI.
> b) either in some pieces of code that we don't test (would need to  
> know
> more about the type of set-up/packages/options you are using).
>
> Thanks,
> Jean-Michel
>
> On Sat, Jan 23, 2010 at 11:06:58AM -0600, m. r. schaferkotter wrote:
>> all;
>> i did cvs update yesterday (jan 22), and now i/m getting these error
>> messages after building and attempting to run my (moments earlier
>> successful) job.
>>
>> schaferk:sapphire01% more r.001.err
>> aborting job:
>> Fatal error in MPI_Allreduce: Invalid MPI_Op, error stack:
>> MPI_Allreduce(714).......: MPI_Allreduce(sbuf=0x7fffffffb28c,
>> rbuf=0x7fffffffb2ec, count=1, dtype=0x4c00081b, MPI_SUM, M
>> PI_COMM_WORLD) failed
>> MPIR_SUM_check_dtype(388): MPI_Op MPI_SUM operation not defined for  
>> this
>> datatype
>> aborting job:
>> Fatal error in MPI_Allreduce: Invalid MPI_Op, error stack:
>> MPI_Allreduce(714).......: MPI_Allreduce(sbuf=0x7fffffffb28c,
>> rbuf=0x7fffffffb2ec, count=1, dtype=0x4c00081b, MPI_SUM, M
>> PI_COMM_WORLD) failed
>>
>>
>> fortunately, i moved aside the old executable and the job runs with
>> that.
>>
>>
>> what/s up with this?
>>
>> michael schaferkotter
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list