[MITgcm-support] MPI on SGI
Ryan Abernathey
rpa at MIT.EDU
Mon Jun 13 14:52:17 EDT 2011
There appears to be a line break between $CMD and your mitgcm executable (last two lines of your jobscript). This would mean that you are not actually using mpi to call the gcm.
-R
Sent from my iPhone
On Jun 13, 2011, at 11:24, Nikki Lovenduski <uclanik at gmail.com> wrote:
> Hi Matt, Ryan, et al.,
>
> Thanks for your rapid responses.
>
> I'm compiling an older version of the model (checkpoint 58). I tried adding usingMPI =.TRUE., to eedata, but got the same result.
>
> Here's my optfile:
>
> ****************************
> #!/bin/bash
>
> FC=`which mpif77`
> CC=`which mpicc`
> DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO -DWORDLENGTH=4'
> CPP='cpp -traditional -P'
> MPI='true'
> INCLUDEDIRS="$MPIROOT/include $NETCDF/include"
> LIBDIRS=$MPIROOT/lib
> LIBS="-L/opt/torque/lib -L$MPIROOT/lib -L$NETCDF/lib -lnetcdf -lmpich"
> INCLUDES="-I$NETCDF/include"
> NOOPTFLAGS='-O0'
> S64='$(TOOLSDIR)/set64bitConst.sh'
> # For IEEE, use the "-ffloat-store" option
> if test "x$IEEE" = x ; then
> FFLAGS="-Wunused -Wuninitialized -DWORDLENGTH=4 -I$NETCDF/include"
> FOPTIM='-O3 -funroll-loops'
> else
> FFLAGS="-Wunused -ffloat-store -DWORDLENGTH=4 -I$NETCDF/include"
> FOPTIM='-O0 '
> fi
> ****************
>
> And here's my job submission script:
>
> ****************
> #!/bin/sh
> #PBS -l nodes=2:ppn=8
> #PBS -V
> #PBS -m ae
>
> NCPU=`wc -l < $PBS_NODEFILE`
> NNODES=`uniq $PBS_NODEFILE | wc -l`
>
> MPIRUN=/usr/local/bin/mpiexec
>
> CMD="$MPIRUN -n $NCPU"
>
> echo "--> Node file: " $PBS_NODEFILE
> echo "--> Running on nodes: " `uniq $PBS_NODEFILE`
> echo "--> Number of cpus: " $NCPU
> echo "--> Number of nodes: " $NNODES
> echo "--> Launch command: " $CMD
>
> cd test_16cpu_nl
> $CMD test_16cpu_nl/mitgcmuv_test_16cpu_nl
> *****************
>
> Any additional comments or suggestions would be most helpful!
>
> Thanks,
> Nikki
>
>
>
> ------------------------------------------------------
> Hi Nikki,
>
> not sure, but perhaps try adding
> usingMPI =.TRUE.,
> to eedata
>
>
> Though this error doesn't seem to be part of the new code -- what
> version are you using?
>
> -Matt
>
> -----------------------------------------------------
> Hi Nikki,
>
> It will be easier to figure out what's going on if you tell us which
> optfile you are using at build time (specified in your genmake2
> command) and also exactly how you are calling the executable at run
> time (your job sumbmission script).
>
> -R
>
>
> On Jun 10, 2011, at 2:53 PM, Nikki Lovenduski wrote:
>
> > Hi all,
> >
> > I'm trying to run the sector model version of MITgcm on an SGI Altix
> > HPCC. The model runs fine on 1 processor. However, I am having
> > trouble running on 16 processors: The compilation works fine (I add
> > -mpi to my genmake2 command and specify nPx = 2 and nPy = 8 in
> > SIZE.h), but I get an error at execution in the routine INI_PROCS.
> >
> > *** ERROR *** S/R INI_PROCS: No. of processes not equal to
> > nPx*nPy 1 16
> >
> > This error message indicates that MPI only assigns 1 processor to my
> > run; whereas, it should be running on nPx*nPy=16 processors.
> >
> > My job submission script specifies the correct number of processors
> > (-n 16).
> >
> > Any ideas what I'm doing wrong?
> >
> > Thanks,
> > Nikki
> >
> > ----------------------
> > Nikki Lovenduski
> > ATOC/INSTAAR
> > Univ. Colorado at Boulder
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list