[MITgcm-support] problem with LAM mpi + xlf
Dmitri Leonov
dleonov at ocean.washington.edu
Tue Dec 6 18:22:23 EST 2005
Hello all,
Again about darwinism and mpi:
I'm trying to use a dual a G5 (2 cpu's on 1 node) with LAM 7.1.1 and IBM
xlf 8.1 (XServ cluster under MacOS 10.3).
(This configuration is being used for running other models: POM, ROMS)
The model either crashes (suddenly starts to output 'NaN' values) or
reports an I/O error like this
cg2d: Sum(rhs),rhsMax = 1.17979101234927E+02 5.263219991MPI_Recv:
message truncated: Input/output error (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - main()
With the non-mpi version and the same input, neither of the above happens.
Also, both with and without MPI, output shows "usingMPI = F" (don't
know if that's normal).
Right now I'm using checkpoint57x_post.
In general, how sensitive is the model supposed to be to the number of
CPU's?
One of the examples is a modified version of 'exp1' verification
experiment. The options/input can be found at
http://orchard.ocean.washington.edu/dleonov/exp1mod.tgz
(120 kb)
Hopefully I'm doing something wrong.
Regards,
Dmitri
More information about the MITgcm-support
mailing list