[MITgcm-devel] global_sum_ad.F
Constantinos Evangelinos
ce107 at ocean.mit.edu
Wed Dec 22 10:06:57 EST 2004
On Wednesday 22 December 2004 06:00, Martin Losch wrote:
> Hi,
> while looking at a different problem (adjoint of global_max is broken),
> I had a closer look at pkg/autodiff/global_sum_ad.F together with "our"
> MPI specialist. We have the feeling (and Patrick has confirmed this
> feeling) that the argument in MPI_Bcast shouldn't be myThid, but
> something like myProcessorId (myMPIid,myPid??). Otherwise threads and
> processors will be mixed. What's your opinion?
The issue may be a completely moot point as the threaded code is broken (at
least my tests show that on an SGI IRIX box both with SGI directives as well
as OpenMP (derived from an earlier OpenMP version of MITgcm that never made
it into the main tree). I've promised Chris to finish debugging the threaded
code (the problems I've seen so far lie in that some of I/O has been written
outside of MASTER sections and write conflicts arise).
However the argument to MPI_Bcast in question should be a fixed integer and
not a variable such as the process ID. This argument is the root processor
argument and should be the same on all processes calling MPI_Bcast. As the
code is run with one thread per process, myThid=0 always and the code works
fine. A parallel-threaded version of it would also work as the call to
MPI_Bcast is done within a MASTER section and thus myThid=0 once again on all
nodes. But this is just plain luck... If instead of a MASTER section we were
using an OpenMP SINGLE directive (which would accomplish the same thing in
this case) different threads would call MPI_Bcast on each process and the
whole code would break down (the call would not even complete). Thus I
suggest myThid is replaced by "0" and all should be fine.
Constantinos
--
Dr. Constantinos Evangelinos
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology
More information about the MITgcm-devel
mailing list