[MITgcm-devel] global_sum_ad.F
chris hill
cnh at mit.edu
Wed Dec 22 10:16:22 EST 2004
Martin,
The parallelism in max looks fine to me as is. The reason why its in a
MASTER section is to ensure thread 0 does the BCAST. As Constantinos
points out that is important.
If do switch to use OpenMP SINGLE we will change it.
There is a subtle adjoint issue that I have never fully reconciled.
Since MAX(2,2,2) is somewhat arbitrary in which 2 it returns, in the
adjoint form it could notionally return a "different" 2 to the forward
run. As far as I know there is node code that would care which 2 it
gets, but in theory its possible for this to cause a problem in reverse
computations (I think).
Chris
On Wed, 2004-12-22 at 10:06, Constantinos Evangelinos wrote:
> On Wednesday 22 December 2004 06:00, Martin Losch wrote:
>
> > Hi,
> > while looking at a different problem (adjoint of global_max is broken),
> > I had a closer look at pkg/autodiff/global_sum_ad.F together with "our"
> > MPI specialist. We have the feeling (and Patrick has confirmed this
> > feeling) that the argument in MPI_Bcast shouldn't be myThid, but
> > something like myProcessorId (myMPIid,myPid??). Otherwise threads and
> > processors will be mixed. What's your opinion?
>
> The issue may be a completely moot point as the threaded code is broken (at
> least my tests show that on an SGI IRIX box both with SGI directives as well
> as OpenMP (derived from an earlier OpenMP version of MITgcm that never made
> it into the main tree). I've promised Chris to finish debugging the threaded
> code (the problems I've seen so far lie in that some of I/O has been written
> outside of MASTER sections and write conflicts arise).
>
> However the argument to MPI_Bcast in question should be a fixed integer and
> not a variable such as the process ID. This argument is the root processor
> argument and should be the same on all processes calling MPI_Bcast. As the
> code is run with one thread per process, myThid=0 always and the code works
> fine. A parallel-threaded version of it would also work as the call to
> MPI_Bcast is done within a MASTER section and thus myThid=0 once again on all
> nodes. But this is just plain luck... If instead of a MASTER section we were
> using an OpenMP SINGLE directive (which would accomplish the same thing in
> this case) different threads would call MPI_Bcast on each process and the
> whole code would break down (the call would not even complete). Thus I
> suggest myThid is replaced by "0" and all should be fine.
>
> Constantinos
More information about the MITgcm-devel
mailing list