[MITgcm-support] Coding help: random numerical errors while running in MPI
Martin Losch
Martin.Losch at awi.de
Thu Jun 5 02:47:17 EDT 2008
Christopher,
I did not have a close look at your routines, just lines 65-67. These
fall into a loop that I have had problems with on a vector machine.
Basically the expression
jG = mpi_myYGlobalLo(npe+1)-1+(bj-1)*sNy+j
global(iG,jG) = local(j,bi,bj)
broke during vectorization and I got segmentations faults, unless I
suppressed the vectorization. In my case, this is a compiler bug and
I am still waiting for an updated compiler, but since inserting a
write statement exactly suppresses vectorizaition/optimization of
this loop, it's my guess that you have an optimization issue as well.
Try reducing the optimization/vectorization level, either globally
via compiler flags (you can put this routine into the NOOPTFILES list
and only use lower optimization for thoes), or via compiler
directives for this loop only (certaintly more efficient).
Martin
On 5 Jun 2008, at 03:18, Christopher L. Wolfe wrote:
>
> Hi all,
>
> I'm trying to write a package for the MITgcm that does zonal
> averages along isopycnals. I've managed to get it working on in a
> single tile configuration, but when I try to run it using multiple
> tiles using MPI, I occasionally (about 1 in 5 runs) get random
> errors in the output. These errors are on the order of 10% of the
> correct output values and they only seem to infest a few (randomly
> varying) grid points at a time. I mainly figured out how to do
> inter-tile communication by copying other routines in the MITgcm,
> so I may have made a mistake that would be obvious to someone more
> experienced with MPI. I was hoping someone might be able to help me
> find the problem.
>
> The only piece that makes explicit use of MPI is the output
> routine, which makes use of a modified version of gather_xy. (The
> averaged data don't fit onto the computational grid, so I couldn't
> figure out how to write the data with the standard IO routines.)
> These files are attached.
>
> The weird thing is that when the commented out write statement on
> lines 65--67 uncommented, the code works fine.
>
> Thanks in advance for the help,
> Christopher
>
> <isoave_output.F>
>
> <isoave_gather.F>
>
> -----------------------------------------------------------
> Dr. Christopher L. Wolfe 858-534-4560
> Physical Oceanography Research Division OAR 357
> Scripps Institution of Oceanography, UCSD clwolfe at ucsd.edu
> -----------------------------------------------------------
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list