[MITgcm-support] Coding help: random numerical errors while running in MPI

Martin Losch Martin.Losch at awi.de
Thu Jun 5 02:47:17 EDT 2008


Christopher,

I did not have a close look at your routines, just lines 65-67. These  
fall into a loop that I have had problems with on a vector machine.  
Basically the expression
                   jG = mpi_myYGlobalLo(npe+1)-1+(bj-1)*sNy+j
                   global(iG,jG) = local(j,bi,bj)
broke during vectorization and I got segmentations faults, unless I  
suppressed the vectorization. In my case, this is a compiler bug and  
I am still waiting for an updated compiler, but since inserting a  
write statement exactly suppresses vectorizaition/optimization of  
this loop, it's my guess that you have an optimization issue as well.
Try reducing the optimization/vectorization level, either globally  
via compiler flags (you can put this routine into the NOOPTFILES list  
and only use lower optimization for thoes), or via compiler  
directives for this loop only (certaintly more efficient).

Martin


On 5 Jun 2008, at 03:18, Christopher L. Wolfe wrote:

>
> Hi all,
>
> I'm trying to write a package for the MITgcm that does zonal  
> averages along isopycnals. I've managed to get it working on in a  
> single tile configuration, but when I try to run it using multiple  
> tiles using MPI, I occasionally (about 1 in 5 runs) get random  
> errors in the output. These errors are on the order of 10% of the  
> correct output values and they only seem to infest a few (randomly  
> varying) grid points at a time. I mainly figured out how to do  
> inter-tile communication by copying other routines in the MITgcm,  
> so I may have made a mistake that would be obvious to someone more  
> experienced with MPI. I was hoping someone might be able to help me  
> find the problem.
>
> The only piece that makes explicit use of MPI is the output  
> routine, which makes use of a modified version of gather_xy. (The  
> averaged data don't fit onto the computational grid, so I couldn't  
> figure out how to write the data with the standard IO routines.)  
> These files are attached.
>
> The weird thing is that when the commented out write statement on  
> lines 65--67 uncommented, the code works fine.
>
> Thanks in advance for the help,
> Christopher
>
> <isoave_output.F>
>
> <isoave_gather.F>
>
> -----------------------------------------------------------
> Dr. Christopher L. Wolfe                   858-534-4560
> Physical Oceanography Research Division    OAR 357
> Scripps Institution of Oceanography, UCSD  clwolfe at ucsd.edu
> -----------------------------------------------------------
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list