[MITgcm-support] cg convergence vs processor count

Tue Jul 26 16:06:23 EDT 2005

On Tuesday 26 July 2005 14:34, Jason Goodman wrote:

> I'm working with a model of nonhydrostatic hydrothermal plume
> convection, and have noticed that the conjugate gradient parts of the
> code converge differently if the processor count changes.  For
> example, if I set up a 16-processor run in SIZE.h:
[snip]
> The 24-processor run eventually blows up, apparently due to cg2d
> convergence failure.
>
> My understanding was that the numerical solution should be
> independent of the underlying processor layout.  Is this correct?  

Alas the non-associative nature of floating point arithmetic ensures that the 
numerical solution is actually dependent on the order reductions (such as the 
residual calculation in CG) are evaluated. However it is rather unusual for 
these differences to lead to such huge differences in solver behaviour 
(non-convergence vs. convergence). This is rather worrying and may be 
indicative of some other underlying problem with your setup. Large 
differences in summation results with differing reduction order tend to occur 
when values in significantly different exponent ranges are added together. 

> If
> not, what do I need to change to get the 24-processor run working?
> If so, what the heck is going on?

Constantinos
-- 
Dr. Constantinos Evangelinos
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology