[MITgcm-support] cg convergence vs processor count
Ed Hill
ed at eh3.com
Wed Jul 27 12:54:49 EDT 2005
On Wed, 2005-07-27 at 11:36 -0400, Jason Goodman wrote:
> > Alas the non-associative nature of floating point arithmetic
> > ensures that the
> > numerical solution is actually dependent on the order reductions
> > (such as the
> > residual calculation in CG) are evaluated. However it is rather
> > unusual for
> > these differences to lead to such huge differences in solver behaviour
> > (non-convergence vs. convergence). This is rather worrying and may be
> > indicative of some other underlying problem with your setup. Large
> > differences in summation results with differing reduction order
> > tend to occur
> > when values in significantly different exponent ranges are added
> > together.
>
> Your point about floating-point errors is a good one, but I agree
> that the very big difference is odd.
>
> I've tried running the exp5 verification (rotating convection from
> widespread surface buoyancy loss) with varying numbers of
> processors, and don't have this problem, so I suspect there's a
> problem with my experimental setup rather than my hardware. I also
> notice that convergence seems to be slower for the point-source
> problem than a broad source with a similar model domain.
>
> I'm doing a point-source convection problem with a rather dense grid
> (200x200 horizontal, 133 vertical); the surface buoyancy forcing is
> isolated at a single gridpoint. Could the fact that most of the
> domain is "boring", in that initially only one point in a million has
> anything going on, cause problems with solving the pressure field?
>
> If so, is there any way to apply a preconditioner or some sort of
> weighting to encourage the CG algorithm to focus its effort on the
> buoyancy source, where the action is? My long-term goal is to use a
> grid with narrow spacing near the source and wider spacing farther
> away, but I'm having blowup problems with that so I'm trying to get
> the evenly-spaced grid working first.
Hi Jason,
For elliptic problems, conjugate-gradient methods usually do well in the
neighborhood of a discrete source since they tend to quickly correct
shorter-wave-length errors. What can take a long time is (perhaps
surprisingly) the propagation of longer-wave-length errors, particularly
those that approach the size of the problem domain. Thus, its entirely
possible that your "large boring area" is where the problem is slowly
converging. For a more rigorous discussion of this topic, please see
many multi-grid references which can be found at, for instance:
http://www.mgnet.org/mgnet-books-wesseling.html
Chris Hill explained to me that some previous MITgcm versions had a per-
tile multi-grid conjugate-gradient pre-conditioner that, while it
(sometimes?) sped up convergence, also created/accentuated slight
discontinuities at the tile edges. So, that implementation had some
annoying side-effects and is not used.
I suspect that there are better parallel multi-grid/multi-level solvers
that can speed up MITgcm's elliptic problem. Someone just needs to
develop and test them. And, I suspect that such an approach could help
in many situations, including yours. But thats only intuition.
This would be a fun topic for a paper. Is anyone interested?
Ed
--
Edward H. Hill III, PhD
office: MIT Dept. of EAPS; Rm 54-1424; 77 Massachusetts Ave.
Cambridge, MA 02139-4307
emails: eh3 at mit.edu ed at eh3.com
URLs: http://web.mit.edu/eh3/ http://eh3.com/
phone: 617-253-0098
fax: 617-253-4464
More information about the MITgcm-support
mailing list