[MITgcm-support] Single reduction CG solver
Christopher L. Wolfe
clwolfe at ucsd.edu
Thu Nov 19 12:49:35 EST 2009
I imagine the algorithm would work just as well for the 3D problem,
though the point where the single-reduction method becomes more
efficient than the standard method might be different. My impression
is that the 3D solver does a lot more work per reduction, so it time
to solution might not be as dominated by reduction overhead.
The two cg2d's give the same results per iteration to within round-off
error. My benchmarking code writes in single precision and the
difference between the two methods is always single-precision zero,
even after thousands if iterations. I ran it through some of the
verification tests using testreport and all the ones that passed with
the "vanilla" CG routine also passed with the single-reduce routine.
(Oddly, several of the verification tests fail even if I try to run
them with a fresh out-of-the-box MITgcm. Perhaps I don't have my
compiler configured correctly.)
On Nov 19, 2009, at 7:07 AM, Martin Losch wrote:
> Hi Christopher,
> that sounds very good. Do you think that the effect will be similar
> for cg3d?
> Are the results between the two cg2d's different? If so, by how much?
> On Nov 19, 2009, at 1:45 AM, Christopher L. Wolfe wrote:
>> Hi all,
>> I've written an implementation of d'Azevedo, Eijkhout, and Romine's
>> (1999) single reduction CG solver for the MITgcm. This method uses a
>> rearrangement of the standard conjugate gradient method so that the
>> required scalars can be determined by a single call to
>> MPI_Allreduce. For problems running on a large number of processors,
>> the decreased MPI overhead can significantly increase the efficiency
>> of the conjugate gradient solver.
>> The attached figure shows the scaling of the 2D CG solver for a
>> fixed problem size of 1024x1024 as the number of processors increase
>> on the Cray XT4 "Franklin." For less than 2^6 processors, the
>> original cg2d solver is slightly more efficient than the single
>> reduction solver since the latter requires slightly more matrix-
>> vector multiplications. However, for processor counts over 2^7, the
>> single reduction solver's performance is significantly better than
>> the original cg2d.
>> I've attached a tarball of the new CG routine cg2d_sr plus the
>> modified files CG2D.h, ini_cg2d.F, and solve_for_pressure.F. To use
>> the single reduce solver, simply drop these files into an experiment
>> directory and compile with "ALLOW_CG2D_SR" defined. I've also
>> included at benchmarking suite, which can also be compiled like a
>> standard MITgcm experiment.
>> I've only implemented the single reduce solver for the 2D case since
>> I just use the hydrostatic model, but the 3D implementation should
>> be straightforward. A paper deriving the single reduction method can
>> be found at http://www.netlib.org/lapack/lawnspdf/lawn56.pdf
>> Feel free to contact me with any questions or comments.
>> Dr. Christopher L. Wolfe 858-534-4560
>> Climate, Atmospheric Science, and Physical Oceanography
>> Scripps Institution of Oceanography, UCSD clwolfe at ucsd.edu
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
More information about the MITgcm-support