[MITgcm-support] Single reduction CG solver

Sun Nov 29 06:05:04 EST 2009

Hi Christopher,

there is already a cg2d_bench in the MITgcm_contrib area, so I put  
your benchmark into my directory MITgcm_contrib/mlosch/cg2d_sr_bench  
for now.

As I said before, I don't know how long it will take for me/us to  
tackle the cg3d solver. Currently cg3d is the bottleneck in our DNS- 
type simulations, so we take any speedup that we can get. On the other  
hand, many of these DNS simulations are carried out on a vector  
computers, so that the number of CPUs is usually small and the sr-code  
will not help. We'll see how our computer resouce situation evolves.

Anyhow, the code is in the repository and in the testing scheme  
(verification/global_with_exf/input.yearly), and the benchmark is also  
checked-in. Thanks again for your contribution.

Martin

On Nov 28, 2009, at 5:27 PM, Christopher L.P. Wolfe wrote:

>
> Hi Martin,
>
> Please do put the benchmark code in the contrib area.
>
> I'll be interested to see how the 3D solver does.
>
> Cheers,
> Christopher
>
> On Nov 23, 2009, at 9:43 AM, Martin Losch wrote:
>
>> Christopher,
>> I have added your code (with minor modifications) to the repository.
>> I'll try to do the same for cg3d some time soon.
>>
>> We could also put your benchmark experiment into the contrib area,
>> what do you think?
>>
>> Martin
>>
>> On Nov 19, 2009, at 6:49 PM, Christopher L. Wolfe wrote:
>>
>>>
>>> Hi Martin,
>>>
>>> I imagine the algorithm would work just as well for the 3D problem,
>>> though the point where the single-reduction method becomes more
>>> efficient than the standard method might be different. My impression
>>> is that the 3D solver does a lot more work per reduction, so it time
>>> to solution might not be as dominated by reduction overhead.
>>>
>>> The two cg2d's give the same results per iteration to within round-
>>> off error. My benchmarking code writes in single precision and the
>>> difference between the two methods is always single-precision zero,
>>> even after thousands if iterations. I ran it through some of the
>>> verification tests using testreport and all the ones that passed
>>> with the "vanilla" CG routine also passed with the single-reduce
>>> routine. (Oddly, several of the verification tests fail even if I
>>> try to run them with a fresh out-of-the-box MITgcm. Perhaps I don't
>>> have my compiler configured correctly.)
>>>
>>> Cheers,
>>> Christopher
>>>
>>> On Nov 19, 2009, at 7:07 AM, Martin Losch wrote:
>>>
>>>> Hi Christopher,
>>>>
>>>> that sounds very good. Do you think that the effect will be similar
>>>> for cg3d?
>>>>
>>>> Are the results between the two cg2d's different? If so, by how  
>>>> much?
>>>>
>>>> Martin
>>>> On Nov 19, 2009, at 1:45 AM, Christopher L. Wolfe wrote:
>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I've written an implementation of d'Azevedo, Eijkhout, and  
>>>>> Romine's
>>>>> (1999) single reduction CG solver for the MITgcm. This method  
>>>>> uses a
>>>>> rearrangement of the standard conjugate gradient method so that  
>>>>> the
>>>>> required scalars can be determined by a single call to
>>>>> MPI_Allreduce. For problems running on a large number of  
>>>>> processors,
>>>>> the decreased MPI overhead can significantly increase the  
>>>>> efficiency
>>>>> of the conjugate gradient solver.
>>>>>
>>>>> The attached figure shows the scaling of the 2D CG solver for a
>>>>> fixed problem size of 1024x1024 as the number of processors  
>>>>> increase
>>>>> on the Cray XT4 "Franklin." For less than 2^6 processors, the
>>>>> original cg2d solver is slightly more efficient than the single
>>>>> reduction solver since the latter requires slightly more matrix-
>>>>> vector multiplications. However, for processor counts over 2^7,  
>>>>> the
>>>>> single reduction solver's performance is significantly better than
>>>>> the original cg2d.
>>>>>
>>>>> I've attached a tarball of the new CG routine cg2d_sr plus the
>>>>> modified files CG2D.h, ini_cg2d.F, and solve_for_pressure.F. To  
>>>>> use
>>>>> the single reduce solver, simply drop these files into an  
>>>>> experiment
>>>>> directory and compile with "ALLOW_CG2D_SR" defined. I've also
>>>>> included at benchmarking suite, which can also be compiled like a
>>>>> standard MITgcm experiment.
>>>>>
>>>>> I've only implemented the single reduce solver for the 2D case  
>>>>> since
>>>>> I just use the hydrostatic model, but the 3D implementation should
>>>>> be straightforward. A paper deriving the single reduction method  
>>>>> can
>>>>> be found at http://www.netlib.org/lapack/lawnspdf/lawn56.pdf
>>>>>
>>>>> Feel free to contact me with any questions or comments.
>>>>>
>>>>> Cheers,
>>>>> Christopher
>>>>>
>>>>> -----------------------------------------------------------
>>>>> Dr. Christopher L. Wolfe              	   858-534-4560
>>>>> Climate, Atmospheric Science, and Physical Oceanography
>>>>> Scripps Institution of Oceanography, UCSD  clwolfe at ucsd.edu
>>>>> -----------------------------------------------------------
>>>>>
>>>>> <strong.eps>
>>>>>
>>>>>
>>>>> <cg2d_sr.tar.gz>
>>>>> <cg2d_sr_bench.tar.gz>
>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-support mailing list
>>>>> MITgcm-support at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>>
>>>> _______________________________________________
>>>> MITgcm-support mailing list
>>>> MITgcm-support at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>
>>>
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> -----------------------------------------------------------
> Dr. Christopher L. Wolfe              	   858-534-4560
> Physical Oceanography Research Division    OAR 357
> Scripps Institution of Oceanography, UCSD  clwolfe at ucsd.edu
> -----------------------------------------------------------
>
>
>