Dimitris Menemenlis menemenlis at jpl.nasa.gov
Wed Jan 13 07:31:29 EST 2021

Martin thanks so much for detailed and helpful answer.  I forward to MITgcm Support.

Cheers, Dimitris

> Begin forwarded message:
> From: Martin Losch <Martin.Losch at awi.de>
> Date: January 13, 2021 at 12:47:20 AM PST
> To: Dimitris Menemenlis <menemenlis at jpl.nasa.gov>
> Cc: "Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]" <daniel.s.kokron at nasa.gov>, Hong Zhang <hong.zhang at jpl.nasa.gov>
> Hi guys,
> the perfomance of the LSR solver is very machine dependent. The default code does not vectorize which is a killer on pure vector machines, e.g. NEC SX8 or SX-ACE, but on intel/amd like chips this can also lead to a performance loss, as the compilers have a (compared to vector computers limited) vectorization capability. 
> Defining SEAICE_VECTORIZE_LSR changes the algorithm slightly and the solver usually requires more iterations to reach the same accuracy (governed by LSR_ERROR). If the vectorization on pleiades’ inter processors is efficient enough, there may be a netl performance increase in terms of "time to solution".
> Here are my recommendations:
> 1 cpp flags
> - As Dan found SEAICE_VECTORIZE_LSR can speed up the code at the cost of more LSR iterations (usually only a few more)
> These two may not be interesting for the pure benchmarker, but may reduce the "time to solution":
> - defining SEAICE_LSR_ZEBRA may speed up the convergence of the solver (see Losch et al 2014), but I have never explored in this in the context of pure Picard/LSR runs.
> - defining SEAICE_DELTA_SMOOTHREG changes the results but replace as MIN(deltaC,deltaMin) by SQRT(deltaC**2,deltaMin**2) which also helps the solver converge faster
> 2 runtime parameters (will not change performance but may improve convergence, hence time to solution):
> - LSR_ERROR <= 1.e-5, because otherwise you’ll have issues along the tile boundaries (having said that, it’s also useful to use the default of current code of having non-zero SEAICE_Olx/y)
> - SEAICElinearIterMax = 200 is probably enough, but once the model is up and running, the number of lsr iteration will be well below that anyway (more like O(50), so I don’t think that this parameter is crucial.
> - SEAICEnonLinIterMax = 10, this will increase the cost of the solver by a factor of 5, because default is 2, and it only makes sense, if you use very recent code that includes a bug fix (PR #369, merged on Dec15, 2020). You can probably get away with the default, if performance is an issue.
> - you can also try SEAICEuseStrImpCpl=.TRUE., which may improve the convergence of the solver and hence reduce the number of iterations to reach the same LSR_ERROR.
> Martin
>> On 13. Jan 2021, at 04:19, Dimitris Menemenlis <menemenlis at jpl.nasa.gov> wrote:
>> Hi Dan, do you define LSR_ERROR and SEAICElinearIterMax in your data.seaice and if yes what are their values?
>> Given Martin’s comment below, I am curious why SEAICE_VECTORIZE_LSR would help on pleiades’ intel processors.
>> C     Use LSR vector code; not useful on non-vector machines, because it
>> C     slows down convergence considerably, but the extra iterations are
>> C     more than made up by the much faster code on vector machines. For
>> C     the only regularly test vector machine these flags a specified
>> C     in the build options file SUPER-UX_SX-8_sxf90_awi, so that we comment
>> C     them out here.
>> Cheers, Dimitris
>>> On Jan 8, 2021, at 9:28 AM, Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] <daniel.s.kokron at nasa.gov> wrote:
>>> Dimitris,
>>> The current setting of SEAICE_VECTORIZE_LSR to undef is hurting performance on Xeon.  Profiling of the llc_540 case shows the code being executed is entirely scalar.  Setting SEAICE_VECTORIZE_LSR to define results in a 2x speedup for seaice_lsr_tridiagv() and a 25% speedup for seaice_lsr_tridiagu().  See attached.
>>> SEAICE_VECTORIZE_LSR=undef is on the left.  seaice_lsr_tridiagv() is highlighted in yellow.
>>> Daniel Kokron
>>> RedLine Performance Solutions
>>> SciCon/APP group
>>> -- 
>>> <LLC540_SeaIceLSRTriDiagV.png>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20210113/c9b6115f/attachment-0001.html>

More information about the MITgcm-support mailing list