<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Martin thanks so much for detailed and helpful answer. I forward to MITgcm Support.<div class=""><br class=""></div><div class="">Cheers, Dimitris</div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">Begin forwarded message:</div><br class="Apple-interchange-newline"><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">From: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">Martin Losch <<a href="mailto:Martin.Losch@awi.de" class="">Martin.Losch@awi.de</a>><br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">Subject: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class=""><b class="">[EXTERNAL] Re: SEAICE_VECTORIZE_LSR</b><br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">Date: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">January 13, 2021 at 12:47:20 AM PST<br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">To: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">Dimitris Menemenlis <<a href="mailto:menemenlis@jpl.nasa.gov" class="">menemenlis@jpl.nasa.gov</a>><br class=""></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;" class=""><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif; color:rgba(0, 0, 0, 1.0);" class=""><b class="">Cc: </b></span><span style="font-family: -webkit-system-font, Helvetica Neue, Helvetica, sans-serif;" class="">"Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]" <<a href="mailto:daniel.s.kokron@nasa.gov" class="">daniel.s.kokron@nasa.gov</a>>, Hong Zhang <<a href="mailto:hong.zhang@jpl.nasa.gov" class="">hong.zhang@jpl.nasa.gov</a>><br class=""></span></div><br class=""><div class=""><div class="">Hi guys,<br class=""><br class="">the perfomance of the LSR solver is very machine dependent. The default code does not vectorize which is a killer on pure vector machines, e.g. NEC SX8 or SX-ACE, but on intel/amd like chips this can also lead to a performance loss, as the compilers have a (compared to vector computers limited) vectorization capability. <br class="">Defining SEAICE_VECTORIZE_LSR changes the algorithm slightly and the solver usually requires more iterations to reach the same accuracy (governed by LSR_ERROR). If the vectorization on pleiades’ inter processors is efficient enough, there may be a netl performance increase in terms of "time to solution".<br class=""><br class="">Here are my recommendations:<br class=""><br class="">1 cpp flags<br class="">- As Dan found SEAICE_VECTORIZE_LSR can speed up the code at the cost of more LSR iterations (usually only a few more)<br class=""><br class="">These two may not be interesting for the pure benchmarker, but may reduce the "time to solution":<br class="">- defining SEAICE_LSR_ZEBRA may speed up the convergence of the solver (see Losch et al 2014), but I have never explored in this in the context of pure Picard/LSR runs.<br class="">- defining SEAICE_DELTA_SMOOTHREG changes the results but replace as MIN(deltaC,deltaMin) by SQRT(deltaC**2,deltaMin**2) which also helps the solver converge faster<br class=""><br class="">2 runtime parameters (will not change performance but may improve convergence, hence time to solution):<br class="">- LSR_ERROR <= 1.e-5, because otherwise you’ll have issues along the tile boundaries (having said that, it’s also useful to use the default of current code of having non-zero SEAICE_Olx/y)<br class="">- SEAICElinearIterMax = 200 is probably enough, but once the model is up and running, the number of lsr iteration will be well below that anyway (more like O(50), so I don’t think that this parameter is crucial.<br class="">- SEAICEnonLinIterMax = 10, this will increase the cost of the solver by a factor of 5, because default is 2, and it only makes sense, if you use very recent code that includes a bug fix (PR #369, merged on Dec15, 2020). You can probably get away with the default, if performance is an issue.<br class="">- you can also try SEAICEuseStrImpCpl=.TRUE., which may improve the convergence of the solver and hence reduce the number of iterations to reach the same LSR_ERROR.<br class=""><br class=""><br class="">Martin<br class=""><br class=""><blockquote type="cite" class="">On 13. Jan 2021, at 04:19, Dimitris Menemenlis <<a href="mailto:menemenlis@jpl.nasa.gov" class="">menemenlis@jpl.nasa.gov</a>> wrote:<br class=""><br class="">Hi Dan, do you define LSR_ERROR and SEAICElinearIterMax in your data.seaice and if yes what are their values?<br class="">Given Martin’s comment below, I am curious why SEAICE_VECTORIZE_LSR would help on pleiades’ intel processors.<br class=""><br class="">C Use LSR vector code; not useful on non-vector machines, because it<br class="">C slows down convergence considerably, but the extra iterations are<br class="">C more than made up by the much faster code on vector machines. For<br class="">C the only regularly test vector machine these flags a specified<br class="">C in the build options file SUPER-UX_SX-8_sxf90_awi, so that we comment<br class="">C them out here.<br class=""># undef SEAICE_VECTORIZE_LSR<br class=""><br class=""><br class="">Cheers, Dimitris<br class=""><br class=""><br class=""><blockquote type="cite" class="">On Jan 8, 2021, at 9:28 AM, Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] <<a href="mailto:daniel.s.kokron@nasa.gov" class="">daniel.s.kokron@nasa.gov</a>> wrote:<br class=""><br class="">Dimitris,<br class="">The current setting of SEAICE_VECTORIZE_LSR to undef is hurting performance on Xeon. Profiling of the llc_540 case shows the code being executed is entirely scalar. Setting SEAICE_VECTORIZE_LSR to define results in a 2x speedup for seaice_lsr_tridiagv() and a 25% speedup for seaice_lsr_tridiagu(). See attached.<br class=""><br class="">SEAICE_VECTORIZE_LSR=undef is on the left. seaice_lsr_tridiagv() is highlighted in yellow.<br class=""><br class="">grep SEAICE_VECTORIZE_LSR *.h<br class="">DEF_IN_MAKEFILE.h:#undef SEAICE_VECTORIZE_LSR<br class="">SEAICE_OPTIONS.h:C# define SEAICE_VECTORIZE_LSR<br class="">SEAICE_OPTIONS.h:C# ifdef SEAICE_VECTORIZE_LSR<br class=""><br class="">Daniel Kokron<br class="">RedLine Performance Solutions<br class="">SciCon/APP group<br class="">-- <br class=""><br class=""><LLC540_SeaIceLSRTriDiagV.png><br class=""></blockquote><br class=""></blockquote><br class=""></div></div></blockquote></div><br class=""></div></body></html>