[MITgcm-support] SOLVE_DIAGONAL_LOWMEMORY
Jean-Michel Campin
jmc at mit.edu
Thu Jan 7 22:20:07 EST 2021
Hi Dimitris,
In term of efficiency, I would rank #define SOLVE_DIAGONAL_KINNER as the slowest,
and then comes the default (just because accessing more 3-D arrays might take more time)
and finally the #define SOLVE_DIAGONAL_LOWMEMORY ; but I would have guessed that the differences
would have been small between the last 2 options.
Now that Dan is reporting significant differences, I am still not sure if the magnitude of
the improvement is not platform/problem dependent.
And just for history (please correct me if I am wrong):
The original version was the SOLVE_DIAGONAL_LOWMEMORY . It's not great for the adjoint
(there are some obvious reasons, but I thought it could have been fixed without major changes
in the inner part of the routine) so Gael wrote a new version, SOLVE_DIAGONAL_KINNER,
that is better for the adjoint, but slower and terrible in term of efficiency on vector machine.
After that the current default version was introduce (may be Martin did it ?) so that
we would have an adjointable version that is efficient on vector processor.
And after that, I messed-up even more the SOLVE_DIAGONAL_LOWMEMORY version to skip
a good half of the inversion computation if/when it's called for the second time
with the same matrix but different RHS.
Cheers,
Jean-Michel
On Thu, Jan 07, 2021 at 11:45:27AM -0800, Dimitris Menemenlis wrote:
> Bonjour Jean-Michel, would you be aware of any possible issues with switching the hi-res LLC simulations to SOLVE_DIAGONAL_LOWMEMORY ?
> Right now we use the default settings of:
>
> C o Choices for implicit solver routines solve_*diagonal.F
> C The following has low memory footprint, but not suitable for AD
> #undef SOLVE_DIAGONAL_LOWMEMORY
> C The following one suitable for AD but does not vectorize
> #undef SOLVE_DIAGONAL_KINNER
>
> But Dan notes that the low-memory variant is approximately twice as fast than the default settings for the llc_540 set-up.
> I could not find any previous discussion of SOLVE_DIAGONAL_LOWMEMORY in mitgcm-support.
>
> Merci, Dimitris
>
>
> > Begin forwarded message:
> >
> > From: "Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]" <daniel.s.kokron at nasa.gov>
> > Subject: Re: Performance analysis feedback
> > Date: January 7, 2021 at 10:49:23 AM PST
> > To: Dimitris Menemenlis <menemenlis at jpl.nasa.gov>
> > Cc: "Zhang, Hong (JPL-398K)[JPL Employee]" <hong.zhang at jpl.nasa.gov>
> >
> > Kinner 147.07/186.45
> > Fall though 60.37/125.83
> > Low mem 26.1/67.9
> >
> > See attached. Upper left is Kinner, upper right is fall through and lower middle is low mem. solve_tridiagonal() is highlighted in yellow.
> > Dan
> >
> > From: "Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]" <daniel.s.kokron at nasa.gov>
> > Date: Thursday, January 7, 2021 at 10:05 AM
> > To: Dimitris Menemenlis <menemenlis at jpl.nasa.gov>
> > Cc: "Zhang, Hong (JPL-398K)[JPL Employee]" <hong.zhang at jpl.nasa.gov>
> > Subject: Re: Performance analysis feedback
> >
> > Comparing profiles with and without Kinner, Kinner is definitely slower than the fall-through path. I did not try using the lowmem path.
> >
> > There is a lot of load imbalance among the ranks so I???ve included min and max times (s) spent in the solve_tridiagonal() routine using the llc_540 case run on 767 ranks.
> >
> > Kinner 147.07/186.45
> > Fall though 60.37/125.83
> >
> > Fall though is activated with
> > #undef SOLVE_DIAGONAL_LOWMEMORY
> > #undef SOLVE_DIAGONAL_KINNER
> >
> >
> > From: Dimitris Menemenlis <menemenlis at jpl.nasa.gov>
> > Date: Wednesday, January 6, 2021 at 7:57 PM
> > To: "Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]" <daniel.s.kokron at nasa.gov>
> > Cc: "Zhang, Hong (JPL-398K)[JPL Employee]" <hong.zhang at jpl.nasa.gov>
> > Subject: Re: Performance analysis feedback
> >
> > actually, it looks like we are ???not??? using the low-memory option in any of our set-ups:
> >
> > bash-3.2$ grep SOLVE_DIAGONAL_LOW */*/code*/*h */code*/*h
> > llc_540/tides_exp/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_90/tides_exps/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_1080/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_2160/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_270/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_270/code_ad/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_4320/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_540/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_90/code-async-noseaice/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> > llc_90/code/CPP_OPTIONS.h:#undef SOLVE_DIAGONAL_LOWMEMORY
> >
> > which happens to be the default for unmodified MITgcm CPP_OPTIONS.h
> >
> > bash-3.2$ grep SOLVE_DIAGONAL CPP_OPTIONS.h
> > #undef SOLVE_DIAGONAL_LOWMEMORY
> > #undef SOLVE_DIAGONAL_KINNER
> >
> > what do you recommend?
> >
> > Dimitris
> >
> >
> >
> >
> >
> >> On Jan 6, 2021, at 3:03 PM, Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] <daniel.s.kokron at nasa.gov> wrote:
> >>
> >> Hong,
> >> The Kinner tri-diagonal solver code path is showing up in my profiling of the llc_540 case. None of the other user cases I have is using this path. Does your investigation require using the Kinner path?
> >> Dan
> >>
> >
> >
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list