[MITgcm-devel] pkg/layers verification exp.

Martin Losch Martin.Losch at awi.de
Thu Feb 7 09:45:06 EST 2013


Hi Ryan,

me again. I have now tested the new routine on my coarse global 2deg-model (I ran it for 720 time steps), and the overall run about 3 times faster than previously, the vectorization is perfect, and the results are the same as before (I use LaHs1RHO and LaVH1RHO to compute the overturning in density coordinates), so I guess it can go into the repository. I will make your version of the code the default and leave #ifdef TARGET_NEC_SX around my new code until you find the time to test the timing for your applications.

Is that OK with you?

Martin

On Feb 6, 2013, at 3:54 PM, Martin Losch <Martin.Losch at awi.de> wrote:

> Hi Ryan,
> 
> I have found an algorithm, that is probably much faster than yours, plus it is much easier to vectorize. I have imlemented a "bisection" routine that is supposed to find the appropriate layer within log2(Nlayers) steps. See Numerical Recipes in Fortran, 3.4 How to Search an Ordered Table, p111 <http://apps.nrbook.com/fortran/index.html>
> What do you think? I did not test the routine one my problem yet (whether it gives the right results), but it's definetly faster by another factor of 2 (mainly because it vectorizes better). The loops have a lot of if-statements, so that their execution will always be slow (I only get 1509MFLOPS in spite of 99.45% Vector operation ratio). Could you test this routine on your problem, and tell me
> a. if it gives the same results?
> b. if it is faster?
> 
> If it turns out to be slower for your problems, we can simply put it between #ifdef TARGET_NEC_SX 
> 
> Martin
> 
> On Feb 6, 2013, at 9:53 AM, Martin Losch <Martin.Losch at awi.de> wrote:
> 
> > Hi Ryan,
> > 
> > that's great, thanks.
> > 
> > For the vector machine the speedup is dramatic as usual, when scalar code is replace by vector code. In my short tests (100 time steps with a coarse global model) I get this: The original scalar code takes 270 sec (101MFLOPS, Vector operation ratio = 6%, very bad), with my modifications this goes down to 107 sec ( 217MFLOPS, Vector operation ratio = 62%). This is still very bad, but definetly better than before. Typically the important MITgcm routines go up to 10,000MFLOPS and have Vector operation ratios of over 99%. The consequence is that layers_fluxcalc even with my modifications uses 75% of the total cpu time (before it was 86%), which is still inacceptable, but the total run time goes down from 5min13sec to 2min22sec.
> > 
> > I didn't test anything on non-vector machines, but I assume that there won't much of a change.
> > 
> > I am afraid, that this diagnostic cannot completely be vectorized, but I'll ask my vectorization guru (o:
> > 
> > Martin
> > 
> > 
> > On Feb 5, 2013, at 8:45 PM, Ryan Abernathey <ryan.abernathey at gmail.com> wrote:
> > 
> >> Martin,
> >> I finally got around to trying your changes to the layers code: the output looks fine to me. Please go ahead and check this in.
> >> Did you get a chance to do any benchmarking? Is the new code faster or slower?
> >> -Ryan
> >> 
> >> 
> >> On Wed, Jan 9, 2013 at 2:02 AM, Martin Losch <Martin.Losch at awi.de> wrote:
> >> OK, it's not exactly urgent anyway.
> >> 
> >> Martin
> >> 
> >> On Jan 8, 2013, at 11:33 PM, Jean-Michel Campin <jmc at ocean.mit.edu> wrote:
> >> 
> >>> Hi Ryan and others,
> >>> 
> >>> Will start to make changes in cfc_example to add a pkg/layers test
> >>> (working with David on this).
> >>> Will need to make few changes in pkg/layers so that it compiles
> >>> in this set-up.
> >>> Martin, if you can postpone your changes until this is done,
> >>> it will make things easier.
> >>> 
> >>> Cheers,
> >>> Jean-Michel
> >>> 
> >>> On Tue, Jan 08, 2013 at 11:25:32AM -0500, Ryan Abernathey wrote:
> >>>> I am not super familiar with these experiments, but I think the cfc_example
> >>>> is a better choice. What would really be ideal is an eddying experiment,
> >>>> but this is probably not practical for the daily test reports. Certainly an
> >>>> experiment with a realistic stratification and overturning is a necessity.
> >>>> Someone else (Martin / David / Ross / Jean-Michel) would probably be better
> >>>> than me at setting this up. The fact is that I don't run realistic global
> >>>> models. I am still stuck in a channel with no salinity! Layers has evolved
> >>>> quite a bit beyond my original setup.
> >>>> 
> >>>> A nice accompaniment to this would be some real documentation. I have
> >>>> promised to work on this for quite some time, but it's one of those things
> >>>> that is hard to prioritize. ;)
> >>>> 
> >>>> I have many ambitions for the future of layers. For example, I would love
> >>>> to be able to accumulate all the tracer-budget diagnostics that are filled
> >>>> in gad_advection.F in layer space. This would permit, for example, the
> >>>> online calculation of water mass transformation with an unprecedented level
> >>>> of precision. I am very glad that all you numerical wizards are getting
> >>>> involved because I will need your help to go down that road! The final
> >>>> step, the momentum budget in layer space, is pretty intimidating. And at
> >>>> that point you are probably better off just running GOLD. ;)
> >>>> 
> >>>> Cheers,
> >>>> Ryan
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> On Mon, Jan 7, 2013 at 9:46 AM, Jean-Michel Campin <jmc at ocean.mit.edu>wrote:
> >>>> 
> >>>>> Hi Ryan, David, Martin and others,
> >>>>> 
> >>>>> In response to Martin's concern about having an example that uses
> >>>>> pkg/layers,
> >>>>> I would propose to turn it on in one of the verification experiment.
> >>>>> 
> >>>>> Right now, it's already compiled in exp4, but given the simple
> >>>>> T,S structure (+ only 8 levels) of this experiment, I was wondering
> >>>>> if we should rather pick an other experiment, may be a realistic set-up ?
> >>>>> 
> >>>>> In term of realistic set-up, I would propose cfc_example, 2.8 x 2.8 global
> >>>>> with 15 levels, starting from a pickup.
> >>>>> It is not too complicated (does not test too many critical features),
> >>>>> and also adding pkg/layers will not make it less clear the "cfc example"
> >>>>> part,
> >>>>> I think.
> >>>>> 
> >>>>> But if exp4 is good enough to test pkg/layers, could just go with this one.
> >>>>> 
> >>>>> What do you think ?
> >>>>> 
> >>>>> Cheers,
> >>>>> Jean-Michel
> >>>>> 
> >>>>> _______________________________________________
> >>>>> MITgcm-devel mailing list
> >>>>> MITgcm-devel at mitgcm.org
> >>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>>>> 
> >>> 
> >>>> _______________________________________________
> >>>> MITgcm-devel mailing list
> >>>> MITgcm-devel at mitgcm.org
> >>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >>> 
> >>> 
> >>> _______________________________________________
> >>> MITgcm-devel mailing list
> >>> MITgcm-devel at mitgcm.org
> >>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >> 
> >> 
> >> _______________________________________________
> >> MITgcm-devel mailing list
> >> MITgcm-devel at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> >> 
> >> _______________________________________________
> >> MITgcm-devel mailing list
> >> MITgcm-devel at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> > 
> > 
> > _______________________________________________
> > MITgcm-devel mailing list
> > MITgcm-devel at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> <layers_fluxcalc.F>




More information about the MITgcm-devel mailing list