[MITgcm-devel] Re: thsice is reeeeeeeeeally scalar!
Jens-Olaf Beismann
jbeismann at hpce.nec.com
Mon Oct 1 06:48:36 EDT 2007
Martin,
I just had a very brief look at your ftraces:
- on how may processors did you run these tests?
- in both tests the total number of procedure calls is very high
- in the THSICE case, thsice_get_exf and thsice_reshape_layers together
give appr. 25e6 calls - can these be inlined, and might inlining improve
the vectorisation of the thsice routines you mentioned? Maybe
vectorising THSICE isn't that big a task after all.
- inlining should also be applied to other routines, cf. the ones I
listed in the cubed sphere case
- you might want to try to get rid of some "barrier" calls as well.
- regarding the advection routines, it would be helpful to compare the
corresponding compiler listings
Cheers,
Jens-Olaf
> in my crusade to turn the MITgcm into a true vector code I noticed that
> the thsice package would require a lot of work. I have attached (in a
> gzipped tar-ball), the output of a comparison between runs with
> seaice+thsice and seaice only. The domain is 243x170x33 (Rüdiger Gerdes'
> Arctic Ocean configuration from AOMIP), and I integrate for 10 days with
> deltaT=900sec, so 960 timesteps.
> If you have a look at ftrace.txt_thsice and ftrace.txt_seaice (from flow
> trace analyses) you'll notice a few things:
> 1. mom_calc_visc is by far the most expensive routine, probably because
> I use the Leith scheme; I use a slightly lower optimization -Cvopt,
> instead of -Chopt for this routine, but I find this still surprising. I
> would have expected cg2d to be the top runner.
> 2. all routines that start with thsice_* have zero vector operation
> ratio, and from the MFLOPS you can see that they are really slow because
> of that.
> 3. exception seaice_advection (V. OP. Ratio = 83%) vectorises worse than
> thsice_advection (99.53%). I have no idea why.
> 4. everything else looks decent except for the exch_rl_send/recv
> routines. I am not touching them without detailed instructions.
>
> As a consequence the seaice+thsice is slower (692sec vs. 558sec,
> stdout.*). The excess time is spend in THSICE_MAIN (146.91sec, as
> opposed to seaice_growth+seaice_advdiff = 31.48-13.21=18.27sec).
>
> I don't want to undertake the huge task of vectorizing thsice, but why
> is seaice_advection so different from thsice_advection (Jean-Michel?).
>
> Martin
>
> CC to Jens-Olaf, although he cannot reply to this list, I guess (just
> MITgcm-support at mitgcm.org).
>
--
Dr. Jens-Olaf Beismann Benchmarking Analyst
NEC High Performance Computing Europe GmbH
Prinzenallee 11, D-40549 Duesseldorf, Germany
Tel: +49 4326 288859 (office) +49 160 183 5289 (mobile)
Fax: +49 4326 288861 http://www.hpce.nec.com
More information about the MITgcm-devel
mailing list