[MITgcm-devel] Re: thsice is reeeeeeeeeally scalar!

Jens-Olaf Beismann jbeismann at hpce.nec.com
Mon Oct 1 06:48:36 EDT 2007


Martin,

I just had a very brief look at your ftraces:

- on how may processors did you run these tests?
- in both tests the total number of procedure calls is very high
- in the THSICE case, thsice_get_exf and thsice_reshape_layers together 
give appr. 25e6 calls - can these be inlined, and might inlining improve 
the vectorisation of the thsice routines you mentioned? Maybe 
vectorising THSICE isn't that big a task after all.
- inlining should also be applied to other routines, cf. the ones I 
listed in the cubed sphere case
- you might want to try to get rid of some "barrier" calls as well.
- regarding the advection routines, it would be helpful to compare the 
corresponding compiler listings

Cheers,

Jens-Olaf

> in my crusade to turn the MITgcm into a true vector code I noticed that 
> the thsice package would require a lot of work. I have attached (in a 
> gzipped tar-ball), the output  of a comparison between runs with 
> seaice+thsice and seaice only. The domain is 243x170x33 (Rüdiger Gerdes' 
> Arctic Ocean configuration from AOMIP), and I integrate for 10 days with 
> deltaT=900sec, so 960 timesteps.
> If you have a look at ftrace.txt_thsice and ftrace.txt_seaice (from flow 
> trace analyses) you'll notice a few things:
> 1. mom_calc_visc is by far the most expensive routine, probably because 
> I use the Leith scheme; I use a slightly lower optimization -Cvopt, 
> instead of -Chopt for this routine, but I find this still surprising. I 
> would have expected cg2d to be the top runner.
> 2. all routines that start with thsice_* have zero vector operation 
> ratio, and from the MFLOPS you can see that they are really slow because 
> of that.
> 3. exception seaice_advection (V. OP. Ratio = 83%) vectorises worse than 
> thsice_advection (99.53%). I have no idea why.
> 4. everything else looks decent except for the exch_rl_send/recv 
> routines. I am not touching them without detailed instructions.
> 
> As a consequence the seaice+thsice is slower (692sec vs. 558sec, 
> stdout.*). The excess time is spend in THSICE_MAIN (146.91sec, as 
> opposed to seaice_growth+seaice_advdiff = 31.48-13.21=18.27sec).
> 
> I don't want to undertake the huge task of vectorizing thsice, but why 
> is seaice_advection so different from thsice_advection (Jean-Michel?).
> 
> Martin
> 
> CC to Jens-Olaf, although he cannot reply to this list, I guess (just 
> MITgcm-support at mitgcm.org).
> 


-- 
Dr. Jens-Olaf Beismann           Benchmarking Analyst
NEC High Performance Computing Europe GmbH
Prinzenallee 11, D-40549 Duesseldorf, Germany
Tel: +49 4326 288859 (office)  +49 160 183 5289 (mobile)
Fax: +49 4326 288861              http://www.hpce.nec.com




More information about the MITgcm-devel mailing list