[MITgcm-devel] vectorizing seaice and exf
Martin Losch
Martin.Losch at awi.de
Thu Sep 27 10:11:42 EDT 2007
Hi Patrick, Jean-Michel,
I am all for retiring one of the two bulkformulae.
Currently the exf_bulkformulae.F seems to have more options than the
other one, right? So I am not sure, how similar they are, your call.
In the future one may want to replace exf_bulkformulae with a bulk
formulations of her/his own choise, eg. the COARE formulae etc ...
Martin
On 27 Sep 2007, at 16:03, Patrick Heimbach wrote:
>
> Hi Martin,
>
> On Sep 27, 2007, at 9:46 AM, Martin Losch wrote:
>
>> Hi Patrick,
>>
>> in
>> /u/u0/mlosch/scratch/MITgcm/pkg/exf/exf_bulkformulae.F
>> you can find a version of exf_bulkformulae that does vectorize
>> (haven't checked the vector operation ratio yet, but the compiler
>> no vectorizes the i-loop(s)). Tell me what you think (I get
>> identical results, but had to make 8 fields two dimensional:
>> tstar, ustar, qstar, tau, rdn, t0, delq, deltap).
>
> I'll have a look.
> 2-dim fields is not too much a problem for me, as I am a big memory
> consumer anyways.
> Not sure what other people think.
>
>> now your questions (i'll refer to the NEC sxf90, which is relevant
>> for anyone running at DKRZ as well):
>> you can make the sxf90 inline functions automatically, but only
>> those that are < 50 lines, and whose code are in the same file, so
>> for exf_bulkformulae you'd have to put these functions into the
>> same *.F file (for automatic inlining). This works for, eg.
>> exf_interp.F and function lagran (which is part of exf_interp.F).
>> you can also make sxf90 inline functions > 50 lines and from other
>> files by explicitly specifying at compile time, e.g. '-pi
>> exp=timestep_tracer expin=timestep_tracer.F',
>> BUT that's complicated and does not work with genmake2 (because
>> the compiler is looking for the specified files and crashes if it
>> cannot find them, see my earlier email to Jean-Michel. My solution
>> to this, an extra flag, is not general either). You'd have to
>> modify your makefile each time you run make makefile for this to
>> work. So until this is not solved I would like to rely on
>> automatic inlining only.
>
> OK. That seems a good solution.
> The Cd functions are one-to-three liners anyways,
> and I'll just append them in the exf_bulkformulae.F file.
>
>> statements like
>> if ( atemp(i,j,bi,bj) .ne. 0. _d 0 ) then
>> don't seem to be a problem for the compiler
>
> Hmm, strange. Armin claims it to be a problem.
>
> Another thing:
> Jean-Michel and I were thinking of
> retiring one routine, either
> exf_bulkformulae.F or exf_bulk_largeyeager04.F
>
> They are extemely similar, and main difference is in the choice
> of the drag coeff. which we make handle differently.
> What do you (and others) think?
>
> -p.
>
>
>
>> Martin
>>
>>
>> On 27 Sep 2007, at 14:46, Patrick Heimbach wrote:
>>
>>>
>>> Hi Martin,
>>>
>>> On Sep 27, 2007, at 8:14 AM, Martin Losch wrote:
>>>
>>>> Hi,
>>>>
>>> ...
>>>
>>>> The bottleneck (from a vectorization point of view) is in
>>>> exf_bulkformulae (and exf_bulk_largeyeager04, I expect), because
>>>> of the iteration as the innermost loop. Jens-Olaf's solution is
>>>> unsatifactory: he noticed that niter_bulk = 2 always (per
>>>> parameter definition) and just copied the loop body twice, thus
>>>> removing the iteratioin loop. I am still looking for a directive
>>>> that can do that automatically, since the loop range is know at
>>>> compile time. But I am afraid that the only real solution would
>>>> be "loop pushing": move the iteration out of the ij-loops. This
>>>> means redefining a few fields (the ones that are updated in the
>>>> iteration, which are they?) as 2D.
>>>
>>> First, quick question:
>>> Do you know whether your compiler can inline function calls
>>> to improve vectorization?
>>> I've removed the three exf_bulk functions about a month ago
>>> to help with vecorization, but would be inclined to put some back in
>>> (to be able to switch between different drag coeff. schemes).
>>> Good compilers would just inline them.
>>>
>>> A second thing to check is the
>>> if ( atemp(i,j,bi,bj) .ne. 0. _d 0 ) then
>>> statement which likely hinders good vectorization.
>>>
>>>> I would like to do this, but what's the best way: have two
>>>> versions of exf_bulkformlulae, one as it is and one for vector
>>>> machines? Or can we live with one version that is vectorizable
>>>> at the cost of having a few extra 2d fields?
>>>
>>> A few that are currently non-fields:
>>> tau
>>> tstar
>>> ustar
>>> qstar
>>>
>>> -p.
>>>
>>>
>>>
>>>>
>>>> Martin
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>
>>> ---
>>> Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
>>> MIT | EAPS 54-1518 | 77 Massachusetts Ave | Cambridge MA 02139 USA
>>> FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach
>>>
>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
> ---
> Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
> MIT | EAPS 54-1518 | 77 Massachusetts Ave | Cambridge MA 02139 USA
> FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach
>
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list