[MITgcm-devel] global sum
Martin Losch
Martin.Losch at awi.de
Mon Nov 26 11:53:47 EST 2012
I guess the answer is "no". This will not work. I have to think about this some more, but if you have suggestions, I'd be happy to hear them.
Martin
On Nov 26, 2012, at 5:50 PM, Martin Losch wrote:
> And I need to store dtempTile in a local common block, right?
>
> M.
>
> On Nov 26, 2012, at 5:48 PM, Martin Losch wrote:
>
>> Hi Jean-Michel,
>>
>> in the context of seaice_fgmres.F (S/R scalprod beginning at line 551) I then need to define a local array dtemptile(nsx,nsy) and pass bi,bj down to this routine right.
>> Then I compute dtemptile(bi,bj), as I now compute dtemp and then call
>> CALL GLOBAL_SUM_TILE_RL( dtemptile,dtemp,myThid )
>>
>> correct?
>>
>> Martin
>>
>> On Nov 26, 2012, at 5:40 PM, Jean-Michel Campin wrote:
>>
>>> Hi Martin,
>>>
>>> I recommend to use global_sum_tile rather than _GLOBAL_SUM for 2 reasons:
>>> 1) with default CPP_EEOPTIONS.h, it uses the same MPI calls than
>>> _GLOBAL_SUM (so same speed), but offer the option (with #define GLOBAL_SUM_SEND_RECV),
>>> for a given domain decomposition in tiles (i.e., fixed tile size),
>>> to get result which is independent of how tiles are distributed among processors
>>> (can change nSx,nSy,nPx,nPy and result stays identical).
>>> This is a useful for checking that the code is right (but is slower).
>>> 2) it's easier to use because it "always" works, whether or not argument are
>>> shared (e.g., in common bloc) or are local. By contrast, _GLOBAL_SUM
>>> does not work with multi-threads if argument is shared (is in a common bloc).
>>> And since this issue is not so obvious to every one, it's easy to forget about it
>>> and get pieces of code which does not work with multi-threads.
>>>
>>> And regarding global_sum_single_cpu, it's much slower, so it cannot be used
>>> as the default; and as a consequence, it requires more specific coding (with
>>> 1 version calling global_sum_single_cpu and a default version calling
>>> global_sum_tile). But the advantage of this additional coding is to offer
>>> the option to get results independent of domain decomposition in tiles.
>>>
>>> Cheers,
>>> Jean-Michel
>>>
>>> On Mon, Nov 26, 2012 at 09:44:33AM +0100, Martin Losch wrote:
>>>> Hi there,
>>>>
>>>> I didn't follow the development of the global-sum code. Under which circumstances should I use which of these variants:
>>>> _GLOBAL_SUM (as, e.g., in seaice_lsr for the residuals)
>>>> call global_sum_tile
>>>> call global_sum_single_cpu
>>>>
>>>> Currently, I would like to figure out if I can actually fix the multithreading of the file seaice_fgmres.F
>>>> In there I use "stolen" code, which I don't quite know how to handle. In particular there is a scalar product, that I adjusted to use MPI, but now I would like it to use a global-sum variant.
>>>>
>>>> Martin
>>>>
>>>> PS. Related to this file, should I actually use LAPACK routines (actually BLAS), if HAVE_LAPACK is defined?
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list