[MITgcm-devel] global sum

Mon Nov 26 11:48:39 EST 2012

Hi Jean-Michel,

in the context of seaice_fgmres.F (S/R scalprod beginning at line 551) I then need to define a local array dtemptile(nsx,nsy) and pass bi,bj down to this routine right. 
Then I compute dtemptile(bi,bj), as I now compute dtemp and then call
 CALL GLOBAL_SUM_TILE_RL( dtemptile,dtemp,myThid )

correct?

Martin

On Nov 26, 2012, at 5:40 PM, Jean-Michel Campin wrote:

> Hi Martin,
> 
> I recommend to use global_sum_tile rather than _GLOBAL_SUM for 2 reasons:
> 1) with default CPP_EEOPTIONS.h, it uses the same MPI calls than 
> _GLOBAL_SUM (so same speed), but offer the option (with #define GLOBAL_SUM_SEND_RECV),
> for a given domain decomposition in tiles (i.e., fixed tile size), 
> to get result which is independent of how tiles are distributed among processors
> (can change nSx,nSy,nPx,nPy and result stays identical). 
> This is a useful for checking that the code is right (but is slower).
> 2) it's easier to use because it "always" works, whether or not argument are
> shared (e.g., in common bloc) or are local. By contrast, _GLOBAL_SUM
> does not work with multi-threads if argument is shared (is in a common bloc).
> And since this issue is not so obvious to every one, it's easy to forget about it
> and get pieces of code which does not work with multi-threads.
> 
> And regarding global_sum_single_cpu, it's much slower, so it cannot be used 
> as the default; and as a consequence, it requires more specific coding (with 
> 1 version calling global_sum_single_cpu and a default version calling 
> global_sum_tile). But the advantage of this additional coding is to offer
> the option to get results independent of domain decomposition in tiles.
> 
> Cheers,
> Jean-Michel
> 
> On Mon, Nov 26, 2012 at 09:44:33AM +0100, Martin Losch wrote:
>> Hi there,
>> 
>> I didn't follow the development of the global-sum code. Under which circumstances should I use which of these variants:
>> _GLOBAL_SUM (as, e.g., in seaice_lsr for the residuals)
>> call global_sum_tile
>> call global_sum_single_cpu
>> 
>> Currently, I would like to figure out if I can actually fix the multithreading of the file seaice_fgmres.F
>> In there I use "stolen" code, which I don't quite know how to handle. In particular there is a scalar product, that I adjusted to use MPI, but now I would like it to use a global-sum variant.
>> 
>> Martin
>> 
>> PS. Related to this file, should I actually use LAPACK routines (actually BLAS), if HAVE_LAPACK is defined?
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> 
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel