[MITgcm-support] OpenMP and multithreading

Thu Oct 18 13:42:58 EDT 2007

On Oct 18, 2007, at 1:30 PM, Paola Cessi wrote:

> Thanks  Constantinos. That was very helpful.
>
> In your example below do you mean nSx=nSy=2 (not sNx=sNy=2), and  
> nTx=2, nTy=1, right?

Yes, nSx=nSy=2 was meant.

> Also where do we find the "the daily testreports" ?

http://mitgcm.org/testing.html

-p.

> Thanks again,
>
> Paola
>
> On Thu, 18 Oct 2007, Constantinos Evangelinos wrote:
>
>> On Wednesday 17 October 2007 8:22:28 pm Dimitris Menemenlis wrote:
>>> Paola, the total size of the domain is Nx*Ny where
>>>
>>>       &           Nx  = sNx*nSx*nPx,
>>>       &           Ny  = sNy*nSy*nPy,
>>>
>>> nPx*nPy is the total number of (MPI) processes,
>>> nSx*nSy is the total number of tiles per process, and
>>> sNx*sNy is the dimension of each tile.
>>>
>>> For shared-memory threaded code each one of the nSx*nSy tiles will
>>> be handled by a different thread.  If nSx*nSy=1, then you will have
>>> only one thread per MPI process.
>>
>> Actually the number of threads is set in eedata and is different  
>> for the X
>> (nTx) and Y (nTy) direction (with their product nTx*nTy assumed to  
>> be equal
>> to OMP_NUM_THREADS for OpenMP code). The obvious restriction is  
>> that the
>> number of threads in X should be a divisor of sNx and the number  
>> of threads
>> in Y should be a divisor of sNy. So it is entirely possible to  
>> have sNx=sNy=2
>> and nTx=2, nTy=1.
>>
>>> Some words of caution:
>>>
>>> 1) shared-memory threaded code is not as well supported as MPI code,
>>> especially in the packages where careless programmers (like myself)
>>> sometimes (accidentally) introduce constructs that break the  
>>> threading.
>>
>> You can look at the daily testreports with multithreading turned  
>> on to see
>> which of the test cases appear to work. On some platforms (eg.  
>> Linux on PPC
>> with the IBM XL compilers) we have complete lack of success for  
>> some unknown
>> (so far) reason.
>>
>>> 2) with some exceptions there is very little gain in using  
>>> threaded code vs
>>> MPI code, even on shared memory platforms.  For example on the  
>>> SGI origin
>>> and altix we typically use MPI rather than threaded code, even  
>>> though they
>>> are shared memory platforms.
>>
>> With dual and quad core processors we may need to revisit that  
>> question. For
>> the time being however quad core seems to be suffering from a lack  
>> of memory
>> bandwidth and OpenMP would not help there.
>>
>> Constantinos
>> --
>> Dr. Constantinos Evangelinos
>> Department of Earth, Atmospheric and Planetary Sciences
>> Massachusetts Institute of Technology
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support

---
Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
MIT | EAPS 54-1518 | 77 Massachusetts Ave | Cambridge MA 02139 USA
FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach