[MITgcm-devel] further vectorization

Patrick Heimbach heimbach at MIT.EDU
Thu Nov 1 21:27:34 EDT 2007


Hi there,

if possible don't tamper too much with exch2 for now
until we ascertain the working of the exch2 adjoint
(unless the changes are critical, of course).

Cheers
-Patrick

On Nov 1, 2007, at 8:46 PM, Jean-Michel Campin wrote:

> Hi Martin,
>
> OK, I had a quick look at EXCH_RX_RECV_GET_X,
> and the thing you propose regarding the BARRIER does not look
> right to me:
> At the top of the S/R:
> doingSingleThreadedComms is a local flag which is set to false,
> and only the master thread turn it to true (if multi-threaded).
> Now at the end, we can't move the BARRIER within
> the block: IF ( doingSingleThreadedComms ) THEN / ENDIF
> since all the other threads will ignore it and the master
> thread will wait (until the other thread reach the next barrier)
> causing threads to be out of sync.
>
> Chris, do you confirm this interpretation ?
>
> Jean-Michel
>
> On Thu, Nov 01, 2007 at 05:07:30PM +0100, Martin Losch wrote:
>> That's OK for me too, but why not move the barrier into the if-block?
>> do we need to wait for every thread before executing "if
>> (dosinglethreadedcomms)"?
>>
>> M.
>> On 31 Oct 2007, at 16:32, Jean-Michel Campin wrote:
>>
>>> Hi Martin,
>>>
>>> Chris is looking at your suggestions.
>>> Regarding the BARRIER thing, if was wandering if something
>>> like:
>>>      IF ( nSx.NE.1 .OR. nSy.NE.1 ) THEN
>>>       _BARRIER
>>>      ENDIF
>>> would do it. The compiler should know (since nSx & nSy are
>>> parameters) that he can remove those barrier when it's safe
>>> (both nSx & nSy = 1).
>>> And when nSx > 1 or nSy > 1 , we can still use the same
>>> executable for single-thread or multi-threads run
>>> just by changing eedata.
>>>
>>> Jean-Michel
>>>
>>> On Wed, Oct 31, 2007 at 08:59:08AM +0100, Martin Losch wrote:
>>>> Hi all,
>>>>
>>>> Jens-Olaf has identified (and fixed) another (small) bottleneck in
>>>> exch_rl_recv_get_x and exch_rl_send_put_x (and all the other files
>>>> that are created from the corresponding template):
>>>>
>>>> The problem: the inner loop is always over i, but for the *_x
>>>> routines this loop is very short (basically Olx). Because the loop
>>>> boundaries are not available at compile time (iMin and iMax are set
>>>> earlier in the routine), only the inner loop is vectorized,  
>>>> resulting
>>>> in slow code (vectorization is at 20%): the routines are among  
>>>> the 20
>>>> most expensive ones.
>>>> This is his suggestion (2 instances, for east and west buffers):
>>>>>         DO K=1,myNz
>>>>> !CDIR NOLOOPCHG
>>>>>          DO I=iMin,iMax
>>>>>           DO J=1,sNy
>>>>>            iB = iB + 1
>>>>>            array(I,J,K,bi,bj) = eastRecvBuf_RL(iB,eBl,bi,bj)
>>>>>           ENDDO
>>>>>          ENDDO
>>>>>         ENDDO
>>>> that is, exchange the loop order and add a directive that the
>>>> compiler does not change the order back. I would suggest to put the
>>>> directive into #ifdef TARGET_NEC_SX/#endif
>>>>
>>>> Also, at the end of the routine there are two barrier calls (each
>>>> call costs about 30% of routine runtime). Can these be moved  
>>>> into the
>>>> IF Block like this?
>>>>
>>>>> c     _BARRIER
>>>>>     IF ( doingSingleThreadedComms ) THEN
>>>>>    _BARRIER
>>>>> C      Restore saved settings that were stored to allow
>>>>> C      single thred comms.
>>>>>      _BEGIN_MASTER(myThid)
>>>>>       DO I=1,nThreads
>>>>>        myBxLo(I) = myBxLoSave(I)
>>>>>        myBxHi(I) = myBxHiSave(I)
>>>>>        myByLo(I) = myByLoSave(I)
>>>>>        myByHi(I) = myByHiSave(I)
>>>>>       ENDDO
>>>>>      _END_MASTER(myThid)
>>>>>    _BARRIER
>>>>>     ENDIF
>>>>> c     _BARRIER
>>>>
>>>> If you agree with these changes I will implement and test them.  
>>>> I am
>>>> asking, becaues I do not feel too comfortable with this part of the
>>>> code. Please let me know.
>>>>
>>>> (There is probably something similar in the corresponding exch2
>>>> routines, but I haven't tried that. yet.)
>>>>
>>>> Martin
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel

---
Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
MIT | EAPS 54-1518 | 77 Massachusetts Ave | Cambridge MA 02139 USA
FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach





More information about the MITgcm-devel mailing list