[Mitgcm-support] Re: MPI with ifc broken ?

mitgcm-support at dev.mitgcm.org mitgcm-support at dev.mitgcm.org
Wed Jul 9 15:53:24 EDT 2003


Sounds great!
-p.

Chris Hill wrote:
> P.S. Dimitris will be at 33 Cameron Avenue this Friday. How about we all
> go out for pizza!
> 
> Chris
> 
> 
> 
>>-----Original Message-----
>>From: Patrick Heimbach [mailto:heimbach at MIT.EDU] 
>>Sent: Monday, May 12, 2003 11:00 AM
>>To: support at mitgcm.org
>>Cc: Dimitris Menemenlis
>>Subject: Re: MPI with ifc broken ?
>>
>>
>>Alistair,
>>
>>Dimitris is at a conference this week.
>>I had trouble too setting up the parallel global state estimation
>>on an AER cluster, maybe the same problem.
>>Can we fix it here?
>>
>>-p.
>>
>>support at mitgcm.org wrote:
>>
>>>Dimitri,
>>>
>>>Hate to complain BUT...
>>>
>>>The stuff you added to ini_procs.F (for gather/scatter) 
>>
>>seems to be breaking
>>
>>>the parallel version of the model.
>>>
>>>Looking at what you've done, you're using blocking sends without
>>>corresponding receives so that the model can't get past 
>>
>>those sends.  I
>>
>>>think you need non-blocking sends followed by receives. Can 
>>
>>you undo those
>>
>>>changes prompto and then fix it.
>>>
>>>Thanks,
>>>
>>>A.
>>>
>>>
>>>
>>>>-----Original Message-----
>>>>From: Jean-Michel Campin [mailto:jmc at gulf.mit.edu]
>>>>Sent: Friday, May 09, 2003 10:04 AM
>>>>To: support at mitgcm.org
>>>>Subject: Re: MPI with ifc broken ?
>>>>
>>>>
>>>>Hi again,
>>>>
>>>>
>>>>
>>>>>It seems that something is broken in the code
>>>>>for MPI with ifc (Linux, like the myrinet-3 cluster) between 
>>>>>checkpoint48e (still working) and checkpoint48f: no output (all 
>>>>>STDOUT + STDERR are empty) and the error message is (on 2 cpus):
>>>>
>>>>Finally it turns to be weekly related to modifications
>>>>for scatter_2d.F gather_2d.F : the problem is in
>>>>ini_procs.F (eesup/src) and is due to the changes between
>>>>1.14 and 1.15:
>>>>
>>>>
>>>>
>>>>>C--   To speed-up mpi gather and scatter routines, myXGlobalLo
>>>>>C     and myYGlobalLo from each process are transferred to
>>>>>C     a common block array.  This allows process 0 to know
>>>>>C     the location of the domains controlled by each process.
>>>>>      DO npe = 0, numberOfProcs-1
>>>>>         CALL MPI_SEND (myXGlobalLo, 1, MPI_INTEGER,
>>>>>    &         npe, mpiMyId, MPI_COMM_MODEL, ierr)
>>>>>      ENDDO
>>>>>      DO npe = 0, numberOfProcs-1
>>>>>         CALL MPI_RECV (itemp, 1, MPI_INTEGER,
>>>>>    &         npe, npe, MPI_COMM_MODEL, istatus, ierr)
>>>>>         mpi_myXGlobalLo(npe+1) = itemp
>>>>>      ENDDO
>>>>>      DO npe = 0, numberOfProcs-1
>>>>>         CALL MPI_SEND (myYGlobalLo, 1, MPI_INTEGER,
>>>>>    &         npe, mpiMyId, MPI_COMM_MODEL, ierr)
>>>>>      ENDDO
>>>>>      DO npe = 0, numberOfProcs-1
>>>>>         CALL MPI_RECV (itemp, 1, MPI_INTEGER,
>>>>>    &         npe, npe, MPI_COMM_MODEL, istatus, ierr)
>>>>>         mpi_myYGlobalLo(npe+1) = itemp
>>>>>      ENDDO   
>>>>
>>>>When I comment those lines, it works fine.
>>>>
>>>>Jean_Michel
>>>
>>>
>>>
>>
>>-- 
>>_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
>>Patrick Heimbach ............................. MIT
>>FON: +1/617/253-5259 .......... EAPS, Room 54-1518
>>FAX: +1/617/253-4464 ..... 77 Massachusetts Avenue
>>mailto:heimbach at mit.edu ....... Cambridge MA 02139
>>http://www.mit.edu/~heimbach/ ................ USA
>>
> 
> 


-- 
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Patrick Heimbach ............................. MIT
FON: +1/617/253-5259 .......... EAPS, Room 54-1518
FAX: +1/617/253-4464 ..... 77 Massachusetts Avenue
mailto:heimbach at mit.edu ....... Cambridge MA 02139
http://www.mit.edu/~heimbach/ ................ USA




More information about the MITgcm-support mailing list