[MITgcm-support] Problem with parallel build: No. of processes?not equal to nPx*nPy

Jean-Michel Campin jmc at ocean.mit.edu
Sat Nov 12 10:38:41 EST 2011


Hi Yuan,

Just a precision regarding this point: 
> The implementation of parallel in longitude had problems,
I think that the only part of the code that does not support
domain decomposition in the X direction is pkg/zonal_filt
(fourier filtering in zonal direction). 
So if you are not using pkg/zonal_filt, you can use domain decomposition 
in X and/or in Y direction, and this should work.
If it does not work, please report the problem.
 
Thanks,
Jean-Michel

On Sat, Nov 12, 2011 at 04:18:41AM -0800, lian at ashimaresearch.com wrote:
> Ok, I meant sNx=80 in your SIZE.h<br><br><br>Yuan<br><br><span style="font-family:Prelude, Verdana, san-serif;"><br></span><span style="color:navy; font-family:Prelude, Verdana, san-serif; "><hr align="left" style="width:75%">On Nov 12, 2011 4:17 AM, lian at ashimaresearch.com &lt;lian at ashimaresearch.com&gt; wrote: <br></span>Try nPx=1 and sNx=40. The implementation of parallel in longitude had problems, not sure if it has some last changes though.<br><br><br>Yuan<br><span><br></span><span><hr align="left">On Nov 12, 2011 3:15 AM, Chun-Yan Zhou &lt;c.zhou at dundee.ac.uk&gt; wrote: <br></span>
>   
> 
>   
>   
>     <p>
>       <font size="2" face="Dialog">Hi Gus,</font>    </p>
>     <p>
>       <font size="2" face="Dialog">&nbsp;Yes, it is the 'SIZE.h' file, type mistake. Any other idea?</font>    </p>
>     <p>
>       <font size="2" face="Dialog">&nbsp;chunyan</font>    </p>
>     <p>
>       <br>
>       <br>
>       ----------------------------------------------------------------------    </p>
>     <p>
>       <br>
>       <font size="2" face="Dialog">Did you call the file "size.h" as your email says?</font>    </p>
>     <p>
>       <font size="2" face="Dialog">I think the file "Size.h" is what is compiled.</font>    </p>
>     <p>
>       <font size="2" face="Dialog">Note, the name starts with an upper case "S".</font>    </p>
> <br>      
>     <p>
>       <font size="2" face="Dialog">Yes nPx * nPy should be equal to the number of processors in your mpirun command.</font>    </p>
> <br>      
>     <p>
>       <font size="2" face="Dialog">Gus Correa</font>    </p>
> <br>      
>     <p>
>       <br>
>       <br>
>       <br>
>       ----------------------------------------------------------------------<br><br>Message: 1<br>Date: Fri, 11 Nov 2011 17:51:38 +0000<br>From: "Chun-Yan Zhou" &lt;c.zhou at dundee.ac.uk&gt;<br>To: &lt;mitgcm-support at mitgcm.org&gt;<br>Subject: [MITgcm-support] Problem with parallel build: No. of<br>processes not equal to nPx*nPy<br>Message-ID: &lt;4EBD60AA0200003200009B2C at ia-gw-6.dundee.ac.uk&gt;<br>Content-Type: text/plain; charset="us-ascii"<br><br><br>Hi Martin and Gustavo,<br>I took Martin's first solution to add the&nbsp; 'libmpi_f77.so' to&nbsp;<br><br>&nbsp;&nbsp;&nbsp; LD_LIBRARY_PATH setup in my .bash_profile. It worked! finally! But another funny error occurred.<br>&nbsp;<br><br>S/R EEBOOT_MINIMAL: No. of processes not equal to nPx*nPy&nbsp;&nbsp;&nbsp;&nbsp; 1&nbsp;&nbsp;&nbsp;&nbsp; 4<br>STOP ABNORMAL END: PROGRAM MAIN<br><br>In this message i believe the first column (1)&nbsp; is how many processors the code recognizes it&nbsp;&nbsp;<br>should run.<br>The second column (4)&nbsp; is the number&nbsp; of procs i request in my mpi command<br>on . Correct?<br><br>I noticed that same error happened 2009 <a href="http://mitgcm.org/pipermail/mitgcm-support/2009-April/006011.html">http://mitgcm.org/pipermail/mitgcm-support/2009-April/006011.html</a><br>But I didn't see a solution there except for the genmake2 change. Any idea about the problem?<br>&nbsp;<br>I also tried to delete the file Size.h_mpi and CPP_EEOPTIONS.h_mpi, still got the same error message.<br>The size.h is as follows.<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER sNx<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER sNy<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER OLx<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER OLy<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER nSx<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER nSy<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER nPx<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER nPy<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER Nx<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER Ny<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER Nr<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; PARAMETER (<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sNx =&nbsp; 40,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sNy =&nbsp; 21,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; OLx =&nbsp;&nbsp; 3,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; OLy =&nbsp;&nbsp; 3,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nSx =&nbsp;&nbsp; 1,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nSy =&nbsp;&nbsp; 1,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nPx =&nbsp;&nbsp; 2,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nPy =&nbsp;&nbsp; 2,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Nx&nbsp; = sNx*nSx*nPx,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ny&nbsp; = sNy*nSy*nPy,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Nr&nbsp; =&nbsp;&nbsp; 8)<br><br>C&nbsp;&nbsp;&nbsp;&nbsp; MAX_OLX :: Set to the maximum overlap region size of any array<br>C&nbsp;&nbsp;&nbsp;&nbsp; MAX_OLY&nbsp;&nbsp;&nbsp; that will be exchanged. Controls the sizing of exch<br>C&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; routine buffers.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER MAX_OLX<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INTEGER MAX_OLY<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; PARAMETER ( MAX_OLX = OLx,<br>&nbsp;&nbsp;&nbsp;&nbsp; &amp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MAX_OLY = OLy )&nbsp;<br><br><br>BTW, Gustavo, you are right. The MPI_INC_DIR is a *direction*,so I just add the line<br>MPI_INC_DIR=/usr/include/openmpi-x86_64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; in my case.&nbsp;<br><br>Best wishes!<br>chunyan<br><br>
>     </p>
>   <br>
> 
>     <p>
>       The University of Dundee is a registered Scottish charity, No: SC015096
>     </p>
>   

> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list