[MITgcm-support] Problem with parallel build: No. of processes not equal to nPx*nPy

Gustavo Correa gus at ldeo.columbia.edu
Fri Nov 11 13:24:06 EST 2011


Did you call the file "size.h" as your email says?
I think the file "Size.h" is what is compiled.
Note, the name starts with an upper case "S".

Yes nPx * nPy should be equal to the number of processors in your mpirun command.

Gus Correa
On Nov 11, 2011, at 12:51 PM, Chun-Yan Zhou wrote:

> Hi Martin and Gustavo,
>  I took Martin's first solution to add the  'libmpi_f77.so' to
>     LD_LIBRARY_PATH setup in my .bash_profile. It worked! finally! But another funny error occurred.
>  
> S/R EEBOOT_MINIMAL: No. of processes not equal to nPx*nPy     1     4
> STOP ABNORMAL END: PROGRAM MAIN
> 
> In this message i believe the first column (1)  is how many processors the code recognizes it  
> should run.
> The second column (4)  is the number  of procs i request in my mpi command
>  on . Correct?
> 
> I noticed that same error happened 2009 http://mitgcm.org/pipermail/mitgcm-support/2009-April/006011.html    
> But I didn't see a solution there except for the genmake2 change. Any idea about the problem?
>  
>  I also tried to delete the file Size.h_mpi and CPP_EEOPTIONS.h_mpi, still got the same error message.
> The size.h is as follows.
> 
>       INTEGER sNx
>       INTEGER sNy
>       INTEGER OLx
>       INTEGER OLy
>       INTEGER nSx
>       INTEGER nSy
>       INTEGER nPx
>       INTEGER nPy
>       INTEGER Nx
>       INTEGER Ny
>       INTEGER Nr
>       PARAMETER (
>      &           sNx =  40,
>      &           sNy =  21,
>      &           OLx =   3,
>      &           OLy =   3,
>      &           nSx =   1,
>      &           nSy =   1,
>      &           nPx =   2,
>      &           nPy =   2,
>      &           Nx  = sNx*nSx*nPx,
>      &           Ny  = sNy*nSy*nPy,
>      &           Nr  =   8)
> 
> C     MAX_OLX :: Set to the maximum overlap region size of any array
> C     MAX_OLY    that will be exchanged. Controls the sizing of exch
> C                routine buffers.
>       INTEGER MAX_OLX
>       INTEGER MAX_OLY
>       PARAMETER ( MAX_OLX = OLx,
>      &            MAX_OLY = OLy )
> 
> 
> BTW, Gustavo, you are right. The MPI_INC_DIR is a *direction*,so I just add the line
> MPI_INC_DIR=/usr/include/openmpi-x86_64              in my case.
> 
> Best wishes!
> chunyan
> >>> <mitgcm-support-request at mitgcm.org> 11/11/2011 5:00 PM >>>
> Send MITgcm-support mailing list submissions to
> mitgcm-support at mitgcm.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mitgcm.org/mailman/listinfo/mitgcm-support
> or, via email, send a message with subject or body 'help' to
> mitgcm-support-request at mitgcm.org
> 
> You can reach the person managing the list at
> mitgcm-support-owner at mitgcm.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of MITgcm-support digest..."
> 
> 
> Today's Topics:
> 
>    1. Problem with parallel build (Chun-Yan Zhou)
>    2. Re: Problem with parallel build (Martin Losch)
>    3. Re: Problem with parallel build (Gustavo Correa)
>    4. Re: Problem with parallel build (Chun-Yan Zhou)
>    5. Re: Problem with parallel build (Martin Losch)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 11 Nov 2011 12:02:07 +0000
> From: "Chun-Yan Zhou" <c.zhou at dundee.ac.uk>
> To: <mitgcm-support at mitgcm.org>
> Subject: [MITgcm-support] Problem with parallel build
> Message-ID: <4EBD0EBF0200003200009AFF at ia-gw-6.dundee.ac.uk>
> Content-Type: text/plain; charset="us-ascii"
> 
> 
> Dear Gustavo,
>  
> Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
> Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
>  
>     MPI_HEADER_FILES='mpif.h mpiof.h'
>     MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'
> 
> ../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
> not in mpif.h
> not in mpif.h
> not in /usr/include/mpif.h
> 
> It seems it didn't point to the mpif.h path I set.
> 
> What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
> Best wishes.
> chunyan
> The University of Dundee is a registered Scottish charity, No: SC015096
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20111111/27b6df32/attachment.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Fri, 11 Nov 2011 13:57:56 +0100
> From: Martin Losch <Martin.Losch at awi.de>
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Problem with parallel build
> Message-ID: <D771CA38-E45D-4C2B-8A8F-9980829A3749 at awi.de>
> Content-Type: text/plain; CHARSET=US-ASCII
> 
> Chunyan,
> 
> I do not know either, what mpiof.h is (but I have to admit that I do not "speak mpi" very well) and I could not find any reference to "mpiof.h" in the code:
> grep mpiof.h */inc/* */scr/* pkg/*/*
> did not give any hits.
> 
> Based on the most recent code (some of the build-option files have been merged or moved to 'unsupported', including linux_amd64_gfortran+mpi_generic), I suggest that you try to start from linux_amd64_gfortran
> and add a line
> MPI_INC_DIR=/usr/include/openmpi-x86_64/mpif.h'
> and you replace the lines
>   CC=${CC:=mpicc}
>   FC=${FC:=mpif77}
>   F90C=${F90C:=mpif90}
> with
>   CC=/usr/lib64/openmpi/bin/mpicc
>   FC=/usr/lib64/openmpi/bin/mpif77
>   F90C=/usr/lib64/openmpi/bin/mpif90
> 
> If that still does not work, you should prove to yourself, that your MPI-installation is working properly: compile and run a simple program such as this one:
> <http://www.rcac.purdue.edu/userinfo/resources/common/run/pbs/examples/hello77_fortran.html>
> with mpif77/90
> 
> Martin
> 
> 
> Nov 11, 2011, at 1:02 PM, Chun-Yan Zhou wrote:
> 
> > Dear Gustavo,
> > 
> > Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
> >  Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
> > 
> >     MPI_HEADER_FILES='mpif.h mpiof.h'
> >     MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'
> >
> > ../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
> > not in mpif.h
> > not in mpif.h
> > not in /usr/include/mpif.h
> >
> > It seems it didn't point to the mpif.h path I set.
> >
> > What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
> > Best wishes.
> >  chunyan
> >
> > The University of Dundee is a registered Scottish charity, No: SC015096
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Fri, 11 Nov 2011 10:30:25 -0500
> From: Gustavo Correa <gus at ldeo.columbia.edu>
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Problem with parallel build
> Message-ID: <51B07CC8-20DF-44EE-9E0F-DEB8F9DABD41 at ldeo.columbia.edu>
> Content-Type: text/plain; charset=us-ascii
> 
> Hi Chun-Yan, Martin
> 
> I think the macros MPI_INC_DIR, MPI_HEADER_FILES_INC, probably should point to
> a *directory*, not to a file.
> However, on my admittedly old build options files I don't have these macros.
> I have the "INCLUDES" macro which is again points to the MPI include *directory*, not to a file:
> INCLUDES=/path/to/mpi/include
> 
> I don't think there is any mpiof.h file in MPI either.
> It sounds as some typo.
> I would just remove it from the options file and see how it goes.
> 
> In addition, I second Martin's suggestion
> that you use the FC=mpif77 or FC=mpif90 compiler wrapper,
> instead of the compiler itself (ifort, gfortran, whatever).
> The compiler wrapper will be able to find the include files.
> 
> I hope this helps,
> Gus Correa
> 
> On Nov 11, 2011, at 7:57 AM, Martin Losch wrote:
> 
> > Chunyan,
> >
> > I do not know either, what mpiof.h is (but I have to admit that I do not "speak mpi" very well) and I could not find any reference to "mpiof.h" in the code:
> > grep mpiof.h */inc/* */scr/* pkg/*/*
> > did not give any hits.
> >
> > Based on the most recent code (some of the build-option files have been merged or moved to 'unsupported', including linux_amd64_gfortran+mpi_generic), I suggest that you try to start from linux_amd64_gfortran
> > and add a line
> > MPI_INC_DIR=/usr/include/openmpi-x86_64/mpif.h'
> > and you replace the lines
> >  CC=${CC:=mpicc}
> >  FC=${FC:=mpif77}
> >  F90C=${F90C:=mpif90}
> > with
> >  CC=/usr/lib64/openmpi/bin/mpicc
> >  FC=/usr/lib64/openmpi/bin/mpif77
> >  F90C=/usr/lib64/openmpi/bin/mpif90
> >
> > If that still does not work, you should prove to yourself, that your MPI-installation is working properly: compile and run a simple program such as this one:
> > <http://www.rcac.purdue.edu/userinfo/resources/common/run/pbs/examples/hello77_fortran.html>
> > with mpif77/90
> >
> > Martin
> >
> >
> > Nov 11, 2011, at 1:02 PM, Chun-Yan Zhou wrote:
> >
> >> Dear Gustavo,
> >>
> >> Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
> >> Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
> >>
> >>    MPI_HEADER_FILES='mpif.h mpiof.h'
> >>    MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'
> >>
> >> ../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
> >> not in mpif.h
> >> not in mpif.h
> >> not in /usr/include/mpif.h
> >>
> >> It seems it didn't point to the mpif.h path I set.
> >>
> >> What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
> >> Best wishes.
> >> chunyan
> >>
> >> The University of Dundee is a registered Scottish charity, No: SC015096
> >>
> >> _______________________________________________
> >> MITgcm-support mailing list
> >> MITgcm-support at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-support
> >
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Fri, 11 Nov 2011 15:54:39 +0000
> From: "Chun-Yan Zhou" <c.zhou at dundee.ac.uk>
> To: <mitgcm-support at mitgcm.org>
> Subject: Re: [MITgcm-support] Problem with parallel build
> Message-ID: <4EBD453F0200003200009B16 at ia-gw-6.dundee.ac.uk>
> Content-Type: text/plain; charset="us-ascii"
> 
> 
> Dear Martin,
> Thanks for your help. I followed your suggestion and I can build and make the mitgcmuv file and didn't receive error. However, there is error when I run it:
> 
> ./mitgcmuv: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory.
>  
> But I can find the file libmpi_f77.so.1 in the path '/usr/lib64/openmpi/lib/l',it seems like I need to add some direction? what do you think? BTW, I test that  MPI-installation is working properly.
> 
> chunyan
> The University of Dundee is a registered Scottish charity, No: SC015096
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20111111/e04bc88a/attachment-0001.htm>
> 
> ------------------------------
> 
> Message: 5
> Date: Fri, 11 Nov 2011 17:15:17 +0100
> From: Martin Losch <Martin.Losch at awi.de>
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Problem with parallel build
> Message-ID: <AFF8BBBE-92A4-40BA-A4BD-D12E9D287AD0 at awi.de>
> Content-Type: text/plain; CHARSET=US-ASCII
> 
> Hi Chunyan,
> 
> I am glad you can compile now.
> 
> libmpi_f77.so is a dynamic library. Dynamic libraries are loaded at run time (the lib*.a are static libraries loaded and included into the executable at compile time).  This is something that you need to discuss with your sys-admin, ie., do you need to set environment paths such as LD_LIBRARY_PATH?
> 
> Alternatively you can try to compile with "-static-libgfortran" (if the static libraries exist), see "man gfortran".
> 
> Martin
> 
> On Nov 11, 2011, at 4:54 PM, Chun-Yan Zhou wrote:
> 
> > Dear Martin,
> >  Thanks for your help. I followed your suggestion and I can build and make the mitgcmuv file and didn't receive error. However, there is error when I run it:
> >
> > ./mitgcmuv: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory.
> > 
> > But I can find the file libmpi_f77.so.1 in the path '/usr/lib64/openmpi/lib/l',it seems like I need to add some direction? what do you think? BTW, I test that  MPI-installation is working properly.
> >
> > chunyan
> >
> > The University of Dundee is a registered Scottish charity, No: SC015096
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> 
> End of MITgcm-support Digest, Vol 101, Issue 13
> ***********************************************
> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list