[MITgcm-support] Problem with parallel build: No. of processes not equal to nPx*nPy

Chun-Yan Zhou c.zhou at dundee.ac.uk
Fri Nov 11 12:51:38 EST 2011


Hi Martin and Gustavo, 
 I took Martin's first solution to add the  'libmpi_f77.so' to  

    LD_LIBRARY_PATH setup in my .bash_profile. It worked! finally! But another funny error occurred. 
  

S/R EEBOOT_MINIMAL: No. of processes not equal to nPx*nPy     1     4 
STOP ABNORMAL END: PROGRAM MAIN 

In this message i believe the first column (1)  is how many processors the code recognizes it   
should run. 
The second column (4)  is the number  of procs i request in my mpi command 
 on . Correct? 

I noticed that same error happened 2009 http://mitgcm.org/pipermail/mitgcm-support/2009-April/006011.html 
But I didn't see a solution there except for the genmake2 change. Any idea about the problem? 
  
 I also tried to delete the file Size.h_mpi and CPP_EEOPTIONS.h_mpi, still got the same error message. 
The size.h is as follows. 

      INTEGER sNx 
      INTEGER sNy 
      INTEGER OLx 
      INTEGER OLy 
      INTEGER nSx 
      INTEGER nSy 
      INTEGER nPx 
      INTEGER nPy 
      INTEGER Nx 
      INTEGER Ny 
      INTEGER Nr 
      PARAMETER ( 
     &           sNx =  40, 
     &           sNy =  21, 
     &           OLx =   3, 
     &           OLy =   3, 
     &           nSx =   1, 
     &           nSy =   1, 
     &           nPx =   2, 
     &           nPy =   2, 
     &           Nx  = sNx*nSx*nPx, 
     &           Ny  = sNy*nSy*nPy, 
     &           Nr  =   8) 

C     MAX_OLX :: Set to the maximum overlap region size of any array 
C     MAX_OLY    that will be exchanged. Controls the sizing of exch 
C                routine buffers. 
      INTEGER MAX_OLX 
      INTEGER MAX_OLY 
      PARAMETER ( MAX_OLX = OLx, 
     &            MAX_OLY = OLy )  


BTW, Gustavo, you are right. The MPI_INC_DIR is a *direction*,so I just add the line 
MPI_INC_DIR=/usr/include/openmpi-x86_64              in my case.  

Best wishes! 
chunyan
 
>>> <mitgcm-support-request at mitgcm.org> 11/11/2011 5:00 PM >>>
Send MITgcm-support mailing list submissions to
mitgcm-support at mitgcm.org

To subscribe or unsubscribe via the World Wide Web, visit
http://mitgcm.org/mailman/listinfo/mitgcm-support
or, via email, send a message with subject or body 'help' to
mitgcm-support-request at mitgcm.org

You can reach the person managing the list at
mitgcm-support-owner at mitgcm.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of MITgcm-support digest..."


Today's Topics:

   1. Problem with parallel build (Chun-Yan Zhou)
   2. Re: Problem with parallel build (Martin Losch)
   3. Re: Problem with parallel build (Gustavo Correa)
   4. Re: Problem with parallel build (Chun-Yan Zhou)
   5. Re: Problem with parallel build (Martin Losch)


----------------------------------------------------------------------

Message: 1
Date: Fri, 11 Nov 2011 12:02:07 +0000
From: "Chun-Yan Zhou" <c.zhou at dundee.ac.uk>
To: <mitgcm-support at mitgcm.org>
Subject: [MITgcm-support] Problem with parallel build
Message-ID: <4EBD0EBF0200003200009AFF at ia-gw-6.dundee.ac.uk>
Content-Type: text/plain; charset="us-ascii"


Dear Gustavo,
 
Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
 
    MPI_HEADER_FILES='mpif.h mpiof.h'
    MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'

../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
not in mpif.h
not in mpif.h
not in /usr/include/mpif.h

It seems it didn't point to the mpif.h path I set.

What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
Best wishes.
chunyan
The University of Dundee is a registered Scottish charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20111111/27b6df32/attachment.html>

------------------------------

Message: 2
Date: Fri, 11 Nov 2011 13:57:56 +0100
From: Martin Losch <Martin.Losch at awi.de>
To: mitgcm-support at mitgcm.org
Subject: Re: [MITgcm-support] Problem with parallel build
Message-ID: <D771CA38-E45D-4C2B-8A8F-9980829A3749 at awi.de>
Content-Type: text/plain; CHARSET=US-ASCII

Chunyan,

I do not know either, what mpiof.h is (but I have to admit that I do not "speak mpi" very well) and I could not find any reference to "mpiof.h" in the code:
grep mpiof.h */inc/* */scr/* pkg/*/*
did not give any hits.

Based on the most recent code (some of the build-option files have been merged or moved to 'unsupported', including linux_amd64_gfortran+mpi_generic), I suggest that you try to start from linux_amd64_gfortran
and add a line
MPI_INC_DIR=/usr/include/openmpi-x86_64/mpif.h'
and you replace the lines
  CC=${CC:=mpicc}
  FC=${FC:=mpif77}
  F90C=${F90C:=mpif90}
with
  CC=/usr/lib64/openmpi/bin/mpicc
  FC=/usr/lib64/openmpi/bin/mpif77
  F90C=/usr/lib64/openmpi/bin/mpif90

If that still does not work, you should prove to yourself, that your MPI-installation is working properly: compile and run a simple program such as this one:
<http://www.rcac.purdue.edu/userinfo/resources/common/run/pbs/examples/hello77_fortran.html>
with mpif77/90

Martin


Nov 11, 2011, at 1:02 PM, Chun-Yan Zhou wrote:

> Dear Gustavo,
> 
> Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
>  Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
> 
>     MPI_HEADER_FILES='mpif.h mpiof.h'
>     MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'
>
> ../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
> not in mpif.h
> not in mpif.h
> not in /usr/include/mpif.h
>
> It seems it didn't point to the mpif.h path I set.
>
> What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
> Best wishes.
>  chunyan
>
> The University of Dundee is a registered Scottish charity, No: SC015096
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




------------------------------

Message: 3
Date: Fri, 11 Nov 2011 10:30:25 -0500
From: Gustavo Correa <gus at ldeo.columbia.edu>
To: mitgcm-support at mitgcm.org
Subject: Re: [MITgcm-support] Problem with parallel build
Message-ID: <51B07CC8-20DF-44EE-9E0F-DEB8F9DABD41 at ldeo.columbia.edu>
Content-Type: text/plain; charset=us-ascii

Hi Chun-Yan, Martin

I think the macros MPI_INC_DIR, MPI_HEADER_FILES_INC, probably should point to
a *directory*, not to a file.
However, on my admittedly old build options files I don't have these macros.
I have the "INCLUDES" macro which is again points to the MPI include *directory*, not to a file:
INCLUDES=/path/to/mpi/include

I don't think there is any mpiof.h file in MPI either.
It sounds as some typo.
I would just remove it from the options file and see how it goes.

In addition, I second Martin's suggestion
that you use the FC=mpif77 or FC=mpif90 compiler wrapper,
instead of the compiler itself (ifort, gfortran, whatever).
The compiler wrapper will be able to find the include files.

I hope this helps,
Gus Correa

On Nov 11, 2011, at 7:57 AM, Martin Losch wrote:

> Chunyan,
>
> I do not know either, what mpiof.h is (but I have to admit that I do not "speak mpi" very well) and I could not find any reference to "mpiof.h" in the code:
> grep mpiof.h */inc/* */scr/* pkg/*/*
> did not give any hits.
>
> Based on the most recent code (some of the build-option files have been merged or moved to 'unsupported', including linux_amd64_gfortran+mpi_generic), I suggest that you try to start from linux_amd64_gfortran
> and add a line
> MPI_INC_DIR=/usr/include/openmpi-x86_64/mpif.h'
> and you replace the lines
>  CC=${CC:=mpicc}
>  FC=${FC:=mpif77}
>  F90C=${F90C:=mpif90}
> with
>  CC=/usr/lib64/openmpi/bin/mpicc
>  FC=/usr/lib64/openmpi/bin/mpif77
>  F90C=/usr/lib64/openmpi/bin/mpif90
>
> If that still does not work, you should prove to yourself, that your MPI-installation is working properly: compile and run a simple program such as this one:
> <http://www.rcac.purdue.edu/userinfo/resources/common/run/pbs/examples/hello77_fortran.html>
> with mpif77/90
>
> Martin
>
>
> Nov 11, 2011, at 1:02 PM, Chun-Yan Zhou wrote:
>
>> Dear Gustavo,
>>
>> Thanks for your help. Yes, I found the mpif77 in the path '/usr/lib64/openmpi/bin/' and the mpif.h in the path  '/usr/include/openmpi-x86_64/', but still can't find the mpiof.h and the the administer told me that he don't (yet) know which package mpiof.h belongs to - could you confirm this is the correct filename?
>> Anyway, I tried to modify the 'linux_amd64_gfortran+mpi_generic' as follows:
>>
>>    MPI_HEADER_FILES='mpif.h mpiof.h'
>>    MPI_HEADER_FILES_INC='/usr/include/openmpi-x86_64/mpif.h'
>>
>> ../../../tools/cyrus-imapd-makedepend/makedepend: warning:  mdsio_readvector.F (reading EESUPPORT.h, line 175): cannot find include file "mpif.h"
>> not in mpif.h
>> not in mpif.h
>> not in /usr/include/mpif.h
>>
>> It seems it didn't point to the mpif.h path I set.
>>
>> What's the problem? Any suggestions? I feel frustrated about the MPI thing, it is too complicated to me. Help me out, please.
>> Best wishes.
>> chunyan
>>
>> The University of Dundee is a registered Scottish charity, No: SC015096
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




------------------------------

Message: 4
Date: Fri, 11 Nov 2011 15:54:39 +0000
From: "Chun-Yan Zhou" <c.zhou at dundee.ac.uk>
To: <mitgcm-support at mitgcm.org>
Subject: Re: [MITgcm-support] Problem with parallel build
Message-ID: <4EBD453F0200003200009B16 at ia-gw-6.dundee.ac.uk>
Content-Type: text/plain; charset="us-ascii"


Dear Martin,
Thanks for your help. I followed your suggestion and I can build and make the mitgcmuv file and didn't receive error. However, there is error when I run it:

./mitgcmuv: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory.
 
But I can find the file libmpi_f77.so.1 in the path '/usr/lib64/openmpi/lib/l',it seems like I need to add some direction? what do you think? BTW, I test that  MPI-installation is working properly.

chunyan
The University of Dundee is a registered Scottish charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20111111/e04bc88a/attachment-0001.htm>

------------------------------

Message: 5
Date: Fri, 11 Nov 2011 17:15:17 +0100
From: Martin Losch <Martin.Losch at awi.de>
To: mitgcm-support at mitgcm.org
Subject: Re: [MITgcm-support] Problem with parallel build
Message-ID: <AFF8BBBE-92A4-40BA-A4BD-D12E9D287AD0 at awi.de>
Content-Type: text/plain; CHARSET=US-ASCII

Hi Chunyan,

I am glad you can compile now.

libmpi_f77.so is a dynamic library. Dynamic libraries are loaded at run time (the lib*.a are static libraries loaded and included into the executable at compile time).  This is something that you need to discuss with your sys-admin, ie., do you need to set environment paths such as LD_LIBRARY_PATH?

Alternatively you can try to compile with "-static-libgfortran" (if the static libraries exist), see "man gfortran".

Martin

On Nov 11, 2011, at 4:54 PM, Chun-Yan Zhou wrote:

> Dear Martin,
>  Thanks for your help. I followed your suggestion and I can build and make the mitgcmuv file and didn't receive error. However, there is error when I run it:
>
> ./mitgcmuv: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory.
> 
> But I can find the file libmpi_f77.so.1 in the path '/usr/lib64/openmpi/lib/l',it seems like I need to add some direction? what do you think? BTW, I test that  MPI-installation is working properly.
>
> chunyan
>
> The University of Dundee is a registered Scottish charity, No: SC015096
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




------------------------------

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org
http://mitgcm.org/mailman/listinfo/mitgcm-support


End of MITgcm-support Digest, Vol 101, Issue 13
***********************************************

The University of Dundee is a registered Scottish charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20111111/542a39fe/attachment-0001.htm>


More information about the MITgcm-support mailing list