[MITgcm-support] Problem with MPI execution

Abbas Dorostkar abbas.dorostkar at ce.queensu.ca
Sun May 25 12:46:39 EDT 2008


Thanks so much Martin!
That was the problem.
Abbas


-----Original Message-----
From: mitgcm-support-bounces at mitgcm.org
[mailto:mitgcm-support-bounces at mitgcm.org] On Behalf Of
Martin.Losch at awi.de
Sent: May 22, 2008 12:19 PM
To: mitgcm-support at mitgcm.org
Subject: Re: RE: [MITgcm-support] Problem with MPI execution

Hi Abbas,

I don't have chance to look at the code closely, but it looks like you
have a compiler that expects "/" as a name-list terminator. 
Try the flag -DNMLTERMINATOR (or so, don't recall) the exact name in the
build options file, and then the code will automatically change all
namelists on the fly (otherwise you'll have to do it by hand).

M.


Martin Losch
Alfred Wegener Institute 
Postfach 120161, 27515 Bremerhaven, Germany; 
Tel./Fax: ++49(0471)4831-1872/1797



----- Original Message -----
From: Abbas Dorostkar <abbas.dorostkar at ce.queensu.ca>
Date: Thursday, May 22, 2008 4:41 pm
Subject: RE: [MITgcm-support] Problem with MPI execution

> Hi Martin and Dimitris,
> Thanks so much!
> 
> I do use option "-mpi" when I generated the Makefile with genmake2.
> There is no problem with non-mpi executable as well. 
> Martin, you were right! I just noticed that when I use the command "ln
> -s ../input/* .", it doesn't link eedata to my run folder properly.
> Anyways, I copied eedata in the run folder and I get new error. 
> What do you mean by "maybe the read permission are different for
> mpirun". How can I change it? Your fruitful comments are much
> appreciated!
> 
> I have attached the new error as follows:
> 
> STOP ABNORMAL END: S/R EESET_PARMS
> STOP ABNORMAL END: S/R EESET_PARMS
> 
> (PID.TID 0001.0001) *** ERROR *** S/R EESET_PARMS
> (PID.TID 0001.0001) *** ERROR *** Error reading execution environment
> (PID.TID 0001.0001) *** ERROR *** parameter file "eedata"
> 
> 
> PID.TID 0000.0001) //
> ======================================================
> (PID.TID 0000.0001) //                      MITgcm UV
> (PID.TID 0000.0001) //                      =========
> (PID.TID 0000.0001) //
> ======================================================
> (PID.TID 0000.0001) // execution environment starting up...
> (PID.TID 0000.0001)
> (PID.TID 0000.0001) // MITgcmUV version:  checkpoint59j
> (PID.TID 0000.0001) // Build user:        Abbas
> (PID.TID 0000.0001) // Build host:        LEO.CiVil.QueensU.Ca
> (PID.TID 0000.0001) // Build date:        Thu May 22 09:09:53 EDT 2008
> (PID.TID 0000.0001)
> (PID.TID 0000.0001) //
> =======================================================
> (PID.TID 0000.0001) // Execution Environment parameter file "eedata"
> (PID.TID 0000.0001) //
> =======================================================
> (PID.TID 0000.0001) ># Example "eedata" file
> (PID.TID 0000.0001) ># Lines beginning "#" are comments
> (PID.TID 0000.0001) ># nTx - No. threads per process in X
> (PID.TID 0000.0001) ># nTy - No. threads per process in Y
> (PID.TID 0000.0001) > &EEPARMS
> (PID.TID 0000.0001) > nTx=1,
> (PID.TID 0000.0001) > nTy=1,
> (PID.TID 0000.0001) > usingMPI=.TRUE.
> (PID.TID 0000.0001) > &
> (PID.TID 0000.0001) ># Note: Some systems use & as the
> (PID.TID 0000.0001) ># namelist terminator. Other systems
> (PID.TID 0000.0001) ># use a / character (as shown here).
> (PID.TID 0000.0001)
> (PID.TID 0000.0001) // Shown below is an example "eedata" file.
> (PID.TID 0000.0001) // To use this example copy and paste the
> (PID.TID 0000.0001) // ">" lines. Then remove the text up to
> (PID.TID 0000.0001) // and including the ">".
> (PID.TID 0000.0001) ># Example "eedata" file
> (PID.TID 0000.0001) ># Lines beginning "#" are comments
> (PID.TID 0000.0001) ># nTx - No. threads per process in X
> (PID.TID 0000.0001) ># nTy - No. threads per process in Y
> (PID.TID 0000.0001) >&EEPARMS
> (PID.TID 0000.0001) >nTx=1,nTy=1
> (PID.TID 0000.0001) >/
> (PID.TID 0000.0001) ># Note: Some systems use & as the
> (PID.TID 0000.0001) ># namelist terminator. Other systems
> (PID.TID 0000.0001) ># use a / character (as shown here).
> 
> 
> 
> Thanks again 
> Abbas
> 
> 
> 
> -----Original Message-----
> From: mitgcm-support-bounces at mitgcm.org
> [mailto:mitgcm-support-bounces at mitgcm.org] On Behalf Of Martin Losch
> Sent: May 22, 2008 9:25 AM
> To: mitgcm-support at mitgcm.org
> Subject: Re: [MITgcm-support] Problem with MPI execution
> 
> I don't think that the MPI within the MITgcm is the problem. Try to 
> 
> generate a non-mpi executable and see if the problem goes away.
> 
> The error message is clear: eedata cannot be opened. Is it in the  
> correct directory? What about read permissions? Maybe the read  
> permission are different for mpirun? This is the code where it 
> happens:
> >       OPEN(UNIT=eeDataUnit,FILE='eedata',STATUS='OLD',
> >      &     err=1,IOSTAT=errIO)
> >       IF ( errIO .GE. 0 ) GOTO 2
> >     1 CONTINUE
> >        WRITE(msgBuf,'(A)')
> >      &  'S/R EESET_PARMS'
> >        CALL PRINT_ERROR( msgBuf , 1)
> >        WRITE(msgBuf,'(A)')
> >      &  'Unable to open execution environment'
> >        CALL PRINT_ERROR( msgBuf , 1)
> >        WRITE(msgBuf,'(A)')
> >      &  'parameter file "eedata"'
> >        CALL PRINT_ERROR( msgBuf , 1)
> >        CALL EEDATA_EXAMPLE
> >        STOP 'ABNORMAL END: S/R EESET_PARMS'
> 
> no other ideas on this side of the Atlantic ...
> 
> Martin
> 
> On 22 May 2008, at 15:06, Dimitris Menemenlis wrote:
> 
> > Abbas, you probably have already done this but to be sure:
> > did you use option "-mpi" when you generated the Makefile with  
> > genmake2 ?
> >
> > Dimitris Menemenlis <menemenlis at sbcglobal.net>
> > 5056 Oakwood Ave, La Canada, CA 91011-2450
> > tel/fax: 818-790-6735;  cell: 818-625-6498
> >
> > On May 22, 2008, at 6:02 AM, Abbas Dorostkar wrote:
> >
> >> Hi Martin,
> >> Thanks for quick reply.
> >> It is strange because I have this file in the folder I run  
> >> mitgcmuv. I
> >> have attached it:
> >>
> >>> # Example "eedata" file
> >>>> # Lines beginning "#" are comments
> >>>> # nTx - No. threads per process in X
> >>>> # nTy - No. threads per process in Y
> >>>> &EEPARMS
> >>>> nTx=1,
> >>>> nTy=1,
> >>>> usingMPI=.TRUE.
> >>>> &
> >>>> # Note: Some systems use & as the
> >>>
> >>> # namelist terminator. Other systems
> >>>> # use a / character (as shown
> >>
> >> Do you have any other idea? I have been trying to fix this 
> problem  
> >> for a
> >> while
> >> Abbas
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: mitgcm-support-bounces at mitgcm.org
> >> [mailto:mitgcm-support-bounces at mitgcm.org] On Behalf Of Martin 
> Losch>> Sent: May 22, 2008 3:42 AM
> >> To: mitgcm-support at mitgcm.org
> >> Subject: Re: [MITgcm-support] Problem with MPI execution
> >>
> >> Abbas,
> >>
> >> you are missing the file "eedata", as the error messgae clearly 
> tells>> you. I won't tell you how often I made this mistake!!!
> >>
> >> Martin
> >>
> >> On 21 May 2008, at 22:46, Abbas Dorostkar wrote:
> >>
> >>> Dear all,
> >>>
> >>>
> >>>
> >>> I have been trying to run the exp1 with MPI execution on my 
> desktop>>> (ia32_linux equipped with one dual 1.5 processor) before 
> running my
> >>> own model on a node with 72  dual-core processors and 570 GB 
> RAM. I
> >>> haven't got any error during compiling. However, when I run the
> >>> mitgcmuv with command "mpirun -np 2 ./mitgcmuv", I get following
> >>> error:
> >>>
> >>>
> >>>
> >>> STOP ABNOSTOP ABNORMAL END: S/R EESET_PARMS
> >>> RMAL rank 0 in job 1  LEO.CiVil.QueensU.Ca_55941   caused
> >>> collective abort of all ranks
> >>>  exit status of rank 0: return code 0
> >>>
> >>> (PID.TID 0000.0001) *** ERROR *** S/R EESET_PARMS
> >>> (PID.TID 0000.0001) *** ERROR *** Unable to open execution  
> >>> environment
> >>> (PID.TID 0000.0001) *** ERROR *** parameter file "eedata"
> >>>
> >>>
> >>>
> >>> I run successfully some simple "Hello World"-type MPI programs,
> >>> showing my MPI (MPICH2) install is working correctly. I don't know
> >>> what I am missing?? Could someone provide some solution?  Here I
> >>> have attached size.h, eedata and optfile :
> >>>
> >>>
> >>>      PARAMETER (
> >>>
> >>>     &           sNx =  60,
> >>>
> >>>     &           sNy =  60,
> >>>
> >>>     &           OLx =   2,
> >>>
> >>>     &           OLy =   2,
> >>>
> >>>     &           nSx =   1,
> >>>
> >>>     &           nSy =   1,
> >>>
> >>>     &           nPx =   2,
> >>>
> >>>     &           nPy =   1,
> >>>
> >>>     &           Nx  = sNx*nSx*nPx,
> >>>
> >>>     &           Ny  = sNy*nSy*nPy,
> >>>
> >>>     &           Nr  =   4)--------------------------------------
> -
> >>> # Example "eedata" file
> >>>
> >>> # Lines beginning "#" are comments
> >>>
> >>> # nTx - No. threads per process in X
> >>>
> >>> # nTy - No. threads per process in Y
> >>>
> >>> &EEPARMS
> >>>
> >>> nTx=1,
> >>>
> >>> nTy=1,
> >>>
> >>> usingMPI=.TRUE.
> >>>
> >>> &
> >>>
> >>> # Note: Some systems use & as the
> >>>
> >>> # namelist terminator. Other systems
> >>>
> >>> # use a / character (as shown
> >>> here).---------------------------------------
> >>> #!/bin/bash
> >>>
> >>> #
> >>>
> >>> #  $Header: /u/gcmpack/MITgcm/tools/build_options/linux_ia32_g77
> >>> +mpi_cg01,v 1.6 2006/03/24 22:34:43 edhill Exp $
> >>>
> >>> #  $Name:  $
> >>>
> >>>
> >>>
> >>> FC='/usr/local/bin/mpif77'
> >>>
> >>> CC='/usr/local/bin/mpicc'
> >>>
> >>> DEFINES='-DALLOW_USE_MPI -DALWAYS_USE_MPI -D_BYTESWAPIO -
> >>> DWORDLENGTH=4'
> >>>
> >>> INCLUDEDIRS='/usr/local/include'
> >>>
> >>> INCLUDES='-I/usr/local/include'
> >>>
> >>> CPP='/lib/cpp  -traditional -P'
> >>>
> >>> NOOPTFLAGS='-O0'
> >>>
> >>>
> >>>
> >>> if test "x$IEEE" = x ; then
> >>>
> >>>    #  No need for IEEE-754
> >>>
> >>>    FFLAGS='-Wimplicit -Wunused -Wuninitialized'
> >>>
> >>>    FOPTIM='-O3 -malign-double -funroll-loops'
> >>>
> >>> else
> >>>
> >>>    #  Try to follow IEEE-754
> >>>
> >>>    FFLAGS='-Wimplicit -Wunused -ffloat-store'
> >>>
> >>>    FOPTIM='-O0 -malign-double'
> >>>
> >>> fi
> >>>
> >>>
> >>>
> >>> # netcdf
> >>>
> >>> #LIBS="-lnetcdf"
> >>> Your help would be much appreciated.
> >>>
> >>> Thanks a lot
> >>>
> >>> Abbas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> MITgcm-support mailing list
> >>> MITgcm-support at mitgcm.org
> >>> http://mitgcm.org/mailman/listinfo/mitgcm-support
> >>
> >>
> >> _______________________________________________
> >> MITgcm-support mailing list
> >> MITgcm-support at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-support
> >>
> >> _______________________________________________
> >> MITgcm-support mailing list
> >> MITgcm-support at mitgcm.org
> >> http://mitgcm.org/mailman/listinfo/mitgcm-support
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
> 
_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org
http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list