[MITgcm-support] Internal-Wave: Whether I could enable MPI with using only one tile001.mitgrid (LLC grid)

陈忱 chen636489 at gmail.com
Wed Jun 8 18:11:28 EDT 2016


Hi, Jean,

Thank you very much! pkg/exch2 works for me! It enabled me with only one
input file in the experiment internal_wave. Here is the things I have done:

1.  /code/package.conf

gfd
obcs
kl10
timeave
exch2
mdsio

2./code/SIZE.h

      PARAMETER (
     &           sNx =   15,
     &           sNy =   1,
     &           OLx =   2,
     &           OLy =   1,
     &           nSx =   4,
     &           nSy =   1,
     &           nPx =   1,
     &           nPy =   1,
     &           Nx  = sNx*nSx*nPx,
     &           Ny  = sNy*nSy*nPy,
     &           Nr  =  20)

C     MAX_OLX  - Set to the maximum overlap region size of any array
C     MAX_OLY    that will be exchanged. Controls the sizing of exch
C                routine buufers.
      INTEGER MAX_OLX
      INTEGER MAX_OLY
      INTEGER NOBCS
      PARAMETER ( MAX_OLX = OLx,
     &            MAX_OLY = OLy,
     &            NOBCS = 2 )

(PS: highlight blue parts is the line I added in. Try both include and
exclude gives the same error.)

3. /input/data

 &PARM04
 usingCurvilinearGrid=.TRUE.,
 delZ=20*10.,

4. /input/data.exch2

 &W2_EXCH2_PARM01

  preDefTopol= 1,
  W2_mapIO   = 1,
  dimsFacets = 60, 1,
 &

============================================
MITgcm reads successfully the only one tile001.mitgrid in terms of 4 tiles,
see the STDOUT.0000:

(PID.TID 0000.0001) SET_PARMS: done
(PID.TID 0000.0001) Enter INI_VERTICAL_GRID: setInterFDr=    T ;
setCenterDr=    F
(PID.TID 0000.0001) tile:   1 ; Read from file tile001.mitgrid
(PID.TID 0000.0001)   => xC yC dxF dyF rA xG yG dxV dyU rAz dxC dyC rAw rAs
dxG dyG
(PID.TID 0000.0001) tile:   2 ; Read from file tile001.mitgrid
(PID.TID 0000.0001)   => xC yC dxF dyF rA xG yG dxV dyU rAz dxC dyC rAw rAs
dxG dyG
(PID.TID 0000.0001) tile:   3 ; Read from file tile001.mitgrid
(PID.TID 0000.0001)   => xC yC dxF dyF rA xG yG dxV dyU rAz dxC dyC rAw rAs
dxG dyG
(PID.TID 0000.0001) tile:   4 ; Read from file tile001.mitgrid
(PID.TID 0000.0001)   => xC yC dxF dyF rA xG yG dxV dyU rAz dxC dyC rAw rAs
dxG dyG

============================================
However, the model collapse when initialize open boundary, see STDERR.0000:

(PID.TID 0000.0001) // =================================
(PID.TID 0000.0001) // END OF
FIELD                                          =
(PID.TID 0000.0001) // ==================================

(PID.TID 0000.0001) GAD_INIT_FIXED: GAD_OlMinSize=  1  0  1
(PID.TID 0000.0001)
(PID.TID 0000.0001) // ===================================
(PID.TID 0000.0001) // GAD parameters :
(PID.TID 0000.0001) // ===================================
(PID.TID 0000.0001) tempAdvScheme =   /* Temp. Horiz.Advection scheme
selector */
(PID.TID 0000.0001)                       2
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) tempVertAdvScheme =   /* Temp. Vert. Advection scheme
selector */
(PID.TID 0000.0001)                       2
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) tempMultiDimAdvec =   /* use Muti-Dim Advec method for
Temp */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) tempSOM_Advection = /* use 2nd Order Moment Advection
for Temp */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) AdamsBashforthGt = /* apply Adams-Bashforth
extrapolation on Gt */
(PID.TID 0000.0001)                   T
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) AdamsBashforth_T = /* apply Adams-Bashforth
extrapolation on Temp */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) saltAdvScheme =   /* Salt. Horiz.advection scheme
selector */
(PID.TID 0000.0001)                       2
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) saltVertAdvScheme =   /* Salt. Vert. Advection scheme
selector */
(PID.TID 0000.0001)                       2
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) saltMultiDimAdvec =   /* use Muti-Dim Advec method for
Salt */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) saltSOM_Advection = /* use 2nd Order Moment Advection
for Salt */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) AdamsBashforthGs = /* apply Adams-Bashforth
extrapolation on Gs */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) AdamsBashforth_S = /* apply Adams-Bashforth
extrapolation on Salt */
(PID.TID 0000.0001)                   F
(PID.TID 0000.0001)     ;
(PID.TID 0000.0001) // ===================================
(PID.TID 0000.0001) OBCS_INIT_FIXED: Setting OB indices in Overlap
(PID.TID 0000.0001)  Sets OBW(j,bi,bj=    0,  1,  1)=    1
(PID.TID 0000.0001) OBCS_INIT_FIXED: Setting OB indices in Overlap <= done

Program received signal SIGSEGV: Segmentation fault - invalid memory
reference.

Backtrace for this error:
#0  0x7f807efc766f in ???
#1  0x519d09 in ???
#2  0x632295 in ???
#3  0x6231f0 in ???
#4  0x650ef5 in ???
#5  0x582911 in ???
#6  0x401b26 in ???
#7  0x7f807efb3b14 in ???
#8  0x401b50 in ???
#9  0xffffffffffffffff in ???
Segmentation fault (core dumped)

Is there anything I am doing wrong? I also tried to disable /pkg/obcs, but
it will have IEEE_DIVIDE_BY_ZERO issue and will collapse at the second
timestep. Could you give me clue where might be the bug?

Thank you again for your concern!

Chen


On Sun, Jun 5, 2016 at 4:54 PM, Jean-Michel Campin <jmc at mit.edu> wrote:

> Hi Chen,
>
> I think you are right, but might not be easy to fix for something
> not many people uses.
>
> I would like to suggest to compile pkg/exch2 (which means you will
> use EXCH2) and try again with just one grid file: tile001.mitgrid
> that contains the full grid (with all fields 61x2).
> With default pkg/exch2 parameters, you don't need to provide a
> parameter file "data.exch2" and it should work just like with EXCH1.
>
> Cheers,
> Jean-Michel
>
> On Sat, Jun 04, 2016 at 11:48:14AM -0500, ?????? wrote:
> > Hi All,
> >
> > I am interested in the experiment Internal_wave, and want to play with
> the
> > input grid file. Here is what I have already tried and the problem I met.
> >
> > step0: input grid info at /input/data, &parm4, delxvar.bin (Initially
> > given)  ==> run serial succeed.
> >
> > step1: input grid info at /input/data, &parm4, tile001.mitgrid(Generated)
> > ==> run serial succeed.
> >
> >      /code/SIZE.h
> >
> >      &           sNx =  60,
> >      &           sNy =   1,
> >      &           OLx =   2,
> >      &           OLy =   2,
> >      &           nSx =   1,
> >      &           nSy =   1,
> >      &           nPx =   1,
> >      &           nPy =   1,
> >      &           Nx  = sNx*nSx*nPx,
> >      &           Ny  = sNy*nSy*nPy,
> >      &           Nr  =  20)
> >
> > step2: input tile001~004.mitgrid(Generated) ==> run serial succeed.
> >
> >      /code/SIZE.h
> >
> >      &           sNx =   15,
> >      &           sNy =    1,
> >      &           OLx =    2,
> >      &           OLy =    2,
> >      &           nSx =    4,
> >      &           nSy =    1,
> >      &           nPx =    1,
> >      &           nPy =    1,
> >      &           Nx  = sNx*nSx*nPx,
> >      &           Ny  = sNy*nSy*nPy,
> >      &           Nr  =  20)
> >
> > step3: input tile001~004.mitgrid(Generated) ==> run parallel succeed.
> >
> >      /code/SIZE.h
> >
> >      &           sNx =   15,
> >      &           sNy =    1,
> >      &           OLx =    2,
> >      &           OLy =    2,
> >      &           nSx =    1,
> >      &           nSy =    1,
> >      &           nPx =    4,
> >      &           nPy =    1,
> >      &           Nx  = sNx*nSx*nPx,
> >      &           Ny  = sNy*nSy*nPy,
> >      &           Nr  =  20)
> >
> > =====================================
> > Here comes the problem. Since eventually I will have a grid with
> > (nSx,nSy,nPx,nPy)=(1,1,20,21), or (nSx,nSy,nPx,nPy)=(20,21,1,1). Generate
> > 400 tiles grid seems crazy. So as in the experiment I would like to try
> > things like :: enable SIZE.h of step2 (or step3) with only one input file
> > tile001.mitgrid. Following is something I have already tried:
> >
> > 1.  Try in Serial with only one tile001.mitgrid (each 16 fields in this
> > grid file have size XC(61,2),YC(61,2), etc )
> > /input/data/&parm01  :  add useSingleCpuIO=.TRUE.,  and/or
> > useSingleCpuInput=.TRUE., with SIZE.h in step2:
> > /code/SIZE.h
> >
> >      &           sNx =   15,
> >      &           sNy =    1,
> >      &           OLx =    2,
> >      &           OLy =    2,
> >      &           nSx =    4,
> >      &           nSy =    1,
> >      &           nPx =    1,
> >      &           nPy =    1,
> >      &           Nx  = sNx*nSx*nPx,
> >      &           Ny  = sNy*nSy*nPy,
> >      &           Nr  =  20)
> >
> > ===============> error: At line 696 of file mdsio_facef_read.f (unit = 9)
> > Fortran runtime error: Cannot open file 'tile002.mitgrid': No such file
> or
> > directory
> >
> >
> >
> > 2.  Try in Parallel with only one tile001.mitgrid (each 16 fields in this
> > grid file have size XC(61,2),YC(61,2), etc )
> > /input/data/&parm01  :  add useSingleCpuIO=.TRUE.,  and/or
> > useSingleCpuInput=.TRUE., with SIZE.h in step3:
> > /code/SIZE.h
> >
> >      &           sNx =   15,
> >      &           sNy =    1,
> >      &           OLx =    2,
> >      &           OLy =    2,
> >      &           nSx =    1,
> >      &           nSy =    1,
> >      &           nPx =    4,
> >      &           nPy =    1,
> >      &           Nx  = sNx*nSx*nPx,
> >      &           Ny  = sNy*nSy*nPy,
> >      &           Nr  =  20)
> >
> > ===============> error: At line 696 of file mdsio_facef_read.f (unit = 9)
> > Fortran runtime error: Cannot open file 'tile002.mitgrid': No such file
> or
> > directory
> >
> >
> > 3. add globalFiles=.TRUE., to &Parm01. Try both serial and parallel,
> gives
> > the same error.
> >
> >
> > 4. compared with the case on
> > MIYgcm_contrib/atnguyen/llc_270/aste_270X450X180, of its Size.h and
> > /input/data. Didn't notice there is anyone Flag which is related to grid
> > input.
> >
> > So Could anyone provide me any ideas about how to enable this?
> >
> > Thank you in advance! I do appreciate it!
> >
> >
> > Chen CHEN
> > Graduate Student in ICES&EM
> > POB 3.402J
> > The University of Texas at Austin
> >
> > Phone: +1 512-968-2126
> > Email : chen636489 at utexas.edu
>
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>



-- 
Chen CHEN
Graduate Student in ICES&EM
POB 3.402J
The University of Texas at Austin

Phone: +1 512-968-2126
Email : chen636489 at utexas.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20160608/2903c8a1/attachment-0001.htm>


More information about the MITgcm-support mailing list