[MITgcm-support] Out of memory: optfile for IBM AIX with complier xlf90

michael schaferkotter schaferk at bellsouth.net
Thu Sep 4 11:43:30 EDT 2014


greetings;

the problem may or may not be the opts file. (probably not)



what i would like to see is the code/SIZE.h file and code/DIAGNOSTICS_SIZE.h

the use of the standard distribution code/DIAGNOSTICS_SIZE.h file caused similar problems for me with a largish domain 330million point  computational grid.

re: diagnostics pkg.

though very useful in many cases, the use of diagnostics can make the executable large.

here is shown how code/DIAGNOSTICS_SIZE.h was altered to make the executable smaller

diff DIAGNOSTICS_SIZE.h 
1c1
< C $Header: /u/gcmpack/MITgcm/pkg/diagnostics/DIAGNOSTICS_SIZE.h,v 1.5 2008/02/05 15:31:19 jmc Exp $
---
> C $Header: /u/gcmpack/MITgcm/pkg/diagnostics/DIAGNOSTICS_SIZE.h,v 1.4 2006/01/23 22:24:28 jmc Exp $
24,25c24,25
<       PARAMETER( numlists = 10, numperlist = 50, numLevels=2*Nr )
<       PARAMETER( numDiags = 1*Nr )
---
>       PARAMETER( numlists = 6, numperlist = 10, numLevels=Nr )
>       PARAMETER( numDiags = 60*Nr )
27c27
<       PARAMETER( diagSt_size = 10*Nr )
---
>       PARAMETER( diagSt_size = 60*Nr )

recompile and run again.

you can disable the use of diagnostics with

[mach:DOMAIN/expt_num/run] me% more data.pkg
# Packages
 &PACKAGES
 useOBCS=.TRUE.,
#useDiagnostics=.TRUE.,
 useMNC=.FALSE.,
 useEXF=.TRUE.,
#useEcco=.TRUE.,
 &

The estimated time to complete the test is approximately 10 minutes + time to sit in batch queue.


On Sep 3, 2014, at 7:59 PM, 王刚 wrote:

> Dear all,
> 
> Can someone help me to check my optfile for IBM with AIX 6.1 as the operating system, and xlf90 for complier? I can successfully pass the compling process and get the executable file:mitgcmuv. However, the job finished after a short run, with the error message like: 
> exec(): 0509-036 Cannot load program ./mitgcmuv because of the following errors:
>         0509-026 System errors: There is not enough memory available now
> 
> My job uses 8 cups, but the mechine still has at least 100 cups left!  I think the problem is due to wrong parameters in the optfile script. My optfile looks like:               
> 
> #!/bin/bash
> #
> # $Name: checkpoint65b $
> #  using the following invocation:
> #    ../../../tools/genmake2 -mpi -mods=../code -of=../../../tools/build_options/IBM_AIX_xlf90+mpi
> 
> S64='$(TOOLSDIR)/set64bitConst.sh'
> MAKEDEPEND=makedepend
> DEFINES='-DTARGET_AIX -DALLOW_USE_MPI -DALWAYS_USE_MPI  -DWORDLENGTH=4'
> 
> INCLUDES='-I/usr/local64/include -I/usr/lpp/ppe.poe/include/thread64'
> CPP='/usr/lib/cpp -P'
> CC='mpcc -q64'
> FC='mpxlf90 -q64'
> LINK='mpxlf90 -q64'
> MPI='true'
> LIBS="-L/usr/lib64 -L/usr/local64/lib -L/usr/lpp/ppe.poe/lib64 -lmpi  -lnetcdf"
> FFLAGS='-qfixed=132'
> if test "x$IEEE" = x ; then
>     #  No need for IEEE-754
>     FOPTIM='-O3 -qarch=pwr7 -qtune=pwr7 -qhot'
>     #CFLAGS='-O3 -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
> else
>     #  Try to follow IEEE-754
>     FOPTIM='-O3 -qstrict -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
>     #CFLAGS='-O3 -qstrict -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
> fi
> FC_NAMEMANGLE="#define FC_NAMEMANGLE(X) X"
> 
> 
> I submit my task using another script: 
> 
> #!/usr/bin/ksh
> #@job_type=parallel
> #@job_name=task1
> #@ class = normal
> #@ group = group2
> #@node    =1
> #@tasks_per_node=8
> #@output=$(job_name).out
> #@error=$(job_name).err
> #@queue
> poe ./mitgcmuv 
> 
> I'll appreciate your help very much!
> 
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support




More information about the MITgcm-support mailing list