[MITgcm-support] Out of memory: optfile for IBM AIX with complier xlf90
michael schaferkotter
schaferk at bellsouth.net
Thu Sep 4 11:43:30 EDT 2014
greetings;
the problem may or may not be the opts file. (probably not)
what i would like to see is the code/SIZE.h file and code/DIAGNOSTICS_SIZE.h
the use of the standard distribution code/DIAGNOSTICS_SIZE.h file caused similar problems for me with a largish domain 330million point computational grid.
re: diagnostics pkg.
though very useful in many cases, the use of diagnostics can make the executable large.
here is shown how code/DIAGNOSTICS_SIZE.h was altered to make the executable smaller
diff DIAGNOSTICS_SIZE.h
1c1
< C $Header: /u/gcmpack/MITgcm/pkg/diagnostics/DIAGNOSTICS_SIZE.h,v 1.5 2008/02/05 15:31:19 jmc Exp $
---
> C $Header: /u/gcmpack/MITgcm/pkg/diagnostics/DIAGNOSTICS_SIZE.h,v 1.4 2006/01/23 22:24:28 jmc Exp $
24,25c24,25
< PARAMETER( numlists = 10, numperlist = 50, numLevels=2*Nr )
< PARAMETER( numDiags = 1*Nr )
---
> PARAMETER( numlists = 6, numperlist = 10, numLevels=Nr )
> PARAMETER( numDiags = 60*Nr )
27c27
< PARAMETER( diagSt_size = 10*Nr )
---
> PARAMETER( diagSt_size = 60*Nr )
recompile and run again.
you can disable the use of diagnostics with
[mach:DOMAIN/expt_num/run] me% more data.pkg
# Packages
&PACKAGES
useOBCS=.TRUE.,
#useDiagnostics=.TRUE.,
useMNC=.FALSE.,
useEXF=.TRUE.,
#useEcco=.TRUE.,
&
The estimated time to complete the test is approximately 10 minutes + time to sit in batch queue.
On Sep 3, 2014, at 7:59 PM, 王刚 wrote:
> Dear all,
>
> Can someone help me to check my optfile for IBM with AIX 6.1 as the operating system, and xlf90 for complier? I can successfully pass the compling process and get the executable file:mitgcmuv. However, the job finished after a short run, with the error message like:
> exec(): 0509-036 Cannot load program ./mitgcmuv because of the following errors:
> 0509-026 System errors: There is not enough memory available now
>
> My job uses 8 cups, but the mechine still has at least 100 cups left! I think the problem is due to wrong parameters in the optfile script. My optfile looks like:
>
> #!/bin/bash
> #
> # $Name: checkpoint65b $
> # using the following invocation:
> # ../../../tools/genmake2 -mpi -mods=../code -of=../../../tools/build_options/IBM_AIX_xlf90+mpi
>
> S64='$(TOOLSDIR)/set64bitConst.sh'
> MAKEDEPEND=makedepend
> DEFINES='-DTARGET_AIX -DALLOW_USE_MPI -DALWAYS_USE_MPI -DWORDLENGTH=4'
>
> INCLUDES='-I/usr/local64/include -I/usr/lpp/ppe.poe/include/thread64'
> CPP='/usr/lib/cpp -P'
> CC='mpcc -q64'
> FC='mpxlf90 -q64'
> LINK='mpxlf90 -q64'
> MPI='true'
> LIBS="-L/usr/lib64 -L/usr/local64/lib -L/usr/lpp/ppe.poe/lib64 -lmpi -lnetcdf"
> FFLAGS='-qfixed=132'
> if test "x$IEEE" = x ; then
> # No need for IEEE-754
> FOPTIM='-O3 -qarch=pwr7 -qtune=pwr7 -qhot'
> #CFLAGS='-O3 -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
> else
> # Try to follow IEEE-754
> FOPTIM='-O3 -qstrict -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
> #CFLAGS='-O3 -qstrict -Q -qarch=auto -qtune=auto -qcache=auto -qmaxmem=-1'
> fi
> FC_NAMEMANGLE="#define FC_NAMEMANGLE(X) X"
>
>
> I submit my task using another script:
>
> #!/usr/bin/ksh
> #@job_type=parallel
> #@job_name=task1
> #@ class = normal
> #@ group = group2
> #@node =1
> #@tasks_per_node=8
> #@output=$(job_name).out
> #@error=$(job_name).err
> #@queue
> poe ./mitgcmuv
>
> I'll appreciate your help very much!
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list