[MITgcm-support] MITgcm with PGI and Ubuntu

Stefano Querin squerin at ogs.trieste.it
Tue Jul 5 11:45:24 EDT 2011


Dear MITgcmers,

we are still trying to understand what's wrong with the new Sgi node  
(H2106-G7, 2 Opteron 6172 with 12 cores, 2.1GHz, and 12MB L3 cache,  
RAM 15.68 GB) on our cluster. We are experiencing very low performance  
(including scalability: see previous "[MITgcm-support] Scalability on  
a new Sgi node" issue). Most likely (as Constantinos told us), there  
is a problem with memory access/bandwidth.
Anyway, we extracted the node from the cluster and used it as a stand  
alone machine in order to isolate the problem: in fact, the cluster  
has older CPUs and compiler version (PGI 6.1)... We installed on the  
node Ubuntu 11.04 (DISTRIB_CODENAME=natty) with a trial version of the  
up to date PGI compiler (11.6, linux86-64).
There are errors and warnings during the compilation, in particular,  
when launching the genmake2 we get:

> launching MITgcm genmake2 with project VECTOR and code codeG4_24p ...
>
> GENMAKE :
>
> A program for GENerating MAKEfiles for the MITgcm project.  For a
> quick list of options, use "genmake -h" or for more detail see:
>
>   http://mitgcm.org/devel_HOWTO/
>
> ===  Processing options files and arguments  ===
>   getting local config information:  none found
> grep: write error: Broken pipe

I don't know why...

>   getting OPTFILE information:
>     using OPTFILE="/home/squerin/MIT_home/VECTOR/build_options/ 
> linux_amd64_pgi+mpich+nobyteswap_sgi2106e"
>   getting AD_OPTFILE information:
>     using AD_OPTFILE="/home/squerin/MITgcm/tools/adjoint_options/ 
> adjoint_default"
>
> ===  Checking system libraries  ===
>   Do we have the system() command using /opt/pgi/linux86-64/2011/mpi/ 
> mpich/bin/mpif77...  yes
>   Do we have the fdate() command using /opt/pgi/linux86-64/2011/mpi/ 
> mpich/bin/mpif77...  yes
>   Do we have the etime() command using /opt/pgi/linux86-64/2011/mpi/ 
> mpich/bin/mpif77...  yes
>   Can we call simple C routines (here, "cloc()") using /opt/pgi/ 
> linux86-64/2011/mpi/mpich/bin/mpif77...  yes
>   Can we unlimit the stack size using /opt/pgi/linux86-64/2011/mpi/ 
> mpich/bin/mpif77...  yes
>   Can we register a signal handler using /opt/pgi/linux86-64/2011/ 
> mpi/mpich/bin/mpif77...  no

Usually this check was OK...

>   Can we use stat() through C calls...  yes
>   Can we create NetCDF-enabled binaries...  no

This is OK since we don't use NetCDF.

> ===  Setting defaults  ===
>   Adding MODS directories: /home/squerin/MIT_home/VECTOR/codeG4_24p
>   Making source files in eesupp from templates
>   Making source files in pkg/exch2 from templates
>   Making source files in pkg/regrid from templates
>
> ===  Determining package settings  ===
>   getting package dependency info from  /home/squerin/MITgcm/pkg/ 
> pkg_depend
>   checking default package list:
>     using PDEFAULT="/home/squerin/MIT_home/VECTOR/pkg/ 
> pkg_default_DARWIN"
>     before group expansion packages are:  DARWIN
>     replacing "DARWIN" with:   gfd gmredi kpp timeave obcs exf cal  
> diagnostics ptracers gchem darwin
>     replacing "gfd" with:   mom_common mom_fluxform mom_vecinv  
> generic_advdiff debug mdsio rw monitor
>     after group expansion packages are:   mom_common mom_fluxform  
> mom_vecinv generic_advdiff debug mdsio rw monitor gmredi kpp timeave  
> obcs exf cal diagnostics ptracers gchem darwin
>   applying DISABLE settings
>   applying ENABLE settings
>     packages are:   cal darwin debug diagnostics exf gchem  
> generic_advdiff gmredi kpp mdsio mom_common mom_fluxform mom_vecinv  
> monitor obcs ptracers rw timeave
>   applying package dependency rules
>     packages are:   cal darwin debug diagnostics exf gchem  
> generic_advdiff gmredi kpp mdsio mom_common mom_fluxform mom_vecinv  
> monitor obcs ptracers rw timeave
>   Adding STANDARDDIRS
>   Searching for *OPTIONS.h files in order to warn about the presence
>     of "#define "-type statements that are no longer allowed:
>     found CPP_OPTIONS="/home/squerin/MIT_home/VECTOR/codeG4_24p/ 
> CPP_OPTIONS.h"
>     found CPP_EEOPTIONS="/home/squerin/MITgcm/eesupp/inc/ 
> CPP_EEOPTIONS.h"
>   Creating the list of files for the adjoint compiler.
>
> ===  Creating the Makefile  ===
>   setting INCLUDES
>   Determining the list of source and include files
>   Writing makefile: Makefile
>   Add the source list for AD code generation
>   Making list of "exceptions" that need ".p" files
>   Making list of NOOPTFILES
>   Add rules for links
>   Adding makedepend marker
>
> ===  Done  ===

I also attach the "genmake_warnings" and "genmake_state" files.

When launching "make depend" we get this (at the end):

> /home/squerin/MITgcm/tools/f90mkdepend >> Makefile
> /bin/sh: /home/squerin/MITgcm/tools/f90mkdepend: not found
> make: *** [depend] Error 127

but we specified: -rootdir=/home/squerin/MITgcm

Then "mitgcmuv" is created without warnings but the executable is  
extremely slow...
We never experienced these warnings/errors in the past, also using  
different HPC systems.
This looks like a system libraries/environment problem, but I'm not a  
computer scientist so it could be something else (totally different)...
Did somebody test Ubuntu 11.04? Should we try an older OS version (8  
or 9)? I'm getting stuck...

Thanks for any suggestion!

Cheers,

Stefano


-------------- next part --------------
A non-text attachment was scrubbed...
Name: genmake_state
Type: application/octet-stream
Size: 14962 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20110705/19a48c1a/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genmake_warnings
Type: application/octet-stream
Size: 4594 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20110705/19a48c1a/attachment-0003.obj>


More information about the MITgcm-support mailing list