[MITgcm-support] MNC oddities on Cray T3E

Tue Mar 27 03:26:10 EDT 2007

On Tue, 27 Mar 2007 17:22:39 +1200 Mark Hadfield
<m.hadfield at niwa.co.nz> wrote:

> Having just spent a couple of happy (?!) days getting the MNC package 
> working on our Cray T3E, I thought I should report my success to the 
> mailing list. Furthermore, my solution is a bit clunky at the moment
> and I'd appreciate suggestions for doing things better.

Hi Mark,

Wow, a T3E ?

I played with SHMEM for the first time on a T3D at NCSC/MCNC ca-1995.
I don't think I've had any access to a T3D or T3E since the 1990's.
I'm a little surprised to hear that there is one still in service
somewhere.  The ones at NCSC/MCNC are long gone and PSC retired their 
T3E three years ago:

  http://www.psc.edu/machines/cray/t3e/t3e.html

Will you be reporting bugs for MITgcm on a CM-5 next?  :-)

> The biggest difficulty relates to the compiler's idiosyncratic
> handling of floating-point number precision. By default,
> single-precision real numbers use 8 bytes (64 bits), as do
> double-precision real numbers, that is
> 
>   REAL = DOUBLE = REAL*8
> 
> The compiler also supports a 4-byte floating-point type via a REAL*4 
> declaration; it does not have a REAL*16 type.

OK, that's unfortunate.  MNC was written with the assumption that the
target machine had usable single and double precision types.  But I
think there is more than one way to work around that little oddity.

> The first problem I ran into was that the function MNCCDIR is 
> unavailable, because genmake2 found that linking with C routines was 
> broken and so disabled it. I worked around this by commenting out the 
> call. I suspect that the C-linking problem arises from type
> mismatches between C and Fortran, but haven't got around to checking
> this out yet.

With a little luck, you should be able to figure out how to call C
routines from Fortran.  Its usually just a silly name-mangling issue
where you need to add some leading and/or trailing underscores to the C
function name.  Using "nm some_file.o" should help you determine the
name mangling convention.

And, as you point out, commenting it out is also a perfectly good
work-around if you don't need the directory-creation feature.

> The second problem occurrs in subroutine MNC_CW_RL_W_OFFSET, defined
> in pkg/mnc/MNC_CW_READWRITE_RL.F. When the model writes out
> information to the state file, it crashes with a floating point
> exception when writing out u-velocity data. The problem arises
> because the corresponding variable (U) in the netCDF file has the
> FLOAT data type. In this situation, MNC_CW_RL_W_OFFSET copies the
> data into a REAL*4 array, resh_r, then puts the data with the netCDF
> library's NF_PUT_VARA_REAL function. But NF_PUT_VARA_REAL expects the
> data to be of type REAL, which on this platform is REAL*8, so it
> reads data from beyond the area that's been initialised, leading to
> general mayhem.
> 
> My work-around is to declare resh_r as
> 
> #ifdef TARGET_T3E
>       REAL  resh_r( MNC_MAX_BUFF )
> #else
>       REAL*4  resh_r( MNC_MAX_BUFF )
> #endif
> 
> This fixes things nicely. So I'm happy for the time being, but I
> wonder if this problem can  be solved more elegantly. Furthermore,
> I'm a bit puzzled about the approach taken by MNC_CW_RL_W_OFFSET. The
> heart of the subroutine is as follows
> 
>         IF (stype(1:1) .EQ. 'D') THEN
>           ...
>           err = NF_PUT_VARA_DOUBLE(fid, idv, vstart, vcount, resh_d)
>         ELSEIF (stype(1:1) .EQ. 'R') THEN
>           ...
>           err = NF_PUT_VARA_REAL(fid, idv, vstart, vcount, resh_r)
>         ELSEIF (stype(1:1) .EQ. 'I') THEN
>           ...
>           err = NF_PUT_VARA_INT(fid, idv, vstart, vcount, resh_i)
>         ENDIF
> 
> where resh_d is REAL*8, resh_r is REAL*4 and resh_i is INTEGER. So
> the code goes to the trouble of creating and populating a data array
> that matches the netCDF variable. But this is unnecessary, surely, as
> the netCDF library will do this for us, see
> 
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f77/Type-Conversion.html#Type-Conversion

That sounds right.  Its been a while since I looked at that code but I
think you're correct and that one can rely on netCDF to do the type
conversions.  And it might or might not be faster.

Also, if your machine basically has no REAL*4 type (its the same as
REAL*8) then one could add a #define that would have it treat all the
REAL*4 variables the same as the REAL*8 ones.

Ed

-- 
Edward H. Hill III, PhD  |  ed at eh3.com  |  http://eh3.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20070327/c410ffd8/attachment.sig>