[MITgcm-support] MNC oddities on Cray T3E
Constantinos Evangelinos
ce107 at ocean.mit.edu
Wed Mar 28 13:45:26 EDT 2007
On Tue, 27 Mar 2007 17:22:39 +1200
Mark Hadfield <m.hadfield at niwa.co.nz> wrote:
> Having just spent a couple of happy (?!) days getting the MNC package
> working on our Cray T3E, I thought I should report my success to the
> mailing list. Furthermore, my solution is a bit clunky at the moment and
> I'd appreciate suggestions for doing things better.
>
> The biggest difficulty relates to the compiler's idiosyncratic handling
> of floating-point number precision. By default, single-precision real
> numbers use 8 bytes (64 bits), as do double-precision real numbers, that is
>
> REAL = DOUBLE = REAL*8
>
> The compiler also supports a 4-byte floating-point type via a REAL*4
> declaration; it does not have a REAL*16 type.
That is true for the Fortran compiler only for reasons of compatibility with
the Cray tradition of having single precision on its' vector systems mean
64 bit Cray arithmetic (that is not the same as the 64bit IEEE 754 floating
point arithmetic we all know - and love :-> but uses a smaller mantissa and a
wider exponent range and has several other differences). For those that are
interested you can read more about it at
http://www.stanford.edu/class/ee486/doc/lecture3.pdf
On the Cray vector systems (up to the SV1, with the exception of the T90 which
also supported IEEE arithmetic) double precision was in fact 128bit arithmetic
(and slow, though faster than on most other systems that offer quad
precision).
The T3E's Alpha processor internally supported only IEEE and various VAX
floating point formats so while there was no real reason to stick to the
real=real*8 rule on the T3Es Cray decided to insist on it so as not to break
customer's vector codes that were being ported to the machine. As the Cray C
compiler understood float as 32bit IEEE and double as 64bit IEEE this made the
use of mixed-language code (such as MITgcm) more problematic.
For a summary of what a Cray vector system, a T3E and an IEEE (big endian)
system with support for IEEE quad precision numbers understand for the various
fortran types and resulting data conversion issues have a look at:
http://jumpdoc.fz-juelich.de/ibmsc/usage/user/DataConversion/t3e_fortran.html
> The compiler accepts a "-s default32" switch (which I didn't know about
> until today) that "Adjusts the default sizes as follows: real, integer,
> and logical are set to 32 bits; complex and double precision are set to
> 64 bits; double complex is set to 128 bits."
This was a switch added to later versions of the T3E compilers to help
non-legacy users with making sense of their code. It should indeed solve your
problems.
> Using this switch might
> solve the problems I'll describe below, but would bring in the
> complication that other libraries (notably the netCDF library) have been
> built without it. So I'll leave that for another day.
You are correct that you would need to rebuild any library you use with this
same flag - similarly on systems that support 64bit integers building with
-i8 or equivalent requires that all libraries be built with -i8 etc.
> The first problem I ran into was that the function MNCCDIR is
> unavailable, because genmake2 found that linking with C routines was
> broken and so disabled it. I worked around this by commenting out the
> call. I suspect that the C-linking problem arises from type mismatches
> between C and Fortran, but haven't got around to checking this out yet.
That is because the standard way that MITgcm handles Fortran-C interlanguage
calling using the FC_NAMEMANGE() preprocessor macro is only capable of
handling name mangling that involves the possible addition of underscores
to the function/subroutine name. Cray's UNICOS (and older versions of HP-UX)
capitalize Fortran objects instead and this preprocessor macro technique
cannot
handle that. One can fix this by handcoding the capitalized name in the
C source code for the T3E.
> The second problem occurrs in subroutine MNC_CW_RL_W_OFFSET, defined in
> pkg/mnc/MNC_CW_READWRITE_RL.F. When the model writes out information to
> the state file, it crashes with a floating point exception when writing
> out u-velocity data. The problem arises because the corresponding
> variable (U) in the netCDF file has the FLOAT data type. In this
> situation, MNC_CW_RL_W_OFFSET copies the data into a REAL*4 array,
> resh_r, then puts the data with the netCDF library's NF_PUT_VARA_REAL
> function. But NF_PUT_VARA_REAL expects the data to be of type REAL,
> which on this platform is REAL*8, so it reads data from beyond the area
> that's been initialised, leading to general mayhem.
>
> My work-around is to declare resh_r as
>
> #ifdef TARGET_T3E
> REAL resh_r( MNC_MAX_BUFF )
> #else
> REAL*4 resh_r( MNC_MAX_BUFF )
> #endif
>
> This fixes things nicely.
Only it uses twice the storage space one would expect it to use for (32bit)
single precision.
> So I'm happy for the time being, but I wonder
> if this problem can be solved more elegantly. Furthermore, I'm a bit
> puzzled about the approach taken by MNC_CW_RL_W_OFFSET. The heart of the
> subroutine is as follows
>
> IF (stype(1:1) .EQ. 'D') THEN
> ...
> err = NF_PUT_VARA_DOUBLE(fid, idv, vstart, vcount, resh_d)
> ELSEIF (stype(1:1) .EQ. 'R') THEN
> ...
> err = NF_PUT_VARA_REAL(fid, idv, vstart, vcount, resh_r)
> ELSEIF (stype(1:1) .EQ. 'I') THEN
> ...
> err = NF_PUT_VARA_INT(fid, idv, vstart, vcount, resh_i)
> ENDIF
>
> where resh_d is REAL*8, resh_r is REAL*4 and resh_i is INTEGER. So the
> code goes to the trouble of creating and populating a data array that
> matches the netCDF variable. But this is unnecessary, surely, as the
> netCDF library will do this for us, see
>
>
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f77/Type-Conversion.html#Type-Conversion
Such conversion comes at the price of loss of precision and we'd like
to be in control of that.
Constantinos
--
Dr. Constantinos Evangelinos
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology
--
Dr. Constantinos Evangelinos
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology
More information about the MITgcm-support
mailing list