[MITgcm-support] MITgcm on AS4, kernel 2.6

Hezi Gildor hezi.gildor at weizmann.ac.il
Fri Oct 7 00:33:51 EDT 2005


Hi Chris, Ed,

We do think it is an OS problem because it works fine on Alpha Tru64 and
on Linux 2.4.  We wonder is someone has tested the code on Linux Kernel
2.6.

I will prepare a tar file with the experiment we were running and  will
put it on our web site. We were using *exactly* the same executable and
*exactly* the same inputs then, on the same machine, and didn't get
exactly the same answer.

Thank, hezi.



Chris Hill said:
> Hi Hezi,
>
>   Is there any chance it could be a hardware or OS problem?
>
> Chris
>
> Ed Hill wrote:
>> On Thu, 2005-10-06 at 18:54 +0300, Hezi Gildor wrote:
>>
>>>Hi Ed,
>>>
>>>thanks for the prompt reply.
>>>
>>>The OS was AS4.0 and AS4U1. The compilers tried are gcc and pgi (5.2 and
>>>6.0). Tried using with LAM, MPICH (from pgi as well as built from tar
>>>ball), MPICH2. The compiler options used were many. From the default
>>>options to the -O3, to -fast, -fastsse, and -kieee (last 3 for pgi).
>>>
>>>Communications is through the internet. (No private network for now).
>>>
>>>Dual-opteron machines with
>>>
>>>AMD Opteron(tm) Processor 250
>>>
>>>The code is based on MITgcm_ss_20050719, checkpoint571_post. We have few
>>>hard-wired changes to code but it runs ok on Alpha machine and on
>>>cluster with Linux Kernel 2.4 so i don't think that the problem is with
>>>the code.
>>>
>>>when we compared fields such as salinity between two runs with same
>>>executables, we can get that at most grid points the difference is zero
>>>but at many isolated points scattered throughout the domain it is not (O
>>>of 10^-6).
>>
>>
>> Hi Hezi,
>>
>> If you are using *exactly* the same executable and *exactly* the same
>> inputs then, assuming you are running it on exactly the same machine, I
>> have no idea why you're not seeing exactly the same answer.  Of course,
>> if anything changes from one run to another (eg. different compiler,
>> different OS, different hardware, etc.) then you won't, in general, see
>> exactly identical ("zero difference") output due to differences in
>> numerical roundoff, etc.
>>
>> And, to be fair, your description is not enough information for us to
>> have a good idea of what you're doing (or trying to do).  So, can you
>> narrow this problem down to a (hopefully simple and) repeatable example?
>> That is, can you create a small but complete setup that includes *all*
>> the code changes that you've made and *all* your input files?
>>
>> Once you can create an example that repeatably demonstrates a problem,
>> then we can work with you to try to determine the reason.  But, without
>> such an example, I'm afraid that all we can offer you is guesses...
>>
>> Ed
>>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>




More information about the MITgcm-support mailing list