[MITgcm-support] mnc output

Yosef Ashkenazy ashkena at bgu.ac.il
Thu Jan 26 16:11:19 EST 2006


Hi Martin,

I ran the model with the -Kieee option without success. So, maybe this 
problem is not a precision problem as Chris suggested.

Yossi

chris hill wrote:

> Hi Martin,
>
>  I don't think there is a precision problem.
>  Attached is a test program with Yossis numbers.
>  It seems to work as expected (as long as everything stays in 64-bit 
> values). Are we missing something?
>
> Chris
>
> P.S. We tested here with g77 and ifort.
>
> Martin Losch wrote:
>
>> Hi Yossi,
>>
>> I may be wrong about this, but I think that the fact that by 
>> increasing the time step by a factor of 10 the problem goes away is 
>> another piece of evidence that my guess is not so bad. I have not 
>> fully understood, how the routine different_multiple works, but is 
>> evaluates differences between myTime (huge number) and deltaTclock 
>> (relatively small number). Now you have increased deltaTclock by one 
>> order of magnitude. If you run your experiment for 200kys with the 
>> longer time step, you'll have the same problem as before, I am pretty 
>> sure. I can't see why the absolute number of timesteps should matter, 
>> but the relative size between myTime and deltaTclock.
>>
>> Have you tried enforcing ieee arithmetics? with the pgf77 the option 
>> is -Kieee
>>
>> Martin
>> On Jan 26, 2006, at 7:10 AM, Yosef Ashkenazy wrote:
>>
>>> Hi Martin,
>>>
>>> I ran the simulation without optimization (i.e., without the -O3 
>>> option...) for the eesupp/src/different_multiple.F routine and got 
>>> the same results. Thus I'm not sure that the problem is related to 
>>> the myTime variable.
>>> In addition, when I used time step that is 10 times larger than the 
>>> original one I didn't observe this problem. So, maybe the problem is 
>>> related to the actual number of time steps rather than the time in 
>>> seconds.
>>>
>>> Yours,
>>>
>>> Yossi
>>>
>>> mlosch at awi-bremerhaven.de wrote:
>>>
>>>> Hi Yossi and Ed,
>>>>
>>>> withouth having looked at your example, I would guess that at 20kys 
>>>> you run into precision problems. The model time "myTime" is counted 
>>>> in seconds. 20kys are approximately 6e11 seconds.
>>>>
>>>> if the following expression is true:
>>>> DIFFERENT_MULTIPLE(dumpFreq,myTime,deltaTClock)
>>>> the model writes output.
>>>>
>>>> This is the comment in eesupp/src/different_multiple.F
>>>> C     !DESCRIPTION:
>>>> C     *====================================================
>>>> ======*
>>>> C     | LOGICAL FUNCTION DIFFERENT\_MULTIPLE                       
>>>> C     | o Checks if a multiple of freq exist
>>>> C     |   around val1 +/- step/2
>>>> C     *====================================================
>>>> ======*
>>>> C     | This routine is used for diagnostic and other periodic    
>>>> C     | operations. It is very sensitive to arithmetic precision. 
>>>> C     | For IEEE conforming arithmetic it works well but for      
>>>> C     | cases where short cut arithmetic  is used it may not work 
>>>> C     | as expected. To overcome this issue compile this routine  
>>>> C     | separately with no optimisation.                          
>>>> C     *====================================================
>>>> ======*
>>>>
>>>> So maybe, if you try compiling this routine without optimisation, 
>>>> it will work for another 20000kys. (when you'll will approach 
>>>> machine precision)
>>>>
>>>> However, if this is really the problem and if long integrations 
>>>> like that become the rule (and not the exception), we might have to 
>>>> consider an option for setting the time units explicitly, e.g, in 
>>>> hours wouldl give another 3 orders of magnitude, etc.
>>>> Martin
>>>>
>>>> Martin Losch
>>>> Alfred Wegener Institute Postfach 120161, 27515 Bremerhaven, 
>>>> Germany; Tel./Fax: ++49(0471)4831-1872/1797
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Yosef Ashkenazy <ashkena at bgu.ac.il>
>>>> Date: Wednesday, January 25, 2006 9:04 am
>>>> Subject: Re: [MITgcm-support] mnc output
>>>>
>>>>
>>>>> Hi Ed,
>>>>>
>>>>> Thank you very much for your suggestions. Indeed, it is
>>>>
>>>>
>>>> possible to
>>>>
>>>>> break the long run into pieces and to start each time from a
>>>>
>>>>
>>>> pickup
>>>>
>>>>> file. However, since my simulations are really long (~80kyr), it 
>>>>> is not so convenient.
>>>>> The problem is not related to large memory size since my
>>>>
>>>>
>>>> output
>>>>
>>>>> files are much smaller than 2GB.
>>>>> I built a very simple configuration to test this problem; a simple 
>>>>> 4x4 box with two vertical layers. It turns out that the problem is 
>>>>> related to the time step rather than to the actual time of the 
>>>>> simulation. Please have a look at this test case :
>>>>> http://www.bgu.ac.il/~ashkena/temp/
>>>>> You will find there a very short matlab code that demonstrates
>>>>
>>>>
>>>> the
>>>>
>>>>> problem and the entire configuration with the results. I hope we 
>>>>> can solve this strange problem.
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Yossi
>>>>>
>>>>> Ed Hill wrote:
>>>>>
>>>>>
>>>>>> On Tue, 2006-01-24 at 09:20 +0200, Yosef Ashkenazy wrote:
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Currently I'm performing long runs of the MITgcm. I'm using
>>>>>>
>>>>
>>>> the
>>>>
>>>>> mnc
>>>>>
>>>>>>> package. After relatively long time of simulation (~19200
>>>>>>
>>>>
>>>> years)
>>>>
>>>>> the
>>>>>
>>>>>>> model starts to produce three or four times the same output
>>>>>>
>>>>
>>>> at
>>>>
>>>>> each
>>>>>
>>>>>>> output point (instead of single output). Do anyone faced this
>>>>>>
>>>>>
>>>>> problem
>>>>>
>>>>>>> before ? Any suggestions ?
>>>>>>>
>>>>>>>
>>>>>> Hi Yossi,
>>>>>>
>>>>>> I've never seen what you describe.  Thats odd.
>>>>>>
>>>>>> Could you try the following:
>>>>>>
>>>>>> - Run the model from a pickup so that the simulation  time per 
>>>>>> run is shorter
>>>>>>
>>>>>> - Are your output files growing close to or perhaps beyond  2GB 
>>>>>> in size?  Large files can cause problems with netCDF  and other 
>>>>>> libraries if they aren't compiled with the  proper 
>>>>>> large-file-support.  There are various settings that  you can use 
>>>>>> to decrease the number of time steps saved in  each file such 
>>>>>> as:  mnc_max_fsize
>>>>>>
>>>>>> http://mitgcm.org/r2_web_testing/latest/online_documents/
>>>>>>
>>>> node254.html
>>>>
>>>>>> If none of the above helps, the best way to track down this
>>>>>
>>>>>
>>>>> problem is
>>>>>
>>>>>> to create a scenario that repeatedly triggers the problem.  So if
>>>>>
>>>>
>>>> all
>>>>
>>>>>> else fails, please create a setup that triggers this problem and
>>>>>
>>>>>
>>>>> then we
>>>>>
>>>>>> can (hopefully) have you send it to us and we'll pick it apart.
>>>>>>
>>>>>> Ed
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-support mailing list
>>>>> MITgcm-support at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> MITgcm-support mailing list
>>>> MITgcm-support at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>
>------------------------------------------------------------------------
>
>
>      PROGRAM MAIN
>      LOGICAL  DIFFERENT_MULTIPLE
>      EXTERNAL DIFFERENT_MULTIPLE
>      LOGICAL  A
>
>      REAL*8 freq, val1, step, valbase
>
>      freq = 3153600000.d0
>      step = 36000.0
>      valbase=788400000000.0d0
>
>      PRINT *, val1, val1+step, val1/step, val1/freq
>
>      val1 = valbase-step
>      A = DIFFERENT_MULTIPLE( freq, val1, step )
>      PRINT *, A
>
>      val1 = valbase
>      A = DIFFERENT_MULTIPLE( freq, val1, step )
>      PRINT *, A
>
>      val1 = valbase+step
>      A = DIFFERENT_MULTIPLE( freq, val1, step )
>      PRINT *, A
>      
>      END
>CBOP
>C     !ROUTINE: DIFFERENT_MULTIPLE
>
>C     !INTERFACE:
>      LOGICAL FUNCTION DIFFERENT_MULTIPLE( freq, val1, step )
>      IMPLICIT NONE
>
>C     !DESCRIPTION:
>C     *==========================================================*
>C     | LOGICAL FUNCTION DIFFERENT\_MULTIPLE
>C     | o Checks if a multiple of freq exist
>C     |   around val1 +/- step/2
>C     *==========================================================*
>C     | This routine is used for diagnostic and other periodic
>C     | operations. It is very sensitive to arithmetic precision.
>C     | For IEEE conforming arithmetic it works well but for
>C     | cases where short cut arithmetic  is used it may not work
>C     | as expected. To overcome this issue compile this routine
>C     | separately with no optimisation.
>C     *==========================================================*
>
>C     !INPUT PARAMETERS:
>C     == Routine arguments ==
>C     freq       :: Frequency by which time is divided.
>C     val1       :: time that is checked
>C     step       :: length of time interval (around val1) that is checked
>      REAL*8  freq, val1, step
>
>C---+----1----+----2----+----3----+----4----+----5----+----6----+----7-|--+----|
>
>C     !LOCAL VARIABLES:
>C     == Local variables ==
>C     v1, v2, v3, v4 :: Temp. for holding time
>C     d1, d2, d3     :: Temp. for hold difference
>      REAL*8  v1, v2, v3, v4, d1, d2, d3
>CEOP
>
>C     o Do easy cases first.
>      DIFFERENT_MULTIPLE = .FALSE.
>
>      IF ( freq .NE. 0. ) THEN
>        IF ( ABS(step) .GT. freq ) THEN
>         DIFFERENT_MULTIPLE = .TRUE.
>        ELSE
>
>C         o This case is more complex because of round-off error
>          v1 = val1
>          v2 = val1 - step
>          v3 = val1 + step
>
>C         Test v1 to see if its a "closest multiple"
>          v4 = NINT(v1/freq)*freq
>          d1 = v1-v4
>          d2 = v2-v4
>          d3 = v3-v4
>          PRINT *, ' v4 = ', v4
>          PRINT *, ' d1 = ', d1
>          PRINT *, ' d2 = ', d2
>          PRINT *, ' d3 = ', d3
>          IF ( ABS(d1) .LT. ABS(d2) .AND. ABS(d1) .LE. ABS(d3) )
>     &        DIFFERENT_MULTIPLE = .TRUE.
>
>        ENDIF
>      ENDIF
>
>      RETURN
>      END
>
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>MITgcm-support mailing list
>MITgcm-support at mitgcm.org
>http://mitgcm.org/mailman/listinfo/mitgcm-support
>  
>




More information about the MITgcm-support mailing list