[MITgcm-support] mnc output
Yosef Ashkenazy
ashkena at bgu.ac.il
Thu Jan 26 16:11:19 EST 2006
Hi Martin,
I ran the model with the -Kieee option without success. So, maybe this
problem is not a precision problem as Chris suggested.
Yossi
chris hill wrote:
> Hi Martin,
>
> I don't think there is a precision problem.
> Attached is a test program with Yossis numbers.
> It seems to work as expected (as long as everything stays in 64-bit
> values). Are we missing something?
>
> Chris
>
> P.S. We tested here with g77 and ifort.
>
> Martin Losch wrote:
>
>> Hi Yossi,
>>
>> I may be wrong about this, but I think that the fact that by
>> increasing the time step by a factor of 10 the problem goes away is
>> another piece of evidence that my guess is not so bad. I have not
>> fully understood, how the routine different_multiple works, but is
>> evaluates differences between myTime (huge number) and deltaTclock
>> (relatively small number). Now you have increased deltaTclock by one
>> order of magnitude. If you run your experiment for 200kys with the
>> longer time step, you'll have the same problem as before, I am pretty
>> sure. I can't see why the absolute number of timesteps should matter,
>> but the relative size between myTime and deltaTclock.
>>
>> Have you tried enforcing ieee arithmetics? with the pgf77 the option
>> is -Kieee
>>
>> Martin
>> On Jan 26, 2006, at 7:10 AM, Yosef Ashkenazy wrote:
>>
>>> Hi Martin,
>>>
>>> I ran the simulation without optimization (i.e., without the -O3
>>> option...) for the eesupp/src/different_multiple.F routine and got
>>> the same results. Thus I'm not sure that the problem is related to
>>> the myTime variable.
>>> In addition, when I used time step that is 10 times larger than the
>>> original one I didn't observe this problem. So, maybe the problem is
>>> related to the actual number of time steps rather than the time in
>>> seconds.
>>>
>>> Yours,
>>>
>>> Yossi
>>>
>>> mlosch at awi-bremerhaven.de wrote:
>>>
>>>> Hi Yossi and Ed,
>>>>
>>>> withouth having looked at your example, I would guess that at 20kys
>>>> you run into precision problems. The model time "myTime" is counted
>>>> in seconds. 20kys are approximately 6e11 seconds.
>>>>
>>>> if the following expression is true:
>>>> DIFFERENT_MULTIPLE(dumpFreq,myTime,deltaTClock)
>>>> the model writes output.
>>>>
>>>> This is the comment in eesupp/src/different_multiple.F
>>>> C !DESCRIPTION:
>>>> C *====================================================
>>>> ======*
>>>> C | LOGICAL FUNCTION DIFFERENT\_MULTIPLE
>>>> C | o Checks if a multiple of freq exist
>>>> C | around val1 +/- step/2
>>>> C *====================================================
>>>> ======*
>>>> C | This routine is used for diagnostic and other periodic
>>>> C | operations. It is very sensitive to arithmetic precision.
>>>> C | For IEEE conforming arithmetic it works well but for
>>>> C | cases where short cut arithmetic is used it may not work
>>>> C | as expected. To overcome this issue compile this routine
>>>> C | separately with no optimisation.
>>>> C *====================================================
>>>> ======*
>>>>
>>>> So maybe, if you try compiling this routine without optimisation,
>>>> it will work for another 20000kys. (when you'll will approach
>>>> machine precision)
>>>>
>>>> However, if this is really the problem and if long integrations
>>>> like that become the rule (and not the exception), we might have to
>>>> consider an option for setting the time units explicitly, e.g, in
>>>> hours wouldl give another 3 orders of magnitude, etc.
>>>> Martin
>>>>
>>>> Martin Losch
>>>> Alfred Wegener Institute Postfach 120161, 27515 Bremerhaven,
>>>> Germany; Tel./Fax: ++49(0471)4831-1872/1797
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Yosef Ashkenazy <ashkena at bgu.ac.il>
>>>> Date: Wednesday, January 25, 2006 9:04 am
>>>> Subject: Re: [MITgcm-support] mnc output
>>>>
>>>>
>>>>> Hi Ed,
>>>>>
>>>>> Thank you very much for your suggestions. Indeed, it is
>>>>
>>>>
>>>> possible to
>>>>
>>>>> break the long run into pieces and to start each time from a
>>>>
>>>>
>>>> pickup
>>>>
>>>>> file. However, since my simulations are really long (~80kyr), it
>>>>> is not so convenient.
>>>>> The problem is not related to large memory size since my
>>>>
>>>>
>>>> output
>>>>
>>>>> files are much smaller than 2GB.
>>>>> I built a very simple configuration to test this problem; a simple
>>>>> 4x4 box with two vertical layers. It turns out that the problem is
>>>>> related to the time step rather than to the actual time of the
>>>>> simulation. Please have a look at this test case :
>>>>> http://www.bgu.ac.il/~ashkena/temp/
>>>>> You will find there a very short matlab code that demonstrates
>>>>
>>>>
>>>> the
>>>>
>>>>> problem and the entire configuration with the results. I hope we
>>>>> can solve this strange problem.
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Yossi
>>>>>
>>>>> Ed Hill wrote:
>>>>>
>>>>>
>>>>>> On Tue, 2006-01-24 at 09:20 +0200, Yosef Ashkenazy wrote:
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Currently I'm performing long runs of the MITgcm. I'm using
>>>>>>
>>>>
>>>> the
>>>>
>>>>> mnc
>>>>>
>>>>>>> package. After relatively long time of simulation (~19200
>>>>>>
>>>>
>>>> years)
>>>>
>>>>> the
>>>>>
>>>>>>> model starts to produce three or four times the same output
>>>>>>
>>>>
>>>> at
>>>>
>>>>> each
>>>>>
>>>>>>> output point (instead of single output). Do anyone faced this
>>>>>>
>>>>>
>>>>> problem
>>>>>
>>>>>>> before ? Any suggestions ?
>>>>>>>
>>>>>>>
>>>>>> Hi Yossi,
>>>>>>
>>>>>> I've never seen what you describe. Thats odd.
>>>>>>
>>>>>> Could you try the following:
>>>>>>
>>>>>> - Run the model from a pickup so that the simulation time per
>>>>>> run is shorter
>>>>>>
>>>>>> - Are your output files growing close to or perhaps beyond 2GB
>>>>>> in size? Large files can cause problems with netCDF and other
>>>>>> libraries if they aren't compiled with the proper
>>>>>> large-file-support. There are various settings that you can use
>>>>>> to decrease the number of time steps saved in each file such
>>>>>> as: mnc_max_fsize
>>>>>>
>>>>>> http://mitgcm.org/r2_web_testing/latest/online_documents/
>>>>>>
>>>> node254.html
>>>>
>>>>>> If none of the above helps, the best way to track down this
>>>>>
>>>>>
>>>>> problem is
>>>>>
>>>>>> to create a scenario that repeatedly triggers the problem. So if
>>>>>
>>>>
>>>> all
>>>>
>>>>>> else fails, please create a setup that triggers this problem and
>>>>>
>>>>>
>>>>> then we
>>>>>
>>>>>> can (hopefully) have you send it to us and we'll pick it apart.
>>>>>>
>>>>>> Ed
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-support mailing list
>>>>> MITgcm-support at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> MITgcm-support mailing list
>>>> MITgcm-support at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>
>------------------------------------------------------------------------
>
>
> PROGRAM MAIN
> LOGICAL DIFFERENT_MULTIPLE
> EXTERNAL DIFFERENT_MULTIPLE
> LOGICAL A
>
> REAL*8 freq, val1, step, valbase
>
> freq = 3153600000.d0
> step = 36000.0
> valbase=788400000000.0d0
>
> PRINT *, val1, val1+step, val1/step, val1/freq
>
> val1 = valbase-step
> A = DIFFERENT_MULTIPLE( freq, val1, step )
> PRINT *, A
>
> val1 = valbase
> A = DIFFERENT_MULTIPLE( freq, val1, step )
> PRINT *, A
>
> val1 = valbase+step
> A = DIFFERENT_MULTIPLE( freq, val1, step )
> PRINT *, A
>
> END
>CBOP
>C !ROUTINE: DIFFERENT_MULTIPLE
>
>C !INTERFACE:
> LOGICAL FUNCTION DIFFERENT_MULTIPLE( freq, val1, step )
> IMPLICIT NONE
>
>C !DESCRIPTION:
>C *==========================================================*
>C | LOGICAL FUNCTION DIFFERENT\_MULTIPLE
>C | o Checks if a multiple of freq exist
>C | around val1 +/- step/2
>C *==========================================================*
>C | This routine is used for diagnostic and other periodic
>C | operations. It is very sensitive to arithmetic precision.
>C | For IEEE conforming arithmetic it works well but for
>C | cases where short cut arithmetic is used it may not work
>C | as expected. To overcome this issue compile this routine
>C | separately with no optimisation.
>C *==========================================================*
>
>C !INPUT PARAMETERS:
>C == Routine arguments ==
>C freq :: Frequency by which time is divided.
>C val1 :: time that is checked
>C step :: length of time interval (around val1) that is checked
> REAL*8 freq, val1, step
>
>C---+----1----+----2----+----3----+----4----+----5----+----6----+----7-|--+----|
>
>C !LOCAL VARIABLES:
>C == Local variables ==
>C v1, v2, v3, v4 :: Temp. for holding time
>C d1, d2, d3 :: Temp. for hold difference
> REAL*8 v1, v2, v3, v4, d1, d2, d3
>CEOP
>
>C o Do easy cases first.
> DIFFERENT_MULTIPLE = .FALSE.
>
> IF ( freq .NE. 0. ) THEN
> IF ( ABS(step) .GT. freq ) THEN
> DIFFERENT_MULTIPLE = .TRUE.
> ELSE
>
>C o This case is more complex because of round-off error
> v1 = val1
> v2 = val1 - step
> v3 = val1 + step
>
>C Test v1 to see if its a "closest multiple"
> v4 = NINT(v1/freq)*freq
> d1 = v1-v4
> d2 = v2-v4
> d3 = v3-v4
> PRINT *, ' v4 = ', v4
> PRINT *, ' d1 = ', d1
> PRINT *, ' d2 = ', d2
> PRINT *, ' d3 = ', d3
> IF ( ABS(d1) .LT. ABS(d2) .AND. ABS(d1) .LE. ABS(d3) )
> & DIFFERENT_MULTIPLE = .TRUE.
>
> ENDIF
> ENDIF
>
> RETURN
> END
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>MITgcm-support mailing list
>MITgcm-support at mitgcm.org
>http://mitgcm.org/mailman/listinfo/mitgcm-support
>
>
More information about the MITgcm-support
mailing list