[MITgcm-devel] advection routines: strange results in flow trace analysis
Martin Losch
Martin.Losch at awi.de
Thu Apr 10 11:37:17 EDT 2008
Thanks!
the MULTIDIM_OLD_VERSION does not change anything for
global_ocean.cs32x15!
It looks like I can use it for my configuration (although it does
change the results on the 7th to 8th digit after 216 timesteps).
Maybe I can make it default for TARGET_NEC_SX? like this in
gad_advection:
#ifdef TARGE_NEC_SX
# define MULTIDIM_OLD_VERSION
#else
# undef MULTIDIM_OLD_VERSION
#endif
What do you think?
Martin
On 10 Apr 2008, at 17:12, Jean-Michel Campin wrote:
> Martin,
>
>> where exactly is this figure you are talking about, I can't find
>> it ...
> In: section 2.28.4 Multi-dimensional advection
> Figure 2.18: Muti-dimensional advection time-stepping with Cubed-
> Sphere topology
>
> Jean-Michel
>
> On Thu, Apr 10, 2008 at 05:04:21PM +0200, Martin Losch wrote:
>> Hi Jean-Michel,
>> does that mean you do not recommend the use of MULTIDIM_OLD_VERSION?
>> How severe is this non-conservation?
>>
>> It does speed up my short (216 timestep) run by over 10%, mostly
>> because the time spent in BLOCKING_EXCHANGES is reduced from 15%
>> to 2%.
>>
>> where exactly is this figure you are talking about, I can't find
>> it ...
>> Martin
>>
>> On 10 Apr 2008, at 15:37, Jean-Michel Campin wrote:
>>> Hi Martin,
>>>
>>> On Thu, Apr 10, 2008 at 09:13:27AM +0200, Martin Losch wrote:
>>>> Me again,
>>>>
>>>> can I still use #define MULTIDIM_OLD_VERSION in gad_advection.F?
>>>> will that fix my performance problem? and at what cost?
>>>
>>> The MULTIDIM_OLD_VERSION does not conserve the total tracer
>>> amount.
>>>
>>>> I guess, I never realized (and probably never will), how many
>>>> complications arise with the cubed shpere configuration.
>>>
>>> I added a figure in the manual, but the description & legend
>>> ares still missing !!!
>>>
>>> And regarding the other suggestion (3 call for every tiles,
>>> even if 1 call is not needed at all), you will get more flops
>>> but is unlikely to really "speed up" a lot our run. And it
>>> will definitively slow down some other setups we have.
>>>
>>> Jean-Michel
>>>
>>>>
>>>> Martin
>>>>
>>>> On 9 Apr 2008, at 18:51, Martin.Losch at awi.de wrote:
>>>>> OK, I can see now where this comes from:
>>>>> C- CubedSphere : pass 3 times, with partial update of local
>>>>> tracer field
>>>>> IF (ipass.EQ.1) THEN
>>>>> overlapOnly = MOD(nCFace,3).EQ.0
>>>>> interiorOnly = MOD(nCFace,3).NE.0
>>>>> calc_fluxes_X = nCFace.EQ.6 .OR. nCFace.EQ.1 .OR.
>>>>> nCFace.EQ.2
>>>>> calc_fluxes_Y = nCFace.EQ.3 .OR. nCFace.EQ.4 .OR.
>>>>> nCFace.EQ.5
>>>>> ELSEIF (ipass.EQ.2) THEN
>>>>> overlapOnly = MOD(nCFace,3).EQ.2
>>>>> interiorOnly = MOD(nCFace,3).EQ.1
>>>>> calc_fluxes_X = nCFace.EQ.2 .OR. nCFace.EQ.3 .OR.
>>>>> nCFace.EQ.4
>>>>> calc_fluxes_Y = nCFace.EQ.5 .OR. nCFace.EQ.6 .OR.
>>>>> nCFace.EQ.1
>>>>> ELSE
>>>>> interiorOnly = .TRUE.
>>>>> calc_fluxes_X = nCFace.EQ.5 .OR. nCFace.EQ.6
>>>>> calc_fluxes_Y = nCFace.EQ.2 .OR. nCFace.EQ.3
>>>>> ENDIF
>>>>>
>>>>>
>>>>> I assume that this is the minimum number of calls to gad_$
>>>>> {advscheme}_adv_x/y that is possible? Why is it not symmetric for
>>>>> all faces? I wonder if the load imbalance on the cpus is more
>>>>> severe (because of waiting in the exchange routines than) than
>>>>> calling gad_${advscheme}_adv_x/y for two more faces, so that the
>>>>> load is nearly the same for all faces. Currently the four
>>>>> exch2_send/recv_rl1/2 routines take up over 20% of the total time
>>>>> (mostly because they wait).
>>>>>
>>>>> Martin
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Jean-Michel Campin <jmc at ocean.mit.edu>
>>>>> Date: Wednesday, April 9, 2008 6:30 pm
>>>>> Subject: Re: [MITgcm-devel] advection routines: strange results in
>>>>> flow trace analysis
>>>>>
>>>>>> Hi Martin,
>>>>>>
>>>>>> MultiDim advection on CS-grid has special stuff depending on
>>>>>> which face is computed (2 or 3 calls to advection S/R, npass=3);
>>>>>> It's very likely that it comes from there.
>>>>>>
>>>>>> Jean-Michel
>>>>>>
>>>>>> On Wed, Apr 09, 2008 at 04:29:44PM +0200, Martin Losch wrote:
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I have an unexpected result in flow trace analysis (see
>>>>>>> below). I
>>>>>> am
>>>>>>> running the high resultion cubed sphere configuration (CS510)
>>>>>> with 16
>>>>>>> passive tracers on 24CPU of a SX8-R, my advection scheme is 7
>>>>>> (os7mp)
>>>>>>> for the tracers and 33 (dst3fl) for the seaice variables. As
>>>>>> expected
>>>>>>> the advection routines use most of the time (18 tracers). The
>>>>>> flow
>>>>>>> trace analysis below gives the cumulative/average values in the
>>>>>> first
>>>>>>> line, and then the values for the individual process in the
>>>>>> following
>>>>>>> 24 lines. However, if you look closely, you'll see that on some
>>>>>> (8)
>>>>>>> cpus the advection routine is called twice as often as on the
>>>>>>> remaining 16 cpus, this is true both for gad_os7mp_adv_x/y
>>>>>>> (which
>>>>>> is
>>>>>>> called from gad_advection in this case) and gad_dst3fl_adv_x/y
>>>>>> (which
>>>>>>> is called from seaice_advection in this case, not shown). We
>>>>>> (Jens-
>>>>>>> Olaf and I) suspect that this imbalance is responsible for the
>>>>>>> terrible performance of the exch2-routines in this run that
>>>>>>> Chris
>>>>>> and
>>>>>>> I talked about in February, because 16 cpus have to wait for 8
>>>>>> all
>>>>>>> the time in the exchange routines.
>>>>>>>
>>>>>>> All other routines seem to be called with the same frequency on
>>>>>> all
>>>>>>> CPUs.
>>>>>>>
>>>>>>> What is the explanation for this?
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> *--------------------------*
>>>>>>>> FLOW TRACE ANALYSIS LIST
>>>>>>>> *--------------------------*
>>>>>>>>
>>>>>>>> Execution : Wed Apr 9 10:54:36 2008
>>>>>>>> Total CPU : 13:40'31"663
>>>>>>>>
>>>>>>>>
>>>>>>>> FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP
>>>>>>>> AVER.
>>>>>>
>>>>>>>> VECTOR I-CACHE O-CACHE BANK PROG.UNIT
>>>>>>>> TIME[sec]( % ) [msec] RATIO V.LEN
>>>>>>
>>>>>>>> TIME MISS MISS CONF
>>>>>>>>
>>>>>>>> 6220800 6776.032( 13.8) 1.089 25809.1 7541.0 99.88 256.0
>>>>>>
>>>>>>>> 6774.327 0.1500 0.1642 10.0670 gad_os7mp_adv_x
>>>>>>>> 194400 211.705 1.089 25814.8 7542.6 99.88 256.0
>>>>>>
>>>>>>>> 211.646 0.0084 0.0104 0.2688 0.0
>>>>>>>> 194400 211.692 1.089 25816.3 7543.1 99.88 256.0
>>>>>>
>>>>>>>> 211.641 0.0013 0.0027 0.2656 0.1
>>>>>>>> 194400 211.890 1.090 25792.1 7536.0 99.88 256.0
>>>>>>
>>>>>>>> 211.838 0.0014 0.0021 0.4195 0.10
>>>>>>>> 194400 211.907 1.090 25790.2 7535.4 99.88 256.0
>>>>>>
>>>>>>>> 211.852 0.0020 0.0024 0.4203 0.11
>>>>>>>> 194400 211.706 1.089 25814.6 7542.5 99.88 256.0
>>>>>>
>>>>>>>> 211.648 0.0059 0.0064 0.2785 0.12
>>>>>>>> 194400 211.698 1.089 25815.6 7542.8 99.88 256.0
>>>>>>
>>>>>>>> 211.644 0.0011 0.0019 0.2743 0.13
>>>>>>>> 194400 211.720 1.089 25812.8 7542.0 99.88 256.0
>>>>>>
>>>>>>>> 211.654 0.0171 0.0173 0.2838 0.14
>>>>>>>> 194400 211.713 1.089 25813.8 7542.3 99.88 256.0
>>>>>>
>>>>>>>> 211.658 0.0034 0.0041 0.2903 0.15
>>>>>>>> 194400 211.673 1.089 25818.7 7543.8 99.88 256.0
>>>>>>
>>>>>>>> 211.615 0.0096 0.0076 0.2493 0.16
>>>>>>>> 194400 211.681 1.089 25817.7 7543.5 99.88 256.0
>>>>>>
>>>>>>>> 211.613 0.0181 0.0169 0.2498 0.17
>>>>>>>> 194400 211.645 1.089 25822.0 7544.7 99.88 256.0
>>>>>>
>>>>>>>> 211.590 0.0041 0.0044 0.2292 0.18
>>>>>>>> 194400 211.650 1.089 25821.4 7544.6 99.88 256.0
>>>>>>
>>>>>>>> 211.596 0.0042 0.0045 0.2281 0.19
>>>>>>>> 194400 211.684 1.089 25817.3 7543.4 99.88 256.0
>>>>>>
>>>>>>>> 211.628 0.0050 0.0063 0.2656 0.2
>>>>>>>> 388800 423.306 1.089 25821.1 7544.4 99.88 256.0
>>>>>>
>>>>>>>> 423.206 0.0061 0.0076 0.4798 0.20
>>>>>>>> 388800 423.311 1.089 25820.8 7544.4 99.88 256.0
>>>>>>
>>>>>>>> 423.208 0.0057 0.0064 0.4736 0.21
>>>>>>>> 388800 423.306 1.089 25821.1 7544.4 99.88 256.0
>>>>>>
>>>>>>>> 423.204 0.0024 0.0031 0.4841 0.22
>>>>>>>> 388800 423.321 1.089 25820.2 7544.2 99.88 256.0
>>>>>>
>>>>>>>> 423.218 0.0017 0.0024 0.4834 0.23
>>>>>>>> 194400 211.678 1.089 25818.0 7543.5 99.88 256.0
>>>>>>
>>>>>>>> 211.625 0.0101 0.0112 0.2526 0.3
>>>>>>>> 388800 423.756 1.090 25793.7 7536.4 99.88 256.0
>>>>>>
>>>>>>>> 423.648 0.0025 0.0048 0.8570 0.4
>>>>>>>> 388800 423.705 1.090 25796.7 7537.3 99.88 256.0
>>>>>>
>>>>>>>> 423.595 0.0079 0.0095 0.8142 0.5
>>>>>>>> 388800 423.733 1.090 25795.1 7536.8 99.88 256.0
>>>>>>
>>>>>>>> 423.660 0.0159 0.0174 0.8413 0.6
>>>>>>>> 388800 423.742 1.090 25794.5 7536.7 99.88 256.0
>>>>>>
>>>>>>>> 423.647 0.0024 0.0039 0.8203 0.7
>>>>>>>> 194400 211.906 1.090 25790.2 7535.4 99.88 256.0
>>>>>>
>>>>>>>> 211.845 0.0124 0.0089 0.4195 0.8
>>>>>>>> 194400 211.904 1.090 25790.4 7535.5 99.88 256.0
>>>>>>
>>>>>>>> 211.849 0.0012 0.0019 0.4183 0.9
>>>>>>>> 6220800 6482.742( 13.2) 1.042 27041.7 7882.1 99.88 256.0
>>>>>>
>>>>>>>> 6480.721 0.5066 0.1471 7.7018 gad_os7mp_adv_y
>>>>>>>> 194400 202.452 1.041 27059.5 7887.3 99.88 256.0
>>>>>>
>>>>>>>> 202.387 0.0137 0.0075 0.2022 0.0
>>>>>>>> 194400 202.439 1.041 27061.3 7887.8 99.88 256.0
>>>>>>
>>>>>>>> 202.380 0.0073 0.0014 0.1965 0.1
>>>>>>>> 388800 405.687 1.043 27007.3 7872.1 99.88 256.0
>>>>>>
>>>>>>>> 405.582 0.0103 0.0025 0.6622 0.10
>>>>>>>> 388800 405.711 1.043 27005.7 7871.6 99.88 256.0
>>>>>>
>>>>>>>> 405.568 0.0446 0.0029 0.6483 0.11
>>>>>>>> 194400 202.487 1.042 27054.9 7886.0 99.88 256.0
>>>>>>
>>>>>>>> 202.422 0.0128 0.0065 0.2192 0.12
>>>>>>>> 194400 202.461 1.041 27058.3 7887.0 99.88 256.0
>>>>>>
>>>>>>>> 202.401 0.0084 0.0013 0.2061 0.13
>>>>>>>> 194400 202.497 1.042 27053.5 7885.6 99.88 256.0
>>>>>>
>>>>>>>> 202.425 0.0237 0.0147 0.2160 0.14
>>>>>>>> 194400 202.519 1.042 27050.6 7884.7 99.88 256.0
>>>>>>
>>>>>>>> 202.423 0.0424 0.0033 0.2155 0.15
>>>>>>>> 388800 404.770 1.041 27068.5 7889.9 99.88 256.0
>>>>>>
>>>>>>>> 404.664 0.0198 0.0146 0.3562 0.16
>>>>>>>> 388800 404.766 1.041 27068.7 7890.0 99.88 256.0
>>>>>>
>>>>>>>> 404.653 0.0310 0.0243 0.3585 0.17
>>>>>>>> 388800 404.796 1.041 27066.7 7889.4 99.88 256.0
>>>>>>
>>>>>>>> 404.692 0.0125 0.0057 0.3540 0.18
>>>>>>>> 388800 404.830 1.041 27064.4 7888.8 99.88 256.0
>>>>>>
>>>>>>>> 404.696 0.0435 0.0060 0.3553 0.19
>>>>>>>> 194400 202.459 1.041 27058.6 7887.1 99.88 256.0
>>>>>>
>>>>>>>> 202.396 0.0111 0.0046 0.1934 0.2
>>>>>>>> 194400 202.395 1.041 27067.1 7889.5 99.88 256.0
>>>>>>
>>>>>>>> 202.338 0.0094 0.0032 0.1811 0.20
>>>>>>>> 194400 202.400 1.041 27066.4 7889.3 99.88 256.0
>>>>>>
>>>>>>>> 202.342 0.0092 0.0030 0.1827 0.21
>>>>>>>> 194400 202.406 1.041 27065.6 7889.1 99.88 256.0
>>>>>>
>>>>>>>> 202.347 0.0074 0.0012 0.1768 0.22
>>>>>>>> 194400 202.438 1.041 27061.4 7887.9 99.88 256.0
>>>>>>
>>>>>>>> 202.347 0.0376 0.0008 0.1773 0.23
>>>>>>>> 194400 202.473 1.042 27056.8 7886.5 99.88 256.0
>>>>>>
>>>>>>>> 202.392 0.0463 0.0091 0.1846 0.3
>>>>>>>> 194400 202.854 1.043 27005.8 7871.7 99.88 256.0
>>>>>>
>>>>>>>> 202.791 0.0090 0.0011 0.3488 0.4
>>>>>>>> 194400 202.871 1.044 27003.6 7871.0 99.88 256.0
>>>>>>
>>>>>>>> 202.806 0.0125 0.0041 0.3407 0.5
>>>>>>>> 194400 202.796 1.043 27013.6 7873.9 99.88 256.0
>>>>>>
>>>>>>>> 202.748 0.0174 0.0081 0.3077 0.6
>>>>>>>> 194400 202.880 1.044 27002.5 7870.7 99.88 256.0
>>>>>>
>>>>>>>> 202.788 0.0424 0.0012 0.3204 0.7
>>>>>>>> 388800 405.695 1.043 27006.7 7871.9 99.88 256.0
>>>>>>
>>>>>>>> 405.581 0.0246 0.0181 0.6572 0.8
>>>>>>>> 388800 405.660 1.043 27009.1 7872.6 99.88 256.0
>>>>>>
>>>>>>>> 405.551 0.0097 0.0021 0.6412 0.9
>>>>>>>> 4572288 5253.314( 10.7) 1.149 26038.1 7946.7 99.88 255.2
>>>>>>
>>>>>>>> 5251.591 0.1115 0.1259 8.4362 gad_os7mp_adv_r
>>>>>>>> 190512 218.247 1.146 26114.6 7970.0 99.88 255.2
>>>>>>
>>>>>>>> 218.173 0.0070 0.0075 0.2710 0.0
>>>>>>>> 190512 218.203 1.145 26119.9 7971.6 99.88 255.2
>>>>>>
>>>>>>>> 218.134 0.0012 0.0023 0.2445 0.1
>>>>>>>> 190512 219.379 1.152 25979.8 7928.9 99.88 255.2
>>>>>>
>>>>>>>> 219.309 0.0017 0.0023 0.4233 0.10
>>>>>>>> 190512 219.377 1.152 25980.0 7929.0 99.88 255.2
>>>>>>
>>>>>>>> 219.305 0.0019 0.0021 0.4169 0.11
>>>>>>>> 190512 218.427 1.147 26093.0 7963.4 99.88 255.2
>>>>>>
>>>>>>>> 218.354 0.0051 0.0057 0.2925 0.12
>>>>>>>> 190512 218.376 1.146 26099.1 7965.3 99.88 255.2
>>>>>>
>>>>>>>> 218.305 0.0010 0.0017 0.2655 0.13
>>>>>>>> 190512 218.430 1.147 26092.7 7963.4 99.88 255.2
>>>>>>
>>>>>>>> 218.352 0.0144 0.0143 0.2949 0.14
>>>>>>>> 190512 218.424 1.147 26093.4 7963.6 99.88 255.2
>>>>>>
>>>>>>>> 218.353 0.0029 0.0035 0.2948 0.15
>>>>>>>> 190512 218.846 1.149 26043.1 7948.2 99.88 255.2
>>>>>>
>>>>>>>> 218.771 0.0073 0.0077 0.3413 0.16
>>>>>>>> 190512 218.903 1.149 26036.3 7946.1 99.88 255.2
>>>>>>
>>>>>>>> 218.826 0.0131 0.0131 0.3785 0.17
>>>>>>>> 190512 218.789 1.148 26049.9 7950.3 99.88 255.2
>>>>>>
>>>>>>>> 218.716 0.0035 0.0039 0.3222 0.18
>>>>>>>> 190512 218.737 1.148 26056.1 7952.2 99.88 255.2
>>>>>>
>>>>>>>> 218.665 0.0036 0.0037 0.3009 0.19
>>>>>>>> 190512 218.213 1.145 26118.6 7971.3 99.88 255.2
>>>>>>
>>>>>>>> 218.141 0.0043 0.0052 0.2531 0.2
>>>>>>>> 190512 219.118 1.150 26010.8 7938.3 99.88 255.2
>>>>>>
>>>>>>>> 219.046 0.0034 0.0044 0.3932 0.20
>>>>>>>> 190512 219.104 1.150 26012.3 7938.8 99.88 255.2
>>>>>>
>>>>>>>> 219.032 0.0031 0.0039 0.3778 0.21
>>>>>>>> 190512 219.107 1.150 26012.0 7938.7 99.88 255.2
>>>>>>
>>>>>>>> 219.034 0.0018 0.0023 0.3809 0.22
>>>>>>>> 190512 219.113 1.150 26011.3 7938.5 99.88 255.2
>>>>>>
>>>>>>>> 219.040 0.0011 0.0014 0.3814 0.23
>>>>>>>> 190512 218.109 1.145 26131.1 7975.1 99.88 255.2
>>>>>>
>>>>>>>> 218.042 0.0086 0.0092 0.2149 0.3
>>>>>>>> 190512 219.504 1.152 25965.0 7924.4 99.88 255.2
>>>>>>
>>>>>>>> 219.429 0.0017 0.0031 0.4977 0.4
>>>>>>>> 190512 219.464 1.152 25969.7 7925.8 99.88 255.2
>>>>>>
>>>>>>>> 219.390 0.0043 0.0052 0.4517 0.5
>>>>>>>> 190512 219.381 1.152 25979.6 7928.8 99.88 255.2
>>>>>>
>>>>>>>> 219.328 0.0083 0.0088 0.4260 0.6
>>>>>>>> 190512 219.394 1.152 25978.0 7928.4 99.88 255.2
>>>>>>
>>>>>>>> 219.327 0.0017 0.0027 0.4132 0.7
>>>>>>>> 190512 219.319 1.151 25987.0 7931.1 99.88 255.2
>>>>>>
>>>>>>>> 219.243 0.0091 0.0097 0.3975 0.8
>>>>>>>> 190512 219.351 1.151 25983.1 7929.9 99.88 255.2
>>>>>>
>>>>>>>> 219.277 0.0013 0.0022 0.4027 0.9
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> MITgcm-devel mailing list
>>>>>>> MITgcm-devel at mitgcm.org
>>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>>> _______________________________________________
>>>>>> MITgcm-devel mailing list
>>>>>> MITgcm-devel at mitgcm.org
>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-devel mailing list
>>>>> MITgcm-devel at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list