[MITgcm-devel] advection routines: strange results in flow trace analysis
Martin Losch
Martin.Losch at awi.de
Thu Apr 10 03:13:27 EDT 2008
Me again,
can I still use #define MULTIDIM_OLD_VERSION in gad_advection.F?
will that fix my performance problem? and at what cost?
I guess, I never realized (and probably never will), how many
complications arise with the cubed shpere configuration.
Martin
On 9 Apr 2008, at 18:51, Martin.Losch at awi.de wrote:
> OK, I can see now where this comes from:
> C- CubedSphere : pass 3 times, with partial update of local
> tracer field
> IF (ipass.EQ.1) THEN
> overlapOnly = MOD(nCFace,3).EQ.0
> interiorOnly = MOD(nCFace,3).NE.0
> calc_fluxes_X = nCFace.EQ.6 .OR. nCFace.EQ.1 .OR. nCFace.EQ.2
> calc_fluxes_Y = nCFace.EQ.3 .OR. nCFace.EQ.4 .OR. nCFace.EQ.5
> ELSEIF (ipass.EQ.2) THEN
> overlapOnly = MOD(nCFace,3).EQ.2
> interiorOnly = MOD(nCFace,3).EQ.1
> calc_fluxes_X = nCFace.EQ.2 .OR. nCFace.EQ.3 .OR. nCFace.EQ.4
> calc_fluxes_Y = nCFace.EQ.5 .OR. nCFace.EQ.6 .OR. nCFace.EQ.1
> ELSE
> interiorOnly = .TRUE.
> calc_fluxes_X = nCFace.EQ.5 .OR. nCFace.EQ.6
> calc_fluxes_Y = nCFace.EQ.2 .OR. nCFace.EQ.3
> ENDIF
>
>
> I assume that this is the minimum number of calls to gad_$
> {advscheme}_adv_x/y that is possible? Why is it not symmetric for
> all faces? I wonder if the load imbalance on the cpus is more
> severe (because of waiting in the exchange routines than) than
> calling gad_${advscheme}_adv_x/y for two more faces, so that the
> load is nearly the same for all faces. Currently the four
> exch2_send/recv_rl1/2 routines take up over 20% of the total time
> (mostly because they wait).
>
> Martin
>
> ----- Original Message -----
> From: Jean-Michel Campin <jmc at ocean.mit.edu>
> Date: Wednesday, April 9, 2008 6:30 pm
> Subject: Re: [MITgcm-devel] advection routines: strange results in
> flow trace analysis
>
>> Hi Martin,
>>
>> MultiDim advection on CS-grid has special stuff depending on
>> which face is computed (2 or 3 calls to advection S/R, npass=3);
>> It's very likely that it comes from there.
>>
>> Jean-Michel
>>
>> On Wed, Apr 09, 2008 at 04:29:44PM +0200, Martin Losch wrote:
>>> Hi there,
>>>
>>> I have an unexpected result in flow trace analysis (see below). I
>> am
>>> running the high resultion cubed sphere configuration (CS510)
>> with 16
>>> passive tracers on 24CPU of a SX8-R, my advection scheme is 7
>> (os7mp)
>>> for the tracers and 33 (dst3fl) for the seaice variables. As
>> expected
>>> the advection routines use most of the time (18 tracers). The
>> flow
>>> trace analysis below gives the cumulative/average values in the
>> first
>>> line, and then the values for the individual process in the
>> following
>>> 24 lines. However, if you look closely, you'll see that on some
>> (8)
>>> cpus the advection routine is called twice as often as on the
>>> remaining 16 cpus, this is true both for gad_os7mp_adv_x/y (which
>> is
>>> called from gad_advection in this case) and gad_dst3fl_adv_x/y
>> (which
>>> is called from seaice_advection in this case, not shown). We
>> (Jens-
>>> Olaf and I) suspect that this imbalance is responsible for the
>>> terrible performance of the exch2-routines in this run that Chris
>> and
>>> I talked about in February, because 16 cpus have to wait for 8
>> all
>>> the time in the exchange routines.
>>>
>>> All other routines seem to be called with the same frequency on
>> all
>>> CPUs.
>>>
>>> What is the explanation for this?
>>>
>>> Martin
>>>
>>>
>>>
>>>> *--------------------------*
>>>> FLOW TRACE ANALYSIS LIST
>>>> *--------------------------*
>>>>
>>>> Execution : Wed Apr 9 10:54:36 2008
>>>> Total CPU : 13:40'31"663
>>>>
>>>>
>>>> FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER.
>>
>>>> VECTOR I-CACHE O-CACHE BANK PROG.UNIT
>>>> TIME[sec]( % ) [msec] RATIO V.LEN
>>
>>>> TIME MISS MISS CONF
>>>>
>>>> 6220800 6776.032( 13.8) 1.089 25809.1 7541.0 99.88 256.0
>>
>>>> 6774.327 0.1500 0.1642 10.0670 gad_os7mp_adv_x
>>>> 194400 211.705 1.089 25814.8 7542.6 99.88 256.0
>>
>>>> 211.646 0.0084 0.0104 0.2688 0.0
>>>> 194400 211.692 1.089 25816.3 7543.1 99.88 256.0
>>
>>>> 211.641 0.0013 0.0027 0.2656 0.1
>>>> 194400 211.890 1.090 25792.1 7536.0 99.88 256.0
>>
>>>> 211.838 0.0014 0.0021 0.4195 0.10
>>>> 194400 211.907 1.090 25790.2 7535.4 99.88 256.0
>>
>>>> 211.852 0.0020 0.0024 0.4203 0.11
>>>> 194400 211.706 1.089 25814.6 7542.5 99.88 256.0
>>
>>>> 211.648 0.0059 0.0064 0.2785 0.12
>>>> 194400 211.698 1.089 25815.6 7542.8 99.88 256.0
>>
>>>> 211.644 0.0011 0.0019 0.2743 0.13
>>>> 194400 211.720 1.089 25812.8 7542.0 99.88 256.0
>>
>>>> 211.654 0.0171 0.0173 0.2838 0.14
>>>> 194400 211.713 1.089 25813.8 7542.3 99.88 256.0
>>
>>>> 211.658 0.0034 0.0041 0.2903 0.15
>>>> 194400 211.673 1.089 25818.7 7543.8 99.88 256.0
>>
>>>> 211.615 0.0096 0.0076 0.2493 0.16
>>>> 194400 211.681 1.089 25817.7 7543.5 99.88 256.0
>>
>>>> 211.613 0.0181 0.0169 0.2498 0.17
>>>> 194400 211.645 1.089 25822.0 7544.7 99.88 256.0
>>
>>>> 211.590 0.0041 0.0044 0.2292 0.18
>>>> 194400 211.650 1.089 25821.4 7544.6 99.88 256.0
>>
>>>> 211.596 0.0042 0.0045 0.2281 0.19
>>>> 194400 211.684 1.089 25817.3 7543.4 99.88 256.0
>>
>>>> 211.628 0.0050 0.0063 0.2656 0.2
>>>> 388800 423.306 1.089 25821.1 7544.4 99.88 256.0
>>
>>>> 423.206 0.0061 0.0076 0.4798 0.20
>>>> 388800 423.311 1.089 25820.8 7544.4 99.88 256.0
>>
>>>> 423.208 0.0057 0.0064 0.4736 0.21
>>>> 388800 423.306 1.089 25821.1 7544.4 99.88 256.0
>>
>>>> 423.204 0.0024 0.0031 0.4841 0.22
>>>> 388800 423.321 1.089 25820.2 7544.2 99.88 256.0
>>
>>>> 423.218 0.0017 0.0024 0.4834 0.23
>>>> 194400 211.678 1.089 25818.0 7543.5 99.88 256.0
>>
>>>> 211.625 0.0101 0.0112 0.2526 0.3
>>>> 388800 423.756 1.090 25793.7 7536.4 99.88 256.0
>>
>>>> 423.648 0.0025 0.0048 0.8570 0.4
>>>> 388800 423.705 1.090 25796.7 7537.3 99.88 256.0
>>
>>>> 423.595 0.0079 0.0095 0.8142 0.5
>>>> 388800 423.733 1.090 25795.1 7536.8 99.88 256.0
>>
>>>> 423.660 0.0159 0.0174 0.8413 0.6
>>>> 388800 423.742 1.090 25794.5 7536.7 99.88 256.0
>>
>>>> 423.647 0.0024 0.0039 0.8203 0.7
>>>> 194400 211.906 1.090 25790.2 7535.4 99.88 256.0
>>
>>>> 211.845 0.0124 0.0089 0.4195 0.8
>>>> 194400 211.904 1.090 25790.4 7535.5 99.88 256.0
>>
>>>> 211.849 0.0012 0.0019 0.4183 0.9
>>>> 6220800 6482.742( 13.2) 1.042 27041.7 7882.1 99.88 256.0
>>
>>>> 6480.721 0.5066 0.1471 7.7018 gad_os7mp_adv_y
>>>> 194400 202.452 1.041 27059.5 7887.3 99.88 256.0
>>
>>>> 202.387 0.0137 0.0075 0.2022 0.0
>>>> 194400 202.439 1.041 27061.3 7887.8 99.88 256.0
>>
>>>> 202.380 0.0073 0.0014 0.1965 0.1
>>>> 388800 405.687 1.043 27007.3 7872.1 99.88 256.0
>>
>>>> 405.582 0.0103 0.0025 0.6622 0.10
>>>> 388800 405.711 1.043 27005.7 7871.6 99.88 256.0
>>
>>>> 405.568 0.0446 0.0029 0.6483 0.11
>>>> 194400 202.487 1.042 27054.9 7886.0 99.88 256.0
>>
>>>> 202.422 0.0128 0.0065 0.2192 0.12
>>>> 194400 202.461 1.041 27058.3 7887.0 99.88 256.0
>>
>>>> 202.401 0.0084 0.0013 0.2061 0.13
>>>> 194400 202.497 1.042 27053.5 7885.6 99.88 256.0
>>
>>>> 202.425 0.0237 0.0147 0.2160 0.14
>>>> 194400 202.519 1.042 27050.6 7884.7 99.88 256.0
>>
>>>> 202.423 0.0424 0.0033 0.2155 0.15
>>>> 388800 404.770 1.041 27068.5 7889.9 99.88 256.0
>>
>>>> 404.664 0.0198 0.0146 0.3562 0.16
>>>> 388800 404.766 1.041 27068.7 7890.0 99.88 256.0
>>
>>>> 404.653 0.0310 0.0243 0.3585 0.17
>>>> 388800 404.796 1.041 27066.7 7889.4 99.88 256.0
>>
>>>> 404.692 0.0125 0.0057 0.3540 0.18
>>>> 388800 404.830 1.041 27064.4 7888.8 99.88 256.0
>>
>>>> 404.696 0.0435 0.0060 0.3553 0.19
>>>> 194400 202.459 1.041 27058.6 7887.1 99.88 256.0
>>
>>>> 202.396 0.0111 0.0046 0.1934 0.2
>>>> 194400 202.395 1.041 27067.1 7889.5 99.88 256.0
>>
>>>> 202.338 0.0094 0.0032 0.1811 0.20
>>>> 194400 202.400 1.041 27066.4 7889.3 99.88 256.0
>>
>>>> 202.342 0.0092 0.0030 0.1827 0.21
>>>> 194400 202.406 1.041 27065.6 7889.1 99.88 256.0
>>
>>>> 202.347 0.0074 0.0012 0.1768 0.22
>>>> 194400 202.438 1.041 27061.4 7887.9 99.88 256.0
>>
>>>> 202.347 0.0376 0.0008 0.1773 0.23
>>>> 194400 202.473 1.042 27056.8 7886.5 99.88 256.0
>>
>>>> 202.392 0.0463 0.0091 0.1846 0.3
>>>> 194400 202.854 1.043 27005.8 7871.7 99.88 256.0
>>
>>>> 202.791 0.0090 0.0011 0.3488 0.4
>>>> 194400 202.871 1.044 27003.6 7871.0 99.88 256.0
>>
>>>> 202.806 0.0125 0.0041 0.3407 0.5
>>>> 194400 202.796 1.043 27013.6 7873.9 99.88 256.0
>>
>>>> 202.748 0.0174 0.0081 0.3077 0.6
>>>> 194400 202.880 1.044 27002.5 7870.7 99.88 256.0
>>
>>>> 202.788 0.0424 0.0012 0.3204 0.7
>>>> 388800 405.695 1.043 27006.7 7871.9 99.88 256.0
>>
>>>> 405.581 0.0246 0.0181 0.6572 0.8
>>>> 388800 405.660 1.043 27009.1 7872.6 99.88 256.0
>>
>>>> 405.551 0.0097 0.0021 0.6412 0.9
>>>> 4572288 5253.314( 10.7) 1.149 26038.1 7946.7 99.88 255.2
>>
>>>> 5251.591 0.1115 0.1259 8.4362 gad_os7mp_adv_r
>>>> 190512 218.247 1.146 26114.6 7970.0 99.88 255.2
>>
>>>> 218.173 0.0070 0.0075 0.2710 0.0
>>>> 190512 218.203 1.145 26119.9 7971.6 99.88 255.2
>>
>>>> 218.134 0.0012 0.0023 0.2445 0.1
>>>> 190512 219.379 1.152 25979.8 7928.9 99.88 255.2
>>
>>>> 219.309 0.0017 0.0023 0.4233 0.10
>>>> 190512 219.377 1.152 25980.0 7929.0 99.88 255.2
>>
>>>> 219.305 0.0019 0.0021 0.4169 0.11
>>>> 190512 218.427 1.147 26093.0 7963.4 99.88 255.2
>>
>>>> 218.354 0.0051 0.0057 0.2925 0.12
>>>> 190512 218.376 1.146 26099.1 7965.3 99.88 255.2
>>
>>>> 218.305 0.0010 0.0017 0.2655 0.13
>>>> 190512 218.430 1.147 26092.7 7963.4 99.88 255.2
>>
>>>> 218.352 0.0144 0.0143 0.2949 0.14
>>>> 190512 218.424 1.147 26093.4 7963.6 99.88 255.2
>>
>>>> 218.353 0.0029 0.0035 0.2948 0.15
>>>> 190512 218.846 1.149 26043.1 7948.2 99.88 255.2
>>
>>>> 218.771 0.0073 0.0077 0.3413 0.16
>>>> 190512 218.903 1.149 26036.3 7946.1 99.88 255.2
>>
>>>> 218.826 0.0131 0.0131 0.3785 0.17
>>>> 190512 218.789 1.148 26049.9 7950.3 99.88 255.2
>>
>>>> 218.716 0.0035 0.0039 0.3222 0.18
>>>> 190512 218.737 1.148 26056.1 7952.2 99.88 255.2
>>
>>>> 218.665 0.0036 0.0037 0.3009 0.19
>>>> 190512 218.213 1.145 26118.6 7971.3 99.88 255.2
>>
>>>> 218.141 0.0043 0.0052 0.2531 0.2
>>>> 190512 219.118 1.150 26010.8 7938.3 99.88 255.2
>>
>>>> 219.046 0.0034 0.0044 0.3932 0.20
>>>> 190512 219.104 1.150 26012.3 7938.8 99.88 255.2
>>
>>>> 219.032 0.0031 0.0039 0.3778 0.21
>>>> 190512 219.107 1.150 26012.0 7938.7 99.88 255.2
>>
>>>> 219.034 0.0018 0.0023 0.3809 0.22
>>>> 190512 219.113 1.150 26011.3 7938.5 99.88 255.2
>>
>>>> 219.040 0.0011 0.0014 0.3814 0.23
>>>> 190512 218.109 1.145 26131.1 7975.1 99.88 255.2
>>
>>>> 218.042 0.0086 0.0092 0.2149 0.3
>>>> 190512 219.504 1.152 25965.0 7924.4 99.88 255.2
>>
>>>> 219.429 0.0017 0.0031 0.4977 0.4
>>>> 190512 219.464 1.152 25969.7 7925.8 99.88 255.2
>>
>>>> 219.390 0.0043 0.0052 0.4517 0.5
>>>> 190512 219.381 1.152 25979.6 7928.8 99.88 255.2
>>
>>>> 219.328 0.0083 0.0088 0.4260 0.6
>>>> 190512 219.394 1.152 25978.0 7928.4 99.88 255.2
>>
>>>> 219.327 0.0017 0.0027 0.4132 0.7
>>>> 190512 219.319 1.151 25987.0 7931.1 99.88 255.2
>>
>>>> 219.243 0.0091 0.0097 0.3975 0.8
>>>> 190512 219.351 1.151 25983.1 7929.9 99.88 255.2
>>
>>>> 219.277 0.0013 0.0022 0.4027 0.9
>>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list