[MITgcm-devel] advection routines: strange results in flow trace analysis

Martin Losch Martin.Losch at awi.de
Thu Apr 10 11:04:21 EDT 2008


Hi Jean-Michel,
does that mean you do not recommend the use of MULTIDIM_OLD_VERSION?  
How severe is this non-conservation?

It does speed up my short (216 timestep) run by over 10%, mostly  
because the time spent in BLOCKING_EXCHANGES is reduced from 15% to 2%.

where exactly is this figure you are talking about, I can't find it ...
Martin

On 10 Apr 2008, at 15:37, Jean-Michel Campin wrote:
> Hi Martin,
>
> On Thu, Apr 10, 2008 at 09:13:27AM +0200, Martin Losch wrote:
>> Me again,
>>
>> can I still use #define MULTIDIM_OLD_VERSION in gad_advection.F?
>> will that fix my performance problem? and at what cost?
>
> The MULTIDIM_OLD_VERSION does not conserve the total tracer
> amount.
>
>> I guess, I never realized (and probably never will), how many
>> complications arise with the cubed shpere configuration.
>
> I added a figure in the manual, but the description & legend
> ares still missing !!!
>
> And regarding the other suggestion (3 call for every tiles,
> even if 1 call is not needed at all), you will get more flops
> but is unlikely to really "speed up" a lot our run. And it
> will definitively slow down some other setups we have.
>
> Jean-Michel
>
>>
>> Martin
>>
>> On 9 Apr 2008, at 18:51, Martin.Losch at awi.de wrote:
>>> OK, I can see now where this comes from:
>>> C-    CubedSphere : pass 3 times, with partial update of local
>>> tracer field
>>>       IF (ipass.EQ.1) THEN
>>>        overlapOnly  = MOD(nCFace,3).EQ.0
>>>        interiorOnly = MOD(nCFace,3).NE.0
>>>        calc_fluxes_X = nCFace.EQ.6 .OR. nCFace.EQ.1 .OR. nCFace.EQ.2
>>>        calc_fluxes_Y = nCFace.EQ.3 .OR. nCFace.EQ.4 .OR. nCFace.EQ.5
>>>       ELSEIF (ipass.EQ.2) THEN
>>>        overlapOnly  = MOD(nCFace,3).EQ.2
>>>        interiorOnly = MOD(nCFace,3).EQ.1
>>>        calc_fluxes_X = nCFace.EQ.2 .OR. nCFace.EQ.3 .OR. nCFace.EQ.4
>>>        calc_fluxes_Y = nCFace.EQ.5 .OR. nCFace.EQ.6 .OR. nCFace.EQ.1
>>>       ELSE
>>>        interiorOnly = .TRUE.
>>>        calc_fluxes_X = nCFace.EQ.5 .OR. nCFace.EQ.6
>>>        calc_fluxes_Y = nCFace.EQ.2 .OR. nCFace.EQ.3
>>>       ENDIF
>>>
>>>
>>> I assume that this is the minimum number of calls to gad_$
>>> {advscheme}_adv_x/y that is possible? Why is it not symmetric for
>>> all faces? I wonder if the load imbalance on the cpus is more
>>> severe (because of waiting in the exchange routines than) than
>>> calling gad_${advscheme}_adv_x/y for two more faces, so that the
>>> load is nearly the same for all faces. Currently the four
>>> exch2_send/recv_rl1/2 routines take up over 20% of the total time
>>> (mostly because they wait).
>>>
>>> Martin
>>>
>>> ----- Original Message -----
>>> From: Jean-Michel Campin <jmc at ocean.mit.edu>
>>> Date: Wednesday, April 9, 2008 6:30 pm
>>> Subject: Re: [MITgcm-devel] advection routines: strange results in
>>> flow trace analysis
>>>
>>>> Hi Martin,
>>>>
>>>> MultiDim advection on CS-grid has special stuff depending on
>>>> which face is computed (2 or 3 calls to advection S/R, npass=3);
>>>> It's very likely that it comes from there.
>>>>
>>>> Jean-Michel
>>>>
>>>> On Wed, Apr 09, 2008 at 04:29:44PM +0200, Martin Losch wrote:
>>>>> Hi there,
>>>>>
>>>>> I have an unexpected result in flow trace analysis (see below). I
>>>> am
>>>>> running the high resultion cubed sphere configuration (CS510)
>>>> with 16
>>>>> passive tracers on 24CPU of a SX8-R, my advection scheme is 7
>>>> (os7mp)
>>>>> for the tracers and 33 (dst3fl) for the seaice variables. As
>>>> expected
>>>>> the advection routines use most of the time (18 tracers). The
>>>> flow
>>>>> trace analysis below gives the cumulative/average values in the
>>>> first
>>>>> line, and then the values for the individual process in the
>>>> following
>>>>> 24 lines. However, if you look closely, you'll see that on some
>>>> (8)
>>>>> cpus  the advection routine is called twice as often as on the
>>>>> remaining 16 cpus, this is true both for gad_os7mp_adv_x/y (which
>>>> is
>>>>> called from gad_advection in this case) and gad_dst3fl_adv_x/y
>>>> (which
>>>>> is called from seaice_advection in this case, not shown). We
>>>> (Jens-
>>>>> Olaf and I) suspect that this imbalance is responsible for the
>>>>> terrible performance of the exch2-routines in this run that Chris
>>>> and
>>>>> I talked about in February, because 16 cpus have to wait for 8
>>>> all
>>>>> the time in the exchange routines.
>>>>>
>>>>> All other routines seem to be called with the same frequency on
>>>> all
>>>>> CPUs.
>>>>>
>>>>> What is the explanation for this?
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>>
>>>>>> *--------------------------*
>>>>>> FLOW TRACE ANALYSIS LIST
>>>>>> *--------------------------*
>>>>>>
>>>>>> Execution : Wed Apr  9 10:54:36 2008
>>>>>> Total CPU : 13:40'31"663
>>>>>>
>>>>>>
>>>>>> FREQUENCY  EXCLUSIVE       AVER.TIME    MOPS  MFLOPS V.OP  AVER.
>>>>
>>>>>> VECTOR I-CACHE O-CACHE    BANK  PROG.UNIT
>>>>>>          TIME[sec](  % )    [msec]                 RATIO V.LEN
>>>>
>>>>>> TIME   MISS    MISS      CONF
>>>>>>
>>>>>> 6220800  6776.032( 13.8)     1.089 25809.1  7541.0 99.88 256.0
>>>>
>>>>>> 6774.327  0.1500  0.1642 10.0670  gad_os7mp_adv_x
>>>>>>  194400   211.705            1.089 25814.8  7542.6 99.88 256.0
>>>>
>>>>>> 211.646  0.0084  0.0104  0.2688   0.0
>>>>>>  194400   211.692            1.089 25816.3  7543.1 99.88 256.0
>>>>
>>>>>> 211.641  0.0013  0.0027  0.2656   0.1
>>>>>>  194400   211.890            1.090 25792.1  7536.0 99.88 256.0
>>>>
>>>>>> 211.838  0.0014  0.0021  0.4195   0.10
>>>>>>  194400   211.907            1.090 25790.2  7535.4 99.88 256.0
>>>>
>>>>>> 211.852  0.0020  0.0024  0.4203   0.11
>>>>>>  194400   211.706            1.089 25814.6  7542.5 99.88 256.0
>>>>
>>>>>> 211.648  0.0059  0.0064  0.2785   0.12
>>>>>>  194400   211.698            1.089 25815.6  7542.8 99.88 256.0
>>>>
>>>>>> 211.644  0.0011  0.0019  0.2743   0.13
>>>>>>  194400   211.720            1.089 25812.8  7542.0 99.88 256.0
>>>>
>>>>>> 211.654  0.0171  0.0173  0.2838   0.14
>>>>>>  194400   211.713            1.089 25813.8  7542.3 99.88 256.0
>>>>
>>>>>> 211.658  0.0034  0.0041  0.2903   0.15
>>>>>>  194400   211.673            1.089 25818.7  7543.8 99.88 256.0
>>>>
>>>>>> 211.615  0.0096  0.0076  0.2493   0.16
>>>>>>  194400   211.681            1.089 25817.7  7543.5 99.88 256.0
>>>>
>>>>>> 211.613  0.0181  0.0169  0.2498   0.17
>>>>>>  194400   211.645            1.089 25822.0  7544.7 99.88 256.0
>>>>
>>>>>> 211.590  0.0041  0.0044  0.2292   0.18
>>>>>>  194400   211.650            1.089 25821.4  7544.6 99.88 256.0
>>>>
>>>>>> 211.596  0.0042  0.0045  0.2281   0.19
>>>>>>  194400   211.684            1.089 25817.3  7543.4 99.88 256.0
>>>>
>>>>>> 211.628  0.0050  0.0063  0.2656   0.2
>>>>>>  388800   423.306            1.089 25821.1  7544.4 99.88 256.0
>>>>
>>>>>> 423.206  0.0061  0.0076  0.4798   0.20
>>>>>>  388800   423.311            1.089 25820.8  7544.4 99.88 256.0
>>>>
>>>>>> 423.208  0.0057  0.0064  0.4736   0.21
>>>>>>  388800   423.306            1.089 25821.1  7544.4 99.88 256.0
>>>>
>>>>>> 423.204  0.0024  0.0031  0.4841   0.22
>>>>>>  388800   423.321            1.089 25820.2  7544.2 99.88 256.0
>>>>
>>>>>> 423.218  0.0017  0.0024  0.4834   0.23
>>>>>>  194400   211.678            1.089 25818.0  7543.5 99.88 256.0
>>>>
>>>>>> 211.625  0.0101  0.0112  0.2526   0.3
>>>>>>  388800   423.756            1.090 25793.7  7536.4 99.88 256.0
>>>>
>>>>>> 423.648  0.0025  0.0048  0.8570   0.4
>>>>>>  388800   423.705            1.090 25796.7  7537.3 99.88 256.0
>>>>
>>>>>> 423.595  0.0079  0.0095  0.8142   0.5
>>>>>>  388800   423.733            1.090 25795.1  7536.8 99.88 256.0
>>>>
>>>>>> 423.660  0.0159  0.0174  0.8413   0.6
>>>>>>  388800   423.742            1.090 25794.5  7536.7 99.88 256.0
>>>>
>>>>>> 423.647  0.0024  0.0039  0.8203   0.7
>>>>>>  194400   211.906            1.090 25790.2  7535.4 99.88 256.0
>>>>
>>>>>> 211.845  0.0124  0.0089  0.4195   0.8
>>>>>>  194400   211.904            1.090 25790.4  7535.5 99.88 256.0
>>>>
>>>>>> 211.849  0.0012  0.0019  0.4183   0.9
>>>>>> 6220800  6482.742( 13.2)     1.042 27041.7  7882.1 99.88 256.0
>>>>
>>>>>> 6480.721  0.5066  0.1471  7.7018  gad_os7mp_adv_y
>>>>>>  194400   202.452            1.041 27059.5  7887.3 99.88 256.0
>>>>
>>>>>> 202.387  0.0137  0.0075  0.2022   0.0
>>>>>>  194400   202.439            1.041 27061.3  7887.8 99.88 256.0
>>>>
>>>>>> 202.380  0.0073  0.0014  0.1965   0.1
>>>>>>  388800   405.687            1.043 27007.3  7872.1 99.88 256.0
>>>>
>>>>>> 405.582  0.0103  0.0025  0.6622   0.10
>>>>>>  388800   405.711            1.043 27005.7  7871.6 99.88 256.0
>>>>
>>>>>> 405.568  0.0446  0.0029  0.6483   0.11
>>>>>>  194400   202.487            1.042 27054.9  7886.0 99.88 256.0
>>>>
>>>>>> 202.422  0.0128  0.0065  0.2192   0.12
>>>>>>  194400   202.461            1.041 27058.3  7887.0 99.88 256.0
>>>>
>>>>>> 202.401  0.0084  0.0013  0.2061   0.13
>>>>>>  194400   202.497            1.042 27053.5  7885.6 99.88 256.0
>>>>
>>>>>> 202.425  0.0237  0.0147  0.2160   0.14
>>>>>>  194400   202.519            1.042 27050.6  7884.7 99.88 256.0
>>>>
>>>>>> 202.423  0.0424  0.0033  0.2155   0.15
>>>>>>  388800   404.770            1.041 27068.5  7889.9 99.88 256.0
>>>>
>>>>>> 404.664  0.0198  0.0146  0.3562   0.16
>>>>>>  388800   404.766            1.041 27068.7  7890.0 99.88 256.0
>>>>
>>>>>> 404.653  0.0310  0.0243  0.3585   0.17
>>>>>>  388800   404.796            1.041 27066.7  7889.4 99.88 256.0
>>>>
>>>>>> 404.692  0.0125  0.0057  0.3540   0.18
>>>>>>  388800   404.830            1.041 27064.4  7888.8 99.88 256.0
>>>>
>>>>>> 404.696  0.0435  0.0060  0.3553   0.19
>>>>>>  194400   202.459            1.041 27058.6  7887.1 99.88 256.0
>>>>
>>>>>> 202.396  0.0111  0.0046  0.1934   0.2
>>>>>>  194400   202.395            1.041 27067.1  7889.5 99.88 256.0
>>>>
>>>>>> 202.338  0.0094  0.0032  0.1811   0.20
>>>>>>  194400   202.400            1.041 27066.4  7889.3 99.88 256.0
>>>>
>>>>>> 202.342  0.0092  0.0030  0.1827   0.21
>>>>>>  194400   202.406            1.041 27065.6  7889.1 99.88 256.0
>>>>
>>>>>> 202.347  0.0074  0.0012  0.1768   0.22
>>>>>>  194400   202.438            1.041 27061.4  7887.9 99.88 256.0
>>>>
>>>>>> 202.347  0.0376  0.0008  0.1773   0.23
>>>>>>  194400   202.473            1.042 27056.8  7886.5 99.88 256.0
>>>>
>>>>>> 202.392  0.0463  0.0091  0.1846   0.3
>>>>>>  194400   202.854            1.043 27005.8  7871.7 99.88 256.0
>>>>
>>>>>> 202.791  0.0090  0.0011  0.3488   0.4
>>>>>>  194400   202.871            1.044 27003.6  7871.0 99.88 256.0
>>>>
>>>>>> 202.806  0.0125  0.0041  0.3407   0.5
>>>>>>  194400   202.796            1.043 27013.6  7873.9 99.88 256.0
>>>>
>>>>>> 202.748  0.0174  0.0081  0.3077   0.6
>>>>>>  194400   202.880            1.044 27002.5  7870.7 99.88 256.0
>>>>
>>>>>> 202.788  0.0424  0.0012  0.3204   0.7
>>>>>>  388800   405.695            1.043 27006.7  7871.9 99.88 256.0
>>>>
>>>>>> 405.581  0.0246  0.0181  0.6572   0.8
>>>>>>  388800   405.660            1.043 27009.1  7872.6 99.88 256.0
>>>>
>>>>>> 405.551  0.0097  0.0021  0.6412   0.9
>>>>>> 4572288  5253.314( 10.7)     1.149 26038.1  7946.7 99.88 255.2
>>>>
>>>>>> 5251.591  0.1115  0.1259  8.4362  gad_os7mp_adv_r
>>>>>>  190512   218.247            1.146 26114.6  7970.0 99.88 255.2
>>>>
>>>>>> 218.173  0.0070  0.0075  0.2710   0.0
>>>>>>  190512   218.203            1.145 26119.9  7971.6 99.88 255.2
>>>>
>>>>>> 218.134  0.0012  0.0023  0.2445   0.1
>>>>>>  190512   219.379            1.152 25979.8  7928.9 99.88 255.2
>>>>
>>>>>> 219.309  0.0017  0.0023  0.4233   0.10
>>>>>>  190512   219.377            1.152 25980.0  7929.0 99.88 255.2
>>>>
>>>>>> 219.305  0.0019  0.0021  0.4169   0.11
>>>>>>  190512   218.427            1.147 26093.0  7963.4 99.88 255.2
>>>>
>>>>>> 218.354  0.0051  0.0057  0.2925   0.12
>>>>>>  190512   218.376            1.146 26099.1  7965.3 99.88 255.2
>>>>
>>>>>> 218.305  0.0010  0.0017  0.2655   0.13
>>>>>>  190512   218.430            1.147 26092.7  7963.4 99.88 255.2
>>>>
>>>>>> 218.352  0.0144  0.0143  0.2949   0.14
>>>>>>  190512   218.424            1.147 26093.4  7963.6 99.88 255.2
>>>>
>>>>>> 218.353  0.0029  0.0035  0.2948   0.15
>>>>>>  190512   218.846            1.149 26043.1  7948.2 99.88 255.2
>>>>
>>>>>> 218.771  0.0073  0.0077  0.3413   0.16
>>>>>>  190512   218.903            1.149 26036.3  7946.1 99.88 255.2
>>>>
>>>>>> 218.826  0.0131  0.0131  0.3785   0.17
>>>>>>  190512   218.789            1.148 26049.9  7950.3 99.88 255.2
>>>>
>>>>>> 218.716  0.0035  0.0039  0.3222   0.18
>>>>>>  190512   218.737            1.148 26056.1  7952.2 99.88 255.2
>>>>
>>>>>> 218.665  0.0036  0.0037  0.3009   0.19
>>>>>>  190512   218.213            1.145 26118.6  7971.3 99.88 255.2
>>>>
>>>>>> 218.141  0.0043  0.0052  0.2531   0.2
>>>>>>  190512   219.118            1.150 26010.8  7938.3 99.88 255.2
>>>>
>>>>>> 219.046  0.0034  0.0044  0.3932   0.20
>>>>>>  190512   219.104            1.150 26012.3  7938.8 99.88 255.2
>>>>
>>>>>> 219.032  0.0031  0.0039  0.3778   0.21
>>>>>>  190512   219.107            1.150 26012.0  7938.7 99.88 255.2
>>>>
>>>>>> 219.034  0.0018  0.0023  0.3809   0.22
>>>>>>  190512   219.113            1.150 26011.3  7938.5 99.88 255.2
>>>>
>>>>>> 219.040  0.0011  0.0014  0.3814   0.23
>>>>>>  190512   218.109            1.145 26131.1  7975.1 99.88 255.2
>>>>
>>>>>> 218.042  0.0086  0.0092  0.2149   0.3
>>>>>>  190512   219.504            1.152 25965.0  7924.4 99.88 255.2
>>>>
>>>>>> 219.429  0.0017  0.0031  0.4977   0.4
>>>>>>  190512   219.464            1.152 25969.7  7925.8 99.88 255.2
>>>>
>>>>>> 219.390  0.0043  0.0052  0.4517   0.5
>>>>>>  190512   219.381            1.152 25979.6  7928.8 99.88 255.2
>>>>
>>>>>> 219.328  0.0083  0.0088  0.4260   0.6
>>>>>>  190512   219.394            1.152 25978.0  7928.4 99.88 255.2
>>>>
>>>>>> 219.327  0.0017  0.0027  0.4132   0.7
>>>>>>  190512   219.319            1.151 25987.0  7931.1 99.88 255.2
>>>>
>>>>>> 219.243  0.0091  0.0097  0.3975   0.8
>>>>>>  190512   219.351            1.151 25983.1  7929.9 99.88 255.2
>>>>
>>>>>> 219.277  0.0013  0.0022  0.4027   0.9
>>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-devel mailing list
>>>>> MITgcm-devel at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel




More information about the MITgcm-devel mailing list