[MITgcm-devel] advection routines: strange results in flow trace analysis

Martin Losch Martin.Losch at awi.de
Thu Apr 10 03:13:27 EDT 2008


Me again,

can I still use #define MULTIDIM_OLD_VERSION in gad_advection.F?
will that fix my performance problem? and at what cost?

I guess, I never realized (and probably never will), how many  
complications arise with the cubed shpere configuration.

Martin

On 9 Apr 2008, at 18:51, Martin.Losch at awi.de wrote:
> OK, I can see now where this comes from:
> C-    CubedSphere : pass 3 times, with partial update of local  
> tracer field
>        IF (ipass.EQ.1) THEN
>         overlapOnly  = MOD(nCFace,3).EQ.0
>         interiorOnly = MOD(nCFace,3).NE.0
>         calc_fluxes_X = nCFace.EQ.6 .OR. nCFace.EQ.1 .OR. nCFace.EQ.2
>         calc_fluxes_Y = nCFace.EQ.3 .OR. nCFace.EQ.4 .OR. nCFace.EQ.5
>        ELSEIF (ipass.EQ.2) THEN
>         overlapOnly  = MOD(nCFace,3).EQ.2
>         interiorOnly = MOD(nCFace,3).EQ.1
>         calc_fluxes_X = nCFace.EQ.2 .OR. nCFace.EQ.3 .OR. nCFace.EQ.4
>         calc_fluxes_Y = nCFace.EQ.5 .OR. nCFace.EQ.6 .OR. nCFace.EQ.1
>        ELSE
>         interiorOnly = .TRUE.
>         calc_fluxes_X = nCFace.EQ.5 .OR. nCFace.EQ.6
>         calc_fluxes_Y = nCFace.EQ.2 .OR. nCFace.EQ.3
>        ENDIF
>
>
> I assume that this is the minimum number of calls to gad_$ 
> {advscheme}_adv_x/y that is possible? Why is it not symmetric for  
> all faces? I wonder if the load imbalance on the cpus is more  
> severe (because of waiting in the exchange routines than) than  
> calling gad_${advscheme}_adv_x/y for two more faces, so that the  
> load is nearly the same for all faces. Currently the four  
> exch2_send/recv_rl1/2 routines take up over 20% of the total time  
> (mostly because they wait).
>
> Martin
>
> ----- Original Message -----
> From: Jean-Michel Campin <jmc at ocean.mit.edu>
> Date: Wednesday, April 9, 2008 6:30 pm
> Subject: Re: [MITgcm-devel] advection routines: strange results in  
> flow trace analysis
>
>> Hi Martin,
>>
>> MultiDim advection on CS-grid has special stuff depending on
>> which face is computed (2 or 3 calls to advection S/R, npass=3);
>> It's very likely that it comes from there.
>>
>> Jean-Michel
>>
>> On Wed, Apr 09, 2008 at 04:29:44PM +0200, Martin Losch wrote:
>>> Hi there,
>>>
>>> I have an unexpected result in flow trace analysis (see below). I
>> am
>>> running the high resultion cubed sphere configuration (CS510)
>> with 16
>>> passive tracers on 24CPU of a SX8-R, my advection scheme is 7
>> (os7mp)
>>> for the tracers and 33 (dst3fl) for the seaice variables. As
>> expected
>>> the advection routines use most of the time (18 tracers). The
>> flow
>>> trace analysis below gives the cumulative/average values in the
>> first
>>> line, and then the values for the individual process in the
>> following
>>> 24 lines. However, if you look closely, you'll see that on some
>> (8)
>>> cpus  the advection routine is called twice as often as on the
>>> remaining 16 cpus, this is true both for gad_os7mp_adv_x/y (which
>> is
>>> called from gad_advection in this case) and gad_dst3fl_adv_x/y
>> (which
>>> is called from seaice_advection in this case, not shown). We
>> (Jens-
>>> Olaf and I) suspect that this imbalance is responsible for the
>>> terrible performance of the exch2-routines in this run that Chris
>> and
>>> I talked about in February, because 16 cpus have to wait for 8
>> all
>>> the time in the exchange routines.
>>>
>>> All other routines seem to be called with the same frequency on
>> all
>>> CPUs.
>>>
>>> What is the explanation for this?
>>>
>>> Martin
>>>
>>>
>>>
>>>> *--------------------------*
>>>>  FLOW TRACE ANALYSIS LIST
>>>> *--------------------------*
>>>>
>>>> Execution : Wed Apr  9 10:54:36 2008
>>>> Total CPU : 13:40'31"663
>>>>
>>>>
>>>> FREQUENCY  EXCLUSIVE       AVER.TIME    MOPS  MFLOPS V.OP  AVER.
>>
>>>> VECTOR I-CACHE O-CACHE    BANK  PROG.UNIT
>>>>           TIME[sec](  % )    [msec]                 RATIO V.LEN
>>
>>>> TIME   MISS    MISS      CONF
>>>>
>>>>  6220800  6776.032( 13.8)     1.089 25809.1  7541.0 99.88 256.0
>>
>>>> 6774.327  0.1500  0.1642 10.0670  gad_os7mp_adv_x
>>>>   194400   211.705            1.089 25814.8  7542.6 99.88 256.0
>>
>>>> 211.646  0.0084  0.0104  0.2688   0.0
>>>>   194400   211.692            1.089 25816.3  7543.1 99.88 256.0
>>
>>>> 211.641  0.0013  0.0027  0.2656   0.1
>>>>   194400   211.890            1.090 25792.1  7536.0 99.88 256.0
>>
>>>> 211.838  0.0014  0.0021  0.4195   0.10
>>>>   194400   211.907            1.090 25790.2  7535.4 99.88 256.0
>>
>>>> 211.852  0.0020  0.0024  0.4203   0.11
>>>>   194400   211.706            1.089 25814.6  7542.5 99.88 256.0
>>
>>>> 211.648  0.0059  0.0064  0.2785   0.12
>>>>   194400   211.698            1.089 25815.6  7542.8 99.88 256.0
>>
>>>> 211.644  0.0011  0.0019  0.2743   0.13
>>>>   194400   211.720            1.089 25812.8  7542.0 99.88 256.0
>>
>>>> 211.654  0.0171  0.0173  0.2838   0.14
>>>>   194400   211.713            1.089 25813.8  7542.3 99.88 256.0
>>
>>>> 211.658  0.0034  0.0041  0.2903   0.15
>>>>   194400   211.673            1.089 25818.7  7543.8 99.88 256.0
>>
>>>> 211.615  0.0096  0.0076  0.2493   0.16
>>>>   194400   211.681            1.089 25817.7  7543.5 99.88 256.0
>>
>>>> 211.613  0.0181  0.0169  0.2498   0.17
>>>>   194400   211.645            1.089 25822.0  7544.7 99.88 256.0
>>
>>>> 211.590  0.0041  0.0044  0.2292   0.18
>>>>   194400   211.650            1.089 25821.4  7544.6 99.88 256.0
>>
>>>> 211.596  0.0042  0.0045  0.2281   0.19
>>>>   194400   211.684            1.089 25817.3  7543.4 99.88 256.0
>>
>>>> 211.628  0.0050  0.0063  0.2656   0.2
>>>>   388800   423.306            1.089 25821.1  7544.4 99.88 256.0
>>
>>>> 423.206  0.0061  0.0076  0.4798   0.20
>>>>   388800   423.311            1.089 25820.8  7544.4 99.88 256.0
>>
>>>> 423.208  0.0057  0.0064  0.4736   0.21
>>>>   388800   423.306            1.089 25821.1  7544.4 99.88 256.0
>>
>>>> 423.204  0.0024  0.0031  0.4841   0.22
>>>>   388800   423.321            1.089 25820.2  7544.2 99.88 256.0
>>
>>>> 423.218  0.0017  0.0024  0.4834   0.23
>>>>   194400   211.678            1.089 25818.0  7543.5 99.88 256.0
>>
>>>> 211.625  0.0101  0.0112  0.2526   0.3
>>>>   388800   423.756            1.090 25793.7  7536.4 99.88 256.0
>>
>>>> 423.648  0.0025  0.0048  0.8570   0.4
>>>>   388800   423.705            1.090 25796.7  7537.3 99.88 256.0
>>
>>>> 423.595  0.0079  0.0095  0.8142   0.5
>>>>   388800   423.733            1.090 25795.1  7536.8 99.88 256.0
>>
>>>> 423.660  0.0159  0.0174  0.8413   0.6
>>>>   388800   423.742            1.090 25794.5  7536.7 99.88 256.0
>>
>>>> 423.647  0.0024  0.0039  0.8203   0.7
>>>>   194400   211.906            1.090 25790.2  7535.4 99.88 256.0
>>
>>>> 211.845  0.0124  0.0089  0.4195   0.8
>>>>   194400   211.904            1.090 25790.4  7535.5 99.88 256.0
>>
>>>> 211.849  0.0012  0.0019  0.4183   0.9
>>>>  6220800  6482.742( 13.2)     1.042 27041.7  7882.1 99.88 256.0
>>
>>>> 6480.721  0.5066  0.1471  7.7018  gad_os7mp_adv_y
>>>>   194400   202.452            1.041 27059.5  7887.3 99.88 256.0
>>
>>>> 202.387  0.0137  0.0075  0.2022   0.0
>>>>   194400   202.439            1.041 27061.3  7887.8 99.88 256.0
>>
>>>> 202.380  0.0073  0.0014  0.1965   0.1
>>>>   388800   405.687            1.043 27007.3  7872.1 99.88 256.0
>>
>>>> 405.582  0.0103  0.0025  0.6622   0.10
>>>>   388800   405.711            1.043 27005.7  7871.6 99.88 256.0
>>
>>>> 405.568  0.0446  0.0029  0.6483   0.11
>>>>   194400   202.487            1.042 27054.9  7886.0 99.88 256.0
>>
>>>> 202.422  0.0128  0.0065  0.2192   0.12
>>>>   194400   202.461            1.041 27058.3  7887.0 99.88 256.0
>>
>>>> 202.401  0.0084  0.0013  0.2061   0.13
>>>>   194400   202.497            1.042 27053.5  7885.6 99.88 256.0
>>
>>>> 202.425  0.0237  0.0147  0.2160   0.14
>>>>   194400   202.519            1.042 27050.6  7884.7 99.88 256.0
>>
>>>> 202.423  0.0424  0.0033  0.2155   0.15
>>>>   388800   404.770            1.041 27068.5  7889.9 99.88 256.0
>>
>>>> 404.664  0.0198  0.0146  0.3562   0.16
>>>>   388800   404.766            1.041 27068.7  7890.0 99.88 256.0
>>
>>>> 404.653  0.0310  0.0243  0.3585   0.17
>>>>   388800   404.796            1.041 27066.7  7889.4 99.88 256.0
>>
>>>> 404.692  0.0125  0.0057  0.3540   0.18
>>>>   388800   404.830            1.041 27064.4  7888.8 99.88 256.0
>>
>>>> 404.696  0.0435  0.0060  0.3553   0.19
>>>>   194400   202.459            1.041 27058.6  7887.1 99.88 256.0
>>
>>>> 202.396  0.0111  0.0046  0.1934   0.2
>>>>   194400   202.395            1.041 27067.1  7889.5 99.88 256.0
>>
>>>> 202.338  0.0094  0.0032  0.1811   0.20
>>>>   194400   202.400            1.041 27066.4  7889.3 99.88 256.0
>>
>>>> 202.342  0.0092  0.0030  0.1827   0.21
>>>>   194400   202.406            1.041 27065.6  7889.1 99.88 256.0
>>
>>>> 202.347  0.0074  0.0012  0.1768   0.22
>>>>   194400   202.438            1.041 27061.4  7887.9 99.88 256.0
>>
>>>> 202.347  0.0376  0.0008  0.1773   0.23
>>>>   194400   202.473            1.042 27056.8  7886.5 99.88 256.0
>>
>>>> 202.392  0.0463  0.0091  0.1846   0.3
>>>>   194400   202.854            1.043 27005.8  7871.7 99.88 256.0
>>
>>>> 202.791  0.0090  0.0011  0.3488   0.4
>>>>   194400   202.871            1.044 27003.6  7871.0 99.88 256.0
>>
>>>> 202.806  0.0125  0.0041  0.3407   0.5
>>>>   194400   202.796            1.043 27013.6  7873.9 99.88 256.0
>>
>>>> 202.748  0.0174  0.0081  0.3077   0.6
>>>>   194400   202.880            1.044 27002.5  7870.7 99.88 256.0
>>
>>>> 202.788  0.0424  0.0012  0.3204   0.7
>>>>   388800   405.695            1.043 27006.7  7871.9 99.88 256.0
>>
>>>> 405.581  0.0246  0.0181  0.6572   0.8
>>>>   388800   405.660            1.043 27009.1  7872.6 99.88 256.0
>>
>>>> 405.551  0.0097  0.0021  0.6412   0.9
>>>>  4572288  5253.314( 10.7)     1.149 26038.1  7946.7 99.88 255.2
>>
>>>> 5251.591  0.1115  0.1259  8.4362  gad_os7mp_adv_r
>>>>   190512   218.247            1.146 26114.6  7970.0 99.88 255.2
>>
>>>> 218.173  0.0070  0.0075  0.2710   0.0
>>>>   190512   218.203            1.145 26119.9  7971.6 99.88 255.2
>>
>>>> 218.134  0.0012  0.0023  0.2445   0.1
>>>>   190512   219.379            1.152 25979.8  7928.9 99.88 255.2
>>
>>>> 219.309  0.0017  0.0023  0.4233   0.10
>>>>   190512   219.377            1.152 25980.0  7929.0 99.88 255.2
>>
>>>> 219.305  0.0019  0.0021  0.4169   0.11
>>>>   190512   218.427            1.147 26093.0  7963.4 99.88 255.2
>>
>>>> 218.354  0.0051  0.0057  0.2925   0.12
>>>>   190512   218.376            1.146 26099.1  7965.3 99.88 255.2
>>
>>>> 218.305  0.0010  0.0017  0.2655   0.13
>>>>   190512   218.430            1.147 26092.7  7963.4 99.88 255.2
>>
>>>> 218.352  0.0144  0.0143  0.2949   0.14
>>>>   190512   218.424            1.147 26093.4  7963.6 99.88 255.2
>>
>>>> 218.353  0.0029  0.0035  0.2948   0.15
>>>>   190512   218.846            1.149 26043.1  7948.2 99.88 255.2
>>
>>>> 218.771  0.0073  0.0077  0.3413   0.16
>>>>   190512   218.903            1.149 26036.3  7946.1 99.88 255.2
>>
>>>> 218.826  0.0131  0.0131  0.3785   0.17
>>>>   190512   218.789            1.148 26049.9  7950.3 99.88 255.2
>>
>>>> 218.716  0.0035  0.0039  0.3222   0.18
>>>>   190512   218.737            1.148 26056.1  7952.2 99.88 255.2
>>
>>>> 218.665  0.0036  0.0037  0.3009   0.19
>>>>   190512   218.213            1.145 26118.6  7971.3 99.88 255.2
>>
>>>> 218.141  0.0043  0.0052  0.2531   0.2
>>>>   190512   219.118            1.150 26010.8  7938.3 99.88 255.2
>>
>>>> 219.046  0.0034  0.0044  0.3932   0.20
>>>>   190512   219.104            1.150 26012.3  7938.8 99.88 255.2
>>
>>>> 219.032  0.0031  0.0039  0.3778   0.21
>>>>   190512   219.107            1.150 26012.0  7938.7 99.88 255.2
>>
>>>> 219.034  0.0018  0.0023  0.3809   0.22
>>>>   190512   219.113            1.150 26011.3  7938.5 99.88 255.2
>>
>>>> 219.040  0.0011  0.0014  0.3814   0.23
>>>>   190512   218.109            1.145 26131.1  7975.1 99.88 255.2
>>
>>>> 218.042  0.0086  0.0092  0.2149   0.3
>>>>   190512   219.504            1.152 25965.0  7924.4 99.88 255.2
>>
>>>> 219.429  0.0017  0.0031  0.4977   0.4
>>>>   190512   219.464            1.152 25969.7  7925.8 99.88 255.2
>>
>>>> 219.390  0.0043  0.0052  0.4517   0.5
>>>>   190512   219.381            1.152 25979.6  7928.8 99.88 255.2
>>
>>>> 219.328  0.0083  0.0088  0.4260   0.6
>>>>   190512   219.394            1.152 25978.0  7928.4 99.88 255.2
>>
>>>> 219.327  0.0017  0.0027  0.4132   0.7
>>>>   190512   219.319            1.151 25987.0  7931.1 99.88 255.2
>>
>>>> 219.243  0.0091  0.0097  0.3975   0.8
>>>>   190512   219.351            1.151 25983.1  7929.9 99.88 255.2
>>
>>>> 219.277  0.0013  0.0022  0.4027   0.9
>>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel




More information about the MITgcm-devel mailing list