[MITgcm-support] Baroclinic instability with MPI run

Jean-Michel Campin jmc at ocean.mit.edu
Thu Jan 22 09:09:35 EST 2015


Hi Noriyuki,

which optfile are you compiling with ?

Otherwise, few other things here:

1) Although I asked Chris for full report about the set-up in order to reproduce it,
 (easy since I have access to the same computer), to my knowledge, 
 the "Independ Tiling" problem has never been "reproducable".

2) One potential problem could be compiler optimisation. 
 To clarify this point, you could:
a) with same compiler and MPI and optfile, try to run few simple 
 verification experiments (e.g., exp4) and compare the output
 with the reference output (e.g., exp4/results/output.txt).
 There is a script (verification/testreport) that does that for all 
 or a sub-set of experiment and might not be too difficult to use
 (testreport -h for a list of option).
b) you could try to lower the level of compiler optimisation.
 default is "-O2" (from linux_amd64_ifort11 optfile); you could try with
 "-O1" (it will run slower) and "-O0" (even slower).
 If "-O0" fixes the problem, then we should try to find which 
 routine cause the problem and just compile this one with "-O0"
 (since "-O0" for all src code is far too slow).

3) An other source ot problem could be the code itself. This is not
 very likely with most standard options and pkgs (since they are
 tested on a regular basis) but can definitively happen.
a) you can check if it's due to a tiling problem or MPI problem,
 simply by running with same sNx but decreasing nPx while
 increasing nSx (to maintain the same number of tiles = nSx*nPx).
 If you compile with "#define GLOBAL_SUM_SEND_RECV" in CPP_EEOPTIONS.h,
 (slower, but make the "global-sum" results independent of processors 
  number but still dependent on domain tiling) and run the 2 cases 
 (with different nPx), you could expect to get the same results.
b) if all previous suggestions do not help, you could provide a 
 copy of you set-up (checkpoint64u is fairly recent) so that we will
 try to reproduce it. Could start with your customized code dir
 and set of parameters files (data*).

Cheers,
Jean-Michel

On Thu, Jan 22, 2015 at 09:02:18PM +0900, Noriyuki Yamamoto wrote:
> Hi all,
> 
> I'm running into a problem with MPI run.
> Outputs from mpi and no-mpi run differ qualitatively.
> 
> This seems to be the same problem reported in "Independ Tiling"
> thread (http://forge.csail.mit.edu/pipermail/mitgcm-support/2014-March/009017.html).
> Is there any progress about it?
> If not, I hope this information will add some clues to fixing up.
> 
> The model has a zonally periodic channel and is forced by westerly
> wind and temperature restoring along the northen and southern wall
> in mid-high latitude.
> I tested some cases with different topography.
> In flat-bottomed case with no-mpi, baroclinic instability develops
> and cascades up to larger scales.
> But with mpi, the zonal wavenumber of baroclinic instability is
> fixed to nPx during the 3000-day integration (I tested nPx = 16, 20)
> and doesn't cascade up.
> In zonally wavy (sinusoidal wave whose k is not nPx) topography case
> with mpi using the same mpi executable file (compiled by genmake
> -mpi) as flat-bottomed case,
> baroclinic instablity cascades up and the result seems to be similar
> to that of no-mpi run though I checked only early surface
> temperature distribution.
> 
> In MPI run I tried two patterns of (nPx, nPy) = (16, 4), (20,4).
> I compiled MITgcm codes by Intel compiler 13.1.3 with Cray MPI
> library 6.3.0 on SUSE Linux Enterprise Server 11 (x86_64).
> MITgcm version is checkpoint64u (sorry that it's no latest version).
> If necessary, I will attach data and SIZE.h files later.
> 
> Sorry for my poor English.
> Noriyuki.
> 
> -- 
> Noriyuki Yamamoto
> PhD Student - Physical Oceanography Group
> Division of Earth and Planetary Sciences,
> Graduate School of Science, Kyoto University.
> Mail:nymmto at kugi.kyoto-u.ac.jp
> Tel:+81-75-753-3924
> 
> 
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support



More information about the MITgcm-support mailing list