[MITgcm-devel] REAL4_IS_SLOW is broken

Jean-Michel Campin jmc at ocean.mit.edu
Sun Oct 26 17:54:45 EDT 2008


Martin, 

I will look at those files.

For the verification, I would prefer to avoid global_ocean.90x40x15
(tested 2 times, and the only test for quasiHydrostatic & down_slope pkg).
Also, I wonder how it will behave with a different compiler on 
a different platforms than g77-faulks.
And for mnc, we can always add it if we want.

I propose an other set of 2 :
 solid-body.cs-32x32x1 
 natl_box (+mnc)
 or inverted_barometer (+mnc) ?
but would try before with different compiler on faulks (ifort/g77/pgi/gfortran)
to see how this goes.

Cheers,
Jean-Michel

PS: I tend to prefer pure ascii text (not html) with shorter lines ...
easier to grep through.
 

On Sun, Oct 26, 2008 at 09:47:55PM +0200, Martin Losch wrote:
> Hi Jean-Michel,<div><br></div><div>- with my personal copy of CPP_EEMACROS.h and CPP_EEOPTIONS.h (see faulks:~mlosch/scratch/MITgcm/eesupp/inc) I get almost all experiments to run (fizhi* don't work and one more, see attached summary file). So a large part of the code can work with undef REAL4_IS_SLOW</div><div><br></div><div>- Unfortunately, I could not see any speed improvement on the machine I have available, so maybe the whole stuff is unnecessary, but at least the flag should work, otherwise is not worth having it at all.</div><div><br></div><div>- verification: One should use the mnc pkg, that was broken too, and one should be cubed sphere, so why not global_ocean.90x40x15 and adjustment.cs-32x32x1? Your suggestions don't have mnc.</div><div><br></div><div>- did you have a look at my solutions for diags_oceanic_flux_surf.F and monitor.F? Not very pretty, but works.</div><div><br></div><div>Maybe you can have quick look at the CPP_EEMACROS.h in the above directory and I will check it in, if you approve. I does not break the current testreports when REAL4_IS_SLOW is defined (default).</div><div><br></div><div>Martin</div><div><br></div><div>PS. A few things are probably still fishy, e.g., I had a model with _RS=real*4 blow up on me on 32cpus with NaNs in the middle of tiles (always at the same position) after the first time step. I changed the compiler (from ifort+intel MPI to pgf77+MPICH) and everything was OK. Probably only an optimization issue with ifort (i've had that happen often), but who knows.</div><div><br></div><div><br></div><div>-<br><br>----- Original Message -----<br>From: Jean-Michel Campin &lt;jmc at ocean.mit.edu><br>Date: Sunday, October 26, 2008 21:28<br>Subject: Re: [MITgcm-devel] REAL4_IS_SLOW is broken<br>To: MITgcm-devel at mitgcm.org<br><br>> Hi Martin,<br>> <br>> Congratulation for getting on board to fight with this _RL / _RS <br>> stuff.<br>> I have 2 comments:<br>> 1) lot of recent pieces of code are all in _RL (but some parts might<br>> work as well with _RS with undef READ4_IS_SLOW). It's tricky<br>> to figure out which part always needs real*8 and which part does not.<br>> And in addition, assuming 1 piece of code works well with real*4,<br>> if this part is not so important for the full model speed, we could<br>> decide to keep it in _RL . In summary, not easy to push forward.<br>> 2) regarding what is there now:<br>> - pkg/diagnostics: will see if something easy can be done.<br>> - would be good to test this #undef READ4_IS_SLOW ;<br>>   I would try it on 1 or 2 experiments; which one ? I would avoid<br>>   the tutorials and also the one which are the only test <br>> for something<br>>   we care about (e.g., lab_sea, global_ocean.cs). What about<br>>   natl_box ? dome ? adjustment.cs-32x32x1 (or an other one <br>> to test<br>>   CS-grid) ?<br>> <br>> Cheers,<br>> Jean-Michel<br>> <br>> On Sat, Oct 25, 2008 at 08:59:56PM +0200, Martin Losch wrote:<br>> > Hi again,<br>> ><br>> > the model runs with my modifications and seems to pass the <br>> testreport  <br>> > (not quite done yet, but I don't see why it should fail for <br>> the last few <br>> > experiments) on eddy.csail.mit.edu, however,<br>> ><br>> > both mnc (for grid variables in write_grid.F) and <br>> diagnostics,  <br>> > regardless of mnc, for the _RS variables (e.g. surface <br>> forcing) is  <br>> > broken if READ4_IS_SLOW is undefined. As far as I can see, <br>> the  <br>> > diagnostics package is not set up for the RL/RS and always <br>> assumes  <br>> > real*8 variables. I can fix write_grid.F and also the monitor <br>> of the  <br>> > surface fields (requires mon_printstats_rl to be replaced by <br>> rs-version, <br>> > not a big deal), but the diagnostics is hopeless I guess.<br>> ><br>> > Should I go ahead and do these changes? If so, what can be <br>> done about  <br>> > the diagnostics? Issue a warning, if the critical variables <br>> are active <br>> > (in data.diagnostics)? I don't know how to do that.<br>> ><br>> > Even if the diagnostics stuff cannot be sorted out, I still <br>> think that <br>> > fixing the REAL4_IS_REAL flag  for the rest of the code <br>> is a good thing, <br>> > so I would opt to do the changes. What do you think?<br>> ><br>> > Martin<br>> ><br>> > On 25 Oct 2008, at 16:59, Martin Losch wrote:<br>> ><br>> >> Hi Chris,<br>> >><br>> >> I think I found the problem: in CPP_EEMACROS.h the expansion <br>> of the  <br>> >> _EXCH_*RS/4 macros is independent of REAL4_IS_SLOW, that is <br>> they  <br>> >> always expand into RL like this:<br>> >> #define _EXCH_XY_RS(a,b) CALL EXCH_XY_RL ( a, b )<br>> >> #define _EXCH_XYZ_RS(a,b) CALL EXCH_XYZ_RL ( a, b )<br>> >> #define _EXCH_XY_R4(a,b) CALL EXCH_XY_RL ( a, b )<br>> >> #define _EXCH_XYZ_R4(a,b) CALL EXCH_XYZ_RL ( a, b )<br>> >><br>> >> I fixed this by putting the into ifdefs and expanding them to <br>> _RS in <br>> >> the cae of REAL4_IS_SLOW. That did the trick. I will <br>> certainly check, <br>> >> if this breaks the testreports, but in case it doesn't break <br>> the tests, <br>> >> can I just check this in (I am asking, because I don't feel <br>> too <br>> >> comfortable futzing around in this area; maybe you or anybody <br>> else want <br>> >> to have a look at my modifications frist?)<br>> >><br>> >> Martin<br>> >><br>> >> PS. Should we have a test that checks this functionality, or <br>> is it not <br>> >> so important?<br>> >><br>> >><br>> >> On 25 Oct 2008, at 07:36, Martin Losch wrote:<br>> >><br>> >>> Hi Chris,<br>> >>><br>> >>> after our telephone conversation yesterday, I quickly tried <br>> #undef  <br>> >>> REAL4_IS_SLOW, to see whether this has any effect on the <br>> quadcore  <br>> >>> performance. It compiles without any problems, but the code <br>> stops at <br>> >>> the temperature == 0 test in ini_theta.F, some THETA values <br>> are  <br>> >>> zero. Actually many values are zero, not only temperature, <br>> because  <br>> >>> after exchange there are stripes of zeros along the tile <br>> edges in  <br>> >>> all grid fields (most of them are now real*4, but theta is <br>> still  <br>> >>> real*8).<br>> >>><br>> >>> Now, I did expect this to work right away, but since you <br>> and  <br>> >>> Dimitris have already gone through using the _RS -> real*4 <br>> business, <br>> >>> maybe you remember what you did exacty, or maybe this is <br>> documented <br>> >>> somewhere?<br>> >>><br>> >>> Our domain here at Weizmann is a cartesian grid, so I am NOT <br>> using  <br>> >>> exch2, as you might have with Dimitris. Do you have any <br>> idea, what's <br>> >>> going on? If it's not so complicated to do I am willing to <br>> fix this <br>> >>> in the code (never good to have feature that does not work, <br>> right?), <br>> >>> otherwise we should include a comment in CPP_EEOPTIONS.h <br>> that this <br>> >>> flag REAL4_IS_SLOW always needs to be defined<br>> >>><br>> >>> Martin<br>> >>> _______________________________________________<br>> >>> MITgcm-devel mailing list<br>> >>> MITgcm-devel at mitgcm.org<br>> >>> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> >><br>> >> _______________________________________________<br>> >> MITgcm-devel mailing list<br>> >> MITgcm-devel at mitgcm.org<br>> >> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> ><br>> > _______________________________________________<br>> > MITgcm-devel mailing list<br>> > MITgcm-devel at mitgcm.org<br>> > http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> _______________________________________________<br>> MITgcm-devel mailing list<br>> MITgcm-devel at mitgcm.org<br>> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br><br>Martin Losch<br>Alfred Wegener Institute <br>Postfach 120161, 27515 Bremerhaven, Germany; <br>Tel./Fax: ++49(0471)4831-1872/1797<br><br><br><br></div>

> Sat Oct 25 16:51:30 EDT 2008
> run: ./testreport
> on : Linux eddy 2.6.20-1.2320.fc5 #1 Tue Jun 12 18:50:38 EDT 2007 i686 i686 i386 GNU/Linux
> 
> No "OPTFILE" was explicitly specified by testreport,
>    so the genmake default will be used.
> 
> default 13  ----T-----  ----S-----  ----U-----  ----V-----  --PTR 01--  --PTR 02--  --PTR 03--  --PTR 04--  --PTR 05--
> G D M    c        m  s        m  s        m  s        m  s        m  s        m  s        m  s        m  s        m  s
> E p a R  g  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .  m  m  e  .
> N n k u  2  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d  i  a  a  d
> 2 d e n  d  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .  n  x  n  .
>  
> Y Y Y Y> 5<16 16 16  0 22 22 22 22 22 22 22 22  5  2  5  5 FAIL  adjustment.128x64x1
> Y Y Y Y> 7<16 16 16  0 22 22 22 22  7  7  7  7  7  7  3  7 FAIL  adjustment.cs-32x32x1
> Y Y Y Y> 6<16 16 16  0 22 22 22 22  7  7  0  8  7  7  0  8 FAIL  adjustment.cs-32x32x1.nlfs
> Y Y Y Y --  9  7 10> 8< 5  5  5  5  7  8  8  9  7  8  8  9 FAIL  advect_cs
> Y Y Y Y --  6  6  6> 6<16 16 16  9 16 16 16 22 16 16 16 22 FAIL  advect_xy
> Y Y Y Y --  6  7  7> 7< 9  9 13 10 16 16 16 22 16 16 16 22 FAIL  advect_xy.ab3_c4
> Y Y Y Y --  6  7  9> 7< 5  5  5  5 16 16  0 10 22 22 22 22 FAIL  advect_xz
> Y Y Y Y --  5  6  8> 7< 5  6  8  7 16 16  0 10 22 22 22 22 FAIL  advect_xz.os7mp
> Y Y Y Y> 6< 8  9 10  9  8  9  9  9  8  8  8  9  8  8  8  9 FAIL  aim.5l_cs
> Y Y Y Y> 7<10 10 10  9  5  8  8  8  7  7  6  7  7  7  6  7 FAIL  aim.5l_cs.thSI
> Y Y Y Y> 6<11 10  9  9  8  9  8  9  8  8  8  8  8  8  6  8 FAIL  aim.5l_Equatorial_Channel
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O   aim.5l_LatLon
> Y Y Y Y> 6< 9  8  9  9 10  9 11  9  9  8  8  9  8  8  6  8 10  8  9  9  8  8  9  9 FAIL  cfc_example
> Y Y Y Y> 5<16 16 13  8 22 22 22 22  5  5  3  6  5  6  2  6 FAIL  deep_anelastic
> Y Y Y Y> 6<16 13  8 10 16 16 12  8  6  7  6  7 16  5  5  7 FAIL  dome
> Y Y Y Y> 6< 7  7  8  8  9  9 11  8  7  7  6  7  7  8  4  7 FAIL  exp2
> Y Y Y Y> 6< 6  8  8  9 10 10 11  9  7  7  5  7  8  7  0  8 FAIL  exp2.rigidLid
> Y Y Y Y> 5< 7  7  8  9 16 16 12  9  5  5  5  5  5  5  3  5 16 16 12  9 FAIL  exp4
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O   fizhi-cs-32x32x40
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O   fizhi-cs-aqualev20
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O   fizhi-gridalt-hs
> Y Y Y Y> 4< 9  9  9  9  8 10  9  9  6  6  3  7  5  5  1  6 FAIL  front_relax
> Y Y Y Y> 4<10  9  9  8 11 11 10  9  6  6  0  6  5  5  0  5 FAIL  front_relax.mxl
> Y Y Y Y> 2< 6  8  9  9  9  9 11  9  7  6  5  6  7  5  2  5 FAIL  global_ocean.90x40x15
> Y Y Y Y> 2< 6  8  9  9  9  9 11  9  7  6  5  6  7  5  2  5  8 10 11  9 FAIL  global_ocean.90x40x15.dwnslp
> Y Y Y Y> 1< 5  5  5  5  9  7  9  6  6  5  3  5  5  4  3  5 FAIL  global_ocean.cs32x15
> Y Y Y Y> 3< 7  7  5  5 10  8  8  6  6  6  3  5  5  5  3  5 FAIL  global_ocean.cs32x15.icedyn
> Y Y Y Y> 3< 5  7  5  5  8  7  8  6  6  5  3  5  5  4  3  5 FAIL  global_ocean.cs32x15.thsice
> Y Y Y Y> 1< 7  7  5  5 10  9  9  6  5  5  3  4  6  4  3  5 FAIL  global_ocean.cs32x15.viscA4
> Y Y Y Y> 7< 8  9  9  9 10 10 11  9  7  7  8  8  8  8  7  8 FAIL  global_ocean_ebm
> Y Y Y Y> 7< 7  9 10  9 10  9 11  9  7  7  7  8  7  7  7  7 FAIL  global_with_exf
> Y Y Y Y> 7< 7  9 10  9 10  9 11  9  7  8  7  8  7  7  7  7 FAIL  global_with_exf.yearly
> Y Y Y Y> 0< 0  0  2  0 22 22 22 22  0  0  0  0  0  0  0  0 FAIL  hs94.128x64x5
> Y Y Y Y> 6< 8  9 10  9 22 22 22 22  6  8  8  7  7  8  0  8 FAIL  hs94.1x64x5
> Y Y Y Y> 6< 9  9 10 10 22 22 22 22  7  7  0  9  7  7  0  8 FAIL  hs94.cs-32x32x5
> Y Y Y Y> 6<10  9 10 10 22 22 22 22  8  8  8  9  8  8  8  9 FAIL  hs94.cs-32x32x5.impIGW
> Y Y Y Y> 2< 9  8  8  8 16 16 16  0  9  7  5  6  7  6  0  6 FAIL  ideal_2D_oce
> Y Y Y Y> 1< 9  8  8  8 22 16 10  9  6  6  1  7 22 22 22 22 FAIL  internal_wave
> Y Y Y Y> 5< 9  9 13 10 16 16 16 22  5  5  0  5  5  5  0  5 FAIL  inverted_barometer
> Y Y Y Y> 0<10 12 11  0 11 13 12  0  0  0  0  0  0  0  0  0 FAIL  isomip
> Y Y Y Y> 0<10 16 12  0 12 16 13  0  0  0  0  0  0  0  0  0 FAIL  isomip.htd
> Y Y Y Y> 5< 4  4  7  6  5 11  8  6  3  5  3  5  5  5  3  5 FAIL  lab_sea
> Y Y Y Y> 3< 4  5  7  6  5 11  8  6  3  3  3  4  2  5  4  5 FAIL  lab_sea.hb87
> Y Y Y Y> 5< 4  4  7  6  5 11  8  6  5  5  4  6  6  5  4  6 FAIL  lab_sea.lsr
> Y Y Y Y> 4< 1  5  6  5  5  9  7  5  2  2  2  3  3  2  2  3 FAIL  lab_sea.salt_plume
> Y Y Y Y> 0<16 16 16 22 16 16 16 22  4  4  0  4  5  4  0  4 FAIL  matrix_example
> Y Y Y Y> 4<16 16 11 11 16 16 16 13  6  6  0  7  6  6  0  7 FAIL  MLAdjust
> Y Y Y Y> 4<16 16 11 11 16 16 13 13  6  6  0  7  6  6  0  7 FAIL  MLAdjust.0.leith
> Y Y Y Y> 4<16 16 11 11 16 16 16 13  6  6  0  7  6  6  0  7 FAIL  MLAdjust.0.leithD
> Y Y Y Y> 4<16 16 11 11 16 16 16 13  6  6  0  7  6  6  0  7 FAIL  MLAdjust.0.smag
> Y Y Y Y> 6<13 16 11  9 16 16 13 10  7  7  3  8  7  7  0  8 FAIL  MLAdjust.1.leith
> Y Y Y Y> 3< 8  3  7  7  9  6  9  7  1  1  3  2  1  0  3  2 FAIL  natl_box
> Y Y Y Y -- 16 16  8  9 16 16 11  8 22 22 22 22 22 22 22 22  0  8  0> 1<FAIL  offline_exf_seaice
> Y Y Y Y -- 16 16  8  9 16 16 11  8 22 22 22 22 22 22 22 22 16 16  3  3 16 16  3> 3<FAIL  offline_exf_seaice.seaicetd
> Y Y Y Y> 5< 9 12 11  7 16 16 14  0  3  4  0  5  5  5  0  5 FAIL  rotating_tank
> Y Y Y Y> 4<16 16  7  5 16 12  8  5  2  1  4  3 16  3  5  2 FAIL  seaice_obcs
> Y Y Y Y> 5<16 16 16  0  5  7  7  7  6  7  7  7  6  7  7  7 FAIL  solid-body.cs-32x32x1
> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O   tutorial_advection_in_gyre
> Y Y Y Y> 6<11 11 16 10 16 16 16  0  7  7  7  7  7  6  1  7 FAIL  tutorial_baroclinic_gyre
> Y Y Y Y> 4<16 16 16 22 16 16 16 22  4  4  2  5  5  5  0  5 FAIL  tutorial_barotropic_gyre
> Y Y Y Y -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 10  9  9> 9<10 10  9  9 FAIL  tutorial_cfc_offline
> Y Y Y Y> 7<11 12 13  9 16 16 16 22  8  8  4  8  8  8  4  8 FAIL  tutorial_deep_convection
> Y Y Y Y> 7< 8  8  9 10 10  9 11  9  9  9  8  8  8  8  6  8 12 12 11  9 12 11 11  9 10 12  9  9  8 10  9 10 10  8 10  9 FAIL  tutorial_global_oce_biogeo
> Y Y Y Y> 2< 7  8  7  8  8  9 10  8  5  5  5  6  6  5  4  7 FAIL  tutorial_global_oce_in_p
> Y Y Y Y> 4< 7  8  9  9  8  9 11  8  5  5  4  6  5  5  4  6  5  8 10  8 FAIL  tutorial_global_oce_latlon
> Y Y Y Y> 6<10  9 11 10 22 22 22 22  7  8  8  9  8  8  8  9 FAIL  tutorial_held_suarez_cs
> Y Y Y Y> 5< 5  9  9  8 16 16 16  0  6  6  4  7 22 22 22 22 FAIL  tutorial_plume_on_slope
> Y Y Y Y> 6<12 10 11 10 16 16 16  0  6  7  6  7  7  6  7  7 FAIL  vermix
> Y Y Y Y> 5<12 13 11  9 16 16 16  0  6  8  6  7  7  6  7  8 FAIL  vermix.ggl90
> Y Y Y Y> 6<12 13 11 10 16 16 16  0  6  7  6  7  7  7  7  7 FAIL  vermix.my82
> Y Y Y Y> 5<12 13 11 10 16 16 16  0  6  7  6  7  7  6  7  7 FAIL  vermix.opps
> Y Y Y Y> 6<12 12 11 11 16 16 16  0  6  7  6  7  7  7  7  7 FAIL  vermix.pp81
> Start time:  Sat Oct 25 16:51:30 EDT 2008
> End time:    Sat Oct 25 20:33:18 EDT 2008

> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel





More information about the MITgcm-devel mailing list