[MITgcm-devel] REAL4_IS_SLOW is broken
Jean-Michel Campin
jmc at ocean.mit.edu
Sun Oct 26 17:54:45 EDT 2008
Martin,
I will look at those files.
For the verification, I would prefer to avoid global_ocean.90x40x15
(tested 2 times, and the only test for quasiHydrostatic & down_slope pkg).
Also, I wonder how it will behave with a different compiler on
a different platforms than g77-faulks.
And for mnc, we can always add it if we want.
I propose an other set of 2 :
solid-body.cs-32x32x1
natl_box (+mnc)
or inverted_barometer (+mnc) ?
but would try before with different compiler on faulks (ifort/g77/pgi/gfortran)
to see how this goes.
Cheers,
Jean-Michel
PS: I tend to prefer pure ascii text (not html) with shorter lines ...
easier to grep through.
On Sun, Oct 26, 2008 at 09:47:55PM +0200, Martin Losch wrote:
> Hi Jean-Michel,<div><br></div><div>- with my personal copy of CPP_EEMACROS.h and CPP_EEOPTIONS.h (see faulks:~mlosch/scratch/MITgcm/eesupp/inc) I get almost all experiments to run (fizhi* don't work and one more, see attached summary file). So a large part of the code can work with undef REAL4_IS_SLOW</div><div><br></div><div>- Unfortunately, I could not see any speed improvement on the machine I have available, so maybe the whole stuff is unnecessary, but at least the flag should work, otherwise is not worth having it at all.</div><div><br></div><div>- verification: One should use the mnc pkg, that was broken too, and one should be cubed sphere, so why not global_ocean.90x40x15 and adjustment.cs-32x32x1? Your suggestions don't have mnc.</div><div><br></div><div>- did you have a look at my solutions for diags_oceanic_flux_surf.F and monitor.F? Not very pretty, but works.</div><div><br></div><div>Maybe you can have quick look at the CPP_EEMACROS.h in the above directory and I will check it in, if you approve. I does not break the current testreports when REAL4_IS_SLOW is defined (default).</div><div><br></div><div>Martin</div><div><br></div><div>PS. A few things are probably still fishy, e.g., I had a model with _RS=real*4 blow up on me on 32cpus with NaNs in the middle of tiles (always at the same position) after the first time step. I changed the compiler (from ifort+intel MPI to pgf77+MPICH) and everything was OK. Probably only an optimization issue with ifort (i've had that happen often), but who knows.</div><div><br></div><div><br></div><div>-<br><br>----- Original Message -----<br>From: Jean-Michel Campin <jmc at ocean.mit.edu><br>Date: Sunday, October 26, 2008 21:28<br>Subject: Re: [MITgcm-devel] REAL4_IS_SLOW is broken<br>To: MITgcm-devel at mitgcm.org<br><br>> Hi Martin,<br>> <br>> Congratulation for getting on board to fight with this _RL / _RS <br>> stuff.<br>> I have 2 comments:<br>> 1) lot of recent pieces of code are all in _RL (but some parts might<br>> work as well with _RS with undef READ4_IS_SLOW). It's tricky<br>> to figure out which part always needs real*8 and which part does not.<br>> And in addition, assuming 1 piece of code works well with real*4,<br>> if this part is not so important for the full model speed, we could<br>> decide to keep it in _RL . In summary, not easy to push forward.<br>> 2) regarding what is there now:<br>> - pkg/diagnostics: will see if something easy can be done.<br>> - would be good to test this #undef READ4_IS_SLOW ;<br>> I would try it on 1 or 2 experiments; which one ? I would avoid<br>> the tutorials and also the one which are the only test <br>> for something<br>> we care about (e.g., lab_sea, global_ocean.cs). What about<br>> natl_box ? dome ? adjustment.cs-32x32x1 (or an other one <br>> to test<br>> CS-grid) ?<br>> <br>> Cheers,<br>> Jean-Michel<br>> <br>> On Sat, Oct 25, 2008 at 08:59:56PM +0200, Martin Losch wrote:<br>> > Hi again,<br>> ><br>> > the model runs with my modifications and seems to pass the <br>> testreport <br>> > (not quite done yet, but I don't see why it should fail for <br>> the last few <br>> > experiments) on eddy.csail.mit.edu, however,<br>> ><br>> > both mnc (for grid variables in write_grid.F) and <br>> diagnostics, <br>> > regardless of mnc, for the _RS variables (e.g. surface <br>> forcing) is <br>> > broken if READ4_IS_SLOW is undefined. As far as I can see, <br>> the <br>> > diagnostics package is not set up for the RL/RS and always <br>> assumes <br>> > real*8 variables. I can fix write_grid.F and also the monitor <br>> of the <br>> > surface fields (requires mon_printstats_rl to be replaced by <br>> rs-version, <br>> > not a big deal), but the diagnostics is hopeless I guess.<br>> ><br>> > Should I go ahead and do these changes? If so, what can be <br>> done about <br>> > the diagnostics? Issue a warning, if the critical variables <br>> are active <br>> > (in data.diagnostics)? I don't know how to do that.<br>> ><br>> > Even if the diagnostics stuff cannot be sorted out, I still <br>> think that <br>> > fixing the REAL4_IS_REAL flag for the rest of the code <br>> is a good thing, <br>> > so I would opt to do the changes. What do you think?<br>> ><br>> > Martin<br>> ><br>> > On 25 Oct 2008, at 16:59, Martin Losch wrote:<br>> ><br>> >> Hi Chris,<br>> >><br>> >> I think I found the problem: in CPP_EEMACROS.h the expansion <br>> of the <br>> >> _EXCH_*RS/4 macros is independent of REAL4_IS_SLOW, that is <br>> they <br>> >> always expand into RL like this:<br>> >> #define _EXCH_XY_RS(a,b) CALL EXCH_XY_RL ( a, b )<br>> >> #define _EXCH_XYZ_RS(a,b) CALL EXCH_XYZ_RL ( a, b )<br>> >> #define _EXCH_XY_R4(a,b) CALL EXCH_XY_RL ( a, b )<br>> >> #define _EXCH_XYZ_R4(a,b) CALL EXCH_XYZ_RL ( a, b )<br>> >><br>> >> I fixed this by putting the into ifdefs and expanding them to <br>> _RS in <br>> >> the cae of REAL4_IS_SLOW. That did the trick. I will <br>> certainly check, <br>> >> if this breaks the testreports, but in case it doesn't break <br>> the tests, <br>> >> can I just check this in (I am asking, because I don't feel <br>> too <br>> >> comfortable futzing around in this area; maybe you or anybody <br>> else want <br>> >> to have a look at my modifications frist?)<br>> >><br>> >> Martin<br>> >><br>> >> PS. Should we have a test that checks this functionality, or <br>> is it not <br>> >> so important?<br>> >><br>> >><br>> >> On 25 Oct 2008, at 07:36, Martin Losch wrote:<br>> >><br>> >>> Hi Chris,<br>> >>><br>> >>> after our telephone conversation yesterday, I quickly tried <br>> #undef <br>> >>> REAL4_IS_SLOW, to see whether this has any effect on the <br>> quadcore <br>> >>> performance. It compiles without any problems, but the code <br>> stops at <br>> >>> the temperature == 0 test in ini_theta.F, some THETA values <br>> are <br>> >>> zero. Actually many values are zero, not only temperature, <br>> because <br>> >>> after exchange there are stripes of zeros along the tile <br>> edges in <br>> >>> all grid fields (most of them are now real*4, but theta is <br>> still <br>> >>> real*8).<br>> >>><br>> >>> Now, I did expect this to work right away, but since you <br>> and <br>> >>> Dimitris have already gone through using the _RS -> real*4 <br>> business, <br>> >>> maybe you remember what you did exacty, or maybe this is <br>> documented <br>> >>> somewhere?<br>> >>><br>> >>> Our domain here at Weizmann is a cartesian grid, so I am NOT <br>> using <br>> >>> exch2, as you might have with Dimitris. Do you have any <br>> idea, what's <br>> >>> going on? If it's not so complicated to do I am willing to <br>> fix this <br>> >>> in the code (never good to have feature that does not work, <br>> right?), <br>> >>> otherwise we should include a comment in CPP_EEOPTIONS.h <br>> that this <br>> >>> flag REAL4_IS_SLOW always needs to be defined<br>> >>><br>> >>> Martin<br>> >>> _______________________________________________<br>> >>> MITgcm-devel mailing list<br>> >>> MITgcm-devel at mitgcm.org<br>> >>> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> >><br>> >> _______________________________________________<br>> >> MITgcm-devel mailing list<br>> >> MITgcm-devel at mitgcm.org<br>> >> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> ><br>> > _______________________________________________<br>> > MITgcm-devel mailing list<br>> > MITgcm-devel at mitgcm.org<br>> > http://mitgcm.org/mailman/listinfo/mitgcm-devel<br>> _______________________________________________<br>> MITgcm-devel mailing list<br>> MITgcm-devel at mitgcm.org<br>> http://mitgcm.org/mailman/listinfo/mitgcm-devel<br><br>Martin Losch<br>Alfred Wegener Institute <br>Postfach 120161, 27515 Bremerhaven, Germany; <br>Tel./Fax: ++49(0471)4831-1872/1797<br><br><br><br></div>
> Sat Oct 25 16:51:30 EDT 2008
> run: ./testreport
> on : Linux eddy 2.6.20-1.2320.fc5 #1 Tue Jun 12 18:50:38 EDT 2007 i686 i686 i386 GNU/Linux
>
> No "OPTFILE" was explicitly specified by testreport,
> so the genmake default will be used.
>
> default 13 ----T----- ----S----- ----U----- ----V----- --PTR 01-- --PTR 02-- --PTR 03-- --PTR 04-- --PTR 05--
> G D M c m s m s m s m s m s m s m s m s m s
> E p a R g m m e . m m e . m m e . m m e . m m e . m m e . m m e . m m e . m m e .
> N n k u 2 i a a d i a a d i a a d i a a d i a a d i a a d i a a d i a a d i a a d
> 2 d e n d n x n . n x n . n x n . n x n . n x n . n x n . n x n . n x n . n x n .
>
> Y Y Y Y> 5<16 16 16 0 22 22 22 22 22 22 22 22 5 2 5 5 FAIL adjustment.128x64x1
> Y Y Y Y> 7<16 16 16 0 22 22 22 22 7 7 7 7 7 7 3 7 FAIL adjustment.cs-32x32x1
> Y Y Y Y> 6<16 16 16 0 22 22 22 22 7 7 0 8 7 7 0 8 FAIL adjustment.cs-32x32x1.nlfs
> Y Y Y Y -- 9 7 10> 8< 5 5 5 5 7 8 8 9 7 8 8 9 FAIL advect_cs
> Y Y Y Y -- 6 6 6> 6<16 16 16 9 16 16 16 22 16 16 16 22 FAIL advect_xy
> Y Y Y Y -- 6 7 7> 7< 9 9 13 10 16 16 16 22 16 16 16 22 FAIL advect_xy.ab3_c4
> Y Y Y Y -- 6 7 9> 7< 5 5 5 5 16 16 0 10 22 22 22 22 FAIL advect_xz
> Y Y Y Y -- 5 6 8> 7< 5 6 8 7 16 16 0 10 22 22 22 22 FAIL advect_xz.os7mp
> Y Y Y Y> 6< 8 9 10 9 8 9 9 9 8 8 8 9 8 8 8 9 FAIL aim.5l_cs
> Y Y Y Y> 7<10 10 10 9 5 8 8 8 7 7 6 7 7 7 6 7 FAIL aim.5l_cs.thSI
> Y Y Y Y> 6<11 10 9 9 8 9 8 9 8 8 8 8 8 8 6 8 FAIL aim.5l_Equatorial_Channel
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O aim.5l_LatLon
> Y Y Y Y> 6< 9 8 9 9 10 9 11 9 9 8 8 9 8 8 6 8 10 8 9 9 8 8 9 9 FAIL cfc_example
> Y Y Y Y> 5<16 16 13 8 22 22 22 22 5 5 3 6 5 6 2 6 FAIL deep_anelastic
> Y Y Y Y> 6<16 13 8 10 16 16 12 8 6 7 6 7 16 5 5 7 FAIL dome
> Y Y Y Y> 6< 7 7 8 8 9 9 11 8 7 7 6 7 7 8 4 7 FAIL exp2
> Y Y Y Y> 6< 6 8 8 9 10 10 11 9 7 7 5 7 8 7 0 8 FAIL exp2.rigidLid
> Y Y Y Y> 5< 7 7 8 9 16 16 12 9 5 5 5 5 5 5 3 5 16 16 12 9 FAIL exp4
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O fizhi-cs-32x32x40
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O fizhi-cs-aqualev20
> Y Y Y N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O fizhi-gridalt-hs
> Y Y Y Y> 4< 9 9 9 9 8 10 9 9 6 6 3 7 5 5 1 6 FAIL front_relax
> Y Y Y Y> 4<10 9 9 8 11 11 10 9 6 6 0 6 5 5 0 5 FAIL front_relax.mxl
> Y Y Y Y> 2< 6 8 9 9 9 9 11 9 7 6 5 6 7 5 2 5 FAIL global_ocean.90x40x15
> Y Y Y Y> 2< 6 8 9 9 9 9 11 9 7 6 5 6 7 5 2 5 8 10 11 9 FAIL global_ocean.90x40x15.dwnslp
> Y Y Y Y> 1< 5 5 5 5 9 7 9 6 6 5 3 5 5 4 3 5 FAIL global_ocean.cs32x15
> Y Y Y Y> 3< 7 7 5 5 10 8 8 6 6 6 3 5 5 5 3 5 FAIL global_ocean.cs32x15.icedyn
> Y Y Y Y> 3< 5 7 5 5 8 7 8 6 6 5 3 5 5 4 3 5 FAIL global_ocean.cs32x15.thsice
> Y Y Y Y> 1< 7 7 5 5 10 9 9 6 5 5 3 4 6 4 3 5 FAIL global_ocean.cs32x15.viscA4
> Y Y Y Y> 7< 8 9 9 9 10 10 11 9 7 7 8 8 8 8 7 8 FAIL global_ocean_ebm
> Y Y Y Y> 7< 7 9 10 9 10 9 11 9 7 7 7 8 7 7 7 7 FAIL global_with_exf
> Y Y Y Y> 7< 7 9 10 9 10 9 11 9 7 8 7 8 7 7 7 7 FAIL global_with_exf.yearly
> Y Y Y Y> 0< 0 0 2 0 22 22 22 22 0 0 0 0 0 0 0 0 FAIL hs94.128x64x5
> Y Y Y Y> 6< 8 9 10 9 22 22 22 22 6 8 8 7 7 8 0 8 FAIL hs94.1x64x5
> Y Y Y Y> 6< 9 9 10 10 22 22 22 22 7 7 0 9 7 7 0 8 FAIL hs94.cs-32x32x5
> Y Y Y Y> 6<10 9 10 10 22 22 22 22 8 8 8 9 8 8 8 9 FAIL hs94.cs-32x32x5.impIGW
> Y Y Y Y> 2< 9 8 8 8 16 16 16 0 9 7 5 6 7 6 0 6 FAIL ideal_2D_oce
> Y Y Y Y> 1< 9 8 8 8 22 16 10 9 6 6 1 7 22 22 22 22 FAIL internal_wave
> Y Y Y Y> 5< 9 9 13 10 16 16 16 22 5 5 0 5 5 5 0 5 FAIL inverted_barometer
> Y Y Y Y> 0<10 12 11 0 11 13 12 0 0 0 0 0 0 0 0 0 FAIL isomip
> Y Y Y Y> 0<10 16 12 0 12 16 13 0 0 0 0 0 0 0 0 0 FAIL isomip.htd
> Y Y Y Y> 5< 4 4 7 6 5 11 8 6 3 5 3 5 5 5 3 5 FAIL lab_sea
> Y Y Y Y> 3< 4 5 7 6 5 11 8 6 3 3 3 4 2 5 4 5 FAIL lab_sea.hb87
> Y Y Y Y> 5< 4 4 7 6 5 11 8 6 5 5 4 6 6 5 4 6 FAIL lab_sea.lsr
> Y Y Y Y> 4< 1 5 6 5 5 9 7 5 2 2 2 3 3 2 2 3 FAIL lab_sea.salt_plume
> Y Y Y Y> 0<16 16 16 22 16 16 16 22 4 4 0 4 5 4 0 4 FAIL matrix_example
> Y Y Y Y> 4<16 16 11 11 16 16 16 13 6 6 0 7 6 6 0 7 FAIL MLAdjust
> Y Y Y Y> 4<16 16 11 11 16 16 13 13 6 6 0 7 6 6 0 7 FAIL MLAdjust.0.leith
> Y Y Y Y> 4<16 16 11 11 16 16 16 13 6 6 0 7 6 6 0 7 FAIL MLAdjust.0.leithD
> Y Y Y Y> 4<16 16 11 11 16 16 16 13 6 6 0 7 6 6 0 7 FAIL MLAdjust.0.smag
> Y Y Y Y> 6<13 16 11 9 16 16 13 10 7 7 3 8 7 7 0 8 FAIL MLAdjust.1.leith
> Y Y Y Y> 3< 8 3 7 7 9 6 9 7 1 1 3 2 1 0 3 2 FAIL natl_box
> Y Y Y Y -- 16 16 8 9 16 16 11 8 22 22 22 22 22 22 22 22 0 8 0> 1<FAIL offline_exf_seaice
> Y Y Y Y -- 16 16 8 9 16 16 11 8 22 22 22 22 22 22 22 22 16 16 3 3 16 16 3> 3<FAIL offline_exf_seaice.seaicetd
> Y Y Y Y> 5< 9 12 11 7 16 16 14 0 3 4 0 5 5 5 0 5 FAIL rotating_tank
> Y Y Y Y> 4<16 16 7 5 16 12 8 5 2 1 4 3 16 3 5 2 FAIL seaice_obcs
> Y Y Y Y> 5<16 16 16 0 5 7 7 7 6 7 7 7 6 7 7 7 FAIL solid-body.cs-32x32x1
> Y Y N N .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. N/O tutorial_advection_in_gyre
> Y Y Y Y> 6<11 11 16 10 16 16 16 0 7 7 7 7 7 6 1 7 FAIL tutorial_baroclinic_gyre
> Y Y Y Y> 4<16 16 16 22 16 16 16 22 4 4 2 5 5 5 0 5 FAIL tutorial_barotropic_gyre
> Y Y Y Y -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 10 9 9> 9<10 10 9 9 FAIL tutorial_cfc_offline
> Y Y Y Y> 7<11 12 13 9 16 16 16 22 8 8 4 8 8 8 4 8 FAIL tutorial_deep_convection
> Y Y Y Y> 7< 8 8 9 10 10 9 11 9 9 9 8 8 8 8 6 8 12 12 11 9 12 11 11 9 10 12 9 9 8 10 9 10 10 8 10 9 FAIL tutorial_global_oce_biogeo
> Y Y Y Y> 2< 7 8 7 8 8 9 10 8 5 5 5 6 6 5 4 7 FAIL tutorial_global_oce_in_p
> Y Y Y Y> 4< 7 8 9 9 8 9 11 8 5 5 4 6 5 5 4 6 5 8 10 8 FAIL tutorial_global_oce_latlon
> Y Y Y Y> 6<10 9 11 10 22 22 22 22 7 8 8 9 8 8 8 9 FAIL tutorial_held_suarez_cs
> Y Y Y Y> 5< 5 9 9 8 16 16 16 0 6 6 4 7 22 22 22 22 FAIL tutorial_plume_on_slope
> Y Y Y Y> 6<12 10 11 10 16 16 16 0 6 7 6 7 7 6 7 7 FAIL vermix
> Y Y Y Y> 5<12 13 11 9 16 16 16 0 6 8 6 7 7 6 7 8 FAIL vermix.ggl90
> Y Y Y Y> 6<12 13 11 10 16 16 16 0 6 7 6 7 7 7 7 7 FAIL vermix.my82
> Y Y Y Y> 5<12 13 11 10 16 16 16 0 6 7 6 7 7 6 7 7 FAIL vermix.opps
> Y Y Y Y> 6<12 12 11 11 16 16 16 0 6 7 6 7 7 7 7 7 FAIL vermix.pp81
> Start time: Sat Oct 25 16:51:30 EDT 2008
> End time: Sat Oct 25 20:33:18 EDT 2008
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list