[MITgcm-devel] verification experiment for pkg/bbl

Menemenlis, Dimitris (3248) Dimitris.Menemenlis at jpl.nasa.gov
Fri Apr 6 19:33:19 EDT 2012


Hi Oliver,

Well, the fixing of the bug did solve the discrepancy between
MPI and non-MPI for global_with_exf.yearly.
However, for when I run this experiment with MPI, and I compare the 
STDOUT.0000 (and in fact all the output files) before and after
fixing the bug, they are identical. The real problem was in
the wrong (because of the bug) output.yearly.txt (generated without
MPI, 2 tiles per proc), which I updated after fixing the bug.
And I wanted to mention this because I guess some people at JPL are 
using pkg/bbl, and if they use a set-up with MPI and 1 tile per proc, 
the results produced before fixing the bug are still OK.

Sorry for the confusion, and I hope it's more clear now.

Cheers,
Jean-Mcihel

On Apr 4, 2012, at 9:42 AM, Menemenlis, Dimitris (3248) wrote:

> Jean-Michel, your check-in yesterday appears to have fixed the pkg/bbl problem that Oliver reported.
> 
> Thanks!
> 
> Dimitris
> 
> (I transfer discussion to devel list for long-term record.
> Hopefully it will be added to the "verification experiment for pkg/bbl" thread.)
> 
>> Hi Jean-Michel,
>> 
>> I'm sorry I don't understand either... why does testreport pass now both 
>> with and without mpi if the changes you made don't touch the mpi issue? 
>> I thought there was a discrepancy between the two before.
>> 
>> Cheers,
>> Oliver
>> 
>> On 2012-04-03 15:15, Jean-Michel Campin wrote:
>>> Dimitris,
>>> 
>>> I was not clear enough:
>>> It's fine the way you generated the output.yearly.txt file,
>>> but this morning, after fixing the Pb in bbl_calc_rho.F,
>>> I had to generate a new one (because  it affects the results
>>> when running without MPI ==>  2 tiles per proc).
>>> But my point was that, when running with MPI, between the STDOUT.0000
>>> I got last night and the new one (after the bug fixed), there
>>> is no difference since with MPI this test is single tile per proc
>>> (1 tile for each proc).
>>> However, since I updated the reference output.yearly.txt file,
>>> now "testreport -mpi" will pass (but was failing @ 4 digits last night).
>>> 
>>> And Oliver's remark was also about testreport with MPI.
>>> 
>>> And I forget to mentionned, the little modifications I made
>>> yesterday to pkg/bbl were just minor: it's because it's a
>>> "clean" package that I care about removing unused variables;
>>> with some pkgs like profiles&  fizhi, I am getting too many
>>> of these warnings so I don't care.
>>> 
>>> Cheers,
>>> Jean-Michel
>>> 
>>> On Tue, Apr 03, 2012 at 10:52:15AM -0700, Menemenlis, Dimitris (3248) wrote:
>>>> Jean-Michel, I did not fully understand message below.
>>>> 
>>>> When I generated the output.txt file, I used the default compiler
>>>> and configuration on baudelaire, that is, I used the results of:
>>>> testreport -t global_with_exf
>>>> 
>>>> Should I be generating output.txt using MPI or with a
>>>> different compiler than the default?
>>>> Or are you saying that output.txt changed because of fixes to
>>>> the multi-tile-per-proc code?
>>>> 
>>>> In either case, I will wait for tonight's tests, and if it's still
>>>> a problem I will have another look at pkg/bbl tomorrow.
>>>> 
>>>> Thanks again for fixes and sorry for careless coding :-(
>>>> 
>>>> Dimitris Menemenlis
>>>> 
>>>> On Apr 3, 2012, at 10:14 AM, Jean-Michel Campin wrote:
>>>> 
>>>>> Hi Dimitris,
>>>>> 
>>>>> I think the fix will not change the results (I mean, STDOUT.0000)
>>>>> of the MPI global_with_exf tests (since with MPI, this exp. is tested
>>>>> with 1 tile per proc and the fix was for multi-tiles per proc).
>>>>> But of course, now the reference output.yearly.txt has been changed
>>>>> so it likely to pass (at least it does now on baudelaire with mpi,
>>>>> 16 digits).
>>>>> 
>>>>> Cheers,
>>>>> Jean-Michel
>>>>> 
>>>>> On Tue, Apr 03, 2012 at 09:52:43AM -0700, Menemenlis, Dimitris (3248) wrote:
>>>>>> Jean-Michel, thanks for multiple fixes to pkg/bbl last two days.
>>>>>> And sorry for the mistakes.
>>>>>> Could the last check-in, the one to bbl_calc_rho.F be responsible
>>>>>> for the MPI bug that Oliver notes below?
>>>>>> 
>>>>>> Dimitris Menemenlis
>>>>>> 
>>>>>> On Apr 3, 2012, at 9:27 AM, Oliver Jahn wrote:
>>>>>> 
>>>>>>> Hi Dimitris,
>>>>>>> 
>>>>>>> seems that this exp still doesn't get more than 4 digits with mpi (any
>>>>>>> compiler, any machine, really, like gfortran, ifort, open64, pgf).
>>>>>>> Looks like a bug?
>>>>>>> 
>>>>>>> http://mitgcm.org/testing/results/2012_04/tr_baudelaire_20120403_7/summary.txt
>>>>>>> http://mitgcm.org/testing/results/2012_04/tr_beagle-ifort-mpi_20120403_0/summary.txt
>>>>>>> 
>>>>>>> Oliver




More information about the MITgcm-devel mailing list