[MITgcm-devel] Re: it's probably my faults again, but ...

chris hill cnh at mit.edu
Mon Feb 26 13:18:17 EST 2007


martin,

  can you check one of your holes setups (code/, input/ etc..) into 
MITgcm_contrib/mlosch.
  then we can try it here.

chris
Martin Losch wrote:
> Hi there,
> 
> I have further investigated the issue with domains with holes and I have 
> narrowed down the problem to checkpoint58u_post. All checkpoints prior 
> to this one seem to work with my cs32 setup with 91 tiles. This is what 
> supposedly happenend 58u:
>> checkpoint58u_post
>> o new test-exp: fizhi-cs-32x32x40 (40 levels) to replace the 10 levels.
>> o move call to INI_FORCING from PACKAGES_INIT_VARIABLES to 
>> INITIALISE_VARIA.
>> o testreport: add option "-skipdir" to skip some test.
>> o exf: when input wind-stress (#undef ALLOW_ATM_WIND), by-pass turbulent
>>   momentum calculation.
>> o gad_advection: fix vertAdvecScheme (if different from advectionScheme)
>> o some cleaning: usePickupBeforeC35 no longer supported ; remove this 
>> option.
>>   remove checkpoint.F and the_correction_step.F (no longer used);
>>   do the k loop inside CYCLE_TRACER (supposed to be more efficient).
>> o add option (linFSConserveTr) to correct for tracer source/sink due to
>>   Linear Free surface
>> o pkg/seaice: fix a bug in the flooding algorithm: turn off the snow 
>> machine
>> o pkg/thsice: fix reading mnc-pickups
> 
> I don't see anything, that could affect exchanges. Do you? I  also did a 
> diff on all files that I actually compile, which I attach (I removed 
> everything thing, that is just white space or spelling difference in 
> comments etc.), but I don't see anything that could explain the NaN's in 
> version 58t.
> 
> Martin
> 
> On 20 Feb 2007, at 21:04, Martin Losch wrote:
> 
>> rats, forgot to include blanklist.txt:
>> 31
>> 34
>> 35
>> 47
>> 79
>>
>> M.
>>
>> On 20 Feb 2007, at 21:01, Martin Losch wrote:
>>
>>> Hi,
>>> I have run a quick test with the appended stuff, (cs32 with 91 8x8 
>>> tiles) and the model produces NaNs right away, in the first timestep. 
>>> So there appears to be something broken in exch2? (after my great 
>>> lapse last week I dare not claim anything anymore).
>>>
>>> Martin
>>> <s91t.tgz>
>>>
>>> On 20 Feb 2007, at 18:44, Martin Losch wrote:
>>>
>>>> Dimitris,
>>>>
>>>> the reason why I think it's the seaice-ice model (but the problem 
>>>> may very well have nothing to do with the seaice-model, but only 
>>>> show up in the seaice model first) is that the monitor output has 
>>>> valuse of > 1e173 for u/vice_del2 while all other variables look ok 
>>>> for time step 1440 (which is the time step of the pickup, that is at 
>>>> this point nothing has happended so far, and the do_the_model_io and 
>>>> monitor packages are called from initialise_varia.
>>>>
>>>> which one is the cs32  test, so I can try it, too?
>>>>
>>>> Martin
>>>>
>>>> On 20 Feb 2007, at 18:32, Dimitris Menemenlis wrote:
>>>>
>>>>> Martin, I am transferring your e-mail to devel list as Chris or 
>>>>> others may have comments.  What makes you think that it is the 
>>>>> sea-ice model that causes trouble?  I have used the 
>>>>> s1500t_17x51/SIZE.h_500 configuration in the past successfully but 
>>>>> have not done so in a very long time.  What is special about this 
>>>>> configuration is that there are holes in the domain, i.e., no 
>>>>> computations take place over some of the land.
>>>>>
>>>>> >>> MY QUESTION TO THE DEVEL LIST IS WHETHER ANYONE ELSE HAS USED
>>>>> >>> DOMAINS WITH HOLES RECENTLY?
>>>>>
>>>>> I don't think that this part of the exch2 package is tested on 
>>>>> regular basis, so it may be broken.  I am in process of rearranging 
>>>>> the description of experiments, etc., as per your suggestions and 
>>>>> there is a small test with holes on the 32x6x32 domain that I plan 
>>>>> to test and get back.  D.
>>>>>
>>>>>
>>>>>> ... just to make sure that I am not trying something stupid: After 
>>>>>> having
>>>>>> made one day of integration on 216 CPUs (and actually picking up 
>>>>>> and running
>>>>>> a second day with 216 CPUs), I have tried using a higher number of 
>>>>>> CPUs
>>>>>> (which is more effective on the machine that I am running on), 
>>>>>> that is the
>>>>>> SIZE.h_500 in the s1500t directory. So I replaced SIZE.h with 
>>>>>> SIZE.h_500 and
>>>>>> w2_ee2setup.F and W2_EXCH_TOPOLOGY.h and recompiled. Then I tried 
>>>>>> to restart
>>>>>> from the same pickup, from which I have already successfully 
>>>>>> started with
>>>>>> 216CPUs. The model starts (after waiting 4days in the queue) and 
>>>>>> seems to
>>>>>> pickup fine, at least the model part.
>>>>>> [ now it's definitely my fault, that I send this email 
>>>>>> prematurely, sorry for
>>>>>> that ]
>>>>>> I am attaching the stdout, and you can see that something is wrong 
>>>>>> with the
>>>>>> seaice model and then the first timestep is already very wrong. My 
>>>>>> eedata is
>>>>>> correct this time. And I am using the MULTICATEGORY seaice (all 
>>>>>> the way, it's
>>>>>> also in the pickup files), but that shouldn't make a difference, 
>>>>>> should it?
>>>>>> So my question is really: Do you regularily use this configuration 
>>>>>> at all or
>>>>>> have I made one of my famous mistakes again?
>>>>>> Martin
>>>>
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel




More information about the MITgcm-devel mailing list