[MITgcm-devel] Re: it's probably my faults again, but ...
chris hill
cnh at mit.edu
Mon Feb 26 13:18:17 EST 2007
martin,
can you check one of your holes setups (code/, input/ etc..) into
MITgcm_contrib/mlosch.
then we can try it here.
chris
Martin Losch wrote:
> Hi there,
>
> I have further investigated the issue with domains with holes and I have
> narrowed down the problem to checkpoint58u_post. All checkpoints prior
> to this one seem to work with my cs32 setup with 91 tiles. This is what
> supposedly happenend 58u:
>> checkpoint58u_post
>> o new test-exp: fizhi-cs-32x32x40 (40 levels) to replace the 10 levels.
>> o move call to INI_FORCING from PACKAGES_INIT_VARIABLES to
>> INITIALISE_VARIA.
>> o testreport: add option "-skipdir" to skip some test.
>> o exf: when input wind-stress (#undef ALLOW_ATM_WIND), by-pass turbulent
>> momentum calculation.
>> o gad_advection: fix vertAdvecScheme (if different from advectionScheme)
>> o some cleaning: usePickupBeforeC35 no longer supported ; remove this
>> option.
>> remove checkpoint.F and the_correction_step.F (no longer used);
>> do the k loop inside CYCLE_TRACER (supposed to be more efficient).
>> o add option (linFSConserveTr) to correct for tracer source/sink due to
>> Linear Free surface
>> o pkg/seaice: fix a bug in the flooding algorithm: turn off the snow
>> machine
>> o pkg/thsice: fix reading mnc-pickups
>
> I don't see anything, that could affect exchanges. Do you? I also did a
> diff on all files that I actually compile, which I attach (I removed
> everything thing, that is just white space or spelling difference in
> comments etc.), but I don't see anything that could explain the NaN's in
> version 58t.
>
> Martin
>
> On 20 Feb 2007, at 21:04, Martin Losch wrote:
>
>> rats, forgot to include blanklist.txt:
>> 31
>> 34
>> 35
>> 47
>> 79
>>
>> M.
>>
>> On 20 Feb 2007, at 21:01, Martin Losch wrote:
>>
>>> Hi,
>>> I have run a quick test with the appended stuff, (cs32 with 91 8x8
>>> tiles) and the model produces NaNs right away, in the first timestep.
>>> So there appears to be something broken in exch2? (after my great
>>> lapse last week I dare not claim anything anymore).
>>>
>>> Martin
>>> <s91t.tgz>
>>>
>>> On 20 Feb 2007, at 18:44, Martin Losch wrote:
>>>
>>>> Dimitris,
>>>>
>>>> the reason why I think it's the seaice-ice model (but the problem
>>>> may very well have nothing to do with the seaice-model, but only
>>>> show up in the seaice model first) is that the monitor output has
>>>> valuse of > 1e173 for u/vice_del2 while all other variables look ok
>>>> for time step 1440 (which is the time step of the pickup, that is at
>>>> this point nothing has happended so far, and the do_the_model_io and
>>>> monitor packages are called from initialise_varia.
>>>>
>>>> which one is the cs32 test, so I can try it, too?
>>>>
>>>> Martin
>>>>
>>>> On 20 Feb 2007, at 18:32, Dimitris Menemenlis wrote:
>>>>
>>>>> Martin, I am transferring your e-mail to devel list as Chris or
>>>>> others may have comments. What makes you think that it is the
>>>>> sea-ice model that causes trouble? I have used the
>>>>> s1500t_17x51/SIZE.h_500 configuration in the past successfully but
>>>>> have not done so in a very long time. What is special about this
>>>>> configuration is that there are holes in the domain, i.e., no
>>>>> computations take place over some of the land.
>>>>>
>>>>> >>> MY QUESTION TO THE DEVEL LIST IS WHETHER ANYONE ELSE HAS USED
>>>>> >>> DOMAINS WITH HOLES RECENTLY?
>>>>>
>>>>> I don't think that this part of the exch2 package is tested on
>>>>> regular basis, so it may be broken. I am in process of rearranging
>>>>> the description of experiments, etc., as per your suggestions and
>>>>> there is a small test with holes on the 32x6x32 domain that I plan
>>>>> to test and get back. D.
>>>>>
>>>>>
>>>>>> ... just to make sure that I am not trying something stupid: After
>>>>>> having
>>>>>> made one day of integration on 216 CPUs (and actually picking up
>>>>>> and running
>>>>>> a second day with 216 CPUs), I have tried using a higher number of
>>>>>> CPUs
>>>>>> (which is more effective on the machine that I am running on),
>>>>>> that is the
>>>>>> SIZE.h_500 in the s1500t directory. So I replaced SIZE.h with
>>>>>> SIZE.h_500 and
>>>>>> w2_ee2setup.F and W2_EXCH_TOPOLOGY.h and recompiled. Then I tried
>>>>>> to restart
>>>>>> from the same pickup, from which I have already successfully
>>>>>> started with
>>>>>> 216CPUs. The model starts (after waiting 4days in the queue) and
>>>>>> seems to
>>>>>> pickup fine, at least the model part.
>>>>>> [ now it's definitely my fault, that I send this email
>>>>>> prematurely, sorry for
>>>>>> that ]
>>>>>> I am attaching the stdout, and you can see that something is wrong
>>>>>> with the
>>>>>> seaice model and then the first timestep is already very wrong. My
>>>>>> eedata is
>>>>>> correct this time. And I am using the MULTICATEGORY seaice (all
>>>>>> the way, it's
>>>>>> also in the pickup files), but that shouldn't make a difference,
>>>>>> should it?
>>>>>> So my question is really: Do you regularily use this configuration
>>>>>> at all or
>>>>>> have I made one of my famous mistakes again?
>>>>>> Martin
>>>>
>>>> _______________________________________________
>>>> MITgcm-devel mailing list
>>>> MITgcm-devel at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>>
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list