[MITgcm-devel] (not so) funny things happen in seaice_lsr and pickups

Wed Feb 18 10:58:41 EST 2009

Hi Martin,

A short comment about restart:
1) it's tested (and I mean, not only the cg2d output, but we 
 do a diff of the real*8 binary pickup files, including
 pickup_seaice, and we get zero diff) every day, with pgi compiler
 and gfortran, but always with zero optimisation.
2) it does not guarantee that the seaice restart is always OK:
 what I found with thsice (and it could be the same with 
 seaice) is that there are so many different cases that
 we would need to run for ~1 year to be sure that all
 the pieces of code have really been tested.
3) there is always a possibility that the optimisation level
 you are using on SX8 is not safe for some part of the seaice 
 code.

Jean-Michel

On Wed, Feb 18, 2009 at 12:21:41PM +0100, Martin Losch wrote:
> Hi all,
>
> just to let you know that we are experiencing problems with the LSR sea 
> ice solver on the C-grid: At unpredictable points of the integration, it 
> appears to become instable and blows up. I have not been able to isolate 
> this in all cases, because a small issue with pickups hampers this:
>
> Apparently, starting from pickup is NOT exact. We have tried the famous 
> 2+2=4 test with our 8CPU job on our SX8 (cc to Olaf, who's been mostly 
> involved in this) and found no difference between the cg2d output (and 
> other output). However, when we run an experiment for a longer time, the 
> same test fails, e.g., 2160+2160 != 4320 (we can provide plots if 
> required). I assume that this is expected, because double precision is 
> not more than double precisioin and in the cg2d output (and other monitor 
> output) there are always only 15 digits, and we don't know about the 16th 
> one, correct? Anyway, this tiny pickup issue hinders me from approaching 
> the point of model crash with pickups, because after starting from a 
> pickup, the model integrate beyond the problem and crashes (sometimes) at 
> a much later time. This is to say, that the problem in seaice_lsr (the 
> problem only appears when the C-LSR solver is used) very sensitive; the 
> code crashes without any warning from one time step to the other. A while 
> ago, in a different case I was able to get close enough the point of 
> crashing to do some diagnostics, but its almost impossible to identify, 
> why the model explodes. I am assuming that for random pathological cases 
> one or more matrix entries are nearly zero, which then prevents the 
> solver from converging.
>
> Any comments? Any similar experience?
>
> I run this code in so many different configurations, and I have these  
> problems only very seldom/randomly, so I am a little at a loss where I  
> should continue looking, so any hint is appreciated.
>
> Martin
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel