[MITgcm-support] diagnosing problems with the adjoint

Wed Aug 12 13:57:02 EDT 2009

The confusion may be coming from the fact that the discussion was of  
two independent setups --

(1) my North Atlantic setup (for debugging purposes running for 1  
month at timestep of 3600 sec)
problem: adxx values are on the order of 10^16 and optim.x crashes

(2) a clean checkout of tutorial_global_oce_optim with only 2 changes  
(synchronous time stepping at 1800 sec and 1 year execution)
problem: with a current version of the GCM and tutorial, cost  
reduction fails at iteration 1
[David was able to see cost reductions over 10+ iterations when he ran  
this 1 year ago and I've tested with a March version of the GCM and  
had better results too]

For (1) I am trying Matt's advice on turning off packages.  I could  
also move to a faster machine and run for 1 year if that might help.
For (2) I can try running with asynchronous time-stepping, though I  
thought it was inadvisable for the adjoint

The west coast is always the right coast, which means I'm on the wrong  
coast!
Holly

On Aug 12, 2009, at Aug 12 , 1:30 PM, Patrick Heimbach wrote:

>
> Holly,
>
> I am a bit confused now on what works in your setup and what doesn't.
> Maybe rather than sensitivities blowing up, there's a bug in the  
> code(?)
>
> David's suggestion of starting from a clean tutorial setup is a good  
> one.
> Also indicate over which time span you run (20 timesteps, a year, 10  
> years?).
> Finally, try doing a tangent linear test, to see whether there are  
> problems
> with store directives.
>
> Sorry for being at the wrong coast right now.
>
> -p.
>
> On Aug 12, 2009, at 1:17 PM, Holly Dail wrote:
>
>> I don't know how to check the testreport, but I did run the  
>> tutorial.  I had to make two changes -- I switched to synchronous  
>> time stepping (1800 sec) based on Patrick's advice, and I set the  
>> tutorial to run for a year.
>>
>> The adjoint doesn't blow up, but I am still struggling with some  
>> inconsistencies.
>> - if I use a March 2009 version of the GCM, optim.x ends with iter0  
>> at a cost of 14.66 and iter1 at a cost of 12.04; optim.x stops  
>> based on maximal number of iterations reached which seems to be in  
>> line with what you got when you developed this tutorial.
>>
>> - if I use a current version of the GCM, optim.x ends with iter0  
>> again at a cost of 14.66, but iter1 fails to reduce cost with a  
>> message 'the search direction is not a descent one'
>>
>> I did recompile and retest the March version this week to make sure  
>> this was the case even with fresh compiles of both, and indeed it  
>> is.  Not sure what could have changed in the GCM; I couldn't find  
>> anything significant in the tutorial code or config, lsopt, optim,  
>> or any of the packages I thought to check.
>>
>> Thanks,
>> Holly
>>
>> On Aug 12, 2009, at Aug 12 , 12:49 PM, David Ferreira wrote:
>>
>>> Holy,
>>> Just to be sure: does the testreport of tutorial_global_oce_optim  
>>> run fine  for you ?
>>> david
>>>
>>>
>>> Holly Dail wrote:
>>>> Thanks for the advice Matt.
>>>>
>>>> I'm not using the divided adjoint, but I'll try the  
>>>> autodiff_inadmode_set.F approach.
>>>>
>>>> Here are the viscosities / diffusivities (chosen to be almost  
>>>> exactly that used in ECCO):
>>>> viscAz=1.E-3,
>>>> viscAh=1.E4,
>>>> diffKhT=100.,
>>>> diffKzT=2.E-5,
>>>> diffKhS=100.,
>>>> diffKzS=1.E-5,
>>>>
>>>> I used your advection scheme based on your earlier advice, but  
>>>> haven't tried
>>>>> multiDimAdvection=.FALSE.,
>>>> Will try that too.
>>>>
>>>> My time step is 3600 - again same as ECCO.
>>>>
>>>> Thanks -
>>>> Holly
>>>>
>>>>
>>>> On Aug 12, 2009, at Aug 12 , 11:42 AM, Matthew Mazloff wrote:
>>>>
>>>>> Hi Holly,
>>>>>
>>>>> Your adjoint is definitely blowing up (how many timesteps is  
>>>>> your grad check....its blowing up fast).   Try turning off  
>>>>> packages when you run the adjoint and see if that helps.  Are  
>>>>> you using the divided adjoint?  If so you can just change some  
>>>>> things to false in data.pkg when its about to start.  Turn off  
>>>>> KPP and GMREDI and packages of that nature.  If you are not  
>>>>> using the divided adjoint then you have to use  
>>>>> autodiff_inadmode_set.F to turn these things off.  In this file  
>>>>> just set
>>>>>    usePtracers  = .FALSE.
>>>>>    useKPP = .FALSE.
>>>>>    useGMREDI = .FALSE.
>>>>>    useSEAICE = .FALSE.
>>>>>
>>>>> Then try again
>>>>>
>>>>> -Matt
>>>>>
>>>>> ps> out of curiosity, what viscosity and diffusivity are you  
>>>>> trying to run the adjoint with?
>>>>>
>>>>> Oh, and also some of the advection schemes may not be stable.  I  
>>>>> am using
>>>>> multiDimAdvection=.FALSE.,
>>>>> tempAdvScheme=30,
>>>>> saltAdvScheme=30,
>>>>>
>>>>>
>>>>> pps> of course the real expert is just upstairs from you -- bug  
>>>>> him :o)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Aug 12, 2009, at 8:14 AM, Holly Dail wrote:
>>>>>
>>>>>> Hello all -
>>>>>>
>>>>>> I'd like to use optimization with a regional North Atlantic  
>>>>>> setup.  As a first case, I started with the approach laid out  
>>>>>> in tutorial_global_oce_optim --
>>>>>> - cost based on (1) divergence of annual mean surface  
>>>>>> temperatures in the model from climatology and (2) reasonable  
>>>>>> magnitude of control vector
>>>>>> - control is a time-mean heat flux correction (2-d field)
>>>>>>
>>>>>> My sensitivities are astronomical (i.e. adxx = 10^16), the  
>>>>>> gradient check seems to fail (as shown below, finite difference  
>>>>>> gradients seem okay, adjoint gradients not so much), and  
>>>>>> optim.x fails with message 'the linesearch failed'.
>>>>>>
>>>>>> (PID.TID 0000.0001) grdchk output:             
>>>>>> procId               I        ITIL  EPOS         
>>>>>> JTILEPOS           LAYER            X(I)      X(I)+/-EPS
>>>>>> (PID.TID 0000.0001) grdchk output:            FC              
>>>>>> FC1             FC2 FC1-FC2/(2*EPS)    ADJ GRAD(FC)   1-FDGRD/ 
>>>>>> ADGRD
>>>>>> (PID.TID 0000.0001) grdchk output:                  
>>>>>> 0               1             56              35                
>>>>>> 1 0.000000000D+00 -.100000000D+00
>>>>>> (PID.TID 0000.0001) grdchk output:                    
>>>>>> 0.261232434D+02 0.261232444     D+02 0.261232340D+02  
>>>>>> 0.523051129D-04 -.115313924+108 0.100000000D+01
>>>>>>
>>>>>> I suppose this may mean the adjoint is blowing up?  I've tried  
>>>>>> reducing my time step and increasing viscosity and I checked  
>>>>>> that my climatology & error fields are defined at all wet  
>>>>>> points; are there other fixes folks have had success with?   
>>>>>> Also if you have scripts that you use to diagnose your  
>>>>>> optimization runs that would be really appreciated.
>>>>>>
>>>>>> Thanks -
>>>>>> Holly
>>>>>> _______________________________________________
>>>>>> MITgcm-support mailing list
>>>>>> MITgcm-support at mitgcm.org
>>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>
>>>>> _______________________________________________
>>>>> MITgcm-support mailing list
>>>>> MITgcm-support at mitgcm.org
>>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>> _______________________________________________
>>>> MITgcm-support mailing list
>>>> MITgcm-support at mitgcm.org
>>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>>
>>> _______________________________________________
>>> MITgcm-support mailing list
>>> MITgcm-support at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> ---
> Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
> MIT | EAPS 54-1518 | 77 Massachusetts Ave | Cambridge MA 02139 USA
> FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support