[MITgcm-support] optim_m1qn3 line search getting stuck

Tue Feb 12 11:07:25 EST 2019

Hi Andrew,
this usually means that there is something wrong with the gradient that the mitgcm_ad computes. I am not sure if modifying the search algorithm will help you. I would try cold restarts instead.

m1qn3 has been around for a while and is probably very correct. That doesn’t mean that my addaptation of it is correct, let alone the “driver”-code, but again the most likely problem is the mitgcm-gradient not being accurate and pointing in the wrong direction.

Martin

> On 12. Feb 2019, at 13:02, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> wrote:
> 
> This is using the optim_m1qn3 package from mitgcm_contrib.
> 
> Quite often, the algorithm runs a few steps, then gets stuck in a line search.  E.g., after 3 good iterations, I get
> 
>  m1qn3: iter 4, simul 4, f= 3.25818816D+00, h'(0)=-2.84510D+00
> 
>  m1qn3: line search
> 
>      mlis3       fpn=-2.845D+00 d2= 9.74D+01  tmin= 5.72D-07 tmax= 1.00D+20
>      mlis3                                      1.000D+00  8.992D-01  1.438D+00
>      mlis3                                      2.468D-01  1.155D-01  6.129D-01
>      mlis3                                      2.468D-03  7.656D-04  2.227D+02
>      mlis3                                      2.345D-03  7.240D-04  1.019D+01
>      mlis3                                      1.759D-03  5.496D-04  2.970D+00
>      mlis3                                      1.231D-03  3.855D-04 -3.000D+00
>      mlis3                                      8.618D-04  2.713D-04  3.105D-01
>      mlis3                                      6.032D-04  1.894D-04  2.202D-01
>      mlis3                                      4.223D-04  1.341D-04  2.223D+00
>      mlis3                                      2.956D-04  9.450D-05  3.954D-01
>      mlis3                                      2.069D-04  6.755D-05  3.121D-01
>      mlis3                                      6.207D-05  1.948D-05  3.642D-01
> 
> This is for the included tutorial_global_oce_optim running for 1 year, and with mult_hflux_tut set to 0.2 rather than 2.
> 
> To be honest, I'm not even sure what numbers are being printed, but I suspect one of those first two numbers is the step size multiplier.  I.e. it's trying to take smaller and smaller steps, but these are still being rejected.
> 
> I'm about to dive in with gdb and see what's going on, but my hypothesis is that the second Wolfe test is being violated.  Roughly speaking, this forces the gradient to decrease by 10% each iteration (at least, the component of the gradient in the descent direction).  This makes sense once the algorithm is in the basin near the minimizer of the cost function, but there's no apriori reason for it to hold further away.
> 
> Is there likely to be anything wrong with modifying this check to allow (some) gradient steepening?  I guess I'll find out...
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support