[MITgcm-support] tutorial_global_oce_optim optimisation failed

Wed May 2 13:34:42 EDT 2018

Thanks for this.

Just as a sanity check, before I involve optim_m1qn3 again, the output of
my ./testreport -t tutorial_global_oce_optim -oad includes

There were 16 decimal places of similarity for "ADM CostFct"
There were 16 decimal places of similarity for "ADM Ad Grad"
There were 0 decimal places of similarity for "ADM FD Grad"

Should I be concerned about this?

E.g. lines 2116-2118 of my output_oadm.txt file are

(PID.TID 0000.0001)  ADM  ref_cost_function      =  6.20023228182329E+00
(PID.TID 0000.0001)  ADM  adjoint_gradient       = -2.69091500991183E-06
(PID.TID 0000.0001)  ADM  finite-diff_grad       =  0.00000000000000E+00

But at least my cost function value is the same:

(PID.TID 0000.0001)   local fc =  0.620023228182329D+01
(PID.TID 0000.0001)  global fc =  0.620023228182329D+01

Andrew

On 2 May 2018 at 10:34, Martin Losch <Martin.Losch at awi.de> wrote:

> Hi Andrew,
>
> I won’t be able to help you much with the optim/lsopt code, because I
> would have to get it running again myself. But I do recommend using the
> MITgcm_contrib/mlosch/optim_m1qn3 code. It’s not very well documented,
> but I am attaching a skeleton script to illustrate how to use it. Please
> give it a try and if you find it useful, I can add this script to the
> repository.
>
> The two versions of the optimization routine are similar, both implement
> the same optimization algorithm (BFGS), but optim_m1qn3 uses a later
> version of the m1qn3 code, I think it’s easier to compile (only one
> Makefile) and I believe (but there’s debate about this) that it does the
> right thing as opposed to the optim/lsopt variant, which somehow truncates
> the optimization in each iteration. Having said that, I have used both in
> parallel, and the reduction of the cost function (which is really all we
> care about) is sometimes better with the optim_m1qn3 code, sometimes it is
> better with the optim/lsopt code. The optim_m1qn3 code is closer to the
> idea of the original m1qn3 code.
>
> Let me know if you can use my attached instructions.
>
> Martin
>
>
>
> > On 1. May 2018, at 00:00, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
> wrote:
> >
> > Right, but the cost function is the same value each time, the norm of x
> is 0 each time, and the norm of g is the same each time.  This suggests
> nothing is happening.  It's a bit ridiculous that one of the core tutorials
> simply isn't working out of the box...
> >
> > I will have a go at debugging.
> >
> > Andrew
> >
> > On 30 April 2018 at 22:54, Matthew Mazloff <mmazloff at ucsd.edu> wrote:
> > Well you are correct that its not actually taking a step because the dot
> product of the control is 0:
> >>> norm of x................... 0.00000000E+00
> > meaning the controls are all 0 still.
> >
> > However the gradients are non-zero
> >>> norm of g................... 0.12730927E-01
> > so the linesearch should step and
> > ecco_ctrl_MIT_CE_000.opt0001
> > should not be all zero.
> >
> > To debug this you could put a print statement in optim_writedata.F to
> see what it is writing…..
> >
> > I don’t know enough about this tutorial to be a bigger help, sorry
> >
> > Matt
> >
> >
> >> On Apr 30, 2018, at 2:50 PM, Andrew McRae <
> andrew.mcrae at physics.ox.ac.uk> wrote:
> >>
> >> Yes, I did.
> >>
> >> On 30 April 2018 at 22:42, Matthew Mazloff <mmazloff at ucsd.edu> wrote:
> >> This is still iteration 0. You have to update data.optim to tell it you
> are now at iteration 1
> >>
> >> Matt
> >>
> >>
> >>> On Apr 30, 2018, at 2:38 PM, Andrew McRae <
> andrew.mcrae at physics.ox.ac.uk> wrote:
> >>>
> >>> I tried a few steps of this, but the output of optim.x always has
> >>>
> >>>   cost function............... 0.62002323E+01
> >>>   norm of x................... 0.00000000E+00
> >>>   norm of g................... 0.12730927E-01
> >>>
> >>> near the end, with no decrease in the cost function.  So I guess it's
> not actually taking the step?
> >>>
> >>> Andrew
> >>>
> >>> On 27 April 2018 at 18:04, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
> wrote:
> >>> !!!  Okay...
> >>>
> >>> Yes, it produced the .opt0001 file.  I'll see how this goes.
> >>>
> >>> Thanks,
> >>> Andrew
> >>>
> >>> On 27 April 2018 at 17:57, Matthew Mazloff <mmazloff at ucsd.edu> wrote:
> >>> Hello
> >>>
> >>> Its been awhile, but I am pretty sure that is the normal output. It
> says “fail", but it did give you a new and ecco_ctrl_MIT_CE_000.opt0001
> (correct?) and if you unpack and run likely the cost will descend.
> >>>
> >>> I think it worked correctly. lsopt/optim are just confusing…but I
> think its working. I think all is good!
> >>>
> >>> Matt
> >>>
> >>>
> >>>
> >>>> On Apr 27, 2018, at 8:25 AM, Andrew McRae <
> andrew.mcrae at physics.ox.ac.uk> wrote:
> >>>>
> >>>> Just separating this from the other thread, I got the bundled MITgcm
> optim routine built (having made these changes, based on this thread from
> 2010 and this one from 2016).
> >>>>
> >>>> I use OpenAD to create the adjoint.
> >>>>
> >>>> My steps are:
> >>>> 1) in the build directory, run ../../../tools/genmake2 -oad
> -mods=../code_oad
> >>>> 2) run make depend and make adAll
> >>>> 3) copy input_oad/ into a new folder scratch/
> >>>> 4) within scratch/, run ./prepare_run
> >>>> 5) copy mitgcmuv_ad from build/ into scratch/, copy optim.x into
> scratch/OPTIM/
> >>>> 6) run ./mitgcmuv_ad
> >>>> 7) in scratch/OPTIM, create symlinks to ../data.optim and ../data.ctrl
> >>>> 8) copy the files ecco_cost_MIT_CE_000.opt0000 and
> ecco_ctrl_MIT_CE_000.opt0000 into the OPTIM subdirectory
> >>>> 9) run ./optim.x within the subdirectory
> >>>>
> >>>> The full output is attached, but I assume the optimisation failed
> since the last lines are
> >>>>
> >>>>   optimization stopped because :
> >>>>   ifail =   4    the search direction is not a descent one
> >>>>
> >>>> Any ideas?  (I guess this isn't something that is tested in the daily
> builds?)
> >>>>
> >>>> In the meantime, I'll try the m1qn3 routine as in the other thread,
> which should help distinguish between a problem with the optimisation
> routine or the gradient generated by mitgcmuv_ad.
> >>>>
> >>>> Andrew
> >>>> <out.txt>_______________________________________________
> >>>> MITgcm-support mailing list
> >>>> MITgcm-support at mitgcm.org
> >>>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> MITgcm-support mailing list
> >>> MITgcm-support at mitgcm.org
> >>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
> >>
> >>
> >> _______________________________________________
> >> MITgcm-support mailing list
> >> MITgcm-support at mitgcm.org
> >> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
> >
> >
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20180502/bf523932/attachment.html>