[MITgcm-support] tutorial_global_oce_optim optimisation failed

Thu May 24 09:40:04 EDT 2018

Hi All,
I am kind of new to mitgcm so please let me know if this the right place to address this.
OpenAD worked for me with halfpipe_streamice but I couldn't recreate the settings with tutorial_global_oce_optim. I managed to get the grdchk to work by adding OPENAD_OPTIONS.h with ALLOW_OPENAD_ACTIVE_READ_XY. The make failed for me but I manually edited OpenAD_active_read_xy  in externalDummies_cb2m_oad.f to use active_var as a real array and not type(active). The make succeeded after that and the output seemed ok.
Can someone with better understanding of the build system help me understand how externalDummies_cb2m_oad.f is created.
Thanks in advance,
Ron

On 05/03/18 11:46, Martin Losch wrote:

Hi Andrew,

the FD gradient is used for checking the AD gradient in a few (very few!) places. I don’t know why it is zero in your case, but assuming that the AD gradient is correct, you don’t need the FD gradient at all (I would actually strongly recommend to turn off the grdchk pkg for any optimization exercise).

make sure you do a “cvs update" on the optim_m1qn3 directory, because I added a fix for the funny cost function value yesterday.

Martin

On 2. May 2018, at 19:34, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:

Thanks for this.

Just as a sanity check, before I involve optim_m1qn3 again, the output of my ./testreport -t tutorial_global_oce_optim -oad includes

There were 16 decimal places of similarity for "ADM CostFct"
There were 16 decimal places of similarity for "ADM Ad Grad"
There were 0 decimal places of similarity for "ADM FD Grad"

Should I be concerned about this?

E.g. lines 2116-2118 of my output_oadm.txt file are

(PID.TID 0000.0001)  ADM  ref_cost_function      =  6.20023228182329E+00
(PID.TID 0000.0001)  ADM  adjoint_gradient       = -2.69091500991183E-06
(PID.TID 0000.0001)  ADM  finite-diff_grad       =  0.00000000000000E+00

But at least my cost function value is the same:

(PID.TID 0000.0001)   local fc =  0.620023228182329D+01
(PID.TID 0000.0001)  global fc =  0.620023228182329D+01

Andrew

On 2 May 2018 at 10:34, Martin Losch <Martin.Losch at awi.de><mailto:Martin.Losch at awi.de> wrote:
Hi Andrew,

I won’t be able to help you much with the optim/lsopt code, because I would have to get it running again myself. But I do recommend using the MITgcm_contrib/mlosch/optim_m1qn3 code. It’s not very well documented, but I am attaching a skeleton script to illustrate how to use it. Please give it a try and if you find it useful, I can add this script to the repository.

The two versions of the optimization routine are similar, both implement the same optimization algorithm (BFGS), but optim_m1qn3 uses a later version of the m1qn3 code, I think it’s easier to compile (only one Makefile) and I believe (but there’s debate about this) that it does the right thing as opposed to the optim/lsopt variant, which somehow truncates the optimization in each iteration. Having said that, I have used both in parallel, and the reduction of the cost function (which is really all we care about) is sometimes better with the optim_m1qn3 code, sometimes it is better with the optim/lsopt code. The optim_m1qn3 code is closer to the idea of the original m1qn3 code.

Let me know if you can use my attached instructions.

Martin

On 1. May 2018, at 00:00, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:

Right, but the cost function is the same value each time, the norm of x is 0 each time, and the norm of g is the same each time.  This suggests nothing is happening.  It's a bit ridiculous that one of the core tutorials simply isn't working out of the box...

I will have a go at debugging.

Andrew

On 30 April 2018 at 22:54, Matthew Mazloff <mmazloff at ucsd.edu><mailto:mmazloff at ucsd.edu> wrote:
Well you are correct that its not actually taking a step because the dot product of the control is 0:

norm of x................... 0.00000000E+00

meaning the controls are all 0 still.

However the gradients are non-zero

norm of g................... 0.12730927E-01

so the linesearch should step and
ecco_ctrl_MIT_CE_000.opt0001
should not be all zero.

To debug this you could put a print statement in optim_writedata.F to see what it is writing…..

I don’t know enough about this tutorial to be a bigger help, sorry

Matt

On Apr 30, 2018, at 2:50 PM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:

Yes, I did.

On 30 April 2018 at 22:42, Matthew Mazloff <mmazloff at ucsd.edu><mailto:mmazloff at ucsd.edu> wrote:
This is still iteration 0. You have to update data.optim to tell it you are now at iteration 1

Matt

On Apr 30, 2018, at 2:38 PM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:

I tried a few steps of this, but the output of optim.x always has

  cost function............... 0.62002323E+01
  norm of x................... 0.00000000E+00
  norm of g................... 0.12730927E-01

near the end, with no decrease in the cost function.  So I guess it's not actually taking the step?

Andrew

On 27 April 2018 at 18:04, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:
!!!  Okay...

Yes, it produced the .opt0001 file.  I'll see how this goes.

Thanks,
Andrew

On 27 April 2018 at 17:57, Matthew Mazloff <mmazloff at ucsd.edu><mailto:mmazloff at ucsd.edu> wrote:
Hello

Its been awhile, but I am pretty sure that is the normal output. It says “fail", but it did give you a new and ecco_ctrl_MIT_CE_000.opt0001 (correct?) and if you unpack and run likely the cost will descend.

I think it worked correctly. lsopt/optim are just confusing…but I think its working. I think all is good!

Matt

On Apr 27, 2018, at 8:25 AM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk><mailto:andrew.mcrae at physics.ox.ac.uk> wrote:

Just separating this from the other thread, I got the bundled MITgcm optim routine built (having made these changes, based on this thread from 2010 and this one from 2016).

I use OpenAD to create the adjoint.

My steps are:
1) in the build directory, run ../../../tools/genmake2 -oad -mods=../code_oad
2) run make depend and make adAll
3) copy input_oad/ into a new folder scratch/
4) within scratch/, run ./prepare_run
5) copy mitgcmuv_ad from build/ into scratch/, copy optim.x into scratch/OPTIM/
6) run ./mitgcmuv_ad
7) in scratch/OPTIM, create symlinks to ../data.optim and ../data.ctrl
8) copy the files ecco_cost_MIT_CE_000.opt0000 and ecco_ctrl_MIT_CE_000.opt0000 into the OPTIM subdirectory
9) run ./optim.x within the subdirectory

The full output is attached, but I assume the optimisation failed since the last lines are

  optimization stopped because :
  ifail =   4    the search direction is not a descent one

Any ideas?  (I guess this isn't something that is tested in the daily builds?)

In the meantime, I'll try the m1qn3 routine as in the other thread, which should help distinguish between a problem with the optimisation routine or the gradient generated by mitgcmuv_ad.

Andrew
<out.txt>_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

_______________________________________________
MITgcm-support mailing list
MITgcm-support at mitgcm.org<mailto:MITgcm-support at mitgcm.org>
http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20180524/3ddf0ae8/attachment-0001.html>