[MITgcm-support] tutorial_global_oce_optim optimisation failed
Daniel Goldberg
dngoldberg at gmail.com
Thu May 24 12:12:03 EDT 2018
Hi Ron
The file externalDummies_cb2m_oad.f is created on the fly based on
several source files and templates. This is done as part of a facility to
replace openad-generated code by user-written code -- which is needed for
instance when there is code which would be problematic for OpenAD to
compile/transform, and the mathematical adjoint of a given operation can
more easily be hand-written -- or to control "side effects" like the
writing of adjoint fields to I/O.
Externaldummies (
https://github.com/MITgcm/MITgcm/blob/master/pkg/openad/externalDummies.F)
is transformed by OpenAD -- but the hand-written code then replaces the
appropriate transformed code via templates in this directory:
https://github.com/MITgcm/MITgcm/tree/master/tools/OAD_support
so the active_read() definition in externalDummies.F serves only to tell
the dependency analysis that certain variables (like active_var) should be
type "active". Can't say for sure why your build failed or why changing it
to to real made it compile. From what i can see the tutorial seems not to
be broken:
http://mitgcm.org/testing/results/2018_05/tr_baudelaire-a_20180524_5/summary.txt
did you have an interest in halfpipe_streamice or did you just want to see
if the openad experiment would work?
dan
On Thu, May 24, 2018 at 2:40 PM, Ron Goldman <ron at ocean.org.il> wrote:
> Hi All,
> I am kind of new to mitgcm so please let me know if this the right place
> to address this.
> OpenAD worked for me with halfpipe_streamice but I couldn't recreate the
> settings with tutorial_global_oce_optim. I managed to get the grdchk to
> work by adding OPENAD_OPTIONS.h with ALLOW_OPENAD_ACTIVE_READ_XY. The make
> failed for me but I manually edited OpenAD_active_read_xy in
> externalDummies_cb2m_oad.f to use active_var as a real array and not
> type(active). The make succeeded after that and the output seemed ok.
> Can someone with better understanding of the build system help me
> understand how externalDummies_cb2m_oad.f is created.
> Thanks in advance,
> Ron
>
> On 05/03/18 11:46, Martin Losch wrote:
>
> Hi Andrew,
>
> the FD gradient is used for checking the AD gradient in a few (very few!) places. I don’t know why it is zero in your case, but assuming that the AD gradient is correct, you don’t need the FD gradient at all (I would actually strongly recommend to turn off the grdchk pkg for any optimization exercise).
>
> make sure you do a “cvs update" on the optim_m1qn3 directory, because I added a fix for the funny cost function value yesterday.
>
> Martin
>
>
> On 2. May 2018, at 19:34, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
>
> Thanks for this.
>
> Just as a sanity check, before I involve optim_m1qn3 again, the output of my ./testreport -t tutorial_global_oce_optim -oad includes
>
> There were 16 decimal places of similarity for "ADM CostFct"
> There were 16 decimal places of similarity for "ADM Ad Grad"
> There were 0 decimal places of similarity for "ADM FD Grad"
>
> Should I be concerned about this?
>
> E.g. lines 2116-2118 of my output_oadm.txt file are
>
> (PID.TID 0000.0001) ADM ref_cost_function = 6.20023228182329E+00
> (PID.TID 0000.0001) ADM adjoint_gradient = -2.69091500991183E-06
> (PID.TID 0000.0001) ADM finite-diff_grad = 0.00000000000000E+00
>
> But at least my cost function value is the same:
>
> (PID.TID 0000.0001) local fc = 0.620023228182329D+01
> (PID.TID 0000.0001) global fc = 0.620023228182329D+01
>
> Andrew
>
> On 2 May 2018 at 10:34, Martin Losch <Martin.Losch at awi.de> <Martin.Losch at awi.de> wrote:
> Hi Andrew,
>
> I won’t be able to help you much with the optim/lsopt code, because I would have to get it running again myself. But I do recommend using the MITgcm_contrib/mlosch/optim_m1qn3 code. It’s not very well documented, but I am attaching a skeleton script to illustrate how to use it. Please give it a try and if you find it useful, I can add this script to the repository.
>
> The two versions of the optimization routine are similar, both implement the same optimization algorithm (BFGS), but optim_m1qn3 uses a later version of the m1qn3 code, I think it’s easier to compile (only one Makefile) and I believe (but there’s debate about this) that it does the right thing as opposed to the optim/lsopt variant, which somehow truncates the optimization in each iteration. Having said that, I have used both in parallel, and the reduction of the cost function (which is really all we care about) is sometimes better with the optim_m1qn3 code, sometimes it is better with the optim/lsopt code. The optim_m1qn3 code is closer to the idea of the original m1qn3 code.
>
> Let me know if you can use my attached instructions.
>
> Martin
>
>
>
>
> On 1. May 2018, at 00:00, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
>
> Right, but the cost function is the same value each time, the norm of x is 0 each time, and the norm of g is the same each time. This suggests nothing is happening. It's a bit ridiculous that one of the core tutorials simply isn't working out of the box...
>
> I will have a go at debugging.
>
> Andrew
>
> On 30 April 2018 at 22:54, Matthew Mazloff <mmazloff at ucsd.edu> <mmazloff at ucsd.edu> wrote:
> Well you are correct that its not actually taking a step because the dot product of the control is 0:
>
> norm of x................... 0.00000000E+00
>
> meaning the controls are all 0 still.
>
> However the gradients are non-zero
>
> norm of g................... 0.12730927E-01
>
> so the linesearch should step and
> ecco_ctrl_MIT_CE_000.opt0001
> should not be all zero.
>
> To debug this you could put a print statement in optim_writedata.F to see what it is writing…..
>
> I don’t know enough about this tutorial to be a bigger help, sorry
>
> Matt
>
>
>
> On Apr 30, 2018, at 2:50 PM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
>
> Yes, I did.
>
> On 30 April 2018 at 22:42, Matthew Mazloff <mmazloff at ucsd.edu> <mmazloff at ucsd.edu> wrote:
> This is still iteration 0. You have to update data.optim to tell it you are now at iteration 1
>
> Matt
>
>
>
> On Apr 30, 2018, at 2:38 PM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
>
> I tried a few steps of this, but the output of optim.x always has
>
> cost function............... 0.62002323E+01
> norm of x................... 0.00000000E+00
> norm of g................... 0.12730927E-01
>
> near the end, with no decrease in the cost function. So I guess it's not actually taking the step?
>
> Andrew
>
> On 27 April 2018 at 18:04, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
> !!! Okay...
>
> Yes, it produced the .opt0001 file. I'll see how this goes.
>
> Thanks,
> Andrew
>
> On 27 April 2018 at 17:57, Matthew Mazloff <mmazloff at ucsd.edu> <mmazloff at ucsd.edu> wrote:
> Hello
>
> Its been awhile, but I am pretty sure that is the normal output. It says “fail", but it did give you a new and ecco_ctrl_MIT_CE_000.opt0001 (correct?) and if you unpack and run likely the cost will descend.
>
> I think it worked correctly. lsopt/optim are just confusing…but I think its working. I think all is good!
>
> Matt
>
>
>
>
> On Apr 27, 2018, at 8:25 AM, Andrew McRae <andrew.mcrae at physics.ox.ac.uk> <andrew.mcrae at physics.ox.ac.uk> wrote:
>
> Just separating this from the other thread, I got the bundled MITgcm optim routine built (having made these changes, based on this thread from 2010 and this one from 2016).
>
> I use OpenAD to create the adjoint.
>
> My steps are:
> 1) in the build directory, run ../../../tools/genmake2 -oad -mods=../code_oad
> 2) run make depend and make adAll
> 3) copy input_oad/ into a new folder scratch/
> 4) within scratch/, run ./prepare_run
> 5) copy mitgcmuv_ad from build/ into scratch/, copy optim.x into scratch/OPTIM/
> 6) run ./mitgcmuv_ad
> 7) in scratch/OPTIM, create symlinks to ../data.optim and ../data.ctrl
> 8) copy the files ecco_cost_MIT_CE_000.opt0000 and ecco_ctrl_MIT_CE_000.opt0000 into the OPTIM subdirectory
> 9) run ./optim.x within the subdirectory
>
> The full output is attached, but I assume the optimisation failed since the last lines are
>
> optimization stopped because :
> ifail = 4 the search direction is not a descent one
>
> Any ideas? (I guess this isn't something that is tested in the daily builds?)
>
> In the meantime, I'll try the m1qn3 routine as in the other thread, which should help distinguish between a problem with the optimisation routine or the gradient generated by mitgcmuv_ad.
>
> Andrew
> <out.txt>_______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
>
--
Daniel Goldberg, PhD
Lecturer in Glaciology
School of Geosciences, University of Edinburgh
Geography Building, Drummond Street, Edinburgh EH8 9XP
em: D <dgoldber at mit.edu>dan.goldberg at ed.ac.uk
web: https://www.geos.ed.ac.uk/homes/dgoldber
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20180524/14349e7a/attachment-0001.html>
More information about the MITgcm-support
mailing list