[MITgcm-support] tutorial_global_oce_optim optimisation failed

Andrew McRae andrew.mcrae at physics.ox.ac.uk
Thu Jun 21 17:52:54 EDT 2018


By optimcycle8, the cost function decreases to

(PID.TID 0000.0001)   local fc =  0.272541476364971D+01

Hopefully this matches other people's results?  (Only just below 50% of the
initial cost function, so not a huge decrease overall, but it stagnated
once it got below 3)

At optimcycle9, mitgcmuv_ad starts spewing out NaN :-\

(PID.TID 0000.0001) //
=======================================================
(PID.TID 0000.0001) // Model current state
(PID.TID 0000.0001) //
=======================================================
(PID.TID 0000.0001)
 EXTERNAL_FIELDS_LOAD, it=         0 : Reading new data, i0,i1=   12    1
(prev=   12    0 )
(PID.TID 0000.0001) SOLVE_FOR_PRESSURE: putPmEinXvector =    F
 cg2d: Sum(rhs),rhsMax =   7.77156117237610E-15  2.54952604282385E+00
 cg2d: Sum(rhs),rhsMax =   1.11022302462516E-16  2.39842634121661E+08
 cg2d: Sum(rhs),rhsMax =   2.77555756156289E-16  1.08361659495383E+10
 cg2d: Sum(rhs),rhsMax =  -8.32667268468867E-16  2.07131315985965E+35
 cg2d: Sum(rhs),rhsMax =  -1.66533453693773E-16  5.11065205331300E+80
 cg2d: Sum(rhs),rhsMax =   5.55111512312578E-17  5.12555505568672+242
 cg2d: Sum(rhs),rhsMax =                    NaN  0.00000000000000E+00
 cg2d: Sum(rhs),rhsMax =                    NaN  0.00000000000000E+00
 cg2d: Sum(rhs),rhsMax =                    NaN  0.00000000000000E+00

Expected?

Andrew

On 21 June 2018 at 18:02, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
wrote:

> Ron,
>
> Fantastic, removing the %vs and %ds seems to work, and I get
> sensible-looking decreases of the cost function once I switch the tutorial
> to integrate for a year.
>
> Thanks,
> Andrew
>
> On 21 June 2018 at 11:31, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
> wrote:
>
>> Okay, thanks, I'll give this a try.
>>
>> I read your earlier email more closely and realised this was exactly the
>> problem I had a few weeks later!  I should read more carefully...
>>
>> "I am still not sure what makes openAD decide if active_var is
>> type(active) or real." -- abstractly, any variable that is both dependent
>> on the independent variable xx_hfluxm
>>
>>
>>
>> *# ifdef ALLOW_HFLUXM_CONTROLc$openad INDEPENDENT(xx_hfluxm)# endif*
>>
>> and is a dependency of the dependent variable fc
>>
>>
>>
>>
>> *# ifdef ALLOW_OPENADc$openad DEPENDENT(fc)# endif /* ALLOW_OPENAD */*
>>
>> should be turned into type(active).
>>
>> Andrew
>>
>> On 21 June 2018 at 10:15, Ron Goldman <ron at ocean.org.il> wrote:
>>
>>> Hi Andrew,
>>> It compiled, and grdchk returned output that matched the finite
>>> difference. I recall that optim reduced the norm by little but I don't
>>> recall if the change in OPENAD_OPTIONS.h was needed for that.
>>> Ron
>>>
>>>
>>> On 06/21/18 10:22, Andrew McRae wrote:
>>>
>>> Hi Ron,
>>>
>>> "It worked" = it compiled, or it compiled + everything now seems to work
>>> (including the optimization)?
>>>
>>> Andrew
>>>
>>> On 21 June 2018 at 05:57, Ron Goldman <ron at ocean.org.il> wrote:
>>>
>>>> Hi Andrew,
>>>> I've been having the same issue. It worked when I changed the code by
>>>> dropping the %v %d.
>>>> Changing tools/OAD_support/ad_template.active_read_xy.F will propagate
>>>> the changes to externalDummies_cb2m_oad.f.
>>>> I am still not sure what makes openAD decide if active_var is
>>>> type(active) or real.
>>>> Best reagrds,
>>>> Ron
>>>>
>>>>
>>>> On 06/20/18 20:28, Andrew McRae wrote:
>>>>
>>>> Damn.  After doing this, the gradient written into ecco_cost seems to
>>>> be all 0.0.  Help?
>>>>
>>>> Andrew
>>>>
>>>> On 19 June 2018 at 15:37, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
>>>> wrote:
>>>>
>>>>> Okay, I have
>>>>> 1) copied OPENAD_OPTIONS.h from pkg/openad to the code_oad/ subfolder
>>>>> of the tutorial, changing it to define ALLOW_OPENAD_ACTIVE_READ_XY
>>>>>
>>>>> Good news: the main body of tools/OAD_support/ad_template.active_read_xy.F
>>>>> (which is wrapped in #ifdef ALLOW_OPENAD_ACTIVE_READ_XY) now appears in
>>>>> external_Dummies_cb2m_oad.f
>>>>>
>>>>> Bad news: this gives a compile error in externalDummies_cb2m_oad.f of
>>>>> "Error: Unexpected '%' for nonderived-type variable 'active_var'".  This
>>>>> seems to be because active_var is declared as a REAL(w2f__8) in
>>>>> externalDummies_cb2m_oad.f, not a type(active).  The lines of code
>>>>> corresponding to
>>>>>
>>>>>       active_var = dummy + active_var
>>>>>       dummy = active_var(1,1,1,1) + dummy
>>>>>
>>>>> don't appear in the post-processed code [optimized out by the OpenAD
>>>>> toolchain, or something else?], which is probably why active_var doesn't
>>>>> become an active variable.  Therefore, I....
>>>>>
>>>>> 2) change the type of active_var to type(active) in the post-processed
>>>>> file (yuck).  make adAll continues from where it left off, and mitgcmuv_ad
>>>>> now compiles :)
>>>>>
>>>>> (I tried changing the type of this variable in
>>>>> pkg/openad/externalDummies.F
>>>>> <https://github.com/MITgcm/MITgcm/blob/master/pkg/openad/externalDummies.F#L285>,
>>>>> but this leads to a bork in the OpenAD toolchain)
>>>>>
>>>>> I can confirm the cost function changes from iteration to iteration,
>>>>> and I'll now test if the optimization works.  Hopefully you can find a more
>>>>> permanent solution to the above.
>>>>>
>>>>> Andrew
>>>>>
>>>>> On 19 June 2018 at 13:43, Andrew McRae <andrew.mcrae at physics.ox.ac.uk>
>>>>> wrote:
>>>>>
>>>>>> The active_read_xy routine used in OpenAD mode looks suspicious:
>>>>>> https://github.com/MITgcm/MITgcm/blob/master/pkg/openad/exte
>>>>>> rnalDummies.F#L269-L296
>>>>>>
>>>>>> 1) ALLOW_OPENAD_ACTIVE_READ_XY isn't defined for
>>>>>> tutorial_global_oce_optim; I guess it should be?
>>>>>>
>>>>>> 2) This routine seems to be basically a no-op anyway?  I guess
>>>>>> active_var_file should be read into active_var, or similar?
>>>>>>
>>>>>> Andrew
>>>>>>
>>>>>> On 18 June 2018 at 18:04, Andrew McRae <andrew.mcrae at physics.ox.ac.uk
>>>>>> > wrote:
>>>>>>
>>>>>>> Not sure if you've had a chance to look at this yet... the only time
>>>>>>> I can see tmpfld2d being written to (and not just initialised to 0.0 or
>>>>>>> 1.0) is in pkg/admtlm/bypassad.F line 96.  Presumably that package isn't
>>>>>>> switched on here.  I can't see xx_hfluxm being written to at all.
>>>>>>>
>>>>>>> A few lines above, active_read_xy is called with xx_hfluxm_dummy as
>>>>>>> the last argument... should this have been xx_hfluxm, perhaps?
>>>>>>> (xx_hfluxm_dummy is a single variable, while xx_hfluxm is an array, so this
>>>>>>> probably won't work as-is...)
>>>>>>>
>>>>>>> Andrew
>>>>>>>
>>>>>>> On 13 June 2018 at 23:18, Andrew McRae <
>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>
>>>>>>>> Okay, thank you.  If do you have any advice on debugging this, do
>>>>>>>> say.  I guess you already got as far as spotting that all the terms on the
>>>>>>>> RHS of https://github.com/MITgcm/MITgcm/blob/master/pkg/ctrl/ctrl_m
>>>>>>>> ap_forcing.F#L259 are zero.
>>>>>>>>
>>>>>>>> Andrew
>>>>>>>>
>>>>>>>> On 13 June 2018 at 21:36, Patrick Heimbach <heimbach at mit.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Andrew,
>>>>>>>>>
>>>>>>>>> I have not been able to look into this due to various other
>>>>>>>>> commitments over the last couple of months.
>>>>>>>>>
>>>>>>>>> I'll be grounded for a while in Austin starting next week, and
>>>>>>>>> this will be near the top of my ToDo list.
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>> > On Jun 13, 2018, at 12:56 PM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >
>>>>>>>>> > MITgcm built with OpenAD is not making use of the ecco_ctrl
>>>>>>>>> files for optimcycle >= 1.  The file apparently gets read in, but the
>>>>>>>>> contents get dropped on the floor somewhere.
>>>>>>>>> >
>>>>>>>>> > Andrew
>>>>>>>>> >
>>>>>>>>> > On 13 June 2018 at 18:51, Matthew Mazloff <mmazloff at ucsd.edu>
>>>>>>>>> wrote:
>>>>>>>>> > Hello
>>>>>>>>> >
>>>>>>>>> > Sorry, I lost track. What needs to be debugged? Can you please
>>>>>>>>> reiterate the problem?
>>>>>>>>> >
>>>>>>>>> > Thanks
>>>>>>>>> > Matt
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >> On Jun 13, 2018, at 10:14 AM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >>
>>>>>>>>> >> Hi Patrick,
>>>>>>>>> >>
>>>>>>>>> >> Were you able to make any progress with this?  If not, do you
>>>>>>>>> have any advice on debugging this?  (I'm getting lost in ctrl_unpack as to
>>>>>>>>> which variable the control vector is even read into)
>>>>>>>>> >>
>>>>>>>>> >> Thanks,
>>>>>>>>> >> Andrew
>>>>>>>>> >>
>>>>>>>>> >> On 5 May 2018 at 20:12, Patrick Heimbach <heimbach at mit.edu>
>>>>>>>>> wrote:
>>>>>>>>> >> A quick update:
>>>>>>>>> >>
>>>>>>>>> >> This tutorial works as advertised (in the manual), but not as
>>>>>>>>> "hoped".
>>>>>>>>> >> What I mean is that it has been developed and only ever fully
>>>>>>>>> tested and used  in optimization mode with TAF-generated code (and that's
>>>>>>>>> what's documented in the manual).
>>>>>>>>> >>
>>>>>>>>> >> Of course, it should not make a difference of whether we use
>>>>>>>>> TAF vs. OpenAD as long as gradients are correct. But as it turns out, with
>>>>>>>>> the OpenAD code there appears to be a little glitch. Gradient seems
>>>>>>>>> correct, and iteration 1 update is properly read in, but then not used
>>>>>>>>> (instead it is reset to zero). Oh well. I'll need to check where that
>>>>>>>>> happens, so stay tuned.
>>>>>>>>> >>
>>>>>>>>> >> p.
>>>>>>>>> >>
>>>>>>>>> >> > On May 4, 2018, at 10:11 AM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> >
>>>>>>>>> >> > And, still no luck(?)
>>>>>>>>> >> >
>>>>>>>>> >> > Running for a year (switching the commented and uncommented
>>>>>>>>> nTimeSteps and lastinterval declarations in data and data.cost), optim.x
>>>>>>>>> (lsopt+optim, not optim_m1qn3) now gives the output
>>>>>>>>> >> >
>>>>>>>>> >> >   cost function............... 0.60514949E+01
>>>>>>>>> >> >   norm of x................... 0.00000000E+00
>>>>>>>>> >> >   norm of g................... 0.23235517E+00
>>>>>>>>> >> >
>>>>>>>>> >> >   optimization stopped because :
>>>>>>>>> >> >   ifail =   4    the search direction is not a descent one
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > On 4 May 2018 at 13:58, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > On 4 May 2018 at 06:04, Patrick Heimbach <heimbach at mit.edu>
>>>>>>>>> wrote:
>>>>>>>>> >> > Hi Matt,
>>>>>>>>> >> >
>>>>>>>>> >> > as you indicated, all is still good, and I suspect the same
>>>>>>>>> you did regarding what might be at issue.
>>>>>>>>> >> >
>>>>>>>>> >> > I just downloaded latest MITgcm, re-ran adjoint, and
>>>>>>>>> conducted 2 iterations (using lsopt).
>>>>>>>>> >> >
>>>>>>>>> >> > It still works "out of the box" ... if one realizes that a
>>>>>>>>> manual is part of that "box", and section 3.18 (old manual prior to
>>>>>>>>> readthedocs) has some description of this tutorial, thanks to dfer
>>>>>>>>> (admittedly somewhat out of date, but still mostly relevant). In particular
>>>>>>>>> it says there that the optimization has been conducted for a 1-year
>>>>>>>>> simulation.
>>>>>>>>> >> >
>>>>>>>>> >> > Okay, thanks.  I interpreted the manual footnote as "running
>>>>>>>>> a 1-year simulation will reproduce the scientifically-interesting graphs in
>>>>>>>>> the manual", not as "the default parameters are only useful for verifying
>>>>>>>>> correctness of the adjoint, but will break the optimisation routine".  I'll
>>>>>>>>> see if I have more success with the longer run.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > Since we do not want to conduct 1-year integrations for *any*
>>>>>>>>> of the tutorials within our regression tests (these tests consist of 90
>>>>>>>>> forward, 24 adjoint/TAF, 10 adjoint/OpenAD, and 16 tangent-linear/TAF
>>>>>>>>> configurations, each needing to be compiled and executed) we have shortened
>>>>>>>>> the number of time steps to 10 (= 10 days) to perform efficient nightly
>>>>>>>>> regression tests of the adjoint. Not changing the number of time steps
>>>>>>>>> leads to optimizing in the noise - in fact cost function goes up in that
>>>>>>>>> case.
>>>>>>>>> >> >
>>>>>>>>> >> > That the user's cost function does not change at all suggests
>>>>>>>>> a more basic problem though (hard to speculate what it might be).
>>>>>>>>> >> >
>>>>>>>>> >> > I made a quick test by extending nTimeSteps from 10 to 90
>>>>>>>>> days, which leads to cost reduction as desired, namely, for:
>>>>>>>>> >> >  numiter=1,
>>>>>>>>> >> >  nfunc=3,
>>>>>>>>> >> >  fmin=5.74,
>>>>>>>>> >> > (values in data.optim that comes with
>>>>>>>>> tutorial_global_oce_optim)
>>>>>>>>> >> > I obtain following costs:
>>>>>>>>> >> > iter. 0: fc =  0.184199260445164D+02
>>>>>>>>> >> > iter. 1: fc =  0.130860446841901D+02
>>>>>>>>> >> > iter. 2: fc =  0.979374136987667D+01
>>>>>>>>> >> >
>>>>>>>>> >> > I did that test "by hand", i.e. not using the script cycsh
>>>>>>>>> also provided (see manual). Doing so by hand requires two more lines in
>>>>>>>>> data.ctrl:
>>>>>>>>> >> >  &CTRL_PACKNAMES
>>>>>>>>> >> >  costname='ecco_cost',
>>>>>>>>> >> >  ctrlname='ecco_ctrl',
>>>>>>>>> >> >
>>>>>>>>> >> > Since gradients produced with TAF are extremely similar (10+
>>>>>>>>> digits?) to those produce with OpenAD (see results/ directory which has
>>>>>>>>> both TAF and OpenAD reference results), I expect it to work with OpenAD too
>>>>>>>>> (have not tested it right now).
>>>>>>>>> >> >
>>>>>>>>> >> > -Patrick
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > > On May 2, 2018, at 12:34 PM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > >
>>>>>>>>> >> > > Thanks for this.
>>>>>>>>> >> > >
>>>>>>>>> >> > > Just as a sanity check, before I involve optim_m1qn3 again,
>>>>>>>>> the output of my ./testreport -t tutorial_global_oce_optim -oad includes
>>>>>>>>> >> > >
>>>>>>>>> >> > > There were 16 decimal places of similarity for "ADM CostFct"
>>>>>>>>> >> > > There were 16 decimal places of similarity for "ADM Ad Grad"
>>>>>>>>> >> > > There were 0 decimal places of similarity for "ADM FD Grad"
>>>>>>>>> >> > >
>>>>>>>>> >> > > Should I be concerned about this?
>>>>>>>>> >> > >
>>>>>>>>> >> > > E.g. lines 2116-2118 of my output_oadm.txt file are
>>>>>>>>> >> > >
>>>>>>>>> >> > > (PID.TID 0000.0001)  ADM  ref_cost_function      =
>>>>>>>>> 6.20023228182329E+00
>>>>>>>>> >> > > (PID.TID 0000.0001)  ADM  adjoint_gradient       =
>>>>>>>>> -2.69091500991183E-06
>>>>>>>>> >> > > (PID.TID 0000.0001)  ADM  finite-diff_grad       =
>>>>>>>>> 0.00000000000000E+00
>>>>>>>>> >> > >
>>>>>>>>> >> > > But at least my cost function value is the same:
>>>>>>>>> >> > >
>>>>>>>>> >> > > (PID.TID 0000.0001)   local fc =  0.620023228182329D+01
>>>>>>>>> >> > > (PID.TID 0000.0001)  global fc =  0.620023228182329D+01
>>>>>>>>> >> > >
>>>>>>>>> >> > > Andrew
>>>>>>>>> >> > >
>>>>>>>>> >> > > On 2 May 2018 at 10:34, Martin Losch <Martin.Losch at awi.de>
>>>>>>>>> wrote:
>>>>>>>>> >> > > Hi Andrew,
>>>>>>>>> >> > >
>>>>>>>>> >> > > I won’t be able to help you much with the optim/lsopt code,
>>>>>>>>> because I would have to get it running again myself. But I do recommend
>>>>>>>>> using the MITgcm_contrib/mlosch/optim_m1qn3 code. It’s not very
>>>>>>>>> well documented, but I am attaching a skeleton script to illustrate how to
>>>>>>>>> use it. Please give it a try and if you find it useful, I can add this
>>>>>>>>> script to the repository.
>>>>>>>>> >> > >
>>>>>>>>> >> > > The two versions of the optimization routine are similar,
>>>>>>>>> both implement the same optimization algorithm (BFGS), but optim_m1qn3 uses
>>>>>>>>> a later version of the m1qn3 code, I think it’s easier to compile (only one
>>>>>>>>> Makefile) and I believe (but there’s debate about this) that it does the
>>>>>>>>> right thing as opposed to the optim/lsopt variant, which somehow truncates
>>>>>>>>> the optimization in each iteration. Having said that, I have used both in
>>>>>>>>> parallel, and the reduction of the cost function (which is really all we
>>>>>>>>> care about) is sometimes better with the optim_m1qn3 code, sometimes it is
>>>>>>>>> better with the optim/lsopt code. The optim_m1qn3 code is closer to the
>>>>>>>>> idea of the original m1qn3 code.
>>>>>>>>> >> > >
>>>>>>>>> >> > > Let me know if you can use my attached instructions.
>>>>>>>>> >> > >
>>>>>>>>> >> > > Martin
>>>>>>>>> >> > >
>>>>>>>>> >> > >
>>>>>>>>> >> > >
>>>>>>>>> >> > > > On 1. May 2018, at 00:00, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > Right, but the cost function is the same value each time,
>>>>>>>>> the norm of x is 0 each time, and the norm of g is the same each time.
>>>>>>>>> This suggests nothing is happening.  It's a bit ridiculous that one of the
>>>>>>>>> core tutorials simply isn't working out of the box...
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > I will have a go at debugging.
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > Andrew
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > On 30 April 2018 at 22:54, Matthew Mazloff <
>>>>>>>>> mmazloff at ucsd.edu> wrote:
>>>>>>>>> >> > > > Well you are correct that its not actually taking a step
>>>>>>>>> because the dot product of the control is 0:
>>>>>>>>> >> > > >>> norm of x................... 0.00000000E+00
>>>>>>>>> >> > > > meaning the controls are all 0 still.
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > However the gradients are non-zero
>>>>>>>>> >> > > >>> norm of g................... 0.12730927E-01
>>>>>>>>> >> > > > so the linesearch should step and
>>>>>>>>> >> > > > ecco_ctrl_MIT_CE_000.opt0001
>>>>>>>>> >> > > > should not be all zero.
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > To debug this you could put a print statement in
>>>>>>>>> optim_writedata.F to see what it is writing…..
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > I don’t know enough about this tutorial to be a bigger
>>>>>>>>> help, sorry
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > Matt
>>>>>>>>> >> > > >
>>>>>>>>> >> > > >
>>>>>>>>> >> > > >> On Apr 30, 2018, at 2:50 PM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >> Yes, I did.
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >> On 30 April 2018 at 22:42, Matthew Mazloff <
>>>>>>>>> mmazloff at ucsd.edu> wrote:
>>>>>>>>> >> > > >> This is still iteration 0. You have to update data.optim
>>>>>>>>> to tell it you are now at iteration 1
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >> Matt
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >>> On Apr 30, 2018, at 2:38 PM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> I tried a few steps of this, but the output of optim.x
>>>>>>>>> always has
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>   cost function............... 0.62002323E+01
>>>>>>>>> >> > > >>>   norm of x................... 0.00000000E+00
>>>>>>>>> >> > > >>>   norm of g................... 0.12730927E-01
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> near the end, with no decrease in the cost function.
>>>>>>>>> So I guess it's not actually taking the step?
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> Andrew
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> On 27 April 2018 at 18:04, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > > >>> !!!  Okay...
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> Yes, it produced the .opt0001 file.  I'll see how this
>>>>>>>>> goes.
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> Thanks,
>>>>>>>>> >> > > >>> Andrew
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> On 27 April 2018 at 17:57, Matthew Mazloff <
>>>>>>>>> mmazloff at ucsd.edu> wrote:
>>>>>>>>> >> > > >>> Hello
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> Its been awhile, but I am pretty sure that is the
>>>>>>>>> normal output. It says “fail", but it did give you a new and
>>>>>>>>> ecco_ctrl_MIT_CE_000.opt0001 (correct?) and if you unpack and run likely
>>>>>>>>> the cost will descend.
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> I think it worked correctly. lsopt/optim are just
>>>>>>>>> confusing…but I think its working. I think all is good!
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> Matt
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>> On Apr 27, 2018, at 8:25 AM, Andrew McRae <
>>>>>>>>> andrew.mcrae at physics.ox.ac.uk> wrote:
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> Just separating this from the other thread, I got the
>>>>>>>>> bundled MITgcm optim routine built (having made these changes, based on
>>>>>>>>> this thread from 2010 and this one from 2016).
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> I use OpenAD to create the adjoint.
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> My steps are:
>>>>>>>>> >> > > >>>> 1) in the build directory, run ../../../tools/genmake2
>>>>>>>>> -oad -mods=../code_oad
>>>>>>>>> >> > > >>>> 2) run make depend and make adAll
>>>>>>>>> >> > > >>>> 3) copy input_oad/ into a new folder scratch/
>>>>>>>>> >> > > >>>> 4) within scratch/, run ./prepare_run
>>>>>>>>> >> > > >>>> 5) copy mitgcmuv_ad from build/ into scratch/, copy
>>>>>>>>> optim.x into scratch/OPTIM/
>>>>>>>>> >> > > >>>> 6) run ./mitgcmuv_ad
>>>>>>>>> >> > > >>>> 7) in scratch/OPTIM, create symlinks to ../data.optim
>>>>>>>>> and ../data.ctrl
>>>>>>>>> >> > > >>>> 8) copy the files ecco_cost_MIT_CE_000.opt0000 and
>>>>>>>>> ecco_ctrl_MIT_CE_000.opt0000 into the OPTIM subdirectory
>>>>>>>>> >> > > >>>> 9) run ./optim.x within the subdirectory
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> The full output is attached, but I assume the
>>>>>>>>> optimisation failed since the last lines are
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>>   optimization stopped because :
>>>>>>>>> >> > > >>>>   ifail =   4    the search direction is not a descent
>>>>>>>>> one
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> Any ideas?  (I guess this isn't something that is
>>>>>>>>> tested in the daily builds?)
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> In the meantime, I'll try the m1qn3 routine as in the
>>>>>>>>> other thread, which should help distinguish between a problem with the
>>>>>>>>> optimisation routine or the gradient generated by mitgcmuv_ad.
>>>>>>>>> >> > > >>>>
>>>>>>>>> >> > > >>>> Andrew
>>>>>>>>> >> > > >>>> <out.txt>_____________________
>>>>>>>>> __________________________
>>>>>>>>> >> > > >>>> MITgcm-support mailing list
>>>>>>>>> >> > > >>>> MITgcm-support at mitgcm.org
>>>>>>>>> >> > > >>>> http://mailman.mitgcm.org/mail
>>>>>>>>> man/listinfo/mitgcm-support
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>>
>>>>>>>>> >> > > >>> _______________________________________________
>>>>>>>>> >> > > >>> MITgcm-support mailing list
>>>>>>>>> >> > > >>> MITgcm-support at mitgcm.org
>>>>>>>>> >> > > >>> http://mailman.mitgcm.org/mail
>>>>>>>>> man/listinfo/mitgcm-support
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >>
>>>>>>>>> >> > > >> _______________________________________________
>>>>>>>>> >> > > >> MITgcm-support mailing list
>>>>>>>>> >> > > >> MITgcm-support at mitgcm.org
>>>>>>>>> >> > > >> http://mailman.mitgcm.org/mail
>>>>>>>>> man/listinfo/mitgcm-support
>>>>>>>>> >> > > >
>>>>>>>>> >> > > >
>>>>>>>>> >> > > > _______________________________________________
>>>>>>>>> >> > > > MITgcm-support mailing list
>>>>>>>>> >> > > > MITgcm-support at mitgcm.org
>>>>>>>>> >> > > > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>> >> > >
>>>>>>>>> >> > > _______________________________________________
>>>>>>>>> >> > > MITgcm-support mailing list
>>>>>>>>> >> > > MITgcm-support at mitgcm.org
>>>>>>>>> >> > > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>> >> > >
>>>>>>>>> >> > > _______________________________________________
>>>>>>>>> >> > > MITgcm-support mailing list
>>>>>>>>> >> > > MITgcm-support at mitgcm.org
>>>>>>>>> >> > > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > _______________________________________________
>>>>>>>>> >> > MITgcm-support mailing list
>>>>>>>>> >> > MITgcm-support at mitgcm.org
>>>>>>>>> >> > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> _______________________________________________
>>>>>>>>> >> MITgcm-support mailing list
>>>>>>>>> >> MITgcm-support at mitgcm.org
>>>>>>>>> >> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > MITgcm-support mailing list
>>>>>>>>> > MITgcm-support at mitgcm.org
>>>>>>>>> > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> MITgcm-support mailing listMITgcm-support at mitgcm.orghttp://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20180621/0a630f32/attachment-0001.html>


More information about the MITgcm-support mailing list