[MITgcm-devel] optim_m1qn3

Tue May 8 02:39:47 EDT 2012

Dear optimizers,

I have fiddled some more with the code in MITgcm_contrib/mlosch/optim_m1qn3 and I am fairly confident that the code is now working properly. That is to say that in the "testbed" that I constructed I get identical results with the original "online" m1qn3 and with my modification m1qn3_offline.F

"Real life test" give mixed results. I am attaching 5 figures. In all plots the "b" experiment (the green line) is with optim_m1qn3 and the blue one with the standard optim/lsopt. A "simulation" is one full run with the "mitgcmuv_ad". The 4 cmp_cf_opt*.png experiments are with a regional model with open boundaries (control parameters) and ice shelf cavities and I have the problem with this experiment that lsopt often returns a control vector that is too extreme (see e.g. opt28 and opt29) for the model to swallow and it explodes in the forward integration (sooner or later). All cmp_cf_opt* have this problem and the blue line stops whenever this happens. In one case this happens after 90 simulations but in opt18 already after very few simulations. optim_m1qn3 on the other hand does much better in opt18 and opt27 (although it seems to get stuck and all simulations are used on the line search and very little improvement is achieved) and not so well for opt28 and op29 where it seems to get stuck well above the lowest cost values found with lsopt. But all experiments are still running, and there is some hope that the cost function will go down some more.

The 5th figure cmp_cf_MOM17.png shows a run with a global cs32 simulation with seaice/gmredi/kpp (gmredi/kpp and seaice_dynamics are turned off in the adjoint, I think). There are 4 experiments. MOM17 (blue line) uses lsopt and nfunc=7, MOM17a (red) uses lsopt and nfunc=1 (so here I show lsopt really only one simulation per iteration and lsopt knows about it), MOM17c (black) uses lsopt and nfunc=100 (just a test) and MOM17b (green) uses optim_m1qn3. Obviously lsopt with nfunc=7 and 1 is doing much better (we only allowed it to do 20 simulations) than m1qn3. It is interesting that nfunc=1 seems to be the better choice in this case. 

optim_m1qn3 gets stuck and terminates the optimization with "output mode 6", which usually means, that the gradient is not accurate enough to find a new descent direction (plausible, as we only compute an approximate gradient with gmredi etc turned off). It can also mean that my fiddling with m1qn3 broke it and I still need to spend more time on that, but I could not find a simple case (cost function) where m1qn3_offline fails.

A different interpretation is that with optim_m1qn3 I quickly arrive in a local minimum and get stuck. I think that lsopt actually breaks the BFGS algorithm (since the line search always uses the same simulation in the nfunc-loop, i.e., for each new x+t*dx that lsline offers, simul returns the same cost function and gradient) and the model therefore accidentally gets pushed out of local minima where m1qn3 is not, because it is more accurate and tries to stay close to a (even small) minimum, once it found it.

I am not sure how to continue. Obviously there are cases when lsopt fails, but there are also cases when m1qn3 is not doing too well, plus since we are sometimes working with approximate gradients, we might not want to be told that the gradient is too inaccurate for further descent.
It would be good to see a few other examples. Maybe you can try to run your problems with optim_m1qn3. Instructions below

Martin

cvs co MITgcm_contrib/mlosch/optim_m1qn3
Compiling is simpler than lsopt/optim: Edit the Makefile to adjust to your system/compiler and change the include path to point to your build directory (just as in optim/Makefile), make depend && make

the resulting optim.x (same name) takes the same input files and most of the variables in data.optim can stay as they are (so far, I plan to have a separate data.m1qn3 and not use data.optim any more, but for now it's easier for comparisons). There are only TWO things that require attention:
- numiter (in data.optim) must be larger than one. It is now the number of optimization iterations that you are going to allow ***in total***. I'd put something large, like 1000
- m1qn3 produces an output file (m1qn3_output.txt, I hard-coded the name for now) that is being reopened each time you run optim.x, so make sure that you keep or restore a copy in the working directory. One could modify optim_sub.F to redirect the m1qn3 output to stdout. I like it better the way it is implemented now.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmp_cf_opt18.png
Type: image/png
Size: 30844 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-devel/attachments/20120508/be721d08/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmp_cf_opt27.png
Type: image/png
Size: 36754 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-devel/attachments/20120508/be721d08/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmp_cf_opt28.png
Type: image/png
Size: 48003 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-devel/attachments/20120508/be721d08/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmp_cf_opt29.png
Type: image/png
Size: 48780 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-devel/attachments/20120508/be721d08/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmp_cf_MOM17.png
Type: image/png
Size: 68152 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-devel/attachments/20120508/be721d08/attachment-0009.png>