[MITgcm-support] Help OpenAD adjfiles
Patrick Heimbach
heimbach at mit.edu
Mon May 9 14:56:04 EDT 2016
Hi Heriberto,
well, it looks like the good news is that you report the last adjoint value (going backwards) to be practically the same between both tools and gradient checks seemed ok too (not sure why you had to tweak the gradient check package, it’s implemented - and used - for both tools). Almost the only way the last values can be ok is for intermediary values to be ok too, and so most likely some issue with I/O of intermediate values.
Is your setup easily accessible somewhere (on a common machine, e.g. TACC or NCAR)?
Could take a look at some point soon - although currently a bit swamped.
-Patrick
On May 9, 2016, at 1:11 PM, Heriberto Vazquez <heriberto1mx at gmail.com> wrote:
> Hello MITgcm developers
>
> I have a question, particularly to those that have used OpenAD in their configurations. Hopefully someone can bring some light to this problem...
>
> Right now I am doing a sensitivity analysis in the Gulf of Mexico using OpenAD, working with MITgcm_c65k version. In order to get the sensitivities (ADJfiles in TAF) I need to edit and modify the code openad_dumpAdjoint.F located on .../pkg/openad/ and of course to modify forward_step.F making a Call to openad_dumpAdjoint at the same place where the call to ADJfiles for TAF is done (dummy_in_stepping.F or the adjoint version addummy_in_stepping.F). Outputs I am getting make perfect sense to me because it is something I could expect and some theories about it are agreed with it.
>
> Here at Scripps we can use TAF, in order to do a comparison in the outputs I did the same implementation but using TAF instead. Everything was great, similar patterns and structures in both sensitivities analysis. However the differences in values is quite big (i.e. adjetan for TAF values goes from -15 to 10 and in OpenAD values goes from -60 to 50) in the other files there are big differences as well (i.e. ajduvel, adjvvel, adjsalt, etc.) differences are not constant and evolve in time.
>
> Because differences, I was told to use gradient check package an check out which one was wrong. After doing some tweaks to generate the proper dependencies, I could use gradient check package in both TAF and OpenAD, results told me both of them where correct, TAF results were better than OpenAD but both correct.
>
> gradient check package works using adxx_files, so I compare adxx_files from TAF and OpenAD and results are not exactly the same but are quite similar, which led me to think the only problem was in openad_dumpAdjoint.F when printing out the adjfiles.
>
> I checked addummy_in_stepping.F and build the same exchanges there but in my version of openad_dumpAdjoint.F, of course with the specific subroutines for openad (I use openad_exch_rl instead of adexch_rl for example) the compilation process finished successfully but at run time I got a segmentation fault. This was the error:
>
> [compas-1-5:02282] *** Process received signal ***
> [compas-1-5:02282] Signal: Segmentation fault (11)
> [compas-1-5:02282] Signal code: Address not mapped (1)
> [compas-1-5:02282] Failing at address: 0x563af148
> One time I got it when the backward integration started and other times after one or two days of forward integration.
>
> I am not sure if exchanges can be the problem in differences in ADJ files because when take out the openad_exchanges, the run is OK and results make sense, the values in the evolution from the first backward integration to the last one are the ones who doesn't make sense.
>
> Thinking that maybe a normalization in the values can be generating these differences, I checked the last value (in backward integration) of adjfiles and it is practically the same as in adxx_file (talking about adxx_uvel and adjuvel or for vvel) so it is not a normalization or something.
>
> By the way, forward run is almost exactly the same only differences of -8xe-13 to 7xe-13 in the values and the cost function is also exactly the same, seems to me like only the adjoint integration has different results.
>
> Because of above paragraph, I decided to comment out all parameters at data.autodiff file and definitions in inadmode_set and _unset just in case one of them (for sure TAF) was changing parameters when backward integration is being done. But it didn't work differences are still being the same, too big.
>
> So, I cannot yet figure out why those differences exist in the adjfiles. Has someone faced the same problem? Some ideas what could be causing those differences? How can I solve it? Hope the problem description is clear enough, if not please feel free and ask.
>
> Thank you very much
>
> Best regards,
>
> Heriberto
> Postdoc SIO-UCSD
>
> __
> No podemos resolver problemas usando el mismo tipo de pensamiento que usamos cuando los creamos...
> Einstein
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
--------------------------------------------------------
Patrick Heimbach, Ph.D. | http://heimbach.wordpress.com
* The University of Texas at Austin *
The Institute for Computational Engineering and Sciences
Institute for Geophysics | Jackson School of Geosciences
201 East 24th Street, POB 4.232 | Austin, TX 78712 | USA
FON: +1-512-232-7694 | Email: heimbach at utexas.edu
* Massachusetts Institute of Technology *
Department of Earth, Atmospheric, and Planetary Sciences
77 Massachusetts Ave, 54-1420 | Cambridge MA 02139 | USA
FON: +1-617-253-5259 | Email: heimbach at mit.edu
--------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1845 bytes
Desc: not available
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20160509/ea7e9c7d/attachment.p7s>
More information about the MITgcm-support
mailing list