[MITgcm-support] [EXTERNAL] mitgcmuv_ad explodes during tape computations

Wang, Ou (US 329B) ou.wang at jpl.nasa.gov
Thu Aug 27 12:18:39 EDT 2020


Hi Matin,

I'd think your interpretation is correct. 

One possible inaccuracy in the tape computations as in ECCOv4 is that single precision is used to store the fields to tapes, while generally double precision is used in the forward code. To rule out this inaccuracy, set doSinglePrecTapelev=.FALSE. in data.ctrl.  

Ou

On 8/27/20, 3:27 AM, "MITgcm-support on behalf of Martin Losch" <mitgcm-support-bounces at mitgcm.org on behalf of Martin.Losch at awi.de> wrote:

    Hi there,

    we are running an ecco-v4-like llc90 experiment. Some parameter options are different from the tru ecco-v4, and we have turned off many of the problematic code bits in the adjoint (seaice, ggl90/kpp, gmredi, saltplume).

    In a two year simulation with an objective function that is basically the mean salt content in the inner Arctic in the last month of the integration, the model blows up some 16.5 days (394 timesteps) into the reverse part of the simulation (with S/R CALC_R_STAR stopping the simulation). A closer inspection let’s us believe that this actually happens during tape computations (i.e. forward simulations), because (a) the error is triggered by CALC_R_STAR (too SMALL rStarFac[C,W,S]) which is only called from forward_step and forward_stepmd, and (b) we have output (adjDumpFreq) every 5 timesteps and the stop happens 57 timesteps earlier than the last ADJ${var} is written.

    Short test simulations (order 100 timesteps) are fine.

    My interpretation is that the forward simulation is fine (which it is), but maybe marginally unstable, but that somehow the restarts from the tapes is not correct or inaccurate so that the forward part is pushed across stability limits somehow. Has this happened before to anyone? Do you have any suggestions, how we can debug this problem?

    Martin


    PS. We use 
    #define ALLOW_AUTODIFF_WHTAPEIO (very useful!!!!)
    #define ECCO_CTRL_DEPRECATED
    #define ALLOW_THETA0_CONTROL
    #define ALLOW_SALT0_CONTROL
    #define ALLOW_ATEMP_CONTROL
    #define ALLOW_AQH_CONTROL
    #define ALLOW_UWIND_CONTROL
    #define ALLOW_VWIND_CONTROL
    #define ALLOW_PRECIP_CONTROL
    #define ALLOW_RUNOFF_CONTROL (with our own additions to make it work, see <https://urldefense.us/v3/__https://github.com/mjlosch/MITgcm/tree/ctrl_runoff__;!!PvBDto6Hs4WbVuu7!YwkZWPaC5MkohUglCQ6fwPUeQFpfZUuG1c_7pjlT1h-ShBGWyGeZhq45A8CbSKR-fpU$ > if you are interested)

          parameter( nchklev_1      =    4 )
          parameter( nchklev_2      =   30 )
          parameter( nchklev_3      =   73 )

    _______________________________________________
    MITgcm-support mailing list
    MITgcm-support at mitgcm.org
    https://urldefense.us/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!PvBDto6Hs4WbVuu7!YwkZWPaC5MkohUglCQ6fwPUeQFpfZUuG1c_7pjlT1h-ShBGWyGeZhq45A8Cbqu9O0kI$ 



More information about the MITgcm-support mailing list