[MITgcm-support] verification_other case global_oce_cs32 fails at runtime in adjoint mode

Dan Jones dcjones.work at gmail.com
Tue Sep 22 10:52:03 EDT 2020


Hi Martin,

Thanks for your quick reply! I can't get the serial case to run on ARCHER,
unfortunately. I think for now I'm stuck testing in parallel. When I run
"grep m_boxmean_theta *.f", I get exactly the same results as you. I'm also
using MITgcm/verification_other from GitHub.

Here is a bit more of the output which will hopefully help. This is from a
case where I tried to use the "m_horflux_vol" case instead of the
"m_boxmean_theta" case. I'm using data.ecco from the input_ad.sens
directory.

(PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_kapgm.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_kapredi.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_diffkr.0000000000.data
> (PID.TID 0000.0001)  --> f_gencost =-0.348173207824978E+08 2
> (PID.TID 0000.0001)  --> f_genarr3d = 0.000000000000000E+00 1
> (PID.TID 0000.0001)  --> f_genarr3d = 0.000000000000000E+00 2
> (PID.TID 0000.0001)  --> f_genarr3d = 0.000000000000000E+00 3
> (PID.TID 0000.0001)  --> fc               =-0.348173207824978E+08
> (PID.TID 0000.0001)   early fc =  0.000000000000000E+00
> (PID.TID 0000.0001)   local fc =  0.000000000000000E+00
> (PID.TID 0000.0001)  global fc = -0.348173207824978E+08
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_diffkr.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> adxx_diffkr.0000000000.data
> (PID.TID 0000.0001)  MDS_WRITE_FIELD: it,rec,kS,kL,kH=       0     1  50
> 1  50 file=adxx_diffkr.0000000000
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_kapredi.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> adxx_kapredi.0000000000.data
> (PID.TID 0000.0001)  MDS_WRITE_FIELD: it,rec,kS,kL,kH=       0     1  50
> 1  50 file=adxx_kapredi.0000000000
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> xx_kapgm.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: opening global file:
> adxx_kapgm.0000000000.data
> (PID.TID 0000.0001)  MDS_WRITE_FIELD: it,rec,kS,kL,kH=       0     1  50
> 1  50 file=adxx_kapgm.0000000000
> (PID.TID 0000.0001)  MDS_READ_FIELD: filename:
> adm_horflux_vol.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: File does not exist


So it's after the cost function has been calculated, as the model is
getting ready to perform the adjoint steps. It's able to read/write for the
existing controls (kapgm, kapredi, diffkr). But it's apparently not
creating an "ad" file for the general objective function term "horflux".
That's why I was wondering if I should manually create a blank file first,
as an ad-hoc fix. Any thoughts?

Best wishes,
Dan

On Mon, Sep 21, 2020 at 8:37 PM <mitgcm-support-request at mitgcm.org> wrote:

> Send MITgcm-support mailing list submissions to
>         mitgcm-support at mitgcm.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
> or, via email, send a message with subject or body 'help' to
>         mitgcm-support-request at mitgcm.org
>
> You can reach the person managing the list at
>         mitgcm-support-owner at mitgcm.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of MITgcm-support digest..."
>
>
> Today's Topics:
>
>    1. verification_other case global_oce_cs32 fails at  runtime in
>       adjoint mode (Dan Jones)
>    2. Re: verification_other case global_oce_cs32 fails at runtime
>       in adjoint mode (Martin Losch)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 21 Sep 2020 10:00:47 +0100
> From: Dan Jones <dcjones.work at gmail.com>
> To: mitgcm-support at mitgcm.org
> Subject: [MITgcm-support] verification_other case global_oce_cs32
>         fails at        runtime in adjoint mode
> Message-ID:
>         <CAPj3iHRxhUOCDT5m7H8uj8cg9dc=_
> oYVssQnvYhEA+_ALjeR6w at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello.
>
> Apologies for the cross-posting - I've posted this as a GitHub issue, but I
> thought I should put it here as well.
>
> I am trying to build and test the global_oce_cs32 verification_other
> exercise using the code in the input_ad.sens directory. The forward case
> compiles and runs without error. The adjoint case (built using TAF)
> compiles without error, but at runtime I receive the following error in
> STDOUT:
>
> (PID.TID 0000.0001)  MDS_READVEC_LOC: open file: south30_maskT
> (PID.TID 0000.0001)  MDS_RD_REC_RL: iRec,Dim =         9          1
> (PID.TID 0000.0001)  MDS_READ_FIELD: filename:
> adm_boxmean_theta.0000000000.data
> (PID.TID 0000.0001)  MDS_READ_FIELD: File does not exist
>
> and this error in STDERR:
>
> (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: filename:
> adm_boxmean_theta.0000000000.data
> (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: File does not exist
>
> My MITgcm source code is up-to-date with the master. I am running on
> archer.ac.uk <https://www.archer.ac.uk/> in parallel mode using 24 cores.
>
> What should I try here? I haven't run into this error before using other
> adjoint setups, at least not that I can recall. Should I just create an
> empty "dummy" file to start with? Thanks in advance for any help/guidance.
>
> Best regards,
> Dan
>
>
> --------------------------------------------------------------
> Dr Dan Jones / British Antarctic Survey
> danjonesocean.com <http://www.danjonesocean.com> / @DanJonesOcean
> --------------------------------------------------------------
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20200921/ddb38a00/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Mon, 21 Sep 2020 14:56:23 +0200
> From: Martin Losch <Martin.Losch at awi.de>
> To: MITgcm Support <mitgcm-support at mitgcm.org>
> Subject: Re: [MITgcm-support] verification_other case global_oce_cs32
>         fails at runtime in adjoint mode
> Message-ID: <FF2DD1AB-462E-4089-90CA-89B9552DF7D8 at awi.de>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Dan,
>
> I tried this on my linux box without MPI and I cannot reproduce your
> problem (I used MITgcm/verification_other.git and not the CVS
> MITgcm_contrib/verification_other, which appears to be out of date). I
> grepped the code for ?m_boxmean_theta? and only found this:
>
> (base) bkli04l006::build (master)> grep m_boxmean_theta *.f
> ad_input_code_ad.f:     $'m_boxmean_theta') then
> ad_input_code_ad.f:     $'m_boxmean_theta') then
> ad_input_code_ad.f:     $'m_boxmean_theta') then
> ad_input_code_ad.f:     $'m_boxmean_theta') then
> ad_input_code.f:            if
> (gencost_barfile(kgen)(1:15).EQ.'m_boxmean_theta') then
> ad_taf_output.f:     $'m_boxmean_theta') then
> ad_taf_output.f:     $'m_boxmean_theta') then
> ad_taf_output.f:     $'m_boxmean_theta') then
> ad_taf_output.f:     $'m_boxmean_theta') then
> ecco_check.f:     &
> (gencost_barfile(k)(1:15).EQ.'m_boxmean_theta').OR.
> ecco_phys.f:            if
> (gencost_barfile(kgen)(1:15).EQ.'m_boxmean_theta') then
>
> (and I made sure that there?s this is really just m_boxmean_theta). Where
> in your code (which routine) does the model try to read adm_boxmean_theta?
>
> Martin
> > On 21. Sep 2020, at 11:00, Dan Jones <dcjones.work at gmail.com> wrote:
> >
> > Hello.
> >
> > Apologies for the cross-posting - I've posted this as a GitHub issue,
> but I thought I should put it here as well.
> >
> > I am trying to build and test the global_oce_cs32 verification_other
> exercise using the code in the input_ad.sens directory. The forward case
> compiles and runs without error. The adjoint case (built using TAF)
> compiles without error, but at runtime I receive the following error in
> STDOUT:
> >
> > (PID.TID 0000.0001)  MDS_READVEC_LOC: open file: south30_maskT
> > (PID.TID 0000.0001)  MDS_RD_REC_RL: iRec,Dim =         9          1
> > (PID.TID 0000.0001)  MDS_READ_FIELD: filename:
> adm_boxmean_theta.0000000000.data
> > (PID.TID 0000.0001)  MDS_READ_FIELD: File does not exist
> >
> > and this error in STDERR:
> >
> > (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: filename:
> adm_boxmean_theta.0000000000.data
> > (PID.TID 0000.0001) *** ERROR ***  MDS_READ_FIELD: File does not exist
> >
> > My MITgcm source code is up-to-date with the master. I am running on
> archer.ac.uk in parallel mode using 24 cores.
> >
> > What should I try here? I haven't run into this error before using other
> adjoint setups, at least not that I can recall. Should I just create an
> empty "dummy" file to start with? Thanks in advance for any help/guidance.
> >
> > Best regards,
> > Dan
> >
> > --------------------------------------------------------------
> > Dr Dan Jones / British Antarctic Survey
> > danjonesocean.com / @DanJonesOcean
> > --------------------------------------------------------------
> > _______________________________________________
> > MITgcm-support mailing list
> > MITgcm-support at mitgcm.org
> > http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
>
> ------------------------------
>
> End of MITgcm-support Digest, Vol 207, Issue 10
> ***********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20200922/d6ac2854/attachment-0001.html>


More information about the MITgcm-support mailing list