[MITgcm-support] optim.x build problems on ARCHER, using MITgcm c65i and OpenAD
Martin Losch
martin.losch at awi.de
Mon Sep 12 19:07:01 EDT 2016
Hi Dan,
when I run this on my Apple Book I get this at the end
OPTIM_STORE_M1QN3: restoring the state of m1qn3 from OPWARM.opt0026
OPTIM_STORE_M1QN3: saving the state of m1qn3 in OPWARM.opt0027
stoptheloop
ml-offline_driver: niter = 18
ml-offline_driver: nsim = 27
ml-offline_driver: omode = 1
stoptheloop
omode 6 means that m1qn3 cannot complete, usually because of accuracy issues (the computed gradient does not lead to a smaller cost function). I guess that’s OK as long as you are getting the same from just running “./driver” in testbed/runsript.sh. runscript.sh does both “online” and “offline” runs to check if both give the same.
your optim.x needs to be compiled with the correct header files (this is no different from the original optim.x in MITgcm/optim). The path to these needs to specificfied in optim_m1qn3/Makefile (INCLUDEDIRS need to point to the directory where you’ve compiled your mitgcmuv_ad), along with other things that you might need for your compiler (haven’t tried this on an XC30 yet). Then you need to do make CLEAN && make depend && make in “optim_m1qn3”
Afterwards you can copy optim.x to where your model runs. It needs to be in the same directory as where you execute mitgcmuv_ad (so compiling and running need not be in the same place) and afterwards you need to make sure that all files are saved so that you can restart.
It looks like you don’t have a data.optim in your directory?
Martin
> On 12 Sep 2016, at 08:22, Dan Jones <dcjones.work at gmail.com> wrote:
>
> Hi Martin,
>
> Thank you for the help! I was able to compile and run optim.x in the "optim_m1qn3/testbed" directory. It produced 27 "OPWARM.optim" files and the following output:
> OPTIM_STORE_M1QN3: restoring the state of m1qn3 from OPWARM.opt0026
>
> OPTIM_STORE_M1QN3: saving the state of m1qn3 in OPWARM.opt0027
>
> stoptheloop
>
> ml-offline_driver: niter = 18
>
> ml-offline_driver: nsim = 27
>
> ml-offline_driver: omode = 6
>
> Is that what we should expect? Unfortunately, when I tried to apply optim.x to my model run, it returned this output/error:
>
> OPTIM_READPARMS: Control options have been read.
>
> lib-4001 : UNRECOVERABLE library error
>
> A READ operation tried to read past the end-of-file.
>
> Encountered during a namelist READ from unit 11
>
> Fortran unit 11 is connected to a sequential formatted text file:
>
> "/tmp/F011.BAAa06574"
>
>
> Aborted
>
> So something is wrong with my specific configuration. I can think of a possible source of error, but I don't know whether or not it's relevant here. I'm following the ARCHER-recommended procedure of compiling on the "home" filesystem while running executables on the separate "work" filesystem. You can copy files from one filesystem to another, but jobs can only see the "work" filesystem. I get the impression that MITgcm/optim works best on a single filesystem, i.e. where compiling and running is done in the same place. Have I understood this correctly?
>
> Thanks!
> Dan
>
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list