[MITgcm-support] optim_m1qn3 on ARCHER supercomputer

Martin Losch Martin.Losch at awi.de
Wed Apr 10 12:02:45 EDT 2019

Hi Dan,

I don’t have access to ARCHER but I have a Cray CS400 with a somewhat disfunctional cray compiler. I followed the instructions in the README.md on https://github.com/mjlosch/optim_m1qn3 and I was able to run the tutorial_global_oce_optim for the zeroth iteration (so I can assume that in this case my compiler is not totally off), I could also compile and run optim.x with these flags:

	-DMAX_INDEPEND=1000000		\
	-D_RL=‘real*8'	\
	-D_RS=‘real*4'	\

#                -DMAX_INDEPEND=293570968        \
# FORTRAN compiler and its flags copied from the opt file, or rather the Makefile of tutorial_global_oce_optim
FC              = ftn
FFLAGS     =  -h byteswapio -hnoomp -O0 -hfp0

Everything looks good until m1qn3_offline is called, unfortunately.
I get many wrong numbers ( somthingE+/-317, and even NaN), but also useful numbers in xx after m1qn3_offline has been called. This looks like something more severe. I’ll look into that, but if you can get this far, that would be good.

Please use the github version.


> On 10. Apr 2019, at 16:30, GOLDBERG Daniel <Dan.Goldberg at ed.ac.uk> wrote:
> Hello Martin (or anyone who has used optim_m1qn3 on ARCHER)
> I have used optim_m1qn3 previously but not on the ARCHER UK supercomputer (a Cray architecture). The setup i am using (making use of STREAMICE/OpenAD/optim_m1qn3) has been working well on the MIT engaging cluster but am now trying to run on ARCHER. Following the recommendations of others I have built MITgcm using Cray compilers; and i modified mlosch/m1qn3_optim/Makefile, save to point to my build directory.
> The first call to optim.x yields the error
> ============================================================
>   OPTIM_READDATA: opened file ecco_cost_MIT_CE_000.opt0000                                                                
> At line 1295 of file optim_readdata.f (unit = 20, file = 'ecco_cost_MIT_CE_000.opt0000')
> Fortran runtime error: End of file
> ============================================================ 
> which suggests the binary file is written in a format that optim_m1qn3 is not expecting? 
> Other tests I did:
> 1) Ran the same experiment (i.e. same code and input, but with gnu compilers) on the engaging cluster. Ran fine.
> 2) Called optim.x (compiled on Archer) with the ecco_cost_MIT_CE_000.opt0000 produced on engaging. Ran fine.
> 3) Called optim.x (compiled on engaging) with the ecco_cost_MIT_CE_000.opt0000 produced on ARCHER. 
> Hence, either mitgcmuv_ad is writing a corrupted executable when built with a cray compiler, or the Makefile of m1qn3_optim should be modified to reflect that its input files are being produced by a cray-compiled executable -- but I do not know how to do this. I am attempting now to build and run MITgcm/OAD using iFort, but may run into trouble for different reasons.
> Any guidance you could give on this topic would be much appreciated.
> Best
> Dan
> -- 
> Daniel Goldberg, PhD
> Sr. Lecturer in Glaciology
> School of Geosciences, University of Edinburgh
> Geography Building, Drummond Street, Edinburgh EH8 9XP
> em: dan.goldberg at ed.ac.uk
> web: https://www.geos.ed.ac.uk/homes/dgoldber
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support

More information about the MITgcm-support mailing list