[MITgcm-support] optim_m1qn3 on ARCHER supercomputer

Martin Losch Martin.Losch at awi.de
Wed Apr 10 12:02:45 EDT 2019


Hi Dan,

I don’t have access to ARCHER but I have a Cray CS400 with a somewhat disfunctional cray compiler. I followed the instructions in the README.md on https://github.com/mjlosch/optim_m1qn3 and I was able to run the tutorial_global_oce_optim for the zeroth iteration (so I can assume that in this case my compiler is not totally off), I could also compile and run optim.x with these flags:


CPPFLAGS = -DREAL_BYTE=4		\
	-DMAX_INDEPEND=1000000		\
	-D_RL=‘real*8'	\
	-D_RS=‘real*4'	\
	-D_d='d'

#                -DMAX_INDEPEND=293570968        \
# FORTRAN compiler and its flags copied from the opt file, or rather the Makefile of tutorial_global_oce_optim
FC              = ftn
FFLAGS     =  -h byteswapio -hnoomp -O0 -hfp0

Everything looks good until m1qn3_offline is called, unfortunately.
I get many wrong numbers ( somthingE+/-317, and even NaN), but also useful numbers in xx after m1qn3_offline has been called. This looks like something more severe. I’ll look into that, but if you can get this far, that would be good.

Please use the github version.

Martin



> On 10. Apr 2019, at 16:30, GOLDBERG Daniel <Dan.Goldberg at ed.ac.uk> wrote:
> 
> Hello Martin (or anyone who has used optim_m1qn3 on ARCHER)
> 
> I have used optim_m1qn3 previously but not on the ARCHER UK supercomputer (a Cray architecture). The setup i am using (making use of STREAMICE/OpenAD/optim_m1qn3) has been working well on the MIT engaging cluster but am now trying to run on ARCHER. Following the recommendations of others I have built MITgcm using Cray compilers; and i modified mlosch/m1qn3_optim/Makefile, save to point to my build directory.
> 
> The first call to optim.x yields the error
> 
> ============================================================
>   OPTIM_READDATA: opened file ecco_cost_MIT_CE_000.opt0000                                                                
> At line 1295 of file optim_readdata.f (unit = 20, file = 'ecco_cost_MIT_CE_000.opt0000')
> Fortran runtime error: End of file
> ============================================================ 
> 
> which suggests the binary file is written in a format that optim_m1qn3 is not expecting? 
> 
> Other tests I did:
> 
> 1) Ran the same experiment (i.e. same code and input, but with gnu compilers) on the engaging cluster. Ran fine.
> 2) Called optim.x (compiled on Archer) with the ecco_cost_MIT_CE_000.opt0000 produced on engaging. Ran fine.
> 3) Called optim.x (compiled on engaging) with the ecco_cost_MIT_CE_000.opt0000 produced on ARCHER. 
> 
> Hence, either mitgcmuv_ad is writing a corrupted executable when built with a cray compiler, or the Makefile of m1qn3_optim should be modified to reflect that its input files are being produced by a cray-compiled executable -- but I do not know how to do this. I am attempting now to build and run MITgcm/OAD using iFort, but may run into trouble for different reasons.
> 
> Any guidance you could give on this topic would be much appreciated.
> 
> Best
> Dan
> 
> -- 
> 
> Daniel Goldberg, PhD
> Sr. Lecturer in Glaciology
> School of Geosciences, University of Edinburgh
> Geography Building, Drummond Street, Edinburgh EH8 9XP
> 
> 
> em: dan.goldberg at ed.ac.uk
> web: https://www.geos.ed.ac.uk/homes/dgoldber
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support



More information about the MITgcm-support mailing list