[MITgcm-support] is this bug in mitgcm code?
Dimitris Menemenlis
menemenlis at sbcglobal.net
Fri May 9 12:40:07 EDT 2008
Suneet, I had a similar problem. What worked for me is to have both
singlecpuio and globalfiles options turned on.
This gives a warning, which can be ignored, but then the obcs files
are written as one global file rather than scattered
around in multiple files. The reason is that singlecpuio is not
implemented for vector quantities.
A quick fix would be to default to globalfiles for vector output when
singlecpuio is .TRUE.
D.
Dimitris Menemenlis <menemenlis at sbcglobal.net>
5056 Oakwood Ave, La Canada, CA 91011-2450
tel/fax: 818-790-6735; cell: 818-625-6498
On May 9, 2008, at 9:12 AM, Suneet Dwivedi wrote:
> Hi Everyone,
> When I tried to run mitgcm adjoint model with obcs package on, I endup
> with the following error message:
> -------------------------------------------------------------------------------------------------------------------------------
> (PID.TID 0000.0001) *** ERROR *** MDSREADFIELD_XZ_GL: File does not
> exist
> --------------------------------------------------------------------------------------------------------------------------------
> My model stopped at:
> --------------------------------------------------------------------------------------------------------------------------------
> (PID.TID 0000.0001) MDSREADFIELD_XZ_GL: opening file: maskobcsn.
> 001.001.data
> (PID.TID 0000.0001) MDSREADFIELD_XZ_GL: filename: maskobcsn.
> 002.001.data
> ----------------------------------------------------------------------------------------------------------------------------------
> The detailed description of the problem and the solution (that worked
> for me) is as follows:
> (i) Model works fine (stops with 'normal end') when I use single
> processor/double processor on a single node.
> (ii) Model crashes with the abovesaid error message when I start using
> 16 processors on maybe 10 nodes.
>
> When I actually started looking for the missing files
> 'maskobcsn.002.001.data', 'maskobcsn.003.001.data' and so on; I found
> that the same were written at different nodes than master node even
> though I used "useSingleCpuIo=.TRUE." I then copied all these files
> manually to the master node and after that model stopped at different
> place looking for "adxx_obcsn.0000000000.002.001.data" and then for
> "xx_obcsn.0000000000.002.001.data" and
> "pickup_obE.ckptA.002.002.data". So I copied all the files associated
> with obcs control to the master node and rerun the model and it worked
> fine this time for me (stopped with normal end). It means that
> useSingleCpuIo=.TRUE. does not work for obcs control files. I wonder
> is this a bug in the mitgcm while using obcs alongwith ecco? Am I
> missing some statement required for obcs package to write output on a
> single CPU while running the model on multiprocessors? Please help me
> sort out this problem.
> Hoping for reply,
> Cheers,
> Suneet
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list