[MITgcm-support] file size issue with mpi
THOMAS HAINE
thomas.haine at jhu.edu
Mon Aug 23 18:14:25 EDT 2004
Hi Folks,
I've hit a problem that is confusing me. If I run an MPI job (on my Opteron Suse cluster with g77 and mpich-1.2.5) I get incorrect file dumps according to how I set up mpi.
For example, if I write to local disk:
mpirun -np 9 -nolocal -machinefile node_list run/mitgcmuv -p4wd /tmp/twnh/scratch
I get files written on each scratch disk which are the wrong size:
for node in `cat node_list `; do echo $node; ssh $node 'ls -alt /tmp/twnh/scratch/U.*.data'; done
gives:
node2
-rw-r--r-- 1 twnh users 96545664 Aug 23 06:30 /tmp/twnh/scratch/U.0000000000.data
node5
-rw-r--r-- 1 twnh users 96212736 Aug 23 06:33 /tmp/twnh/scratch/U.0000000000.data
node14
-rw-r--r-- 1 twnh users 96544512 Aug 23 02:21 /tmp/twnh/scratch/U.0000000000.data
node15
-rw-r--r-- 1 twnh users 96546816 Aug 23 06:29 /tmp/twnh/scratch/U.0000000000.data
node16
-rw-r--r-- 1 twnh users 96213888 Aug 23 06:31 /tmp/twnh/scratch/U.0000000000.data
The correct size is 96546816 bytes (so node15 is the right size - the others are too small. node15 ran processes 3 and 8 of 9). Similar problems occur with 2D files. This also happens if I write to disk on the master node (although it seems a bit better).
These cases are with globalFiles=.true. If I set it false, to get tiled output, everything looks the right size when I write to local scratch disk.
Any ideas what's going on here and how I should fix it? Have I missed something?
Thanks, Tom.
More information about the MITgcm-support
mailing list