[Aces-support] abnormal exit

Shihong Chi shihong at erl.mit.edu
Wed Dec 15 10:15:58 EST 2004


Sorry to bother you all.

My program run for 39 hrs and exited due to file open error. I do not know 
if it is due to disk limitation or file handle shortage. It happened 
around 6 AM today.

At the end of stdout file, it reads:

step 4499 of 8000: snapshot 9 out
snapshot file set1snapshots9vx.bin not opened, whch = 0
error opening snapshot file
MPI_Recv: process in local group is dead (rank 1, comm 3)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD):  - MPI_Recv()
Rank (2, MPI_COMM_WORLD):  - MPI_Barrier()
Rank (2, MPI_COMM_WORLD):  - MPI_Barrier()
Rank (2, MPI_COMM_WORLD):  - main()

The working directory is:
/home/shihong/PMLFS/Stretch/TunnelQ

Thanks.

Shihong



More information about the Aces-support mailing list