[Aces-support] ao jobs quit immediately

William Boos billboos at MIT.EDU
Tue Apr 3 16:05:04 EDT 2007


hi greg,
ao jobs are quitting immediately again -- 59213.ao is one such job.  i'm
guessing it's problem with a54-1727-073, because jobs quit immediately when
i qsub to this node in particular, but not to other nodes.
-bill


-----Original Message-----
From: aces-support-bounces at mitgcm.org
[mailto:aces-support-bounces at mitgcm.org] On Behalf Of
aces-admin at techsquare.com
Sent: Monday, April 02, 2007 3:06 PM
To: ACES-support at mitgcm.org
Subject: Re: [Aces-support] ao jobs quit immediately

hello billboos-

both of these jobs failed to stage-in correctly. 
could have been slooooow network, but most likely local problem on
a54-1727-077 (which was the execution node for both jobs). 

i've offlined the node and will check on it as soon as the current round of
jobs (tries) to run.

[greg]


> From: "William Boos" <billboos at mit.edu>
> Date: Mon, 2 Apr 2007 13:40:48 -0400
> MIME-Version: 1.0
> Reply-To: ACES-support at mitgcm.org
> 
> When I submit a job to ao it starts and then quits immediately, for 
> both interactive and scripted jobs.  Examples are 58728.ao (a 1-node 
> interactive
> job) and 58729.ao (a 24-node job).
> 
> Also, I've been trying to run on ao because my jobs typically run in 
> half the time there as on geo.  Anyone know why this would be?  Are 
> the scratch disks mounted local to a particular cluster, thereby 
> making disk writes quicker for some disks?
> 
> -Bill
> 
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
> 
_______________________________________________
Aces-support mailing list
Aces-support at acesgrid.org
http://acesgrid.org/mailman/listinfo/aces-support




More information about the Aces-support mailing list