[Aces-support] qhy are my jobs not being executed?

Einat Lev einatlev at MIT.EDU
Fri Aug 5 15:05:27 EDT 2005


My problem is actually more complicated...
I have a post-processing script, that gets as input a list of processors 
on which the original program ran, and collects the output files from 
them. So I am suppose to be asking for specific nodes...
so this brings a general file-system question - I cann see all the 
output files when I am logged to the main node. Bu t I guess that the 
actual files reside in the local disks of the nodes. Is that right?
Thanks,
Einat


Ed Hill wrote:

>On Fri, 2005-08-05 at 14:42 -0400, Einat Lev wrote:
>  
>
>>they get stuck in the queue, no matter what queue I submit it to 
>>(four/long).
>>It seems like only 57 nodes are allocated to current jobs, so there 
>>should be enough room for everybody.
>>    
>>
>
>Hi Einat,
>
>Your script:
>
>  /home/einatlev/CitcomS/inputsample/regional/pbs_citcom
>
>specifies that it only wants nodes with two processors:
>
>  #PBS -l nodes=4:ppn=2:gigabit
>
>and it wants both of those processors.  And only single-processor
>machines are currently available for use, so its waiting.
>
>So, there are a few different things that you can do including:
>
>  1) try using:  #PBS -l nodes=4:ppn=1:gigabit
>
>  2) try using:  #PBS -l nodes=4:gigabit
>
>  3) run things on ao.acesgrid.org or itrda.acesgrid.org instead 
>     of geojr since they're less busy
>
>Ed
>
>  
>



More information about the Aces-support mailing list