[Aces-support] PBS question
aces-admin at techsquare.com
aces-admin at techsquare.com
Thu Aug 4 15:04:44 EDT 2005
hello lcampo-
torque is a quickly evolving beast.
which version are you running ?
[greg]
> Date: Thu, 4 Aug 2005 12:31:20 -0400
> From: Lorenzo Campo <lcampo at mit.edu>
> MIME-Version: 1.0
> Cc:
> Reply-To: ACES-support at mitgcm.org
>
> Hi,
> this is not a question about acesgrid cluster, but any suggestion will be highly
> appreciated...
> I'm trying to make a cluster of 16 processors in my Department (University of
> Florence, Italy), I installed everything with the OSCAR Package, that installed
> every node and all useful daemons without problems. Problem is that PBS reports
> (with pbsnodes -a command) that every node is "state-unknonw,down", apart the
> master node that is "free", with no apparent reason. All communications between
> nodes and master are not blocked (iptables is just down, and I performed
> several communication tests), PBS files like "nodes" and "server" contains
> right ip and hostnames, pbs_mom daemon is regularly running on each node and on
> the master, and MAUI doesn't seem to have problem (or I guess so...). I created
> two queues with qmgr, both are enabled and started, I defined all 15 nodes in
> defining the server (again in qmgr), but every time I try to launch a job with
> qsub that employs more than one processor, it is put in queue indefinetely, and
> a qstat -s says that "there are not enough resources". Any idea?
> Thank you very much and sorry for the out-of-topic question.
> Lorenzo
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
>
More information about the Aces-support
mailing list