[Aces-support] weird behaviour of cluster...

aces-admin at techsquare.com aces-admin at techsquare.com
Fri Aug 12 14:56:11 EDT 2005


hello einatlev-

sounds like environment. from which machines
are you having problems ? even failed job-id's
would be enough for me to track this down.

[greg]

ps. tech-answer is not enough low-ports for rsh to work.
    that's what "poll: protocol failure in circuit setup" 
    usually means.


> Date: Fri, 12 Aug 2005 14:36:09 -0400
> From: Einat Lev <einatlev at mit.edu>
> MIME-Version: 1.0
> Reply-To: ACES-support at mitgcm.org
> 
> Hi
> 
> I am submitting jobs using a PBS script. Despite using exactly the same 
> script file, the result varies. Sometimes it works fne, and sometimes I 
> get error messages, like :
> "
> poll: protocol failure in circuit setup
> /bin/rm: cannot remove 
> `/home/einatlev/CitcomS-2.0.0/packages/CitcomS/examples/Cookbook4/8767/PI685': 
> No such file or directory
> umount: /mnt/pvfs2raid: not mounted
> pvfs2-client: no process killed
> /var/torque/mom_priv/pvfs-stop: line 5: rmmod: command not found
> "
> the mentioned file did exist in the benining, but then dissapeard...
> 
> What is causing this instablility?
> 
> Thanks
> Einat
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
> 



More information about the Aces-support mailing list