Monitor PBS clusters with Jenkins

It is not exactly an idea, it is more a requirement for BioUno. A common way of administering Batch tasks in computer clusters is using Batch servers such as PBS. The process to submit a job consist of executing a shell script with special “meta” comments that will tell the PBS about your job priority, CPU’s needed, etc.

Jenkins has an awesome remoting API, that has been reworked since its first version in Hudson (it was part of the core IIRC), and can be used for other things. There is a Wiki page about Monitoring external jobs.

What we would need, is query the PBS server, running qstat + parameters locally in the cluster machine. So the requirements that I see:

  • Learn more about the remoting API in Jenkins (see http://kohsuke.org/tag/remoting/ too)
  • Rework pbs4java or use a wrapper to an existing C API (I’ve read in the past about one, the Python PBS module is a wrapper for this API too)
  • Learn about this technique for monitoring external jobs mentioned in that Wiki page
  • Write glue code if needed
  • Test with a PBS cluster
  • Report the findings (paper?)

Just food for thought :D

Edit: There is a label in Jenkins Wiki with several plug-ins that could be used for reference