Posts in blog

Monitor PBS clusters with Jenkins

Oct 02, 2012 in jenkins, bioinformatics | blog

It is not exactly an idea, it is more a requirement for BioUno. A common way of administering Batch tasks in computer clusters is using Batch servers such as PBS. The process to submit a job consist of executing a shell script with special “meta” comments that will tell the PBS about your job priority, CPU’s needed, etc.

Jenkins has an awesome remoting API, that has been reworked since its first version in Hudson (it was part of the core IIRC), and can be used for other things. There is a Wiki page about Monitoring external jobs.

What we would need, is query the PBS server, running qstat + parameters locally in the cluster machine. So the requirements that I see:

  • Learn more about the remoting API in Jenkins (see http://kohsuke.org/tag/remoting/ too)
  • Rework pbs4java or use a wrapper to an existing C API (I’ve read in the past about one, the Python PBS module is a wrapper for this API too)
  • Learn about this technique for monitoring external jobs mentioned in that Wiki page
  • Write glue code if needed
  • Test with a PBS cluster
  • Report the findings (paper?)

Just food for thought :D

Edit: There is a label in Jenkins Wiki with several plug-ins that could be used for reference

Use Jenkins plug-ins API in Apache Nutch

Oct 02, 2012 in nutch, jenkins, ideas | blog

I’m working in an Apache Nutch project that involves some new plug-ins and customization in existing parts parts of Nutch, however, after reading Nutch’s code base and learning about its plug-in architecture, I believe someone could use part of Jenkins API to enhance the plug-in API in Nutch.

Nutch uses a similar concept, with the same name as in Jenkins, Extension Points. However, it’s quite hard to create a plug-in project separate from the core project (it uses Ivy and plug-ibs have some dependencies to the core project). And you have to extend certain classes and configure XML files to prepare your plug-in.

Part of this could be automatically done with inheritance + Java annotations. I’ll have a cycle for Open Source in the next days, and will give it a try to see if that really makes sense.

Redesigning Speak Like A Brazilian

Aug 29, 2012 in products, slbr | blog

Speak Like A Brazilian started as a weekend project submitted to Hacker News and Reddit. There was an amazing feedback and thus we decided to dedicate some time to a new version of SLBR. It is a web site to share Brazilian Portuguese expressions with other users. Anybody can submit expressions, like or dislike expressions and share the expressions with friends via e-mail or social networks.

TL;DR We added the feature to have an expression with many definitions, simplified the design of the landing page and removed the need to register to submit expressions. Part of the work was based on user feedback, and another part was based on the way that Urban Dictionary works.

What was done?

We started by the database model. Initially, expressions were thought as a single entity. An expression had a text and its description. However, some users pointed that they could have more than one definition for the same expression. We had to break the table into two, expressions and definitions.

Time to rethink the landing page design. There are several articles that talk about the landing pages ([1], [2], [3]). Here are some basic guidelines that we followed when redesigning SLBR.

  • Limit action
  • Buttons and call actions standing out
  • Enable sharing

Limit action

The old version had too much text, a video and a box with tweets. As well as the letters menu. We removed this content, leaving only the letters menu, minus the Home and Login menu items. Now the user is presented with the search box, the letters menu and the top ten expressions.

Buttons and call actions standing out

With less actions on the screen, the user experience will be better in SLBR. When the users sees the landing page now, there is the search button standing out, with a bigger font size. There is also a menu with letters with simple navigation, and some expressions that give the users the picture of how the web site works.

Enable sharing

This was a recommendation in one of the web sites mentioned earlier but UD already did that too. A very simple idea, and very easy to be put into practice too. We added a form that lets users share an expression via e-mail or via social networks. Hopefully it will increase the number of expressions in SLBR.

Final thoughts

We have also made several changes to the layout of items, and some CSS and JS enhancements, like using hover in some elements, modal windows to share and code to prevent a form being submitted twice.

But one of the greatest enhancements was allow users to submit expressions without having to log in. Before this, users were required to register in the web site in order to submit expression. That is just boring. After playing with UD, we decided to give it a try.

We will monitor the web site stats to check the results of this redesign, but so far we are very happy with this new version. The next step now is, probably, use jQuery Mobile to create a simple version of the web site for mobile users and promote the web site at local hostels.

What do you think?

References

[1] http://blog.hubspot.com/blog/tabid/6307/bid/7177/What-Is-a-Landing-Page-and-Why-Should-You-Care.aspx
[2] http://unbounce.com/landing-page-examples/your-landing-page-sucks/
[3] http://support.google.com/adwords/bin/answer.py?hl=en&answer=2404197

BOSC 2012 and BioUno

Aug 10, 2012 in jenkins, bioinformatics, events | blog

Few weeks ago TupiLabs participated of BOSC 2012 , with a talk about BioUno . BioUno is a project that applies techniques and tools from Continuous Integration in Bioinformatics, in special Jenkins . In the very beginning, the project was a ” hey, look what I can do with Jenkins “. Later on we defined that we would like to create biology pipelines with Jenkins , using plug-ins. This was the topic of our talk in BOSC 2012 - Creating biology pipelines with Jenkins . The main advantage of using Jenkins, is its distributed environment with master and slaves , the plug-ins and the documentation and community available .

In the talk, we showed a demo using vanilla Jenkins, MrBayes , with mrbayes-plugin , and R and ape for plotting, using r-plugin . It was a very simple pipeline, without using Cloud services or any NGS tool. What was great in BOSC 2012 is that we could catch up with people that work on Galaxy , Taverna , Mobyle and other workflow management systems, as well as lads from other interesting projects, like CloudGene (MapReduce interface for researchers, pretty cool). When we returned to Brazil, we had more items to include in BioUno TODO list, but also a realization. That BioUno is not only a biology workflow management system. Its roles intersect with Galaxy, Taverna and Mobyle, as a workflow management system , but also intersect with BioHPC , as a bioinformatics computer management system .

With Jenkins you can start and monitor slaves remotely (in your local network or in a cloud), execute parts of your build in one machine and serialize results back to the master, display graphs, monitor usage and execute other things that give you the possibility to use Jenkins to create very customized pipelines . Sometimes a researcher has to use tools like stacks , samtools , structure , beast and so it goes. But sometimes he has need of a very specific routine, maybe for plotting something or adjusting data output from one tool, before inputting it into the next tool in the pipeline. These routines are not always worth a tool, as they would be used very rarely . This is possible with Jenkins. Or a job may demand five computers. Common computer facilities would delegate the machine provision to cluster management systems, like PBS , LSF or some cloud based system . With Jenkins, you can manage your computer, maybe even use Puppet to help you. We have a long way ahead, we are reorganizing our servers at Linode to install JIRA and Confluence , and have a more Jenkins-like web site (as this is the principal tool in BioUno). And we are still creating plug-ins. If you have any interested in the project, feel free to join us , your help will be very welcome :-)

Comparison of PBS cluster monitoring applications

Jun 30, 2012 in jenkins, bioinformatics | blog

While we worked on our small internal cluster set up, we used PBS for running structure jobs in batch. This post has a simple comparison of web applications that can be used for monitoring a PBS cluster. This is a very simple comparison, and the information may not be sufficient for you to decide whether you should use it or not in your computer facility.

We found tools with a simple query in SourceForge.net. We looked up pbs, and selected three tools. They are PBS Viz Cluster 0.6a, Torque Web Monitor 1.0 and myJAM 2.4.7-3191. The hardware used is not being taken in account in this comparison, but for what it is worth, we have a Core i5 quad core, with 6GB of memory and 400GB disk. The operational system is Debian Squeeze, and the web server an Apache.