Posts tagged with jenkins

BOSC 2012 and BioUno

Aug 10, 2012 in jenkins, bioinformatics, events | blog

Few weeks ago TupiLabs participated of BOSC 2012 , with a talk about BioUno . BioUno is a project that applies techniques and tools from Continuous Integration in Bioinformatics, in special Jenkins . In the very beginning, the project was a ” hey, look what I can do with Jenkins “. Later on we defined that we would like to create biology pipelines with Jenkins , using plug-ins. This was the topic of our talk in BOSC 2012 - Creating biology pipelines with Jenkins . The main advantage of using Jenkins, is its distributed environment with master and slaves , the plug-ins and the documentation and community available .

In the talk, we showed a demo using vanilla Jenkins, MrBayes , with mrbayes-plugin , and R and ape for plotting, using r-plugin . It was a very simple pipeline, without using Cloud services or any NGS tool. What was great in BOSC 2012 is that we could catch up with people that work on Galaxy , Taverna , Mobyle and other workflow management systems, as well as lads from other interesting projects, like CloudGene (MapReduce interface for researchers, pretty cool). When we returned to Brazil, we had more items to include in BioUno TODO list, but also a realization. That BioUno is not only a biology workflow management system. Its roles intersect with Galaxy, Taverna and Mobyle, as a workflow management system , but also intersect with BioHPC , as a bioinformatics computer management system .

With Jenkins you can start and monitor slaves remotely (in your local network or in a cloud), execute parts of your build in one machine and serialize results back to the master, display graphs, monitor usage and execute other things that give you the possibility to use Jenkins to create very customized pipelines . Sometimes a researcher has to use tools like stacks , samtools , structure , beast and so it goes. But sometimes he has need of a very specific routine, maybe for plotting something or adjusting data output from one tool, before inputting it into the next tool in the pipeline. These routines are not always worth a tool, as they would be used very rarely . This is possible with Jenkins. Or a job may demand five computers. Common computer facilities would delegate the machine provision to cluster management systems, like PBS , LSF or some cloud based system . With Jenkins, you can manage your computer, maybe even use Puppet to help you. We have a long way ahead, we are reorganizing our servers at Linode to install JIRA and Confluence , and have a more Jenkins-like web site (as this is the principal tool in BioUno). And we are still creating plug-ins. If you have any interested in the project, feel free to join us , your help will be very welcome :-)

Comparison of PBS cluster monitoring applications

Jun 30, 2012 in jenkins, bioinformatics | blog

While we worked on our small internal cluster set up, we used PBS for running structure jobs in batch. This post has a simple comparison of web applications that can be used for monitoring a PBS cluster. This is a very simple comparison, and the information may not be sufficient for you to decide whether you should use it or not in your computer facility.

We found tools with a simple query in SourceForge.net. We looked up pbs, and selected three tools. They are PBS Viz Cluster 0.6a, Torque Web Monitor 1.0 and myJAM 2.4.7-3191. The hardware used is not being taken in account in this comparison, but for what it is worth, we have a Core i5 quad core, with 6GB of memory and 400GB disk. The operational system is Debian Squeeze, and the web server an Apache.

Creating a PBS/MPI cluster for bioinformatics – Part 3

Jun 21, 2012 in jenkins, bioinformatics | tutorial

This is the third and last part of this blog series. In this post we will install Structure (Pritchard Lab) and Torque PBS. We will configure a simple run in Structure using the two machines in our cluster.

Installing Structure

Installing structure is very simple. Download the latest version from Pritchard Lab page, decompress it and move the executables to a folder in your $PATH (or use symlinks). Here I’m using /usr/local/bin, but to keep things in order, I renamed the console folder (from structure_linux_console.tar.gz) to structure-console-2.3.3, because I like to know the tool and its version without having to browse it. Then I moved it under /opt/biouno, where I keep the executables used by the cluster. Finally, I created the symlink /usr/local/bin/structure that points to /opt/biouno/structure-console-2.3.3/structure.

Creating a PBS/MPI cluster for bioinformatics – Part 2

Apr 29, 2012 in jenkins, bioinformatics | tutorials

In the previous post of this series we saw how to configure a basic network for our small cluster. Now it is time to work on the MPI stuff. Our cluster will be a Beowulf cluster. This kind of cluster is composed of commodity computers, connected via network sharing resources and programs. And in the next post, we will see how to include a batch queuing system to control the resource utilization in our cluster.

MPI is not a library. MPI is actually a standard. When you find an RESTful application, REST is only a standard, and you have different libraries that implement this standard (Jersey, JBoss Reast Easy, CodeIgniter+Rest Controllers, and so it goes). With MPI it is not different. There are different implementations of the MPI specification. We suggest you to read this tutorial that, although it was written in 2009, it uses Debian (operational system we are using in BioUno cluster) and is very well written and concise. Basically, you have to install OpenMPI, one of the existing MPI libraries. And if you followed the instructions in part 1 of this series, then you already have SSH correctly configured in your computers.