BOSC 2012 and BioUno

Few weeks ago TupiLabs participated of BOSC 2012 , with a talk about BioUno . BioUno is a project that applies techniques and tools from Continuous Integration in Bioinformatics, in special Jenkins . In the very beginning, the project was a ” hey, look what I can do with Jenkins “. Later on we defined that we would like to create biology pipelines with Jenkins , using plug-ins. This was the topic of our talk in BOSC 2012 - Creating biology pipelines with Jenkins . The main advantage of using Jenkins, is its distributed environment with master and slaves , the plug-ins and the documentation and community available .

In the talk, we showed a demo using vanilla Jenkins, MrBayes , with mrbayes-plugin , and R and ape for plotting, using r-plugin . It was a very simple pipeline, without using Cloud services or any NGS tool. What was great in BOSC 2012 is that we could catch up with people that work on Galaxy , Taverna , Mobyle and other workflow management systems, as well as lads from other interesting projects, like CloudGene (MapReduce interface for researchers, pretty cool). When we returned to Brazil, we had more items to include in BioUno TODO list, but also a realization. That BioUno is not only a biology workflow management system. Its roles intersect with Galaxy, Taverna and Mobyle, as a workflow management system , but also intersect with BioHPC , as a bioinformatics computer management system .

With Jenkins you can start and monitor slaves remotely (in your local network or in a cloud), execute parts of your build in one machine and serialize results back to the master, display graphs, monitor usage and execute other things that give you the possibility to use Jenkins to create very customized pipelines . Sometimes a researcher has to use tools like stacks , samtools , structure , beast and so it goes. But sometimes he has need of a very specific routine, maybe for plotting something or adjusting data output from one tool, before inputting it into the next tool in the pipeline. These routines are not always worth a tool, as they would be used very rarely . This is possible with Jenkins. Or a job may demand five computers. Common computer facilities would delegate the machine provision to cluster management systems, like PBS , LSF or some cloud based system . With Jenkins, you can manage your computer, maybe even use Puppet to help you. We have a long way ahead, we are reorganizing our servers at Linode to install JIRA and Confluence , and have a more Jenkins-like web site (as this is the principal tool in BioUno). And we are still creating plug-ins. If you have any interested in the project, feel free to join us , your help will be very welcome :-)