mpiBLAST: Open-Source Parallel BLAST

| Home | Support | Download | Site Map |



synergy





 

mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. By efficiently utilizing distributed computational resources through database fragmentation, query segmentation, intelligent scheduling, and parallel I/O, mpiBLAST improves NCBI BLAST performance by several orders of magnitude while scaling to hundreds of processors. mpiBLAST is also portable across many different platforms and operating systems. Lastly, a renewed focus and consolidation of the many codebases has positioned mpiBLAST to continue to be of high utility to the bioinformatics community. More...

Key Features of mpiBLAST: Awards:
  • Database fragmentation

  • Query segmentation

  • Exact NCBI e-value scores

  • Increased per-query throughput

  • Improved query response time

  • Portable across all major operating systems

  • Integrated advanced job scheduling

  • Parallel input/output

  • Fault tolerant

  • High-performance on desktops, clusters, and HPC systems

Recent News:


mpiBLAST-1.5.0-PIO Released 2008-01-17
After almost a year in development, we are pleased to release version 1.5.0-PIO for production use. Although labeled as "PIO", this version is capable of running on both parallel (e.g. PVFS) and serial (e.g. NFS) file-systems achieving significant performance improvements over prior versions of mpiBLAST. Additionally, this release is able to quickly generate exact e-value scores for large query sets. Also of interest, this version was used in our winning entry for the SC|07 Storage Challenge. Get mpiBLAST-1.5.0-PIO here.

mpiBLAST Turns 5 Years Old 2007-12-31
Today is an important day for the mpiBLAST project as it marks 5 years since the initial 1.0 release back in 2002. The mpiBLAST project would like to thank all its users and developers for helping make mpiBLAST what it is today: one of the most popular parallel BLAST applications in use and the de facto standard against which all parallel BLAST applications are compared.

WINNER SC|07 Storage Challenge: ParaMEDIC Environment for mpiBLAST 2007-11-15
The collaboration between Virginia Tech (mpiBLAST), Argonne National Laboratory (MPICH2), and North Carolina State University (mpiBLAST-PIO) was chosen as the Winner of the SC|07 Storage Challenge. ParaMEDIC used 12,000 cores, performed 256 Trillion searches, and generated 1 Petabyte of data. The winning announcement is here: SC07 Award Winners

ParaMEDIC Enables Worldwide Supercomputer for Bioinformatics 2007-11-08
Utilizing the combined resources of five supercomputer centers distributed across the continental United States and a single high-performance storage center more than 10,000 kilometers away in Tokyo, Japan, a worldwide supercomputer to benefit genomics performed more than 256 Trillion searches and generated 1 Petabyte of data. Newswise reports that this high-performance worldwide supercomputer and ParaMEDIC, a general software-based framework for large-scale distributed computing developed by Argonne National Laboratory (ANL) and Virginia Tech, will have a significant impact on the study of genomics.

mpiBLAST-2.0 presented at the Microsoft eScience Workshop 2007 at RENCI 2007-10-21
As part of the poster session highlighting novel research, mpiBLAST was presented at the Microsoft eScience Workshop 2007 at RENCI. Showcasing the novel mixin layers software architecture, the poster was well received and fostered future collaborations with large medical and bioinformatics institutions. The poster is available for download on the Publications page.

Software Architecture of mpiBLAST-2.0 presented at IEEE ICSM 2007-10-02
The software engineering methodology and analysis of mpiBLAST-2.0 was presented at 23rd IEEE International Conference for Software Maintenance (ICSM 2007). The presentation was well received and catalyzed several future collaborations. The paper and presentation are available here.

SC|07 Storage Challenge Finalist: ParaMEDIC Environment for mpiBLAST 2007-08-28
MPICH2 (Argonne National Laboratory) and mpiBLAST (Virginia Tech) collaborate using the ParaMEDIC framework to land a finalist slot in the SC|07 storage challenge. ParaMEDIC, short for Parallel Meta-data Environment for Distributed I/O and Computing, accelerates the I/O in mpiBLAST by as much as 25-fold in a distributed I/O and computing environment. For additional information, see the SC07 entry here.

Next-generation mpiBLAST framework presented at IEEE EMBC 2007-08-23
A high-level overview of a new pluggable software architecture for mpiBLAST was presented at the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2007) in Lyon, France. The publication can be found here.

mpiBLAST project and website rejuvenated 2007-07-07
Just as the website sports a newly revamped look, the mpiBLAST project is also undergoing a renaissance. For the first time since its inception, several developers are co-located at the same place spurring development. Furthermore, a new development model is also emerging more closely resembling the successful open-source models of established projects.

Massively Parallel Sequence Search presented at ACM-CF 2007-05-07
"Parallel Genomic Sequence-Search on a Massively Parallel System" is presented by Virginia Tech, IBM, and North Carolina State University at ACM International Conference on Computing Frontiers. This paper describes a novel approach to executing mpiBLAST-PIO on IBM BlueGene/L that improves scalability to 8192 nodes (with 2 processors per node). The modifications will be either be released as a patch or incorporated into mpiBLAST-1.5.0-PIO shortly.

GreenGene Ad-hoc Grid presented at SC|06 2006-11-15
Virginia Tech, the University of Utah, and North Carolina State University present the following Best Paper Nominee at SC06: "Parallel Genomic Sequence-Searching on an Ad-Hoc Grid: Experiences, Lessons Learned, and Implications". The paper describes a multi-disciplinary effort from SC05 that used 458 lines of Perl code to form an ad-hoc grid to run mpiBLAST-1.4.0-PIO. The "GreenGene" ad-hoc grid used over 3000 CPU cores to compare the NT database against itself.

More news
 
 
| Edit | Print |