Frequently Asked Questions (FAQ) for mpiBLAST
- Does mpiBLAST support PSI-BLAST, PHI-BLAST, RPS-BLAST, etc.?
- Does mpiBLAST support Mega-BLAST?
- Can mpiBLAST run without local storage?
- Can mpiBLAST run without a shared filesystem?
- Can mpiBLAST be run on a single processor system for testing purposes?
- I benchmarked mpiBLAST but I don't see super-linear speedup! Why?!
- Does mpiBLAST run on Mac OS X?
- How do I compile mpiBLAST from SVN?
- How do I format a huge database?
- How accurate are the E-value statistics?
Does mpiBLAST support PSI-BLAST, PHI-BLAST, RPS-BLAST, etc.?
No. Although it may be possible to parallelize these search algorithms using database segmentation, our preliminary studies indicate they would not benefit as much as the other blast search types do from such a parallelization scheme.
Does mpiBLAST support Mega-BLAST?
No. We are focusing our efforts on blastn, blastp, blastx, tblastn, and tblastx.
Can mpiBLAST run without local storage?
Yes. On systems without local storage, turn on the use-virtual-frags option for better performance.
Can mpiBLAST be run on a single processor system for testing purposes?
Yes, simply execute the desired number of MPI processes using the -np flag. The minimum is -np 3.
I benchmarked mpiBLAST but I don't see super-linear speedup! Why?!
mpiBLAST only yields super-linear speedup when the database being searched is significantly larger than the core memory on an individual node. The super-linear speedup results published in the ClusterWorld 2003 paper describing mpiBLAST are measurements of mpiBLAST v0.9 searching a 1.2GB (compressed) database on a cluster where each node has 640MB of RAM. A single node search results in heavy disk I/O and a long search time.
Does mpiBLAST run on Mac OS X?
Yes, mpiBLAST versions 1.3.0 or later support Mac OS X.
How do I compile mpiBLAST from SVN?
Please see the instructions on the development page.
How do I format a huge database?
Large databases like
nt can consume several gigabytes of disk space and it is preferable to store them in compressed form. Starting with mpiBLAST 1.4.0 it is possible to pipe FastA formatted sequence data into
mpiformatdb. This feature provides the ability to directly format a compressed (gzip/bzip etc.) database using command line syntax like:
zcat nt.gz | mpiformatdb -i stdin -N 100 -t nt -p F
mpiformatdb needs the
-t <title> and
-p <T|F> options to format a database piped via standard input.
How accurate are the E-value statistics?
In mpiBLAST 1.3 or later, they are exact for all supported search types. In versions 1.2.1 and earlier, e-values for blastn were loosely approximated using a linear equation. For blastp, blastx, tblastn, and tblastx they were inaccurate in versions 1.2.1 and earlier. Note that by "exact" we mean exactly the same as those generated by NCBI-BLAST with the traditional search engine. As of 2009, NCBI is still refining the e-value calculations in their blast implementation.