mpiBLAST: Open-Source Parallel BLAST

| Home | Support | Download | Site Map |



synergy





 

User's Guide

In order to perform a search with mpiBLAST, the target BLAST database must first be formatted and segmented using mpiformatdb. Then, mpiexec can be used to execute mpiblast in parallel on several cluster nodes.

Formatting a database

Before processing blast queries the sequence database must be formatted with mpiformatdb. The command line syntax looks like this:
mpiformatdb -N 25 -i nt -o T

The above command would format the nt database into 25 fragments, ideally for 25 worker nodes. mpiformatdb accepts the same command line options as NCBI's formatdb. See the README.formatdb file that comes with the NCBI BLAST distribution for more details.

mpiformatdb reads the ~/.ncbirc file and creates the formatted database fragments in the shared storage directory.

Querying the database

mpiblast command line syntax is nearly identical to NCBI's blastall program. See the README.bls file included in the BLAST distribution for details. Running a query on 25 nodes would look like:
mpiexec -n 27 mpiblast -p blastn -d nt -i blast_query.fas -o blast_results.txt

The above command would query the sequences in blast_query.fas against the nt database and write out results to the blast_results.txt file in the current working directory. By default, mpiBLAST reads configuration information from ~/.ncbirc. An optional --config-file argument can also be used to specify the configuration file for a specific run. Furthermore, mpiBLAST needs at least 3 processes to perform a search: one process performs file output and another schedules search tasks, while any additional processes actually perform search tasks.

Extra options to mpiblast

Please refer to the README file in the PIO package for extra options to mpiBLAST.

Removing a database

The --removedb command line option will cause mpiBLAST to do all work in a temporary directory that will get removed from each node's local storage directory upon successful termination. For example:
mpiexec -n 16 mpiblast -p blastx -d yeast.aa -i ech_10k.fas -o results.txt --removedb

The above command would perform a 16 node (14 worker) search of the yeast.aa database, writing the output to results.txt. Upon completion, worker nodes would delete the yeast.aa database fragments from their local storage.

Databases can also be removed without performing a search in the following manner:
mpiexec -n 16 mpiblast_cleanup

 
 
| Edit | Print |