Module:Bio::Index::Blast

From BioPerl
Jump to: navigation, search
Bio::Index::Blast
PDoc Bio::Index::Blast
CPAN Bio::Index::Blast
metaCPAN Bio::Index::Blast


This particular module is for indexing BLAST reports (not a NCBI BLAST or WU BLAST indexed sequence file) so that one can have random access to a report for a particular sequence in the report database.

Setting Up the BioPerl Indices (Bio::Index::*)

If you want to use BioPerl indices of FASTA, EMBL, Swissprot .dat files, SwissPfam, GenBank, or BLAST files then the bp_fetch.PLS and bp_index.PLS scripts are great ways to start off (and also reading the scripts shows you how to use the BioPerl indexing stuff). You can find these two scripts in the scripts/index directory (see Bioperl scripts for a complete list of BioPerl scripts).

bp_fetch.PLS and bp_index.PLS coordinate using two environment variables:

  • BIOPERL_INDEX - directory where the indices are kept
  • BIOPERL_INDEX_TYPE - type of DBM file to use for the index (see AnyDBM_File)

For example, for csh-style shells (eg. tcsh):

  setenv BIOPERL_INDEX /nfs/datadisk/bioperlindex/
  setenv BIOPERL_INDEX_TYPE SDBM_File

Or in sh-style shells (eg. bash):

  export BIOPERL_INDEX=/nfs/datadisk/bioperlindex/
  export BIOPERL_INDEX_TYPE=SDBM_File

The basic way of indexing a database, once BIOPERL_INDEX has been set up, is to go

 bp_index.pl <index-name> <filenames as full path>

e.g., for Fasta files

 bp_index.pl est /nfs/somewhere/fastafiles/est*.fa

Or, for EMBL/Swissprot files

 bp_index.pl -fmt=EMBL swiss /nfs/somewhere/swiss/swissprot.dat

Retrieving Sequences

To retrieve sequences from the index use

 bp_fetch.pl <index-name>:<id>

For example:

 bp_fetch.pl est:AA01234

or

 bp_fetch.pl swiss:VAV_HUMAN

bp_fetch.pl also has other options to connect to GenBank across the network.

Other Modules

See Bio::Index::Fasta, Bio::Index::GenBank, Bio::Index::Blast, Bio::Index::Hmmer, Bio::Index::EMBL, Bio::Index::SwissPfam, and Bio::Index::Swissprot for more.

Flat file indexing of Fasta files is also provided by Bio::DB::Fasta - this module provides some functionality absent from Bio::Index::Fasta.

Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox