SilkDB
Home News Genome Microarray Tools Download Document Resource Links About SilkDB  

BLAST2: with gaps (Altschul, Madden, Schaeffer, Zhang, Miller, Lipman)



your e-mail

Blast program



Sequence File : please enter either :
  1. the name of a file:

    Please input the sequence in Fasta format or silkworm Gene ID.

    Notes:If you input Gene ID, the line must start with "BmID:", and gene name must suffix with "-TA" to indicates this sequence is nucleotide, and suffix with "-PA" indicates it is protein peptides.

    For example, BmID:BGIBMGA012615-PA,BGIBMGA012616-PA,BGIBMGA012617-PA

  2. or the actual data here:




Start of required region in query sequence (-L)

End of required region in query sequence (-L)

protein db

nucleotid db

Filtering and masking options

Selectivity options

Scoring options

Translation options

Report options

Other Options


Filtering and masking options

Filter query sequence (DUST with blastn, SEG with others) (-F)

Filtering options (-F must be true)

Use lower case filtering (-U)



[Return to the main part with your favorite browser's Back function]


Selectivity options

Expect: upper bound on the expected frequency of chance occurrence of a set of HSPs (-e)

Word Size (-W) (zero invokes default behavior)

Multiple Hits window size (zero for single hit algorithm) (-A)

Threshold for extending hits (-f)

X dropoff for blast extention in bits (0.0 invokes default behavior) (-y)

Number of best hits from region to keep (-K)

Perform gapped alignment (not available with tblastx) (-g)

X dropoff value for gapped alignment (in bits) (-X)

X dropoff value for final alignment (in bits) (-Z)



[Return to the main part with your favorite browser's Back function]


Scoring options

Penalty for a nucleotide mismatch (blastn) (-q)

Reward for a nucleotide match (blastn) (-r)

Matrix (-M)

Cost to open a gap (-G)

Cost to extend a gap (-E)



[Return to the main part with your favorite browser's Back function]


Translation options

Query Genetic code to use (blastx) (-Q)

DB Genetic code (for tblast[nx] only) (-D)

Query strand to search against database (for blastn, blastx or tblastx) (-S) ? 1: top 2: bottom 3: both



[Return to the main part with your favorite browser's Back function]


Report options

How many short descriptions? (-v)

How many alignments? (-b)

Alignment view options (not with blastx/tblastx) (-m)

Show GI's in deflines (only available for NCBI db such as nrprot) (-I)

SeqAlign file (-J option must be true) (-O)

Believe the query defline (-J)

Html output

HTML output options (html4blast)



[Return to the main part with your favorite browser's Back function]


HTML output options (html4blast)

Use external web sites for databases entries retrieval links (-e instead of -s)

Draw one HSP per line in image instead of putting all HSP in one line (-l)

Generate images names based on corresponding query (-q)



[Return to the main part with your favorite browser's Back function]


Other Options



Restrict search of database to GI's in file (-l) : please enter
either :
  1. the name of a file:
  2. or the actual data here:





PSI-TBLASTN checkpoint file (-R) : please enter either :
  1. the name of a file:
  2. or the actual data here:





[Return to the main part with your favorite browser's Back function]


your e-mail


Some explanations about the options



Main parameters
Blast program
The five BLAST programs described here perform the following tasks:
. blastp compares an amino acid query sequence against a protein sequence database;
. blastn compares a nucleotide query sequence against a nucleotide sequence database;
. blastx compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database;
. tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
. tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
psitblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands) using a position specific matrix created by PSI-BLAST.
enter either the name of a file or the actual data
if you are using Netscape 2.x or later, you can select a file by typing its name, or better, by selecting it with the Netscape file browser (Browse button)
OR you can type your data in the next area, or cut and paste it from another application.
(but not both)
protein db
Choose a protein db for blastp or blastx.
nucleotid db
choose a nucleotide db for blastn, tblastn or tblastx


Report options
How many short descriptions? (-v)
Maximum number of database sequences for which one-line descriptions will be reported (-v).
How many alignments? (-b)
Maximum number of database sequences for which high-scoring segment pairs will be reported (-b).
Show GI's in deflines (only available for NCBI db such as nrprot) (-I)
Causes NCBI gi identifiers to be shown in the output, in addition to the accession and/or locus name.
Warning: only available for NCBI db such as nrprot.
SeqAlign file (-J option must be true) (-O)
SeqAlign is in ASN.1 format, so that it can be read with NCBI tools (such as sequin). This allows one to view the results in different formats.


Scoring options
Cost to open a gap (-G)
default is 5 for blastn, 10 for blastp, blastx and tblastn
Cost to extend a gap (-E)
default is 2 for blastn, 1 for blastp, blastx and tblastn
Limited values for gap existence and extension are supported for these three programs. Some supported and suggested values are:
Existence Extension
10 1
10 2
11 1
8 2
9 2
(source: NCBI Blast page)


Selectivity options
The programs blastn and blastp offer fully gapped alignments. blastx and tblastn have 'in-frame' gapped alignments and use sum statistics to link alignments from different frames. tblastx provides only ungapped alignments.
Expect: upper bound on the expected frequency of chance occurrence of a set of HSPs (-e)
The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable.
Threshold for extending hits (-f)
Blast seeks first short word pairs whose aligned score reaches at least this value (default for blastp is 11) (T in the NAR paper and in Blast 1.4)
Number of best hits from region to keep (-K)
If this option is used a value of 100 is recommended.
X dropoff value for gapped alignment (in bits) (-X)
This is the value that control the path graph region explored by Blast during a gapped extension (Xg in the NAR paper) (default for blastp is 15).




HTML output options (html4blast)
Use external web sites for databases entries retrieval links (-e instead of -s)
-s option will use SRS for databases entries retrieval links, whereas -e will use the original database site links.
Draw one HSP per line in image instead of putting all HSP in one line (-l)
Useful for genomes searching, where there is only one sequence in the database.
Generate images names based on corresponding query (-q)
Useful when you only want to keep the image.


Filtering and masking options
BLAST 2.0 and 2.1 uses the dust low-complexity filter for blastn and seg for the other programs. Both 'dust' and 'seg' are integral parts of the NCBI toolkit and are accessed automatically.
If one uses '-F T' then normal filtering by seg or dust (for blastn) occurs (likewise '-F F' means no filtering whatsoever).
This options also takes a string as an argument. One may use such a string to change the specific parameters of seg or invoke other filters. Please see the 'Filtering Strings' section (below) for details.
Filtering options (-F must be true)
The -F argument can take a string as input specifying that seg should be run with certain values or that other non-standard filters should be used.
A coiled-coiled filter, based on the work of Lupas et al. (Science, vol 252, pp. 1162-4 (1991)) written by John Kuzio (Wilson et al., J Gen Virol, vol. 76, pp. 2923-32 (1995)), may be invoked specifying: -F 'C'
One may also run both seg and coiled-coiled together by using a ';': -F 'C;S'
Filtering by dust may also be specified by: -F 'D'
It is possible to specify that the masking should only be done during the process of building the initial words by starting the filtering command with 'm', e.g.: -F 'm S' which specifies that seg (with default arguments) should be used for masking, but that the masking should only be done when the words are being built.
If the -U option (to mask any lower-case sequence in the input FASTA file) is used and one does not wish any other filtering, but does wish to mask when building the lookup tables then one should specify: -F 'm'
Use lower case filtering (-U)
This option specifies that any lower-case letters in the input FASTA file should be masked.
Sequence format
The sequence will be automatically converted in the format needed for the program
providing you enter a sequence either:
in plain (raw) sequence format or in one of the following known formats:
IG,GenBank,NBRF,EMBL,GCG,DNAStrider,Fitch,fasta,Phylip,PIR,MSF,ASN,PAUP,CLUSTALW
You may enter in the text area a database entry code, or an accession number, in this form:

database:entry_name

or:

database:accession.

References:

Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaeffer,Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25:3389-3402.