Some explanations about the options
- Blast program
- The five BLAST programs described here perform the following tasks:
- . blastp compares an amino acid query sequence against a protein sequence database;
- . blastn compares a nucleotide query sequence against a nucleotide sequence database;
- . blastx compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database;
- . tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
- . tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
- psitblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands) using a position specific matrix created by PSI-BLAST.
- enter either the name of a file or the actual data
- if you are using Netscape 2.x or later, you can select a file by typing its name, or better, by selecting it with the Netscape file browser (Browse button)
- OR you can type your data in the next area, or cut and paste it from another application.
- (but not both)
- protein db
- Choose a protein db for blastp or blastx.
- nucleotid db
- choose a nucleotide db for blastn, tblastn or tblastx
- How many short descriptions? (-v)
- Maximum number of database sequences for which one-line descriptions will be reported (-v).
- How many alignments? (-b)
- Maximum number of database sequences for which high-scoring segment pairs will be reported (-b).
- Show GI's in deflines (only available for NCBI db such as nrprot) (-I)
- Causes NCBI gi identifiers to be shown in the output, in addition to the accession and/or locus name.
- Warning: only available for NCBI db such as nrprot.
- SeqAlign file (-J option must be true) (-O)
- SeqAlign is in ASN.1 format, so that it can be read with NCBI tools (such as sequin). This allows one to view the results in different formats.
- Cost to open a gap (-G)
- default is 5 for blastn, 10 for blastp, blastx and tblastn
- Cost to extend a gap (-E)
- default is 2 for blastn, 1 for blastp, blastx and tblastn
- Limited values for gap existence and extension are supported for these three programs. Some supported and suggested values are:
- Existence Extension
- 10 1
- 10 2
- 11 1
- 8 2
- 9 2
- (source: NCBI Blast page)
- The programs blastn and blastp offer fully gapped alignments. blastx and tblastn have 'in-frame' gapped alignments and use sum statistics to link alignments from different frames. tblastx provides only ungapped alignments.
- Expect: upper bound on the expected frequency of chance occurrence of a set of HSPs (-e)
- The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable.
- Threshold for extending hits (-f)
- Blast seeks first short word pairs whose aligned score reaches at least this value (default for blastp is 11) (T in the NAR paper and in Blast 1.4)
- Number of best hits from region to keep (-K)
- If this option is used a value of 100 is recommended.
- X dropoff value for gapped alignment (in bits) (-X)
- This is the value that control the path graph region explored by Blast during a gapped extension (Xg in the NAR paper) (default for blastp is 15).
HTML output options (html4blast)
- Use external web sites for databases entries retrieval links (-e instead of -s)
- -s option will use SRS for databases entries retrieval links, whereas -e will use the original database site links.
- Draw one HSP per line in image instead of putting all HSP in one line (-l)
- Useful for genomes searching, where there is only one sequence in the database.
- Generate images names based on corresponding query (-q)
- Useful when you only want to keep the image.
Filtering and masking options
- BLAST 2.0 and 2.1 uses the dust low-complexity filter for blastn and seg for the other programs. Both 'dust' and 'seg' are integral parts of the NCBI toolkit and are accessed automatically.
- If one uses '-F T' then normal filtering by seg or dust (for blastn) occurs (likewise '-F F' means no filtering whatsoever).
- This options also takes a string as an argument. One may use such a string to change the specific parameters of seg or invoke other filters. Please see the 'Filtering Strings' section (below) for details.
- Filtering options (-F must be true)
- The -F argument can take a string as input specifying that seg should be run with certain values or that other non-standard filters should be used.
- A coiled-coiled filter, based on the work of Lupas et al. (Science, vol 252, pp. 1162-4 (1991)) written by John Kuzio (Wilson et al., J Gen Virol, vol. 76, pp. 2923-32 (1995)), may be invoked specifying: -F 'C'
- One may also run both seg and coiled-coiled together by using a ';': -F 'C;S'
- Filtering by dust may also be specified by: -F 'D'
- It is possible to specify that the masking should only be done during the process of building the initial words by starting the filtering command with 'm', e.g.: -F 'm S' which specifies that seg (with default arguments) should be used for masking, but that the masking should only be done when the words are being built.
- If the -U option (to mask any lower-case sequence in the input FASTA file) is used and one does not wish any other filtering, but does wish to mask when building the lookup tables then one should specify: -F 'm'
- Use lower case filtering (-U)
- This option specifies that any lower-case letters in the input FASTA file should be masked.
- Sequence format
- The sequence will be automatically converted in the format needed for the program
- providing you enter a sequence either:
- in plain (raw) sequence format or in one of the following known formats:
- You may enter in the text area a database entry code, or an accession number, in this form:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaeffer,Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25:3389-3402.
The Institute of Sericulture and Systems Biology, Southwest University
Tel:+86-23-68250793 Address:216 Tiansheng Rd., Beibei District, Chongqing, China 400716