blastncbi

Generate a remote BLAST request

Syntax

blastncbi(Seq, Program, 'PropertyName', PropertyValue...)
RID = blastncbi(Seq, Program)
[RID, RTOE]= blastncbi(Seq, Program)

blastncbi(..., 'Database', DatabaseValue)
blastncbi(..., 'Descriptions', DescriptionsValue)
blastncbi(..., 'Alignments', AlignmentsValue)
blastncbi(..., 'Filter', FilterValue)
blastncbi(..., 'Expect', ExpectValue)
blastncbi(..., 'Word', WordValue)
blastncbi(..., 'Matrix', MatrixValue)
blastncbi(..., 'Gapopen', GapopenValue)
blastncbi(..., 'ExtendGap', ExtendGapValue)
blastncbi(..., 'Inclusion', InclusionValue)
blastncbi(..., 'Pct', PctValue)

Arguments

Seq

Nucleotide or amino acid sequence. Enter a GenBank or RefSeq accession number, GI, FASTA file, URL, string, character array, or a MATLAB structure that contains the field Sequence. You can also enter a structure with the field Sequence.

Program

BLAST program. Enter 'blastn', 'blastp', 'pciblast', 'blastx', 'tblastn', 'tblastx', or 'megablast'.

Database

Property to select a database. Compatible databases depend upon the type of sequence submitted and program selected. The nonredundant database, 'nr', is the default value for both nucleotide and amino acid sequences.

For nucleotide sequences, enter 'nr', 'est', 'est_human', 'est_mouse', 'est_others', 'gss', 'htgs', 'pat', 'pdb', 'month', 'alu_repeats', 'dbsts', 'chromosome', or 'wgs'. The default value is 'nr'.

For amino acid sequences, enter 'nr', 'swissprot', 'pat', 'pdb', or 'month'. The default value is 'nr'.

DescriptionProperty to specify the number of short descriptions. The default value is normally 100, and for Program = pciblast, the default value is 500.
AlignmentProperty to specify the number of sequences to report high-scoring segment pairs (HSP). The default value is normally 100, and for Program = pciblast, the default value is 500.
Filter

Property to select a filter. Enter 'L' (low-complexity), 'R' (human repeats), 'm' (mask for lookup table), or 'lcase' (to turn on the lowercase mask). The default value is 'L'.

Expect

Property to select the statistical significance threshold. Enter a real number. The default value is 10.

Word

Property to select a word length. For amino acid sequences, Word can be 2 or 3 (3 is the default value), and for nucleotide sequences, Word can be 7, 11, or 15 (11 is the default value). If Program = 'MegaBlast', Word can be 11, 12, 16, 20, 24, 28, 32, 48, or 64, with a default value of 28

Matrix

Property to select a substitution matrix for amino acid sequences. Enter 'PAM30', 'PAM70', 'BLOSUM80', 'BLOSUM62', or 'BLOSUM45'. The default value is 'BLOSUM62'.

InclusionProperty for PCI-BLAST searches to define the statistical significance threshold. The default value is 0.005.
Pct

Property to select the percent identity. Enter None, 99, 98, 95, 90, 85, 80, 75, or 60. Match and mismatch scores are automatically selected. The default value is 99 (99, 1, -3)

Description

The Basic Local Alignment Search Tool (BLAST) offers a fast and powerful comparative analysis of interesting protein and nucleotide sequences against known structures in existing online databases.

blastncbi(Seq, Program) sends a BLAST request against a sequence (Seq) to NCBI using a specified program (Program).

blastncbi uses the NCBI default values for the optional arguments: 'nr' for the database, 'L' for the filter, and '10' for the expectation threshold. The default values for the remaining optional arguments depend on which program is used. For help in selecting an appropriate BLAST program, visit

http://www.ncbi.nlm.nih.gov/BLAST/producttable.shtml

Information for all of the optional parameters can be found at

http://www.ncbi.nlm.nih.gov/blast/html/blastcgihelp.html

blastncbi(..., 'Database', DatabaseValue) selects a database for the alignment search.

blastncbi(..., 'Descriptions', DescriptionsValue) when the function is called without output arguments, specifies the numbers of short descriptions returned to the quantity specified.

blastncbi(..., 'Alignments', AlignmentsValue) when the function is called without output arguments, specifies the number of sequences for which high-scoring segment pairs (HSPs) are reported.

blastncbi(..., 'Filter', FilterValue) selects the filter to applied to the query sequence.

blastncbi(... , 'Expect', ExpectValue) provides a statistical significance threshold for matches against database sequences. You can learn more about the statistics of local sequence comparison at

http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html#head2

blastncbi(..., 'Word', WordValue) selects a word size for amino acid sequences.

blastncbi(..., 'Matrix', MatrixValue) selects the substitution matrix for amino acid sequences only. This matrix assigns the score for a possible alignment of two amino acid residues.

blastncbi(..., 'GapOpen', GapOpenValue) selects a gap penalty for amino acid sequences. Allowable values for a gap penalty vary with the selected substitution matrix. For information about allowed gap penalties for matrixes other then the BLOSUM62 matrix, see

http://www.ncbi.nlm.nih.gov/blast/html/blastcgihelp.html

blastncbi(... , 'ExtendGap', ExtendGapValue) defines the penalty for extending a gap greater than one space.

blastncbi(..., 'Inclusion', InclusionValue) for PSI-BLAST only, defines the statistical significance threshold (InclusionValue) for including a sequence in the Position Specific Score Matrix (PSSm) created by PSI-BLAST for the subsequent iteration. The default value is 0.005.

blastncbi(..., 'Pct', PctValue), when ProgramValue is 'Megablast', selects the percent identity and the corresponding match and mismatch score for matching existing sequences in a public database.

Examples

% Get a sequence from the Protein Data Bank and create 
% a MATLAB structure 
S = getpdb('1CIV')

% Use the structure as input for a BLAST search with an
% expectation of 1e-10.
blastncbi(S,'blastp','expect',1e-10)

% Click the URL link (Link to NCBI BLAST Request) to go
% directly to the NCBI request.

% You can also try a search directly with an accession 
% number and an alternative scoring matrix.
RID = blastncbi('AAA59174','blastp','matrix','PAM70,'...
                             'expect',1e-10)

% The results based on the RID are at
http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi

% or pass the RID to BLASTREAD to parse the report and
% load it into a MATLAB structure.
blastread(RID)

See Also

Bioinformatics Toolbox function blastread


© 1994-2005 The MathWorks, Inc.