seqlogo

Display sequence logo for nucleotide and amino acid sequences

Syntax

seqlogo(Seqs, 'PropertyName', PropertyValue ...)
DiplayInfo = seqlogo(Seqs)
DisplayInfo = seqlogo(..., 'Displaylogo', DisplaylogoValue).
seqlogo(..., 'Alphabet', AlphabetValue)
seqlogo(..., 'Startat', StartatValue)
seqlogo(..., 'Endat', EndatValue)
seqlogo(..., 'SSCorrection', SSCorrectionValue).

Arguments

SeqsSet of pairwise or multiply aligned amino acid or nucleotide sequences. Enter an array of strings, a cell array of strings, or an array of structures with the field Sequence.
DisplaylogoProperty to control drawing a sequence logo. Enter either true or false.

Description

seqlogo(Seqs, 'PropertyName', PropertyValue ...) displays a sequence logo for a set of aligned sequences (Seqs). The logo graphically displays the sequence conservation at a particular position in the alignment of sequences measured in bits. The maximum sequence conservation per site is log2(4) bits for nucleotide sequences and log2(20) bits for amino acid sequences.

The alphabet for nucleic acids is colored as follows

AGreen
CBlue
GYellow
T, URed

The alphabet for proteins is colored according to chemical property as follows

G S T Y C Q N(Polar) — Green
A V L I P W F M (Hydrophobic) — Orange
D E (Acidic) — Red
K R H (Basic) — Blue

Ambiguous symbols not in the list above are added to the logo and colored purple.

DiplayInfo = seqlogo(Seqs)returns a cell array of the sequence position count matrix, array of unique symbols in sequences, and the information weight matrix used for graphically displaying the logo.

DisplayInfo = seqlogo(..., 'Displaylogo', DisplaylogoValue). when Displaylogo is false, returns display information, but does not draw the sequence logo.

seqlogo(..., 'Alphabet', AlphabetValue) selects the alphabet for nucleotide sequences ('NT') or amino acid sequences ('AA'). The default is 'NT'. If you provide amino acid sequences to seqlogo, you must select 'AA' for the Alphabet.

seqlogo(..., 'Startat', StartatValue) specifies the starting position for the sequences (Seqs). The default starting position is 1.

seqlogo(..., 'Endat', EndatValue) specifies the ending position for the sequences (Seqs). The default ending position is the maximum length of the sequences (Seqs).

seqlogo(..., 'SSCorrection', SSCorrectionValue). when SSCorrection is false, no estimation is made for the number of bits. A simple calculation of bits tends to overestimate the conservation at a particular location. To compensate for this overestimation, when SSCorrection is true, a rough estimate is applied as an approximate correction. This correction works better when the number of sequences is greater than 50. The default is true.

Reference

Schneider, T.D., Stephens, R.M., "Sequence Logos: A new way to display consensus sequences," Nucleic Acids Research, Vol. 18, pp. 6097-6100, 1990.

Examples

Display the sequence logo for a series of aligned sequences

S = {'ATTATAGCAAACTA',...
     'AACATGCCAAAGTA',...
     'ATCATGCAAAAGGA'}
seqlogo(S)

See Also

Bioinformatics Toolbox functions seqconsensus, seqdisp, seqprofile


© 1994-2005 The MathWorks, Inc.