| Bioinformatics Toolbox | ![]() |
Graphically display the words in a sequence
seqshowwords(Seq, Word, 'PropertyName', PropertyValue) seqshowwords(...,'Color', ColorValue) seqshowwords(...,'Columns', ColumnsValue)
Seq | Enter either a nucleotide or amino acid sequence. You can also enter a structure with the field Sequence. |
Word | Enter a short character sequence. |
| ColorValue | Property to select the color for highlighted characters. Enter a 1-by-3 RGB vector specifying the intensity (0–255) of the red, green, and blue components, or enter a character from the following list: 'b'– blue, 'g'– green, 'r'– red, 'c'– cyan, 'm'– magenta, or 'y'– yellow. The default color is red 'r'. |
| ColumnsValue | Property to specify the number of characters in a line. Default value is 64. |
seqshowwords(Seq, Word) displays the sequence with all occurrences of a word highlighted, and returns a structure with the start and stop positions for all occurrences of the word in the sequence.
seqshowwords(...,'Color', ColorValue) selects the color used to highlight the words in the output display.
seqshowwords(...,'Columns', ColumnsValue) specifies how many columns per line to use in the output.
If word contains nucleotide or amino acid symbols that represent multiple possible symbols (ambiguous characters), then seqshowwords shows all matches. For example, the symbol R represents either G or A (purines). For another example, if word equals 'ART', then seqshowwords counts occurrences of both 'AAT' and 'AGT'. This example shows two matches, 'TAGT' and 'TAAT', for the word 'BART'.
seqshowwords('GCTAGTAACGTATATATAAT','BART')
ans =
Start: [3 17]
Stop: [6 20]
000001 GCTAGTAACGTATATATAAT
seqshowwords does not highlight overlapping patterns multiple times. This example highlights two places, the first occurrence of 'TATA' and the 'TATATATA' immediately after 'CG'. The final 'TA' is not highlighted because the preceding 'TA' is part of an already matched pattern.
seqshowwords('GCTATAACGTATATATATA','TATA')
ans =
Start: [3 10 14]
Stop: [6 13 17]
000001 GCTATAACGTATATATATA
To highlight all multiple repeats of TA, use the regular expression 'TA(TA)*TA'.
seqshowwords('GCTATAACGTATATATATA','TA(TA)*TA')
ans =
Start: [3 10]
Stop: [6 19]
000001 GCTATAACGTATATATATA
Bioinformatics Toolbox functions palindromes, restrict, seqdisp, seqshoworfs
MATLAB functions findstr, regexp
| seqshoworfs | seqwordcount | ![]() |
© 1994-2005 The MathWorks, Inc.