| Bioinformatics Toolbox | ![]() |
Convert sequence with ambiguous characters to regular expression
seq2regexp(Seq)
Seq | Nucleotide or amino acid sequence. |
Nucleotide Conversions
| Nucleotide Letter | Nucleotide | Nucleotide Letter | Nucleotide |
|---|---|---|---|
A—A | Adenosine | S—[GC] | (Strong) |
C—C | Cytosine | W—[AT] | (Weak) |
G—G | Guanine | B—[GTC] | |
T—T | Thymidine | D—[GAT] | |
U—U | Uridine | H—[ACT] | |
R—[GA] | (Purine) | V—[GCA] | |
Y—[TC] | (Pyrimidine) | N—[AGCT] | Any nucleotide |
K—[GT] | (Keto) | - — - | Gap of indeterminate length |
M—[AC] | (Amino) | ?—? | Unknown |
Amino Acid Conversion
| Amino Acid Letter | Description |
|---|---|
B—[DN] | Aspartic acid or asparagine |
Z—[EQ] | Glutamic acid or glutamine |
X—[ARNDCQEGHILKMFPSTWYV] | Any amino acid |
seq2regexp(Seq) converts ambiguous nucleotide or amino acid symbols in a sequence into a regular expression format using IUB/IUPAC codes.
Convert a nucleotide sequence into a regular expression.
r = seq2regexp('ACWTMAN')
r =
AC[AT]T[AC]A[AGCT]
Bioinformatics Toolbox functions restrict, seqwordcount
MATLAB functions regexp, regexpi
| select (phytree) | seqcomplement | ![]() |
© 1994-2005 The MathWorks, Inc.