seq2regexp

Convert sequence with ambiguous characters to regular expression

Syntax

seq2regexp(Seq)

Arguments

Seq

Nucleotide or amino acid sequence.

Nucleotide Conversions

Nucleotide LetterNucleotideNucleotide LetterNucleotide

AA

Adenosine

S[GC]

(Strong)

CC

Cytosine

W[AT]

(Weak)

GG

Guanine

B[GTC]

 

TT

Thymidine

D[GAT]

 

UU

Uridine

H[ACT]

 

R[GA]

(Purine)

V[GCA]

 

Y[TC]

(Pyrimidine)

N[AGCT]

Any nucleotide

K[GT]

(Keto)

--

Gap of indeterminate length

M[AC]

(Amino)

??

Unknown

Amino Acid Conversion

Amino Acid LetterDescription

B[DN]

Aspartic acid or asparagine

Z[EQ]

Glutamic acid or glutamine

X[ARNDCQEGHILKMFPSTWYV]

Any amino acid

Description

seq2regexp(Seq) converts ambiguous nucleotide or amino acid symbols in a sequence into a regular expression format using IUB/IUPAC codes.

Examples

Convert a nucleotide sequence into a regular expression.

r = seq2regexp('ACWTMAN')

r =
AC[AT]T[AC]A[AGCT]

See Also

Bioinformatics Toolbox functions restrict, seqwordcount

MATLAB functions regexp, regexpi


© 1994-2005 The MathWorks, Inc.