| Bioinformatics Toolbox | ![]() |
Convert nucleotide sequence from letter to integer representation
SeqInt = nt2int(SeqChar, 'PropertyName', PropertyValue) nt2int(..., 'Unknown', UnknownValue) nt2int(..., 'ACGTOnly', ACGTOnlyValue)
SeqNT | Nucleotide sequence represented with letters. Enter a character string from the table Mapping Nucleotide Letters to Integers below. Integers are arbitrarily assigned to IUB/IUPAC letters. If the property ACGTOnly is true, you can only enter the characters A, C, T, G, and U. |
UnknownValue | Property to select the integer for unknown characters. Enter an integer. Maximum value is 255. Default value is 0. |
ACGTOnlyValue | Property to control the use of ambiguous nucleotides. Enter either true or false. Default value is false. |
Mapping Nucleotide Letters to Integers
Base | Code | Base | Code | Base | Code |
|---|---|---|---|---|---|
Adenosine | A—1 | A, G (purine) | R—6 | T, G, C | R—12 |
Cytidine | C—2 | T, C (pyrimidine) | Y—7 | A, T, G | Y—13 |
Guanine | G—3 | G, T (keto) | K—8 | A, T, C | K—14 |
Thymidine | T—4 | A, C (amino) | M—9 | A, G, C | V—15 |
Uridine | U—4 | G, C (strong) | S—10 | Gap of indeterminate length | - —16 |
A, T, G, C (any) | N—5 | A, T (weak) | W—11 | Unknown (default) | *—0 |
nt2int(SeqNT, 'PropertyName', PropertyValue) converts a character string of nucleotides to a 1-by-N array of integers using the table Mapping Nucleotide Letters to Integers above. Unknown characters (characters not in the table) are mapped to 0. Gaps represented with hyphens are mapped to 16.
nt2int(SeqNT,'Unknown',UnknownValue) defines the number used to represent unknown nucleotides. The default value is 0.
nt2int(SeqNT,'ACGTOnly', ACGTONlyValue) if ACGTOnly is true, the ambiguous nucleotide characters (N, R, Y, K, M, S, W, B, D, H, and V) are represented by the unknown nucleotide number.
Convert a nucleotide sequence with letters to integers.
s = nt2int('ACTGCTAGC')
s =
1 2 4 3 2 4 1 3 2
Bioinformatics Toolbox function aa2int, baselookup, int2aa, int2nt
| nt2aa | ntdensity | ![]() |
© 1994-2005 The MathWorks, Inc.