| Bioinformatics Toolbox | ![]() |
Convert nucleotide sequence from integer to letter representation
SeqChar = int2nt(SeqInt,
'PropertyName', PropertyValue...)
int2nt(..., 'Alphabet', AlphabetValue)
int2nt(..., 'Unknown', UnknownValue)
int2nt(..., 'Case', CaseValue)
SeqInt | Nucleotide sequence represented by integers. Enter a vector of integers from the table Mapping Nucleotide Integers to Letters below. The array does not have to be of type integer, but it does have to contain only integer numbers. Integers are arbitrarily assigned to IUB/IUPAC letters. |
Alphabet | Property to select the nucleotide alphabet. Enter either 'DNA' or 'RNA'. |
Unknown | Property to select the integer value for the unknown character. Enter a character to map integers 16 or greater to an unknown character. The character must not be one of the nucleotide characters A, T, C, G or the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H, or V. The default character is *. |
Case | Property to select the letter case for the nucleotide sequence. Enter either 'upper' or 'lower'. The default value is 'lower'. |
Mapping Nucleotide Integers to Letters
| Nucleotide Base | Nucleotide Base | Nucleotide Base | |||
|---|---|---|---|---|---|
Adenosine | 1–A | R - A, G (purine) | 6–R | B - T, G, C | 12–B |
Cystine | 2–C | Y - T, C (pyrimidine) | 7–Y | D - A, T, G | 13–D |
Guanine | 3–G | K - G, T (keto) | 8–K | H - A, T, C | 14–H |
Thymidine with Alphabet = 'DNA' | 4–T | M - A, C (amino) | 9–M | V - A, G, C | 15–V |
U - uridine with Alphabet = 'RNA' | 4–U | S - G, C (strong) | 10–S | - Gap of indeterminate length | 16– - |
N - A, T, G, C (any) | 5–N | W - A, T (weak) | * Unknown (default) | 0–* |
int2nt(SeqNT, 'PropertyName', PropertyValue...) converts a 1-by-N array of integers to a character string using the table Mapping Nucleotide Letters to Integers above.
int2nt(..., 'Alphabet', AlphabetValue) defines the nucleotide alphabet to use. The default value is 'DNA', which uses the symbols A, T, C, and G. If Alphabet is set to 'RNA', the symbols A, C, U, G are used instead.
int2nt(..., 'Unknown', UnknownValue) defines the character to represent an unknown nucleotide base. The default character is '*'.
int2nt(..., 'Case', CaseValue) sets the output case of the nucleotide string. The default is uppercase.
Enter a sequence of integers as a MATLAB vector (space or comma-separated list with square brackets).
s = int2nt([1 2 4 3 2 4 1 3 2]) s = ACTGCTAGC
Define a symbol for unknown numbers 16 and greater.
si = [1 2 4 20 2 4 40 3 2]; s = int2nt(si, 'unknown', '#') s = ACT#CT#GC
Bioinformatics Toolbox function aa2int, baselookup, int2aa, nt2int
| int2aa | isoelectric | ![]() |
© 1994-2005 The MathWorks, Inc.