| Bioinformatics Toolbox | ![]() |
A nucleotide sequence includes regulatory sequences before and after the protein coding section. By analyzing this sequence, you can determine the nucleotides that code for the amino acids in the final protein.
After you have a list of genes you are interested in studying, you can determine the protein coding sequences. This procedure uses the human gene HEXA and mouse gene HEXA as an example.
If you did not retrieve gene data from the Web, you can load example data from a MAT-file included with the Bioinformatics Toolbox. In the MATLAB Command window, type
load hexosaminidase
MATLAB loads the structures humanHEXA and mouseHEXA into the MATLAB workspace.
Look for open reading frames in the human gene. For example, for the human gene HEXA, type
humanORFs=seqshoworfs(humanHEXA.Sequence)
seqshoworfs creates the output structure humanORFs. This structure gives the position of the start and stop codons for all open reading frames (ORFs) on each reading frame.
humanORFs =
1x3 struct array with fields:
Start
Stop
The Help browser opens with a listing for the three reading frames with the ORFs colored blue, red, and green. Notice that the longest ORF is on the third reading frame.

Locate open reading frames (ORFs) on the mouse gene. Type
mouseORFs = seqshoworfs(mouseHEXA.Sequence)
seqshoworfs creates the structure mouseORFS.
mouseORFs =
1x3 struct array with fields:
Start
Stop
The mouse gene shows the longest ORF on the first reading frame.

| Searching a Public Database for Related Genes | Comparing Amino Acid Sequences | ![]() |
© 1994-2005 The MathWorks, Inc.