| Bioinformatics Toolbox | ![]() |
The sequence and function of many genes is conserved during the evolution of species through homologous genes. Homologous genes are genes that have a common ancestor and similar sequences. One goal of searching a public database is to find similar genes. If you are able to locate a sequence in a database that is similar to your unknown gene or protein, it is likely that the function and characteristics of the known and unknown genes are the same.
After finding the nucleotide sequence for a human gene, you can do a BLAST search or search in the genome of another organism for the corresponding gene. This procedure uses the mouse genome as an example.
Open the MATLAB Help browser to the NCBI Web site. In the MATLAB Command window, type
web('http://www.ncbi.nlm.nih.gov')
Search the nucleotide database for the gene or protein you are interested in studying. For example, from the Search list, select Nucleotide, and in the for box enter hexosaminidase A.

The search returns entries for the mouse and human genomes. The NCBI reference for the mouse gene HEXA has accession number AK080777.

Get sequence information for the mouse gene into MATLAB. Type
mouseHEXA = getgenbank('AK080777')
The mouse gene sequence is loaded into the MATLAB workspace as a structure.
mouseHEXA =
LocusName: 'AK080777'
LocusSequenceLength: '1839'
LocusNumberofStrands: ''
LocusTopology: 'linear'
LocusMoleculeType: 'mRNA'
LocusGenBankDivision: 'HTC'
LocusModificationDate: '05-DEC-2002'
Definition: [1x67 char]
Accession: [1x201 char]
Version: ' AK080777.1'
GI: '26348756'
Keywords: 'HTC; CAP trapper.'
Segment: []
Source: [1x93 char]
SourceOrganism: [2x66 char]
Reference: {1x6 cell}
Comment: [12x66 char]
Features: [31x79 char]
BaseCount: [1x1 struct]
Sequence: [1x1839 char]
| Getting Sequence Information from a Public Database | Locating Protein Coding Sequences | ![]() |
© 1994-2005 The MathWorks, Inc.