getembl

Retrieve sequence information from EMBL database

Syntax

Data = getembl('AccessionNumber',
               'PropertyName', PropertyValue...)

getembl(..., 'ToFile', ToFileValue)
getembl(..., 'SequenceOnly', SequenceOnlyValue)

Arguments

AccessionNumber

Unique identifier for a sequence record. Enter a unique combination of letters and numbers

ToFile

Property to specify the location and filename for saving data. Enter either a filename or a path and filename supported by your system (ASCII text file).

SequenceOnly

Property to control getting a sequence without the metadata. Enter true or false.

Description

getembl retrieves information from the European Molecular Biology Laboratory (EMBL) database for nucleotide sequences. This database is maintained by the European Bioinformatics Institute (EBI). For more details about the EMBL-Bank database, see

http://www.ebi.ac.uk/embl/Documentation/index.html

Data = getembl('AccessionNumber', 'PropertyName', PropertyValue...) searches for the accession number in the EMBL database (http://www.ebi.ac.uk/embl) and returns a MATLAB structure containing the following fields:

Comments
Identification
Accession
SequenceVersion
DateCreated
DateUpdated
Description
Keyword
OrganismSpecies
OrganismClassification
Organelle
Reference
DatabaseCrossReference
Feature
BaseCount
Sequence

getembl(..., 'ToFile', ToFileValue) returns a structure containing information about the sequence and saves the information in a file using an EMBL data format. If you do not give a location or path to the file, the file is stored in the MATLAB current directory. Read an EMBL formatted file back into MATLAB using the function emblread.

getembl(..., 'SequenceOnly', SequenceOnlyValue) if SequenceOnly is true, returns only the sequence information without the metadata.

Examples

Retrieve data for the rat liver apolipoprotein A-I.

emblout = getembl('X00558')

Retrieve data for the rat liver apolipoprotein and save in the file rat_protein. If a filename is given without a path, the file is stored in the current directory.

Seq = getembl('X00558','ToFile','c:\project\rat_protein.txt')

Retrieve only the sequence for the rat liver apolipoprotein.

Seq = getembl('X00558','SequenceOnly',true)

See Also

Bioinformatics Toolbox functions emblread, getgenbank, getgenpept, getpdb, getpir


© 1994-2005 The MathWorks, Inc.