Bioinformatics Toolbox 1.0 Release Notes


Introduction

The Bioinformatics Toolbox extends MATLAB with basic sequence analysis and gene expression analysis functions. The Bioinformatics Toolbox is a collection of tools built on the MATLAB numeric computing environment. The toolbox supports a wide range of common sequence analysis and expression analysis tasks, from accessing web-based databases, to sequence alignment, to microarray normalization and visualization.

The Bioinformatics Toolbox is dependent upon many functions from the Statistics Toolbox including some functions only available in the latest version of the Statistics Toolbox, 4.1. We recommend that you install the latest version of the Statistics Toolbox before running the Bioinformatics Toolbox.

Features

This section introduces the features for the Bioinformatics Toolbox 1.0. The Bioinformatics Toolbox has more than 100 functions implemented using M-files. For a complete list of functions, in the MATLAB Command Window, type

help bioinfo

Data I/O

The toolbox provides functions to directly access many standard Web-based databases such as GenBank, EMBL, PIR, and PDB. There are also functions to read many standard file formats, including FASTA and PDB. For microarray data, there are functions to read Affymetrix, GenePix, SPOT format data, and a function to access data directly from the NCBI Gene Expression Omnibus Web site.

Sequence Alignment

The toolbox has functions for pairwise sequence alignment and for hidden Markov model-based sequence profile alignment, including efficient MATLAB implementations of the Needleman-Wunsch and Smith-Waterman algorithms. In addition to the alignment functions there are several tools for visualizing sequence alignments. The toolbox provides many standard scoring matrices, including the PAM and BLOSUM families.

Sequence Utilities and Statistics

The toolbox contains many functions for working with sequences. There are functions for converting DNA sequences to RNA or amino acid sequences; there are functions that report various statistics about sequences, and functions to search for patterns within the sequence; there are functions for creating random sequences, and there are functions to perform in-silico digestion of sequences with restriction enzymes and proteases.

Microarray Normalization and Visualization

The toolbox contains a number of functions for normalizing microarray data including lowess normalization, global mean normalization, and MAD normalization. The toolbox provides several functions for visualizing microarray data, including spatial heat maps, box plots, loglog, and I-R plots. The toolbox also uses functions from the Statistics Toolbox to perform cluster analysis and to visualize the results.

Protein Structure Analysis

In addition to standard sequence analysis functions, there is also a graphical user interface (GUI), proteinplot, for visualizing properties of protein sequences.

Tutorial Demonstrations

There are also several tutorial examples that demonstrate how to use the functions in the toolbox. These tutorials would be a good place to start using the toolbox.


© 1994-2005 The MathWorks, Inc.