Database Management,
GIS, and
Information Exchange
Presentation at Portland, Maine
May 27, 2003
Robert C. Groman

Database Management System
A suite of programs which typically manage
large structured sets of persistent data, offering ad hoc query facilities to many users.
It controls the organization, storage and retrieval of data (fields, records and files) in a database. It also controls the security and
integrity of the database.

One or more large structured sets of persistent data, usually associated with software to update and query the data. A simple database might be a single file containing many records, each of which contains the same set of fields where each field is a certain fixed width.

Geographic Information System
A computer system for capturing, storing,
checking, integrating, manipulating, analyzing and displaying data related to positions on the Earth's surface. Typically, a GIS is used for handling maps of one kind or another. These might be represented as several different layers where each
layer holds data about a particular kind of feature
(e.g. roads).

Data Dictionary
A data structure that stores meta-data, i.e. data
about data. The term "data dictionary" has several uses.

Most generally it is a set of data descriptions that
can be shared by several applications.

Usually it means a table in a database that stores the names, field types, length, and other characteristics of the fields in the database tables.

Data Model
A data model says what information is to be contained in a database, how the information will be used, and how the items in the database will be related to each other.  (To combine and synthesize disparate data)
Simple table(s), hierarchical, 3rd Normal Form. “One of the most widely used methods for developing data models is the entity-relationship model. The relational model is the most widely used type of data model. Another example is NIAM.”

Date Model (more)
Scientific Information Model for distributed data in relational databases and web links established at the NOAA Pacific Marine Environmental Lab (Wright et al., 1997)
See also Data Management for Marine Geology and Geophysics, Tools for Archiving, Analysis, and Visualization, 2001 and other of Wright’s references

Data about data.
For example, meta-data would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data.

Detailed Meta-data
Pros – required for full understanding of data within a DBMS.  Required if others want to use the data
Cons – pain in the neck to prepare, maintain, and enter

Database Scheme Appropriate?
Does it fit the current needs?  Access; ease of use; speed requirements; interface to applications; scalability;
Can it evolve?  Add new data; accommodate new requirements and applications
Cost; platform dependencies; care and feeding costs (people and $)

How Transferable from GMBIS?
Is it typically easy to transfer from a database into someone’s own database using different software?  No.  But it is usually “do-able.”
Are there steps we should be taking to make sure it’s possible?  Yes.  Follow standards when they exist or what looks good  (DIF, OBIS, FGDC, DODS->NVODS->OpenDap, digir, ….)
FGDC is very complex; for meta-data only

GIS –looking at geographically attributed data
MapServer – public domain version of ArcInfo
GMT – text files as a GIS at some level
UIRSA – public domain
MapInfo – low end
Strategic GIS
ArcInfo/ArcGIS – ESRI
EASy – “no analysis tools, just serving data”
Delorme (ME); GeoZUI3D (UNH); JGOFS (!)
Learning curve: lots of options, capabilities

Other Efforts and Alphabet Soup
LabNet – consortium of marine organizations to make their data available (uses 4D Geobrowser “index cards”)
NetCDF (self documenting format liked by DODS, and others)
OBIS – “portal” (aggregation server) for biological data (using Darwin Core 2 – OBIS)

Other Efforts, continued
ZOPE – object oriented application server
PLONE – another tool
Ocean Dataview (ODV) – CVS, Matlab, ODV (1 and 2)
JGViews – Needs JRE software
LAS – web-based, active-image based data interface for registered data

Other Efforts, continued
uBio – (Universal Biological Indexer and Organizer) a networked information service for biological information resources based on the Taxonomic Name Server (TNS), a thesaurus
A system for indexing data that is associated with living or once living organisms

Other Efforts, continued
Hexacoral – biggest user in OBIS; uses digir
Digir – uses XML protocol to get the data.  Extends XML to do queries.  Uses php software package to execute the code.  Supports  14 or 15 databases, e.g SQL based.  Two options for JGOFS: export to flat file or write own perl script to interface directly to digir (ZooGene -> OBIS)

Other Efforts, continue
Oregon State University, Randy Keller and Paul Johnson, mapping specialist at HMRG
Steve Hankin, “An Implementation Plan for the Data and Communication Subsystem of the U.S> Integrated Ocean Observing System”
Margo Edwards at HIG and Dawn Wright at OSU

Other Efforts, continued
Protocols: RIDGE, petrological data. Endeavor Observatory website, Lamont’s PetDB
SIO Ocean Exploration data portal,
University of Washington’s Endeavor GIS and Portal to Endeavor Data (PED)

Educational “Tools”
Virtual Research Vessel, University of Oregon and Oregon State University
REVEL, University of Washington
Dive and Discover, WHOI

Portal for biological data using Darwin Core 2 as modified by OBIS
Difficulties with attribution and handling duplicate data (Add date last modified for each datum has been suggested.)

DODS (Cornillion) à http; likes NetCDF format
Digir à uses XML; but too verbose for physical data.  OBIS will use DODS instead
JGOFS à http

My apologies for those references I’ve missed.
There are many other efforts underway in all these areas.