Database Management,
GIS, and
Information Exchange
Presentation at Portland, Maine | |
May 27, 2003 | |
Robert C. Groman |
A suite of programs which typically
manage large structured sets of persistent data, offering ad hoc query facilities to many users. |
|
It controls the organization, storage
and retrieval of data (fields, records and files) in a database. It also
controls the security and integrity of the database. |
One or more large structured sets of persistent data, usually associated with software to update and query the data. A simple database might be a single file containing many records, each of which contains the same set of fields where each field is a certain fixed width. |
A computer system for capturing,
storing, checking, integrating, manipulating, analyzing and displaying data related to positions on the Earth's surface. Typically, a GIS is used for handling maps of one kind or another. These might be represented as several different layers where each layer holds data about a particular kind of feature (e.g. roads). |
|
A data structure that stores meta-data,
i.e. data about data. The term "data dictionary" has several uses. Most generally it is a set of data descriptions that can be shared by several applications. Usually it means a table in a database that stores the names, field types, length, and other characteristics of the fields in the database tables. |
A data model says what information is to be contained in a database, how the information will be used, and how the items in the database will be related to each other. (To combine and synthesize disparate data) | |
Simple table(s), hierarchical, 3rd Normal Form. “One of the most widely used methods for developing data models is the entity-relationship model. The relational model is the most widely used type of data model. Another example is NIAM.” |
Scientific Information Model for distributed data in relational databases and web links established at the NOAA Pacific Marine Environmental Lab (Wright et al., 1997) | |
See also Data Management for Marine Geology and Geophysics, Tools for Archiving, Analysis, and Visualization, 2001 and other of Wright’s references |
Data about data. | |
For example, meta-data would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data. |
Pros – required for full understanding of data within a DBMS. Required if others want to use the data | |
Cons – pain in the neck to prepare, maintain, and enter |
Does it fit the current needs? Access; ease of use; speed requirements; interface to applications; scalability; | |
Can it evolve? Add new data; accommodate new requirements and applications | |
Cost; platform dependencies; care and feeding costs (people and $) |
Is it typically easy to transfer from a database into someone’s own database using different software? No. But it is usually “do-able.” | |
Are there steps we should be taking to make sure it’s possible? Yes. Follow standards when they exist or what looks good (DIF, OBIS, FGDC, DODS->NVODS->OpenDap, digir, ….) | |
FGDC is very complex; for meta-data only |
GIS –looking at geographically attributed data
MapServer – public domain version of ArcInfo | |
GMT – text files as a GIS at some level | |
UIRSA – public domain | |
MapInfo – low end | |
Strategic GIS | |
ArcInfo/ArcGIS – ESRI | |
EASy – “no analysis tools, just serving data” | |
Delorme (ME); GeoZUI3D (UNH); JGOFS (!) | |
Learning curve: lots of options, capabilities |
Other Efforts and Alphabet Soup
LabNet – consortium of marine organizations to make their data available (uses 4D Geobrowser “index cards”) | |
NetCDF (self documenting format liked by DODS, and others) | |
OBIS – “portal” (aggregation server) for biological data (using Darwin Core 2 – OBIS) |
ZOPE – object oriented application server | |
PLONE – another tool | |
Ocean Dataview (ODV) – CVS, Matlab, ODV (1 and 2) | |
JGViews – Needs JRE software | |
LAS – web-based, active-image based data interface for registered data |
uBio – (Universal Biological Indexer and Organizer) a networked information service for biological information resources based on the Taxonomic Name Server (TNS), a thesaurus | |
A system for indexing data that is associated with living or once living organisms |
Hexacoral – biggest user in OBIS; uses digir | |
Digir – uses XML protocol to get the data. Extends XML to do queries. Uses php software package to execute the code. Supports 14 or 15 databases, e.g SQL based. Two options for JGOFS: export to flat file or write own perl script to interface directly to digir (ZooGene -> OBIS) |
Oregon State University, Randy Keller and Paul Johnson, mapping specialist at HMRG | |
Steve Hankin, “An Implementation Plan for the Data and Communication Subsystem of the U.S> Integrated Ocean Observing System” | |
Margo Edwards at HIG and Dawn Wright at OSU |
Protocols: RIDGE, petrological data. Endeavor Observatory website, Lamont’s PetDB | |
SIO Ocean Exploration data portal, http://sioexplorer.ucsd.edu | |
University of Washington’s Endeavor GIS and Portal to Endeavor Data (PED) |
Virtual Research Vessel, University of Oregon and Oregon State University | |
REVEL, University of Washington | |
Dive and Discover, WHOI |
Portal for biological data using Darwin Core 2 as modified by OBIS | |
Difficulties with attribution and handling duplicate data (Add date last modified for each datum has been suggested.) | |
DODS (Cornillion) à http; likes NetCDF format | |
Digir à uses XML; but too verbose for physical data. OBIS will use DODS instead | |
JGOFS à http |
My apologies for those references I’ve missed. | |
There are many other efforts underway in all these areas. |