Instructions for adding data

Data Organization

There are some guidelines, based on the recommendations of the U.S. GLOBEC Data Policy report and suggestions from G. Flierl and others about how to organize your data for the data management system. However, the main strength of the JGOFS software system is that it can accommodate almost any kind of data with a suitably written method (program). But when several people collect similar data sets it will make it easier for people to retrieve the data for subsequent review and analysis if some guidelines are followed.

These guidelines are as follows:

For more information or when you are ready to add data, contact the Data Management Office.

Nitty Gritty Details

To add data to the JGOFS distributed database system used by the U.S. GLOBEC Georges Bank Program, take the following steps:

  1. Decide where data will reside, either on your own server or the GLOBEC server.
  2. Decide from which program type (broad-scale, modeling, mooring, process, or satellite) it will be served. This will determine which .remoteobjects file will reference your data. (See more about this below.)
  3. Create an entry for the .remoteobjects file as follows:

    where data_id_a and data_id_b are names identifying the specific data you are referencing. They need not be the same names, but can be. However, data_id_a does need to be unique with respect to other object names in the same directory. Note that in a future release of the jgofs software, it will be necessary to have data_id_a and data_id_b be the same to take advantage of new capabilities.

    "Server.computer.name" is the internet name of the computer system serving the data such as mycomputer.univ.edu.

    "Directory_path" is the path to the .objects file beginning from the "object root" (defined for each data server). (For example the "object root" might look like /data/objects/ and the "Directory_path" sits below this level.)

    SI_name is the name of the science investigator responsible for this data. Limit the "Brief description" of the object to less than about 40 characters, although an optional URL can be included if you want to have extended text, even images, referenced here.

    Note that the format of these lines is important, in particular the dash (-) must be the first character on the second and third lines and must be followed by a space in each line.

    (There can be qualifiers on the remote name as well but these are not discussed here.)

  4. Pass this information along to the U.S. GLOBEC Georges Bank office in order to update the central database directory and make the data available to everyone. Until you do this step, only those people that know the data exist and know its name (data_id_a) will be able to access the data.
  5. The .objects file contains an entry for each file (or object) you will be serving from your machine to the user community.

    The .objects file resides only on the system which serves the data. This permits the data contributor to move the data to another location or change the data file name without affecting other users of the database.

    Create an entry for the .objects file as follows if you will be serving the data yourself. The form of this entry is as follows:

  6. Create the information necessary for the descript.html file. This file contains information about all the data referenced in the .remoteobjects file. A sample of such a file follows. Provide the data management office
  7. with this information so it can be entered into the system. The information should be sent in plain text (sending via e-mail is fine).

    eventlog
    < /h1 > [Note: Only the closing tag is necessary to highlight the name.]
    
         PI:       A. Smith
        Dataset:        CTD data
        Ship:            R/V Albatross IV
        Cruise:         9312-IV
    
    Parameter        Description                      Units 
    ----------------------------------------------------------
      eventno        event/operation number            nd
      instrument     instrument used to collect data   nd
      cast           cast number                       nd
      station        station number                    nd
    .
    .
       lat           latitude                  decimal degrees
       lon           longitude                 decimal degrees
       wdepth        depth of the water             meters
       depth         depth of the cast              meters
    
    eventlog
    
    

    This information is repeated for each object defined in the .remoteobjects file bracketing each description using the name of the object. The specific formatting of the information can vary. This information is made available when a person clicks on the highlighted third entry in the .remoteobjects file.

Note 1: If the above instructions were not clear, click here for another version of these instructions.

Note 2: The JGOFS/GLOBEC system is pretty flexible in how it deals with actual data values. "Data" are typically ASCII text, either numbers or text, although "data" can also be URL links to images and movies. We have found that, with some precautions, the following symbols can be used in the data:

? . : / * - + [ ] { } ' ~ @ ^

However, the following symbols should not be used in the data:

commas, tabs or ; ( ) \ " ` ! # $ % & = , < >
The semicolon appearing in data can cause a problem when selections are specified.


Last modified: September 4, 2009