Real-Time Data Assimilation
The Real-Time Data Assimilation (RTDA) project for Phase 3 USGLOBEC-Georges
Bank was discussed at length at the SI meeting at UNH, Sept. 8 through
12, 1998. Participants included
-
Cabell Davis, Charles Hannah, Craig Lewis, Gregory Lough, Daniel
Lynch, James Manning, Dennis McGillicuddy, Christopher Naimie, John
Quinlan, and Francisco Werner
Discussions over the course of the workshop included refinement of
objectives, development of a preliminary data assimilation protocol,
discussions with the observational PI's about shipboard experimental
planning, and software distribution and testing.
Hypotheses
Three interrelated hypotheses have been articulated:
- Real-Time DA can improve ocean sampling: Water, Dye, Planktonic Animals
- A practical nowcast/forecast system can be constructed from
conventional and novel instruments, and delivered to shipboard
scientists. (Practical = better than climatology and fast)
- A 3-D forecast is better than 1-D observation i.e. direct use of data
(example dye forecast by progressive vector diagram from ADCP at site of release).
Fundamental is the recognition that
- there are imperfections in all models, in all data, and in all sampling
plans; and
- an optimal solution to ocean state estimation and
forecasting is infeasible today
The emphasis of the project is therefore on developing a practical
procedure which can be implemented in today's technology; and defining
its limits.
Targets
The goal is to produce useful information about the movement of
- Water
- Dye
- Larval Fish
- Zooplankton
in both Nowcast and Forecast modes. A fundamental constraint is that
interactive display and computational capability is to be delivered to
scientists at sea, in order to allow for real-time adjustment of
experimental protocol. Due to present bandwith limitations, this
requires the concentration of the computational talent and hardware on
ship, with shore-based support.
Data
Data expected to be available for assimilation were reviewed:
- Hydrography
- CTD
- SST
- Underway T,S
- Atmospheric
- Wind - Measured
- Wind - Forecast
- Heat Flux
- Velocity
- ADCP
- Drifters
- Moored Timeseries (T,S,V,P)
- Experimental
- VPR (Calanus/Pseudocalanus, plus T,S)
- Dye/Tracer
A Data
Inventory was initiated via conversations with other investigators,
covering frequency of
observation, delay, the nature of the data product(s), type of quality assurance,
and processing required between data product and model.
QA by data providers will be essential and may limit data availability.It was discovered that moored
velocity timeseries are unlikely to be available in real-time. Opportunistic
reliance on drifters (which are available R-T) and other ships' ADCP data (more problematic
from a network perspecitve) will be necessary to provide more than local velocity sampling.
Data Processing
A key activity in the next several months will be the definition of file standards. Every
data product will require a standard format for presentation of the space-time
location of the data, averaging performed, the data itself,
and some informaton about the expected error:
- (t,x,y,z, .... , variance)
Software required includes
- Readers and Writers for all standard data files, to/from Matlab and Fortran
- Objective Analysis (Optimal Interpolation) -- the BIO package OAX is the
project standard for this task
- Kriging -- A new Matlab-based library with graphical user interface was
introduced by Dezhang Chu. This alternative requires a simple extension to
handle unstructured estimation points (the .nod file). This effort was
launched during the meeting. This software is available via
anonymous ftp;
see attached email
from Chu.
A recognized problem is space-time dealiasing of (x,y,z,t)
observations of fields (e.g. CTD, VPR, tracer) which are transported in
the tidal flow. The present strategy is to assimilate these
observations as synoptic Initial Conditions. There is no formal
assimilation strategy for these fields which accounts for the
non-synopticity. The implied assumptions are: either the phenomena
represented are slowly evolving and therefore the errors are small;
and/or the insitu modeled tidal-time transport processes will correct
the sampling deficiency. These assumptions need to be revisited
routinely and will be critically examined during the planned pre-cruise OSSE's.
Models
A list of models we anticipate being used follows:
- Physics
- Fundy5 -- Complete, mature
- Quoddy4 -- Needs additional work on the advection
algorithm
- Saco -- Linearized version of Quoddy4 (BC's only); a first
realization has been tested
- Truxton -- Inverse of Fundy; complete, needs work on I/O
modules i.e. User Subroutines to handle data flow.
- Moody -- Inverse of Saco; in design stage
- Casco (Saco/Moody iterative combo) -- in design stage
- Coupled Physical/Biological
- Drog3DDT (clouds of particles) -- complete, mature
- Trophodynamic IBM -- complete, under regular revision for
research purposes; needs a few standard versions
- Vineyard (Miller Vector-Based IBM) -- complete in 2-D case
- Acadia2D -- complete, reasonably mature
- Scotia (Acadia adjoint) -- complete; operational nature not
clear
Post-Processing
Most models are configured to accept(demand) Fortran source code in the
form of user-built subroutines which conform to a standard
specification. Example: subroutines OUTPUT in Fundy, Quoddy, Truxton.
There is a need to collect a library of useful subroutines which can be
exercised from these subroutines. Example: compile a report on velocity
across a given transect at a certain time. Many such utilities have
been used in the Quoddy Users Group; they need to be assembled.
Obtaining tidal constituents from simulations has historically been
done with Fourier Transforms if forcing is simple M2 and residual. For
this project, we will be using multi-constituent simulations (standard
spectrum must include at least M2, N2, S2, O1, K1 plus compound tides.
Fourier Analysis will not work; a least squares fit is needed. The
Foreman software will be the project standard. (See QUG homepage).
Advanced versions of this problem for data assimilation involve
sampling and averaging simulation output and comparing with
observations. Some software has been built for this during the Truxton
development. It will need to be developed and generalized.
Display
The project standard will be MATLAB. A library of scripts will be
assembled. As in the QUG, economy of effort is critical to success.
Procedure
At sea, we anticipate daily updates to the hindcast/nowcast/forecast
circulation estimates on a limited-area domain, roughly to the 150m
isobath.
The hindcast will be arrived at as an inverse calculation, fitting a
forward simulation model to observations. The forward simulation
(Quoddy) will be initialized from observed broad-scale
hydrography and/or climatology, and forced by observed wind and heating and
best-estimate (prior) boundary conditions. The forward model will be
sampled and the modeled velocities compared with observations. These
velocity discrepancies (errors) will drive inverse model(s) (Truxton, Casco)
which deduce improvements in the open-water boundary conditions which
reduce the discrepancies.
The key unknowns in the hindcast are the open-water boundary conditions.
They will be expressed and assembled as the sum of several parts:
- a prior estimate including tidal and subtidal components
(Climatology or inversion of a previous Broad Scale
Survey); plus
- a tidal correction (Truxton); plus
- a shelf-scale wind-band correction
(Q4 or Saco on G2S with regional wind); plus
- refined wind response + oceanic + other (Casco); plus
- random noise
The wind-band correction will be obtained from a regional (including
GoM and Scotian Shelf) simulation forced with wide-area winds and Halifax
Sea Level. This model will run in a continuous simulation mode,
assimilating only observed regional wind as a forcing function. The
essential idea is to approximate the shelf-scale pressure response to wind.
Imperfections in this estimate will be corrected with the Casco inversion.
The nowcast will be the terminal condition of the hindcast.
The forecast system will be initialized from the nowcast, and run
forward with forecasted atmospheric interaction from the internet and
persistent tidal and subtidal boundary condition perturbations. There may
be a need to develop a forecast tool (e.g. a linear regression) for the
shelf-scale wind response to complement the forecast wind stress.
Approximately one-day turnaround is the target,i.e. data assimilation will
be lagged by one day from its availability.
At-Sea Timeline
An approximate time-line from pre-cruise preparation through at-sea
experimentation follows
Time (Days) |
Activity |
Computational Product |
-28 |
BSS Start |
Prior Estimate |
-14 |
BSS End |
Hindcast #1 |
-5 |
Update with SST |
Hindcast #2 |
-1 |
Leave Port |
Simulation Forecast Mode |
0 |
Arrive GB,Begin Survey of Front |
Simulation Forecast Mode |
3 |
Survey Complete |
Hindcast #3 |
3-14 |
Conduct Experiments |
Daily Update: Hindcast, Nowcast, Forecast |
14 |
Leave Bank |
Archive all results |
Throughout this timeline, we assume continuous availability of the
following data streams, updated in near-real-time according to the data
inventory:
- Drifter Locations
- Heat Flux
- Wind Stress
- SST
- Moorings
- Boundary Pressure
The last item is a a conputational data product which we will produce
ourselves. It is an approximate contribution to the Georges Bank (mesh
Bank150) boundary pressure due to the full-shelf wind response. It is
obtained from a continuous shelf-scale simulation with regional atmospheric
forcing as described above.
Overall this strategy implies four separate compuational tasks, running
simultaneously:
- Continuous Wind-driven Simulation on the Shelf-Scale mesh
- Forecast Simulation for the Bank-Scale mesh
- Hindcast Inversions
- Postprocessing of the archived Hindcasts and Forecasts.
The latter task will require an hourly archive of the latest products, plus a
browser which needs to be built. Matlab is the likely platform. A Bank150
.inq file requires ~32MB ASCII, or ~4MB binary.
It is likely that as many as four dedicated CPU's will need to be devoted to the task.
Software Exercises: TRUXTON1
In the latter half of the meeting the group worked with the TRUXTON software.
This software inverts velocity data to deduce harmonic open-water boundary
conditions. It defines a velocity input file, '.m3d', which is a project
standard for recording velocity observations and/or errors. It assumes point
measurements in(x,y,t) but partial vertical averaging. In the course of
discussing this file standard, it became clear that a project standard for
(x,y,t) coordinates is necessary. Horizontal locations will be Cartesian,
referred to the Boston tide gage with the Dartmouth standard Mercator
projection. Time will be reported in UTC decimal days.
TRUXTON is archived at the Dartmouth web server,
http://www-nml.dartmouth.edu/Software/truxton/. Included there is a draft
manual (very sparse overheads) and the software and data files required to
invert the ADCP from EN265, as reported in Lynch et al 1998 ("Hindcasting... Part I:
Detiding").
(Long
postscript document here.)
Two additional test cases were created during the meeting. Both were original
ADCP observations provided by Craig Lee from SeaSoar cruises. These cases
illustrate the conversion of foreign ADCP data to the .m3d file format, and its
subequent Truxton inversion. These cases were distributed and will be archived
with the TRUXTON software.
There is a need to standardize three TRUXTON-related modules:
- ADCP-->Truxton: create an .m3d file from incoming ADCP data, multiple
sources.
- Quoddy-->Truxton: create an .m3d file representing the difference between
a Quoddy simulation and an existing .m3d file containing ADCP data; also
create friction files (viscosity, bottom stress coefficient) representing
linearization of Quoddy turbulence regime for Truxton use.
- Truxton-->Quoddy: create input to Quoddy's Subroutine BC to allow the
resimulation via Quoddy with Truxton-estimated Boundary Condition supplements.
The first module is self explanatory. The second module is necessary to
subtract a prior estimate based on simulation from ADCP data and prepare for
Truxton inversion. The third module is needed to operate Quoddy/Truxton
iteratively, in order to make a nonlinear inversion.
Instances of these modules exist from the Truxton development work:
- reading an m3d or o3d file
- ~channah/public_html/da_results/src/read_obs/read_obs.f
- ~channah/matlab/read_obs.m
- writing an m3d or o3d file
- ~channah/matlab/write_obs.m
- unix script to throw out the m3d header and write the data to standard output
- ~channah/public_html/da_results/src/cat_m3d
These modules will be standardized in the coming months, and a nonlinear
iteration test case will be created and put in the Truxton archives. A goal
for all seagoing PI's is to achieve fluency with the Quoddy/Truxton interface,
sufficient to perform a nonlinear inversion, before 25 November.
Budget
Below we summarize the Phase 3 budget history from proposal to present.
Included is modeling project (Lynch et al) as well as the three
experimental projects which were hosting the real-time modeling. There
are 4 levels of refinement:
- Proposed: as proposed to NSF, Dec. 15 1997
- Taylor Memo: the outcome of the Panel and Mail Reviews. The
program totals at this point were over budget and this memo was
the basis for further PI reductions.
- 20 July: the outcome of voluntary reductions at the 20 July PI
meeting at WHOI. Not all PI's attended this meeting. The
outcome was a partial reduction of the budget gap, with a
recommendation that absent PI's cut 14% in year 2 on average.
If that were done, then the program would be in balance in
years 1 and 2 and have a 6% surplus in year 3.
- Recommended: This budget is shown for the modeling effort
only. It reflects the PI-recommended 14% cut in year 2 and 6%
increase in year 3. These adjustments pertain to
the Dartmouth, WHOI, and NMFS portions of the modeling; the
UNC budget had already been cut 20% in year 2 at the 20 July
meeting.
Yr 1 Yr 2 Yr 3
Lough et al.
Proposed 597. 467. 468.
Taylor Memo 481. 413. 395.
20 July 474. 407. 395.
Ledwell
Proposed 181. 223. 136.
Taylor Memo 339. 181. 125.
20 July 312. 170. 125.
Davis
Proposed 245. 156. 164.
Taylor Memo 50. 50. 50.
20 July 50. 50. 50.
Lynch, Werner, McGillicuddy, Lough
Proposed 473. 359. 336.
Taylor Memo 18. 359. 336.
20 July 18. 345. 336.
Recommended 18. 305. 356.
It was felt that a limited Real-Time project could be mounted with this
budget limitation, provided that Phase 2 monies were temporarily
reprogrammed to cover the year 1 expense implied by the above
workplan. (Without a year 1 activity, the project has no real-time
content and would revert to pure hindcasting, missing a major
opportunity for GLOBEC.) Phase 2 activities would be deferred to
years 2 and 3 of Phase 3. It was not considered desirable to abandon
any of the present Phase 2 goals, but their delay was considered
acceptable.
Key problems identified were a) the necessity of purchasing computer
power during year 1; and b) salary support for PI C. Davis.
The original budget
submission presumed that Davis' modeling time would be supported under the
separate VPR proposal. However, the level at which the VPR proposal is
presently supported is only sufficient for data acquisition. It is therefore
critical to restore this salary line under the WHOI budget to facilitate
his participation in the modeling activities.
The breakdown of the Recommended Budget, by institution, is then as
follows:
Lynch et al: Yr 1 Yr 2 Yr 3
Dartmouth 0. 148,046. 221,384.
UNC 0. 70,004. 51,282.
WHOI 0. 67,840. 68,424.
NMFS 18,000. 19,110. 14,910.
Total 18,000. 305,000. 356,000.
Key allocations for years 2 and 3:
- NMFS as proposed;
- WHOI as proposed
with additional 30K/year to support C. Davis;
- UNC as proposed;
- Dartmouth cut by 36% year 2, 5% year 3. The allocation among
years 2 and 3 will be smoothed by the
reprogramming of phase 2 expenditures.
Combined with the 100% reductions in year 1, we arrive at an overall
project cut of 42%, from 1,168,000. proposed to 679,000. This
achieves the stated program goals re voluntary reductions
which were announced at the 20 July PI meeting at WHOI.
Next Meeting and Work Assignments
Assuming the above funding is authorized, the next meeting will
occur (tentatively) at the North Carolina Supercomputing Center
(UNC-Chapel Hill) with Prof. Werner hosting. Target meeting time
is late fall -- between Nov 25 and Dec 25, 1998. It is
essential that all seagoing PI's be in attendance at this meeting.
Work assignments between now and then:
- General
- Lynch: establish/maintain a project website
- Lynch: report plans to Phil Taylor and Beth Turner; coordinate funding discussions
- Lewis: coordinate all entries to project archives on the www-nml site
- Lewis: initiate and maintain a Project Standards document (supplement to
the NML GoM standards). Include (x,y,z,t) conventions, filenames, etc.
- Manning: coordinate all discussions with data providers; maintain the data
inventory
- Manning, Lough: investigate the networking capability, especially
ship-to-ship; keep the Program Services Office appraised of plans,
problems
- Davis, Lough: coordinate ship scheduling; liason with Ledwell, Houghton.
- Simulation and Forecasting:
- ALL: develop/maintain fluency with QUODDY4 and DROG3Ddt
- Lynch: rework Smagorinsky routines
- Werner, Blanton: standardize the turbulent kick feature in DROG software
- Werner, Quinlan: package a few standard trophdynamics models for DROG use.
- Davis, Manning: simulate a dye release with Q4, climatology, and a cloud of
passive particles with turbulent kicks
- Manning, McGillicuddy: investigate real-time forecast wind products and
their interface to Q4.
- Lynch: examine strategy for shelf-scale wind-band simulation.
- Inversion, Data Assimilation
- TRUXTON
- ALL: achieve operational capability with T1 alone and iteratively with Q4;
implied is full familiarity with the ADCP data processing. By next meeting,
Truxton is behind us.
- Werner, Lewis: archive the three Truxton cases discussed at the UNH
meeting
- Naimie: distrubute sampler.f for use with Q4 subroutine output. This is
the Quoddy-->Truxton interface via .m3d file.
- Lewis: refine/distribute Truxton-->Quoddy software (via subroutine BC).
This inserts BC perturbations into Quoddy.
- Lewis: refine/distribute ADCP-->Truxton (.m3d) software; prepare for multiple
ADCP formats
- Werner: compare archived software T1 with beta-version distributed at UNH;
report on same; make sure all are using T1 from website.
- Werner: upgrade the T1 Users' Manual.
- Lewis: define iterative Q4/T1 test problem; archive, distribute.
- OA
- Manning, Lough, Davis: become fluent with Kriging software; interact with Chu.
- McGillicuddy, Manning, Lough: consider fearture model for shelf/slope front
and procedure for its assimilation
- McGillicuddy, Davis: OA procedure for VPR data
- Naimie, Lynch, Lewis: Software package to go from CTD data to initial hydrographic
data by OAXing difference from climatology.
Must deal with
how to sample the climatology
in time, and
mean offset.
- CASCO
- Lynch/Hannah/Naimie: develop to beta-level
- Hannah: develop new mesh > Bank150
- Lynch: distribute adjoint notes in advance of next meeting
- Archival Browsing & Postprocessing
- Lewis: define archival strategy
- Naimie: adapt Q4 sampling software to sample archives (Strategy: subroutine
output is the gateway. An ARCHIVE READER interpolates to given times and calls
output. Output makes use of standard sampling routines. Essentially, READER
takes the place of Q4.
- Blanton: finalize upgrade of FEDAR to current Matlab edition (5.2).
- Next Meeting
- McGillicuddy, Manning, Davis: coordinate design of OSSE's (including VPR)
- Lynch: coordinate agenda
- Werner: coordinate logistics
Compiled by DRL, 21 Sept 98.
Back to RTDA home page.