Real-Time Data Assimilation

The Real-Time Data Assimilation (RTDA) project for Phase 3 USGLOBEC-Georges Bank was discussed at length at the SI meeting at UNH, Sept. 8 through 12, 1998. Participants included
Cabell Davis, Charles Hannah, Craig Lewis, Gregory Lough, Daniel Lynch, James Manning, Dennis McGillicuddy, Christopher Naimie, John Quinlan, and Francisco Werner
Discussions over the course of the workshop included refinement of objectives, development of a preliminary data assimilation protocol, discussions with the observational PI's about shipboard experimental planning, and software distribution and testing.

Hypotheses

Three interrelated hypotheses have been articulated: Fundamental is the recognition that The emphasis of the project is therefore on developing a practical procedure which can be implemented in today's technology; and defining its limits.

Targets

The goal is to produce useful information about the movement of in both Nowcast and Forecast modes. A fundamental constraint is that interactive display and computational capability is to be delivered to scientists at sea, in order to allow for real-time adjustment of experimental protocol. Due to present bandwith limitations, this requires the concentration of the computational talent and hardware on ship, with shore-based support.

Data

Data expected to be available for assimilation were reviewed:
Hydrography
CTD
SST
Underway T,S
Atmospheric
Wind - Measured
Wind - Forecast
Heat Flux
Velocity
ADCP
Drifters
Moored Timeseries (T,S,V,P)
Experimental
VPR (Calanus/Pseudocalanus, plus T,S)
Dye/Tracer
A Data Inventory was initiated via conversations with other investigators, covering frequency of observation, delay, the nature of the data product(s), type of quality assurance, and processing required between data product and model. QA by data providers will be essential and may limit data availability.It was discovered that moored velocity timeseries are unlikely to be available in real-time. Opportunistic reliance on drifters (which are available R-T) and other ships' ADCP data (more problematic from a network perspecitve) will be necessary to provide more than local velocity sampling.

Data Processing

A key activity in the next several months will be the definition of file standards. Every data product will require a standard format for presentation of the space-time location of the data, averaging performed, the data itself, and some informaton about the expected error:
(t,x,y,z, .... , variance)
Software required includes A recognized problem is space-time dealiasing of (x,y,z,t) observations of fields (e.g. CTD, VPR, tracer) which are transported in the tidal flow. The present strategy is to assimilate these observations as synoptic Initial Conditions. There is no formal assimilation strategy for these fields which accounts for the non-synopticity. The implied assumptions are: either the phenomena represented are slowly evolving and therefore the errors are small; and/or the insitu modeled tidal-time transport processes will correct the sampling deficiency. These assumptions need to be revisited routinely and will be critically examined during the planned pre-cruise OSSE's.

Models

A list of models we anticipate being used follows:

Post-Processing

Most models are configured to accept(demand) Fortran source code in the form of user-built subroutines which conform to a standard specification. Example: subroutines OUTPUT in Fundy, Quoddy, Truxton. There is a need to collect a library of useful subroutines which can be exercised from these subroutines. Example: compile a report on velocity across a given transect at a certain time. Many such utilities have been used in the Quoddy Users Group; they need to be assembled.

Obtaining tidal constituents from simulations has historically been done with Fourier Transforms if forcing is simple M2 and residual. For this project, we will be using multi-constituent simulations (standard spectrum must include at least M2, N2, S2, O1, K1 plus compound tides. Fourier Analysis will not work; a least squares fit is needed. The Foreman software will be the project standard. (See QUG homepage).

Advanced versions of this problem for data assimilation involve sampling and averaging simulation output and comparing with observations. Some software has been built for this during the Truxton development. It will need to be developed and generalized.

Display

The project standard will be MATLAB. A library of scripts will be assembled. As in the QUG, economy of effort is critical to success.

Procedure

At sea, we anticipate daily updates to the hindcast/nowcast/forecast circulation estimates on a limited-area domain, roughly to the 150m isobath.

The hindcast will be arrived at as an inverse calculation, fitting a forward simulation model to observations. The forward simulation (Quoddy) will be initialized from observed broad-scale hydrography and/or climatology, and forced by observed wind and heating and best-estimate (prior) boundary conditions. The forward model will be sampled and the modeled velocities compared with observations. These velocity discrepancies (errors) will drive inverse model(s) (Truxton, Casco) which deduce improvements in the open-water boundary conditions which reduce the discrepancies.

The key unknowns in the hindcast are the open-water boundary conditions. They will be expressed and assembled as the sum of several parts:

The wind-band correction will be obtained from a regional (including GoM and Scotian Shelf) simulation forced with wide-area winds and Halifax Sea Level. This model will run in a continuous simulation mode, assimilating only observed regional wind as a forcing function. The essential idea is to approximate the shelf-scale pressure response to wind. Imperfections in this estimate will be corrected with the Casco inversion.

The nowcast will be the terminal condition of the hindcast.

The forecast system will be initialized from the nowcast, and run forward with forecasted atmospheric interaction from the internet and persistent tidal and subtidal boundary condition perturbations. There may be a need to develop a forecast tool (e.g. a linear regression) for the shelf-scale wind response to complement the forecast wind stress. Approximately one-day turnaround is the target,i.e. data assimilation will be lagged by one day from its availability.

At-Sea Timeline

An approximate time-line from pre-cruise preparation through at-sea experimentation follows

Time (Days) Activity Computational Product
-28 BSS Start Prior Estimate
-14 BSS End Hindcast #1
-5 Update with SST Hindcast #2
-1 Leave Port Simulation Forecast Mode
0 Arrive GB,Begin Survey of Front Simulation Forecast Mode
3 Survey Complete Hindcast #3
3-14 Conduct Experiments Daily Update: Hindcast, Nowcast, Forecast
14 Leave Bank Archive all results

Throughout this timeline, we assume continuous availability of the following data streams, updated in near-real-time according to the data inventory:

The last item is a a conputational data product which we will produce ourselves. It is an approximate contribution to the Georges Bank (mesh Bank150) boundary pressure due to the full-shelf wind response. It is obtained from a continuous shelf-scale simulation with regional atmospheric forcing as described above. Overall this strategy implies four separate compuational tasks, running simultaneously: The latter task will require an hourly archive of the latest products, plus a browser which needs to be built. Matlab is the likely platform. A Bank150 .inq file requires ~32MB ASCII, or ~4MB binary.

It is likely that as many as four dedicated CPU's will need to be devoted to the task.



Software Exercises: TRUXTON1

In the latter half of the meeting the group worked with the TRUXTON software. This software inverts velocity data to deduce harmonic open-water boundary conditions. It defines a velocity input file, '.m3d', which is a project standard for recording velocity observations and/or errors. It assumes point measurements in(x,y,t) but partial vertical averaging. In the course of discussing this file standard, it became clear that a project standard for (x,y,t) coordinates is necessary. Horizontal locations will be Cartesian, referred to the Boston tide gage with the Dartmouth standard Mercator projection. Time will be reported in UTC decimal days.

TRUXTON is archived at the Dartmouth web server, http://www-nml.dartmouth.edu/Software/truxton/. Included there is a draft manual (very sparse overheads) and the software and data files required to invert the ADCP from EN265, as reported in Lynch et al 1998 ("Hindcasting... Part I: Detiding"). (Long postscript document here.)

Two additional test cases were created during the meeting. Both were original ADCP observations provided by Craig Lee from SeaSoar cruises. These cases illustrate the conversion of foreign ADCP data to the .m3d file format, and its subequent Truxton inversion. These cases were distributed and will be archived with the TRUXTON software.

There is a need to standardize three TRUXTON-related modules:

ADCP-->Truxton: create an .m3d file from incoming ADCP data, multiple sources.
Quoddy-->Truxton: create an .m3d file representing the difference between a Quoddy simulation and an existing .m3d file containing ADCP data; also create friction files (viscosity, bottom stress coefficient) representing linearization of Quoddy turbulence regime for Truxton use.
Truxton-->Quoddy: create input to Quoddy's Subroutine BC to allow the resimulation via Quoddy with Truxton-estimated Boundary Condition supplements.
The first module is self explanatory. The second module is necessary to subtract a prior estimate based on simulation from ADCP data and prepare for Truxton inversion. The third module is needed to operate Quoddy/Truxton iteratively, in order to make a nonlinear inversion.

Instances of these modules exist from the Truxton development work:

reading an m3d or o3d file
writing an m3d or o3d file
unix script to throw out the m3d header and write the data to standard output

These modules will be standardized in the coming months, and a nonlinear iteration test case will be created and put in the Truxton archives. A goal for all seagoing PI's is to achieve fluency with the Quoddy/Truxton interface, sufficient to perform a nonlinear inversion, before 25 November.



Budget

Below we summarize the Phase 3 budget history from proposal to present. Included is modeling project (Lynch et al) as well as the three experimental projects which were hosting the real-time modeling. There are 4 levels of refinement:

Proposed: as proposed to NSF, Dec. 15 1997

Taylor Memo: the outcome of the Panel and Mail Reviews. The program totals at this point were over budget and this memo was the basis for further PI reductions.

20 July: the outcome of voluntary reductions at the 20 July PI meeting at WHOI. Not all PI's attended this meeting. The outcome was a partial reduction of the budget gap, with a recommendation that absent PI's cut 14% in year 2 on average. If that were done, then the program would be in balance in years 1 and 2 and have a 6% surplus in year 3.

Recommended: This budget is shown for the modeling effort only. It reflects the PI-recommended 14% cut in year 2 and 6% increase in year 3. These adjustments pertain to the Dartmouth, WHOI, and NMFS portions of the modeling; the UNC budget had already been cut 20% in year 2 at the 20 July meeting.
			Yr 1	Yr 2	Yr 3
		
Lough et al. 	
	Proposed	597.	467.	468.
	Taylor Memo	481.	413.	395.
	20 July		474.	407.	395.
	
Ledwell 
	Proposed	181.	223.	136.
	Taylor Memo	339.	181. 	125.
	20 July		312.	170.	125.
	
Davis
	Proposed	245.	156.	164.
	Taylor Memo	 50.	 50.	 50.
	20 July		 50.	 50.	 50.
	
Lynch, Werner, McGillicuddy, Lough
	Proposed	473.	359.	336.
	Taylor Memo	 18.	359.	336.
	20 July		 18.	345.	336.
	Recommended	 18.	305.	356.
It was felt that a limited Real-Time project could be mounted with this budget limitation, provided that Phase 2 monies were temporarily reprogrammed to cover the year 1 expense implied by the above workplan. (Without a year 1 activity, the project has no real-time content and would revert to pure hindcasting, missing a major opportunity for GLOBEC.) Phase 2 activities would be deferred to years 2 and 3 of Phase 3. It was not considered desirable to abandon any of the present Phase 2 goals, but their delay was considered acceptable.

Key problems identified were a) the necessity of purchasing computer power during year 1; and b) salary support for PI C. Davis. The original budget submission presumed that Davis' modeling time would be supported under the separate VPR proposal. However, the level at which the VPR proposal is presently supported is only sufficient for data acquisition. It is therefore critical to restore this salary line under the WHOI budget to facilitate his participation in the modeling activities.

The breakdown of the Recommended Budget, by institution, is then as follows:

Lynch et al:	Yr 1		Yr 2		Yr 3

Dartmouth   	     0.		148,046.	221,384.       
UNC   		     0.     	 70,004.	 51,282.        
WHOI    	     0.	 	 67,840.	 68,424.
NMFS		18,000.		 19,110.	 14,910.    

Total 		18,000.		305,000.	356,000.      
Key allocations for years 2 and 3:
NMFS as proposed;
WHOI as proposed with additional 30K/year to support C. Davis;
UNC as proposed;
Dartmouth cut by 36% year 2, 5% year 3. The allocation among years 2 and 3 will be smoothed by the reprogramming of phase 2 expenditures.

Combined with the 100% reductions in year 1, we arrive at an overall project cut of 42%, from 1,168,000. proposed to 679,000. This achieves the stated program goals re voluntary reductions which were announced at the 20 July PI meeting at WHOI.



Next Meeting and Work Assignments

Assuming the above funding is authorized, the next meeting will occur (tentatively) at the North Carolina Supercomputing Center (UNC-Chapel Hill) with Prof. Werner hosting. Target meeting time is late fall -- between Nov 25 and Dec 25, 1998. It is essential that all seagoing PI's be in attendance at this meeting.

Work assignments between now and then:



Compiled by DRL, 21 Sept 98.

Back to RTDA home page.