March 11, 2010 Meeting Summary
Altug Aksoy from AOML/HRD gave a presentation titled, "Vortex-Scale Hurricane Data Assimilation: Preliminary Results with Airborne Doppler Radar and Dropsondes Using NOAA/AOML/HRD's HWRF Ensemble Data Assimilation System (HEDAS)." Altug began by explaining HRD's strategy for taking a system from research to operations and noting that the main goal at HRD is to better understand the inner-core structure of a hurricane through observations. Data assimilation (DA) is the best method for assessing the impact of observations quantitatively and objectively. HRD started their major efforts with an ensemble-based DA system back in 2008 and today, preliminary coding and system design are completed while testing with observation system simulation experiments (OSSEs) is ongoing. In addition, acquisition of airborne Doppler radar and dropsonde data as well as pre-processing of this data is underway. Altug mentioned that this DA system will be tested in semi-real-time as part of the NOAA HFIP demo system, and that a comparison between 3dVar (GSI) and EnKF will be made to assess the differences and system advantages as well as set the foundations for a hybrid Var-ensemble system.
Next, Altug addressed how HEDAS research could provide feedback to the operational HWRF. By transitioning to a common code repository for HWRF and HEDAS, collaboration and code development would be faster and easier, while HEDAS would also provide a framework for the evaluation of model performance and diagnosis of model structure in observation space. This would also lead to the creation of many ensemble- and TC-structure-related diagnostic tools all contributed to the repository. The use of an ensemble will also allow a straightforward evaluation of model error while the use of airborne observations will allow the initialization of the inner-core TC structure. The forecast model used for HEDAS is the HWRF-X (explained in detail by Gopal in his presentation given February 4, 2010: http://www.emc.ncep.noaa.gov/HWRF/weeklies/FEB10/FEB042010.html), which has two nested domains (at 9 and 3km) and 42 vertical levels. There is also a static inner-nest for covariance computations between ensemble members in space, which means that the inner nest doesn't move for 4-6 hours giving ensemble members the same inner nest location. Ferrier microphysics and explicit convection in the nest are also used. The ensemble system is initialized from GEFS ensemble member analyses with a total of 30 members. For DA, a square root EnKF is used with covariance localization, and only inner-core NOAA P-3 aircraft data is assimilated on the inner nest.
Then Altug gave an overview of DA with two-variables, one observed. He explained that when there are unobserved variables, information is propagated through covariances among variables. For a background (in blue) consisting of x1 and x2, with both variables normally distributed, we assume the two variables are correlated, as evidenced by the tilt. If x1 is directly observed (and normally distributed) as y0 (in orange), and the observation error distribution for y0 is smaller than the background error for x1, then the analysis, x1a (in red) is closer to y0. The analysis error s1a (in red) is smaller than s1f (in blue) and sy (in orange), and the covariance (in pink) between x1 and x2 relates changes in x1 to changes in x2. The joint analysis probability distribution (red ellipse) is narrower indicating improved estimates for x1 and x2. Next, Altug described the general process for ensemble-based DA, where sample covariances are computed from an ensemble of forecasts. First, instead of a single state representing the atmosphere's initial state, there is an ensemble of states (or ensemble members) to better represent the initial uncertainty about the "mean" state. At t=t0+delta(t), there is divergence in the spread of members due to errors growing with time. Next, for the assimilation of observations, covariances sampled from the ensemble of forecasts are used. The use of the EnKF reduces the divergence indicating a smaller uncertainty. This analysis uncertainty becomes the initial condition uncertainty for a new forecast cycle which it's initialized from the previous analysis ensemble.
Next, Altug presented some advantages of using ensemble-based DA. Since background covariances are sampled from the forecast ensemble, this results in flow-dependent covariances. Ensemble-based DA also provides a natural basis for probabilistic forecasts, and these systems are easy to implement and maintain with no adjoint needed. Ensemble-based DA systems are also fairly straightforward to apply to a domain with multiple nests, easily lend themselves to parallelization, and have a performance that Altug found to be comparable to variational schemes. Systems characteristics of HEDAS were then explained. The EnKF system is coded in Fortran 90 with critical parts of the EnKF code parallelized using OpenMP, which has a relatively simple design. Acquisition of real data will be done through operational channels with plans to assimilate airborne Doppler wind, dropsonde, and flight-level data. Also, a wrapper bash script is used to manage the work flow during cycling of the model. In the chart illustrating HEDAS EnKF workflow, the boxes with the dashed borders represent components parallelized in OpenMP. First, in the model and observation I/O preprocessing step, ensemble members from a previous model run are read in and a state vector for domain - 2 is constructed. Next, domain - 1 is written to the restart files, and finally, observations are read in and an observation array is constructed. Next, in the EnKF update step, prior diagnostics are performed and written out, and then observations are looped over and state points within the influence region are updated. Finally, posterior diagnostics are performed and written out. In the final step involving model I/O postprocessing, updated domain - 2 state vector ensemble perturbations are converted to individual member model states and finally domain - 2 is written to the restart files.
Altug next gave a brief overview of the P-3 and G-IV aircraft used to observe the hurricane environment and vortex structure. The P-3 is used at lower altitudes for hurricane eye penetrations while the G-IV flies at higher altitudes and is used for synoptic surveillance. The EnKF work focuses primarily on the dropsonde, Doppler radar, and flight-level observations provided by the aircraft. Then, Altug described a simulation of dropwindsonde observations. As a plane flies near 700 mb, a dropsonde is released and advected by winds. As the dropsonde falls, model quantities like u, v, T, and Q are observed every 50 mb or so until the dropsonde reaches the surface. The aircraft fly one complete leg (500 km) per assimilation cycle (every hour), and drop points are determined based on the starting point, track direction, and release point resolution (which is 25km) . The track direction is then rotated by 50 degrees before another leg is started. Next, a simulation of airborne Doppler radar wind observations was described, with the same flight track used as for dropsondes with the aircraft mimicking radar geometry. As the plane moves forward, it performs a forward scan at a 20 degree tilt and then performs a backward scan at a 20 degree tilt. However, the plane doesn't always fly exactly in the east-west direction and tends to tilt up and down as it flies through a storm. This requires a coordinate transformation from plane-relative to Earth-relative coordinates to compute the Doppler wind.
Altug then presented a test case for Hurricane Paloma which occurred from November 7-9 in 2008. For this case, there was a nature run which used the same model configuration as the DA ensemble, was initialized from one GEFS ensemble member at November 7 at 00Z with observations extracted every hour between 00-06Z on the 8th. There was also a DA run which was initialized from GEFS ensemble member analyses on November 7 at 18Z (18hr later than nature run) with a 6-hr spin-up before the DA cycle. Observations from the DA run were assimilated ever hour for a 6-hr analysis cycle between 00-06Z on the 8th, and only vortex-scale observations were assimilated on the inner nest (which had a resolution of 3km). Altug then presented a diagram of what the observations look like for 1 hr data for the nature run. The red dots, which look like lines, indicate Doppler wind observations, while the blue dots indicate drop locations for dropsondes.
Next, an evaluation of the ensemble forecast was shown without DA (also called the CTRL). First graphics were shown for the 6-hr evolution of the vortex on November 8th from 00 to 06Z. Altug noted that at 00Z, the nature run has a max 10-m surface wind that is 10m/s stronger and an MSLP that is 8 mb lower than that of the ensemble mean. At 06Z, the storm in the nature run intensified while that in the ensemble mean really did not. Looking at the ensemble spread at the initial time (00Z) on the 8th, the pressure field showed the most spread (shown in shaded contours) around the mean center which is likely due to position error with spread in the zonal wind field also due to position error. Observation space statistics were then shown using prior innovation distributions at 00Z on November 8th. Altug noted that for most variables, the distribution is well-behaved, except for u- and v-wind. By taking a closer look at the u/v biases at 00Z through the difference in the nature - CTRL run, Altug showed a positive bias for the u-wind (indicated by warmer colors) which was the result of position error. The v-wind, in contrast, showed a negative bias, indicated by the cool colors.
Altug explained that for real-data applications, verification in observation space is necessary, meaning that model-equivalent observation values must be computed at observations locations through forward operators. Examples of observation space statistics were then presented, first for Doppler wind. In the RMS, mean innovations, and spread plots (on the left), the saw-tooth pattern is a result of innovation increase during the forecast and innovation decrease during analysis. The spread ratio on the right) is the ratio of the forecast ensemble spread to the "optimal" ensemble spread, which ideally would be close to 1. For the Doppler wind assimilated and evaluated tracks, this value is close to 0.4. Since this value is less than 1, it indicates insufficient spread. For the dropsonde observations space statistics, there is a decrease in RMS error (red dashed line) for u but not for v-wind. The spread ratios for u and v-wind are between 0.4-0.6. Then, a 06Z analysis of storm structure for the test case was shown comparing the CTRL with the nature run and the analysis. Here, the analysis is closer to the nature run in MSLP and 10m wind speed. Looking at the R-Z mean primary and secondary circulations for the CTRL, nature run, and analysis, the storm size in the analysis is more similar to the nature run than that in the CTRL. This is evidence of the impact of assimilating Doppler winds.
Next, Altug addressed the importance of ensemble spread in DA. Ensemble spread, or prior variance, determines the amount of confidence in the prior (background) state. If ensemble spread is smaller than observation error, the DA system interprets more confidence in the background than the observation. Thus, the smaller the ensemble spread, the less impact an observation has on the background, which means that the ensemble mean may begin to drift away from the observations, called filter divergence. Sampling errors, which are due to the limited size of an ensemble and tend to result in the underestimation of background errors can improve ensemble spread. Model error should also be addressed to improve ensemble spread, as an under-representation of model error will lead to spread deficiency. HEDAS has experimented with GFS-EnKF initial/boundary conditions and perturbed-parameter model physics to deal with the ensemble spread problem.
Altug concluded by presenting a HEDAS development plan for Fall/Winter 2010/2011. This plan included full coupling with the global ensemble square root filter system and assimilation of new observation platforms with the potential to include ocean data. There are also plans for diagnostics and assimilation of satellite data in the inner-core region of storms, model-error-related parameter estimation, and experimentation with non-linear update techniques. The plan also included experimentation with storm-centered covariances and filter updates and finally code enhancements and more parallelization.