Regional Reanalysis Questions and Answers

Partial Index

  1. tropopause definition
  2. winds are different from Eta/NAM/WRF
  3. where is the snowfall
  4. categorical rain doesn't agree with accumulations
  5. no assimilation of surface land temperatures
  6. surface drag, surface wind stress, Cd, friction velocity
  7. zero 30 m winds
  8. creation of land-water masks, and topography
  9. soil moisture output
  10. evaporation and latent heat flux different

Note: The answers to many questions can be found in Office Note 438, the NCEP Eta Post Processor, which has many details about how many items are defined in the model. You can access that document here.

How is the tropopause defined?

From Office Note 438: The post processor can generate the following tropopause level fields: pressure, temperature (ambient and potential), horizontal winds, and vertical wind shear. The greatest difficulty was coding an algorithm to locate the tropopause above each mass point. The procedure used in the Eta post processor is based on that in the NGM. Above each mass point a surface-up search is made for the first occurrence of three adjacent layers over which the lapse rate is less than a critical lapse rate. In both the NGM and Eta model the critical lapse rate is 2 K/km. The midpoint (in log pressure) of these two layers is identified as the tropopause. A lower bound of 500 mb is enforced on the tropopause pressure. If no two layer lapse rate satisfies the above criteria the model top is designated the tropopause. Very strong horizontal pressure gradients result from this algorithm. Horizontal averaging over neighboring grid points prior to or during the tropopause search might minimize this effect. To date, this alternative has not been coded. It might be more accurate to describe the current algorithm as one locating the lowest tropopause fold above 500 mb.

Linear interpolation in log pressure from the model layers above and below the tropopause provides the temperature. Recall that velocity points are staggered with respect to mass points. Winds at the four velocity points surrounding each mass point are averaged to provide a mass point wind. These mass point winds are used in the vertical interpolation to tropopause level. Vertical differencing between horizontal wind field above and below the tropopause provides an estimate of vertical wind shear at the tropopause.

Why do the wind fields seem incompatible with operational Eta/NAM/WRF winds?

RR winds are written so that they are earth-relative. That means that a vector that is due east on the RR grid is due east on the earth. The operational NCEP forecasts, all use grid-relative winds. For lat-lon and Gaussian grids, the grid-relative and earth-relative winds are the same. But for the Lambert conformal and polar stereographic grids, one needs a rotation to convert from grid-relative to earth relative. This can cause problems as codes designed to read the operational grids usually apply a rotation to convert the winds to earth relative. However, we felt that it would be easier to convert the winds to earth relative rather than teach people to rotate the winds. BTW if you would like to convert grid-relative winds to earth-relative, use this code.

Why is the snowfall missing from the RR archive? How can I calculate snowfall?

Unfortunately, there is no straightforward way to obtain the snowfall from NARR output. Regretably, no procedures were instituted in NARR to accumulate the precipitation into a separate field of snowfall when the precipitation was deemed to be non-liquid. We are presently investigating candidate procedures to accurately but indirectly infer the accumulated snowfall from other NARR output fields, and we will post a description of a suitable procedure if we can determine one. In the meantime, we remind users that snowfall is not needed to compute the land surface water budget, though it is needed to compute the water budget of just the snowpack itself. Hence the water budget of the snowpack itself cannot be analyzed from the NARR output at this time. In this context, we warn users NOT to assume that the NARR 3-hour forecast of snowpack state is an accumulated 3-hour snowfall (it is not). To steer users clear of the latter erroneous assumption, we offer the following specifics. The standard NARR output at each 3-hour analysis time consists of the large "Merged A" file and the small "Merged B" file. Both files contain a field named WEASD, which is the instantaneous state of the water equivalent snowpack on the ground -- in units of millimeters, or equivalently Kg/m**2). In the Merged A file, the WEASD field is the instantaneous snowpack state of the ANALYSIS, at the analysis time (time of the filename). In the Merged B file, the WEASD field is the instantaneous snowpack state of the 3-hour FORECAST, valid 3-hours AFTER the analysis time (time of the filename). For completeness here, we now explain the purpose of the WEASD forecast field in the Merged B file. During any given 24-hour interval, the NARR assimilation consists of eight analysis times. These eight analysis times are linked by eight 3-hour forecasts. For example, the starting point for the 03Z NARR analysis is the 3-hour NARR forecast from the previous 00Z NARR analysis. (By "starting point" we mean the reference or "background" field into which observations are assimilated.) In the case of the snowpack state, of the eight analysis times per day, only the 00Z analysis in NARR assimilates external snowpack information. Hence only at the 00Z valid time is the analysis of the snowpack state different from the 3-hour forecast of snowpack state from the previous analysis time (in this case 21Z). At all other analysis times besides 00Z, the analysis of snowpack state is taken from (and thus is identical to) the 3-hour forecast of snowpack state from the previous analysis. The amount of the snowpack analysis increment at the 00Z analysis time must be included in the land surface water budget. This daily 00Z snowpack analysis increment is calculated by taking the 00Z WEASD analysis of the Merged A file of 00Z and substracting from it the 3-hour WEASD forecast from the Merged B file of the prior 21Z. This latter surface water budget application is the purpose of carrying the 3-hour WEASD forecast record in the Meged B file.

Why are sometimes the categorical precipitation flags set to no precipitation when the accumulated precipitation is larger than 0?

RR includes flags that type the precipitation into liquid, snow, ice pellets and freezing rain. It was noted that not all the (0-3 hour) accumulated precipitation was typed. The problem is that the flags are instantaneous and the precipitation is accumulated. So some regions with accumulated precipitation will not have any precipitation when the precipitation flags are set.

Why is the snowfall larger than the total precipitation at times?

The ETA code is structured in such a way that the accumulated WEASD is offset by one physical time step. Consequently the snowfall can exceed the total precipitation in some isolated areas. This is a problem with the operational ETA model (6/2003).

Why does the RR not assimilate surface temperatures over land?

Land surface temperature observations cannot be assimilated successfully in the 3DVAR scheme used for the regional reanalysis. The background error contains implicit synoptic scale balance information, but knows nothing about boundary layer dynamics. Because of large diurnal variation of temperature over land, which exists at any time of year, observation - background residuals are regularly quite large. The balance relationship in the 3DVAR erroneously connects these large surface residuals to the wind field, which has a relatively large vertical correlation length (around 250-300mb). As a result, the large surface residuals have an undesirable impact on lower-tropospheric temperatures and lower- to mid-tropospheric winds, instead of being limited to the boundary layer. Newer versions of 3DVAR with more sophisticated background error definition will eventually eliminate this problem, but no mature version existed at the time the reanalysis was started.

Why does one observe discontinuities in the precipitation field occasionally on coastlines or near the Canadian border?

The analyzed precipitation used for precipitation assimilation comes from four sources. Over the continental United States, the analysis comes from a gage dataset and is analyzed onto a 1/8-degree grid using the PRISM mountain mapper and a least-squares distance weighting scheme. Over Mexico and Canada, the analysis comes from gage datasets analyzed onto 1-degree grids. Neither of those datasets has PRISM or a least-squares weighting scheme. Over the rest of the domain, the CMAP dataset is used. CMAP is a combination of gage and satellites, analyzed onto a 1-degree grid. Despite an attempt at a smooth blending, we still see discontinuities between the different datasets, especially around coastlines and borders of countries.

Why does there appear to be a diurnal variation in snowcover?

It has been a well-known problem for several years or more, but less so in RR than in current ops Eta, and much less so than before the major Eta/Noah land-surface upgrade in July 2001. Specifically, going back to before July 2001, about HALF of the magnitude of the fast snowmelt bias in the Eta was solved with the major Eta/Noah land-surface upgrade in July 2001. Of the remaining bias, about half of that was solved in the RR snow albedo and snow fraction upgrades put into the RR shortly before the RR production phase. The remaining fast snowmelt bias in the RR is due I feel to the high bias in solar insolation in the Eta and RR. The Eta/RR has a high solar insolation bias because the Eta-RR solar radiation fails to attenuate enough radiation through cloud cover -- a problem quite substantial in the high cloudiness regime of winter season. So, the Eta/RR has a widespread tendency to melt off snow cover too fast during the day (not always, depends on air temperature and solar insolation), and then the daily update of the snow cover from the external snow cover analysis tends to put the snow cover back. Part of the problem also is the unreliability of the snow cover product itself (from USAF), which can persist as unchanged snow cover that is actually changing under obscuring cloud cover (because the visible satellite imagery used by the USAF product cannot see the snow cover change if obscuring cloud cover is present).

We worked hard to minimize the "saw-tooth" nature of this daily melt/analysis-replacement tendency by replacing the direct insertion of the USAF daily snowdepth analysis with a "poorman" assimilation, in which the model background snowdepth state was not changed if it was a factor of two within the daily ingested USAF snowdepth product.

Why does it appear that the coastline has shrinked when masked plots are produced?

The land/sea mask was defined over land and water, so you get a cleaner representation. Items such as soil bottom temperature with masked points over the water, has a shrinked land mass because Grads needs 4 points to surround a point to make it land.

Why is 2-m pressure sometimes higher than the surface pressure?

The 2-m pressure interpolated using nearest neighbor, while the surface pressure uses bilinear interpolation. By the way, there is a surface pressure interpolated using nearest neighbor as well.

Why does the climatological 2-m temperature and latent heat flux over the Gulf of California seem to have discontinuties on the month boundaries?

The sea-surface temperatures (SST) over the Gulf of California were interpolated using an algorithm developed by David Stensrud. The value of the SST is set to the mean surface value near Guaymas (Ripa and Marionone, 1989, Quart. J. Roy. Meteor. Soc., 115, 887-913).

The problem is that there was a bug which set the daily SST value to the climatological value from the first of the month. The routine is meant to interpolate between the first of the month and the first of the next month, but that interpolation was not done thanks to the bug, making the SST from the last of the month to the first of the month inconsistent with each other. This bug was corrected before the NARR began 1 January 2004, but 1979-2003 was run with the bug in place.

(Where is/what is wrong) with the surface drag, Cd, wind stress, friction velocity?

We store data in GRIB as integers multiplied by a preset scaling factor. This allows considerable control over the precision of the numbers and the resulting size of the GRIB fiel. We spent considerable time modifying the precision of the GRIB files in order to save disk space for the on-line archives as well as to make sure the data had enough precision for most applications. However, we goofed when it came to the surface stress/friction velocity/Cd.

For the surface CD, we didn't save the data with enough precision. The surface stress was also not saved with enough precision and the stress was removed from the data sent to NCDC and others. Our error makes estimating the friction velocity more difficult.

Why are 30-m winds around coastlines sometimes exactly zero?

We have identified the problem to have been caused by a code error in the following manner. 30 m winds were derived on the model's native E-grid using the same algorithm as that used to obtain the 10 m winds (wgrib abbreviations also UGRD, VGRD, merged files record #s 293, 294). Both of these sets of winds are calculated by the Eta code at mass points, using the algorithm based on mid-layer winds at the four neighboring wind points.

If the Eta bottom layer depth is greater than 30 m, the 30 m winds thus obtained are left as is. But if it is less, a linear vertical interpolation in z is performed between the lowest layer wind, and the wind of the layer above, to obtain the 30 m winds. In doing this, our code erroneously was not interpolating the winds derived for the mass point, but was doing the interpolation using values at a neighboring wind point, the one to the east or to the west of the mass point.

Given that the lowest Eta layer of our RR code is thinner than 30 m, this means that all our 30 m winds at Eta points with bottom elevation at sea level are affected by the code error. But the error is minor, we believe for most purposes negligible, if this neighboring wind point carries a "live" wind, wind that is greater than zero. The error will not be negligible if this wind is forced to be zero by blocking resulting from one or more of the three neighboring topography points. This will happen if one of these points has its ground topography defined at the nearest interface above sea level, or still higher.

At some of these points with the lowest wind that is used for the vertical interpolation blocked and equal to zero, also the wind above will happen to be blocked and equal to zero. At these points, resulting 30 m wind will be zero also. Preferred places for this to happen are coastlines with non-negligible topography.

The native grid winds are interpolated to the Lambert grid, used for the "AWIPS" and also "merged" files. The 30 m wind interpolation being nearest neighbor, some of these erroneous zeros will arrive to the Lambert grid as zeros as well. Some might not be used and thus will not affect the merged file values.

But note that some of the AWIPS grid values with 30 m winds different from zero will be wrong too, if the lowest layer wind used for the linear vertical interpolation was a blocked wind, and the wind above not. These wrong winds will tend to be smaller in magnitude than they ought to be.

Thus, 30 m winds over land points with topography - nearest neighbor interpolated - above sea level are not affected by the error. But points at sea level elevation might have 30 m winds wrong, either zeros, or different from zero.

We are generating code to identify Lambert points with erroneous 30 m winds, and will post the result on our web page. We are also generating correct code to be used in R-CDAS. Finally, we will consider possible remedial actions re NARR files we have.

Why are the units of fluxes, given by the Grib decoder I use, kg/m, and not kg/m/s?

What wgrib decoder, version v1.8.0.9f says (other decoders?), is that eight various fields that are water vapor and condensate fluxes, have units [kg/m]. This does not look right, because fluxes have units [kg/m/s]. The reason for the discrepancy is that variables in these fields are in fact not fluxes, transports (kg) across unit distance, per second, but are transports in 3 hours. Note the "0-3hr acc" in wgrib printout. Thus, the units kg/m are correct. At the time of this writing, we are in the process of changing wgrib to refer to those variables as transports, as opposed to fluxes.

As to the data prepared for Global Reanalysis, is the NARR using the data files prepared for the NCEP/NCAR ("R1") or for the NCEP/DOE ("R2") reanalysis?

Neither one, strictly speaking. After R1 was completed, two pre-processing errors were identified which afflicted the R1 datasets post-1979. By that we mean the errors were made in converting original sources (from NCAR for instance) into NCEP reanalysis formats (i.e. PREPBUFR). The problems consisted of about 10 years of mis-located PAOBs (Australian bogused sea level pressure data, SH only), and about 10 years when some mainly Scandinavian radiosonde stations were systematically not included.

For the R2 reanalysis, the causes of these errors were corrected, and the reanalysis formatted datasets were re-made from the same original sources. Subsequently, it was found that supplemental sig-level winds from some ECMWF decoded radiosondes which had been included in R1, Aug 1989 through Oct. 1991, were left out of the R2 data. These were added to the R2 data for NARR data preparation.

How are the NARR topography and land/water masks created?

A: The NARR topography and land/water masks are created in the same way as those of the then operational Eta model, following the procedure as follows.

Each model grid box is split into 2 x 2 subboxes, and terrain elevation read off terrain data (USGS 30 s data where available), collecting values and percentages of water that happen to fall within each of the subboxes. Using these, mean subbox elevation and percentage of water is obtained. Subsequently, for each grid box, mean and silhouette elevations are calculated. Silhouette elevation is calculated by looking at each pair of subboxes as seen from two directions perpendicular to sides of the box, amounting to four pairs total, each time taking the higher value, and subsequently averaging these four higher values. Following this, nine-point Laplacian of the mean orography is calculated for every box. Where the Laplacian is positive, mean is used for the box elevation. Where it is negative, silhouette is used (Mesinger, Bull. Amer. Meteor. Soc., 1996, p. 2646-2647).

Subsequently, an effort is made to minimize closing up of significant valleys and mountain passes that may have happened within the silhouette part of the so obtained topography. This is done by looking for points for which in at least in one of the four possible directions the average elevation of three nearest neighbors in that direction, centered on the point considered, is less than both the average elevation of the three nearest points on one side of these three points, and also of the three nearest neighbors on the other side of the three points. If so, irrespective of the sign of the Laplacian, the mean elevation is used for that point. Points are declared water if more than 50 percent of their area is covered by water according to the topography data read.

This is followed by rounding off to reference interface elevation, and elimination of "windless" points created as a result. Land points that have winds at all four of their vertices blocked are raised as needed to reach the lowest unblocked wind. Isolated water points, or water points that have only one of their nearest neighbors water and are at sea level, are also raised to reach the lowest wind, and are declared land. Otherwise, land is removed at one of the water point's four corners, to free one of the blocked winds. The corner chosen for land removal is one that has the smallest three-point averaged elevation.

Elevation of water points above sea level is checked for presence of neighboring water points with a different elevation, and, if so, elevation of all neighboring water points is made equal.

Why does the change in precipitable water not exactly match the sum of moisture convergence and evaporation, minus precipitation?

There are a number of reasons why this does not happen, why the "water budget" cannot be "closed" using the terms mentioned above, and even using all of the NARR fields available. The most obvious reason is the "analysis increment", change in the total water of the atmospheric column (precipitable water, PWAT) resulting form the 3D-Var analysis step. Note that the analysis increment has been monitored and is obtainable from NARR files, since both PWAT entering the 3D-Var at the end of the 3 hr forecast, and PWAT out of the 3D-Var are available, within fields of the merged files B and A, respectively.

Yet other budget terms that have been monitored and are available are water vapor and water condensate increments (WVINC, WCINC) that are made within the precipitation assimilation during the 3-h forecasts in between the consecutive 3D-Var steps. Note that, for example, if the precipitation analysis being assimilated at a grid point shows precipitation, and the model at that location does not have enough moisture for precipitation according to its precipitation schemes, moisture is added to the model to achieve consistency (Lin et al. 1999).

Note also that moisture convergence in NARR consists of two fields, water vapor flux convergence and water condensate flux convergence, that need to be added to obtain the total moisture convergence.

But even if all of these NARR fields are accounted for the budget will not exactly close for several reasons. One is the difference between the local column moisture change resulting from the model's moisture horizontal advection scheme, and the change that apparently should take place due to the columnā€™s moisture flux convergence. Namely, the model's moisture horizontal advection scheme (Janjic 1997) involves non-local moisture adjustments aimed at achieving several objectives (e.g., Mesinger and Jovic 2002, NCEP Office note 439, available online at These adjustments redistribute moisture horizontally. However, since conserving the total moisture is one of the objectives pursued by these adjustments they should not have much impact on area-averaged budget calculations. The same kind of an impact has the horizontal diffusion of moisture.

Finally, interpolation from the model's native grid to the Lambert output grid is yet another reason preventing the exact closure of the budget. To reduce the contribution of this interpolation to the resulting residual, moisture flux convergence has been carefully calculated and vertically integrated on the model's native grid, and only subsequently interpolated to the Lambert grid. Thus, users should take care to use this NARR native grid calculated moisture flux convergence (WVCONV, WCCONV), as opposed to calculating the moisture flux convergence on the Lambert grid themselves using the interpolated velocity and moisture fields, and thus needlessly introducing errors in their budget calculations additional to those that cannot be avoided.

How many soil layers are there? At what part of the layers are the temperature and moisture valid?

All four NARR soil layers (0-10, 10-40, 40-100, 100-200 cm layer thicknesses) are available in the standard NARR output at NCDC. For each soil layer, three prognostic soil states are available in the records of the grib file: 1) soil temperature at the mid-point of the soil layer, 2) total soil moisture (sum of liquid plus frozen soil moisture), and 3) liquid soil moisture (that part of the total soil moisture that is not frozen). Substracting the amount of liquid soil moisture from the total soil moisture yields the amount of frozen soil moisture (soil ice content).

All three soil state variables above are considered valid at the mid-point of the soil layer. Units of soil temperature is Kelvin. Units of soil moisture is volumetric fraction (fraction of unit volume of soil that is occupied by soil moisture -- typical saturation value of volumetric soil moisture is 0.45, except for sandy soils the saturation value is more typically around 0.35). Super-cooled liquid water can and does easily exist in the soil state, so even at sub-freezing temperatures the NARR soil states contain some liquid soil moisture (water), but the amount of such liquid soil moisture relative to the total soil moisture decreases as the soil temperature falls further and further below the freezing point.

Why are the values of evaporation and latent heat flux different even after accounting for the difference in units?

Given that in the Eta code the evaporated water vapor is not added explicitly to the atmosphere but instead the latent heat flux is used as a boundary condition for the vertical diffusion of moisture, the total column water vapor needs to be calculated before and after the diffusion loop, and the evaporation obtained as the difference between the two. Therefore, the values of the evaporation and latent heat fluxes in NARR are not the same even after accounting for the difference in units.