Regional Reanalysis Questions and Answers
- tropopause definition
- winds are different from Eta/NAM/WRF
- where is the snowfall
- categorical rain doesn't agree with accumulations
- no assimilation of surface land temperatures
- surface drag, surface wind stress, Cd, friction velocity
- zero 30 m winds
- creation of land-water masks, and topography
- soil moisture output
- evaporation and latent heat flux different
Note: The answers to many questions can be found in Office Note 438,
the NCEP Eta Post Processor, which has many details about how many
items are defined in the model. You can access that document here.
How is the tropopause defined?
From Office Note 438: The post processor can generate the following
tropopause level fields: pressure, temperature (ambient and potential),
horizontal winds, and vertical wind shear. The greatest difficulty was
coding an algorithm to locate the tropopause above each mass point. The
procedure used in the Eta post processor is based on that in the NGM.
Above each mass point a surface-up search is made for the first
occurrence of three adjacent layers over which the lapse rate is less
than a critical lapse rate. In both the NGM and Eta model the critical
lapse rate is 2 K/km. The midpoint (in log pressure) of these two
layers is identified as the tropopause. A lower bound of 500 mb is
enforced on the tropopause pressure. If no two layer lapse rate
satisfies the above criteria the model top is designated the
tropopause. Very strong horizontal pressure gradients result from this
algorithm. Horizontal averaging over neighboring grid points prior to
or during the tropopause search might minimize this effect. To date,
this alternative has not been coded. It might be more accurate to
describe the current algorithm as one locating the lowest tropopause
fold above 500 mb.
Linear interpolation in log pressure from the model layers above and
below the tropopause provides the temperature. Recall that velocity
points are staggered with respect to mass points. Winds at the four
velocity points surrounding each mass point are averaged to provide a
mass point wind. These mass point winds are used in the vertical
interpolation to tropopause level. Vertical differencing between
horizontal wind field above and below the tropopause provides an
estimate of vertical wind shear at the tropopause.
Why do the wind fields seem incompatible with operational Eta/NAM/WRF winds?
RR winds are written so that they are earth-relative. That means that a vector
that is due east on the RR grid is due east on the earth. The operational
NCEP forecasts, all use grid-relative winds. For lat-lon and Gaussian grids, the grid-relative
and earth-relative winds are the same. But for the Lambert conformal and polar stereographic
grids, one needs a rotation to convert from grid-relative to earth relative. This can
cause problems as codes designed to read the operational grids usually apply a rotation
to convert the winds to earth relative. However, we felt that it would be easier to convert
the winds to earth relative rather than teach people to rotate the winds. BTW if
you would like to convert grid-relative winds to earth-relative,
use this code.
Why is the snowfall missing from the RR archive? How can I calculate
Unfortunately, there is no straightforward way to obtain the snowfall
from NARR output. Regretably, no procedures were instituted in NARR to
accumulate the precipitation into a separate field of snowfall when the
precipitation was deemed to be non-liquid. We are presently
investigating candidate procedures to accurately but indirectly infer
the accumulated snowfall from other NARR output fields, and we will
post a description of a suitable procedure if we can determine one. In
the meantime, we remind users that snowfall is not needed to compute
the land surface water budget, though it is needed to compute the water
budget of just the snowpack itself. Hence the water budget of the
snowpack itself cannot be analyzed from the NARR output at this time.
In this context, we warn users NOT to assume that the NARR 3-hour
forecast of snowpack state is an accumulated 3-hour snowfall (it is
not). To steer users clear of the latter erroneous assumption, we offer
the following specifics. The standard NARR output at each 3-hour
analysis time consists of the large "Merged A" file and the small
"Merged B" file. Both files contain a field named WEASD, which is the
instantaneous state of the water equivalent snowpack on the ground --
in units of millimeters, or equivalently Kg/m**2). In the Merged A
file, the WEASD field is the instantaneous snowpack state of the
ANALYSIS, at the analysis time (time of the filename). In the Merged B
file, the WEASD field is the instantaneous snowpack state of the 3-hour
FORECAST, valid 3-hours AFTER the analysis time (time of the filename).
For completeness here, we now explain the purpose of the WEASD forecast
field in the Merged B file. During any given 24-hour interval, the NARR
assimilation consists of eight analysis times. These eight analysis
times are linked by eight 3-hour forecasts. For example, the starting
point for the 03Z NARR analysis is the 3-hour NARR forecast from the
previous 00Z NARR analysis. (By "starting point" we mean the reference
or "background" field into which observations are assimilated.) In the
case of the snowpack state, of the eight analysis times per day, only
the 00Z analysis in NARR assimilates external snowpack information.
Hence only at the 00Z valid time is the analysis of the snowpack state
different from the 3-hour forecast of snowpack state from the previous
analysis time (in this case 21Z). At all other analysis times besides
00Z, the analysis of snowpack state is taken from (and thus is
identical to) the 3-hour forecast of snowpack state from the previous
analysis. The amount of the snowpack analysis increment at the 00Z
analysis time must be included in the land surface water budget. This
daily 00Z snowpack analysis increment is calculated by taking the 00Z
WEASD analysis of the Merged A file of 00Z and substracting from it the
3-hour WEASD forecast from the Merged B file of the prior 21Z. This
latter surface water budget application is the purpose of carrying the
3-hour WEASD forecast record in the Meged B file.
Why are sometimes the categorical precipitation flags set to
no precipitation when the accumulated precipitation is larger than 0?
RR includes flags that type the precipitation into liquid, snow,
ice pellets and freezing rain. It was noted that not all the
(0-3 hour) accumulated precipitation was typed. The problem is
that the flags are instantaneous and the precipitation is accumulated.
So some regions with accumulated precipitation will not have any
precipitation when the precipitation flags are set.
Why is the snowfall larger than the total precipitation at times?
The ETA code is structured in such a way that the accumulated
WEASD is offset by one physical time step. Consequently the snowfall
can exceed the total precipitation in some isolated areas. This is
a problem with the operational ETA model (6/2003).
Why does the RR not assimilate surface temperatures over land?
Land surface temperature observations cannot be assimilated
successfully in the 3DVAR scheme used for the regional reanalysis.
The background error contains implicit synoptic scale balance
information, but knows nothing about boundary layer dynamics.
Because of large diurnal variation of temperature over land,
which exists at any time of year, observation - background
residuals are regularly quite large. The balance relationship
in the 3DVAR erroneously connects these large surface residuals
to the wind field, which has a relatively large vertical correlation
length (around 250-300mb). As a result, the large surface residuals
have an undesirable impact on lower-tropospheric temperatures
and lower- to mid-tropospheric winds, instead of being limited to
the boundary layer. Newer versions of 3DVAR with more
sophisticated background error definition will eventually
eliminate this problem, but no mature version existed at the time the
reanalysis was started.
Why does one observe discontinuities in the precipitation field
occasionally on coastlines or near the Canadian border?
The analyzed precipitation used for precipitation assimilation comes
from four sources. Over the continental United States, the analysis comes
from a gage dataset and is analyzed onto a 1/8-degree grid using the
PRISM mountain mapper and a least-squares distance weighting scheme.
Over Mexico and Canada, the analysis comes from gage datasets analyzed onto
1-degree grids. Neither of those datasets has PRISM or a least-squares
weighting scheme. Over the rest of the domain, the CMAP dataset is used.
CMAP is a combination of gage and satellites, analyzed onto a 1-degree grid.
Despite an attempt at a smooth blending, we still see discontinuities
between the different datasets, especially around coastlines and borders
Why does there appear to be a diurnal variation in snowcover?
It has been a well-known problem for several years or more, but less so in RR
than in current ops Eta, and much less so than before the major Eta/Noah
land-surface upgrade in July 2001. Specifically, going back to before July 2001,
about HALF of the magnitude of the fast snowmelt bias in the Eta was solved with
the major Eta/Noah land-surface upgrade in July 2001. Of the remaining bias,
about half of that was solved in the RR snow albedo and snow fraction upgrades
put into the RR shortly before the RR production phase. The remaining fast
snowmelt bias in the RR is due I feel to the high bias in solar insolation in the
Eta and RR. The Eta/RR has a high solar insolation bias because the Eta-RR solar
radiation fails to attenuate enough radiation through cloud cover -- a problem
quite substantial in the high cloudiness regime of winter season.
So, the Eta/RR has a widespread tendency to melt off snow cover too fast
during the day (not always, depends on air temperature and solar insolation), and
then the daily update of the snow cover from the external snow cover analysis
tends to put the snow cover back. Part of the problem also is the unreliability
of the snow cover product itself (from USAF), which can persist as unchanged snow
cover that is actually changing under obscuring cloud cover (because the visible
satellite imagery used by the USAF product cannot see the snow cover change if
obscuring cloud cover is present).
We worked hard to minimize the "saw-tooth" nature of this daily
melt/analysis-replacement tendency by replacing the direct insertion of the USAF
daily snowdepth analysis with a "poorman" assimilation, in which the model
background snowdepth state was not changed if it was a factor of two within the
daily ingested USAF snowdepth product.
Why does it appear that the coastline has shrinked when masked plots are produced?
The land/sea mask was defined over land and water, so you get a cleaner
representation. Items such as soil bottom temperature with masked
points over the water, has a shrinked land mass because Grads needs 4
points to surround a point to make it land.
Why is 2-m pressure sometimes higher than the surface pressure?
The 2-m pressure interpolated using nearest neighbor, while the surface
pressure uses bilinear interpolation. By the way, there is a surface
pressure interpolated using nearest neighbor as well.
Why does the climatological 2-m temperature and latent heat flux
over the Gulf of California seem to have discontinuties on the month
The sea-surface temperatures (SST) over the Gulf of California were
interpolated using an algorithm developed by David Stensrud. The value
of the SST is set to the mean surface value near Guaymas (Ripa and
Marionone, 1989, Quart. J. Roy. Meteor. Soc., 115, 887-913).
The problem is that there was a bug which set the daily SST value to
the climatological value from the first of the month. The routine is
meant to interpolate between the first of the month and the first of
the next month, but that interpolation was not done thanks to the bug,
making the SST from the last of the month to the first of the month
inconsistent with each other. This bug was corrected before the NARR
began 1 January 2004, but 1979-2003 was run with the bug in place.
(Where is/what is wrong) with the surface drag, Cd, wind stress, friction velocity?
We store data in GRIB as integers multiplied by a preset scaling factor.
This allows considerable control over the precision of the numbers and the
resulting size of the GRIB fiel. We spent considerable time modifying the precision
of the GRIB files in order to save disk space for the on-line archives as well as to
make sure the data had enough precision for most applications. However,
we goofed when it came to the surface stress/friction velocity/Cd.
For the surface CD, we didn't save the data with enough precision.
The surface stress was also not saved with enough precision and the stress
was removed from the data sent to NCDC and others. Our error makes
estimating the friction velocity more difficult.
Why are 30-m winds around coastlines sometimes exactly zero?
We have identified the problem to have been caused by a code error in
the following manner. 30 m winds were derived on the model's native
E-grid using the same algorithm as that used to obtain the 10 m winds
(wgrib abbreviations also UGRD, VGRD, merged files record #s 293, 294).
Both of these sets of winds are calculated by the Eta code at mass
points, using the algorithm based on mid-layer winds at the four
neighboring wind points.
If the Eta bottom layer depth is greater than 30 m, the 30 m winds thus
obtained are left as is. But if it is less, a linear vertical
interpolation in z is performed between the lowest layer wind, and the
wind of the layer above, to obtain the 30 m winds. In doing this, our
code erroneously was not interpolating the winds derived for the mass
point, but was doing the interpolation using values at a neighboring
wind point, the one to the east or to the west of the mass point.
Given that the lowest Eta layer of our RR code is thinner than 30 m,
this means that all our 30 m winds at Eta points with bottom elevation
at sea level are affected by the code error. But the error is minor, we
believe for most purposes negligible, if this neighboring wind point
carries a "live" wind, wind that is greater than zero. The error will
not be negligible if this wind is forced to be zero by blocking
resulting from one or more of the three neighboring topography points.
This will happen if one of these points has its ground topography
defined at the nearest interface above sea level, or still higher.
At some of these points with the lowest wind that is used for the
vertical interpolation blocked and equal to zero, also the wind above
will happen to be blocked and equal to zero. At these points, resulting
30 m wind will be zero also. Preferred places for this to happen are
coastlines with non-negligible topography.
The native grid winds are interpolated to the Lambert grid, used for
the "AWIPS" and also "merged" files. The 30 m wind interpolation being
nearest neighbor, some of these erroneous zeros will arrive to the
Lambert grid as zeros as well. Some might not be used and thus will not
affect the merged file values.
But note that some of the AWIPS grid values with 30 m winds different
from zero will be wrong too, if the lowest layer wind used for the
linear vertical interpolation was a blocked wind, and the wind above
not. These wrong winds will tend to be smaller in magnitude than they
ought to be.
Thus, 30 m winds over land points with topography - nearest neighbor
interpolated - above sea level are not affected by the error. But
points at sea level elevation might have 30 m winds wrong, either
zeros, or different from zero.
We are generating code to identify Lambert points with erroneous 30 m
winds, and will post the result on our web page. We are also generating
correct code to be used in R-CDAS. Finally, we will consider possible
remedial actions re NARR files we have.
Why are the units of fluxes, given by the Grib decoder I use, kg/m, and not kg/m/s?
What wgrib decoder, version v18.104.22.168f says (other decoders?), is that
eight various fields that are water vapor and condensate fluxes, have
units [kg/m]. This does not look right, because fluxes have units
[kg/m/s]. The reason for the discrepancy is that variables in these
fields are in fact not fluxes, transports (kg) across unit distance,
per second, but are transports in 3 hours. Note the "0-3hr acc" in
wgrib printout. Thus, the units kg/m are correct. At the time of this
writing, we are in the process of changing wgrib to refer to those
variables as transports, as opposed to fluxes.
As to the data prepared for Global Reanalysis, is the NARR using
the data files prepared for the NCEP/NCAR ("R1") or for the NCEP/DOE
Neither one, strictly speaking. After R1 was completed, two
pre-processing errors were identified which afflicted the R1 datasets
post-1979. By that we mean the errors were made in converting original
sources (from NCAR for instance) into NCEP reanalysis formats (i.e.
PREPBUFR). The problems consisted of about 10 years of mis-located
PAOBs (Australian bogused sea level pressure data, SH only), and about
10 years when some mainly Scandinavian radiosonde stations were
systematically not included.
For the R2 reanalysis, the causes of these errors were corrected, and
the reanalysis formatted datasets were re-made from the same original
sources. Subsequently, it was found that supplemental sig-level winds
from some ECMWF decoded radiosondes which had been included in R1, Aug
1989 through Oct. 1991, were left out of the R2 data. These were added
to the R2 data for NARR data preparation.
How are the NARR topography and land/water masks created?
A: The NARR topography and land/water masks are created in the same way
as those of the then operational Eta model, following the procedure as
Each model grid box is split into 2 x 2 subboxes, and terrain elevation
read off terrain data (USGS 30 s data where available), collecting
values and percentages of water that happen to fall within each of the
subboxes. Using these, mean subbox elevation and percentage of water is
obtained. Subsequently, for each grid box, mean and silhouette
elevations are calculated. Silhouette elevation is calculated by
looking at each pair of subboxes as seen from two directions
perpendicular to sides of the box, amounting to four pairs total, each
time taking the higher value, and subsequently averaging these four
Following this, nine-point Laplacian of the mean orography is
calculated for every box. Where the Laplacian is positive, mean is used
for the box elevation. Where it is negative, silhouette is used
(Mesinger, Bull. Amer. Meteor. Soc., 1996, p. 2646-2647).
Subsequently, an effort is made to minimize closing up of significant
valleys and mountain passes that may have happened within the
silhouette part of the so obtained topography. This is done by looking
for points for which in at least in one of the four possible directions
the average elevation of three nearest neighbors in that direction,
centered on the point considered, is less than both the average
elevation of the three nearest points on one side of these three
points, and also of the three nearest neighbors on the other side of
the three points. If so, irrespective of the sign of the Laplacian, the
mean elevation is used for that point. Points are declared water if
more than 50 percent of their area is covered by water according to the
topography data read.
This is followed by rounding off to reference interface elevation, and
elimination of "windless" points created as a result. Land points that
have winds at all four of their vertices blocked are raised as needed
to reach the lowest unblocked wind. Isolated water points, or water
points that have only one of their nearest neighbors water and are at
sea level, are also raised to reach the lowest wind, and are declared
land. Otherwise, land is removed at one of the water point's four
corners, to free one of the blocked winds. The corner chosen for land
removal is one that has the smallest three-point averaged elevation.
Elevation of water points above sea level is checked for presence of
neighboring water points with a different elevation, and, if so,
elevation of all neighboring water points is made equal.
Why does the change in precipitable water not exactly match the sum
of moisture convergence and evaporation, minus precipitation?
There are a number of reasons why this does not happen, why the "water
budget" cannot be "closed" using the terms mentioned above, and even
using all of the NARR fields available. The most obvious reason is the
"analysis increment", change in the total water of the atmospheric
column (precipitable water, PWAT) resulting form the 3D-Var analysis
step. Note that the analysis increment has been monitored and is
obtainable from NARR files, since both PWAT entering the 3D-Var at the
end of the 3 hr forecast, and PWAT out of the 3D-Var are available,
within fields of the merged files B and A, respectively.
Yet other budget terms that have been monitored and are available are
water vapor and water condensate increments (WVINC, WCINC) that are
made within the precipitation assimilation during the 3-h forecasts in
between the consecutive 3D-Var steps. Note that, for example, if the
precipitation analysis being assimilated at a grid point shows
precipitation, and the model at that location does not have enough
moisture for precipitation according to its precipitation schemes,
moisture is added to the model to achieve consistency (Lin et al. 1999).
Note also that moisture convergence in NARR consists of two fields,
water vapor flux convergence and water condensate flux convergence,
that need to be added to obtain the total moisture convergence.
But even if all of these NARR fields are accounted for the budget will
not exactly close for several reasons. One is the difference between
the local column moisture change resulting from the model's moisture
horizontal advection scheme, and the change that apparently should take
place due to the columnā€™s moisture flux convergence. Namely, the
model's moisture horizontal advection scheme (Janjic 1997) involves
non-local moisture adjustments aimed at achieving several objectives
(e.g., Mesinger and Jovic 2002, NCEP Office note 439, available online
at http://wwwt.emc.ncep.noaa.gov/officenotes). These adjustments
redistribute moisture horizontally. However, since conserving the total
moisture is one of the objectives pursued by these adjustments they
should not have much impact on area-averaged budget calculations. The
same kind of an impact has the horizontal diffusion of moisture.
Finally, interpolation from the model's native grid to the Lambert
output grid is yet another reason preventing the exact closure of the
budget. To reduce the contribution of this interpolation to the
resulting residual, moisture flux convergence has been carefully
calculated and vertically integrated on the model's native grid, and
only subsequently interpolated to the Lambert grid. Thus, users should
take care to use this NARR native grid calculated moisture flux
convergence (WVCONV, WCCONV), as opposed to calculating the moisture
flux convergence on the Lambert grid themselves using the interpolated
velocity and moisture fields, and thus needlessly introducing errors in
their budget calculations additional to those that cannot be avoided.
How many soil layers are there? At what part of the layers are the temperature and moisture valid?
All four NARR soil layers (0-10, 10-40, 40-100, 100-200 cm layer thicknesses) are available in the standard NARR output at NCDC. For each soil layer, three prognostic soil states are available in the records of the grib file: 1) soil temperature at the mid-point of the soil layer, 2) total soil moisture (sum of liquid plus frozen soil moisture), and 3) liquid soil moisture (that part of the total soil moisture that is not frozen). Substracting the amount of liquid soil moisture from the total soil moisture yields the amount of frozen soil moisture (soil ice content).
All three soil state variables above are considered valid at the mid-point of the soil layer. Units of soil temperature is Kelvin. Units of soil moisture is volumetric fraction (fraction of unit volume of soil that is occupied by soil moisture -- typical saturation value of volumetric soil moisture is 0.45, except for sandy soils the saturation value is more typically around 0.35). Super-cooled liquid water can and does easily exist in the soil state, so even at sub-freezing temperatures the NARR soil states contain some liquid soil moisture (water), but the amount of such liquid soil moisture relative to the total soil moisture decreases as the soil temperature falls further and further below the freezing point.
Why are the values of evaporation and latent heat flux different even after accounting for the difference in units?
Given that in the Eta code the evaporated water vapor is not added explicitly to the atmosphere
but instead the latent heat flux is used as a boundary condition for the vertical diffusion of
moisture, the total column water vapor needs to be calculated before and after the diffusion loop,
and the evaporation obtained as the difference between the two. Therefore, the values of the
evaporation and latent heat fluxes in NARR are not the same even after accounting for the
difference in units.