Objective Evaluation of QPF and PQPF Forecasts Based on NCEP Ensemble
Yuejian Zhu and Zoltan Toth
Environmental Modeling Center
National Centers for Environmental Prediction
A subjective comparison of probabilistic quantitative precipitation
forecasts (PQPF) based on the operational T62 resolution ensemble
of forecasts (Toth and Kalnay, 1997) and the T126 control quantitative
precipitation forecasts (QPF) at the National Centers for Environmental
Prediction (NCEP) indicates that the time period of skillful precipitation
forecasts can be extended by a day or two by using PQPF information derived
from an ensemble ( Zhu and etc. 1998 ). In this study, we objectively evaluate
QPF and PQPF forecasts based on the lower resolution T62 ensemble and the
higher resolution T126 control forecasts over the continental United States
by using 24-hour accumulated precipitation analyses from hourly Gage data.
QPF information will be derived from the ensemble based PQPF probability
distribution using the median (the precipitation amount that is predicted
to be exceeded by a 50% probability), or other percentiles (associated
with precipitation amounts predicted to be exceeded with different levels
of probabilities, such as 30%, 70% and so on).
For representing PQPF distributions, the three-parameter Pearson Type lll (PE3) distribution ( Lehman 1997 ) will be used. For the comparison of QPF forecasts based on the control and ensemble forecasts, Equitable Threat Scores (ETS) and Standard Bias Scores (SBS) will be used for the evaluation for both controls and ensemble based PQPF forecasts. The evaluation period will be the full winter season of 1997-1998 which was December 1997, January 1998 and February 1998. The observation data was 24 hours analysis from hourly accumulated Gage
data ( Baldwin and etc. 1996 ).
Corresponding Auther: Yuejian Zhu EMC/NCEP/NOAA Rm. 204. 5200 Auth Road,
Camp Spring. MD 20746
ETS is defined as the ratio of (HIT - EXP) and (OBS + FCS - HIT -
EXP) for all amounts greater than the threshold precipitation value, where
OBS is the number of observations, FCS is the number of forecasts, HIT
is the number of correct forecasts (hits), and EXP is ( FCS * OBS / TOT
), where TOT is the total number of verifiable points. SBS is the ratio
of FCS and OBS. In general, higher ETS values represent more skilful forecasts
while SBS values close to 1 represent forecasts with no quantitative bias
in terms of over or under forecasting.
Based on the ETS and
SBS scores from the full winter season, Figure 1 evaluates the seasonal
average 24-hour accumulated precipitation forecasts at 4 days (84-108 hours)
lead time for the T126 control forecast ( MRF ), the T62 low resolution
control forecast ( T62 ) and the ensemble median (50 percentile) forecast
( P50 ). For lower threshold values ( from 0.2 mm to 10.0 mm per day ),
the median ensemble forecast exhibits the highest ETS and best (closest
to 1) SBS scores. For higher threshold values ( at and beyond 25.0 mm per
day ), both control forecasts exhibit better scores than the ensemble mode.
One should note, however, that at those high threshold values there is
only a very small sample of forecasts to evaluate. It is obvious from Fig.
1 that all the forecasts, and especially the median ensemble forecast,
exhibit a positive area bias at low precipitation amounts and a negative
bias at high precipitation amounts. The ensemble, however, offers an easy
way of eliminating much of this bias. Instead of looking only at the median
ensemble forecast, one can evaluate forecasts associated with different
percentile values, such as 30%, 40%, etc. (Fig. 2). We can see from this
figure that in order to produce unbiased QPF forecasts from the ensemble,
one needs to use different percentile values for different precipitation
amount thresholds. For example, for the 2, 5, 10, 15 and 25 mm threshold
values the 70, 60, 50, 40, and 30% values yield close to unbiased forecasts.
Note also that these percentile values also correspond to ETS scores that
are typically higher than those corresponding to the control forecasts.
Tests will be performed using forecasts from independent seasons
to assess if percentiles for optimal QPF forecasts can be chosen independent
verification statistics from this study confirm our earlier subjective
evaluation, suggesting that the ensemble based PQPF forecasts should
provide more skilful guidance than those available when using only control
forecasts. Beyond improving QPF forecasts, the other potential advantage
of using ensembles is that they can be used to generate PQPF forecasts.
The results of this study will also be used in designing the calibration
algorithm for the ensemble-based PQPF forecasts at NCEP. This algorithm
will ensure that the forecast probabilities and the observed frequencies
will match over large statistical samples.
Lehman, R., 1997: Modeling precipitation data with Pearson Type III
Distributions: Overview of new interactive software "PFIT". Preprints of
the 13th Conference on Hydrology, 2-7 February 1997, Long Beach, California,
Baldwin, M., and K. Mitchell, 1996: The NCEP hourly multisensor U. S.
precipitation analysis. Preprints, 11th NWP Conf. Norfolk, VA, 19-23 Aug.
Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the
breading method. Mon. Wea. Rev., 125, p. 3297-3319
Toth, Z., Y. Zhu, T. Marchok, S. Tracton and E. Kalnay, 1998: Verification
of the NCEP Global Ensemble Forecasts, Preprint, 12th Conf. on Numerical
Weather Prediction, 11-16 January 1998, Phoenix, Arizona, p 286-289.
Zhu, Y., Z. Toth, E. Kalnay and S. Tracton, 1998: Probabilistic Quantitative Precipitation Forecasts based on the NCEP global ensemble, Preprints, 14th Int. Conf. on Interative Information and Processing Systems for Meteorology, Oceanography, and Hydrology,11-16 January 1998, Phoenix, Arizona, p. J8-J11.
Zhu, Y., G. Iyengar, Z. Toth, S. Tracton and T. Marchok, 1996: Objective evaluation of the NCEP global ensemble forecasting system. Preprints, 15th Conference on Weather Analysis and Forecasting, 19-23 august 1996, Norfolk, Virginia, p. J79-J82.