NCEP Verification System User Guide Geoff DiMego, Hui-Ya Chuang, and Mary Hart INTRODUCTION: The NCEP Verification System, which generates the Verification Statistics Data Base (VSDB), is divided into three parts: 1) the "editbufr" that thins the observation PREPBUFR files, 2) the "prepfits" that interpolates model forecast GRIB files to the observation sites, and 3) the "gridtobs" that computes and generates VSDB records. The VSDB is in a self-documenting straightforward ASCII format. The format and the database are described at http://www.emc.ncep.noaa.gov/mmb/papers/brill/VSDBformat.txt . The database is a collection of flat files that can be left as individual files or can be concatenated together into larger files. These ASCII records contain the raw numbers from which many final statistics can be computed. NCEP calls these numbers partial sums. This format also can record final statistics as well for any domain or time period that the self-documenting format allows. DOWNLOAD: The tar file "NCEPVERIF.tar" that contains source codes/running scripts/libraries/parameter files needed to run NCEP Verification System is available for anonymous ftp by: 1. Ftp to the EMC public server by typing ftp ftpprd.ncep.noaa.gov . Use "anonymous" as your user id and your e-mail address as the password. 2. Change the directory to /pub/emc/mmb/WRFtesting/verif/ Un-tarring "NCEPVERIF.tar" creates four directories: 1) sorc/ contains the source codes for editbufr, prepfits, and gridtobs as well as the makefiles (named build) used to build corresponding executables on the IBM. If platforms other than IBM are used, the makefiles will need to be modified. 2) lib/ contains the libraries that are needed to compile the source codes. Most of these libraries only work on big-endian computers. Also note that the version of bufrlib included in NCEPVERIF.tar is slightly different from the standard version of NCEP bufrlib because it was modified to accommodate prepfits. 3) parm/ contains the parameter files, which can be modified by the users to control how the verification should be performed. More detail of these control files are provided later in the overview of the three programs. 4) scripts/ contains the sample scripts NCEP uses to run these three programs on the IBM. INSTALLATION: The steps to run NCEP Verification System for the first time are as follows: 1. Build the libraries; 2. Compile the source codes; 3. Create an ascii file called newdate with the string yyyymmdd00, which is one day before the day to be verified; 4. Modify the control file; 5. Obtain the PREPBUFR files; 6. Modify the scripts to reflect correct paths for scripts, executables, parameter files, the model forecast GRIB files, and the PREPBUFR files, etc. Submit the script. NOTE: At NCEP, the verification system is typically run once every 24 hour cycle at 00z to verify the model forecasts from the previous day (${DATE}). Each time, the main running script (runpfit.retro.mangr) calls for simultaneous execution of a series of eight scripts (runfits${VH}.pll.ll, where ${VH} stands for verifying hour) to verify the model forecasts every three hours at 00Z, 03 Z, 06 Z....24 Z for that whole day (${DATE}). Within our sample script, exfits.${VH}z.pll.sh, a search is made to collect and verify the model forecasts at a 12-hour interval that are all valid at ${VH}Z on ${DATE}. Each run of exfits.${VH}z.pll.sh executes editbufr, prepfits, and gridtobs in a sequence. When all eight scripts are completed, all the VSDB records for the same day are combined into one file. OVERVIEW: The following is a description of the three components of the verification system and their input files. 1. The code editbufr reads in, thins and writes out observations in the Operational PREPBUFR common / international standard format of BUFR. As part of the process of generating a PREPBUFR file, some platform-specific QC is applied to the data contained in the PREPBUFR file. Dennis Keyser describes PREPBUFR processing in greater detail at http://www.emc.ncep.noaa.gov/mmb/papers/keyser/data_processing . [For WRF, this PREPBUFR will be the format for not only the verifying observations but also for the WRF 3DVAR analysis.] We have also restricted the data to just those pieces of information we actually use in the analysis / data assimilation. THEREFORE, we only have wind, temperature, height and moisture. Each piece of data also has an associated observation pressure which is used along with latitude and longitude to locate the obs in three-dimensions in the atmosphere. Sea-level pressure obs are the only pressures that are not used for this ob location function. The editbufr step thins the complete obs collection contained in the Operational PREPBUFR down to just those data to be used for verification, and creates a temporary output file. The thinning saves time and space in the next prepfits step, where the most computer work is actually done. The output file uses standard PREPBUFR format with one difference. NCEP's standard PREPBUFR allows for each observation to also have stored with it a value of the first guess or background (generated at the location and time of the observation). The temporary output file from editbufr is identical in all respects to that of PREPBUFR except that it allows multiple backgrounds to be stored. This will happen in the next step and for that reason we call this our PREPFITS format to distinguish it from PREPBUFR format. Decision for inclusion in the output file is based on an input control file called keeplist which allows specification of time window, areal extent and observation type to be included. A sample keeplist file is shown below. ------------------------------------------------------------------- IRETGRID - GRID NUMBER OF THE RETENTION AREA 104 YYMMDD - DATE OR TIME WINDOW INDICATOR -75 OBTYP - UP TO 20 OB TYPES TO BE RETAINED 120 220 221 122 222 223 224 133 233 180 280 181 281 182 282 183 284 --------------------------------------------------------------------- As shown in the sample above, data to be retained can be controlled by specifying: line 2: geographic location using the AWIP grid number, line 4: time window in hundredths of an hour within which you wish to keep the data, and lines 6-22: observation types. Eric Rogers shows the myriad of output grids needed from the Operational Meso Eta (and the RUC too) at his webpage: http://www.emc.ncep.noaa.gov/mmb/etagrids/ . The definition of observation types can be found at: http://www.emc.ncep.noaa.gov/mmb/papers/keyser/prepbufr.doc/table_4.htm 2. The code prepfits reads in the observations from the temporary output file created by editbufr, and adds background values to each piece of data from one or more forecasts which are valid at the time of the observations. ONCE PER DAY you will be able to verify all the model forecasts that are valid at the time of the data collected / selected in the input prepfits file. You can run this code for just one forecast or for many. The background values (model forecasts) are generated by horizontal and vertical bi-linear interpolation from the standard GRIB grid representations. This code deals with AVN, NGM, RUC and Eta model fields so it was written to perform the vertical interpolation from standard pressure level output from those models. In the case of the Eta model, this method does not introduce very much uncertainty (3-5%) compared to performing vertical interpolation from native model levels, AS LONG AS we have isobaric model data AT LEAST every 50 mb. Vertical interpolation is linear in ln p for everything except specific humidity, which is interpolated as the ln of q linear in ln p. The moisture variable in PREPBUFR and PREPFITS is specific humidity q. Two options are available for verifying forecasts at surface. The first option is to directly compare the post-processed surface variables of the model (2 meter Temperature & moisture and 10 meter winds) with observation. This method only involves a horizontal interpolation. There is no adjustment for discrepancies in the elevation of observations versus model terrain height. AFTER ALL, forecasters don't do this calculation when using these fields! The second option is to perform a 3-dimensional interpolation of the post-processed fields from the model (which ALWAYS extend to 1000 mb) to the observed pressure. This option performs the necessary adjustment for elevation differences between observation and model terrain. It also reflects what the forecasters will see in the below terrain isobaric fields coming out of the model post-processor like 1000mb and even 850 mb fields under the Rockies. The prepfits job can be run multiple times to add additional model forecasts to the PREPFITS file. Remember, the verifying observations remain the same - we are simply adding different forecasts with a common valid time. In practice, we would do this if we were going to archive this PREPFITS file, but we generate so much stuff we just don't have enough archive space to do that. There are four input files for prepfits (sample files for each of them can be found in the directory parm/): 1) levcat: allows specification of number of levels and data categories; line 1: number of levels to read in from GRIB file, line 2: logical flag to choose observation data categories to fit. NOTE: There are 10 data categories: 1)surface, 2)mandatory level, 3)significant level temperature & moisture, 4)winds by pressure, 5)winds by height, 6)tropopause, 7)any single level for aircraft data, 8)auxiliary, 9)not used, and 10)not used; 2) data00: is the thinned-down BUFR file from editbufr; 3) prepfits.tab: is a BUFR table defining BUFR mnemonics; 4) prepfits.in00: contains the names of the experiment (e.g., WRF), the model forecast GRIB file, and the index file of the model forecast GRIB file. The file prepfits.in00 is read in as standard input and is generated within our sample script exfits.${VH}z.pll.sh. 3. The code gridtobs generates VSDB records containing the desired partial sums. The code takes a brute force - brain dead approach to the problem. We read in instructions from a control file (e.g., gridtobs.wrf0012) that contains essentially the bounding parameters over which the partial sums are to be accumulated. This bounding parameter info also facilitates generation of the 9 headers of the VSDB record. A sample control file is shown below: ------------------------------------------------------------------------ V01 10 1 WRF/212 6 00 12 24 36 48 60 1 19 1 ADPUPA 1 G236 1 SL1L2 4 Z T RH VWND 11 P1000 P850 P700 P500 P400 P300 P250 P200 P150 P100 P50 ------------------------------------------------------------------------ In the sample control file are specifications for: line 1: version number of verification system (which is always version one at NCEP), and unit number for the input BUFR file. line 2: number of verifying models, and the name of the verifying models/grid numbers. One can specify up to 6 different models. lines 3-8: number of forecast hours (has to be less than 20) to read from PREPFIT file followed by all the two-digit forecast hours (00h to 60h every 12 h). line 9: number of verification dates, and the 10-digit yyyymmddhh. If one only verify one date at a time, as at NCEP, the date of PREPFIT input file (output from PREPFITS step) will be used instead. In the sample script, because only one verification date is used, the value of 19 is ignored. line 10: number of verifying ob type (maximum 10 types), and the names of the ob type. line 11: number of verification areas over which the computation of partial sums is performed, and the grid numbers of the areas. line 12: number of statistics types, and the types of statistics. lines 13-16: number of the variable types to verify (max 6 types), and the names of the variables. In the sample script, four types of variables will be verified: height, temperature, relative humidity, and wind. lines 17-27: number of levels to verify, and names of the levels. For example, P1000 represents 1000 mb pressure level and SFC represents surface. Two additional input files are regions and grid#104. These two files are used to divide grid 104 into different sub-regions so that statistics may also be performed within each of these sub-regions. The file regions defines the two-digit number and three-letter abbreviation associated with each sub-region. The file grid#104 assigns each grid point in the NCEP grid 104 with a unique sub-region number that is consistent with the definition in the file regions. Currently, 29 sub-regions are defined within NCEP grid 104. TEST DATA: If you wish to verify your model forecasts, one month's worth of PREPBUFR observation files for the August 2001 are available on the EMC public server under /pub/emc/mmb/WRFtesting/data/ . The directory also contains data sets for Eta analysis and forecast Grib files that can be used to initialize WRF for the same period. There are four types of files in this directory: (i) yyyymmddhh.INPUT.tar: analysis and forecast Eta Grib files (every 3 h from 0 h to 48 h) initializing at the yyyymmddhh cycle. (ii) yyyymmddhh.INPUT.list: a list of the file names in yyyymmddhh.INPUT.tar. (iii) yyyymmdd.prepbufr.tar: 8 PREPBUFR files verifying from yyyymmdd00 to yyyymmdd21 every 3 hours. (iv) yyyymmdd.prepbufr.list: a list of the file names in yyyymmdd.prepbufr.tar. DISPLAY OF VSDB: In addition to NCEP Verification System described above, NCEP also has the Forecast Verification System (FVS) software that Keith Brill wrote to process the VSDB and accumulate the partial sums into final sums, compute the requested statistics and display the results. This software reads the control file with the user's requests, constructs a list of the records it needs, scans the VSDB for records matching those it needs (using the basic UNIX file names given to the records AND their contents). It uses some GEMLIB (GEMPAK library) entities to help with this name/record matching and to perform the display of the resulting final statistic. The FVS also is capable of writing the final statistics out as a VSDB record. At present, we are still storing a number of individual records (e.g. for each cycle's run for each day etc.), but this facility of FVS could be used to condense the numbers down into weekly, monthly, seasonal or annual statistics. PRECIPITATION VERIFICATION SYSTEM Information on Precipitation Verification System will be provided in the near future.