PREPBUFR PROCESSING AT NCEP

Dennis Keyser - NOAA/NWS/NCEP/EMC
(Last Revised 2/12/2018 - should be up to date)

Please take a moment to read the Disclaimer for this non-operational web page.

1. INTRODUCTION

The "PREPBUFR" processing is the final step in preparing the majority of conventional observational data for assimilation into the various NCEP analyses including the North American Model (NAM) unified grid-point statistical interpolation (GSI) analysis (the "NAM" network), the Global Forecast System (GFS) and Global Data Assimilation System (GDAS) unified grid-point statistical interpolation (GSI) analysis (the "GFS" and "GDAS" networks), the Rapid Refresh (RAP) unified grid-point statistical interpolation (GSI) analysis [the "RAP" network, shared with the High-Resolution Rapid Refresh (HRRR)], the Real Time Mesoscale Analysis (RTMA) and UnRestricted Mesoscale Analysis (URMA) unified grid-point statistical interpolation (GSI) analysis (the "RTMA"and "URMA" networks, resp.), the and the Climate Data Assimilation System (CDAS) spectral statistical interpolation (SSI) analysis (the "CDAS" network). This step involves the execution of series of programs designed to assemble observations dumped from a number of on-line decoder databases, encode information about the observational error for each data type as well the background (first guess) interpolated to each data location, perform both rudimentary multi-platform quality control and more complex platform-specific quality control, and store the output in a monolithic BUFR file, known as PREPBUFR. The background guess information is used by certain quality control programs while the observation error is used by the analysis to weigh the observations. The structure of the BUFR file is such that each PREPBUFR processing step which changes a datum (either the observation itself, or its quality marker) records the change as an "event" with a program code and a reason code. Each time an event is stored, the previous events for the datum are "pushed down" in the stack. In this way, the PREPBUFR file contains a complete history of changes to the data throughout all of the PREPBUFR processing. The most recent changes are always at the top of the stack and are thus read first by any subsequent data decoder routine. It is expected that the data at the top of the stack are of the highest quality. Once the PREPBUFR job has completed, a separate PREPBUFR post processing job is initiated.

2. PREPBUFR PROCESSING PROGRAMS

A. PREPOBS_PREPDATA

     Purpose: To read in and consolidate observations dumped from individual BUFR DATA databases, perform rudimentary checks on the data, and organize upper-air data by decreasing pressure. For all networks except the CDAS, and to some extent the RTMA and URMA, also performs a number of tasks under the name GBLEVENTS: 1) Adds forecast background (first guess) interpolated to each observation location; 2) Adds observational error (read in from a look-up table) to each observation; 3) Performs some rough quality control checks on surface pressure (vs. the background); and 4) Converts dry bulb temperature to virtual and dewpoint temperature to specific humidity for surface data. All of these GBLEVENTS functions are performed by the program PREPOBS_PREVENTS in the CDAS network. In the RTMA and URMA networks, the only GBLEVENT function performed is item 4. Output is stored in a monolithic BUFR file called PREPBUFR.
     Input: Various BUFR data dump files including (based on the network): ADPUPA (rawinsonde, pibal, dropwinsonde, reconnaissance), AIRCAR (MDCRS-ACARS aircraft), AIRCFT (AIREP, PIREP, AMDAR, and TAMDAR aircraft), SATWND (GOES satellite derived cloud winds from NESDIS, EUMETSAT, GMS, INSAT as well as POES winds from Aqua/Terra MODIS), PROFLR [wind profiler and acoustic sounder (SODAR) winds by height], VADWND (Vertical Azimuth Display winds by height at U.S. NEXRAD radar sites), ADPSFC (surface land synoptic and METAR), SFCSHP (surface marine ships, buoys, C-MAN platforms, tide gauges, and splash-level dropwinsondes), GOESND (GOES 4-layer precipitable water retrievals, sounder radiances, and cloud-top data from NESDIS), ATOVS (temperature soundings from NESDIS), RASSDA [Radio Acoustic Sounding System (RASS) vertical profiles of virtual temperature], GPSIPW (GPS Integrated Precipitable Water retrievals), MSONET (Mesonet data from a myriad of providers, mostly over the U.S.), WDSATR [reprocessed, SUPEROBed (optional) WindSAT scatterometer derived oceanic wind speed and direction (NOTE: WindSAT data has not been processed since August 2012 due to a format change in the raw files, in all likelihood these data will not be restored)], and ASCATW [reprocessed, SUPEROBed (optional) ASCAT scatterometer derived oceanic wind speed and direction]. Also reads in (based on the network) the global nemsio first guess file valid at the PREPBUFR center time [in the GFS amd GDAS (only) when tropical storms are present, this guess is updated by the program RELOCATE_MV_NVORTEX in the relocation step of the upstream tropical cyclone processing], the observational error table (text) file, the BUFR mnemonic table file (more about this later), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A file known as PREPBUFR, containing observations with state variables, sensible weather element, and other ancillary information needed by the analyses, as well as (depending upon the network) forecast background and (depending upon the network) observation errors ¹. At this point the only quality control on the data are the rudimentary limit checks applied by this program, the checks of surface pressure observations compared to the background (except for CDAS, RTMA and URMA network runs), and those applied in the upstream observational dumping process: the interactive NCEP NCO Implementation and Data Services Branch purge or keep flags on data types such as rawinsonde, aircraft, satellite wind, surface land, surface marine, wind profiler/SODAR and Vertical Azimuth Display winds, and the interactive quality markers generated by the NCEP's Ocean Prediction Center (OPC) on marine ship and buoy data.
     Note 1: In the NAM and RAP networks, the background used is the global nems first guess (subject to tropical cyclone relocation in the NAM network.; Note: No longer in NAM after March 2017.) This is only used by the subsequent quality control programs. No background is encoded in the RTMA or URMA network as no quality control programs are run here. Also, observational errors are not encoded in the PREPBUFR file in the RAP, RTMA and URMA networks ¹.
     Note 2: In all networks except the RTMA and URMA, this program is multi-tasked amongst 3 nodes on the WCOSS machine to speed up processing time. In order to load-balance the run streams, each of the input data dump files in this case are divided into 3 equal parts by the program PREPOBS_MPCOPYBUFR. This is analogous to a card game where all of the cards in the deck are dealt out to 3 players. Next, PREPOBS_PREPDATA runs in 3 parallel run streams, with each run using the mini-dump files as input. Each run stream uses all of the dump types, but for each type only 1/3^rd of the original dump is processed. A program called PREPOBS_LISTHEADERS runs immediately after PREPOBS_PREPDATA in run each stream, reordering all message types in each "mini” PREPBUFR file according to that specified in the BUFR mnemonic table. This is necessary because when all 3 run streams of PREPOBS_MPCOPYBUFR/PREPOBS_PREPDATA/ PREPOBS_LISTHEADERS have completed, the program PREPOBS_MONOPREPBUFR concatenates the 3 mini-PREPBUFR files into a monolithic PREPBUFR file ready for subsequent processing.
Note 3: In all networks except for the CDAS, the data read in from the SATWND dump file and encoded into the PREPBUFR files is not read by the subsequent analysis. Instead, the GSI reads in this same SATWND dump file directly and ignores the SATWND data in the PREPBUFR files.

¹The observation errors encoded into the PREPBUFR files (in the GFS/GDAS and NAM) are no longer read by the GSI. Instead, the GSI reads observation errors (for all networks) from an external file in the network fixed file directory. However, the CDAS SSI still reads in the observation errors encoded into CDAS PREPBUFR file.

B. SYNDAT_SYNDATA (NOTE: No longer runs in NAM after March 2017. Thus it only runs in the GFS and GDAS networks. However, the synthetic bogus winds it generates are no longer assimilated by the GFS/GDAS GSI after July 2017, they are only monitored. At some point SYNDAT_SYNDATA may be modified to no longer generate bogus winds in the GFS and GDAS in order to save time.)

     Purpose: Performs three distinct functions:
1) Reads in a quality controlled tropical storm position records from the tcvitals file valid at the PREPBUFR center time and uses them, along with other observations in the PREPBUFR file, to generate synthetic (bogus) wind mandatory level profile reports (throughout the depth of the storm) in the vicinity of the storm(s) to better define tropical systems for the analysis. In the NAM network, a synthetic mass report at all tropical cyclone center locations is also generated with a surface pressure based on the global sigma first guess pressure (from the relocated global sigma guess) adjusted according to the storm category (from the Saffir-Simpson Hurricane Scale), and with specific humidity values generated on mandatory levels throughout the depth of the storm from the relocated global first guess temperatures and an assumption of 99% relative humidity. The synthetic wind reports are then appended to the PREPBUFR file in all networks and assimilated by the analysis. The synthetic mass reports generated in the NAM network are currently being tested and are not yet assimilated by the GSI analysis.   The forecast background (first guess) interpolated to each observation location and the observational error, read in from a look-up table, are also encoded in the PREPBUFR file in all networks where this runs (now only GFS and GDAS).
2) Flags mass data in observations sufficiently "close" to all storms in the tcvitals file list (i.e., within the lat/lon boundary for which bogus reports are generated). These data will then not be assimilated.
3) Flags wind data in dropwinsonde reports sufficiently "close" to all storms in the tcvitals file list (i.e., within a distance to storm center of the larger of 111 km or three times the radius of maximum surface wind). These data will then not be assimilated.
Input: Quality-controlled tropical storm position and intensity field (tcvitals) file (in the NAM network in functions 1 and 2 above and in all networks in function 3 above, this is the so-called operational file generated by the program SYNDAT_QCTROPCY in the quality control step of the upstream tropical cyclone processing, while in the GFS and GDAS networks in function 1 and function 2 above this is the file created by the program RELOCATE_MV_NVORTEX in the relocation step of the upstream tropical cyclone processing) valid at the PREPBUFR center time. Also, the PREPBUFR file output from the previous program PREPOBS_PREPDATA, network-specific parm (data) cards which control processing through namelist variable switches, the global nems first guess file valid at the PREPBUFR center time [in the GFS amd GDAS (only) when tropical storms are present, this guess is updated by the program RELOCATE_MV_NVORTEX in the relocation step of the upstream tropical cyclone processing], and the observational error table (text) file.
     Output: A PREPBUFR file with synthetic reports added (observations as well as the background first guess and observation errors), as well as mass reports (from all sources) and dropwinsonde wind reports flagged in the vicinity of each storm in the tcvitals file.
     Note: This program does not run in the CDAS, RAP, RTMA and URMA, and after March 2017, NAM networks. It will only run to generate bogus data and flag mass data near storms in the GFS and GDAS networks if tropical storm data are available in the input tcvitals file created by the program RELOCATE_MV_NVORTEX in the relocation step of the upstream tropical cyclone processing (most likely not the case). It will only run to generate bogus data and flag mass data near storms in the NAM network and to flag dropwinsonde wind data near storms in the GFS, GDAS and NAM networks if tropical storm data are available in the input tcvitals file generated by the program SYNDAT_QCTROPCY in the quality control step of the upstream tropical cyclone processing.

C. PREPOBS_GLERLADJ

     Purpose: This runs only in the URMA network to perform the NOAA Great Lakes Environmental Research Laboratory (GLERL) adjustment to surface land and marine data in the Great Lakes region. The goal is to create a smooth wind analysis over the Great Lakes that can be used to initialize the Great Lakes Wave model. New, GLERL-adjusted (pseudo-) reports are generated based on certain exisitng report observations as well as water temperature (if existing report is over water, new report is over land and vice-versa). The original, existing report is retained at its original location. Other reports are moved from land to water (or vice versa) in order to align properly with the RTMA land/sea mask over the Great Lakes region. In this case, only latitude and longitude are changed. Here, the original report is not retained.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPDATA (if the program SYNDAT_SYNDATA did not run and recall that it does not run in the URMA network). Observations in PREPBUFR message types "ADPSFC", "SFCSHP" and "SFCSHP" are read.
     Output: A PREPBUFR file containing adjusted surface reports over, and inland of, the Great Lakes.

D. PREPOBS_PREVENTS

     Purpose: This runs only in the CDAS network to add the forecast background (first guessvfrom the CDAS itself) interpolated to each observation location and the observational error (read in from a look-up table) associated with each observation to the PREPBUFR file. It also performs some rough quality control checks on surface pressure (vs. the background), and converts dry bulb temperature to virtual and dewpoint temperature to specific humidity for surface data.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPDATA (if the programs SYNDAT_SYNDATA and PREPOBS_GLERLADJ did not run and recall that neither runs in the CDAS network). Observations in all PREPBUFR message types are read. Also reads in the CDAS spectral (sigma) first guess file valid at the PREPBUFR center time and the observational error table (text) file, as well as network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file containing the forecast background and observation errors along with surface virtual temperature and specific humidity added.
     Note: In all networks other than CDAS, the "PREVENTS" function is performed within the PREPOBS_PREPDATA and SYNDAT_SYNDATA programs.

E. PREPOBS_CQCBUFR

     Purpose: Performs complex quality control on rawinsonde height and temperature data to identify or correct erroneous observations that arise from location, transcription or communications errors. Attempts are made, when appropriate, to correct commonly occurring types of errors. Erroneous data that cannot be corrected are flagged and will not be considered by the analyses. The checks used are: hydrostatic, increment, horizontal statistical, vertical statistical, temporal (in the CDAS network only), baseline and lapse rate. These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess). This program also applies intersonde (radiation) corrections to the quality controlled rawinsonde height and temperature data. The degree of correction is a function of the rawinsonde instrument type, the sun angle and the vertical pressure level. Finally, this program converts rawinsonde and dropwinsonde dry bulb temperature to virtual and rawinsonde and dropwinsonde dewpoint temperature to specific humidity.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPDATA (if the program SYNDAT_SYNDATA did not run), or from the previous program SYNDAT_SYNDATA, or from the previous program PREPOBS_PREVENTS in the case of the CDAS network (recall that the upstream program PREPOBS_GLERLADJ runs only in the URMA network where PREPOBS_CQCBUFR does not run). In all cases, observations in PREPBUFR message type "ADPUPA" and their background guess are read. (In the case of the CDAS network, where temporal checking is performed, PREPBUFR files valid 24-hours previous, 12-hours previous, 12-hours subsequent, and 24-hours subsequent are also input.) Also reads in network-specific parm (data) cards which control processing through namelist variable switches
     Output: A PREPBUFR file with quality controlled rawinsonde data, intersonde corrections applied to rawinsonde temperature and height, and virtual temperature and specific humidity added to rawinsonde and dropwinsonde data. Text files are also output containing various informative results from the running of this program. These files are made available to the NCEP SDM.
     Note: This program does not run in the RTMA or URMA network.

F. PREPOBS_PROFCQC

     Purpose: Performs complex quality control on wind profiler and acoustic sounder (SODAR) data in order to identify erroneous data and remove it from consideration by the analyses. The checks used are: increment, vertical statistical, temporal statistical, and combined vertical-temporal. These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess).
     Input: The PREPBUFR file output from the previous program PREPOBS_CQCBUFR (observations in PREPBUFR message type "PROFLR" and their background guess are read), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file with quality controlled wind profiler/SODAR data.
     Note: This program does not run in the RTMA or URMA network.

G. PREPOBS_CQCVAD

     Purpose: Performs complex quality control on Vertical Azimuth Display (VAD) winds from WSR-88D radars in order to identify erroneous data and remove it from consideration by the analyses. The checks used are: increment, vertical statistical, temporal statistical, and combined vertical-temporal. These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess). In addition, there is an algorithm to account for contamination due to the seasonal migration of birds.
     Input: The PREPBUFR file output from the previous program PREPOBS_PROFCQC (observations in PREPBUFR message type "VADWND" and their background guess are read).
     Output: A PREPBUFR file with quality controlled VAD wind data.
     Note: This program does not run in the RTMA or URMA network.

H. PREPOBS_PREPACQC

    Purpose: Performs comprehensive quality control on conventional AIREP, PIREP, AMDAR (Aircraft Report, Pilot Report, Aircraft Meteorological Data Relay), TAMDAR (Tropospheric Airborne Meteorological Data Reporting) and MDCRS (Meteorological Data Collection and Reporting System) aircraft wind, temperature and, where applicable, moisture data. Checks include: duplicate, spike, invalid report, stuck value, gross value, inconsistent position, ordering, suspect data, reject list. A detailed flight track check is performed. The basic quality control algorithms were written by Dr. Patricia Pauley at the Naval Research Laboratory (NRL). Optionally, also creates a pseudo-PREPBUFR file containing quality controlled aircraft profiles (ascents and descents) constructed from the single level reports, along with an estimated instantaneous altitude rate on each profile level. Flight level reports are also included here. In this pseudo-PREPBUFR profile file, the mass and wind information are combined. Table 21 contains the code tables of pseudo-PREPBUFR aircraft profile report types currently valid in all applicable networks.
     Input: The PREPBUFR file output from the previous program PREPOBS_CQCVAD (observations in PREPBUFR message type "AIRCFT" and "AIRCAR" are read), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file with quality controlled conventional (AIREP, PIREP, AMDAR, TAMDAR, MDCRS) aircraft data. A pseudo-PREPBUFR file containing quality controlled aircraft profiles (ascents and descents) as well as flight level reports where the mass and wind information are combined. A text file listing all reports (output to both the PREPBUFR file and the the mini-PREPBUFR file containing profiles) with detailed quality mark infromation. Listings from each individual quality control check as well as an overall log of the complete quality control check.
     Note: This program does not run in the RTMA or URMA network.

I. PREPOBS_OIQCBUFR

     Purpose: Performs an optimum interpolation based quality control on the complete set of observations in the PREPBUFR file. As with the complex quality control procedures, this program operates in a parallel rather than a serial mode. That is, a number of independent checks (horizontal, vertical, geostrophic) are performed using all admitted observations. Each observation is subjected to the optimum interpolation formalism using all observations except itself in each check. A final quality decision (keep, toss, or reduced confidence weight) is made based on the results from all prior platform-specific quality checks (see B.-I. above) and from any manual quality marks attached to the data. The results from all the checks are kept in an annotated observational database. One other responsibility of this program is to perform a multivariate surface wind analysis and assign the analyzed direction to the SSM/I oceanic wind speed observation in order to produce a wind vector for these data.
     Input: The PREPBUFR file output by the previous program PREPOBS_PREPACQC (observations in all PREPBUFR message types and their background guess are read). Also, an observational error table (text file) tuned specifically for this program.
     Output: A PREPBUFR file with final OI-based quality control applied to all data. Text files are also output containing various informative results from the running of this program. These files are made available to the NCEP SDM.
     Note: This program runs only in the CDAS network. The GFS/GDAS, NAM, RAP, RTMA and URMA GSI run their own internal variational quality control on the observations (in additional to other quality control within the GSI itself).

3. THE STRUCTURE OF THE PREPBUFR FILE

The PREPOBS_PREPDATA program reads in a BUFR table text file which lays out the BUFR descriptors and their defined sequence for each type of report. Every descriptor and sequence is represented by a unique mnemonic in order to make the NCEP form of BUFR more user-friendly. This BUFR table is stored in the first messages of the output PREPBUFR file. The PREPBUFR file is thus self-defining - all subsequent codes that read it are able to parse the table directly out of the PREPBUFR file itself. The current BUFR mnemonic table is found in Table 1.a-1.e.

The highest level mnemonic sequences in the PREPBUFR file are known as the "Table A Entries" because they refer to a unique BUFR Table A data category as defined in Section 1 of the BUFR message. These mnemonic sequences will be referred to as PREPBUFR "message types". See Table 1.a for the current list of message types along with their number (BUFR descriptor) and description. The last 3 digits in the descriptor number are the Table A data entries in Section 1.

Each PREPBUFR message type consists of either mnemonic sequences known as "Table D" entries, or mnemonics representing a single datum known as "Table B" entries. Each Table D sequence consists of either other Table D sequences or of Table B data descriptors. Thus, every PREPBUFR message type can be broken down finer and finer until it consists of a string of Table B descriptors. See Table 1.b for the current list of Table D entries, their descriptor number and description. Table 1.c contains the current list of Table B entries along with their descriptor number and description. Table 1.c also contains the scaling, reference value, number of bits and units associated with each Table B entry.

The current layout of Table D and Table B entries that comprise each report by PREPBUFR message type is shown in Table 1.d. There are special characters around some of the Table D sequences in Table 1.d. These refer to replication descriptors.

- Curly brackets "{" and "}" around a sequence mnemonic indicate that 8-bit delayed replication is possible on the sequence. This is generally found for sequences which replicate as levels, such as in upper-air data. The replication is delayed because the number of levels is not known ahead of time. There is a maximum of 255 replicated levels here.

- Parentheses "(" and ")" around a sequence mnemonic indicate that 16-bit delayed replication is possible on the sequence. This is generally found for sequences which replicate as radiance channels, such as in AIRS 1B satellite radiance data. The replication is delayed because the number of levels is not known ahead of time. There is a maximum of 65535 replicated levels here.

- Angle brackets "<" and ">" around a sequence indicate that a 1-bit replication descriptor is acting on this sequence. If every Table B descriptor in the sequence is missing, then only one bit is needed to represent the data (and the bit is set to zero). If one or more Table B descriptors in the sequence are present, the bit is set to one indicating that all of the Table B descriptors in the sequence are represented bit-wise. This is useful for sequences which may often be missing, since only one bit is needed in this case.

- Square brackets "[" and "]" around a sequence indicate that this sequence is subset to events stacking. Here, the replication is the number of events associated with the sequence. Recall from the first paragraph of this document that the PREPBUFR file is structured such that each PREPBUFR processing step which changes a datum (either the observation itself, or its quality marker) records the change as an "event" with a program code and a reason code. Each time an event is stored, the previous events for the datum are "pushed down" in the stack. In this way, the PREPBUFR file contains a complete history of changes to the data throughout all of the PREPBUFR processing. Table 1d shows that square brackets are only found around sequences which consist the observation value itself (either pressure, specific humidity, temperature, height wind, and total or layer precipitable water), the observation value's quality marker, the program code for the event changing either the observation or its quality marker and the reason code within this program code. There is a maximum of 255 replicated events here.

Table 1.e contains the list of Table D entries that define the PREPBUFR processing steps that can act to generate events. It should be noted that a step is not necessarily the same as a program here. Some programs consist of more than one step, while some steps can appear in more than one program. The description defines the program(s) associated with each Table D entry here. The "program code" mentioned in the previous paragraph is unique for a particular step here and is determined by the last 3 digits in the descriptor number.

The PREPBUFR file contains a number of Table B entries which are code or flag tables (see Table 1.c). Links are provided to web pages which define these tables. In general, the code and flag tables for those variables defined with WMO descriptors can be found in the WMO BUFR Code and Flag Tables and Common Code Tables. The code tables for most of the more common variables defined with local descriptors are discussed next.

The reports in the PREPBUFR file are differentiated by both the PREPBUFR report type (mnemonic “TYP” in the PREPBUFR file) and by an input "dump" report type, loosely based on the obsolete NMC Office Note 29 and NMC Office Note 124 report types (mnemonic “T29" in the PREPBUFR file). Reports are split into mass and wind pieces at the current time. All mass reports contain PREPBUFR report types in the range 100-199, while all wind reports contain PREPBUFR report types in the range 200-299. These report types are used by the various assimilation systems to identify the reports in the PREPBUFR file. Table 2, Table 3, Table 4, Table 5 and Table 19 contain the code tables of PREPBUFR report types currently valid in the GFS/GDAS, CDAS, NAM, RAP and RTMA/URMA networks, respectively. In addition, more detailed information on the usage of each variable in each PREPBUFR report type can be found HERE for the GFS/GDAS and HERE for the NAM (Note: this information may need to be updated). The input "dump" report type defines the report more precisely than the PREPBUFR report type (e.g., PREPBUFR report type 180 consists of marine ship, buoy and C-MAN platform reports, all of which contain a unique input report type). Table 6 defines the code table of input "dump" report types (the same for all networks). The input report type is not used by any assimilation system at the current time.

Most of the observation types in the PREPBUFR file are associated with quality markers (e.g., mnemonics “PQM, “TQM”, “WQM”, etc.). These are used by the various analyses to place a weight on the data based on its quality. Table 7 contains the code table of quality markers. These quality markers apply to all observation types in the PREPBUFR file.

The program codes (e.g., mnemonics “PPC”, “TPC”, “WPC”, etc.) associated with the PREPBUFR processing steps in Table 1.e and the reason codes associated with a particular program code (e.g., mnemonics “PRC”, “TRC”, “WRC”, etc.) together define the “events” associated with changes in the observation itself or in its quality marker in the course of the PREPBUFR processing. Table 8.a contains the code table of reason codes for step “PREPRO” (program code “001"). Table 8.b contains a code table of possible future reason codes based on events currently occurring in the PREPRO step. Table 8.c contains a list of other unrecorded events in the PREPRO step that result in originally reported observational data not being encoded into the PREPBUFR file. Table 9 contains the code table of reason codes for step “SYNDATA” (program code “002"). Table 10 contains the code table of reason codes for step “PREVENT” (program code “004"). Table 11 contains the code table of reason codes for step “CQCHT” (program code “005"). Table 12 contains the code table of reason codes for step “RADCOR” (program code “006"). Table 20 contains the code table of reason codes for step "NRLACQC" (program code “015"). Table 14 contains the code table of reason codes for step “VIRTMP” (program code “008"). Table 15 contains the code table of reason codes for step “CQCPROF” (program code “009") . Table 16 contains the code table of reason codes for step “OIQC” (program code “010"). Table 17 contains the code table of reason codes for step “CQCVAD” (program code “011"). Table 22 contains the code table of reason codes for step “GLERL” (program code “017") . The steps “CLIMO”, "SSI", and "R3DVAR" currently do not run, and the step "GSI" currently does not generate or store reason codes. The step "PREPACQC" pertains to the aircraft quality control step that was used prior to 17 July 2012. Step "DEFAULT" is designed to handle events written out by any non-defined program/step (useful for non-operational runs).

Additional documentation on the structure of NCEP BUFR files in general can be found at http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/, while additional documentation on the structure of PREPBUFR files in particular can be found at http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/toc/prepbufr/.

4. OPERATIONAL DATA THAT DO NOT PASS THROUGH PREPBUFR PROCESSING

The NAM, GFS/GDAS, RAP and RTMA/URMA GSI analyses also assimilate (or at least monitor) many observation types directly from their BUFR dump files or from other non-PREPBUFR sources. Such types include satellite radiances (including RARS and DBRTN), satellite retrieved products (including ozone and GOES cloud data from NESDIS and LaRC), GPS radio occultation, WSR-88D NEXRAD radial wind data, lightning and satellite-derived winds and other products.

Table 18 summarizes the current usage of these data in the various NCEP assimilation systems. It also lists data types that are monitored (but not used) in either the NAM, GFS/GDAS, RAP or RTMA/URMA GSI.

5. EXAMPLE PROGRAM TO DECODE NCEP PREPBUFR FILES

A sample program at http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/decode_prepbufr_example demonstrates how the contents of the PREPBUFR file can be decoded using routines in the NCEP BUFR library. In this particular example, every report is decoded and listed to output files specific to the BUFR Table A message type. This program also merges the mass and wind report "pieces" into a common output report for listing. In subroutine READPB, there is a logical variable single_msgtyp which controls the amount of processing. If it is set to FALSE, then the entire PREPBUFR file is decoded. If it is set to TRUE, then only reports in the Table A message type indicated by the variable msgtyp_process are decoded.

6. POST-PROCESSING OF NCEP PREPBUFR FILES

The completion of the PREPBUFR job triggers a job which performs post-processing on the PREPBUFR files just created. This job does not produce any output necessary to the successful completion of the analysis/forecast network (indeed, it runs either simultaneously with, or after, the analysis job which is also triggered by the completion of the PREPBUFR job).

The first job step executes the program BUFR_REMOREST which removes or masks, from the PREPBUFR file, certain data types that are restricted (either by the data producers themselves or by the WMO) from redistribution outside of NCEP. NCEP/NCO has created a very strict policy on who may or may not have access to restricted data. The resulting PREPBUFR file, gleaned of all restricted data, is given a suffix qualifier of ".nr" in the network-specific /com directories on the NCEP-WCOSS.

The next PREPBUFR post-processing job step identifies upper-air "TimeTwins" (duplications in current rawinsonde, pibal or dropwinsonde wind report "parts" vs. those over the past 35 days). This is only executed in the GDAS network (but for every cycle).

The next job step reformats received, selected, and assimilated data counts (for both satellite and non-satellite types) for all four GDAS network cycles for the current day and saves the result in the monthly archive directory. On the second day of each month, a monthly summary is run, for the previous month, and posted to the web. This job step is only executed in the 18z GDAS network.

The next job step generates a table of received, selected, and assimilated satellite data counts for all four GDAS network cycles. This is only executed in the GDAS network (but for every cycle).

The final PREPBUFR post-processing job step updates the Master Ship Station List based on any new information read from the updated VOS ship list from NDBC. The Master Ship Station list is read within job JRW1 at 12z on the first day of each month in order to generate marine monthly statistics. This is only executed in the 18z GDAS network.