PREPBUFR PROCESSING AT NCEP

Dennis Keyser - NOAA/NWS/NCEP/EMC
(Last Revised 1/22/2013)



Please take a moment to read the Disclaimer for this non-operational web page.

1.  INTRODUCTION
 

     The "PREPBUFR" processing is the final step in preparing the majority of conventional observational data for assimilation into the various NCEP analyses including the North American Model (NAM) and NAM Data Assimilation System (NDAS) unified grid-point statistical interpolation (GSI) analysis (the "NAM" and "NDAS" networks), the Global Forecast System (GFS) and Global Data Assimilation System (GDAS) unified grid-point statistical interpolation (GSI) analysis  (the "GFS" and "GDAS" networks), the Rapid Refresh (RAP) unified grid-point statistical interpolation (GSI) analysis (the "RAP" network), the Real Time Mesoscale Analysis (RTMA) unified grid-point statistical interpolation (GSI) analysis (the "RTMA" network), and the Climate Data Assimilation System (CDAS) spectral statistical interpolation (SSI) analysis (the "CDAS" network). This step involves the execution of series of programs designed to assemble observations dumped from a number of on-line decoder databases, encode information about the observational error for each data type as well the background (first guess) interpolated to each data location, perform both rudimentary multi-platform quality control and more complex platform-specific quality control, and store the output in a monolithic BUFR file, known as PREPBUFR. The background guess information is used by certain quality control programs while the observation error is used by the analysis to weigh the observations. The structure of the BUFR file is such that each PREPBUFR processing step which changes a datum (either the observation itself, or its quality marker) records the change as an "event" with a program code and a reason code. Each time an event is stored, the previous events for the datum are "pushed down" in the stack. In this way, the PREPBUFR file contains a complete history of changes to the data throughout all of the PREPBUFR processing. The most recent changes are always at the top of the stack and are thus read first by any subsequent data decoder routine. It is expected that the data at the top of the stack are of the highest quality.  Once the PREPBUFR job has completed, a separate PREPBUFR post processing job is initiated.
 
 

2.  PREPBUFR PROCESSING PROGRAMS


A.  PREPOBS_PREPDATA
 

     Purpose: To read in and consolidate observations dumped from individual BUFR DATA databases, perform rudimentary checks on the data, and organize upper-air data by decreasing pressure. For all networks except the CDAS, and to some extent the RTMA, also performs a number of tasks under the name GBLEVENTS: 1) Adds forecast background (first guess) interpolated to each observation location; 2) Adds observational error (read in from a look-up table) to each observation; 3) Performs some rough quality control checks on surface pressure (vs. the background); and 4) Converts dry bulb temperature to virtual and dewpoint temperature to specific humidity for surface data.  All of these GBLEVENTS functions are performed by the program PREPOBS_PREVENTS in the CDAS network.  In the RTMA network, the only GBLEVENT function performed is item 4.  Output is stored in a monolithic BUFR file called PREPBUFR.
     Input: Various BUFR data dump files including (based on the network): ADPUPA (rawinsonde, pibal, dropwinsonde, reconnaissance), AIRCAR (MDCRS-ACARS aircraft), AIRCFT (AIREP, PIREP, AMDAR, and TAMDAR aircraft), SATWND (GOES satellite derived cloud winds from NESDIS, EUMETSAT, GMS, INSAT as well as POES winds from Aqua/Terra MODIS), PROFLR [wind profiler and acoustic sounder (SODAR) winds by height], VADWND (Vertical Azimuth Display winds by height at U.S. NEXRAD radar sites), ADPSFC (surface land synoptic and METAR), SFCSHP (surface marine ships, buoys, C-MAN platforms, tide gauges, and splash-level dropwinsondes), GOESND (GOES 4-layer precipitable water retrievals, sounder radiances, and cloud-top data from NESDIS), ATOVS (temperature soundings from NESDIS), RASSDA [Radio Acoustic Sounding System (RASS) vertical profiles of virtual temperature], GPSIPW (GPS Integrated Precipitable Water retrievals), MSONET (Mesonet data from a myriad of providers, mostly over the U.S.), WDSATR [reprocessed, SUPEROBed (optional) WindSAT scatterometer derived oceanic wind speed and direction] and ASCATW [reprocessed, SUPEROBed (optional) ASCAT scatterometer derived oceanic wind speed and direction].  Also reads in (based on the network) the global spectral first guess file valid at the current time (either the operational file or the one created by the program RELOCATE_MV_NVORTEX in the relocation step of the previous tropical cyclone processing job), the observational error table (text) file, the BUFR mnemonic table file (more about this later), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A file known as PREPBUFR, containing observations with state variables, sensible weather element and other ancillary information needed by the analyses as well as (depending unpon the network) forecast background and observation errors.  At this point the only quality control on the data are the rudimentary limit checks applied by this program, the checks of surface pressure observations compared to the background (except for CDAS and RTMA network runs), and those applied in the upstream observational dumping process: the interactive NCEP NCO Production Management Branch purge or keep flags on data types such as rawinsonde, aircraft, satellite wind, surface land, surface marine, wind profiler/SODAR and Vertical Azimuth Display winds, and the interactive quality markers generated by the NCEP's Ocean Prediction Center (OPC) on marine ship and buoy data.
     Note 1: In the NAM, NDAS, and RAP networks, the background used is the global first guess (subject to tropical cyclone relocation in the NAM and NDAS networks). This is only used by the subsequent quality control programs. No background is encoded in the RTMA network as no quality control programs are run here. Also, observational errors are not encoded in the PREPBUFR file in the RAP and RTMA networks as the RAP and RTMA analyses do not make use of this information.
      Note 2: Global spectral first guess files are valid only at cycle times which are a multiple of three hours.  For RAP network runs at cycle times that are not a multiple of three hours, two global spectral first guess files which span the run cycle time are read in.  A linear time interpolation of the coefficients is then performed to generate a first guess valid at the run cycle time.
     Note 3: In all networks except the RTMA, this program is multi-tasked amongst 12 nodes on the IBM-SP machine to speed up processing time.  In order to load-balance the run streams, each of the input data dump files are divided into 12 equal parts by the program PREPOBS_MPCOPYBUFR.  This is analogous to a card game where all of the cards in the deck are dealt out to 12 players.  Next, PREPOBS_PREPDATA runs in 12 parallel run streams, with each run using the mini-dump files as input.  Each run stream uses all of the dump types, but for each type only 1/12th of the original dump is processed.  A program called PREPOBS_LISTHEADERS runs immediately after PREPOBS_PREPDATA in run each stream, reordering all message types in each "mini” PREPBUFR file according to that specified in the BUFR mnemonic table.  This is necessary because when all 12 run streams of PREPOBS_MPCOPYBUFR/PREPOBS_PREPDATA/ PREPOBS_LISTHEADERS have completed, the program PREPOBS_MONOPREPBUFR concatenates the 12 mini-PREPBUFR files into a monolithic PREPBUFR file ready for subsequent processing.
    Note 4: In the GFS and GDAS networks, the data read in from the SATWND dump file and encoded into the PREPBUFR files is not read by the subsequent GFS/GDAS GSI analysis.  Instead, the GSI reads in this same SATWND dump file directly and ignores the SATWND data in the PREPBUFR files.


B.  SYNDAT_SYNDATA
 

     Purpose: Performs three distinct functions:
          1)  Reads in a quality controlled tropical storm position records from the tcvitals file valid at the current time and uses them, along with other observations in the PREPBUFR file, to generate synthetic (bogus) wind mandatory level profile reports (throughout the depth of the storm) in the vicinity of the storm(s) to better define tropical systems for the analysis. In the NAM and NDAS networks, a synthetic mass report at all tropical cyclone center locations is also generated with a surface pressure based on the global sigma first guess pressure (from the relocated global sigma guess) adjusted according to the storm category (from the Saffir-Simpson Hurricane Scale), and with specific humidity values generated on mandatory levels throughout the depth of the storm from the relocated global first guess temperatures and an assumption of 99% relative humidity.  The synthetic wind reports are then appended to the PREPBUFR file in all networks and assimilated by the analysis.  The synthetic mass reports generated in the NAM and NDAS networks are currently being tested and are not yet assimilated by the GSI analysis.   The forecast background (first guess) interpolated to each observation location and the observational error, read in from a look-up table, are also encoded in the PREPBUFR file in all networks where this runs.
          2)  Flags mass data in observations sufficiently "close" to all storms in the tcvitals file list (i.e., within the lat/lon boundary for which bogus reports are generated).  These data will then not be assimilated.
          3)  Flags wind data in dropwinsonde reports sufficiently "close" to all storms in the tcvitals file list (i.e., within a distance to storm center of the larger of 111 km or three times the radius of maximum surface wind).  These data will then not be assimilated.
     Input: Quality-controlled tropical storm position and intensity field (tcvitals) file (in the NAM and NDAS networks in functions 1 and 2 above and in all networks in function 3 above, this is the so-called operational file generated by the program SYNDAT_QCTROPCY in the quality control step of the previous tropical cyclone processing job, while in the GFS and GDAS networks in functions 1 and 2 above this is the file created by the program RELOCATE_MV_NVORTEX in the relocation step of the previous tropical cyclone processing job) valid at the current time.  Also, the PREPBUFR file output from the previous program PREPOBS_PREPDATA, network-specific parm (data) cards which control processing through namelist variable switches, the global spectral first guess file valid at the current time (either the operational file or the one created by the program RELOCATE_MV_NVORTEX in the relocation step of the previous tropical cyclone processing job), and the observational error table (text) file.
     Output: A PREPBUFR file with synthetic reports added (observations as well as the background first guess and observation errors), as well as mass reports (from all sources) and dropwinsonde wind reports flagged in the vicinity of each storm in the tcvitals file.
     Note: This program does not run in the CDAS, RAP and RTMA networks.  It will only run to generate bogus data and flag mass data near storms in the GFS and GDAS networks if tropical storm data are available in the input tcvitals file created by the program RELOCATE_MV_NVORTEX in the relocation step of the previous tropical cyclone processing job (most likely not the case).  It will only run to generate bogus data and flag mass data near storms in the NAM and NDAS networks and to flag dropwinsonde wind data near storms in the GFS, GDAS, NAM and NDAS networks if tropical storm data are available in the input tcvitals file generated by the program SYNDAT_QCTROPCY in the quality control step of the previous tropical cyclone processing job.


C.  PREPOBS_PREVENTS
 

     Purpose: This runs only in the CDAS network to add the forecast background (first guess) interpolated to each observation location and the observational error (read in from a look-up table) associated with each observation to the PREPBUFR file. It also performs some rough quality control checks on surface pressure (vs. the background), and converts dry bulb temperature to virtual and dewpoint temperature to specific humidity for surface data.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPDATA (if the program SYNDAT_SYNDATA did not run) or from the previous program SYNDAT_SYNDATA (recall that SYNDAT_SYNDATA currently does not run in the CDAS network).  Observations in all PREPBUFR message types are read. Also reads in the global spectral first guess file valid at the current time and the observational error table (text) file, as well as network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file containing the forecast background and observation errors along with surface virtual temperature and specific humidity added.
     Note: In all networks other than CDAS, the "PREVENTS" function is performed within the PREPOBS_PREPDATA and SYNDAT_SYNDATA programs.


D.  PREPOBS_CQCBUFR
 

     Purpose: Performs complex quality control on rawinsonde height and temperature data to identify or correct erroneous observations that arise from location, transcription or communications errors.  Attempts are made, when appropriate, to correct commonly occurring types of errors.  Erroneous data that cannot be corrected are flagged and will not be considered by the analyses.  The checks used are: hydrostatic, increment, horizontal statistical, vertical statistical, temporal (in the CDAS network only), baseline and lapse rate. These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess).  This program also applies intersonde (radiation) corrections to the quality controlled rawinsonde height and temperature data. The degree of correction is a function of the rawinsonde instrument type, the sun angle and the vertical pressure level. Finally, this program converts rawinsonde and dropwinsonde dry bulb temperature to virtual and rawinsonde and dropwinsonde dewpoint temperature to specific humidity.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPDATA (if the program SYNDAT_SYNDATA did not run), or from the previous program SYNDAT_SYNDATA, or from the previous program PREPOBS_PREVENTS in the case of the CDAS network (in all cases, observations in PREPBUFR message type "ADPUPA" and their background guess are read). (In the case of the CDAS network, where temporal checking is performed, PREPBUFR files valid 24-hours previous, 12-hours previous, 12-hours subsequent, and 24-hours subsequent are also input.)  Also reads in network-specific parm (data) cards which control processing through namelist variable switches
     Output: A PREPBUFR file with quality controlled rawinsonde data, intersonde corrections applied to rawinsonde temperature and height, and virtual temperature and specific humidity added to rawinsonde and dropwinsonde data. Text files are also output containing various informative results from the running of this program. These files are made available to the NCEP SDM.
     Note: This program does not run in the RTMA network. 


E.  PREPOBS_PROFCQC
 

     Purpose: Performs complex quality control on wind profiler and acoustic sounder (SODAR) data in order to identify erroneous data and remove it from consideration by the analyses.  The checks used are: increment, vertical statistical, temporal statistical, and combined vertical-temporal.  These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess).
     Input: The PREPBUFR file output from the previous program PREPOBS_CQCBUFR (observations in PREPBUFR message type "PROFLR" and their background guess are read), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file with quality controlled wind profiler/SODAR data.
     Note: This program does not run in the RTMA network. 


F.  PREPOBS_CQCVAD
 

     Purpose: Performs complex quality control on Vertical Azimuth Display (VAD) winds from WSR-88D radars in order to identify erroneous data and remove it from consideration by the analyses.  The checks used are: increment, vertical statistical, temporal statistical, and combined vertical-temporal.  These multiple checks are based upon differences from the six-hour Global Data Assimilation System (GDAS) forecast (the usual background first guess).  In addition, there is an algorithm to account for contamination due to the seasonal migration of birds.
     Input: The PREPBUFR file output from the previous program PREPOBS_PROFCQC (observations in PREPBUFR message type "VADWND" and their background guess are read).
     Output: A PREPBUFR file with quality controlled VAD wind data.
     Note: This program does not run in the RTMA network. 


G.  PREPOBS_PREPACQC
 

     Purpose: Performs quality control on conventional AIREP, PIREP, AMDAR (Aircraft Report, Pilot Report, Aircraft Meteorological Data Relay) and TAMDAR (Tropospheric Airborne Meteorological Data Reporting) aircraft wind and temperature data.  The flight tracks are checked, with reports failing the check flagged and duplicate reports removed.  In addition, AIREP and PIREP reports are quality controlled in two ways: isolated reports are compared to the first guess with outliers flagged, and groups of reports in close geographical proximity are inter-compared using both a vertical wind shear check and a temperature lapse check.  Finally, there is also an option to superob collocated AIREP and PIREP reports, however this is no longer performed in any operational NCEP networks.
     Input: The PREPBUFR file output from the previous program PREPOBS_CQCVAD (observations in PREPBUFR message type "AIRCFT" and their background guess are read), the aircraft waypoints file, a land sea-mask if geographical filtering of the data is to be performed (which is isn't in production), and network-specific parm (data) cards which control processing through namelist variable switches.
     Output: A PREPBUFR file with quality controlled conventional (AIREP, PIREP, AMDAR, TAMDAR) aircraft data, a text file listing isolated reports that failed the quality control tests and a text file listing collated reports with mean wind vectors that varied significantly from the first guess. The text files are made available to the NCEP SDM.
     Note: This program does not run in the RTMA network. 


H.  PREPOBS_ACARSQC
 

     Purpose: Performs rudimentary and gross quality control checks on MDCRS-ACARS aircraft wind and temperature data.  Reports failing the quality control checks are flagged.
     Input: The PREPBUFR file output from the previous program PREPOBS_PREPACQC (observations in PREPBUFR message type "AIRCAR" and their background guess are read) and a land sea-mask if geographical filtering of the data is to be performed (which is isn't in production).
     Output: A PREPBUFR file with quality controlled conventional MDCRS-ACARS aircraft data and a text file listing all reports that failed the quality control tests as well as those with mean wind vectors that varied significantly from the first guess. The text files are made available to the NCEP SDM. 
     Note: This program does not run in the RTMA network. 

I.  PREPOBS_OIQCBUFR
 

     Purpose: Performs an optimum interpolation based quality control on the complete set of observations in the PREPBUFR file.  As with the complex quality control procedures, this program operates in a parallel rather than a serial mode.  That is, a number of independent checks (horizontal, vertical, geostrophic) are performed using all admitted observations.  Each observation is subjected to the optimum interpolation formalism using all observations except itself in each check.  A final quality decision (keep, toss, or reduced confidence weight) is made based on the results from all prior platform-specific quality checks (see B.-I. above) and from any manual quality marks attached to the data.  The results from all the checks are kept in an annotated observational database.  One other responsibility of this program is to perform a multivariate surface wind analysis and assign the analyzed direction to the SSM/I oceanic wind speed observation in order to produce a wind vector for these data.
     Input: The PREPBUFR file output by the previous program PREPOBS_ACARSQC (observations in all PREPBUFR message types and their background guess are read). Also, an observational error table (text file) tuned specifically for this program.
     Output: A PREPBUFR file with final OI-based quality control applied to all data. Text files are also output containing various informative results from the running of this program. These files are made available to the NCEP SDM.
     Note: This program runs only in the GFS, GDAS, and CDAS networks, however as of 1200 UTC 24 February 2009, the GFS and GDAS GSI analyses no longer honor the decisions made in this program.  Instead, the GFS/GDAS GSI now runs its own internal variational quality control on the observations..  The NAM, NDAS, RAP and RTMA networks also run their own form of quality control on the observations within the analysis program itself.
 
 

3.  THE STRUCTURE OF THE PREPBUFR FILE
 

     The PREPOBS_PREPDATA program reads in a BUFR table text file which lays out the BUFR descriptors and their defined sequence for each type of report. Every descriptor and sequence is represented by a unique mnemonic in order to make the NCEP form of BUFR more user-friendly. This BUFR table is stored in the first messages of the output PREPBUFR file. The PREPBUFR file is thus self-defining - all subsequent codes that read it are able to parse the table directly out of the PREPBUFR file itself. The current BUFR mnemonic table is found in Table 1.a-1.e.

     The highest level mnemonic sequences in the PREPBUFR file are known as the "Table A Entries" because they refer to a unique BUFR Table A data category as defined in Section 1 of the BUFR message. These mnemonic sequences will be referred to as PREPBUFR "message types". See Table 1.a for the current list of message types along with their number (BUFR descriptor) and description. The last 3 digits in the descriptor number are the Table A data entries in Section 1.

     Each PREPBUFR message type consists of either mnemonic sequences known as "Table D" entries, or mnemonics representing a single datum known as "Table B" entries. Each Table D sequence consists of either other Table D sequences or of Table B data descriptors. Thus, every PREPBUFR message type can be broken down finer and finer until it consists of a string of Table B descriptors. See Table 1.b for the current list of Table D entries, their descriptor number and description. Table 1.c contains the current list of Table B entries along with their descriptor number and description. Table 1.c also contains the scaling, reference value, number of bits and units associated with each Table B entry.

     The current layout of Table D and Table B entries that comprise each report by PREPBUFR message type is shown in Table 1.d. There are special characters around some of the Table D sequences in Table 1.d. These refer to replication descriptors.

          - Curly brackets "{" and "}" around a sequence mnemonic indicate that 8-bit delayed replication is possible on the sequence. This is generally found for sequences which replicate as levels, such as in upper-air data. The replication is delayed because the number of levels is not known ahead of time.  There is a maximum of 255 replicated levels here.

          - Parentheses "(" and ")" around a sequence mnemonic indicate that 16-bit delayed replication is possible on the sequence. This is generally found for sequences which replicate as radiance channels, such as in AIRS 1B satellite radiance data. The replication is delayed because the number of levels is not known ahead of time.  There is a maximum of 65535 replicated levels here.

          - Angle brackets "<" and ">" around a sequence indicate that a 1-bit replication descriptor is acting on this sequence. If every Table B descriptor in the sequence is missing, then only one bit is needed to represent the data (and the bit is set to zero). If one or more Table B descriptors in the sequence are present, the bit is set to one indicating that all of the Table B descriptors in the sequence are represented bit-wise. This is useful for sequences which may often be missing, since only one bit is needed in this case.

          - Square brackets "[" and "]" around a sequence indicate that this sequence is subset to events stacking. Here, the replication is the number of events associated with the sequence. Recall from the first paragraph of this document that the PREPBUFR file is structured such that each PREPBUFR processing step which changes a datum (either the observation itself, or its quality marker) records the change as an "event" with a program code and a reason code. Each time an event is stored, the previous events for the datum are "pushed down" in the stack. In this way, the PREPBUFR file contains a complete history of changes to the data throughout all of the PREPBUFR processing. Table 1d shows that square brackets are only found around sequences which consist the observation value itself (either pressure, specific humidity, temperature, height wind, and total or layer precipitable water), the observation value's quality marker, the program code for the event changing either the observation or its quality marker and the reason code within this program code.  There is a maximum of 255 replicated events here.

     Table 1.e contains the list of Table D entries that define the PREPBUFR processing steps that can act to generate events. It should be noted that a step is not necessarily the same as a program here. Some programs consist of more than one step, while some steps can appear in more than one program. The description defines the program(s) associated with each Table D entry here. The "program code" mentioned in the previous paragraph is unique for a particular step here and is determined by the last 3 digits in the descriptor number.

     The PREPBUFR file contains a number of Table B entries which are code or flag tables (see Table 1.c).  Links are provided to web pages which define these tables.  In general, the code and flag tables for those variables defined with WMO descriptors can be found in the WMO BUFR Code and Flag Tables and Common Code Tables.  The code tables for most of the more common variables defined with local descriptors are discussed next.

     The reports in the PREPBUFR file are differentiated by both the PREPBUFR report type (mnemonic “TYP” in the PREPBUFR file) and by an input "dump" report type, loosely based on the obsolete NMC Office Note 29 and NMC Office Note 124 report types (mnemonic “T29" in the PREPBUFR file).  Reports are split into mass and wind pieces at the current time.  All mass reports contain PREPBUFR report types in the range 100-199, while all wind reports contain PREPBUFR report types in the range 200-299.  These report types are used by the various assimilation systems to identify the reports in the PREPBUFR file.  Table 2, Table 3, Table 4Table 5 and Table 19 contain the code tables of PREPBUFR report types currently valid in the GFS/GDAS, CDAS, NAM/NDAS, RAP and RTMA networks, respectively.  In addition, more detailed information on the usage of each variable in each PREPBUFR report type can be found HERE for the GFS/GDAS and HERE for the NAM/NDAS.  The input "dump" report type defines the report more precisely than the PREPBUFR report type (e.g., PREPBUFR report type 180 consists of marine ship, buoy and C-MAN platform reports, all of which contain a unique input report type). Table 6 defines the code table of input "dump" report types (the same for all networks).  The input report type is not used by any assimilation system at the current time.

     Most of the observation types in the PREPBUFR file are associated with quality markers (e.g., mnemonics “PQM, “TQM”, “WQM”, etc.).  These are used by the various analyses to place a weight on the data based on its quality.  Table 7 contains the code table of quality markers.  These quality markers apply to all observation types in the PREPBUFR file.

     The program codes (e.g., mnemonics “PPC”, “TPC”, “WPC”, etc.) associated with the PREPBUFR processing steps in Table 1.e and the reason codes associated with a particular program code (e.g., mnemonics “PRC”, “TRC”, “WRC”, etc.) together define the “events” associated with changes in the observation itself or in its quality marker in the course of the PREPBUFR processing.  Table 8.a contains the code table of reason codes for step “PREPRO” (program code “001").  Table 8.b contains a code table of possible future reason codes based on events currently occurring in the PREPRO step.  Table 8.c contains a list of other unrecorded events in the PREPRO step that result in originally reported observational data not being encoded into the PREPBUFR file. Table 9 contains the code table of reason codes for step “SYNDATA” (program code “002").  Table 10 contains the code table of reason codes for step “PREVENT” (program code “004").  Table 11 contains the code table of reason codes for step “CQCHT” (program code “005").  Table 12 contains the code table of reason codes for step “RADCOR” (program code “006").  Table 13 contains the code table of reason codes for step “PREPACQC (program code “007").  Table 14 contains the code table of reason codes for step “VIRTMP” (program code “008").  Table 15 contains the code table of reason codes for step “CQCPROF” (program code “009") .  Table 16 contains the code table of reason codes for step “OIQC”  (program code “010"). Table 17 contains the code table of reason codes for step “CQCVAD” (program code “011") .  The steps “CLIMO”, "SSI", "R3DVAR" and "NRLACQC" currently do not run, and the step "GSI" currently does not generate or store reason codes.  Step "DEFAULT" is designed to handle events written out by any non-defined program/step (useful for non-operational runs). 

     Additional documentation on the structure of NCEP BUFR files in general can be found at   http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/, while additional documentation on the structure of PREPBUFR files in particular can be found at http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/toc/prepbufr/.
 
 

4.  OPERATIONAL DATA THAT DO NOT PASS THROUGH PREPBUFR PROCESSING
 

     The NAM/NDAS, GFS/GDAS GSI and RAP GSI analyses also assimilate (or at least monitor) some observation types directly from their BUFR dump files or from other non-PREPBUFR sources. Such types include satellite radiances [including RARS (GFS/GDAS only)], satellite retrieved products [including TRMM/TMI and ozone (both GFS/GDAS only) and GOES cloud data from LaRC (RAP only)], GPS radio occultation (NAM/NDAS and GFS/GDAS only), WSR-88D NEXRAD radial wind data [NAM/NDAS and RAP only (RAP also uses radar mosaic to generate reflectivity data)] and lightning (RAP only).  In addition, the GFS/GDAS GSI (only) directly assimilates satellite derived winds from NESDIS/GOES [infrared, imager water vapor (cloud-top)], JMA/MTSAT [infrared, visible, imager water vapor (cloud-top)], EUMETSAT/METEOSAT (infrared, visible), and MODIS/AQUA-TERRA [infrared, imager water vapor (deep-layer and cloud-top)] by reading the same SATWND dump file that is read in by PREPOBS_PREPDATA and encoded into the GFS and GDAS PREPBUFR files (the GFS/GDAS GSI skips over these data when reading these PREPBUFR files).

     Table 18 summarizes the current usage of these data in the various NCEP assimilation systems.  It also lists data types that are monitored (but not used) in either the NAM/NDAS, GFS/GDAS or RAP GSI.
 
 

5.  EXAMPLE PROGRAM TO DECODE NCEP PREPBUFR FILES
 

     A sample program at http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/decode_prepbufr_example demonstrates how the contents of the PREPBUFR file can be decoded using routines in the NCEP BUFR library.  In this particular example,  every report is decoded and listed to output files specific to the BUFR Table A message type.  This program also merges the mass and wind report "pieces" into a common output report for listing.  In subroutine READPB, there is a logical variable single_msgtyp which controls the amount of processing.  If it is set to FALSE, then the entire PREPBUFR file is decoded.  If it is set to TRUE, then only reports in the Table A message type indicated by the variable msgtyp_process are decoded. 

 
 

6.  POST-PROCESSING OF NCEP PREPBUFR FILES
 

     The completion of the PREPBUFR job triggers a job which performs post-processing on the PREPBUFR files just created.  This job does not produce any output necessary to the successful completion of the analysis/forecast network (indeed, it runs either simultaneously with, or after, the analysis job which is also triggered by the completion of the PREPBUFR job).

     The first job step executes the program BUFR_REMOREST which removes or masks, from the PREPBUFR file, certain data types that are restricted (either by the data producers themselves or by the WMO) from redistribution outside of NCEP.  NCEP/NCO has created a very strict policy on who may or may not have access to restricted data.   The resulting PREPBUFR file, gleaned of all restricted data, is given a suffix qualifier of  ".nr" in the network-specific /com directories on the NCEP-CCS.  [Note: The last step of the PREPBUFR job in the GFS, GDAS and CDAS networks generates an unblocked version of the PREPBUFR file and copies it to the /com directories.  This unblocked form of the PREPBUFR file is also gleaned of all restricted data here (and is given the suffix qualifier ".nr").  The unblocked, non-restricted PREPBUFR file is then copied to servers for use by organizations outside of NCEP.  (The native blocking on the IBM-SP machine is Fortran 77.)   Unblocked, restricted PREPBUFR files are not copied to these servers.]

     The next PREPBUFR post-processing job step identifies upper-air "TimeTwins" (duplications in current rawinsonde, pibal or dropwinsonde wind report "parts" vs. those over the past 35 days).  This is only executed in the GDAS network (but for every cycle).

     The next job step reformats received, selected, and assimilated data counts (for both satellite and non-satellite types) for all four GDAS network cycles for the current day and saves the result in the monthly archive directory.  On the second day of each month, a monthly summary is run, for the previous month, and posted to the web.  This job step is only executed in the 18z GDAS network.

     The final PREPBUFR post-processing job step generates a table of received, selected, and assimilated satellite data counts for all four GDAS network cycles.  This is 
only executed in the GDAS network (but for every cycle)