Dashboard
Observational Data Dumping at NCEP
NOAA/NWS/NCEP/EMC
(Last Revised 2/12/2018)
Observational Data Dumping at NCEP
Dennis Keyser -
NOAA/NWS/NCEP/EMC
(Last Revised 2/12/2018 - should be up to date!)
Please take a moment to read the Disclaimer for this non-operational web page. The dumping of observational data is the first step in each NCEP network production suite. At the appropriate network data cutoff time, up to two dump jobs are executed simultaneously. Once the two dump jobs have completed, a separate dump post processing job is initiated. ===> Dump Job 1 performs the following steps in sequence: A . Copying Files For Later Use By Analyses In the Global Forecast System (GFS) and Global Data Assimilation System (GDAS) network runs, GRIB files containing current analyses of snow depth, ice distribution, and sea-surface temperature from NESDIS are copied from the NCEP Weather and Climate Operational Supercomputing System (WCOSS) /dcom database into network-specific (“/com”) directories. These fields will be read later by the Global Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current day's files are not available, then files between one-day old and ten-days old, depending upon the product, are copied.) In the North American Model (NAM) network runs, GRIB files containing current 16'th mesh (24 km) analyses of snow/sea ice coverageproduced by the NOAA/NESDIS Interactive Multisensor Snow and Ice Mapping System (IMS) and 8'th mesh (48 km) Northern Hemisphere snow-depth/sea-ice produced by the Air Force are copied from the NCEP WCOSS /dcom database into network-specific (“/com”) directories. These fields will be read later by the Regional Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current day's files are not available, then one-day old files are copied.) In the full-cycle Rapid Refresh (RAP) network, GRIB files containing current 96'th mesh (4 km) analyses of snow/sea ice coverage produced by the NOAA/NESDIS Interactive Multisensor Snow and Ice Mapping System (IMS) are copied from the NCEP WCOSS /dcom database into network-specific (“/com”) directories. These fields will be read later by the RAP Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current day's files are not available, then one-day old files are copied.) B. Dumping of BUFR Observational Data (excluding WSR-88D Level II radial wind and reflectivity - see Dump Job 2 for the dumping of these data) The process of accessing the observational database and retrieving a select set of observational data is accomplished in several stages by a number of FORTRAN codes. This retrieval process is run in all of the operational networks many times a day to assemble “dump” data for model assimilation. The script that manages the retrieval of observations provides users with a wide range of options. These include observational date/time windows, specification of geographic regions for filtering (via either a lat/lon box, a center point lat/lon and radius, or a lat/lon grid point mask), data specification and combination, duplicate checking and bulletin “part” merging, and parallel processing. The primary retrieval software performs the initial stage of all data dumping by retrieving subsets of the NCEP WCOSS BUFR /dcom data base that contain all of the data base messages valid for the data type, geographical filter and time window requested by a user. (Recall that the /dcom data base is continuously updated with new data as the data GTS decoder and satellite ingest jobs run.) The retrieval software looks only at the date in Section One of the BUFR message to determine which messages to copy for a particular data type. This results in an observing set containing possibly more data than was requested, but allows the software to function very efficiently. The second stage of the process performs a final 'winnowing' of the data to an observing set with the exact time window requested 1. This is done within the codes which remove exact- or near-duplicate reports (the nature of which is data type dependent) and merge bulletin parts for upper-air reports from the TAC-feed. 1Normally,
the six-hour
cycle GFS, GDAS,
and
Climate Data Assimilation System (CDAS) network runs dump BUFR data
globally
over a six-hour time window centered on the analysis time.
The
six-hour
cycle NAM network runs normally dump data within the expanded
WRF-NMM-model
domain over a six-hour time window centered on the analysis time [the
NAM runs six hourly update ("catchup") cycles to assimilate
data
hourly
between 6-hours and 1-hour prior to cycle
time]. The
one-hour
full-cycle RAP, partial-cycle RAP (RAP_PCYC),
early-cycle RAP
(RAP_ERLY), early-HRRR-cycle RAP (RAP_EHRRR), RTMA and URMA
network
runs normally dump BUFR data within the expanded
WRF-NMM-model
domain (a superset of the RAP, RTMA and URMA
domains)
over a one-hour time window centered on the analysis time.
The 15-minute Rapid-Update RTMA (RTMA_RU) network runs normally dump BUFR data with the expanded
WRF-NMM-model
domain over a one-hour time window centered on the analysis times of 00, 15, 30 and 45-munites past each hour.
The RTMA and URMA dump domain includes Guam and surrounding waters. (Note:
The full-cycle and early-cycle RAP shares its PREPBUFR
files with the HRRR. The early-cycle-HRRR RAP dumps WSR-88D Level
II radial wind and reflectivity specifically for the HRRR.) Each data type selected for dumping is associated with a unique mnemonic string which represents a particular BUFR type and subtype in the /dcom database. The complete list of BUFR data types is shown in Table 1.a. This includes obsolete data types, future data types, and current data types which are currently not dumped in any network job. In order to limit the number of output dump files in the operational network jobs, like data types are grouped together and represented by sequence or group mnemonics. The data group mnemonics used to generate dump files in the various NCEP networks (including obsolete types) are read by either the subsequent PREPBUFR processing steps , by the subsequent analysis codes, or by neither according to network. See Table 1.b for a listing of data group mnemonic dumps read by the PREPBUFR processing steps and Table 1.c for a listing of data group mnemonic dumps read by the analysis codes. C. Re-processing of BUFR Observational Data Dump Files Some of the BUFR data dump files are re-processed into new BUFR files such that they can be used properly by the subsequent PREPBUFR processing or analysis programs. 1. SSM/I data - all network runs (NOTE: The SSM/I data went bad in November 2009 resulting in no data being processed. All processing here was permanently turned off in October 2010): The “reports” in the SSM/I products BUFR dump files (group mnemonics “ssmip” or “ssmipn”, see Table 1.b) consist of orbital scans, each of which contain 64 retrieval footprints of one or more products. The program PREPOBS_PREPSSMI unpacks selected products out of the scans, superobs them onto a one-degree latitude/longitude grid (optional in some network runs) then encodes them as individual “reports” in the output, re-processed, BUFR file which contains only those data needed for subsequent PREPBUFR processing. The output filename contains the qualifier “spssmi” (see Table 1.b, key for superscript 2 in “NET” column). The GDAS, GFS and CDAS network runs superob the “operational” rainfall rate product generated at FNMOC, and the surface ocean wind speed and total column precipitable water products generated using a Neural-Net 3 algorithm (OMBNN3) developed by the Marine Modeling Branch of NCEP/EMC. The NAM network runs superob the “operational” surface ocean wind speed and total column precipitable water products generated at FNMOC. The upper-air RUC network run processes the same products as the NAM network runs but it does not superob the data. 2. QuikSCAT data - NAM, GFS, GDAS and CDAS network runs (NOTE: The QuikSCAT data went bad in November 2009 resulting in no data being processed. All processing here was permanently turned off in October 2010): Each “report” in the QuikSCAT BUFR dump file (group mnemonic “qkscat”, see Table 1.b) consists of four sets of nudged wind vectors and other raw scatterometer information. The program WAVE_DCODQUIKSCAT unpacks each report checking the report date for realism, selecting the proper nudged wind vector, and excluding reports over land, reports with missing nudged wind vector, reports with missing model wind direction and speed, reports with probability of rain greater than 10%, and reports at the edges of the orbital swath. Reports passing checks are then superobed onto a one-half degree lat/lon grid according to satellite id and encoded into the output, re-processed BUFR file which contains only those data needed for subsequent PREPBUFR processing. The output filename contains the qualifier “qkswnd” (see Table 1.b, key for superscript 1 in “NET” column). 3. TRMM TMI data - GFS, GDAS and CDAS network runs (NOTE: The TRMM TMI data went bad in April 2015 resulting in no data being processed. All processing here was permanently turned off at that time): Each “report” in the TRMM TMI BUFR dump file (group mnemonic “trmm”, see Table 1.c) is at full footprint resolution. The program BUFR_SUPERTMI unpacks each report checking the validity of the satellite id, observation date and total precipitation observation. Reports passing checks are then superobed onto a one-degree lat/lon grid according to satellite id and encoded into the output, re-processed BUFR file. The output filename contains the qualifier “sptrmm” (see Table 1.c, key for superscript 1 in “NET” column). The Global GSI analysis (GFS and GDAS network runs only) reads the superobed data directly from the reprocessed "sptrmm" BUFR dump file (these data do not pass through the PREPBUFR processing steps). 4.
WindSat data - NAM, RAP, RAP_PCYC, RAP_ERLY, RTMA, URMA, RTMA_RU, GFS,
GDAS and CDAS network runs: Each
“report”
in the WindSat BUFR dump file (group mnemonic
“wndsat”, see
Table
1.b)
consists of four sets of
nudged wind vectors and other raw
scatterometer
information. The program BUFR_DCODWINDSAT
unpacks each
report
checking the report date for realism, selecting the proper nudged wind
vector, and excluding reports not explicitly over ocean, reports with
missing nudged
wind
vector, reports with missing model wind direction and
speed, and
reports with a "bad" or "no retrieval" EDR quality flag.
Reports
passing checks are then superobed onto a one-degree lat/lon grid
according to satellite id (GFS, GDAS and CDAS networks only) and
encoded into the
output,
re-processed BUFR file which contains only those data needed for
subsequent
PREPBUFR
processing.
The output
filename contains the qualifier
“wdsatr”
(see
Table
1.b, key for superscript 5 in “NET” column). (NOTE:
WindSat data has not been processed since August 2012 due to a format
change in the raw files. In all likelihood these data will not be
restored.)
5.
ASCAT data - NAM, RAP,
RAP_PCYC, RAP_ERLY, RTMA, URMA, RTMA_RU, GFS, GDAS and CDAS network runs:
Each
“report”
in the ASCAT BUFR dump file (group mnemonic
“ascatt”,
see
Table
1.b)
consists of
two sets of nudged wind vectors and other raw
scatterometer
information. The program WAVE_DCODQUIKSCAT
unpacks each
report
checking the report date for realism, selecting the proper nudged wind
vector, and excluding reports over land, reports with missing nudged
wind
vector, reports with missing model wind direction and speed, and
reports with one or more "critical" wind vector cell quality
flags
set. Reports passing checks are then encoded (without
superobing)
into the
output,
re-processed BUFR file which contains only those data needed for
subsequent
PREPBUFR
processing.
The output
filename contains the qualifier
“ascatw”
(see
Table
1.b, key for superscript 6 in “NET” column).
Dumping of WSR-88D Level II radial wind and reflectivity BUFR Data
This
currently runs in only the NAM network.
The processing is identical to that described in
Dump Job 1, Step B
above. The dumping of WSR-88D Level II radial wind and
reflectivity data is performed in a separate job from the dumping of
all other data in the NAM network in order to save
computation time since it takes almost as long to
dump Level
II data here as it takes to dump all other observational data
in Dump Job 1. ===> Tropical Cyclone Processing Job, running simultaneously with Dump Job 1 and, in the NAM network (NOTE: No longer in NAM after March 2017) Dump Job 2, performs the following steps in sequence: A. Quality Control of Tropical Cyclone Bulletin Data (NOTE: No longer runs in the NAM after March 2017. After July 2017, the GFS and GDAS perform the quality control of tropical cycle bulletin data in a process that is now upstream and separate from obs processing.) In
the
GFS and GDAS network
runs,
tropical
cyclone bulletins
valid
for
the current cycle from the
Joint
Typhoon Warning Center (JTWC)
and
Fleet
Numerical Meteorology and Oceanography Center (FNMOC)
are read from
the NCEP WCOSS /dcom database and merged into the proper record
structure
by the program SYNDAT_GETJTBUL. Next,
tropical cyclone
bulletins
valid for the current cycle from the
NCEP/Tropical
Prediction Center
(TPC) are read
from the TPC directory on the NCEP
WCOSS (these are already in the proper record format).
Finally,
manually generated
tropical
cyclone bulletins
are read from
the NCEP
WCOSS
database. The latter can be generated by the NCEP/NCO Senior
Duty
Meteorologist (SDM) in the event that data from other sources are not
available.
Next, the program SYNDAT_QCTROPCY runs in order to merge the tropical cyclone records from the various sources and perform quality control on tropical cyclone position and intensity information. Some of the checks performed include duplicate records, appropriate date/time, proper record structure, storm name/id number, records from multiple institutions, secondary variables (e.g. central pressure), storm position and direction/speed. The emphasis is on internal consistency between the reported storm location and prior motion. The output tropical cyclone vital statistics (tcvitals) file is then copied to the network-specific /com directories in the NCEP WCOSS. This file is read in the next tropical cyclone relocation step in the GFS and GDAS networks. B. Relocation of Tropical Cyclone Vortices in the Global Sigma (First) Guess (NOTE: After March 2017 the NAM, and after July 2017 the GFS and GDAS, performs relocation of its own first guess in a process that is now separate and separate from obs processing.) In
the GFS
and GDAS network
runs, the quality-controlled
tropical storm
position
and intensity field (tcvitals)
file valid at the current time
(output by the previous
tropical
cyclone record
q.c. step). along
with the
tcvitals
files
valid 12- and 6-hours prior to the current
time, and the "best" global sigma first guess and global
pressure
grib files valid 6-hours prior to the current time, 3-hours
prior
to the current time,
at the current time, and 3-hours after the current time are
input
to a series of programs (SUPVIT,
GETTRK, RELOCATE_MV_NVORTEX).
These programs relocate
one or more tropical cyclone (or hurricane) vortices in the
global
sigma first guess files valid 3-hours prior to the current time, at the
current time, and
3-hours after the current time. The updated global
sigma
guess file for the current time is later read
in the
PREPBUFR
processing
by
the program
PREPOBS_PREPDATA
and used by the various quality control programs in the PREPBUFR
processing stream. In the GFS and GDAS networks, the
updated global sigma guess files for all three times
(current
time, for 3-hours prior to the current time, and for 3-hours after the
current time) are read by the subsequent Global
GSI analysis.
This processing may also (but usually not)
generate an
updated
tcvitals
file valid
at the current time.
This file, if generated, contains only records for "weak"
vortices
which could not be used to update the global sigma first guess here.
It would be read later in the
PREPBUFR
processing
by the
program
SYNDAT_SYNDATA
in the GFS and GDAS networks in order to generate tropical cyclone
bogus wind reports. If this file is empty, no bogus reports
will
be generated by
SYNDAT_SYNDATA.
This updated
tcvitals
file is not considered in the NAM network runs as the
original tcvtials file, output by the previous tropical
cyclone record q.c. step,
is always
input to
SYNDAT_SYNDATA.
Although tropical cyclone relocation is not used in the
NAM runs
(other than to provide a better guess
for PREPBUFR
quality control programs) , the t-06 NAM does start
with a
global sigma guess which reflects tropical cyclone
relocation.
Note1:
This
job runs only in the GFS
and GDAS networks, and
only if TPC and/or JTWC/FNMOC tropical
storm records are
originally present and valid at the current time.
===> Dump Post-Processing Job, running after both Dump Job 1 and Dump Job 2 have completed, performs the following single step: Post-processing of BUFR Observational Data Dump Files The completion of the data dump job(s) triggers a job which performs post-processing on the data dump files just created. This job does not produce any output necessary to the successful completion of the analysis/forecast network [indeed it runs simultaneously with the PREPBUFR Processing Job which is also triggered by the completion of the data dump job(s)]. The
first job step prepares a table of data counts for the various reports just dumped via the execution of the program BUFR_DATACOUNT.
These counts are compared to the running average over the past 30 days for each report type for the particular
network and cycle time. If the current dump count for a particular type is considered abnormally low (for most report types
this means more than 50% below the 30 day average), a dump alert is generated. The action taken for low dump counts depends upon
the report type. For those types considered "critical" to the
subsequent assimilation system, a low dump count generates diagnostics
and triggers a code failure and a return code of 6 in the dump alert
job . For those types considered
"moderately-critical" (all
types that are assimilated which are not in the "critical" category), a
low dump count generates diagnostics and a non-fatal return code of 5
in the dump alert job. For those types considered
"non-critical"
(all types that are not assimilated in the particular network), a low
dump count generates diagnostics and a non-fatal return code of 4 in
the dump alert job. In all cases, a complete listing of dump
counts vs. the 30 day average, along with those types which are either
low or high (for most report types this means more than 200% above the
30 day average) is sent to the SDM. High dump counts do not
generate non-zero return codes in the dump alert job but they do
generate diagnostics. Trends in the 30 day averages vs. those
for
3-, 6-, 9- and 12-months ago are also recorded for the SDM (report
types trend low vs. one of these previous averaging periods if the
current 30 day average is more than 20% below the 30 day average for
that period, or report types trend high vs. one of these previous
averaging periods if
the current 30 day average is more than 20% above the 30 day average
for
that period). Currently this dump count and alert processing
runs
only in the NAM (tm00 only), GFS and GDAS networks. The next job step executes the program BUFR_REMOREST which removes or masks, from the appropriate dump files, certain data types that are restricted (either by the data producers themselves or by the WMO) from redistribution outside of NCEP. NCEP/NCO has created a very strict policy on who may or may not have access to restricted data. The resulting dump files, gleaned of all restricted data, are given a suffix qualifier of ".nr" in the network-specific (*/com*) directories on the NCEP-WCOSS. The next dump post-processing job step executes the program BUFR_LISTDUMPS which generates files containing text listings of all reports in the various BUFR data dump files. These text files are then copied to the network-specific ("*/com*") directories on the WCOSS in order to provide diagnostic information for troubleshooting problems in the data, etc. Files containing listings of dump files that have been stripped of all restricted data are given the suffix qualifier ".nr". The
post-processing job also contains a step which generates
unblocked versions of the
BUFR
data dump files and copies them to the (/com) directories (again, files
containing
unblocked forms of dump files that have been stripped of all restricted
data are given the suffix qualifier ".nr"). The unblocked
files
are
then copied to servers for use by organizations outside of
NCEP.
(The native blocking on the IBM-SP machine is Fortran
77.)
Restricted data are not copied to these servers. (NOTE: No longer invoked after migration to WCOSS because BUFR files are unblocked by default.) Finally,
in the all networks, the final post-processing job of
the
day performs a data average processing step via the execution of the
program BUFR_AVGDATA.
This updates the 30 day
running average for each report type dumped, for each cycle for which a
dump is generated. These "current" 30 day averages are saved
in
text files, according to the network, in either the
"/com/arch/prod/avgdata" [NAM (tm00 only), RAP],
"/com2/arch/prod/avgdata" (RTMA, URMA), or "/gpfs/hps/nco/ops/com/gfs/prod/sdm_rtdm/avgdata" (GFS, GDAS) directory on the NCEP
WCOSS.
These
files are used by the dump alert processing in the NAM (tm00 only), GFS and GDAS
networks in order to generate alerts for high or low dump counts for
the current dump vs. the current 30 day average (see paragraph two in
this section). For the final post-processing job of a
particular
month, the current 30 day average for the NAM (tm00 only), GFS and GDAS networks is
saved off in a separate file for that month in the same "/com"
directory as the current 30 day average files. These past
month
30 day average files are used to check for high and low trends in the
current NAM (tm00 only), GFS or GDAS 30 day average for a particular report vs. the
30 day average for 3-, 6-, 9- and 12-months ago (again, see paragraph
two in this section). Only the most recent 12 months of 30
day
averages are saved here for the NAM (tm00 only), GFS and GDAS networks. The
NCEP production suite schedule, for those networks which
originate
with a dump of observational data, is shown in
Table
2.
“DUMP” indicates the name of the
Dump Job 1,
"DUMP2”
indicates the name
of the
Dump
Job 2, "DPOST”
indicates
the name of the
Dump
Post-processing Job,
"PREP" (and "PREP1" and "PREP2" in the CDAS network) indicates
the
name of the
PREPBUFR
Processing Job,
"ANAL” indicates the name of
the Analysis Job, "FCST” (and "FCSTH" and "FCSTL" in the GFS
network) indicates the name of the Forecast
Job, "PPOST" (and "PPOST1" and "PPOST2" in the CDAS network)
indicates the name of the PREPBUFR Post-processing Job, "GESS" in the
RTMA and URMA networks indicates the name of the job which retrieves the
first-guess and "APOST" in the RTMA and URMA networks indicates the name if the
Analysis Post-processing Job. The
initiation of the dump jobs ("DUMP" and "DUMP2") and the tropical
cyclone processing job ("TROPCY")
are triggered by the clock at the times
indicated.
All subsequent jobs run in sequence. "RAP_PCYC" refers to
the
partial-cycle Rapid Refresh network runs. "RAP_ERLY" refers to
the
early-cycle Rapid Refresh network runs. "RAP_EH" refers to
the
early-cycle-HRRR Rapid Refresh network runs. "RTMA_RU" refers to
the
Rapid-Update RTMA runs. |
NOAA
/ National
Weather Service
National Centers for Environmental Prediction Environmental Modeling Center 5200 Auth Road Camp Springs, Maryland 20746-4304 Page Author: EMC Webmaster |
---|