Observational Data Dumping at NCEP
Dennis Keyser -
NOAA/NWS/NCEP/EMC
(Last Revised 7/18/2012)
|
Please take a moment to read the Disclaimer for this non-operational web page. The dumping of observational data is the first step in each NCEP network production suite. At the appropriate network data cutoff time, up to three separate jobs are executed simultaneously - two dump jobs and a tropical cyclone processing job (the latter job is now actually submitted 5 minutes prior to the two dump jobs). Once the two dump jobs have completed, a separate dump post processing job is initiated. ===> Dump Job 1 performs the following steps in sequence: A . Copying Files For Later Use By Analyses In the Global Forecast System (GFS) and Global Data Assimilation System (GDAS) network runs, GRIB files containing current analyses of snow depth, ice distribution, and sea-surface temperature from NESDIS are copied from the NCEP IBM Central Computer System (IBM-CCS) /dcom database into network-specific (“/com”) directories. These fields will be read later by the Global Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current files are not available, then one-day old files are copied.) In the North American Model (NAM) and North American Data Assimilation System (NDAS) network runs, GRIB files containing current 16'th mesh (24 km) analyses of snow/sea ice coverage produced by the NOAA/NESDIS Interactive Multisensor Snow and Ice Mapping System (IMS) are copied from the NCEP IBM-CCS /dcom database into network-specific /com directories. These fields will be read later by the Regional Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current files are not available, then one-day old files are copied.) In the full-cycle Rapid Refresh (RAP) network, GRIB files containing current 96'th mesh (4 km) analyses of snow/sea ice coverage produced by the NOAA/NESDIS Interactive Multisensor Snow and Ice Mapping System (IMS) are copied from the NCEP IBM-CCS /dcom database into network-specific /com directories. These fields will be read later by the RAP Gridpoint Statistical Interpolation (GSI) analysis. (Note: If the current files are not available, then one-day old files are copied.) B. Dumping of BUFR Observational Data (excluding WSR-88D Level II radial wind and reflectivity - see Dump Job 2 for the dumping of these data) The process of accessing the observational database and retrieving a select set of observational data is accomplished in several stages by a number of FORTRAN codes. This retrieval process is run in all of the operational networks many times a day to assemble “dump” data for model assimilation. The script that manages the retrieval of observations provides users with a wide range of options. These include observational date/time windows, specification of geographic regions for filtering (via either a lat/lon box, a center point lat/lon and radius, or a lat/lon grid point mask), data specification and combination, duplicate checking and bulletin “part” merging, and parallel processing. The primary retrieval software performs the initial stage of all data dumping by retrieving subsets of the NCEP IBM-CCS BUFR /dcom data base that contain all of the data base messages valid for the data type, geographical filter and time window requested by a user. (Recall that the /dcom data base is continuously updated with new data as the data GTS decoder and satellite ingest jobs run.) The retrieval software looks only at the date in Section One of the BUFR message to determine which messages to copy for a particular data type. This results in an observing set containing possibly more data than was requested, but allows the software to function very efficiently. The second stage of the process performs a final 'winnowing' of the data to an observing set with the exact time window requested1. This is done within the codes which remove exact- or near-duplicate reports (the nature of which is data type dependent) and merge bulletin parts for upper-air reports. 1Normally,
the six-hour
cycle GFS, GDAS,
and
Climate Data Assimilation System (CDAS) network runs dump BUFR data
globally
over a six-hour time window centered on the analysis time.
The
six-hour
cycle NAM and NDAS network runs normally dump data within the expanded
WRF-NMM-model
domain over a six-hour time window centered on the analysis time (the
NDAS
assimilates data and updates every three-hours). The one-hour
full-cycle RAP, partial-cycle RAP (RAP_PCYC) and RTMA
network
runs normally dump BUFR data within the expanded
WRF-NMM-model
domain (a superset of the RAP
and RTMA
domains)
over a one-hour time window centered on the analysis time.
The RTMA dump domain includes Guam and surrounding waters.
The
one-hour cycle surface RUC network runs normally dump BUFR
data
within the expanded RUC domain over a one-hour time window centered on
the analysis time. Each data type selected for dumping is associated with a unique mnemonic string which represents a particular BUFR type and subtype in the /dcom database. The complete list of BUFR data types is shown in Table 1.a. This includes obsolete data types, future data types, and current data types which are currently not dumped in any network job. In order to limit the number of output dump files in the operational network jobs, like data types are grouped together and represented by sequence or group mnemonics. The data group mnemonics used to generate dump files in the various NCEP networks (including obsolete types) are read by either the subsequent PREPBUFR processing steps , by the subsequent analysis codes, or by neither according to network. See Table 1.b for a listing of data group mnemonic dumps read by the PREPBUFR processing steps and Table 1.c for a listing of data group mnemonic dumps read by the analysis codes. C. Re-processing of BUFR Observational Data Dump Files Some of the BUFR data dump files are re-processed into new BUFR files such that they can be used properly by the subsequent PREPBUFR processing or analysis programs. 1. SSM/I data - all network runs (NOTE: The SSM/I data went bad in November 2009 resulting in no data being processed. All processing here was permanently turned off in October 2010): The “reports” in the SSM/I products BUFR dump files (group mnemonics “ssmip” or “ssmipn”, see Table 1.b) consist of orbital scans, each of which contain 64 retrieval footprints of one or more products. The program PREPOBS_PREPSSMI unpacks selected products out of the scans, superobs them onto a one-degree latitude/longitude grid (optional in some network runs) then encodes them as individual “reports” in the output, re-processed, BUFR file which contains only those data needed for subsequent PREPBUFR processing. The output filename contains the qualifier “spssmi” (see Table 1.b, key for superscript 2 in “NET” column). The GDAS, GFS and CDAS network runs superob the “operational” rainfall rate product generated at FNMOC, and the surface ocean wind speed and total column precipitable water products generated using a Neural-Net 3 algorithm (OMBNN3) developed by the Marine Modeling Branch of NCEP/EMC. The NAM and NDAS network runs superob the “operational” surface ocean wind speed and total column precipitable water products generated at FNMOC. The upper-air RUC network run processes the same products as the NAM and NDAS network runs but it does not superob the data. 2. QuikSCAT data - NAM, NDAS, GFS, GDAS and CDAS network runs (NOTE: The QuikSCAT data went bad in November 2009 resulting in no data being processed. All processing here was permanently turned off in October 2010): Each “report” in the QuikSCAT BUFR dump file (group mnemonic “qkscat”, see Table 1.b) consists of four sets of nudged wind vectors and other raw scatterometer information. The program WAVE_DCODQUIKSCAT unpacks each report checking the report date for realism, selecting the proper nudged wind vector, and excluding reports over land, reports with missing nudged wind vector, reports with missing model wind direction and speed, reports with probability of rain greater than 10%, and reports at the edges of the orbital swath. Reports passing checks are then superobed onto a one-half degree lat/lon grid according to satellite id and encoded into the output, re-processed BUFR file which contains only those data needed for subsequent PREPBUFR processing. The output filename contains the qualifier “qkswnd” (see Table 1.b, key for superscript 1 in “NET” column). 3. TRMM TMI data - GFS, GDAS and CDAS network runs: Each “report” in the TRMM TMI BUFR dump file (group mnemonic “trmm”, see Table 1.c) is at full footprint resolution. The program BUFR_SUPERTMI unpacks each report checking the validity of the satellite id, observation date and total precipitation observation. Reports passing checks are then superobed onto a one-degree lat/lon grid according to satellite id and encoded into the output, re-processed BUFR file. The output filename contains the qualifier “sptrmm” (see Table 1.c, key for superscript 1 in “NET” column). The Global GSI analysis (GFS and GDAS network runs only) reads the superobed data directly from the reprocessed "sptrmm" BUFR dump file (these data do not pass through the PREPBUFR processing steps). 4.
WindSat data - NAM, NDAS, RAP, RAP_PCYC, RTMA, GFS,
GDAS and CDAS network runs: Each
“report”
in the WindSat BUFR dump file (group mnemonic
“wndsat”, see
Table
1.b)
consists of four sets of
nudged wind vectors and other raw
scatterometer
information. The program BUFR_DCODWINDSAT
unpacks each
report
checking the report date for realism, selecting the proper nudged wind
vector, and excluding reports not explicitly over ocean, reports with
missing nudged
wind
vector, reports with missing model wind direction and
speed, and
reports with a "bad" or "no retrieval" EDR quality flag.
Reports
passing checks are then superobed onto a one-degree lat/lon grid
according to satellite id (GFS, GDAS and CDAS networks only) and
encoded into the
output,
re-processed BUFR file which contains only those data needed for
subsequent
PREPBUFR
processing.
The output
filename contains the qualifier
“wdsatr”
(see
Table
1.b, key for superscript 5 in “NET” column).
5. ASCAT data - NAM, NDAS, RAP,
RAP_PCYC, RTMA, GFS, GDAS and CDAS network runs:
Each
“report”
in the ASCAT BUFR dump file (group mnemonic
“ascatt”,
see
Table
1.b)
consists of
two sets of nudged wind vectors and other raw
scatterometer
information. The program WAVE_DCODQUIKSCAT
unpacks each
report
checking the report date for realism, selecting the proper nudged wind
vector, and excluding reports over land, reports with missing nudged
wind
vector, reports with missing model wind direction and speed, and
reports with one or more "critical" wind vector cell quality
flags
set. Reports passing checks are then encoded (without
superobing)
into the
output,
re-processed BUFR file which contains only those data needed for
subsequent
PREPBUFR
processing.
The output
filename contains the qualifier
“ascatw”
(see
Table
1.b, key for superscript 6 in “NET” column).
Dumping of WSR-88D Level II radial wind and reflectivity BUFR Data
This
currently runs in only the NAM and NDAS networks.
The processing is identical to that described in
Dump Job 1, Step B
above. The dumping of WSR-88D Level II radial wind and
reflectivity data is performed in a separate job from the dumping of
all other data in the NAM and NDAS networks in order to save
computation time since it takes almost as long to
dump Level
II data here as it takes to dump all other observational data
in Dump Job 1. ===> Tropical Cyclone Processing Job, running simultaneously with Dump Job 1 and, in the NAM and NDAS networks Dump Job 2, performs the following steps in sequence: A. Quality Control of Tropical Cyclone Bulletin Data In
the
GFS, GDAS, NAM and NDAS
network runs,
tropical
cyclone bulletins
valid
for
the current cycle from the
Joint
Typhoon Warning Center (JTWC)
and
Fleet
Numerical Meteorology and Oceanography Center (FNMOC)
are read from
the NCEP IBM-CCS /dcom database and merged into the proper record
structure
by the program SYNDAT_GETJTBUL. Next,
tropical cyclone
bulletins
valid for the current cycle from the
NCEP/Tropical
Prediction Center
(TPC) are read
from the TPC directory on the NCEP
IBM-CCS (these are already in the proper record format).
Finally,
manually generated
tropical
cyclone bulletins
are read from
the NCEP
IBM-CCS
database. The latter can be generated by the NCEP/NCO Senior
Duty
Meteorologist (SDM) in the event that data from other sources are not
available.
Next, the program SYNDAT_QCTROPCY runs in order to merge the tropical cyclone records from the various sources and perform quality control on tropical cyclone position and intensity information. Some of the checks performed include duplicate records, appropriate date/time, proper record structure, storm name/id number, records from multiple institutions, secondary variables (e.g. central pressure), storm position and direction/speed. The emphasis is on internal consistency between the reported storm location and prior motion. The output tropical cyclone vital statistics (tcvitals) file is then copied to the network-specific /com directories in the NCEP IBM-CCS. This file is read in the next tropical cyclone relocation step in all networks and also later in the PREPBUFR processing by the program SYNDAT_SYNDATA in the NAM and NDAS networks in order to generate tropical cyclone bogus wind reports. B. Relocation of Tropical Cyclone Vortices in the Global Sigma (First) Guess In the GFS,
GDAS, NAM and NDAS network runs, the quality-controlled
tropical storm
position
and intensity field (tcvitals)
file valid at the current time
(output by the previous
tropical
cyclone record
q.c. step). along
with the
tcvitals
files
valid 12- and 6-hours prior to the current
time, and the "best" global sigma first guess and global
pressure
grib files valid 6-hours prior to the current time, 3-hours
prior
to the current time,
at the current time, and 3-hours after the current time are
input
to a series of programs (SUPVIT,
GETTRK, RELOCATE_MV_NVORTEX).
These programs relocate
one or more tropical cyclone (or hurricane) vortices in the
global
sigma first guess files valid 3-hours prior to the current time, at the
current time, and
3-hours after the current time. The updated global
sigma
guess file for the current time is later read
in the
PREPBUFR
processing by
the program
PREPOBS_PREPDATA
and used by the various quality control programs in the PREPBUFR
processing stream. In the GFS and GDAS networks, the
updated global sigma guess files for all three times
(current
time, for 3-hours prior to the current time, and for 3-hours after the
current time) are read by the subsequent Global
GSI analysis.
This processing may also (but usually not)
generate an
updated
tcvitals
file valid
at the current time.
This file, if generated, contains only records for "weak"
vortices
which could not be used to update the global sigma first guess here.
It would be read later in the
PREPBUFR
processing
by the
program
SYNDAT_SYNDATA
in the GFS and GDAS networks in order to generate tropical cyclone
bogus wind reports. If this file is empty, no bogus reports
will
be generated by
SYNDAT_SYNDATA.
This updated
tcvitals
file is not considered in the NAM and
NDAS network runs as the original tcvtials file, output by the previous
tropical
cyclone record q.c. step,
is always
input to
SYNDAT_SYNDATA.
Although tropical cyclone relocation is not used in the NAM and NDAS
runs (other than to provide a better guess
for PREPBUFR
quality control programs) , the t-12 NDAS does start
with a
global sigma guess which reflects tropical cyclone
relocation.
Note1: This
job runs only in the GFS, GDAS,
NAM and NDAS
networks, and only if TPC and/or JTWC/FNMOC tropical storm records are
originally present and valid at the current time.
Note2: In
the NDAS network, Step A runs alone as a job only one time
for each cycle (00, 06, 12, 18Z), four hours after cycle time.
This is much later than the corresponding
NDAS cycle's
series of four dump jobs and four Step B jobs (running
relocation
only) for cycle time minus 12-hours, cycle time minus 9-hours,
cycle time minus 6-hours and cycle time minus 3-hours. The
dump
and relocation jobs run simultaneously for each of the four
NDAS
cycle processing times. The Step A job generates post-dated
tcvitals
files which
will be
read by future NDAS Step B jobs.
===> Dump Post-Processing Job, running after both Dump Job 1 and Dump Job 2 have completed, performs the following single step: Post-processing of BUFR Observational Data Dump Files The completion of the data dump job(s) triggers a job which performs post-processing on the data dump files just created. This job does not produce any output necessary to the successful completion of the analysis/forecast network [indeed it runs simultaneously with the PREPBUFR Processing Job which is also triggered by the completion of the data dump job(s)]. The
first job step prepares a table of data counts for the
various
reports just dumped via the execution of the program
BUFR_DATACOUNT.
These counts
are compared to the running
average over the past 30 days for each report type for the particular
network and cycle time. If the current dump count for a
particular type is considered abnormally low (for most report types
this means more than 50% below the 30 day average), a dump alert is
generated. The action taken for low dump counts depends upon
the
report type. For those types considered "critical" to the
subsequent assimilation system, a low dump count generates diagnostics
and triggers a code failure and a return code of 6 in the dump alert
job . For those types considered
"moderately-critical" (all
types that are assimilated which are not in the "critical" category), a
low dump count generates diagnostics and a non-fatal return code of 5
in the dump alert job. For those types considered
"non-critical"
(all types that are not assimilated in the particular network), a low
dump count generates diagnostics and a non-fatal return code of 4 in
the dump alert job. In all cases, a complete listing of dump
counts vs. the 30 day average, along with those types which are either
low or high (for most report types this means more than 200% above the
30 day average) is sent to the SDM. High dump counts do not
generate non-zero return codes in the dump alert job but they do
generate diagnostics. Trends in the 30 day averages vs. those
for
3-, 6-, 9- and 12-months ago are also recorded for the SDM (report
types trend low vs. one of these previous averaging periods if the
current 30 day average is more than 20% below the 30 day average for
that period, or report types trend high vs. one of these previous
averaging periods if
the current 30 day average is more than 20% above the 30 day average
for
that period). Currently this dump count and alert processing
runs
only in the NAM, GFS and GDAS networks. The next job step executes the program BUFR_REMOREST which removes or masks, from the appropriate dump files, certain data types that are restricted (either by the data producers themselves or by the WMO) from redistribution outside of NCEP. NCEP/NCO has created a very strict policy on who may or may not have access to restricted data. The resulting dump files, gleaned of all restricted data, are given a suffix qualifier of ".nr" in the network-specific /com directories on the NCEP-CCS. The next dump post-processing job step executes the program BUFR_LISTDUMPS which generates files containing text listings of all reports in the various BUFR data dump files. These text files are then copied to the network-specific /com directories on the IBM-CCS in order to provide diagnostic information for troubleshooting problems in the data, etc. Files containing listings of dump files that have been stripped of all restricted data are given the suffix qualifier ".nr". The
post-processing job also contains a step which generates
unblocked versions of the
BUFR
data dump files and copies them to the /com directories (again, files
containing
unblocked forms of dump files that have been stripped of all restricted
data are given the suffix qualifier ".nr"). The unblocked
files
are
then copied to servers for use by organizations outside of
NCEP.
(The native blocking on the IBM-SP machine is Fortran
77.)
Restricted data are not copied to these servers. Finally,
in the all networks, the final post-processing job of
the
day performs a data average processing step via the execution of the
program BUFR_AVGDATA.
This updates the 30 day
running average for each report type dumped, for each cycle for which a
dump is generated. These "current" 30 day averages are saved
in
text files, according to the network, in the
"/com/arch/prod/avgdata" directory on the NCEP CCS.
These
files are used by the dump alert processing in the NAM, GFS and GDAS
networks in order to generate alerts for high or low dump counts for
the current dump vs. the current 30 day average (see paragraph two in
this section). For the final post-processing job of a
particular
month, the current 30 day average for the NAM, GFS and GDAS networks is
saved off in a separate file for that month in the same
"/com"
directory as the current 30 day average files. These past
month
30 day average files are used to check for high and low trends in the
current NAM, GFS or GDAS 30 day average for a particular report vs. the
30 day average for 3-, 6-, 9- and 12-months ago (again, see paragraph
two in this section). Only the most recent 12 months of 30
day
averages are saved here for the NAM, GFS and GDAS networks. The
NCEP production suite schedule, for those networks which
originate
with a dump of observational data, is shown in
Table
2.
“DUMP” indicates the name of the
Dump Job 1,
"DUMP2”
indicates the name
of the
Dump
Job 2,
"TROPCY"
indicates
the name of the
Tropical Cyclone
Processing Job (with "TROPC1"
for the relocation part only and "TROPCY2" for the q.c. part only in
the NDAS network), "DPOST”
indicates
the name of the
Dump
Post-processing Job,
"PREP" (and "PREP1" and "PREP2" in the CDAS network) indicates
the
name of the
PREPBUFR
Processing Job,
"ANAL” indicates the name of
the Analysis Job, "FCST” (and "FCSTH" and "FCSTL" in the GFS
network) indicates the name of the Forecast
Job, "PPOST" (and "PPOST1" and "PPOST2" in the CDAS network)
indicates the name of the PREPBUFR Post-processing Job, "GESS" in the
RTMA network indicates the name of the job which retrieves the
first-guess and "APOST" in the RTMA network indicates the name if the
Analysis Post-processing Job. The
initiation of the dump jobs ("DUMP" and "DUMP2") and the tropical
cyclone processing job ("TROPCY", or "TROPCY1" in the NDAS network)
are triggered by the clock at the times
indicated.
All subsequent jobs run in sequence. "RAP_PCYC" refers to the
partial-cycle Rapid Refresh network runs.
|
|
NOAA
/ National
Weather Service
National Centers for Environmental Prediction Environmental Modeling Center 5200 Auth Road Camp Springs, Maryland 20746-4304 Page Author: EMC Webmaster |
|---|






