April 9, 2009 Meeting Summary
Zhan Zhang presented a few slides about his most recent work involving differing results on P5 and P6 due to round-off errors. For Hurricane Fay, HWRF runs on P6 (Cirrus), produced lower intensity error values than runs on P5 (Dew). To figure out why this was occurring, Zhan performed a combination of runs using: P5 executable on P5, P5 executable on P6, P6 executable on P5, and P6 executable on P6. All options initially produced different results. This error could be caused by compiler differences between P5 and P6, runtime library differences, or bugs in the model. As Zhan noted, the error was found only within the HWRF model. Looking at the subroutine BASE_STATE_PARENT and the variable Z3d, Zhan detailed how he eliminated intermediate steps to get consistent results between P5 and P6. Looking at a comparison at a sample grid point (1st time step), we see different values obtained for Z3d between P5 and P6, with the modified Z3d using either the value from P5 or P6 without any kind of pattern. Some solutions to this overall problem provided by Zhan included 1) modifying all affected source code and eliminating all intermediate computational steps, which could prove to be difficult; 2) using a compilation option: -qautodbl = DBL which takes real*4 to real*8 and real*8 to real*16 to gain extra precision, which so far, won't work because HWRF uses external libraries; and 3) noting the problem and leaving the source code as is. Overall, this work is striving to eliminate the differences between P5 and P6 to find bugs. Work on this issue is ongoing.
Vijay Tallapragada presented his work on the HWRF transition to Cirrus and differences between P5 and P6. For his experiments, Vijay ran end-to-end experiments for Gustav, Hanna, and Ike from the 2008 Atlantic hurricane season. Track errors for Gustav, Hanna and Ike showed very small differences between P5 (in red) and P6 (in green); however, intensity errors for P6 were higher than those for P5 at all forecast times by as much as 2 kts. For Gustav only, track errors for P6 were a little higher than those for P5 from 48h onward, and intensity errors sometimes showed P6 worse than P5 and vice versa. Track error for Hanna showed very similar values between P5 and P6, but P6 had higher intensity error values. For Hurricane Ike, track errors between P5 and P6 were almost identical, but intensity errors once again alternated between higher values for P5 and P6. Between P5 and P6, there was a 5-7% variance for these three storms, which comprised about 1/3 of the total Atlantic sample. These differences could be attributed to pre-processing, specifically differences in initialization (for both SI and Qingfu's initialization). There is also the issue mentioned by Zhan in his presentation. Finally, Vijay mentioned run-time issues. For example, using 80 processors on both P5 and P6, runtime on P6 is slower by approximately 13 minutes for the 126h forecast. Runtime differences for the stand-alone model are slightly less, with P6 running about 6 minutes faster than P5. This possibly indicates that the runtime issue could be associated with the coupler. Also, more MPMD related options need to be explored, such as co-scheduling. Work on this issue will continue.