Probabilistic and Deterministic Forecasting using Evolutionary Program Ensembles

Paul J. Roebber
UW Milwaukee
Noon August 11 in Room 2155

Charles Darwin wrote: “Can it … be thought improbable … that other variations useful in some way to each being in the great and complex battle of life, should sometimes occur in the course of thousands of generations? If such do occur, can we doubt … that individuals having any advantage, however slight … would have the best chance of surviving and of procreating their kind?” This is the conceptual basis of evolutionary programming (EP), a process in which simulated evolution is used to find solutions to problems as diverse as the sorting of numbers and forecasting minimum temperature. Despite a history in computer sciences dating back to the 1960s, the application of this idea to meteorological studies is relatively new. Recently, EP has been adapted to the weather domain in order to generate large member ensemble forecasts for minimum temperature, maximum temperature, wind power, and heavy rainfall (Roebber 2013; Roebber 2015abc). These studies have shown that the method can provide greater probabilistic and deterministic skill, particularly at the extremes, than post-processed numerical weather prediction (NWP) ensembles. Further research has shown that this skill advantage persists out to longer ranges, where the forecast signal is presumably weaker.

The method can be understood as follows. Suppose that we have a well-defined problem with a clear measure of success (e.g., root-mean-square-error), and for which we can construct solutions by performing various mathematical operations on a set of inputs. In this case, it is possible to develop a single computer program that generates algorithms which solve the defined problem by applying various operators and coefficients to the inputs. The level of success or "fitness" of a particular solution can then be measured. The idea of fitness invokes evolutionary principles and suggests that if one starts from a very large set of random initial algorithms and allows fit algorithms to propagate some portion of their components to the next generation, then it may be possible to produce improved algorithms over time. This culling of the population in favor of stronger individuals through maximizing fitness and the exchange of "genetic material" between fit algorithms drives the progress towards improved solutions. Since weather forecast problems are nonlinear with non-unique solutions, evolved programs are a new means for generating a set of skillful but independent solutions. The algorithms resemble multiple linear or nonlinear regression equations, but with conditionals that allow for special circumstances to be accounted for as a routine outcome of the data search (e.g., the impact of snow cover on temperature under conditions of clear skies and light winds; Roebber 2010).

In this talk, I will discuss the EP concept and its most recent meteorological forms, including examples from various applications of the method. Roebber (2015abc) modified the technique to incorporate various forms of genetic exchange, disease, mutation, and the training of solutions within ecological niches, and to produce an adaptive form that can account for changing local conditions (such as changing flow regimes) as well as improved forecast inputs – thus, once initial training is completed, the ensemble will adapt automatically as forecasts are produced. I will outline efforts to mitigate the tendency for EP ensembles to exhibit under dispersion as with NWP ensembles and the concept of balancing the minimization of root-mean-square error with the maximization of ensemble diversity. I will then conclude with a discussion of outstanding questions regarding the method and future research directions.