Probablistic future climate and climate impacts prediction workshop
25-26 September 2006
Biomathematics and Statistics Scotland - University of Edinburgh
Summary
Day 1 involved presentations by four groups on current work in the area of probabilistic climate impacts modelling.
Jonathon Rougier presented results from a screening experiment to assess parameter uncertainty in the GENIE-I model as part of the RAPID project. The model contains sixteen unknown inputs, for which maximum and minimum values have been ellicited. Carefully selected Box-Cox transformations were used to transform each of the parameters onto the range [-1,1]. The sampling of parameter values was based upon applying a variant of a Latin Hypercube design within the transformed parameter space, the design being modified in such a way that the design achieved maximum space filling for those parameters that were regarded a priori as being the most important. Jonathon informed us that the screening experiment had yielded a high proportion of runs from which the model failed to evaluate, and presented a `pre-calibration' approach that he had developed in order to detect which of the model runs would have had a high probability of failure. The approach is based upon a logistic regression of success/failure upon the model inputs, and so identifies those regions of the parameter space that tend to be associated with high and low probabilities of failure. The results suggest that these regions are complicated, being defined by multiple complex interactions between seven of the input variables. Jonathon argued that the pre-calibration approach provides an initial attempt to compare model outputs against a mix of expert prior information and real data, and noted that there is much scope for the development of more sophisticated approaches within this area - for example, it can be extended to distinguish betwen physically plausible and implausible outputs, as well as between model runs which complete and those which do not.
Mark New discussed the development of a risk-based framework to handle probabilistic climate change in the context of water resources and biodiversity. He presented results from an analysis of water resources in the Thames basin based upon driving the CATCHMOD hydrology model with climate projections derived from Phase 1 of the climateprediction.net project. cp.net is a distributed public computing project, and Phase 1 involves repeatedly running an equilibrium experiment to investigate the impact of a doubling of CO2 for a wide range of input values (physical parameterisations, initial conditions and climate forcings). The Thames basin is small relative to the size of a GCM grid cell, and the outputs from cp.net do not provide enough information to perform any realistic form of downscaling, so the mismatch in the spatial resolution is dealt with through the use of change factors (i.e. by assuming that the proportional changes in climate variables that are seen within the GCM grid cell will also apply at finer spatial resolutions). The results reveal evidence of bimodality in the response of precipitation to climate change, with the two modes hypothesised to relate to whether the model predicts the autumn to be generally wet or generally dry. The results are able to incorporate the impacts of climate uncertainty and hydrological uncertainty, leading to relatively high levels of estimated uncertainty for the impacts of climate change.
Ruth Doherty and Adam Butler outlined preliminary results from an analysis of the impact of climate uncertainty upon the assessment of trends in the outputs from the LPJ model for terrestrial ecosystems, funded as part of the ALARM project. Ruth outlined the structure of the LPJ model, discussed the various sources of uncertainty within it, and outlined the results of the paper by Zaehle et al. (2005) that attempt to quantify level of parameter uncertainty within LPJ. The current analysis focused on the impact of climate uncertainty upon LPJ, and Ruth explained how publicly available climate projections based upon 18 runs from 9 GCMs were used to provide climatic inputs to the vegetation model. Adam presented a statistical approach for using hierarchical time series models to analyse trends in aggregated output from LPJ, and discussed possible implementations of this approach within a Bayesian context. Adam also discussed some of the statistical and computational difficulties associated with performing a fully probabilistic assessment of uncertainty in the context of an impacts model such as LPJ. David Cameron noted that this work does not capture the feedback effects of vegetation upon climate, and outlined current work that attempts to deal with the relationship between vegetation and climate within a unified framework, albeit in the context of a much simplified climate model.
Hayley Fowler presented results from the AquaTerra project that attempt to link probabilistic climate scenarios with downscaling methods for hydrological impact studies. Her analysis involved running four Regional Climate Models, each driven by one or more different GCMs. Each of the RCM runs is used as inputs to the EARWIG weather generator, and subsequently to a rainfall-runoff model, in order to produce simulated time series of hydrological variables. The approach of Tebaldi et al. (2004) is used to estimate weights which quantify the performance of the different RCMs in describing aggregate properties of the true climate in each season, and these weights are then used in resampling from the set of simulated hydrological time series. Hayley raised more general questions about when and how we should attempt to weight different climate models, and about whether it was possible/necessary to perform this weighting in a spatially explicit fashion. Robin Hankin pointed out that the RCM runs are far from independent, leading to a more general discussion on the causes and effects of non-independence in the responses of different climate models.
Day 2 consisted of a structured discussion which centered upon four key themes:
Emulation provides a generic statistical approach for the analysis of output from computationally expensive models, enabling us to draw probabilistic inferences about the values of the model for parameter values at which it has not yet been evaluated. Robin Hankin presented the basic mathematical details of the methodology, and demonstrated how the BACCO bundle of packages can be used to implement them within R. There was debate about how emulators should be viewed: Robin argued that an emulator is simply a computationally cheap approximation to a random function, whilst Jonathan Rougier and John Paul Gosling argued that it is essentially a statement of belief about values of the model for those parameter values at which it has not yet been evaluated. Jonathon emphasised the distinction between a surrogate and an emulator, noting that statisticians define the former as a fast approximation to an expensive model whilst defining the latter as a tool for drawing probabilistic inferences about the expensive model based upon the use of a fast approximation to it. John Paul noted that the emulator reflects our prior beliefs about the smoothness with which model outputs vary across the parameter space, and argued that the regressor terms within the emulator should reflect our knowledge of the physical processes which are encapsulated within the expensive model.
Assessment of model reliability constitutes a key aspect of probabilistic climate and climate impact prediction. Robin Hankin emphasised that there are two elements to the assessment of model performance - validation (ensuring that we solve the equations correctly) and verification (ensuring that we solve the right equations). It was noted that assessments of reliability ought to relate to the purposes for which the model was constructed, but that many climate and climate impact models are sufficiently generic that in practice this is often difficult or impossible to achieve. David Cameron noted that models are typically assessed relative to other models, or, more usually, to earlier versions of the same model. Hayley Fowler pointed out that current comparisons of model outputs with data tend to focus upon mean values, whilst levels of variability and extreme values may actually be much more indicative of the performance of the model in producing the true system. Adam Butler argued that observational data on impacts are often sparse, and that they may also be subject to systematic biases which must be accounted for within the assessment of model performance. Jonathon Rougier suggested that assessments should attempt to include data from as many distinct sources as possible, and this led to a discussion on the validity of including palaeoclimatic data in the assessment of climate model performance. Jonathon also noted that any assessment of model performance must incorporate prior information about the extent to which the model is regarded as accurate.
Spatial scales for which impacts assessment are required tend to be much finer than those for which GCM outputs are available, necessitating either the use of a Regional Climate Model or the use of some kind of statistical downscaling technique. It was also noted that there is a tendency for end users to prefer assessments at fine spatial scales, because these tend to produce more plausible looking outputs ('pretty picture syndrome'), but that assessments at coarser scales may often be of greater scientific accuracy and utility. Jonathon Rougier argued that it is possible to devise sophisticated techniques for downscaling, but that the practical utility of the methods is typically limited by our ability to validate the accuracy of their outputs against real data. Roger Street suggested that synoptic scale identification of weather patterns can be used to provide a low dimensional descriptor of the climate, and that this can then be used to simplify the process of downscaling and upscaling; Mark New pointed out that synoptic identification is already widely performed, for example within the context of climateprediction.net.
The accurate communication of probabilistic assessments to policymakers and the public was universally agreed to be a vital and difficult task. Roger Street and Ruth Doherty emphasised the importance of good communication between the climate modelling and impact assessment communities, with Ruth outlining the ways in which impact assessments can yield useful information about the performance of the climate models that have been used to drive the assessment. Roger Street argued that many probabilistic assessments of impacts can most usefully be phrased in terms of the probability of exceeding a threshold. The nature of the IPCC scenarios was discussed, but there was disagreement about whether it would be meaningful and/or useful to assign probabilities to the different scenarios. It was pointed out that there is no great divergence between the scenarios in the period up until 2040, so that scenario uncertainty is likely to be a relatively minor consideration for decision making with respect to most impacts. Roger Street outlined how the UKCIP scenarios are likely to be generated in future, and pointed out that probabilistic information on climate change in the UK will be available from 2008 onwards. Finally, there was an emphasis upon the importance of ensuring thorough documentation of scientific work: Jonathon Rougier demonstrated the wiki which the RAPID group have been using to record their work with the GENIE model, whilst Robin Hankin argued for the widespread use of open source software - notably R - as a means of ensuring reproducability.
[Webpage maintained by Adam Butler, last modified 6 October 2006]