Tools for handling changes of support

Combining data sources where system variables are observed at different locations or scales.

The need to augment information about natural processes often motivates combining observations from different data sources, which may typically be collected at different times, locations and scales. Reconciling these mismatched observations, to draw valid inferences, is challenging and is known as a “change of support problem” (COSP).

While the statistical literature on COSPs is well developed, many of the conventional change of support models rely on the ability to estimate fine-scale properties of the spatio-temporal patterns, and so may not work well for sparse data. Furthermore, there is still a lack of software implementations that are both accessible and versatile, as well as guidance on how to fit these models. The approach taken here sought to address these two limitations, using multiple likelihoods to jointly analyse different data streams, while allowing shared information via a hierarchical Generalized Additive Models framework. A benefit of these models is the ability to decompose spatio-temporal variation into distinct components with mean, smoothness and (unstructured) error variance that may or may not be shared between data sources. This gives the analyst substantial versatility to represent different assumptions about the natural processes and the observation error underpinning each data source and to draw inferences about the respective sources of variation.

The current implementation assumes that all measurements were taken over units sufficiently small to be assimilable to a point in space or time without significant loss of information, or that the data are too sparse for information on the shape and extent of observation areas to be helpful. This assumption could however be relaxed within the same framework, with relatively modest extensions to the models.

The outputs of the project include the 'ascot' package for simulating and sampling from realistic spatio-temporal scenarios in R (including a Shiny app for intuitive visualization of different spatio-temporal processes), as well as vignettes that demonstrate how areal and point data can be modelled in the same analysis, using hierarchical Generalized Additive Models implemented with the popular “mgcv” R package.

image showing three graphs: the left is a heatmap with varying shades of yellow and green over a square, the middle shows point data sampled over a regular grid on the same area, and the right shows small square areas sampled randomly over the same area

Left: underlying “true” spatial field (simulated). Middle and right: regular point sampling and random areal sampling of the same spatial field respectively, leading to two data types with different properties. Simulated with the `ascot` package.

Contextual information, theory and vignettes can be found in the “Slides” and “Vignettes” menu of the delivery workshop webpage under the section “Project 1: Data Fusion With Change Of Support”. The software can be found in the ascot R package (stable version as of April 2024 or ongoing development version).

 

Investigators: Ana Couto (BioSS), Fergus Chadwick (formerly BioSS, now University of St Andrews), Dave L. Miller (BioSS, UKCEH), Thomas Cornulier (BioSS), Janice Scheffler (UKCEH), Peter Levy (UKCEH), Jackie Potts (BioSS)