s41586-024-07236-z.pdf - Page 1

788 | Nature | Vol 628 | 25 April 2024 Article Revealing uncertainty in the status of biodiversity change T . F. Johnson1 ✉, A. P . Beckerman1, D. Z. Childs1, T . J. Webb1, K. L. Evans1, C. A. Griffiths1,10, P . Capdevila2,3,4, C. F. Clements2, M. Besson2,11, R. D. Gregory5,6, G. H. Thomas1, E. Delmas1,7,8 & R. P . Freckleton1,9 Biodiversity faces unprecedented threats from rapid global change1. Signals of biodiversity change come from time-series abundance datasets for thousands of species over large geographic and temporal scales. Analyses of these biodiversity datasets have pointed to varied trends in abundance, including increases and decreases. However, these analyses have not fully accounted for spatial, temporal and phylogenetic structures in the data. Here, using a new statistical framework, we show across ten high-profile biodiversity datasets2–11 that increases and decreases under existing approaches vanish once spatial, temporal and phylogenetic structures are accounted for. This is a consequence of existing approaches severely underestimating trend uncertainty and sometimes misestimating the trend direction. Under our revised average abundance trends that appropriately recognize uncertainty, we failed to observe a single increasing or decreasing trend at 95% credible intervals in our ten datasets. This emphasizes how little is known about biodiversity change across vast spatial and taxonomic scales. Despite this uncertainty at vast scales, we reveal improved local-scale prediction accuracy by accounting for spatial, temporal and phylogenetic structures. Improved prediction offers hope of estimating biodiversity change at policy-relevant scales, guiding adaptive conservation responses. Accelerating rates of species extinction are driving global changes in biodiversity, threatening ecosystems and the services they provide1. In an attempt to reverse biodiversity declines, world leaders, policy- makers and academics have called for action12. Evidence-based actions require long-term datasets and rigorous modelling to reliably detect and attribute biodiversity change through time13,14. At present, some of the most influential estimates of biodiversity change are calcu - lated using datasets such as BioTIME2, the Living Planet15 or the North
American Breeding Bird Survey3. Inferences from these abundance datasets have shaped policy16 and are considered by some to be a key pillar of global biodiversity monitoring17. Biodiversity datasets are complex and typically subject to one or more sources of non-independence across the axes of time, space and evolution. This presents a challenge for analysis, as omission of even one of these sources of non-independence from a statistical model can lead to underestimation of uncertainty, incorrect trends and poorly resolved prediction, and ultimately undermines current interpreta- tion of wildlife abundance trends18–20. A unifying feature of previous studies is that they are characterized by the consistent omission of one or more of these dependencies from their analysis. This imposes a risk that past estimates of abundance change—pointing to declines15,21, no net change18,22,23 and recovery24—may be unreliable. Non-independence can be classified in a variety of ways, which we split into two core types: hierarchical, for which observations are pseudoreplicated or nested (for example, multiple trends for a given species, site or region in time); and correlative, for which observa - tions become increasingly correlated (sometimes termed autocorre- lation) when close in time25, space26 or phylogeny27. Under correlative non-independence, we may expect sequential abundance values in a time series to be more similar, and trends should be similar when near in space or in closely related species (Fig. 1). Although studies commonly account for hierarchical non-independence using features such as random effects in mixed models, a literature review covering hundreds of papers published in high-impact journals since 2010 revealed that studies rarely account for correlative non-independence across space (accounted for in 7% of studies), phylogeny (14%) or time (32%; Supple- mentary Table 1). Further, no biodiversity model has yet been formal- ized to account for all three sources of correlative non-independence at the same time. Here we show that ignoring non-independence has serious conse- quences for inference of biodiversity trends. We introduce the corre- lated effect model, which incorporates hierarchical non-independence and all three sources of correlative non-independence, and apply it to ten high-profile, multi-species datasets that have been used to infer https://doi.org/10.1038/s41586-024-07236-z Received: 23 November 2022 Accepted: 26 February 2024 Published online: 27 March 2024 Open access Check for updates 1School of Biosciences, Ecology and Evolutionary Biology, University of Sheffield, Sheffield, UK. 2School of Biological Sciences, Biosciences, University of Bristol, Bristol, UK. 3Departament de Biologia Evolutiva, Ecologia i Ciències Ambientals, Universitat de Barcelona (UB), Barcelona, Spain. 4Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona (UB), Barcelona, Spain. 5RSPB Centre for Conservation Science, The Lodge, Sandy, UK. 6Centre for Biodiversity & Environment Research, Department of Genetics, Evolution and Environment, University College London, London, UK. 7Habitat, Montreal, Quebec, Canada. 8Institut des Sciences de la Forêt Tempérée, Université du Québec en Outaouais, Ripon, Quebec, Canada. 9Debrecen Biodiversity Centre, University of Debrecen, Debrecen, Hungary. 10Present address: Swedish University of Agricultural Sciences, Department of Aquatic Resources, Institute of Marine Research, Lysekil, Sweden. 11Present address: Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins, BIOM, Banyuls-sur-Mer, France. ✉e-mail: [email protected] Nature | Vol 628 | 25 April 2024 | 789 abundance trends in global biodiversity2–11. Combined, these datasets describe the abundance (including relative abundance and densities) patterns of more than 30,000 populations, representing about 3,100 species and about 6,000 unique locations, and are considered some of the best biodiversity monitoring datasets available. Non-independence increases uncertainty We compared our correlated effect model with two mixed-effect modelling frameworks that are commonly used and account only for hierarchical non-independence: random intercept and random slope (both described in Fig. 1). Across the 44 relevant studies identified in a literature search spanning 282 published papers, 43% (n = 19) used a version of the random intercept model and 50% (n = 22) used a ver- sion of the random slope model (Supplementary Table 1). Comparing these commonly applied approaches to the correlated effect model, we detect a pronounced shift in collective abundance trends (that is, the model-derived average rate of change in abundance across all spe- cies and locations), and show that existing approaches underestimate collective trend uncertainty and can misestimate direction (Fig. 2). Collective abundance trend uncertainty (that is, the standard devia- tion (s.d.) around the abundance–time coefficient) was underestimated in all ten datasets in both the random intercept and random slope mod- els. These underestimates are large, with uncertainty in the correlated effect model 26 times greater [95% confidence interval (CI): 14–47] than that in the random intercept model and 3.4 times greater [95% CI: 1.8–6.2] than that in the random slope model. Further, after accounting for correlative non-independence, we find instances in which the trend direction shifts and even reverses (for example, from negative to posi- tive). For instance, in the Living Planet dataset, a decreasing trend in the random intercept model shifts to a stable trend in the random slope model, before shifting back to a sharp albeit uncertain decrease after accounting for correlative non-independence. In three databases—the Living Planet, RivFishTIME and Atlantic reef fishes—the mean trends were more extreme under the correlated effect model, shifting away from zero (that is, no net change in abundance), although still highly uncertain. Across the three models, we observed complete agreement in trend direction and significance status (50% credible intervals) in only four of the datasets. At 95% credible intervals, we found no instances in which models agreed on trend direction and significance status. ObjectiveP roblem Implications Solution Data Current approaches The collective trend is derived from datasets describing abundance patterns over time for multiple species and sites. Abundance Year Species 1 Species 2 Species 3 Time (Ma) Species 1 Species 2 Species 3 2.5 0 Phylogeny: trends are shaped by species traits, a product of evolution, so closely related species should have more similar trends. Ma, million years ago. Abundance Year 1 2 6 Space: biodiversity threats are spatially clustered, so trends should be more similar when near in space. Time: neighbouring abundance observations are likely to be more similar. For example, the abundance in point 1 is more similar to that in point 2 than in point 6.
Mixed models are commonly applied to derive the collective trend. The two main types (random intercept and random slope – see Methods) use a mixed modelling framework to account for variation in populations, species, genera, location and regions. At their core, both regress the log of abundance against time, but with key differences in random effects. Abundance Random intercept Year Mean abundances vary for each population, species and location with a common trend. Year Mean abundances and trends vary for each population, species and location. Random slope Family Genus SpeciesNested random effects are used in the random intercept and random slope models to recognize the implicit phylogenetic, spatial and temporal structures of biodiversity data (for example, species > genus > family. Correctly speci/f_ied, this nesting can address pseudoreplication and produce valid inference. However, nested random effects are probably a poor proxy for the complex phylogenetic, spatial and temporal structures in the data, potentially violating model assumptions around independence. When phylogenetic, spatial and temporal structures are poorly characterized, violating independence assumptions, inference can be distorted, potentially misestimating the collective trend direction and uncertainty. For instance, recognizing the phylogenetic structure in site 1, the three species trends become two clade-level trends. At the level of the collective trend, ignoring this phylogenetic structure leads to the false detection of a signi/f_icant increase, which vanishes once the phylogeny is included. AbundanceAbundance Tools already exist to capture these phylogenetic, spatial and temporal structures. However, the tools are able to account for only one or (in rare cases) two sources of non-independence, but to fairly represent biodiversity change, it is vital that phylogenetic, spatial and temporal non-independences are captured simultaneously. We introduce the correlated effect model, which builds three critical components into the hierarchical random slope model—the simultaneous capture of phylogenetic, spatial and temporal structures in one model—addr essing non-independence and offering improved inference and prediction. We specify the following: These datasets are expected to contain phylogenetic, spatial and temporal structures. For instance, see below. Year Year Site 1 Site 2 Site 3 Rate of change (%) No phylogeny With phylogeny Species trends co-vary according to pairwise distance in phylogenetic branch lengths. Correctly specifying these implicit data structures offers improved inference and prediction—critical to understanding biodiversity change. Abundance observations exhibit /f_irst-order autoregressive temporal autocorrelation. Spatial site-level trends co-vary according to pairwise distance (km) between sites. Sites 2.5 1 versus 6 Species 1 Species 2 Species 3 The average rate of change in population abundances across all species and locations—the collective tr end—is vital to our understanding of biodiversity change. 1 versus 2 Fig. 1 | Impact of correlative non-independence on collective abundance trends. The text and images show the objective, implicit and key features of large-scale abundance datasets, current approaches for analysis, the problem, its implications and the solution. 790 | Nature | Vol 628 | 25 April 2024 Article Collective abundance trend uncertainty is likely to be underesti- mated when hierarchical terms (for example, random effects) fail to effectively represent the complex spatial, phylogenetic and temporal structures in the data (Extended Data Fig. 1). This is an apparently com- mon phenomenon given all ten datasets underestimate uncertainty, and across the ten datasets, we find that correlative terms proportion- ally account for approximately one-third of the variation in the data (spatial: mean = 0.34 s.d. = 0.3; phylogeny: mean = 0.41, s.d. = 0.28), relative to the combined variance captured by the respective hierar- chical and correlated terms. There is no comparable metric for the temporal term that describes the correlation between abundances instead of covariance between trends. Notably, the stark increase in uncertainty is not a consequence of simply introducing additional correlated terms. This is because uncertainty tends to increase substan- tially only when the correlated terms are capturing a high proportion of variance (β = 1.00, 95% CI: −0.19 to 2.21, P = 0.09; Extended Data Fig. 1). Through iteratively introducing the correlated terms into the random slope model (exploring six further model structures), it is apparent that uncertainty increases most after the inclusion of spatial correlation (Extended Data Fig. 2). Predicting biodiversity change Counterintuitively, accounting for correlative non-independence improves our capacity to make predictions ‘out of sample’—that is, for a withheld subset of data not used to develop the model—despite the large uncertainty at the level of the collective trend. Part of the value of these abundance trends is that they can be used to estimate which species and locations are likely to be declining or recovering, and when. T o evaluate whether the correlated effect model improves our ability to make local-scale predictions, we tested each model’s ability to forecast new abundance observations and estimate population BioTIME Living Planet Br eeding Bir ds FishGlob RivFishTIME UK riverine /f_ishes Atlantic r eef /f_ishes German vegetation Euro pean biodiversity Larg e car nivore s AbundanceAbundance Signi/f_icant incr ease Signi/f_icant decr ease Model Random inter cept Random slope Corr elated ef fect 50 75 100 125 1950 1980 2010 50 75 100 125 50 75 100 125 50 100 200 20 100 500 50 100 200 400 1985 2000 2015 100 200 400 20 100 500 100 200 400 100 1,000 10,000 Non-signi/f_icant increase Non-signi/f_icant decrease Year 1960 1985 2010 Year 1975 1995 2015 Year 1980 1995 2010 Year 1985 2000 2015 Year Year 2010 2014 2018 Year 1980 1995 2010 Year 1980 1995 2010 Year 1920 1960 2000 Year AbundanceAbundance AbundanceAbundance AbundanceAbundance AbundanceAbundance Fig. 2 | Widely used statistical models misrepresent biodiversity abundance trends. Abundance trend projections across ten high-profile datasets under three different models. Circles represent the collective trend (the coefficient describing the change in abundance over time averaged across all species and locations) for each dataset in our three models (from left to right): random intercept, random slope and the correlated effect model that simultaneously accounts for temporal, spatial and phylogenetic correlative non-independence. We specify four categories of trend: significant increase— coefficient is positive and significant; non-significant increase—coefficient
is positive but not significant (that is, no detectable change); non-significant decrease—coefficient is negative but not significant (that is, no detectable change); significant decrease—coefficient is negative and significant. Significance indicates that the coefficient does not overlap zero at a 50% credible interval. Coefficients and 95% credible intervals are available in Supplementary Table 4. We use the collective trend coefficient and 50% credible intervals (represented by shading) to produce abundance projections for each model in each dataset from an arbitrary baseline abundance of 100. Abundance projections cover the time span of the observed data and are presented on the log 10 scale. Nature | Vol 628 | 25 April 2024 | 791 trends. For each dataset, we removed the final abundance observa- tion in 50% of the population abundance time series and then evalu- ated each of the three models’ ability to predict this value. Next, we conducted leave-one-out cross-validation to assess trend prediction, removing a population time series (that is, trend) from each dataset and testing each model’s ability to recover this population’s abundance trend. We repeated this cross-validation 50 times for each of the 10 datasets. In each dataset, we report predictive accuracy for each of these approaches as the percentage error (PE), a metric describing the median of the absolute percentage difference between predicted and observed values; for example, with a 5% error, an abundance on the log scale of 1 would become 1.05. Summarizing across datasets, we report the mean and s.d. Across the 10 datasets, the correlated effect model estimated the final abundance observation with 16.1% error (s.d. = 7.5%), 1.51 times more accurately than the random intercept model (mean = 24.4%, s.d. = 16.2%) and 1.13 times more accurately than the random slope model (mean = 18.3%; s.d. = 10.5%). The correlated effect model also performed best when estimating missing population trends, with an error of 18.3% (s.d. = 11.6%), 1.35 times more accurate than the random slope model (mean = 28.9%; s.d. = 25.5%). In one case, using the corre- lated effect model to capture the spatial, temporal and phylogenetic structures halved the trend prediction error, relative to the random slope model. The random slope model had a lower prediction error than the correlated effect model in only one dataset in the abundance assessment, and two datasets in the trend assessment. M. daubentonii M. emarginatus M. nattereri Site: 46.6º N 2.4º W 100 200 300 400 500 600 100 200 300 400 500 600 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 1990 2000 Year 2010 Abundance Projected abundance 100 200 300 400 500 600 100 200 300 400 500 600 Abundance 100 200 300 400 500 600 100 200 300 400 500 600 Abundance 0 50 100 150 200 80 90 100 110 120 Projected abundance 80 90 100 110 120 Projected abundance 80 90 100 110 120 50 100 150 200 50 100 150 200 a b c Population level Site level Collective level Projected trend Collective trend: random intercept All site-level trends Collective trend: correlated effect Collective trend: random slope Fig. 3 | More complex models better represent population dynamics and improve the validity of conclusions across ecological scales. a–c, Example of how the three models (random intercept (a), random slope (b) and correlated effect (c)) describe abundance patterns at different ecological scales (finer ecological scales on the left). The population-level column showcases how each of the three models produce different estimates of abundance trends (lines are the median values with 95% credible interval shading) for all three bat species (genus Myotis) with data in a given location, with data points representing the observed abundance values. The site-level column depicts how the species’ trends, under different models, influence the site-level trend (that is, a trend for a given location; black), in which the line and 95% credible intervals describe the median trend and variability in trend (respectively) across all species in the given location. At the collective level, the median trend for each unique site is represented by a faded grey line, and the median collective trend coefficient and 95% credible intervals are depicted by the coloured line and shading. At the site and collective levels, credible intervals solely describe uncertainty in the main parameter of interest, the rate of change coefficient, not the intercept. The final column describes how a hypothetical population would change under the median collective trend coefficient and 50% credible intervals projected from a relative baseline abundance of 100. This example is based on data in the Living Planet. In each plot, we restrict the time frame to the temporal extent of the population-level trends (1987–2015), instead of the total temporal extent of our Living Planet sample.

Page 1 of 6