s41467-024-53046-2.pdf - Page 1

Article https://doi.org/10.1038/s41467-024-53046-2 No universal mathematical model for thermal performance curves across traits and taxonomic groups Dimitrios - Georgios Kontopoulos 1,2,3 ,A r n a u dS e n t i s4, Martin Daufresne4, Natalia Glazman 1, Anthony I. Dell5,6 & Samraat Pawar 1 In ectotherms, the performance of physiological, ecological and life-history traits universally increases with temperature to a maximum before decreasing again. Identifying the most appropriate thermal performance model for a speciﬁc trait type has broad applications, from metabolic modelling at the cellular level to forecasting the effects of climate change on population, eco- system and disease transmission dynamics. To date, numerous mathematical models have been designed, but a thorough comparison among them is lacking. In particular, we do not know if certain models consistently outper- form others and how factors such as sampling resolution and trait or orga- nismal identity inﬂuence model performance. To ﬁll this knowledge gap, we compile 2,739 thermal performance datasets from diverse traits and taxa, to which we ﬁt a comprehensive set of 83 existing mathematical models. We detect remarkable variation in model performance that is not primarily driven by sampling resolution, trait type, or taxonomic information. Our results reveal a surprising lack of well-deﬁned scenarios in which certain models are more appropriate than others. To aid researchers in selecting the appropriate set of models for any given dataset or re search objective, we derive a classi- ﬁcation of the 83 models based on the average similarity of their ﬁts. All physiological, ecological and life history traits of ectotherms, from cellular metabolic rates to population growth rates and species inter- actions, are strongly in ﬂuenced by temperature. The relationship between trait performance and temperature is known as the“thermal performance curve” (TPC) or the “thermal reaction norm” (Fig. 1a) and is unimodal and usually asymmetric 1,2. Determining the most appro- priate TPC model for a given trait dataset and how the shape of the TPC varies across trait types, taxonomic groups, and environments have a wide range of applications across biological systems and levels of organisation, from cellular metabolic modelling 3– 5 and ontogenetic growth6– 8, to population 9– 12,c o m m u n i t y13– 15,e c o s y s t e m16– 18,a n dd i s - ease dynamics 19– 21.T P C sh a v ea l s or e c e n t l yb e c o m ei n t e g r a lt o predicting the effects of climatic warming as well as thermal ﬂuctua- tions on biological systems22– 25. A wide variety of unimodal TPC models have been developed since the ﬁrst two (one symmetric and one asymmetric) were pro- posed by Janisch back in 1925 26. TPC models span the spectrum of phenomenological (e.g., based on a modi ﬁcation of the Gaussian distribution27,28) to mechanistic (based on biochemical kinetics 29– 33) mathematical equations. Many of the phenomenological models were initially developed for speci ﬁc traits or species groups (e.g., for the development rate of arthropods 34– 36). This smorgasbord of TPC models prompts the question of whe- ther certain models are more appropriate than others for speciﬁc trait Received: 18 September 2023 Accepted: 27 September 2024 Check for updates 1Department of Life Sciences, Imperial College London, Silwood Park, Ascot, Berkshire, UK.2LOEWE Centre for Translational Biodiversity Genomics, Frankfurt, Germany.3Senckenberg Research Institute, Frankfurt, Germany.4INRAE, Aix Marseille University, UMR RECOVER, Aix-en-Provence Cedex 5, France. 5National Great Rivers Research and Education Center, East Alton, Illinois, USA.6Department of Biology, Washington University in St. Louis, St. Louis, Missouri, USA. e-mail: [email protected] Nature Communications| (2024) 15:8855 1 1234567890():,; 1234567890():,; data. For example, mechanistic models derived from biochemical kinetic principles were most often developed under the assumption that a single rate-limiting enzyme governs the shape of the TPC29– 33,37,38. Because of this, such models are commonly assumed to strongly outperform phenomenological models for fundamental physiological traits (e.g., respiration, photosynthesis). In contrast, the TPCs of more “emergent” traits, such as overall dynamic body acceleration, resource consumption rate, and growth rate, cannot be straightforwardly linked to the activity of a single rate-limiting enzyme and are expected to be better described by phenomenological models whose parameters more accurately capture various features of TPC shape (e.g.,T min, Tpk, and Tmax;F i g .1a). It is worth noting that, even for fundamental phy- siological traits such as metabolic rate, the aforementioned assump- tion of a single rate-limiting enzyme is an approximation39– 42.A n o t h e r assumption is that for high-resolution datasets (numerous tempera- ture treatments over a suf ﬁciently wide range 43), parameter-rich models will systematically outperform simple alternatives, e.g., revealing two different gradients at the rising part of the TPC 43– 45.S u c h ﬁne-scale variation in TPC shape is assumed to be generally present but hard to detect because most thermal performance datasets tend to be small 46. Whether such assumptions are valid has yet to be objectively and systematically determined, with all previous such studies being limited in their scope in terms of the number and types of models compared, types of traits, and diversity of taxonomic groups 27,47– 52 (Supplemen- tary Table 1 in Supplementary Note 1). Even in recent studies that introduce new general models (expected to apply well to wide classes of datasets) 31– 33,53, a thorough performance comparison between the new model and pre-existing ones is typically missing. As a result, in studies quantifying the temperature dependence of experimental trait data, TPC models are typically chosen (semi-)arbitrarily and with lim- ited justiﬁcation. This practice could occasionally worsen the signal-to- noise ratio of a dataset or, worse, introduce model-speciﬁc biases that could be hard to control for 51 (see Fig. 1b). For example, consider a hypothetical trait whose TPC is strictly symmetric. For such a trait, models that force the fall of the TPC to be steeper than its rise will necessarily yield misleading estimates of the gradients before and after the thermal optimum. Furthermore, some datasets may not allow adequate statistical power to objectively estimate all parameters of a complex and parameter-rich model ("parameter unidentiﬁability”), in which case multiple sets of parameter estimates will produce effec- tively identical curves. One could treat this spurious variation in parameter estimates as real rather than a purely statistical artefact and may attempt to come up with mechanistic explanations for it. To ﬁll this gap, here we compile an extensive set of TPC models that covers practically all models that have been proposed to date, which we ﬁt to multiple experimentally-determined thermal perfor- mance datasets from the literature. We then compare models on the basis of (i) how well theyﬁt experimental data and (ii) how similar their ﬁts tend to be. This allows us to address three key questions:

Are there models that consistently outperform others across any dataset?
Do models with more parameters tend to outperform simpler alternatives as the sampling resolution of the dataset increases?
Do mechanistic models ﬁt fundamental physiological traits better than phenomenological models? Results and discussion Thermal performance datasets analysed in this study We compiled 3598 previously published datasets of thermal perfor- mance (i.e., measurements of trait performance vs temperature for a single taxon from a speci ﬁc location and study), with at least four distinct temperatures per dataset (see“Methods”). These wereﬁltered down to 2739 datasets 54, spanning more than 100 traits from all seven kingdoms55 and from 39 phyla (Fig. 2). For almost all datasets, trait performance measurements were obtained under constant rather than ﬂuctuating temperatures, even though the latter should offer a closer approximation to the conditions naturally experienced by a given species
Even for datasets with constant temperatures, the time allowed for acclimation to each experimental temperature57 was often not reported explicitly. We should emphasise here that the afore- mentioned data gaps did not prevent us from addressing the three key questions of the present study. Nevertheless, future studies could investigate whether the timescale of temperature shifts systematically inﬂuences (a) the shape of the TPCs of diverse traits and taxa and (b) the performance of alternative mathematical models. Comparison of model performance across datasets We collected published TPC models that capture the entire TPC (i.e., not just its rise or fall) using Google Scholar (see “Methods”). We did Fig. 1 | The thermal performance curve (TPC) illustrates the inﬂuence of tem- perature on the performance of a biological trait of an individual ectotherm, a population, or even a whole ecosystem.TPCs can be estimated by ﬁtting a wide range of nonlinear mathematical models to trait measurements obtained at mul- tiple temperatures. Some parameters commonly included in such models are explicitly shown in panel (a). T min and Tmax are the temperatures above and below which trait values are positive.Tpk is the temperature at which the TPC reaches its maximum height (Bpk). B0 is the performance at a reference temperature (Tref) below Tpk, often quantiﬁed for comparison of baseline performance across TPCs from different individuals or species63. Panel (b) shows three different models (see Supplementary Note 7) ﬁtted to net photosynthesis rate measurements of the eastern daisy ﬂeabane ﬂowering plant (Erigeron annuus)79. Note that the resulting TPCs differ in their Tpk (dashed lines) and Bpk values, as well as in their degree of skewness, with the TPC of the Janisch I model being fully symmetric. Source data for panel (b) are provided as a Source Data ﬁle. Article https://doi.org/10.1038/s41467-024-53046-2 Nature Communications| (2024) 15:8855 2 not include models that can only be ﬁtted to multiple thermal per- formance datasets at the same time (e.g., through a hierarchical Bayesian approach

and not to individual datasets. We also ensured that each included model could adequatelyﬁt( s e e“Methods”)a tl e a s t a single dataset from a representative subset of our data compilation. This process yielded 83 distinct TPC models (see Supplementary Note 7 for their equations, free parameters, and references). We ﬁtted the 83 models separately to each thermal performance dataset and, for each ﬁt, calculated the value of R 2 and Akaike Infor- mation Criterion, using the correction for small sample size (AICc 58; see Box 1). After removing model ﬁts with an R2 value below 0.5 (see Supplementary Figs. 1 and 2 in Supplementary Note 2 for information on retained model ﬁts), we calculated the AICc weight of each model for each thermal performance dataset. We also constructed a matrix of median pairwise Euclidean distances among models, from which we inferred a dendrogram of models to identify those that generally produce very similar ﬁts (Fig. 3). There is no universal model.T h eﬁrst question that we addressed was whether some models systematically outperform others, irrespective of the dataset. This was not the case as model performance varied strongly across thermal performance datasets, with no model con- sistently reaching high AICc weights throughout (Fig. 3). It is worth noting, however, that certain models (e.g., Gaussian, simpliﬁed Briere I, second-order polynomial, Mitchell-Angilletta, Eubank) performed relatively well (with AICc weights greater than 0.2) across many datasets. Parameter-rich models do not strongly outperform simpler models as sampling resolution increases. To understand if complex models BOX 1: Model performance metrics used in this study R2: A measure of the absolute performance of a model against a given dataset. It represents the proportion of variance in the response variable that is explained by the model.R2 can take values between 0 and 1, albeit negative values can occur in some cases. Adding extra parameters to a given model generally increases theR2 but can lead to overﬁtting, i.e., treating the noise in the data as true signal. When overﬁtting occurs, the model will reach a very high R2 against the data on which it was trained, but much lower R2 values against new observations. AICc: A measure of the relative performance of alternative models against a given dataset. It penalises models with large numbers of parameters and, therefore, can be used to avoid cases of overﬁtting. AICc values can range from −∞ to ∞, and when comparing alternative models, the most appropriate model for a given dataset would be the one with the lowest AICc value. Note that AICc values are speciﬁct oa given dataset. For objectively comparing the performance of differentmodels across multiple datasets, AICc weights can be used instead. AICc weights: A standardised measure of the relative performance of alternative models against a given dataset. AICc weights take values b e t w e e n0a n d1 ,w i t ht h e i rs u mb e i n ge q u a lt o1 .I nt h ecase where a single model vastly outperforms allﬁtted alternatives, that model will have an AICc weight of≈ 1, whereas all other models will have AICc weights of≈ 0. At the opposite extreme, if allﬁtted models perform equally well against a dataset, their AICc weights will all be equal to 1 divided by the model count. It is worth stressing that, in contrast to theR2,A I C c weights cannot be used to evaluate the absolute performance of a modelagainst a dataset. This makes AICc weights meaningless in the case where no model canﬁt the data well enough. For this reason, anR2 cutoff (e.g., 0.5 in the present study) is typically used to exclude datasets that are poorly ﬁtted by all candidate models. Animalia Archaea Bacteria Chromista Fungi Plantae Protozoa 33.2% 3.3% 18.1% 13.7% 5.3% 26.3% 0.1% Arthropoda Ascomycota Bacillariophyta Chlorophyta Chordata Cyanobacteriota other Pseudomonadota Streptophyta 19% 5.3% 7% 5.1% 6.3% 5.9% 30.4% 4.6% 16.4% ﬁltration rate net photosynthesis rate other population growth rate respiration rate 2.3% 9% 11.9% 17% 42.9% 4.1% 12.8% individual growth rate resource consumption rate 3 parameters model ﬁt(s) with AICc = -∞ no model ﬁts with R2 ≥ 0.5 ≤ 4 parameters ≤ 5 parameters ≤ 6 parameters ≤ 7 parametersDatasets quantiﬁed with models with Datasets excluded because of d ab c Number of thermal performance datasets N = 1224 N = 399 N = 287 N = 673N = 199 N = 630 N = 186 Fig. 2 | Characteristics of thermal performance datasets included in this study. Panels (a, b,a n dc) show the breakdown of the most common kingdoms, phyla, and traits, respectively, in our data compilation. Panel (d) shows the number of thermal performance datasets to which various models could be ﬁtted, as well as datasets that were excluded from this study. For more details, see the Methods section. Source data are provided as a Source Data ﬁle. Article https://doi.org/10.1038/s41467-024-53046-2 Nature Communications| (2024) 15:8855 3 0.2 Ashrafi III Tomlinson-Menz Tomlinson-Phillips Logan III Lactin II Logan I Lactin I Rezende-Bozinovic Ritchie Hinshelwood Hobbs Bilinear Dent-like modified Bilinear modified Dent-like Johnson-Lewin extended Johnson-Lewin simplified Johnson-Lewin simplified extended Johnson-Lewin Eubank Atkin Warren-Dreyer Ashrafi I Briere I simplified Janisch I β-type Mitchell-Angilletta Analytis-Kontodimas Taylor-Sexton Thornton-Lessem simplified Asbury-Angilletta Ashrafi II Ashrafi V second-order polynomial Ashrafi IV Briere II simplified Briere II Cardinal Temperature Huey-Stevenson Finstad-Jonsson simplified Briere I linear-logistic exponentially modified Gaussian modified Gaussian Stinner Enzyme-assisted Arrhenius Van't Hoff Gaussian Ruiz Weibull skew-normal double Gaussian Gaussian-Gompertz O'Neill fourth-order polynomial fifth-order polynomial Ross-Ratkowsky Sharpe-Schoolfield simplified Sharpe-Schoolfield extended Sharpe-Schoolfield simplified extended Sharpe-Schoolfield modified Deutsch Logan II Stevenson Wang-Lan-Ding Rice Clock Kumaraswamy Asbury-Angilletta Boatman extended Briere simplified extended Briere Ratkowsky third-order polynomial Newbery Régnière Thomas I Thomas II Jöhnk Cardinal Temperature with inflection Wang-Engel Yan-Hunt Analytis-Allahyari

0 0.2 0.4 0.6 0.8 1 AICc weight Janisch II

mechanistic phenomenological Fig. 3 | Comparison of the 83 TPC models across 2739 thermal performance datasets. Left: dendrogram of models based on the similarity of the ﬁts that they produce. Branch lengths represent median Euclidean distances among resulting TPC ﬁts (see “Methods”). Models that branch off very close to each other typically produce nearly identical ﬁts, whereas models separated by large distances tend to produce highly distinct ﬁts. Models based on biochemical kinetics are shown in pink, whereas phenomenological models are in black. Right: The AICc weight of each model (rows) for each dataset (columns). Note that columns have been ordered so that datasets with similar patterns of AICc weights are within close proximity of each other. The ﬁgure was rendered using the ggtree80 (v. 2.2.4) and the ComplexHeatmap81 (v.2.12.1) R packages. Source data are provided as a Source Data ﬁle. Article https://doi.org/10.1038/s41467-024-53046-2 Nature Communications| (2024) 15:8855 4

Page 1 of 2