Early-warning signals (potentially) reduce uncertainty in forecasted timing of critical shifts

. Despite the identification of early-warning signals precluding ecosystems regime shifts, limited evidence exists that they can be used to forecast the actual timing of a critical shift. Here, we propose a probabilistic Bayesian approach to forecast the timing of a shift by combining uncertain prior information about the ecosystem dynamics (parameters and drivers) and sampled spatial and temporal correlation and variance of ecosystem states, which are well known early-warning signals. For an ecosystem of logistically growing vegetation under linear increase in grazing pressure, we show that the use of sampled early-warning signals results in lower prediction uncertainty in forecasted timing of shifts compared to forecasts made with sampled mean state variables. In addition, we show that uncertainty in ecosystem parameters decreases well ahead of a shift. An important conclusion of our study is that the use of early-warning signals in forecasting of shifts is promising, provided that a large number of samples are collected ( n ’ 10 4 in our study). This explains the limited success of finding early warning signals from field studies of real world ecosystems.


INTRODUCTION
Non-linear systems may undergo abrupt transitions as a result of a gradual change in system drivers. Such regime shifts, or critical transitions, have been described for a large range of spatiotemporal systems, including ecosystems (Scheffer et al. 2009). Regime shifts are considered undesirable because they cause large changes that may be irreversible. For designing adaptation strategies to avoid shifts, the ability to forecast the probability and timing of a shift is essential (Andersen et al. 2009, Biggs et al. 2009, Contamin and Ellison 2009.
However, it is hard to detect upcoming regime shifts because mean values of observable system state variables show little change before a transition occurs (Scheffer et al. 2009). This problem has sparked research focused on finding alternative properties of a system that show a more marked change before a transition is coming. It turns out that such early-warning signals exist and include various higher order statistics of state variables (Kleinen et al. 2003, Brock and Carpenter 2006, Kéfi et al. 2007, Chisholm and Filotas 2009, Dakos et al. 2009, Fernández and Fort 2009, Takimoto 2009, Drake and Griffen 2010, Carpenter et al. 2011. Despite the identification of early-warning signals, limited evidence exists that they can be used to actually forecast the timing of a shift (Biggs et al. 2009, Contamin andEllison 2009). This is because the use of early-warning signals in forecasting is hampered by a number of issues. One is that changes in higher order statistics sufficiently well ahead of a shift are often considered too small to be detected by sampling schemes (Biggs et al. 2009). Also, absolute values of observed early-warning signals are considered to have limited value in estimating the proximity of a shift (Van Nes and Scheffer 2007, Biggs et al. 2009, Kéfi et al. 2010. This is because changes in early-warning signals are considered too gradual to provide information on the distance to the shift or absolute values depend on ecosystem characteristics that are generally unknown. In addition, the uncertainty of sampled earlywarning signals may hamper their use to forecast shifts. Higher order statistics calculated from samples have a high uncertainty compared to first order statistics (de Gruijter et al. 2006). This may seriously reduce their information content for forecasting shifts even when early-warning signals show considerable changes ahead of a shift. It might even be the case that uncertainty in early-warning signals calculated from observations overshadows in many cases observational error, making them virtually useless in detecting shifts in ecosystems. Surprisingly, when analyzing the value of higher order statistics as predictors of regime shifts, this has not been considered thus far.
These issues raise the following questions: Is it possible to quantify the timing of a shift using early-warning signals calculated from a sampled ecosystem? And, what is the associated uncertainty in this forecasted timing, relative to the case when (notably small) trends in mean state variables are used as indicators instead of earlywarning signals? To examine these questions we use a probabilistic Particle Filter (Van Leeuwen 2009, Karssenberg et al. 2010) framework that in principle allows forecasting the timing of a shift, including the associated uncertainty in the timing. We encode prior, but limited, ecosystem knowledge as a dynamical ecosystem model of logistic vegetation growth (Noy-Meir 1975, Dakos et al. 2009, Fernández and Fort 2009, Guttal and Jayaprakash 2009 where uncertainty in all parameters and drivers is defined by probability distributions. These uncertainties lead to a high forecast uncertainty of the timing of a shift if no additional observational data were available. To reduce this uncertainty, however, we take samples of a key ecosystem state variable (biomass) at given moments in time, taking into account sampling error caused by limited sampling. As our aim is a proof of concept, we use a synthetic (model generated) world from which we sample. Early-warning signals calculated from these samples are combined with the ecosystem model using a numerical solution of Bayes' theorem.

Sequential Bayesian filtering
When forecasting the timing of a regime shift in a particular study area, we can rely on two sources of information: the prior information consisting of a dynamical ecosystem model representing our understanding of the ecosystem and the observational data providing site-specific information on how the ecosystem changes over time. In sequential Bayesian Filtering, these two sources of information are combined using Bayes' theorem. Here we provide a short explanation of the statistical framework, for a full account the reader is referred to Appendix A.
The prior knowledge is described by a forward model and its parameter values. In cases when the ecological process behind the regime shift is partly unknown, one would need to represent the prior knowledge by a set of possible model structures (Clark et al. 2011). As existing statistical methods (e.g., Beven 2002, Ajami et al. 2007, Bulygina and Gupta 2010 dealing with a prior defined by multiple model structures do not fully address simultaneous identification of the model structure and parameters of spatial forward models in a Bayesian framework, it is assumed here that the mechanism behind the regime shift is known. So, the model structure of the forward model is kept fixed. The parameters, drivers and state variables in the model, however, are considered to be uncertain, and the full state of the model is given by a joint probability density of all variables (state variables, parameters, drivers), denoted by p(x t ), with a time index t, t ¼ 1, 2, . . . , 3500. The system is monitored by collecting biomass samples at observation times t. Observation error (e.g., instrument error), and errors due to limited sampling are inherent in the v www.esajournals.org 2 February 2012 v Volume 3(2) v Article 15 calculation of early-warning signals y t , resulting in a joint probability density of all early-warning signals, denoted as p(y t ). Below, y t is referred to as ''observations''.
While monitoring, probabilistic forecasts of the full state p(x t ) are made by running the ecosystem model forward in time. At observation times, the full state is adjusted by assimilating observations into the model using Bayes' Theorem: with p(x t jy t ) the posterior model state, i.e., the joint probability density function of the full model state x t given the observations y t , and p(y t jx t ) the joint probability density function of the observations y t given the model state In Eq. 2, H is the measurement operator transforming the model state to the observation, which is here one because no transformation is required. The matrix R t is the covariance matrix of the measurement error at t. It represents all sources of uncertainty in the observations y t , including correlation (covariances) between the observation errors. Next, a new set of N realizations is created that consists of exact copies of a subset of realizations. This is done with Residual Resampling Chen 1998, Weerts andEl Serafy 2006), which copies each realization a number of times proportional to its posterior probability p x ðiÞ t jy t (Eq. 2). After resampling, the new set of realizations is integrated to the next observation time step using the system model, and the same algorithm is applied. An intuitive explanation of the procedure is that it removes at each observation time those model realizations that do not match the observations available for that time step, while the remaining model realizations are duplicated. At the last observation time, all observations have been incorporated into the ecosystem model resulting in the posterior probability distributions of the full model state and the forecasted timing of the ecosystem shift.

Prior information: the ecosystem model and its drivers
The prior information of the system is assumed to be a spatial model of logistically growing vegetation (Noy-Meir 1975). This system shows a critical transition towards lower biomass when a grazing threshold is exceeded. We selected this system because higher order statistics as earlywarning signals have been described elsewhere (Dakos et al. 2009, Fernández and Fort 2009, Guttal and Jayaprakash 2009. Biomass is modelled in response to a linear increase in grazing over time. Biomass B x,y,t at cell coordinates (x, y), with x ¼ 1, 2, . . . , 60, y ¼ 1, 2, . . . , 60 is: with D x;y;t ¼ B x;yÀ1;t þ B x;yþ1;t þ B xþ1;y;t þ B xÀ1;y;t À 4B x;y;t ; ð4Þ using B x,y,t ¼ k at t ¼ 0. We consider situations with limited prior information, thus all drivers and parameters are taken uncertain, represented by spatially uniform random variables with v www.esajournals.org a uniform probability distribution. In most cases, prior ranges of parameter values can be estimated from literature, calibration against historic data, or direct observation in the field. These prior ranges of values can be used to define the prior probability distributions of the parameters. Here we assume prior information is available that can narrow the prior distribution as follows: in Eq. 3, r is the growth rate taken uniform in [0.6, 1.0], k is the carrying capacity with a prior in [9,11], c 0 is the initial grazing pressure with a prior in [0 Results not shown here indicate that increasing the interval of the priors implies smaller information content, and thus slightly increases the forecast uncertainty and the range of the posterior distributions of the parameters for all scenarios described in the paper. The relative performance of scenarios remains approximately the same. In Eq 3, e x,y,t is a random variable that is independent for each location and time step with e x,y,t ; N(0, 0.64), representing model structural error. For details on the numerical solution of Eqs. 3-4 and closed boundary conditions, see Appendix B. In all calculations, the timing of the shift is the time t at which the average biomass of the grid cells falls below 2.0.
Observational data: sampled early-warning signals As real world observations are not used, we generate a synthetic world by creating one realization of the model (Eqs. 3 and 4), using deterministic model input values corresponding to the median values of the prior probability distributions, i.e., r ¼ 0.
Let V x,y,t be the time series of synthetic world biomass created in this way.
To mimic actual sampling, we add observation errors to retrieve the random variable biomass C x,y,t , calculated as C x,y,t ¼ V x,y,t þ S þ R x,y,t , with, S the systematic error, a random variable with S ; N(0,r 2 s ); R x,y,t , the random error, a random variable that is spatially independent with R x,y,t ; N(0,r 2 r ). Note that S is systematic and thus represents an offset (due to instrument error) in the observed value that is constant and spatially uniform. R x,y,t is random error and thus its offset is different for each time step and measurement location. Observed early-warning signals are retrieved by sampling C x,y,t . We evaluate the use of spatial and temporal correlation and variance, given as semivariance values, because these are known early-warning signals for our ecosystem (Dakos et al. 2009, Fernández and Fort 2009, Guttal and Jayaprakash 2009. To compare spatial and temporal semivariance as earlywarning signals, a sampling scheme is applied that can be used to calculate both types of earlywarning signals, using the same number of samples. To retrieve the value of these earlywarning signals at a particular observation time t, biomass is sampled for each time step t in a period of 50 time steps preceding t. Each of these 50 time steps, 40 samples are taken on a regular grid with a distance of 5 grid cells between samples. In addition, 10 samples are taken at a distance of one grid cell in the x or y direction away from randomly selected sampling locations on the regular grid, in order to calculate shortrange spatial variation (Appendix C). This scheme is feasible and efficient for calculation of spatial semivariance (de Gruijter et al. 2006). Let this set of 50Á50 ¼ 2500 sampling locations in the temporal and spatial domains at a particular observation time t be the domain D(t).
Because combining the modelled ecosystem state and sampled early-warning signals requires the covariance matrix R t (Eq. 2) of the error in the sampled early-warning signals, we perform a Monte Carlo simulation to calculate R t . A number of M ¼ 500 realizations C ðmÞ x;y;t with m ¼ 1, 2, . . . , M, are drawn from C x,y,t by randomly offsetting the regular sampling grid for each realization (see Appendix C). By this procedure R t includes observation error represented by the systematic error and random error incorporated in C x,y,t , and sampling error (de Gruijter et al. 2006), i.e., the error caused by finite sampling from the reality. For each realization m, a number of early-warning signals are calculated, which are semivariance values at different separation intervals in the spatial or temporal direction. At a given observation time v www.esajournals.org t, the semivariance c ðmÞ t ðhÞ of sample realization m is calculated within the domain D(t) as (Goovaerts 1997): C ðmÞ ðs i Þ À C ðmÞ ðs i þ hÞ 2 À r 2 r ; In Eq. 5, c ðmÞ t ðhÞ is the semivariance, calculated for time step t; s i is a location (x, y, t). The semivariance is calculated from biomass at L pairs of locations that have a separation distance between h min and h max , and represent semivariance in either the spatial or the temporal direction, with h the average separation distance of the L pairs, also referred to as the lag.
The calculation of semivariance at observation time steps allows to assimilate these earlywarning signals into the ecosystem model as described in the previous sections, which provides a means to evaluate their use in forecasting the timing of a shift. To compare the use of these early-warning signals with the alternative approach where an observed mean state variable, instead of higher order statistics such as semivariance, is used, we also calculate mean biomass at observation time steps. To make sure these mean values are calculated using the same number of observational data, which is required to make an unbiased comparison with the use of early-warning signals, the mean biomass M ðmÞ t is calculated in the same domain D(t) which was also used in Eq. 5 for the calculation of semivariance values. In the Bayesian filter used to assimilate the observations into the ecosystem model, the median of c ðmÞ t ðhÞ at different separation distances in space and time and/or the median of M ðmÞ t are taken as the observation vector y t (Eq. 2), while R t is calculated using the m ¼ 1, 2, . . . , M Monte Carlo samples.

Relations between patterns and timing of transition
To explore the relation between early-warning signals and timing of a shift, we first describe the results of a single realization of the ecosystem model (Fig. 1). This realization was also used as synthetic data in the sequential Bayesian filtering procedure that will be described in the second part of the results. The linear increase in grazing pressure over time results in a small reduction in biomass until the abrupt shift towards a much lower biomass occurs at t ¼ 1866 (Fig. 1A). On top of this trend, biomass varies randomly in response to the small random addition or removal of biomass e x,y,t at each time step (Eq. 3). This random variation in biomass increases towards the shift, both when biomass is observed at a single location over time (Fig. 1B) and when biomass is observed at a single time step over the spatial dimension (Fig. 1C). Also, the patterns in biomass change; periods with high or low biomass become longer (Fig. 1B) and patches become larger towards the shift (Fig. 1C). These trends are captured by the early-warning signals used. The increase in variation in biomass towards the shift results in an increase in semivariance at all separation distances, both for the spatial and the temporal semivariance (Fig. 1D, E). The increase in patch size (in space and time), results in an increase in correlation length as shown by the variograms in Fig. 1E, and a decrease in the ratio of the semivariance at separation distance one over semivariance at separation distance 25, indicative for a change in correlation length (Fig. 1D). The results confirm the results found by (Dakos et al. 2009, Fernández and Fort 2009, Guttal and Jayaprakash 2009, who studied the same system, albeit with a different input error model, and without transient inputs. Although semivariance values change well ahead of the transition as shown above, measurements of semivariance can only be used for forecasting the timing of a shift if semivariance observed at a particular time step is related to the timing of the shift. To explore this relation, we run the ecosystem model in Monte Carlo mode (Liu and Chen 1998) resulting in alternative realizations of the model. These realizations show a non-linear statistical dependence between the timing of the shift and the semivariance observed at a particular time step before the shift (cf. Appendix D). This is the case for each of the individual separation distances of the semivariance (spatial and temporal), and for the ratio of semivariance at separation distance 1 and 25. The statistical dependence is already strong at time step 500 and tends to increase towards the shift.
This exploratory data analysis shows that semivariances are potential early-warning signals that can be used to forecast the timing of a shift. However, their predictive power does not only depend on the correlation with timing of the shift, but also on the uncertainty in the measured semivariance (due to observation and sampling error). This uncertainty is represented, and thus taken into account, by the covariance matrix R t (Eq. 2) that is used in the Bayesian forecast.

Sequential Bayesian forecasting
In order to mimic an ecosystem that is almost continuously monitored, we sample the synthetic world with a time interval of 300, resulting in observation times t ¼ 200, 500, . . . , 1400. For each observation time, 2500 samples of biomass are collected according to the sampling scheme described in the Methods section. We assume that the standard deviations of the systematic observation error r s and the random observation error r r are both equal to 0.1. At each observation time, the samples are used to calculate the spatial and the temporal semivariance at four separation intervals (h ' 1, 6, 15, and 25 grid cells), the mean biomass, and the covariance matrix R t (Eq. 2). Associated with an increase in semivariance values, the error variances of the semivariance values increase towards the shift (for details, see Appendix E). The error variances of mean biomass stay constant, because the mean biomass does not change much towards the shift. Errors are approximately Gaussian distributed, which is required for Eq. 2.
To evaluate the value of different early-warning signals in forecasting shifts, scenarios of different combinations of observed early-warning signals are assimilated into the system model. These are (1) ''temporal semivariance'', using observed temporal semivariance at 4 separation distances, (2) ''spatial semivariance'', idem, using spatial semivariances, (3) ''temporal and spatial semivariance'', using spatial and temporal semivariances, i.e., 8 semivariance values, (4) ''mean'', using observed mean biomass, (5) ''all information'', using observed temporal, spatial semivariance values and the mean biomass values, (6) ''no observations'', without assimilation of measurements into the model, i.e., a free Monte Carlo run. The ''no observations'' scenario refers to a forecast that only relies on prior estimates of the model parameters and the system driver (grazing pressure). Fig. 2 shows summary statistics calculated from individual realizations (Appendix F) of the system model. In the ''no observations'' scenario, the large uncertainty defined by the prior probability distributions of the system model and its parameters results in a large uncertainty in forecasted biomass and timing of the shift ( Fig.  2A). This uncertainty is strongly reduced when observed temporal semivariance is assimilated into the model, even when the forecast is made at a moment well in advance of the shift, for instance at t ¼ 500 (Fig. 2B). Particularly at the first observation time, at t ¼ 200, the Bayesian update strongly reduces the uncertainty in biomass. As a result, the uncertainty in the timing of the forecasted shifts is reduced as well. The assimilation of observed semivariance into the system model results in a larger reduction of uncertainty compared to assimilation of the observed mean biomass. For the forecast made at t ¼ 500 (Fig. 2, left panels), the temporal semivariance scenario results in a standard deviation in the timing of the shift of 204, while assimilating mean biomass results in a standard deviation of 365. Results for the spatial semivariance scenario (not shown) are comparable to those for temporal semivariance. The combination of semivariances and mean biomass further reduces uncertainty, with a standard deviation of 114 (Fig. 2E, left panel). The forecasts made at t ¼ 1400 (Fig. 2, right panels) show smaller uncertainty compared to t ¼ 500 forecasts due to the larger number of data assimilated (5Á2500 samples versus 2Á2500 samples for the forecast at t ¼ 500) and a shorter lead-time. Importantly, in addition to a reduced uncertainty in the forecasted timing of a shift, the assimilation of measured semivariances provides improved knowledge about the ecosystem itself, and its drivers. This is shown by the posterior probability densities of parameters and inputs, i.e., the probability densities found by assimilation of past-observed semivariance into the system model. The posterior probability densities are considerably narrower compared to the prior densities (for details, see Appendix G).

Effect of observation error and number of observations on forecast uncertainty
To evaluate the effect of observation error on forecast uncertainty, we run our scenarios using two different settings for the observation error. One is an increased systematic error, i.e., r s ¼ 0.3 instead of 0.1. As the systematic error introduces the same bias to all biomass measurements, it does not affect the uncertainty in observed semivariance. However, the error variance of the observed mean biomass does increase, with approximately 800% (for details, see Appendix E). As a result, the uncertainty in forecasted timing of the shift increases for the scenario that assimilates only mean biomass, compared to the original scenario (Fig. 3, see Appendix H for detailed results). The relative increase is largest when the forecast is made at t ¼ 1400. The increased systematic error also increases the uncertainty of the forecasted timing for the ''all information'' scenario, because this also uses mean biomass. However, the effect is less pronounced because the all ''information scenario'' also uses observed semivariance.
Unlike increasing systematic error, increasing the random error mainly affects the uncertainty of observed semivariance. With a random error r r of 0.3 (instead of 0.1), the variance of the uncertainty in observed semivariances increases with approximately 1000% while the variance of observed mean biomass only increases with approximately 10% (Appendix E). As a result, an increase in random error gives a marked increase in forecast uncertainty for the scenarios that assimilate semivariance into the model as shown in Fig. 3 (for details, see Appendix H). The scenario that uses only mean biomass is hardly affected by the increased random error.
To study the effect of the number of samples, we run the original scenarios with a number of 17 sampling locations per time step instead of 50, which was used in the original scenarios. The adjusted scheme was created by randomly selecting 14 samples at grid locations of the original scheme and 3 samples at random positions directly next to grid locations. The scheme results in 17Á50 ¼ 850 sampling locations at a particular observation time t. The observation error was kept the same as used in the standard scenario (r s ¼ r r ¼ 0.1). The effect of this scenario (Fig. 3, fewer samples run) on the forecast uncertainty is comparable to the scenario with larger random error.
The forecast uncertainty of the scenario that uses observed semivariance is further reduced when a smaller number of samples (17 sampling locations) and increased random error (r r ¼ 0.3) is used (Fig. 3). With this sampling scheme and observation error, the advantage of using observed semivariance instead of mean observed biomass disappears, and the scenario that uses observed mean biomass has a forecast uncertainty that is considerably smaller than the scenario that uses observed semivariance.

DISCUSSION
Although the literature proposes a wide range of higher order statistics of system variables that show a marked increase well before an upcoming regime shift, the value of these early-warning signals for making quantitative inferences on the timing of a shift and its associated uncertainty remains largely untested. The limited number of studies (Biggs et al. 2009, Contamin andEllison 2009) that evaluate early-warning signals for forecasting used deterministic ecosystem models with arbitrarily defined thresholds for earlywarning indicators signaling a shift, without a theoretical framework to include observation and sampling errors of early-warning signals. Also, a number of current studies states that quantitative values of early-warning signals alone have limited value in diagnosing an impending shift, let alone the calculation of the timing of the shift (Van Nes and Scheffer 2007, Dakos et al. 2009, Kéfi et al. 2010. We have shown that the calculation of the absolute timing of the shift (and its uncertainty) is possible using observed early-warning signals. This is the case even when the ecosystem model is largely undefined in its parameters and drivers, as was the case in our runs. The results show that our assumed uncertainty in spatial dispersion of biomass (expressed by the probability distribution of the ecosystem model parameter d ) does not hamper the forecast of the timing of the shift. This is notable, as spatial dispersion directly affects semivariance at a particular distance to a shift (Dakos et al. 2009, Fernández andFort 2009), and this uncertainty could overshadow the trend in semivariance towards a shift, which is not the case here.
The use of spatial or temporal variance in biomass calculated from samples results in a forecast uncertainty of the timing of the shift that is low compared to the case when mean biomass is used. Thus, spatial and temporal variance can indeed be a better leading indicator of forthcoming shifts compared to measured mean state variables, i.e., mean biomass. The best results are found when both semivariance and mean biomass, also referred to as a composite indicator (Contamin and Ellison 2009), are used.
However, a large sample size with small observation error is required. The runs with increased random sampling error and/or a reduced number of samples show that the forecast uncertainty of the timing of the shift strongly depends on the random sampling error and the number of samples used. In general, the estimation of higher order statistics (e.g., semivariance) from samples is more difficult, i.e., results in larger uncertainty, compared to the estimation of first order statistics (de Gruijter et al. 2006). This is confirmed by our calculations on biomass (Appendix E). Thus, the lower uncertainty in forecasted timing of the shift obtained when using the semivariance compared to using the mean, disappears when random observation errors are large and sampling densities are too low. Changing systematic sampling error has the opposite effect of changing random sampling error: the uncertainty in forecasted timing of the shift shows a marked increase when mean biomass is assimilated, while the uncertainty for v www.esajournals.org the scenarios that use semivariance is not affected. So, the use of early-warning signals to forecast shifts is promising provided that random sampling errors are small, and large data sets are assimilated.
It is difficult to provide quantitative guidelines for the required sampling density to make earlywarning signals useful in practice. This is because the effect of early-warning signals on forecast uncertainty depends on many factors, including the type of ecosystem studied, the magnitude of random disturbances of the system, the prior distribution of parameters and inputs including spatial variation in these values due to landscape heterogeneity, the type of early-warning signal used, the magnitude of random and systematic sampling error, the number of samples, and the sampling scheme. A number of these factors were fixed and taken spatially uniform in our study. We do however expect that the problem of high uncertainty in observed semivariance can be extrapolated to other situations, and our study provides a strong indication that dense sampling networks with low measurement error are required to make early-warning signals useful in forecasting shifts. With a smaller sample size, the use of mean state variables in forecasting the shift may give better results in terms of forecast uncertainty.
Our results show the relevance of studying early-warning signals in the context of sampling error. An indicator with a marked increase well ahead of an upcoming shift may still be relatively useless for the purpose of diagnosing an upcoming shift or forecasting its timing when observed values of the indicator have a high uncertainty. Unfortunately this is often the case because ecological sampling networks mostly consist of a limited number of samples. This is also important in the light of studies that test the hypothesis that early-warning signals exist in real world ecosystems. As these studies rely often on relative small sampling networks with inherent large observation errors, relationships between early-warning signals and distance to a shift may be overshadowed by the relatively high error in observed early-warning signals.

ACKNOWLEDGMENTS
We thank two anonymous reviewers for their constructive comments. v www.esajournals.org

Introduction
When forecasting the timing of a regime shift in a particular study area, we can rely on two sources of information: prior information and observational data. The prior information in-cludes our understanding of how the ecosystem works, in particular how it changes over time leading to a possible regime shift. For quantitative predictions, this knowledge needs to be encoded in a dynamical (temporal) ecosystem model, for instance a vegetation growth model, as is used in our study. In addition, the prior information includes estimates of the magnitude and direction of change in drivers that push the ecosystem towards a regime shift. These drivers are represented as external inputs to the ecosystem model. An example of a driver would be grazing pressure, as is used here. When calculating the forecast of the timing of a regime shift, initial estimates, so called priors, are made of model parameters and drivers. Values of model parameters are estimated from literature or by calibration against historic data. Future changes in drivers of the ecosystem are for instance estimated by extrapolating historic trends.
As these prior estimates rely on assumptions or uncertain data, they need adjustment using observational data of ecosystem state variables collected in the study area, in the period of time before a potential regime shift. Here, we use time series of observations of biomass. This is our second source of information for forecasting the timing of a regime shift. The temporal sampling interval and spatial resolution of the observations need to be sufficiently small to derive temporal changes in values of early-warning signals. Observations that can be used are for instance field samples of vegetation taken at a regular time interval, or time series of biomass derived from daily remote sensing images.
As both prior information (model parameters and drivers) and observational data collected in the study area are uncertain, they need to be described in probabilistic terms, i.e., using probability distributions for prior model parameters and observations. This allows weighting both sources of information when making the forecast of the timing of the shift. Here, prior information and observations are combined by a sequential Bayesian filter that numerically solves Bayes' Theorem. This results in updated values, referred to as posteriors, for model parameters and drivers, and a forecast of the timing of the shift and its uncertainty, calculated using all available information and observations. The approach is to calculate this forecast using different sets of observational data, i.e., either using early-warning signals (i.e., higher order statistics) calculated from samples of ecosystem state variables or using mean values of ecosystem state variables as observational data. If the forecast uncertainty is indeed lowest when early-warning signals are used, our study confirms the assumed value of early-warning signals to predict regime shifts.

Sequential Bayesian filtering
Sequential Bayesian filtering combines two sources of information: prior knowledge of the system and observational data of the state of the system. These sources of information are combined by using a numerical solution of the Bayes' theorem. The prior knowledge of the system is given by a forward stochastic model: In Eq. A.1, x t , is the full state of the system, described by a joint probability density of all variables (state, parameters, drivers), denoted by p(x t ), with the time index t, t ¼ 0, 1, . . . , T, and f is the system transition function that represents real-world processes. Eq. A.1 of this appendix is represented by Eqs. 2-3 in the main text of the article. Observational data are collected by monitoring the real state of the system. This is done by sampling one or more state variables of the system. Observations y t are taken at observation times t m , with t m , a subset of t.
These sources of information are combined as follows, as illustrated in Fig. A1A. At t ¼ 0, which is generally a moment just before the first observation time, the system model is initialized with prior probability density functions of state variables, parameters, and drivers. The system is integrated forward in time with the system model (Eq. A.1) up to the first observation time. At this time step, the observational data are used to adjust the full state of the model. From this adjusted state, the system is propagated forward in time with the system model, up to the next observation time where a similar adjustment of the full state is made. This procedure is followed up to the last time step T.
The adjustment of the full state of the model at an observation time is done by solving Bayes' theorem: pðx t jy t Þ ¼ pðy t jx t Þpðx t Þ pðy t Þ : ðA:2Þ In Eq. A.2, p(x t ) is the joint probability density of the system variables at t calculated by propagating the system model from the previous observation time up to the current observation time. As it refers to the density at the observation time before assimilating observations collected at the observation time, it is referred to as the prior probability density at the observation time. The term p(y t jx t ) is the joint probability density of the observations at t given the model state, also referred to as the likelihood of the observations given the model. The term p(y t ) is the joint probability density function of the observations at t. The term p(x t jy t ) is the adjusted joint probability density of the model. As it refers to the density at the observation time after assimv www.esajournals.org ilating observations collected at the observation time, it is referred to as the posterior probability density.
The Particle Filter solves this sequential Bayesian scheme with Monte Carlo simulation (Fig.  A1B). At t ¼ 0, a number of N independent model realizations (also referred to as particles) x function that rounds to the nearest integer toward zero. In the second step, residual probabilities r ðiÞ t are calculated: ðA:9Þ A number of N À P N j¼1 k ðjÞ t additional copies of samples are made using the residual probabilities r ðiÞ t . This is done in a similar way as in the basic resampling technique by uniform sampling from a cumulative distribution function constructed now from the residual weights r ðiÞ t . The samples copied in step one and two result again in N samples representing the posterior probability density of the full model state.
A potential well-known problem of the particle filter is that the number of unique realizations (particles) that is propagated through time becomes too small to provide a precise representation of the probability density of the system. Fig. F1 shows that a large number of unique realizations is retained in the system, and for all scenarios the number of unique realizations is sufficient to approximate probability distributions (Table A1). Note that the two sampling schemes used in the paper use only the values at measurement times t ¼ 200, 500 and t ¼ 200, 500, . . . , 1400, respectively. In each panel, the three lines represent: S, main (standard) scenario with standard deviation of systematic observation error r s ¼ 0.1 and standard deviation of the random observation error r r ¼ 0.1; ISE, increased systematic observation error scenario with r s ¼ 0.3, r r ¼ 0.1; IRE, increased random observation error scenario with r s ¼ 0.1, r r ¼ 0.3. (A), error variance of temporal semivariance at separation distance 1, (B) error variance of temporal semivariance at separation distance 25, (C) error covariance between temporal semivariance at separation distance 1 and 25, (D) error variance of mean biomass. Error variances and covariances for spatial semivariance, and other separation intervals show comparable trends (not shown). The observation errors of the semivariance values increase towards the shift, which can be explained by an increase in mean semivariance. The error variances of mean biomass, however, stay constant. Errors are approximately Gaussian distributed, which is required for Eq. 2 (main text).

APPENDIX G
The posterior probability densities of model parameters and drivers (Figs. G1 and G2) are considerably narrower compared to the prior densities. The ranking in the performance of the different scenarios is comparable to what we found for the uncertainty in the forecasted timing of the shift, with the sampled semivariance resulting in lower uncertainties than using mean biomass. Parameter and input uncertainty reduces when a larger number of observations is assimilated, as shown by the difference in the results between t ¼ 500 and t ¼ 1400.