Statistical inference for a multivariate diffusion model of an ecological time series

. Diffusion processes are natural modelling frameworks for many stochastic phenomena. However since inference for nonlinear diffusion models is difficult, the diffusion models used within the ecological literature are predominantly linear thus ignoring the more complex forces that many populations and environments are subject to. We suggest a parameter estimation procedure that is computationally efficient and valid for a wide class of diffusion models. This includes diffusions that are nonlinear and/or multivariate. The procedure is used to fit a diffusion model to a time series with three variables: the plankton abundances for both Pseudo-nitzschia australis and Prorocentrum micans in addition to the surface water temperature. Pseudo-nitzschia australis are shown to achieve maximal growth with a water temperature in the region of 7.8–16.0 8 C. The corresponding optimum temperature region for Prorocentrum micans is 12.5–24.5 8 C with the dinoflagellate appearing to be less sensitive to sub-optimal temperatures. Both species are governed by r-type strategies and are subject to regulated growth. Though Prorocentrum micans are known to feed on diatoms, we are unable to find evidence of feeding on Pseudo-nitzschia australis . We thus demonstrate that by fitting a diffusion model to an ecological dataset, one may learn much about the factors that affect the ecosystem. A good understanding of these factors is vital in order to manage an ecosystem appropriately.


INTRODUCTION
Diffusion processes are powerful instruments for modelling continuous-time phenomena. Not only can these models account for a wide range of stochastic behavior, but they also enable one to infer the dynamics that govern the evolution of a process from discretely-sampled data. Such inference remains straightforward even when the data is irregularly spaced. Consequently diffusion models have found widespread appli-cation in many different fields of scientific endeavor.
In ecology, diffusion processes are used to model the movements of organisms and hence to explain the distribution and dispersal of a species (Tufto et al. 2012). They are also widely used for analyzing a population's viability (Takimoto 2009). Furthermore, diffusion models can be used to study the impact of established ecological concepts-such as the Allee effect-on a population (Takimoto 2009). Generalizations of tradi-tional diffusion processes to include time delayed input have also found application in population modelling (Mao et al. 2005).
Diffusion processes may also serve as natural extensions to the ordinary differential equations often used in ecology. Mao et al. (2002) showed how the addition of a noise component to an ordinary differential equation, thus reformulating the model in terms of a stochastic differential equation (SDE), can suppress potential population explosions. Indeed diffusion processes can approximate many well-known population models. This includes branching processes (Allen and Allen 2003), age-structured population models (Vindenes et al. 2011) as well as birth and death processes (Varughese 2009). Diffusion processes have also been used to model environmental variables such as phosphorous levels in lake water and sediment (Carpenter and Brock 2006). The usage of diffusion processes in both population and environmental modelling suggests that higher dimensional diffusion processes can provide a unified framework for inferring relationships between populations and the environment (Varughese 2011).
Compared to the many ecological papers studying the population behavior implied by a prescribed diffusion model, relatively few perform statistical inference using a diffusion model. Within an ecological context, parameter estimation has been primarily performed for simple diffusion models. For example, Gaston and Nicholls (1995) fitted a Wiener process with drift to bird populations in order to estimate their times to extinction. Some extensions have been made to more complex diffusion models. Knape and de Valpine (2012) fitted a more complex quadratic diffusion model to kangaroo data by combining Markov Chain Monte Carlo (MCMC) with particle filters. The authors point out that this is a brute-force method that is computationally intensive. Humbert et al. (2009) showed how one may estimate the parameters of a simple diffusion process with observation error. Ovaskainen (2004) developed a diffusion model of species movement across a heterogeneous landscape and used a finite element scheme to estimate the model parameters. Ananthasubramaniam et al. (2011) focused on the estimation of the volatility of a diffusion process subsequently enabling their study of the behavior of a size-structured population.
This article demonstrates a recently developed, computationally efficient parameter estimation procedure that is applicable for a wide class of diffusion models (Varughese 2013). The methodology allows quite arbitrary model specification thus widening the scope of inference substantially: whether the aim is to measure specific phenomena or to encapsulate multiple points of interest in a model, the method remains the same. This allows for the fitting of more realistic diffusion models of an ecosystem in a computationally efficient manner. The method is used to fit a diffusion model to a plankton time series. The flexibility of the parameter estimation procedure enables us to fit a diffusion model that simultaneously allows for species self-regulation, interspecific interactions and environmental drivers. The fitted model can subsequently be used to better understand the dynamics of plankton fluctuations. As such, the purpose of this paper is threefold: firstly, introduce the diffusion process and its properties; secondly, introduce an efficient parameter estimation method that is valid for a wide class of diffusion processes and finally, fit a model to an ecological dataset and report on the subsequent findings.

DIFFUSION PROCESSES
A multivariate diffusion process can be represented by the following stochastic differential equation (SDE): where U(t) ¼ (U i (t)) i¼1, . . . ,m is the process vector at time t and dU(t) ¼ U(t þ dt) À U(t). U(t) can for example represent a collection of species abundances and environmental factors. The parameter vector h ¼ (h i ) i¼1, . . . ,p of the diffusion system may represent ecological parameters such as a crowding coefficient or environmental sensitivity. l(U(t), . For a general introduction to stochastic differential equations consult Stirzaker (2005). Table 1 gives a summary of all notation used.
v www.esajournals.org The stochastic component of the diffusion process U(t) is driven by the m independent Brownian motions. Brownian motion B(t) is a continuous-time stochastic process characterized by the following properties: firstly, B(0) ¼ 0; secondly, the function mapping t!B(t) is continuous and lastly, B(t) has independent increments with B(t) -B(s) ; Normal(0, t À s).
From Eq. 1, it is easy to see that the diffusion system is characterized by l(U(t), t; h) and R(U(t), t; h) as they govern the evolution of U(t) through time. Their individual elements are defined as: Over an infinitesimally small interval, the diffusion system is distributed (Stirzaker 2005): Uðt þ DtÞ À UðtÞ ; Multivariate Normal l UðtÞ; t; h Dt; Since the drift vector and diffusion matrix only depend on the current process value U(t), the process is Markov. Furthermore, as the time interval (t; t þ Dt) lengthens, the mean and variance of U(t þ Dt) À U(t) only remain proportional to Dt for a special case (a Wiener process with drift). In general, the evolution of the probability density function of U(t) ¼ (/ i (t)) i¼1, . . . ,m through time-which we denote as p(U(t))-is governed by the Kolmogorov forward equation (Cox and Miller 1977): This is also known as the Fokker-Planck equation. There is a corresponding partial differential equation for the moment generating function (MGF) (for more on MGFs consult Mood et al. The parameter vector for the diffusion system.
The ith diffusion process.
The multivariate diffusion process vector.

B(t)
A univariate standard Brownian motion process.
The conditional instantaneous expectation for / i (t).
The drift vector for the diffusion system.
The conditional instantaneous covariance between / i (t) and / j (t).
The diffusion matrix for the diffusion system.
Transitional probability density function for U(t).
Approximate transitional probability density function for U(t).
The parameters for the MGF.
Set of differential operations that act on the MGF.

M(K, t)
The moment generating function (MGF).

L(h)
The likelihood function.
The temperature.
The deterministic component of the temperature.

C(t)
The random component of the temperature.

A(t)
The log of Pseudo-nitzschia australis abundances.
v www.esajournals.org of the system. In addition, let K* ¼ (]/]m 1 , . . . , ]/ ]m m ). The MGF is governed by the following partial differential equation (Varughese 2013): Unless required, we may suppress any dependence on the parameter vector h within our notation. Though the terms l i (K, t) and r ij (K, t) denote functions of K and t, the terms l i (K*, t) and r ij (K*, t) represent differential operators that act on the MGF M ¼ M(K, t). So, for example, if we have a 2-dimensional diffusion system with drift term l 1 (/ 1 , is a differential operator and represents the action of the operator on the MGF M(K, t). In general, neither Eq. 3 nor Eq. 4 are analytically tractable.
Since a diffusion process is Markovian, the likelihood of a diffusion system sampled at discrete time points (t 1 , t 2 , . . . , t N ) can be decomposed multiplicatively (Ait-Sahalia 2002). That is, Asymptotically, for large N, the term p(U(t 1 )) may be ignored whilst the transitional distribution p(U(t i )jU(t iÀ1 )) is the solution to Eq. 3 at time t i with the boundary condition that p(U(t)) is given by the Dirac delta function centered around U(t iÀ1 ) at time t iÀ1 . The Dirac delta function is zero everywhere except at U(t iÀ1 ), where it must be infinite in order to ensure the probability density integrates to one. Using this probability density at time t iÀ1 reflects the fact that we are conditioning on U(t iÀ1 ) in Eq. 5 so there is no uncertainty about the process value at time t iÀ1 . Unfortunately, except for a few special cases, Eq. 3 (and hence also the likelihood) is analytically intractable. A number of approaches have been suggested to approximate the likelihood when Eq. 5 is intractable. One may use Bayesian imputation to augment the data so that the distribution of the increments is approximated by Eq. 2 (Roberts and Stramer 2001). Alternatively, Monte Carlo methods may be used to approximate the likelihood (Kleinhans and Friedrich 2007). One can also solve Eq. 3 numerically. Wojtkiewicz and Bergman (2000) discretized the spatial domain and solved the partial differential equation numerically at each point on the lattice. Another possibility is to seek a closed form approximation to the transitional probability. Ait-Sahalia (2002) approximated the solution to Eq. 3 by a Hermite polynomial expansion.
Instead of approximating the transitional distribution directly, this paper first seeks to approximate the conditional moments of the diffusion system. These may be subsequently used to approximate the likelihood. Huang (2012) approximated the first two conditional moments using a Wagner-Platen approximation. Varughese (2013) approximated the conditional moments to an arbitrary order using the cumulant truncation procedure. This method has a number of features which makes it suitable for ecological datasets: the method is applicable for multivariate diffusion systems and it can be naturally incorporated within a MCMC procedure. Furthermore, in comparison to small-time expansions like the Hermite and the Wagner-Platen approximations, the method is more robust to the large lapses between observations that are prevalent with ecological datasets. Consequently, we apply the method to fit a diffusion model to an ecological dataset.

THE DATASET
We introduce the methodology of the present paper by analyzing part of the seminal dataset compiled by W. E. Allen during 1917-1939 on the abundance of phytoplankton and the state of relevant environmental variables. Thomas et al. (2001) give a very detailed analysis of how the data was collected and managed. This time series v www.esajournals.org contains dense information over long periods of time, warranting the analysis of a diffusion representation of the underlying population dynamics and environmental factors. We give a brief description of the data collection scheme and then discuss the attributes of the species to be modelled.
Observations on the number of each species present in water samples, taken from sites along the Southern California coastline, were collected daily and at the end of each week compiled into weekly averages by mixing the samples for each site and re-analyzing the content for observations (Thomas et al. 2001). Our focus will be on weekly observations taken from the Scripps Institute of Oceanography station for the dinoflagellate Prorocentrum micans and the diatom Pseudo-nitzschia australis as well as the accompanying surface water temperatures for the time period 1930 to 1937. We focus on the post 1930 data as the data collection method changed in this period. Fig. 1 displays log-population numbers together with the surface water temperature. The dataset consists of observations made at non-equidistant intervals, up to 17 weeks apart.
Mixotrophic behavior among dinoflagellates is now a well-established phenomenon in marine biology (Jeong et al. 2010). Burkholder et al. (2008) give evidence of the ingestion of prey by Prorocentrum micans together with the morphological adaptations required to support this trophic mode. These phenomena suggest the possible existence of a predator-prey relationship between Prorocentrum micans and Pseudo-nitzschia australis which lead to interesting possibilities from a population dynamics perspective, given that both species are known to exhibit blooming dynamics caused by changes in ambient media (Cochlan et al. 2008), as well as the use of photosynthesis as a primary method of energy production. Thus the rules that govern the underlying dynamics are potentially more complex than those of specialist predators and their prey, where predation components dominate the underlying dynamics. Smayda (2002) explore contrasts in the blooming behavior of dinoflagellates and diatoms in general, citing specific adaptations and strategies (in this instance transience in trophic modes and nutrient uptake) under ambient changes as significant population driving components. The preference of species to water temperature is difficult to establish, with Pseudo-nitzschia australis and other harmful Pseudo-nitzschia species being cosmopolites (Lundholm and Moestrup 2006), showing up in temperate coastal waters (Guiry 2012). Prorocentrum micans also tolerates a wide range of temperatures (Guiry 2012) and shows vast geographical distribution. Temperature does still however play a crucial role in growth rates and contribute to conditions that result in blooms (Rose and Caron 2007).

A multivariate model of the data
Our attention now turns towards the development of an appropriate model for this dataset. An ecological model should be capable of mimicking the behavior of the primary forces that affect the system. By fitting such a model to the plankton data, one can infer the system's dynamics. This however is constrained by the need for model parsimony. Parsimonious models are usually easier to interpret. By definition, the parameter space of a parsimonious model will be of a lower dimension than that for a complex model. This is desirable as not only is it easier to find the global maximum of the resulting model's posterior distribution but it is also less likely that the model will suffer from model degeneracies (that is, having large regions in the parameter space for which the posterior distribution is very close to its maximum). Difficulties arising from high dimensionality get compounded in multivariate diffusion models making parsimony all the more important.
In this paper, the focus is not only on the nature of the interactions between the two plankton species, but also on the impact of the fluctuating water temperature on their abundances. The surface water temperature T(t) consists of a deterministic and a random component. Let: where w(t) represents the deterministic fluctuation in temperatures and T(t) is the residual component. w(t) is given by the sum of the annual seasonal variation in temperatures (assumed to be sinusoidal) and a longer term cyclical component (assumed to be a polynomial v www.esajournals.org of order p) that is driven by the Southern Oscillation Index. That is: The cyclical component is estimated by fitting a degree seven polynomial to the rolling mean of the water temperature. By subsequently subtracting the resulting polynomial from the data, the seasonal component can be fitted using the method of least squares. Fig. 2 shows the deterministic component w(t) superimposed onto the temperatures.
The residual temperature C(t) is modelled as a diffusion process.
where q . 0 is the instantaneous volatility of the temperature and h , 0. Keeping h , 0 ensures that the temperature tends to drift towards the deterministic component w(t). We may derive the corresponding stochastic differential equation (SDE) for T(t) through Ito's lemma. Given that a diffusion process X(t) behaves according to the SDE dX(t) ¼ l(t, X(t))dt þ r(t, X(t))dB(t), Ito's lemma gives the corresponding SDE for a transformation g(t: X(t)): By applying Eq. 9 to Eqs. 6 and 8, we obtain: The SDE is nonlinear in time since w(t) is a nonlinear function. In addition to the SDE for water temperature, the SDEs for the two species must also be specified. Due to the very large range in the species abundances, we take their log transforms. Let A(t) and C(t) be the logs of species abundances for Pseudo-nitzschia australis and Prorocentrum micans, respectively. Note that: dAðtÞ ¼ log P: australis abundance at time t þ dt P: australis abundance at time t

:
That is both dA(t) and dC(t) may be interpreted as the logs of the instantaneous growth rates for the two species. We model these as follows: dAðtÞ ¼ ½a 1 þ b 1;1 AðtÞ þ b 1;2 TðtÞ þ c 1 TðtÞ 2 þ b 1;3 CðtÞ dt þ ffiffiffiffi ffi d 1 p dB 1 ðtÞ dCðtÞ ¼ ½a 2 þ b 2;1 CðtÞ þ b 2;2 TðtÞ þ c 2 TðtÞ 2 þ b 2;3 AðtÞ dt þ ffiffiffiffi ffi d 2 p dB 2 ðtÞ: The terms b 1,1 and b 2,1 represent intraspecies competition coefficients. As such, both b 1,1 and b 2,1 will be negative hence ensuring that the growth rates of the two species are regulated. By making the diffusion's drift terms quadratic in T(t), we allow their relationship to water temperature to be concave with the degree of concavity controlled by the parameters c 1 and c 2 . Consequently, the optimal temperatures for maximal growth of P. australis and P. micans are Àb 1,2 /2c 1 and Àb 2,2 /2c 2 respectively with c 1 and c 2 representing the respective species' sensitivity to departures from the optimal temperature. Interactions between the two species are allowed v www.esajournals.org through the parameters b 1,3 and b 2,3 . This enables us to look for possible predation upon P. australis by P. micans. Finally, the parameters d 1 and d 2 represent the respective volatilities in the growth rates of the two species.
In total, there are 14 parameters that need to be estimated: a 1 , a 2 , b 1,1 , b 1,2 , b 1,3 , b 2,1 , b 2,2 , b 2,3 ,  c 1 , c 2 , d 1 , d 2 , h and q. These are subsumed within the parameter vector h. The value of h determines the transitional trivariate distribution p(U(t i )jU(t iÀ1 )) for U(t) ¼ (A(t), C(t), T(t)) and hence also the value of the likelihood through Eq. 5. We may thus estimate h by maximizing the likelihood.

Parameter estimation
We fit the plankton model to the W. E. Allen dataset. As the model is multivariate and nonlinear, neither Eq. 3 nor the corresponding likelihood are analytically tractable consequently making inference difficult. However the parameter estimation method suggested by Varughese (2013) can still be used to analyze the data. The resulting analysis within this section is didactic, showcasing the application of this estimation procedure. Except where it may be unclear, any functional dependence on K and t is suppressed within the notation.
The first step in the procedure is to derive a differential equation governing the evolution of the model's MGF M(K,t) ¼ E[expfm 1 A(t) þ m 2 C(t) þ m 3 T(t)g] through time. By substituting Eqs. 10 and 11 within Eq. 4, we obtain: It is subsequently possible to derive the corresponding partial differential equation for the cumulant generating function (CGF) K(K,t) ¼ ln M(K,t) (for more on the CGF, consult Mood et al. (1974)). In order to do so, the following relation is needed: where x is a variable that M and K depend on. One can for example show: From Eq. 12, we thus have: Neither Eq. 12 nor Eq. 13 are analytically tractable. Indeed, even the numerical solution of these equations is difficult as they are both nonlinear partial differential equations. Instead, we apply the cumulant truncation procedure (Daniels 1954) as it provides approximate, computationally efficient solutions to Eq. 13. Note that the CGF K(K, t) ¼ lnE[expfm 1 / 1 (t) þ . . . þ m m / m (t)g] may be written as a series expansion in terms of the model cumulants.
where: j r 1 ;r 2 ;...;r m ðtÞ v www.esajournals.org f(U(t iþ1 )jU(t i )). This is an implementation of Eq. 5. v. Set i ¼ i þ 1 and if i , N (where N is the number of data points) go to step ii. 5. Calculate the acceptance ratio R. This is given by: is the prior density. We then accept the proposed parameter values h new with probability min(1, R). That is, set h new ¼ h old with probability 1 À min(1, R). Otherwise leave h new unchanged. 6. Go back to step 3.
The proposal distributions used within the MCMC are normally distributed and hence symmetric. In addition, we specify priors for the model parameters h. These priors are shown in Table 2. The priors reflect initial expectations on the behavior of the modelled system, including the optimal temperatures for the two species as well as species competition.
MCMC chains of length 300,000 are run and the first 10,000 steps are trimmed as part of the burn-in period. It took 2 hours, 23 minutes to run such a chain, implemented in R, on an Intel i5, 2.53GHz processor. However, if instead of using a Runge-Kutta scheme to solve the system of equations in 16 we rather used a Euler scheme, the run time drops to 43 minutes. Despite its slower speed, we recommend using the Runge-Kutta algorithm as it is more robust against severe non-linearities. Most scientific software will have built-in procedures to solve a system of ordinary differential equations. We provide our R code as a Supplement.
In comparison to the cumulant truncation method, using a well-known computationally efficient data augmentation method (Kalogeropoulos et al. 2011) to evaluate the likelihood resulted in the MCMC chain taking 1 hour, 41 minutes to run. The two methods are comparable in speed. In general the speed of the imputation algorithm is tied to the resolution at which the imputation is performed. The appropriate resolution to use for the imputation is determined by the dataset so it is difficult to draw general conclusions about the relative speeds of the two methods. It should however be pointed out that, unlike the imputation algorithm, the cumulant truncation algorithm introduced allows for nonzero off-diagonal terms in the model's diffusion matrix R(U(t), t; h). Such off-diagonal terms may be required to model an ecosystem of interest.

RESULTS
By applying the MCMC algorithm to the dataset, parameter estimates were obtained. These estimates together with the corresponding 90% credibility intervals are shown in Table 3. A quick comparison of d 1 and d 2 reveals that Pseudo-nitzschia australis is the more volatile species, which is perhaps not surprising as it tends to be the more numerous species. In addition the parameters b 1,2 and b 2,1 are negative. This implies that the lower the current species numbers are, the greater the instantaneous growth rate will be-all else being equal. This serves to regulate the growth of both species.
We also find no evidence of a standard predator-prey interaction between the two species as neither of the parameters b 1,3 and b 2,3 are significantly non-zero. Their estimated posterior distributions are shown in Fig. 3. Any possible predator-prey link between Prorocentrum micans and Pseudo-nitzschia australis may be obscured by competition for nutrients and/or by the species having different ecological niches. This is a topic for future research.
Perhaps more interestingly, the model enables us to find the optimal temperatures for species growth. The optimal temperatures T* may be found by maximizing the instantaneous growth rates in Eq. 11 with respect to the temperature. We thus have: We can find the optimal temperatures sampled by the MCMC chain. Our estimate for the optimum temperature for Pseudo-nitzschia aus- tralis is 12.38C with a 90% credibility interval of (7.88C; 16.08C). For Pseudo-nitzschia australis, the estimated optimum temperature is 17.58C with a 90% credibility interval of (12.58C; 24.58C). Fig. 4 shows the estimated posterior distributions for the two species. Prorocentrum micans seems to prefer warmer temperatures than Pseudo-nitzschia australis. Furthermore note that c 1 and c 2 controls the concavity of the species relationships to the temperature. As such c 1 and c 2 represent their respective sensitivities to temperature. Pseudonitzschia australis appears to be more sensitive to unfavorable temperatures. This is further evidenced by the narrower credibility interval for its optimal temperature.

DISCUSSION
Diffusion models are capable of capturing the behavior of a vast array of ecological phenomena (Drake and Griffen 2009). It is demonstrated how a nonlinear, multivariate diffusion model may be fitted to an ecological time series through the use of the cumulant truncation procedure. The method remains applicable even with large, non-equal time lapses between observations within a time series (Varughese 2013). This makes it suitable for many ecological datasets, including the plankton dataset investigated in this paper.
The bloom dynamics exhibited by both species investigated are typical for r-selected species (Smayda and Reynolds 2001). In addition, there  v www.esajournals.org is evidence that the growth of the two species is regulated as the parameters b 1,1 and b 2,1 are negative. We also find no evidence of predation of the diatom Pseudo-nitzschia australis by the dinoflagellate Prorocentrum micans. Any possible predation could possibly be obscured by competition or a more complex interaction between the two species. The results however suggest that the dinoflagellate prefers warmer temperatures than the diatom. Pseudo-nitzschia australis is the more volatile species and appears to be more vulnerable to unfavorable temperatures. This suggests that the two species occupy different niches within the marine environment (Smayda and Reynolds 2001). By performing statistical inference upon the diffusion model, we were able to learn much about the species relationship to surface water temperature as well as the nature of inter-and intra-species interactions. Diffusion models may also be used to assess boundary crossing events and hence are useful in population viability analysis (Gaston and Nicholls 1995). They are also useful for forecasting population trajectories under varying conditions. This makes diffusion models highly valuable tools in the management and understanding of ecosystems.