Predicting coastal algal blooms in southern California

. The irregular appearance of planktonic algae blooms off the coast of southern California has been a source of wonder for over a century. Although large algal blooms can have significant negative impacts on ecosystems and human health, a predictive understanding of these events has eluded science, and many have come to regard them as ultimately random phenomena. However, the highly nonlinear nature of ecological dynamics can give the appear- ance of randomness and stress traditional methods — such as model fitting or analysis of vari-ance — to the point of breaking. The intractability of this problem from a classical linear standpoint can thus give the impression that algal blooms are fundamentally unpredictable. Here, we use an exceptional time series study of coastal phytoplankton dynamics at La Jolla, CA, with an equation-free modeling approach, to show that these phenomena are not random, but can be understood as nonlinear population dynamics forced by external stochastic drivers (so-called “ stochastic chaos ” ). The combination of this modeling approach with an extensive dataset allows us to not only describe historical behavior and clarify existing hypotheses about the mechanisms, but also make out-of-sample predictions of recent algal blooms at La Jolla that were not included in the model development.

Red tides off the coast of southern California have captured the attention of scientists and non-scientists for over a century. They occur when blooms of marine phytoplankton reach such high concentrations that the sea surface becomes noticeably discolored. In La Jolla, CA, blooms are often caused by bioluminescent dinoflagellates (e.g., Lingulodinium polyedrum). When concentrations of these organisms reach the tens of thousands per liter, the effect after dark is spectacular-lightning flashes in the breaking waves, eerie blue trails linger behind surf fish, and beach sand glows underfoot. But on the darker side, red tides and other algal blooms in southern California and throughout the world can be caused by toxic algal species, resulting in fish and shellfish mortality (Lackey and Clendenning 1965, Curtiss et al. 2008, Lewitus et al. 2016, food-borne illness (Hoagland et al. 2002, Lewitus et al. 2016, and even respiratory irritation and failure (Hallegraeff 1993, Pierce et al. 2003. Regardless of toxicity, the sheer concentrations of organisms can cause significant operational issues (e.g., clogging) to intakes at power and desalinization plants (Caron et al. 2010) and can lead to oxygen depletion and fish kills when blooms persist over extended periods (Hallegraeff 1993, Lewitus et al. 2016). Thus, while studying algal blooms can satisfy our innate curiosity for the dramatic, it also addresses important economic and human health concerns.
Prediction is a central concern with harmful algal blooms (HABs) and red tides. The ability to predict blooms even just a week in advance would allow public health officials to respond to human health concerns and provide time for facilities that use coastal water to shut down before equipment is damaged (Caron et al. 2010, Lewitus et al. 2016. The modern sampling program (begun in 1983) at the Scripps Institution of Oceanography (SIO) pier in La Jolla, CA, offers a long, finely sampled (twice-weekly to weekly), and nearly continuous record of chlorophyll-a (Chl-a) with many physical and nutrient covariates. Chlorophyll-a is a standard proxy for phytoplankton abundance that can be combined with occasional species-specific cell counts to verify the predominance of HAB species during blooms. Thus, the modern record offers a unique opportunity to take a detailed quantitative approach to studying coastal blooms not previously possible.
That being said, the best quantitative approach to prediction depends on the nature of the system being studied. One approach is to use parametric models. However, the coastal ecosystem in southern California is an open system, subject to oceanic conditions (McGowan et al. 1998, Checkley and Barth 2009, Kim et al. 2009). That is, unlike previous studies of red tides in sheltered environments like harbors and bays (Lee et al. 2003, Wong et al. 2007), these blooms occur in a highly dynamic physical environment that is extremely difficult to model Manuscript received 29 December 2016; revised 16 February 2017; accepted 6 February 2017. Corresponding Editor: Elizabeth E.  explicitly. Indeed, modeling studies and observations of the Southern California Bight find this region to be highly energetic with respect to currents, with substantial temporal variability across multiple spatial and temporal scales (Di Lorenzo 2003, Capet et al. 2008. Another approach is to use statistical methods such as cross-correlation or principal components analysis. However, these approaches can only reliably determine interactions when the effects of different driving variables are separable (i.e., each variable acts independently from the influence of the others). Such separability is a characteristic of linear systems, whereas numerous studies have suggested that the dynamics of marine populations, ranging from phytoplankton to fishes, are nonlinear (Sugihara and May 1990, Ascioti et al. 1993, Hsieh et al. 2005, Glaser et al. 2014. In a nonlinear system, the system variables act interdependently and so the effect of one driver (e.g., nitrate) cannot be understood separately from the effect of other variables (phosphate, silicate, stratification, phytoplankton abundance, etc.).
Indeed, attempts to understand blooms by applying correlation to chlorophyll-a and environmental covariates have met with little success. Rounsefell and Dragovich (1966) studied red tides in southwest Florida and found no evidence for linear correlations between environmental drivers and blooms. However, their analysis hinted that environmental factors were interacting nonlinearly to produce blooms. Similarly, Kim et al. (2009) used linear correlation to look for environmental drivers (e.g., wind-driven upwelling) of blooms at the SIO Pier, but found little evidence of significant relationships. Another statistical analysis of the SIO Pier data (unpublished) found a distinct linear correlation between chlorophyll-a and intrusions of cold water onto the shelf for the 1986 to 1994 time period (Fig. 1, top). However, this apparent correlation has completely disappeared with subsequent observations (Fig. 1, bottom). These "mirage correlations"-linear relationships that appear and disappear or even change sign as time goes on-are known to arise in nonlinear systems , thus indicating the need for nonlinear tools to study this problem.
Early work on nonlinear dynamics in ecology focused mainly on endogenously (internally) driven population dynamics (May 1976). However, nonlinear dynamics can also arise from a mixture of deterministic endogenous dynamics and stochastic exogenous (external) drivers like rainfall or temperature. This combination can produce stochastic chaos (Sugihara 1994, Sugihara 2011, where episodic events arise as nonlinear perfect storms of chance events (Dixon et al. 1999). In such systems, populations and exogenous drivers may show inconsistent correlation or none at all, even when there is a strong interaction. This accords with the work of W.E. Allen in the early 20th century (Allen 1943), who hypothesized that no single factor controlled the appearance of red tides, but rather that a "whole series of favorable combinations" must happen together. In light of all this, the lack of simple linear correlations between environmental variables and chlorophyll-a at the SIO Pier observed by Kim et al. (2009) is unsurprising.
Here, we reexamine the issues of detecting environmental drivers and prediction of algal blooms at the SIO Pier using empirical dynamic modeling (EDM). EDM is a non-parametric framework ("equation-free" as described by DeAngelis and Yurek (2015)) for understanding and predicting ecosystem behavior when linear statistics and parametric models do not apply (Sugihara and May 1990, Deyle et al. 2013, Ye et al. 2015b). Descriptions of the methods are given below, and in more detail in Appendix S1, but the key concepts are best illustrated in a brief introductory video (http://tinyurl.com/ EDM-intro). Although EDM is based on methods originally conceived to study deterministic dynamics, it has been extended to explicitly account for stochastic variables in systems where endogenous dynamics are convolved with stochastic forcing (Stark et al. 2003).
Our work builds on a long history of observation and research in La Jolla and southern California. The first report of a red tide in La Jolla dates back to 1901 (Torrey 1902). Following the founding of SIO and construction of the SIO Pier in La Jolla, W.E. Allen observed several red tides (Allen 1933(Allen , 1938(Allen , 1943 during careful, daily sampling of the phytoplankton at the pier over more than 20 yr. However, because the only environmental variables measured during this era were temperature and salinity, any investigation of exogenous drivers is limited. In the decades following Allen's sampling, there were only sporadic records of red tide events until regular biological sampling at the SIO Pier resumed in 1983. The modern sampling program includes twice-weekly measurements of chlorophyll-a. Weekly cell count measures are also available for some years but sadly these tend to be less consistent reflecting the focus of the funding available at the time. Although different from the direct cell-count measurements, chlorophyll-a measurements are thought to be a good proxy for measuring red tides, and as described in the Results to follow, the vast majority of samples during large algal blooms are dominated by dinoflagellates and specifically those considered HAB species. Thus, these data can be used to address the more specific problems of dinoflagellate blooms and HABs. Moreover, the modern sampling not only exceeds Allen's original observations in duration, but exists alongside concurrent measurements of phytoplankton cell counts, dissolved nutrient concentrations, water temperature, salinity, rainfall, and wind, as well as derived quantities, such as surface density. These data comprise a long-term record of hydrography, weather, ocean chemistry, and algal dynamics-information that could be used to study blooms as nonlinear events. The blooms at the SIO Pier are representative of the southern California coast in general. McGowan et al. (1998) found considerable regional coherence in the physical variability observed at the SIO Pier and other monitoring stations along the southern California coast.
A preliminary study of chlorophyll and cells counts across the 5 HAB monitoring stations in the Southern California Ocean Observing System (SCCOOS) indicates blooms are correlated over distances of 100 to 180 km, and show higher agreement among species specific comparisons vs. using chlorophyll-a concentration (Carter et al. 2014). Appendix S2: Table S1 shows the spatial coherence for cell counts of Lingulodinium polyedrum (the most extensive bloom-forming dinoflagellate in southern California) between the SIO Pier and other stations. These correlations suggest that the physical and biological activity at the SIO Pier reflect regional scale variability across the southern California coast. Consequently, the modern sampling program offers an exceptional opportunity for a detailed quantitative study of regional-scale coastal blooms, in order to better identify mechanisms and to develop a predictive model for the La Jolla coast and the surrounding region.
One hypothesized necessary condition for blooms is a stable or stratified water column (Holmes et al. 1967, Horner et al. 1997). This can result in a depletion of surface nutrients, creating a vertical separation between light and nutrientsconditions that should give motile dinoflagellates a competitive advantage. However, there are multiple mechanisms that can lead to stratification: sunny and calm weather, the meeting of different watermasses, heavy rainfall, or inputs of fresh water from land (Wyatt and Horwood 1973). Thus, there are many variables that can be indicative of stratified conditions: wind speed, vertical temperature gradients, or even just surface density. Indeed, several studies found a strong offshore thermocline during blooms in the vicinity of La Jolla (Holmes et al. 1967, Wilton and Barham 1968, Cullen et al. 1982, though it is important to note that a steep thermocline is not always necessary, as vertical stratification and weak mixing could also result from a lack of surface winds. A second possible driver of blooms is nutrient input, as nutrients are necessary for phytoplankton growth. Earlier studies suggested that the red tides in La Jolla Bay are fueled by nutrient inputs from upwelling (Holmes et al. 1967, Cullen et al. 1982. Classic coastalupwelling theory would then suggest North-to-South wind speed as a predictive indicator for blooms. However, contrary to the general theory, North-to-South wind speed has been shown to have limited utility for explaining coastal currents and upwelling in the Southern California Bight (Lentz andWinant 1986, Lerczak et al. 2003). Instead, the internal wave field appears to have a much stronger influence on upwelling in this region (Winant 1974, Lucas et al. 2011. Thus, wind speed may be a better indicator of calm conditions that support stratification, rather than an indicator for upwelled nutrients.
Dissolved nutrient concentrations can also be difficult to interpret. For example, a near-zero reading of nitrate could reflect: (1) very low supply and low primary productivity, (2) a balance between supply and uptake with high primary productivity, or (3) a shift to recycled forms of nitrogen, like urea. That is, dissolved inorganic nutrients like nitrate and phosphate may be taken up so quickly during bloom events that their concentrations in the euphotic zone are actually very low. Studies in other areas have suggested that terrestrial run-off (Horner et al. 1997) or rapid nutrient recycling (Smith 1978) can provide the nutrient fuel for red tides. This murky relationship between blooms and dissolved nutrients is further complicated by mixotrophy in many dinoflagellate species (Burkholder et al. 2008). Finally, it should be mentioned that the appearance of red tides at the coast could simply reflect the physical concentration of offshore organisms or the advection of a bloom from a remote formation zone (Blasco 1977, Huntley 1982, Horner et al. 1997. Clearly there is considerable disagreement and uncertainty as to the basic mechanisms involved. Thus, our historical inability to predict the red tides in southern California should come as no surprise. Here, we show that despite these challenges, a predictive understanding of chlorophyll blooms in La Jolla is possible by treating them explicitly as nonlinear events. Our initial analysis of the chlorophyll-a time series by itself reveals that it exhibits nonlinear dynamics, but that the endogenous behavior of chlorophyll-a alone cannot explain bloom events well. Preliminary data exploration reaffirms an absence of simple statistical relationships between chlorophyll-a and other variables, but hints that there are important environmental thresholds. Taken together, these findings motivate the application of a multivariate nonlinear approach where environmental variables (i.e., exogenous drivers like wind) are explicitly taken into account. A similar approach was used to understand episodic spikes in damselfish larval supply where a perfect storm of co-occurring stochastic (exogenous) and deterministic (endogenous) factors produced pulses of larval fish (Dixon et al. 1999). Here we show that integrating nonlinear endogenous dynamics with relevant exogenous variables in this way not only gives mechanistic insight into historic blooms, but can also produce out-of-sample forecasts of a recent red tide in La Jolla.

Data
Chlorophyll-a, cell-count, and nutrient time series are compiled from measurements collected at SIO Pier (32°52.02 0 N, 117°15.42 0 W) as part of the SCCOOS Harmful Algal Bloom Monitoring Program. Sampling methods are described in Hatch et al. (2013). Chlorophyll-a and nutrient measurements are typically collected twice weekly on Monday and Thursday. To create time series with fixed time steps, we filter the data to observations that were 3-4 d apart (~1/2 week intervals) or 6-8 d apart (~1 week intervals), and restrict the data to contiguous segments of at least 20 points.
Temperature (°C) and salinity (PSU) are taken at the 0 m (sea surface) and 5 m depth as part of the Shore Stations Program (http://shorestation.ucsd.edu/data/in dex_data.html). Kim et al. (2009) outline these measurement procedures. Density (kg/m 3 ) at both depths is calculated from temperature and salinity using the equation of state for seawater. Previous work defined a bottom temperature anomaly (T Ã b ) to indicate cold-water run-ups as: where T b is the observed water temperature at 5 m depth and T s is the mean year-day surface temperature from 1927-1994. Positive values indicate an absence of coldwater run-up and are therefore ignored by fixing the value at 0. Finally, wind and temperature data from the offshore buoy (station 46025) are taken from NOAA's National Data Buoy Center online database (http:// www.ndbc.noaa.gov/station_page.php?station=46025).
Except in the case of the anomaly defined above, SIO Pier measurements for temperature, salinity, and density are averaged across the two depths (0, 5 m), and all environmental variables are converted to the running mean over the previous 7 d. All analyses but the final out-of-sample test are performed on data collected beginning in 1983 and up through the end of December 2010. The final out-of-sample prediction test uses data collected subsequently (January 2011 through March 2012).

Empirical dynamic modeling
Empirical dynamic modeling is an approach for studying systems that focuses on a dynamic attractor rather than a set of parametric equations (Sugihara and May 1990, DeAngelis and Yurek 2015, Ye et al. 2015b, Deyle et al. 2016. Like parametric equations, an attractor is a description of the rules that govern the system dynamics. Unlike parametric equations, however, attractors can be recovered directly from time series. Consequently, it avoids many a priori assumptions that go into parametric modeling (Ye et al. 2015b), as well as problems that can arise in parameter estimation (Wood and Thomas 1999, Perretti et al. 2012. Below we give a description of the methods used in this paper. The general algorithms are implemented in the "rEDM" package available in the CRAN repository with examples in the associated tutorial; here, we use version 0.4.7 of the package. Attractor reconstruction.-A dynamic attractor (also referred to as "attractor" or "embedding") can be directly recovered from time series observations by re-plotting the data in a multi-dimensional state space (see http://tinyurl.com/EDM-intro from the Supplemental Information of Sugihara et al. 2012).
As an example, consider a simple plankton model that has three state variables: nutrients (N), phytoplankton (P), and zooplankton (Z). The state of the ecosystem at time t is simply the vector of values for N, P, and Z: x t = <N(t), P(t), Z(t)>, which is a point in the three-dimensional state-space. The specific values of N, P, and Z describe the system state. As the values of N, P, and Z change (according to the model equations), the system state also changes, and the sequence of states traces out a trajectory. The geometric pattern formed by the trajectories constitutes a dynamic attractor and is an empirical description of how the system behaves at different locations in the state-space. That is, the attractor describes how the system variables co-vary (positively, negatively, or not at all) as a function of the current location in state space. For example, the attractor could describe how competition between two species spikes when a limiting resource is scarce but is otherwise undetectable when resources are abundant (Deyle et al. 2016). In this way the attractor provides mechanistic insight into how variables interact.
In this study, we have measurements of chlorophyll-a (as a proxy for phytoplankton abundance) and some data on nutrient concentrations, but lack other potentially important ecosystem variables that may act as endogenous system variables such as zooplankton grazer abundances. However, an incomplete set of variables can be offset by the fact that dynamic attractors can be constructed using lags of a single variable as proxy coordinates for the missing system variables (Packard et al. 1980, Takens 1981, Sauer et al. 1991). An attractor constructed in this rather abstract way can be viewed as a transformation (a coordinate transformation) of the original multivariate attractor, and is the basis of univariate (single time series) models.
For the simple NPZ model, this means that instead of representing the state explicitly and multivariately as x t = <N(t), P(t), Z(t)>, the state vector can be represented in terms of lags of phytoplankton alone, Here, E is the embedding dimension (i.e., the number of coordinates or time lags), and s is the time lag between successive coordinates. In practice, E is chosen to maximize forecast skill, and in most ecological applications, sampling is sparse, so that s is by default set to the sampling interval.
Generalizations of this approach permit multivariate representations of the system state using multiple time series (Sauer et al. 1991, Deyle andSugihara 2011). For example, if both nutrients and phytoplankton were observed for the NPZ system, the state vector could be represented with a mixture of lags of P and N as x 00 t = <P(t), P(t-s), . . ., N(t), N(t-s), . . .>. Note that with multiple time series there are many different possible embeddings (even for a fixed E). For example, models with coordinates Such multivariate representations can often provide a better description of the system than using just a single time series and its lags. This is especially true if the system is strongly affected by an exogenous stochastic driver, such as temperature. To be reasonably accounted for, external stochastic drivers must be represented explicitly as coordinates in the reconstructed attractor (Stark et al. 2003).
Prediction.-Reconstructed attractors are validated by their ability to make skillful forecasts. To differentiate "prediction" from "fitting", forecasts are made out-ofsample. This is done by either: 1) dividing the data into a separate "library set" for constructing the attractor and "prediction set" for which predictions are made or 2) by applying leave-one-out, cross-validation over the whole data set (Appendix S1).
Here, we use two forecasting algorithms: simplex projection and S-maps. Simplex projection (Sugihara and May 1990) involves following nearest neighbor analogues (points that are close to each other on the attractor) to see where they moved next. Nearby points on the attractor correspond to similar ecosystem states. Specifically, simplex projection computes a forecast by following the E + 1 nearest neighbors (where E is the embedding dimension of the attractor) and taking a weighted average of their predictions (details in Appendix S1).
The S-map method (Sugihara 1994) involves computing a different locally-weighted linear regression to make each forecast. Points closest to the current state on the attractor are given the highest weight when computing the linear map. The nonlinear parameter h controls the strength of this weighting and hence the degree of nonlinearity or state-dependence. When h = 0, all states on the attractor are given equal weight, and the S-map is equivalent to a global linear model (a standard autoregressive model). As h increases, nearby states are weighted more strongly, and predictions become increasingly sensitive to specific location on the attractori.e., increasingly nonlinear or state dependent (details in Appendix S1).
In this work, forecast skill is primarily measured by Pearson's correlation (q) between observed and predicted values. The correlation coefficient is desirable because of its sensitivity to extreme values (i.e., blooms). Where appropriate, other metrics such as mean absolute error (MAE) and binary classifier performance (e.g., true positive rate, false negative rate) are also used.

Testing for endogenous dynamics
Univariate analysis is used to test if the chlorophyll-a time series contains endogenous nonlinear dynamics. Following Glaser et al. (2014) we use simplex projection to determine the optimal embedding dimension, E, i.e., the number of lags that best unfolds the attractor and maximizes forecast skill (q). This involves using crossvalidation to make forecasts for the whole time series with a prediction step tp = 1 week in the future. The optimal embedding dimension is determined by varying E over the range 1-15 with a time lag s = 1/2 week. Note that an upper limit of 15 is conservatively large (corresponding to a maximum lag of 7.5 weeks) as measurements older than 5 weeks were found not to be relevant for the diatoms in Allen's earlier data (Sugihara and May 1990).
S-maps are used to test for nonlinear dynamics. Nonlinearity is inferred if forecast skill improves for h > 0 (i.e., if the nonlinear model produces better forecasts than the linear equivalent). As above, forecasts are made with cross-validation using time lag s = 1/2 week, prediction step tp = 1 week in the future, with E fixed at 4 (the optimal value determined from simplex projection).

Identifying environmental drivers
Environmental drivers of chlorophyll dynamics are identified using convergent cross mapping (CCM), a nonlinear causality test . CCM operates by measuring how well a driving variable can be cross-mapped (i.e., predicted) from the affected variable. For example, the causal effect of a nutrient is established if its concentration can be predicted from the chlorophyll-a time series. That is, if there is recoverable information about nutrients in the chlorophyll-a time series, then nutrients are causal. The ability to identify causal links without relying on correlation is important because interacting variables in nonlinear systems are often uncorrelated.
Operationally, CCM is a two-part criterion for causality: (1) cross-map skill (q, correlation between observed and predicted values) must be statistically significant and (2) cross-map skill should increase with the amount of data used to make predictions up to a practical limit of error and uncertainty in the data (i.e., "convergence" occurs with increasing "library size"). Statistical significance is determined using a randomization test involving surrogate time series (Tsonis et al. 2015, van Nes et al. 2015. These are created by randomly shuffling the order of time points so that the dynamics are destroyed but linear cross-correlation is preserved. Convergence is demonstrated by computing cross-map skill using random subsamples of different sizes (library size = {20, 40, . . ., 160 points}). For these tests, predictions are made using cross-validation for the time points in the bloom periods only (described next).
We focus the analysis on "bloom periods" to understand the mechanisms specific to these events. These are operationally defined as time series segments containing at least one bloom observation (above the observed 95% quantile of chlorophyll-a, 7.73 mg/m 3 ) together with points up to two observations prior (to capture bloom initiation) and two observations after (to capture bloom termination). Cross-map predictions are made with cross-validation using E = 4 (the optimal univariate E) and s = 1/2 week. Because environmental drivers can act with a time delay, we fit the prediction time (between 0 and À5 weeks) that maximizes cross-map skill (q) (Ye et al. 2015a).

Multivariate analysis
As described earlier, with multiple environmental drivers to choose from, there are many ways to construct multivariate models (Ye and Sugihara 2016). We limit our models to a maximum of four coordinates to keep the analysis tractable and to be consistent with the optimal E determined for the univariate model. Fixing the first coordinate to be the unlagged chlorophyll-a concentration, we consider up to three additional coordinates, where each other coordinate is a 0-, 1-, or 2-week lag of either an environmental variable or of chlorophyll-a. We use all environmental variables that CCM identifies as significant. Choosing combinations of the six causal environmental variables listed in Table 1 generates 1351 possible multivariate models. Note that the lags occur in 1-week intervals. This maximizes the usable data given that some nutrients were only sampled weekly. Just as above, we examined predictions for "bloom periods" that contain points up to two observations before and after days when chlorophyll-a exceeded 7.73 mg/m 3 . Finally, because the variables have different numerical scales, they are normalized to have mean = 0, and standard deviation = 1.
It is important to know whether model performance is consistent (stationary) when selecting multivariate 1424 JOHN A. MCGOWAN ET AL. Ecology,Vol. 98,No. 5 models to use for forecasting. That is, how well does the ranking of models based on their performance on the first half of the data reflect their ranking on the second half ? The constancy of this ranking will depend on the specific performance metric (q, MAE, or a combination thereof). To investigate this question, we divide the data into two equal halves, corresponding to observations before and after 09 June 1996 up to December 2010. We then use S-maps (h = 1 with leave-one-out cross-validation) to compute forecasts for each multivariate model on the first half of the data and compare this performance with out-of-sample forecasts on the second half. The first half of the data series is the "library" or training set for this test. Finally, S-map analysis is applied to the multivariate case to test whether the environmental drivers behave nonlinearly. The procedure is the same as the univariate S-map.

Temporal out-of-sample test
The model selection procedure is simplified in the final out-of-sample test by choosing a single overall h. This reduces the degrees of freedom and thereby reduces the chance of over-fitting in selecting a multivariate model. A simple heuristic for choosing h is to average the predictability of the top 50 embeddings at each h and select a value of h that gives the highest average q (Appendix S2).
To keep the out-of-sample test as transparent as possible we focus in detail on just three multivariate models: one model selected at each E (E = 2, E = 3, E = 4) with the best combined q and MAE rank (the metric found to give the most stationary model performance in the test described above). Model performance is computed using cross-validation on all the bloom-period data up through December 2010 and S-maps with the single value of h determined above. These models are then used to forecast chlorophyll-a from 3 January 2011 to 29 March 2012a more recent set of observations that was not available to the development or selection of the models.

Binary prediction
Binary prediction skill ("bloom" or "no bloom") is evaluated by how well bloom predictions (5% largest predicted values) correspond to bloom observations (5% largest observed values). As an example of a way to tune the sensitivity of such predictions we investigate an adhoc multi-model method where a bloom is predicted when at least a certain fraction, q, of the top 100 models (ranked by q and MAE) predicts a bloom. For example, by setting q to a small value (e.g., q = 10%), a bloom is predicted even if only a small fraction (10%) of the individual models predicts a bloom.

RESULTS
As seen in Fig. 2, concentrations of chlorophyll-a during the modern sampling period (1983-2010) are typically around 1-2 mg/m 3 , with occasional large excursions that can exceed 100 mg/m 3 . Chlorophyll follows an extreme-value distribution (Appendix S2: Fig. S1A), and exhibits non-normality even after logtransformation (Appendix S2: Fig. S1B, C). Because the distribution does not exhibit bimodality or any natural discontinuity that could be used to objectively mark a bloom event, we define a bloom operationally with a statistical threshold, as earlier studies have done (e.g., Kim et al. 2009). In our case a bloom is defined as chlorophyll-a level that meets or exceeds the 95th percentile of all observed values (i.e., >7.73 mg/m 3 ). It is noteworthy that although this definition of a bloom is arbitrary, previous studies in southern California have found a corresponding dominance by dinoflagellates above similar statistically defined thresholds of chlorophyll-a (Cullen et al. 1982). In the modern SIO pier data (1983-), where species composition data are available, we find that the majority of statistically defined blooms are dominated by dinoflagellates (87.3% of samples), and many are specifically dominated by HAB species: Lingulodinium polyedrum (58.7% of samples), Prorocentrum spp. (7.9%), Akashiwo sanguinea (7.1%), Cochlodinium spp. (3.2%), and Gymnodinium spp. (1.6%). This validates the earlier claim that studying the chlorophyll-a time series can give insight into the specific problem of red tides and HABs.

Testing for endogenous dynamics
We examine the entire chlorophyll-a time series (including blooms and non-bloom periods) to see if it is possible to uncover deterministic attractor dynamics using lags of chlorophyll-a alone. Fig. 3A shows the results of simplex projection demonstrating that chlorophyll-a does indeed have an underlying attractor that can explain much of its behavior. Prediction is best when using 4 lags to unfold the trajectories (E = 4). Fig. 3B shows that, with S-maps, the dynamics are nonlinear and state dependent with an overall predictability of q = 0.44. Thus, we can conclude that taken as a whole, chlorophyll-a dynamics are indeed predictable (exhibit a measure of determinism) and are clearly nonlinear.
However if we focus specifically on bloom events we find that predictability is rather different. Using the best univariate S-map model (E = 4 and h = 1.8), predictability is substantially worse for bloom periods, where q = 0.29, as compared to q = 0.50 for non-bloom periods (Fig. 3C). This indicates that the mechanisms driving blooms are not fully captured by lag-coordinates of the chlorophyll-a time series alone. In other words, blooms are not fully described by endogenous chlorophyll-a dynamics, but may depend on exogenous stochastic drivers; thus a multivariate approach that explicitly includes coordinates that are exogenous may be required.

Identifying environmental drivers
In Fig. 4A we see that blooms (open diamonds) are associated with lower water densities, with the largest events occurring during the lowest densities. Interestingly, blooms also occur during periods of low surface nutrient concentrations (Fig. 4B)conditions that can arise up to 3 weeks in advance (Appendix S2: Fig. S2). Thus, low nutrient levels appear to be a precondition for blooms rather than a result. Notice also that blooms  Ecology,Vol. 98,No. 5 tend to occur when the North-South wind speed is low (Fig. 4C). Thus, low wind speed could be another possible precondition for bloom formation. Although these associations point to possible mechanisms, there are many more times when conditions such as low density occur, but no bloom is observed; which suggests that they are at best necessary, but not sufficient predictors. So while these variables clearly have some relation to chlorophyll-a, it is not a simple linear relationship.
Given the evidence that we are dealing with a nonlinear phenomenon, we apply a nonlinear causality test, convergent cross-mapping (CCM) to identify causal environmental drivers. CCM applies even where there is no simple correlation. Table 1 shows cross-map skill for the tested environmental variables during bloom periods. CCM convergence (the 2nd criterion for inferring causation) is shown in Appendix S2: Fig. S4. Note that significant linear correlations are rare, but that CCM identifies many of the potential environmental drivers as being significantly causal. Interestingly, the significant linear correlation with salinity is identified by CCM as in fact not being causal. The significant causal drivers identified by CCM include density, nutrient concentrations, temperature and wind speed. Many of these variables also appear to be important during "normal," non-bloom periods (Appendix S2: Table S2). However there are exceptions. For example, phosphate has a significant effect overall (Appendix S2: Table S2), but not when the data are restricted to bloom periods (Table 1). Conversely, wind appears to be important only during bloom periods. These differences suggest the possibility that bloom dynamics may involve environmental drivers in different combinations or different relationships from those operating during non-bloom conditions. We note that the optimal prediction lag identified in CCM is not an exact indicator of the actual delay in causal effects (Ye et al. 2015a); thus the lag for each variable is determined separately for the multivariate analysis (below).

Multivariate analysis
The six environmental variables that show significant causal effects during bloom periods (Table 1) are used to construct the 1351 multivariate models: SIO Pier density, nitrate, silicate, nitrite, SIO Pier temperature, and wind speed (see methods for details).
With so many possible multivariate models there is a risk of overfitting to models (variable combinations) that produce good in-sample fits to the data but which lack out-of-sample predictive power. We address this possibility by examining whether the performance of the models is indeed stationary. Fig. 5 compares forecast skill for all 1351 multivariate models on the first-half of the data (x-axis) to that obtained on the second-half of the data (y-axis). While both q and MAE individually show stationarity in model performance, the combined metric (the sum of q and MAE ranks) shows more consistency. Thus models selected with this metric are more likely to perform well out-of-sample.
Nonlinear interdependence between environmental drivers was hypothesized early on by Allen (1943), and is suggested in Fig. 4. We test this idea quantitatively by applying the S-map analysis to the 1351 multivariate models. Fig. 6A shows the improvement in predictions (Dq, the increase in q relative to the q at h = 0) as a function of h. All models show improved forecast skill for h > 0, indicating that the drivers are indeed acting nonlinearly (interdependently with respect to chlorophyll). Fig. 6B shows the distribution of forecast skill across all embeddings at h = 8; the value of h chosen to simplify model selection (see methods and Appendix S2: Fig. S3). Note that the distribution has a large tail. Although PREDICTING COASTAL ALGAL BLOOMS 1427 many arbitrary combinations of variables are not very good, some variable combinations can predict blooms extremely well, suggesting that very specific variables or combinations are required to describe the mechanisms driving blooms. This is consistent with the idea that blooms are caused by multiple exogenous factors occurring together, but are less well understood by individual factorsthe nonlinear "perfect storms" hypothesized by Allen (1943) and shown for damselfish larval supply by Dixon et al. (1999). The three models for final testing are based on the combined q and MAE performance over in-sample bloom periods up through December 2010. The best 2dimensional model is <Chl-a(t), v wind (t-2 week)>, the best 3-dimensional model is <Chl-a, v wind (t), v wind (t-2 week)>, and the best 4-dimensional model is <Chl-a(t), Si(t), D(t), Si(t-1 week)>, where Chl-a is chlorophyll-a, v wind is the North-South wind speed, Si is silicate, and D is water density.

Temporal out-of-sample test
As a demonstration of true out-of-sample forecast skill, we compute one-week ahead predictions for data that were not available to any of the above analysis (the period from January 2011 through March 2012 see Methods for details). Again, the three models are constructed and selected on data only up through 27 December 2010. The results given in Fig. 7 show that all models make statistically significant out-of-sample predictions (P < 0.05). The best forecasts come from the 4-dimensional model, showing a very strong performance of q = 0.63. Note that E = 4 was the optimal embedding dimension from the univariate analysis.

DISCUSSION
Previous attempts to understand coastal algal blooms in southern California have not been able to establish For the 1351 multivariate models, the relative rank in forecast skill (q or MAE) is compared for predictions over the first-half vs. second-half of the data. Note that smaller ranks correspond to better performance (higher q, lower MAE). (C) Combining the two forecast metrics (q and MAE) gives a more stationary metric, and is therefore more robust for selecting a model to test out-of-sample. consistent mechanistic relationships between chlorophyll-a and environmental drivers (Kim et al. 2009). In some cases, mirage correlations have appeared for a period of time, only to later vanish (Fig. 1). Our results demonstrate that blooms are nonlinear phenomena and therefore such ephemeral correlations are to be expected. Thus, they require an approach that addresses nonlinear and state-dependent interactions among variables. Here we use an empirical dynamic approach that determines variable relationships as a function of position on an attractor in a state space. We note that this approach differs fundamentally from linear methods that invoke the term "state-space", such as Kalman filters and MARSS (Holmes et al. 2012, Hampton et al. 2013. These treat different system components as having independent effects, and model time-variation as being essentially stochastic. While this can be useful for shadowing slow change, it involves phenomenological fitting that is not suitable for predicting rapid nonlinear change. These linear methods would treat the transitions between normal chlorophyll-a dynamics and bloom conditions as random rather than the result of predictable ecological mechanisms.
Studying the SIO pier data with a nonlinear perspective, however, provides a path for understanding the mechanisms involved by identifying causal variables and their meaningful combinations. In particular, the results are consistent with the hypothesis that blooms in this region arise in conditions that give dinoflagellates a competitive edge over other phytoplanktonnamely, when the water column is stable and surface nutrients are depleted. These conditions generally favor motile dinoflagellates, which can assimilate nutrients at depth, over non-motile phytoplankton, such as diatoms. For example, the three example models selected include either water density or wind speed (Fig. 7). Both are indirect indicators of water column stability (Holmes et al. 1967, Horner et al. 1997. Additionally, Fig. 4B shows that blooms tend to occur when dissolved surface nutrients are very low; and CCM (Table 1) confirms that there is indeed a causal effect of nitrogen on bloom dynamics. While none of the three models include measurements of nitrogen or phosphorus, the inclusion of silicate in the 4-dimensional model suggests that nutrient concentrations could also be informative when predicting blooms. Unlike nitrate or phosphate, high to moderate (rather than low) concentrations of silicate are associated with bloom formation (Appendix S2: Fig. S5). In this region it is well-known that high concentrations of silicate at the water surface can indicate a water mass history involving recent nutrient enrichment, and that silicate is a better indicator of nutrient history than dissolved nitrogen or phosphorous which are rapidly accumulated into biomass (Holmes et al. 1967  1 9 9 1 1 9 9 2 1 9 9 3 1 9 9 5 1 9 9 6 1 9 9 7 1 9 9 8 1 9 9 9 2 0 0 0 2 0 0 5 2 0 0 6 2 0 0 7 2 0 0 8 2 0 1 0  Ecology,Vol. 98,No. 5 concentration during blooms could also be a general indicator of ocean conditions that favor dinoflagellates over diatoms (e.g., stratification), as diatoms would have depleted the silicate if they were present. Thus, it is plausible that silicate could be an indirect indicator of stratification as well as of nutrient history. It is also worth noting that the analyses show no evidence for an effect of rainfall on blooms, which suggests that nutrient runoff, although an important mechanism for blooms in other areas (Horner et al. 1997), is less important for blooms at Scripps Pier and within southern California. The importance of water column stability as a predictive indicator for blooms becomes more apparent when we examine the performance of the 1140 E = 4 models (ranked by q and MAE) across three categories: (1) variable combinations that include only nutrient variables (i.e., nitrate, nitrite, or silicate); (2) variable combinations that include only indicators of water column stability (i.e., temperature, density, or wind speed); and (3) variable combinations that include both. Here, models that include only lags of nutrient concentrations (in addition to chlorophyll-a) are substantially worse at predicting individual bloom events than models that include indicators of water column stability (P < 0.0168) or models that include variables suggestive of both mechanisms (P < 0.0010) (Fig. 8A). In other words, it appears that models that include measures of water column stability are more predictive than models that do not. The bottom line is that both water column stability and nutrients are causal in bloom formation but that variables related to water column stability are stronger predictors than those related exclusively to nutrients.
While each of the three example models predicts high chlorophyll during some portion of the large bloom in 2011 (Fig. 7A), none capture everything. When looking historically through December 2010, we find that each model predicts only a portion of the observed blooms. In a sense, each model is a caricature providing a particular viewpoint on bloom dynamics based on different indirect indicators of the mechanism. This is illustrated in Fig. 8B which shows that among the 84 historical bloom days, 80 are successfully predicted by at least one of top 100 E = 4 models but that no single model predicts all of the blooms well. This raises two interesting possibilities: (1) the environmental variables studied are imperfect in the sense that different variables are better proxies for the same basic phenomena at different times (e.g., either low wind speed or low surface density can be a better proxy for water column stability at different times (Wyatt and Horwood 1973)) or (2) different bloom events simply have different underlying specific causes (e.g., because of differences in species composition).
Sorting out these two possibilities sets an agenda for future work, and additional and more-direct oceanographic measurements of the physical environment offer a potential path. Specifically, newly underway highresolution physical measurements in conjunction with ongoing observations from other shore stations along the California coast present an opportunity to better understand how well the currently available indicators reflect mechanisms (Lucas et al. 2011)e.g., how well wind, temperature, and density act as proxies for water column stability.
Alternatively, even without additional data, predictions could be improved by developing multi-model approaches that combine bloom predictions across separate models. Fig. 9 demonstrates a simple example for doing this where a bloom is predicted if at least a certain fraction of the individual multivariate models predicts a bloom (see methods for details). This increases the overall model sensitivity for predicting blooms, and has the desirable characteristic that the true positive rate increases faster than the false positive rate. Despite having more false-positives, higher sensitivity may be valuable to management scenarios where critical values of chlorophyll-a concentration incur large costs (e.g., power plant intakes, beach closures, delaying military training exercises or threatening protected wildlife).
Despite the challenge of understanding and predicting rare events arising from biological dynamics in a stochastic environment, we have shown how the equation-free framework of EDM can improve our understanding of coastal algal blooms and their predictionan analysis made possible by the long-term monitoring at the SIO pier and subsequent long time series. The combination of better-resolved data to test these mechanistic hypotheses and more sophisticated analytical tools to make use of these data is the obvious path forward for future research. Thus, as the first century of studying of these phenomena in Southern California comes to a close (1) we have a general understanding that blooms arise as nonlinear perfect storms of environment and biology (stochastic chaos), (2) we have field evidence for the

Constant Predictor
Model sensitivity 1% 3% 5% 10% 25% 50% FIG. 9. ROC curve for a simple multi-model approach (see methods for details). Overall, as the sensitivity of the model increases, the true positive rate increases more quickly than the false positive rate. The constant predictor (prediction of Chl-a(t + 1 week) is just Chl-a(t)) is shown for comparison.

May 2017
PREDICTING COASTAL ALGAL BLOOMS 1431 importance of water column stability and nutrients, (3) and finally, we have the ability to predict blooms (albeit imperfectly) as well as a path toward better prediction through new measurements and new methods of analysis.