Detecting species at low densities: a new theoretical framework and an empirical test on an invasive zooplankton

Species often occur at low abundance and, as a result, may evade detection during sampling. Here, we develop a general theoretical model of how the probability of detecting a species in a discrete sample varies with its mean density in that location based on sampling statistics. We show that detection probability at low densities can be approximated with a simple one parameter saturating function: The probability of detection (P) should increase with the mean number of individuals in a discrete sample (N) as P ~ aN/(1 + aN). We further show how the parameter a will be affected by species spatial aggregation, within-sample observability, and gear catchability. We use the case of the invasive zooplankter, spiny water flea (Bythotrephes longimanus), to demonstrate that this theoretical model fits the empirical pattern of detectability. In Lake Mendota, WI, Bythotrephes went undetected despite rigorous long-term zooplankton monitoring (possibly as long as 14 yr with 15 sampling dates/yr). Using our modeling framework, we found that the likelihood of detecting Bythotrephes was low unless peak annual densities exceeded 0.1 individuals/m over multiple years. Despite using models based on Bythotrephes ecology to identify ways to increase detectability, detection probabilities remained low at low densities, such that cases of missed detection such as Lake Mendota are possible despite rigorous monitoring effort. As such, our theoretical model provides a simple rule of thumb for estimating sampling effort required to detect rare species when densities are expected to be low: The minimum samples required (S) to reliably detect a species can be approximated as S ~ (1 + N)/N.


INTRODUCTION
Species frequently occur at low local densities over the majority of their geographical distributions (Brown 1984, Sheldon 1988, Brown et al. 1995. Most ecological sampling methods detect species by intensive sampling of small areas of a larger region (Magurran and McGill 2011). Such sampling regimes can fail to detect species that are present, as a given sample may not include any individuals of a species. This issue only gets compounded if the sampling approach does not capture every individual in a given volume (imperfect catchability), or if it is possible to miss individuals captured in a sample or subsample (imperfect observability; see MacKenzie et al. 2017 for a review of recent work on factors affecting detectability when estimating occupancy). The probability of detecting at least one individual in a sample given that the species occurs in a region (i.e., detectability) is one of the most important parameters to understand when trying to measure occupancy rates of a species across a landscape (MacKenzie et al. 2002, 2017, Royle and Nichols 2003, Martin et al. 2005) and for invasive species monitoring programs where detecting invasion early on is critical (MacKenzie and , Hui et al. 2011, Epanchin-Niell et al. 2012, Whittle et al. 2013. Royle and Nichols (2003) identified local abundance (and therefore density) as the key factor in determining detectability, as species that are locally rare are also unlikely to be detected by a survey. To model the functional relationship between abundance and detectability, Royle and Nichols (2003) proposed assuming that abundances are Poisson-distributed between sampling locations, an approach followed by a wide range papers examining the connection between abundance and detectability (Epanchin-Niell et al. 2012, K ery andRoyle 2016). The Poisson distribution, however, assumes that individuals are randomly distributed across the region of interest. Any factor that leads to species being aggregated at spatial scales as large as or larger than the scale of the sampling scheme will affect the shape of the density-detectability relationship; the more aggregated the species is, the more likely it is for a given sample to miss an aggregation (assuming a random sampling scheme across the region).
Several different approaches have been suggested to allow for the effects of aggregation on the density-detectability relationship. Royle and Nichols (2003) proposed using a negative binomial distribution for abundances within sampling sites instead of a Poisson distribution. Delaney and Leung (2010) used logistic regression as a flexible empirical model of how detection probability varied with density (allowing for varying effects of aggregation), and McCarthy et al. (2013) developed a theoretical model of how detectability should vary with search and showed that the mean time to first detection should decrease linearly with density for randomly distributed species and slower than linearly for aggregated species. However, all of these approaches require an aggregation parameter that needs to be empirically estimated and will depend on the species, sampling gear, volume sampled, and sampling design. Further, any given model of the densitydetectability curve will typically only be valid for some range of densities, and it is unclear that the relationships suggested above will remain valid at very low densities. Given that invasive species can often increase very rapidly after spending substantial amounts of time at low density (Crooks 2005), it is important for any sampling program to account for the probability of missing species that are present at low densities. This paper focuses on the question of how detectability depends on density at very low densities, and how likely it is for a monitoring program to miss rare species.
The first part of this paper develops a theory of how species detectability should vary with density and reveals that the probability of detecting a species present in a region follows a simple saturating function of density in the limit of low densities, regardless of the spatial distribution of the species, as long as sampling effort is allocated randomly across the region. We further show how this function varies with species aggregation and density-independent catchability and observability. The second part of this paper applies this theory to the detection of the invasive zooplankter, spiny water flea (Bythotrephes longimanus), to demonstrate that this model matches observed patterns in species detectability. Finally, we use this framework to improve our understanding of Bythotrephes detection, providing an example of how this modeling framework can aid in designing surveillance programs.

THEORY: THE RELATIONSHIP BETWEEN DENSITY AND DETECTION
Invasive monitoring programs often involve spatial subsampling-taking samples from small areas to infer invasive presence or absence in the larger region. However, given that individuals are discrete and only found in a finite area, a given sample may not capture any individual of that species, even when it is present in the larger region (e.g., a lake or forest). As such, detection at low densities becomes a question of probabilities: How often will a species at a given density be found in a sample? The number of individuals found in a sample, and thus the probability of detection, will depend on the mean density per unit area of the species in the region (Royle and Nichols 2003), the spatial distribution of the species among locales within the larger region (e.g., sampling sites within a lake or forest; McCarthy et al. 2013), sampling effort in the locale, the volume and geometry of the sampling gear (Pedersen and Guichard 2016), and any habitat effects or behavioral responses of the species that ❖ www.esajournals.org change the likelihood of an individual getting caught by a given gear (i.e., catchability; Walters and Martell 2004). The probability of detecting individuals that were caught by a given gear will in turn depend on the observability of each individual in a given sample. For monitoring approaches that depend on passive observation, such as observers counting individuals observed along transects, only observability, not catchability, will affect detectability and will depend on habitat characteristics and observer ability (Tyre et al. 2003). Both catchability and observability are measured on a per-individual bias but contribute to detectability; even if local abundance is high, detectability may be low if each individual in the population has low catchability or observability. We first focus on how detectability depends on mean density at low densities and then on how this result will depend on patterns of spatial aggregation, catchability, and observability of a focal organism.
The distribution of a population of individuals at any given time spread out across an area of space can be described as a spatial point pattern (Illian et al. 2008), and we can view a given discrete sample as counting the number of individuals from that point pattern found in the sampled volume (ignoring imperfect gear catchability and observer error for now). If all the individuals in the region are randomly distributed throughout the whole regional volume (so the point pattern of individuals in the water corresponds to a realization of a homogeneous spatial Poisson process, Illian et al. 2008, see also Berec et al. 2015), then the number of individuals found (N) in a sample of volume, V, should follow a Poisson distribution with a mean of N = VÁk, where k is the mean density of individuals per unit volume in the region. For instance, when sampling zooplankton with net tows, increasing net diameter or tow length increases N by increasing the total volume of the tow (V = p 9 r 2 9 l; r = net mouth radius and l = tow length). The probability (P) of detection in a tow is then just one minus the probability of observing zero individuals from a Poisson distribution with a mean N, or 1e ÀN (Sharov et al. 1998, Bogich et al. 2008, Epanchin-Niell et al. 2012. As V increases with net diameter, N also increases, improving the probability of capturing at least one individual in a tow. Note that N will only increase linearly with tow length (and thus sample volume) if the species is also randomly distributed vertically in the water column, or tows are conducted horizontally at a single depth; more complex vertical distributions will result in a more complicated relationship between l and N, and if a species does not occur below a certain depth, sampling below this depth will not increase detection probability. For sampling programs where all samples are taken from a single depth, this issue can be avoided by instead focusing on abundance per unit of surface area (instead of per unit volume), that is, looking at the number of individuals per unit area integrating across all sampled depths. In this case, then k would equal the mean number of individuals per unit area found above depth l and N would increase proportionally to k times sample area. The theory developed here will apply equally well in this case.

Detection probabilities at low densities
As we are interested in how detection probability varies at low density, we examined how the probability of detection behaved at low expected catches (N < 1 individual per sample). For any relationship, when density goes to zero so does the probability of detection, yet we need to find a reasonable definition of low density that also allows us to find a useful functional relationship between density and detection probability. If we instead look at the relationship between the log (of base b) of mean individuals per sample (log b (N)) and the logit of detection probability (ln (P/(1 À P)), we can approximate the relationship between log-density and logit-detection for a given distribution by using a linear Taylor approximation (Bolker 2008) around a given point (log b (N 0 )) and look at the limit of this approximation as log b (N 0 ) goes to À∞ (corresponding to the expected count per sample going to zero). If the limit of the slope and the intercept of the linear Taylor approximation go to a constant value as log b (N 0 ) goes to À∞, then the Taylor approximation will become a better and better approximation of the true function at low densities (see Appendix S1 for details of this approach).
When N is small, the function relating the logit of the probability of detection to the log (base b) of N is well approximated by a log-linear relationship with a slope of ln(b) and an intercept of zero for the Poisson distribution (see Appendix S1: Fig. S1 for details on this derivation). When looking at the natural log of N (ln(N)), this simplifies to: logit(P) % ln(N), or, after applying the inverse logit transform: This approximation has the useful properties of always being a true probability (bounded between zero and one), going to zero as the expected count per sample goes to zero and to one as the expected count goes to infinity, and when N equals one (i.e., an expected count of one individual per sample), this approximation implies that detection probability should equal 0.5. Further, we can estimate the mean number of samples (S) needed to detect an individual as S = 1/P % (1 + N)/N (Appendix S1).
This approximation may seem to be unnecessary, as the exact probability of detecting rare individuals for the Poisson distribution (P = 1Àe ÀN ) is also quite simple and straightforward to understand (e.g., as used in Nichols 2003, Epanchin-Niell et al. 2012). To demonstrate why this approximation is useful, we next show that this relationship (or a simple extension to it) should hold at very low densities for any spatial distribution of a species across the region, as long as all locations within the region have an equal chance of being sampled by the monitoring program.
Eq. 1 was derived using the assumption that all individuals are randomly distributed across the larger region, catchability is perfect (so that if an individual of the focal species is present in the swept volume of water, it will get captured by the gear), and that observation error is zero, so that if a sample catches at least one individual, that sample will always get observed as a detection. Violations of the assumption of random distribution will generally be driven by aggregation in the target species. Here, we define aggregation as any mechanism that leads to individuals being on average closer to one another in space at a given point in time than would be expected given the mean density in the region (Illian et al. 2008; pure variation in density over time does not need to be treated as a separate form of temporal aggregation in this approach; instead, this can be modeled as time-varying density, k t ). Aggregation may be due to spatial variation in habitat and environmental conditions in the region or by conspecific aggregation due to factors such as low dispersal following reproductive events or conspecifics actively moving toward one another.
There are countless ways that species may aggregate in space, and typically how aggregated a species is will depend on the spatial scale of measurement, which will correspond to a different probability distribution of counts among samples. However, for the purposes of detection, the only probability that matters is the probability of observing more than zero individuals: P. Since we have been assuming that the expected number of individuals per sample is defined as N, P is uniquely defined as P = N/E (n|n > 0, k, V, M), where E(n|n > 0, k, V, M) is the expected number of individuals found in a sample of volume V that has at least one individual for regional density k and M, which is an index specifying all the other ecological conditions affecting the distribution of individuals in space (see Appendix S1 for details). Therefore, the fundamental question for any given distribution when considering the behavior of detectability in the limit of low densities is as follows: As regional density declines, how does the mean number of individuals across all samples with at least one individual present change, when measured at the spatial scale of the sampling gear? There are two possibilities: This mean may decline to one as regional density declines (this will happen, e.g., if individuals preferentially inhabit specific habitats but move independently of one another, so at low regional densities, there are few individuals even in desirable habitats), or this mean may asymptote to a value greater than one (this might occur with colonial or schooling species). Appendix S1 describes the proof for how the density-detectability relationship changes in both these cases. The main results from Appendix S1 follow.
In the first case, when aggregation declines with regional density, the relationship derived for the Poisson distribution will hold in the limit of low densities: P % N/(1 + N). This is because when densities get very low, it gets increasingly difficult to distinguish any given distribution from a binomial distribution as most samples, even those occurring in areas of high relative abundance, will only have at most one individual present. This does not imply that there is no benefit to concentrating sampling efforts on areas of relatively high abundance within a region ❖ www.esajournals.org (doing so will in effect increase the average density k and thus N in the sample); however, it does imply that, absent specific knowledge of where to sample in a given region, Eq. 1 is a reasonable approximation to use to estimate the detection effort required for a sampling program.
In the second case, where aggregation (at the spatial scale of the sampling gear) stays constant as density declines: where d equals the expected number of individuals in a sample, conditional on finding at least one individual. This will occur for any species that tends to aggregate in groups regardless of average density, as would occur with clonal species. This implies that aggregative behavior, to the extent that it stays constant as density declines, will effectively decrease the average density per sample, and thus the detectability.
The other major factors that can affect detectability are catchability of the target species by the sampling gear and observer error. If catchability of individuals is not perfect, so a given sample taken from a volume might not capture all the individuals in the volume, or if observers are not guaranteed to see every individual in a sample, then on average the detectability at any given density will be lower than the above theoretical limit. If catchability or observability is density-independent (i.e., if the probability of a given individual getting caught by the gear is not affected by local density in the area being sampled, or if the probability of observing an individual in a sample is independent of the number within that sample), then this will affect the regional density-detectability relationship the same way that fixed-size aggregation does: Here, Q is the probability of catching an individual if they are in the sampled area, and O is the probability of observing an individual in a given sample. Density-dependent catchability and observability will lead to more complex relationships between regional density and detectability, but the line defined by Eq. 1 will still be an upper bound on this relationship, in the limit of low k. Therefore, we can think of Eq. 1 as a best-case relationship; no monitoring program will have a greater chance of detecting an invader at low densities than that given by Eq. 1, in the absence of a sampling program designed to target areas or time periods of high abundance for that species. Equivalently, when densities are low, the mean number of samples required to detect an invader at density k with gear of volume V is at least S = (1 + kV)/k V samples (Appendix S1).

Detection probabilities at higher densities
This theoretical analysis focused on how detectability via sampling depended on the species' density at the limits of low density. The next question is then how well this approximation works to estimate detectability at higher densities. To determine this, we compared our approximation of detection to the expected probability of detection across a range of population densities for three different distributions: 1. The Poisson distribution (representing random variation of individuals in space).

A negative binomial distribution with mean
N and size parameter b equal to 2, 1, 0.5, and 0.25, representing four levels of aggregation for a population whose aggregation declines with density (the variance of the negative binomial equals N + N 2 /b. b is often below one for ecological samples, indicating relatively strong over-dispersion; Bolker 2008). 3. A population where individuals are aggregated into groups of fixed size d ≥ 1, with the number of these groups caught following a Poisson distribution. We refer to this as the cluster-aggregated distribution. This distribution approximates a colonial species, where individuals are found in groups even when regional densities are low. The probability of detection for this distribution will be equal to that of a Poisson distribution with mean N = k/dV (when d = 1 this distribution reduces to a Poisson). We tested three levels of aggregation for this distribution: d = 2, d = 4, and d = 8.
We plotted both the exact probability for each distribution and the approximate probability for all distributions used to observe the relationship between density and detection at higher densities (Fig. 1). In the negative binomial case (Fig. 1B), the log-linear approximation under-estimates the ❖ www.esajournals.org probability of detection when b > 1 (the Poisson distribution, plotted in red, corresponds to a negative binomial with size = ∞) and over-estimates the probability of detection when size <1, but the magnitude of both under and overestimation diminish rapidly at mean densities of less than one individual per sample. When size = 1, the approximation is exact for all densities (derivation not shown). For the cluster-aggregated distribution (Fig. 1C), the approximation underestimated the probability of detection for all levels of aggregation, but this error is relatively small when the mean number of individuals per tow is <1, and this approximation error declines rapidly with decreasing population density and increasing levels of aggregation.

APPLICATION: IMPROVING DETECTION OF AN AQUATIC INVASIVE SPECIES, SPINY WATER FLEA
We use the case of the detection of the invasive predatory zooplankter, spiny water flea (Bythotrephes longimanus), to explore empirically how detectability should vary with density. Nonnative species are often found at low densities (Labra et al. 2005, Hansen et al. 2013), but detecting nonnative populations at low densities is resource-intensive (Rew et al. 2006, Harvey et al. 2009, Hoffman et al. 2011, demonstrating the need to better understand the factors influencing their detection. Reports of Bythotrephes adverse effects have been accumulating since its detection in the Laurentian Great Lakes , Kerfoot et al. 2016, Walsh et al. 2016a) but its detection has proven challenging, often occurring well after Bythotrephes establishment and population growth ❖ www.esajournals.org (Walsh et al. 2016b, Branstrator et al. 2017). In the case of the well-monitored Lake Mendota, Wisconsin, USA, Bythotrephes persisted and avoided detection for perhaps a decade or more prior to erupting to high densities (>150 individuals/m 3 ) in fall of 2009 that drove a large decline in lake water clarity (Walsh et al. 2016a, b). As Lake Mendota is monitored roughly 15 times per year by the North Temperate Lakes Long Term Ecological Research Program (NTL-LTER), this case highlights the challenges of invasive species detection.
In the following section, we apply our theoretical analysis to the problem of Bythotrephes detection and model the probability that NTL-LTER monitoring would have detected a low-density population under varying scenarios of Bythotrephes population density, seasonal population dynamics, year of establishment, and sampling strategy. From these simulations, we highlight how understanding the factors that influence Bythotrephes detection can be used to improve invasive species monitoring and surveillance programs.

Empirical relationship between density and detection
To test the effectiveness of the theoretical approximation above, we compiled data from zooplankton sampling conducted from 2010 to 2016 in Lake Mendota. Sampling occurred in five pelagic sites (one site at maximum depth, Z max = 25 m, and four sites at 15 m). Zooplankton samples were taken using a 50-cm zooplankton net with a 150-lm mesh, which has a larger sampling area than the smaller 30-cm NTL-LTER zooplankton net. Vertical tow lengths extended from 2 m above the lake bottom to the surface and thus varied by site maximum depth. We counted each sample in their entirety for Bythotrephes. We estimated population density per sampling event as the mean number of Bythotrephes found across all tows on a given day and we used the presence or absence of Bythotrephes in each tow as the response variable in the model of the relationship between detection and the mean number of individuals per sample.
We modeled the relationship between density and detection using a generalized linear model (GLM, fit using the glm function in R; R Core Team 2016) with binomial errors and a logit link function, regressing Bythotrephes presence and absence in each net on log 10 -transformed population density after removing sampling days where no Bythotrephes were found in any tow (as we could not estimate densities for these days). While removing any sampling days without any observations can bias the observed predictabilitydensity relationship (since the lowest frequency of detection possible in a given day is then one divided by the number of nets sampled in that day), this should not be strong for any day with at least four samples taken (see Appendix S2). After filtering, there were 735 net tows included in the regression. We evaluated model fit by calculating randomized quantile (RQ) residuals (R package statmod; Dunn andSmyth 1996, Giner andSmyth 2016) for the fitted detection model. The RQ residual approach uses the fitted model and specified error distribution (in this case, the binomial) to transform the observed data into residuals that should be independent and identically distributed following a normal distribution with a mean of zero standard deviation of one, as long as the model correctly captures the data generating process. We used RQ residuals as it is simpler to detect unmodeled patterns with them compared to raw residuals (Dunn and Smyth 1996). We compared the observed RQ residuals to RQ residuals drawn from simulated data, by randomly drawing from a Poisson distribution with N equal to what would be expected from the observed raw data (e.g., if observed k equals 0.1 individuals/m 3 then N is roughly equal to 0.2 expected in a single tow), then fitting a binomial GLM regressing the simulated probability of detection in a sample on log-density. Since we only expected from our theoretical results that the linear relationship between logit-detection and log-density should hold at low densities, this implies that the fitted model is expected to fit more poorly at high densities. By comparing observed RQ residuals to simulated RQ residuals, we could determine whether any systematic patterns of misfit between the model and RQ residuals for the data observed in the field were larger than we would expect from random sampling.
Simulating a low-density, seasonally variable, Bythotrephes population We were interested in modeling factors influencing Bythotrephes detection at low densities (e.g., <1 individual/m 3 ) that are often unobservable under standard sampling methods (e.g., 3-5 50-cm zooplankton net tows per day, as used here). As Bythotrephes populations begin each year by hatching from the egg bank, then advance through multiple generations in a year, Bythotrephes densities are highly seasonal in lakes. This seasonality is likely to affect the probability of detection. Therefore, we simulated lowdensity populations with seasonal variability.
To reproduce the average seasonal population dynamics of a Bythotrephes population at varying low-density scenarios (e.g., the low-density Bythotrephes population prior to its detection), we fit a temporally varying model to observed Bythotrephes densities from multiple lakes for which we had abundance time series: Lake Mendota (obtained from sampling here from 2010 to 2016 and NTL-LTER data), Trout Lake (NTL-LTER data), Lake Monona (sampling here and NTL-LTER data), Lake Waubesa (sampling here), Lake Kegonsa (sampling here), Gile Flowage, and Stormy Lake in Wisconsin, as well as Lake Gogebic in Michigan's Upper Peninsula (data from these final three lakes were obtained from WDNR, courtesy of Carol Warden). These lakes span a range of seasonal Bythotrephes dynamics, from Lake Mendota, a population that appears later and crashes during high summer surface water temperature, to Gile Flowage (Iron County, Wisconsin, USA), a northern Wisconsin reservoir, where Bythotrephes peaks earlier and longer than any other lake surveyed in Wisconsin. We modeled observed population densities in each lake as a function of day of the year using generalized additive models with a thin plate regression spline smoother and a Gaussian error distribution. We estimated a single seasonal smoother plus a separate smoother for each lake, using a factor smooth (fs) basis function (Wood 2017) to allow for inter-lake heterogeneity in seasonal dynamics. Smooth terms were selected by generalized cross validation criterion (package mgcv in R; Wood 2015). We were primarily interested in highlighting the general seasonal patterns across study lakes, but with varying population densities. Therefore, we relativized the seasonal density curves to the maximum to obtain a curve ranging from 0 to 1. Relativizing the seasonal curve allowed us to multiply a value (k max = peak annual population densities) to scale the same seasonal population dynamics to different lowdensity scenarios. This approach allowed us to test how detectability might vary both across seasons and with varying maximum abundances, assuming the relative seasonal pattern of Bythotrephes does not differ with maximum density.

Estimating the probability of detecting a Bythotrephes population
We used our approximation (Eq. 1; P % kV/ (1 + kV)) to estimate the probability of detecting Bythotrephes for four net diameters (representing four different sampling schemes): 15-cm (a commercially available net, common to citizen science; V = 0.017Ád m 3 ), 30-cm (net used by NTL-LTER; V = 0.070Ád m 3 ), 50-cm (net used here; V = 0.19Ád m 3 ), and 100-cm (net used to capture macrozooplankton like Mysids; V = 0.78Ád m 3 ), across a range of possible invader densities (k) and a constant tow length of d = 10 m (based on the fact that Bythotrephes is strongly phototactic and primarily lives in the upper water column during the day so 10 m tows would effectively sample the entire population; J. R. Walsh unpublished data). We assumed the probability of detection was defined by the low-density unaggregated approximation (Eq. 1).
To estimate the probability of detecting Bythotrephes in Lake Mendota under the NTL-LTER monitoring protocol, we used the 211 sampling events between 1995 and 2008. On average, NTL-LTER sampled once every other month from January through March, twice per month from April through August, and between one and 1.5 times per month from September through November (no sample trips were recorded in December from 1995 to 2008).
We simulated the low-density Bythotrephes population as described above using relativized GAM model estimates of the seasonal curve from Lake Mendota. This produced an estimate of the seasonal dynamics exhibited by a low-density population that we could use to understand seasonal changes in detectability. We then simulated the overall probability of repeatedly failing to detect the population on each sampling trip on the NTL-LTER sampling schedule from the year of its invasion through 2008, the year before it was detected at high densities (e.g., >150 individuals/m 3 ). We conducted these simulations using both the empirical fitted log-linear density-detection curve for the 50-cm net and the theoretical densitydetection curves (Eq. 1) for the 15-, 30-, 50-, and 100-cm nets.
❖ www.esajournals.org We calculated the overall probability of detecting a population in each time span as one minus the product of the annual probabilities of detection failure for each possible year of invasion (i) to the year Bythotrephes was actually detected (retroactively in a sample from 2008), with initial invasion years ranging from 1995 to 2008 (Eq. 1). For example, if a low-density population invaded in 1995 and the annual probability of detecting that population was 0.1, then the probability that the population would been detected in the 14-yr period (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) would be 1 À ð1 À 0:1Þ 14 % 0:8 . The annual probability of detecting the population was calculated as one minus the product of the single trip probabilities of detection failure: one minus the probability of detection given a population density (N) on the day of that trip (j; Eq. 2).

P detection by 2008
Finally, we conducted the detection probability simulation for a range of sampling schedules (altering the annual frequency and timing of j), using the empirical (50-cm net) and theoretical density-detection relationships under two different seasonally dynamic scenarios (early and late season population peaks) to explore the association among sampling strategy, population seasonality, and Bythotrephes detection.

Factors influencing detection probability of Bythotrephes
Overall, the probability of detecting Bythotrephes in a single zooplankton net tow increased rapidly with increasing population density ( Fig. 2A); the estimated slope of the relationship Red lines represent the theoretical relationship between density and detection derived from approximating zooplankton captures in a net tow using Eq. 1. The binomial regression model estimating the probability of detecting Bythotrephes (raw detection data are denoted by ticks above and below the fitted curve) as a function of population density (note log 10transformed x-axis) using a 50-cm net is plotted in solid black and the shaded gray area represents AE1 standard error (A). Average density versus model randomized quantile residuals for both observed (B, D) and Poisson-simulated data (C, E), with generalized additive model smooths (solid lines with dotted 95% confidence intervals, GAM P-values indicated in panel). QQ plots comparing the distribution of residuals from the empirical model (D) and Poisson simulation (E) to the quantiles from a standard normal distribution. (Fig. 2. Continued) ❖ www.esajournals.org between logit-detection and log 10 -count was 2.8 (AE0.28 standard error [SE]) and the intercept was 0.24 (AE0.21 SE). The slope and intercept were close to the theoretical prediction of ln 10 ð Þ % 2:3 and 0, respectively (from Eq. 1). The GLM fits the observed pattern of detections well, with no evidence of a non-linear relationship between density and the RQ residual values (Fig. 2B), and with RQ residuals showing a close to normal distribution (based on QQ plots), except for slightly larger number of negative residuals and slightly fewer positive residuals (Fig. 2D). These patterns are consistent with those from Poisson-distributed simulated data (Fig. 2C,E). For the 50-cm net, the theoretical relationship between population density and probability of detection was similar to the empirical ( Fig. 2A). Fitted detection probability reached 0.5 (corresponding to roughly 0.95 after four tows averaging 10 m in length) at 0.82 individuals/m 3 , and the theoretical detection probability reached 0.5 at 0.51 individuals/m 3 for the 50-cm net. The theoretical detection probabilities of the 15-, 30-, and 100-cm nets reached 0.5 at 5.7, 1.4, and 0.12 individuals/m 3 , respectively (assuming an average 10 m tow in all cases).
The number of samples necessary to reliably detect Bythotrephes (S) increased with decreasing population density (k) and was extremely close to theoretical predictions using the formula derived here, S = 1/P % (1 + N)/N = (1 + kV)/ (kV), where V is calculated using a 10 m tow of a 50-cm net (Fig. 3). As increasing volume (V) effectively increases the probability of detection (P), increasing net diameter decreases S for a given k (Fig. 3).

Simulating detection rates in Lake Mendota
Using GAM models to simulate detection at different low-density (k max ) scenarios (Appendix S3: Fig. S1), we found that detecting Bythotrephes was unlikely when population density was low (e.g., between 0.001 and 0.01 individual/m 3 ), but this probability increased with increasing zooplankton net diameter and time since invasion (Fig. 4). Using Eq. (1) to approximate detectability for the 30-cm net, we found that detection of a low-density population in any given net tow was unlikely (<50% chance of detection) given populations that peaked at densities (k max ) ranging from 0.001 to 0.01 individuals/m 3 across all simulated invasion years (Fig. 4B). At peak densities greater than 0.01 individuals/m 3 , the probability of a detection rapidly increased, and increased further with an earlier invasion year and higher peak population densities (Fig. 4B). The minimum density to detect a population in 50% of tows ranged widely between the two extreme net sizes and was nearly ten times higher for the 15-cm net (on the order of 10 0 individuals/m 3 ; Fig. 4A) and ten times lower for the 100-cm net (10 À2 individuals/m 3 ; Fig. 4D) relative to the 30-cm net (10 À1 individuals/m 3 ; Fig. 4B).
Bythotrephes would be more likely to go undetected under several scenarios: (1) when using a small net (Fig. 4), (2) when sampling a lake with low summer population densities (but relatively high fall densities as in Lake Mendota; Appendix S3: Fig. S1), and (3) when restricting sampling efforts to mid-summer months (Fig. 5). In Lake Mendota, doubling fall sampling effortincreasing the total number of trips from 14 to 18 -was as effective at increasing the probability of detection as doubling the sampling effort over an entire year-increasing from 14 trips to 28 total  Fig. 3. The relationship between population density (k in individuals/m 3 ) and the mean number of samples needed for detection (S) using the empirical relationship (GLM fits in black and gray) between k and the probability of detection (P, where S = 1/P) as well as the theoretical relationship between k and S derived from Eq. 1. The theoretical relationship is estimated for four varying net sizes ranging from 15 to 100 cm in diameter (D, red lines). trips (Fig. 5A). Increasing sampling effort in both summer and fall was effective in the Gile Flowage (Fig. 5C). In both cases, increasing sampling frequency or volume increased the probability of detection, though detection remained low in the lowest density scenario (0.001 individuals/m 3 ; Fig. 5B,D).

DISCUSSION
Detecting and studying species at low densities is a key challenge in understanding species distributions. Here, we applied a theoretical framework of detecting organisms at low densities to a case study of a repeated detection failure of a low-density invasive population. We derive a theoretical expectation for how likely a given sample is to detect an invader at low densities regardless of the ecological details of the sampled species and found that this theoretical approach matched the case study of Bythotrephes detection well. Finally, we used our theoretical findings to improve Bythotrephes detection, demonstrating the value of this approach for designing species surveillance programs.

Theoretical implications
Our theoretical results imply that, as densities get very low, the density-detectability relationship should be essentially linear under any ecological circumstances. One intuitive way to understand this result is that, when densities are very low, most catches will be one and that measured average density per sample will very nearly equal to the empirical detectability in sample, so the relationship must be linear. In effect, at very low average counts, it is very hard to distinguish an arbitrary discrete distribution from a Bernoulli distribution. Aggregation only changes this story if species remain in groups at very low densities. However, aggregation at very low densities would only increase the fraction of samples that are zero at low densities but would not change the fundamental shape of the relationship.
Our approximation differs from that derived in McCarthy et al. (2013), who modeled the density-detectability relationship in discrete samples as P = 1 À exp(Àkk b ), where k is an empirically estimated proportionality constant and b is an estimated aggregation parameter ≤1 (b = 1 corresponding to a random spatial distribution). This formulation differs from ours in that it can account for the effect of aggregation on detectability at higher densities but does not have the correct limit behavior as population densities go to zero. In their formulation, the limit of low densities, the slope of the ln(density)logit(detection) relationship will equal a rather than one, as in ours (results not shown, but they are derivable through the method used in Appendix S1). This difference in slope implies that their relationship will tend to overestimate detectability at very low densities. The two results likely differ due to the regime around which the detection-detectability function is approximated. In our case, we focused on the very low-density regime, and so were able to eliminate the effect of aggregation on detectability, which came at the cost of being able to model aggregation effects at higher densities. McCarthy et al. focused on a power-law approximation that can capture the effects of aggregation on detectability at higher densities but fails to reduce to the linear relationship at low densities. One approximation that might reconcile these two approaches would be to use the probability of detection for a negative binomial distribution as a more general approximation. For a negative binomial distribution with an average number of individuals per sample N and a size parameter b, the detection probability is as follows: P = 1 À (b/(N + b)) b (Fig. 1). This relationship would have the appropriate limit behavior at low densities, and the effect of increasing aggregation on detectability at greater densities could be modeled by varying b (Bolker 2008). While this detectability function has not been derived from any natural limit, its relationship to both our Eq. 1 (i.e., equal to our approximation when b equals one) and to randomly distributed data (i.e., it reduces to the detection probability for Poisson-distributed data when b goes to infinity) suggests that it may be a useful approximation when estimating density-detectability relationships for higher densities in the presence of spatial or temporal aggregation.

Detecting Bythotrephes at low densities
Our results support previous suggestions that Bythotrephes occurred at low abundances in Lake Mendota for an extended period without being detected. We found that Bythotrephes in Lake Mendota was unlikely to be detected at low densities (<0.01 individuals/m 3 ), but the probability of detection could be increased by altered invasion scenarios and sampling strategies at moderately low densities (between 0.01 and 1 individuals/m 3 ). We also found that detection failures were less likely when targeting seasonal peak Bythotrephes population densities (e.g., fall in Lake Mendota or summer in the Gile Flowage). We applied information learned here to our simulations and found that detection probability in southern lakes increased the same amount if sampling efforts were doubled only in the fall (+4 trips) or the entire year (+ 14 trips). Our findings highlight the benefit of knowing species life histories, but they also suggest these patterns may vary substantially among or even within systems (e.g., Appendix S3: Fig. S1). Given this variability among systems, a more effective approach to surveillance of Bythotrephes may be to extend the sampling season (e.g., summer and fall) rather than attempt to predict in which season Bythotrephes is likely to be abundant (e.g., summer or fall).
The probability of detecting Bythotrephes was influenced primarily by population density. This probability increased dramatically from 0.10 to 1.0 individuals/m 3 , yet these densities are low compared to lakes with documented Bythotrephes invasions (Young et al. 2011, Kerfoot et al. 2016). At densities above 1.0 individuals/m 3 , Bythotrephes is readily detectable, and detectability is largely independent of net diameter (Boudreau and Yan 2004). Larger zooplankton nets were more effective at detecting Bythotrephes, effectively increasing the sampled volume. Note that here we account for only the differences in tow volume (i.e., search volume) among zooplankton net sizes, not differences in catchability between nets. Catchability is lower for smaller nets since stronger swimming zooplankton (including Bythotrephes) can sense the resistance created above the net and swim away, evading capture. As a result, we likely overestimated the detection probability for smaller nets using Eq. 1; however, empirical estimates of catchability could be accounted for in a monitoring plan using Eq. 3.
While population density was the key predictor of the probability of detecting Bythotrephes, densities varied considerably within a year. Field surveys for invasive species typically occur during the summer, and while increasing summer effort improved detection probability in the Gile Flowage, it had little effect in Lake Mendota where Bythotrephes summer population densities were much lower. Population maxima are reached later in the year for Lake Mendota, resulting in a substantially improved probability of detecting Bythotrephes by increasing sampling effort from September to December.
Low-density populations are likely to persist undetected when the population density required to sustain a population is lower than species detection limits, which may be the case for Bythotrephes in North America. For example, Bythotrephes would have been detectable in Lake Mendota (211 sampling efforts) at peak densities higher than 0.01 individuals/m 3 , but previous work has shown that Bythotrephes might be capable of persisting at much lower densities; 0.001 individuals/m 3 (Drake 2004). Lake Mendota is not the first report of Bythotrephes persisting in a system at low densities prior to its detection. Branstrator and his colleagues have shown this to be the case in Island Lake, Minnesota, USA. Sediment cores have revealed that Island Lake could have been the earliest invaded lake inland of the Laurentian Great Lakes in North America, with Bythotrephes establishing as early as 1982 (the first year of its detection in the ❖ www.esajournals.org Laurentian Great Lakes) prior to its detection in 1990 (Branstrator et al. 2017). Here, we suggest that Bythotrephes may be more likely to persist undetected due to the necessity to use specialized equipment for detection (e.g., a relatively large zooplankton net, which is not easily available to citizens) and highly seasonal population dynamics.
Many cladoceran zooplankton increase encounter rates for sexual reproduction at low population densities by aggregation or sexual swarming (Young 1978). While swarming behavior could partially explain Bythotrephes capacity to persist at low densities, it would also reduce the probability of detection at any given density. Aggregation leads to larger volumes of water without any Bythotrephes, especially if Bythotrephes are aggregated at spatial scales similar to or smaller than the scale of sampling (Pedersen and Guichard 2016). The high degree of larger-scale variability in density implies that it would be better to sample more spatially disparate sites rather than increasing sampling time at any given site, as this would increase the chances of finding locations with high densities of invaders. However, our theoretical results (Eq. 2) indicate that this should not have been a driving factor in reducing detectability at low densities, given that the degree of aggregation in this species declines with density (Appendix S4: Fig. S1). Furthermore, even at extremely low densities, an established Bythotrephes population would be extremely large lakewide (e.g., in Lake Mendota, 0.01 individuals/m 3 could correspond to~3 million individuals assuming a lake volume of 500 million m 3 and roughly 60% habitat suitability given mid-summer oxygen conditions). In the earliest stages of an invasion, when lakes are not fully colonized, populations are likely much smaller and the ratio of a zooplankton tow to lake volume is infinitesimally small. Therefore, our recommendation to trade multiple samples at a single site with multiple samples at multiple sites is likely most useful in the former case (i.e., a low-density population that has fully colonized a lake).
Bythotrephes has also been shown to migrate vertically through the water column in a diel cycle, which could affect our results in lakes where Bythotrephes migrates to deeper waters during the day. The lake-by-lake variation in diel vertical migration is likely the outcome of the trade-offs between light, temperature, prey availability, and predator presence, and these trade-offs often lead to dampened diel vertical migration in Bythotrephes (Young and Yan 2008). Nonetheless, if Bythotrephes is migrating into deep waters to avoid visual (or warm-water) predators, this could affect our results here as all tows were taken during daylight hours. While Bythotrephes is positively phototactic (migrates toward light) in Lake Mendota (J. R. Walsh unpublished data), this could vary by lake. In cases of positively phototactic populations, this behavior could be used to improve sampling designs by aggregating the population toward a light source at night and sampling in the aggregated area (e.g., a light trap).

Management implications
Our theoretical results indicate that, when planning sampling efforts, it is essential to identify what would qualify as the minimum density of invaders before a region is considered invaded, and plan sampling efforts based on this level; given Eq. 1, a useful rule of thumb would be that at least (kV + 1)/kV samples are needed to have a reasonable chance of detecting an invader at density k (Fig. 3). This could be achieved by either using a large number of samples or by using monitoring approaches that effectively sample higher volumes of water (such as eDNA, acoustics, or techniques like light traps which actively attract the target species).
Our results also suggest that it may be more useful for invasive monitoring programs to structure their sampling approaches around targeting specific species, rather than spreading sampling efforts across a broad region to attempt to detect multiple species. For instance, consider two invasive species in a large region, each of which is at a density of 0.01 m À2 across the region, but each of which occurs at densities of 0.1 m À2 in separate smaller sub-areas of the region (or at different times of the year). In this case, generalpurpose sampling scheme taking 1 m 2 samples across the whole region would require more than (1 + 0.01)/0.01 = 101 samples on average to detect both species, but species-specific sampling programs (targeting areas or periods high abundance for each species) would only take on average (1 + 0.1)/0.1 = 11 samples per species to detect them, or approximately 22 samples for both species.
Understanding invasive species population biology and life history can greatly improve the likelihood of species detection. According to our model of Bythotrephes, targeting additional zooplankton net monitoring effort to fall months would improve detection in any of the lakes surveyed here (including the Gile Flowage where Bythotrephes peaks earlier than in other lakes, Fig. C1), particularly with larger diameter nets. Alternatively, using seasonally insensitive methods for detection (e.g., sediment sampling of preserved Bythotrephes tail spines as in Walsh et al. 2016b andBranstrator et al. 2017) could circumvent the challenge of detecting a highly seasonal nonnative species.
Risk-based sampling designs can improve detection rates of nonnative species that can persist below detection limits (Barrett et al. 2010, Whittle et al. 2013. For example, when planning monitoring programs across multiple locations, management resources may be better allocated to sampling designs that select for high-frequency sampling of vulnerable, high-risk locations rather than low-frequency sampling of many locations (Harvey et al. 2009). Such assessments have proven to be useful for understanding the spread of Bythotrephes (Potapov et al. 2011). For example, Bythotrephes dispersal is strongly driven by propagule pressure, namely boater movement (MacIsaac et al. 2000, Weisz andYan 2010). Further, Bythotrephes is more likely to occur in large, deep, and nutrient-poor lakes (Branstrator et al. 2006, Wang andJackson 2011) but is more abundant in productive lakes (Brown et al. 2012) as well as lakes with higher prey availability (Young et al. 2011, Walsh et al. 2017. Our general theoretical relationship between density and the effort needed to reliably detect a species at low densities also highlights the benefits of targeting search efforts to areas and times of higher expected densities within regions selected for monitoring and reinforces Royle and Nichols' (2003) point that abundance is a key factor in determining detectability. Adapting information from a subset of more intensively studied lakes (e.g., regional variation in population seasonality was used to develop a seasonally targeted sampling strategy) to plan where and when to target sampling efforts within a region to coincide with maximum abundances could substantially improve detection of low-density invasives. The realities and challenges of detecting species at low densities point to the important role erupting low-density invasive populations will play under future ecological and environmental change (Rilov et al. 2004, Crooks 2005, Walsh et al. 2016b). While we show that the probability of detecting low-density populations can be increased by an understanding of species life history, detection failures remain likely to occur for low-density Bythotrephes populations in North America (as in Fig. 5). As lakes change, undetected low-density populations may change with them, either gradually or suddenly. In the case of Lake Mendota, a single cool summer led to explosive population growth and large ecological and economic damages (Walsh et al. 2016a, b). Understanding early-stage nonnative population dynamics despite challenges associated with low population densities will be a critical endeavor for invasion ecologists and managers in the coming decades (Crooks 2005).
Our work provides the theory to support a simple and generalizable rule of thumb for designing surveillance programs with no information other than an estimate of expected species counts or densities. While this approach was tested using an invasive species as a case study, it could be just as easily applied to other cases where low-density populations and rarity in a landscape are a critical feature of a species ecology and management (e.g., imperiled species; Taylor and Gerrodette 1993), and be a useful component of species surveillance programs in the future.