Detection probabilities for sessile organisms

. Estimation of population sizes and species ranges is central to population and conservation biology. It is widely appreciated that imperfect detection of mobile animals must be accounted for when estimating population size from presence–absence data. Sessile organisms also are imperfectly detected, but correction for detection probability in estimating their population sizes is rare. We illustrate challenges of detection probability and population estimation of sessile organisms using censuses of red wood ant ( Formica rufa - group) nests as a case study. These ants, widespread in the northern hemisphere, can make large (up to 2 m tall), highly visible nests. Using data from a mapping campaign by eight observers with varying experience of sixteen 3600- m 2 plots in the Black Forest region of southwest Germany, we compared three different statistical approaches (a nest- level data- augmentation patch- occupancy model with event- specific covariates; a plot- level Bayesian and maximum- likelihood model; nonparametric Chao- type estimators) for quantifying detection probability of sessile organisms. Detection probabilities by individual observers of red wood ant nests ranged from 0.31 to 0.64 for small nests, depending on observer experience and nest size (detection rates were approximately 0.17 higher for large nests), but not on habitat characteristics (forest type, local vegetation). Robust estimation of population density of sessile organisms—even highly apparent ones such as red wood ant nests—thus requires estimation of detection probability, just as it does when estimating population density of rare or cryptic species. Our models additionally provide approaches to calculate the number of observers needed for a required level of accuracy. Estimating detection probability is vital not only when censuses are conducted by experts, but also when citizen- scientists are engaged in mapping and monitoring of both common and rare species.


IntroductIon
Estimating population size is a central requirement of population and conservation biology. Similarly, estimating species ranges and predicting their changes-for example, in response to climatic change and habitat disturbance-depend on accurately documenting the presence and absence of individuals. In both cases, imperfect detection is a widely appreciated problem (e.g., Royle et al. 2005, MacKenzie et al. 2006, Kellner and Swihart 2014, Dénes et al. 2015: How can an observer be certain that individuals are detected when they are present? Consequently, estimates v www.esajournals.org BERBERICH ET AL. of detection probability now are used routinely in subsequent estimation of population sizes and ranges of common, rare, or cryptic mobile animals (e.g., Williams et al. 2011).
For sessile organisms such as plants, many marine invertebrates, and a wide range of colonyforming organisms including ants and termites, estimating their colony sizes or ranges would seem to be much easier than for animals that are constantly moving. However, detection probability of sessile organisms is surprisingly variable and strongly depends on the conspicuousness of the focal taxa; habitat characteristics; sampling design, time and duration; and the experience of the observer (e.g., Alexander et al. 1997, Miller and Ambrose 2000, Fitzpatrick et al. 2009). Sessile organisms also are simple targets for monitoring by citizen-scientists.
Ants are ubiquitous in most terrestrial landscapes (e.g., Dunn et al. 2009). Red wood ants (henceforth RWA) form very large, often polydomous colonies (Ellis and Robinson 2014); individual mound nests may reach 2 m in height and contain >60,000 individual workers (Chen and Robinson 2013). RWA are of significant ecological importance (e.g., Klimetzek 1981, Way andKhoo 1992). Recently, RWA species have been introduced for biological control of undesirable insects (Seifert 2016), developed as biological indicators for otherwise undetected tectonic activity (Berberich et al. 2016), and some are considered species of conservation concern (e.g., BfN 2012, IUCN 2015. There are few long-term studies of RWA populations. Some investigators have suggested that populations of RWA are declining (e.g., Wellenstein 1990, Crist 2009), whereas others have reported that their populations are increasing (e.g., Stoschek andRoch 2006, Wilson 2011). Because none of these (or other) researchers have estimated or accounted for detection probability, a potential explanation for differences among studies is that estimates of occurrences or population sizes of RWA nests are inaccurate. Although this general problem has been recognized for mobile animals (e.g., MacKenzie et al. 2006), it is discussed only rarely in reviews of population sizes of endangered sessile species such as plants or ants (e.g., Philippi et al. 2001, Underwood and Fisher 2006, Godefroid et al. 2011. Therefore, we used the large, persistent, and highly apparent nests of red wood ants (Formica rufa-group) as a case study (Fig. 1).
Estimating the size of a population is a statistical problem addressed in hundreds of publications (e.g., Manning and Goldberg 2010, Grimm et al. 2014, Royle et al. 2015. Our case is different, although not atypical, and several aspects render the application of established approaches either unnecessarily cumbersome or completely infeasible. First, as sessile organisms do not move, they do not have a capture or resighting history (as used, e.g., in Huggins-style recapture models, e.g., Akanda and Alpizar-Jara 2014): Every time a plot is inspected, the nest will be found (with a certain detection probability) because the occupancy is constant (ψ = 1 for any object ever recorded). Second, detection probability is a function both of traits of the object (e.g., its size) and environmental conditions. Again, this has been addressed infrequently in recapture studies (but see Royle et al. 2004 for sparse data lacking object traits). This study employed several different statistical models, each of which is relatively simple and all of which estimate variability in detection rates by individual observers. An additional goal of the analysis was to quantify how many observers would be required to achieve a given level of accuracy for an estimator of population size. To achieve this goal, we also needed to estimate observer-specific detection probabilities.
In this study, we addressed five inter-related questions: (1) Do multiple observers detect or overlook the same RWA nest? (2) Is there a "best" way to quantify detection probability of sessile organism such as RWA nests? (3) Do colony size and density influence detection probability? (4) Does individual nest size influence detection probability? (5) How many observers are needed to converge on an estimate of the true number of nests? We asked these questions specifically with respect to individual RWA nests. In doing so, we improved estimates of RWA population sizes by including detection probability while simultaneously developing and using methods that will be applicable to a wide range of sessile organisms.

Sampling design
Fieldwork was carried out during April 2015 in 16, randomly chosen 60 × 60 m plots near Friedenweiler (N47.54, E8.16, EPSG: 5677, 850-920 m a.s.l.) in the Black Forest region of southwest Germany. Eight observers (two experienced ones [coauthors GMB and MBB] and six inexperienced ones) independently mapped RWA nests for 1 h in each of the 16 plots. The inexperienced observers were trained beforehand to recognize RWA nests in the field and to map them using a GPS receiver (Garmin 60CSx/62S/64S; 10-m precision) held directly above a RWA nest and register its location. Each observer also took a photograph of every mapped nest ( Fig. 1) to facilitate its subsequent identification and to avoid double-counting when nearby nests were within the precision of the GPS. Each GPS receiver was preloaded with 1:50,000 topographic maps onto which the boundaries of all 16 study plots had been transferred so that plot boundaries could be observed and maintained during each census.
All cameras and GPS receivers were synchronized to local time and projection (WGS84 projection; Datum: Potsdam). To avoid two observers mapping the same plot at the same time, each observer mapped the plots in a specifically defined sequence. The track of each observer in each plot was recorded continuously to quantify speed, total distance covered, and individual search strategy (Fig. 2). Finally, to minimize errors in delimiting plot boundaries in the field, a buffer region of 10 m around each plot was included during field recording to account for GPS imprecision. All GPS data were downloaded immediately after collection and transferred into a GIS database. Forest stand types were classified in the field, and nest heights and diameters were classified from nest photographs.

Estimating and correcting for false positives
False positives for each observer i sampling in plot s were tabulated manually from the number of reported nests. The number of observed real nests N obs was determined by cross-matching all mapped entities identified as RWA nests with their GPS coordinates, photographs, and recorded census tracks and expert knowledge. We linked GPS coordinate positions for each actual RWA nest recorded by each observer and averaged them to obtain a unique GPS position for each nest, which was then assigned a unique identifier. In all analyses, only real RWA nests were analyzed.

Covariates of detection probability
For exploratory analysis, we used a quasibinomial generalized linear model to test whether nest sizes, classified by height classes (1-10, 11-50, 51-100, and >100 cm) or diameter classes (1-50, 51-100, 101-150, and >150 cm)  or beech [Fagus]) in which it occurred (classified in the field); or its location (within the forest, along forest roads, or along forest edges, as classified in the field and from GIS layers) affected the number of nests detected by each observer. Because the number of small nests greatly exceeded those of larger nests, we pooled the two largest size classes when regressing detection probability on nest size.

Statistical analyses
Our data set is unusual relative to others in the detection-probability literature because (1) our objects do not move (in contrast to spatial recapture analyses, which estimate the probability of an animal having been observed in different plots, that is, its occupancy); (2) we counted ant nests in several plots; (3) instead of plot revisits (typical for recapture data), our "visits" were different observers, making it possible to determine observerspecific detection probabilities; and (4) each nest was characterized by its size, which may also have affected detection rates. Of course, there may be some nests that none of the eight observers discovered. For those, we obviously also do not know the size or habitat characteristics. We used three fundamentally different ways to estimate the total number (N) of nests and the number of nests in each of our sampling plots, N s . Approach 1: Nest-level Bayesian data-augmentation.-The most detailed analyses were performed at the scale of individual nests ("nestlevel" model). This nest-level model used a Bayesian data-augmentation approach to include the (potentially) overlooked nests in the analysis. For this analysis, we used an approach similar to patch-occupancy models, which essentially included two elements. First, an indicator variable assigned each nest a value equal to 1 if it existed and to 0 otherwise. This indicator variable was drawn from a Bernoulli distribution with a parameter representing the overall probability that a nest in the data actually existed. Second, we used a logistic regression of the detection probability to account for observer-specific detection rates and effects of nest size and other covariates. The data (one row per nest) were augmented by 50 rows of missing data (N augmented unobserved nests, that is, containing no information but contributing to the estimation of the overall probability that a nest existed; cf. Dorazio et al. 2011). For the N augmented unobserved nests, the model estimated how likely it was that they were actually there, but were not observed. This could be achieved because the unobserved nests (and their sizes) were drawn from the same data model as were the observed data. The main tuning parameter of this nest-level model was the number of nests assumed to be missing; the model was insensitive to this parameter and yielded the same results when using 20, 50, or 200 augmented rows. Uninformative priors were chosen for all model parameters. The model was implemented in JAGS (Plummer 2003).
Approach 2: Plot-level detection models.-We also estimated N s using two different types of plotlevel analyses: one Bayesian and one using maximum likelihood. The disadvantage of these plot-level models is that they cannot accommodate nest-level information (e.g., size). On the other hand, the advantage of plot-level models is that the maximum-likelihood version can be used to readily simulate different numbers of observers (requiring thousands of randomized analyses).
For each plot and for each observer, we modeled the number of nests observed as a realization from a binomial distribution, with parameters N s and P i , representing the estimated number of nests per plot s and observer i's detection rate, respectively: P(N i,s ,P i ). Note that this requires the estimation of 16 (plots) + 8 (observers) = 24 different parameters. These parameters could be estimated using Bayesian or maximum-likelihood approaches, differing, in our implementation, only in choosing (for the Bayesian version) priors for N s that have a lower bound at the observed number of nests at each plot. Then, for each plot × observer combination, we estimated the expected number of observed nests as the product N sPi . As in the nest-level model, we estimated a detection rate for each observer. Note that the Bayesian plot-level model serves as a link between the data-augmentation model and the maximum-likelihood model, illustrating that the main benefit of the data-augmentation approach is the incorporation of nest sizes.
Finally, we used the maximum-likelihood model to simulate estimates of nest counts that we would get with fewer observers. To do so, we randomly drew 2, 3, …, 7 observers and reran the estimation of nest numbers. Each simulation (number of observers) was repeated 1000 times.
Approach 3: Nonparametric richness estimators.-Last, we used nonparametric sample-based estimators, developed for estimating the number of species in samples of community data (Chao and Jost 2012, most recently reviewed by Chao et al. 2014). This approach does not account for observer-specific detection probability or plotlevel covariates. We estimated the total number of nests in each plot, N s , and the total number of nests among the 16 plots, N , using standard bias-corrected species richness estimators (Chao's S, jackknife 1 [Jack1] S, and Jack2 S; see Jost 2012, Oksanen et al. 2015) implemented in the specpool function of the vegan library in R, version 3.2 (R Core Team 2015). These estimators are based on the observed number of nests that were detected by only one ("singletons") or two ("doubletons") observers.

Determining the number of observers needed to accurately estimate the number of nests
The analyses described above assumed that detection probability was independent of each v www.esajournals.org BERBERICH ET AL. observer. However, our data showed that many nests were recorded by all observers, whereas others were found only by some (Fig. 3). In other words, we could not assume independence of observations: Adding more observers to the team led to records largely similar to what had already been reported. We computed the amount of effort required to accurately estimate numbers of nests assuming a constant detection probability among observers and serial correlation among observers (details of these calculations are given in Appendix S1).
Essentially, we estimated how more observers would affect our estimation, by assuming that new observers would have detection rates similar to those of our eight real observers, P i . In addition to the detection rate of each observer, we had to compute the probability of a second observer finding a new nest, P c , which we computed from the observed data for each observer pair. The probability that k observers would overlook a nest was computed as (1 − P c ) k (1 − P i ). We simulated data for 9 and 10 observers, bootstrapping values for P c and P i based on our eight observers.

Availability of data and code
The commented R-code for all our analyses and figures are provided as online supplementary material (Appendix S2). All data and raw R and JAGS codes are available from the Harvard Forest Data Archive (http://harvardforest.fas. harvard.edu:8080/exist/apps/datasets/showData. html?id=hf286), data set HF286.

Sampling effort
Although the sampling protocol specified that each observer spend 60 min in a plot, GPS records revealed that actual time spent by the single observer in each plot ranged from 30 to 120 min. On the other hand, the eight observers were highly consistent in their searching behavior and all appeared to cover the majority of each plot in their searches while avoiding wetlands and very dense vegetation (Fig. 2). However, there was a surprising lack of consistency in the nests detected and overlooked by the different observers (Fig. 3).

Estimates of detection probability and the number of nests
Estimated detection probability (P i ) computed from the nest-level model ranged from 0.37 to 0.64 (mean = 0.50). The plot-level models yielded estimates ranging from 0.31 to 0.52 (mean = 0.42; Bayesian plot-level detection model) or from 0.35 to 0.58 (mean = 0.47; plot-level maximumlikelihood model; Table 1). Results of the Bayesian plot-level detection model suggested that we overlooked approximately 26% of nests (of an estimated total of 190 nests). The difference between the nest-level and plot-level estimates can be attributed to (1) fewer data points (the plot-level model aggregates all nests within a plot: 16 × 8 = 128 vs. 147 for the data-augmentation); and (2) the joint estimation of detection rates and true number of nests, P N obs i,s |P i ,N s , rather than conditionally P N obs i,s |N i,s , was estimated for each nest as in the data-augmentation model.
Estimated number of nests per plot (N s ) ranged from 0 to 24 (patch-occupancy model), 0 to 27 (maximum likelihood), or 0 to 29 (Bayesian) ( Table 2). Estimated total number of nests (N ) across all 16 plots = 147.7 (95% confidence interval = [147, 149]), that is, one to three nests overlooked; patch-occupancy model), 168.2 (maximum likelihood), or 190.1 (26% of nests overlooked; Bayesian). Estimated detection probabilities for the observers were slightly higher in the patch-occupancy model, but the estimated number of nests varied by a smaller percentage among models. In other words, while nest size affected detection probability, it did not greatly bias estimates of the total number of nests.
All of these estimates of total number of nests exceeded the bias-corrected ones that did not explicitly incorporate detection probability (Fig. 3, Table 3).

Covariates of detection probability
Large nests had a higher chance of being detected (estimate for β size = 0.819). Height was a better predictor than diameter, making it necessary to incorporate nest height in an ideal analysis of these data. But nest size did not bias greatly estimates of the total number of nests. Detection probability increased significantly with both nest height (both linear [estimate = 6.7] and quadratic [estimate = −3.8] terms were significantly v www.esajournals.org BERBERICH ET AL. different from 0 [P = 0.002 and P < 0.001, respectively]) and diameter (only linear term [estimate = 1.3] was significantly different from 0 [P < 0.001]) (Fig. 4). Moreover, we found no relationship between the number of nests per plot and detection probability (Fig. 5). There also were no significant effects of forest type, position, or interactions between these plot characteristics and nest-height size-class on nest detection (Fig. 6, Table 4).

Effects of having more observers
We observed that some of the 147 observed nests were detected by all observers (black rows in Fig. 3), whereas others were detected only by a single observer (rows with only a single black square in Fig. 3). The average correlation among pairs of observers in detecting a nest was relatively high (0.65, SD = 0.071). Nonetheless, each new observer added some additional information. Assuming that still more observers would be similar to those we worked with, we found that there was an inverse relationship between the number of observers and N : Fewer observers led to higher estimates of overlooked and hence of the true number of nests (Fig. 7) because there are many nests but detection probability was relatively low. However, as the number of observers increased, fewer nests were overlooked (<1% with eight observers; see Fig. 7, inset), and consistency among observers refined (and shrank) the estimated number of nests (Fig. 7).

dIscussIon
Our work with red wood ants addressed five general questions: (1) Do multiple observers detect or overlook the same nest? (2) Is there a "best" way to quantify detection probability? (3) Do colony size and density influence detection probability? (4) Does individual nest size influence detection probability? (5) How many observers are needed to converge on an estimate of the true number of nests? For RWA, the short answers are as follows: 1. Multiple observers detect and overlook different individual nests. 2. Bayesian methods provide more precise estimates of detection probability. 3. Population size and density had little effect on detection probability. 4. Larger nests were more likely to be detected. 5. More observers are better, but the "return on investment" is a diminishing function.
Over the past several decades, a number of statistical models have been developed to correct for imperfect detection in population studies with respect to occupancy/species distribution modeling (reviewed in MacKenzie et al. 2006), mark-recapture (e.g., Lettink andArmstrong 2003, Chen andRobinson 2013), or distance sampling (Baccaro and Ferraz 2013). Many of these methods account for bias of observer, time of day, or season (Dénes et al. 2015). Survey-, plot-, and species-level factors differentially affecting detection of species or individuals are incorporated only partially in these models, resulting in a disproportionately high number of nondetections (Iknayan et al. 2014, Dénes et al. 2015. These issues are of particular concern for mobile organisms, but also can play a significant role for sessile ones . Additional difficulties also may arise when the objects under study vary in size or shape over time and are generally not easily noticed by unpracticed observers (e.g., Fitzpatrick et al. 2009).
The data-augmentation approach we used is fully in line with already published approaches (Royle et al. 2007, Kéry & Royle 2010, Dorazio et al. 2011. It models detection probability in exactly the same way, but the novelty is that it Table 1. Estimated detection probability P i and its SD for each of the eight observers (six "beginners" and two "experts"), using the patch-occupancy model per-observer observation, Bayesian site-level detection model, and site-level maximum-likelihood model.  Table 2. Estimated number of nests N s (maximum likelihood) in each plot and its SD, assuming a detection probability equal to the mean of the P i = 0.42 (maximum likelihood) or 0.39 (Bayes) from Table 1 adds a characteristic for each individual nest, and it estimates the number of unobserved nests. There was only a small proportion of nests that were observed by only one observer, unlike, for example, the American redstart data in Royle (2004). One advantage of our patch-occupancy model with event-specific covariate (nest size) approach was that it allowed us to model each nest separately and thereby include a covariate for the nest. The estimates of detection probability and nest abundance were similar between the Bayesian and maximum-likelihood models.
Our study of RWA nests highlights some underlying aspects of detection probability for sessile organisms. Red wood ants are ecologically important and have been listed as threatened or endangered because repeated censuses often suggest declines in abundance (e.g., Dekoninck et al. 2010). However, detection probability of RWA nests has been estimated only once previously using a "mark-release-recapture method" while disturbing the ant colony (Chen and Robinson 2013). Our results, applying a noninvasive method without disturbing the ant colony, revealed that even in a well-designed survey of a well-known population, RWA nests were detected imperfectly even by experienced observers. Imperfect detection can bias seriously  conventional estimators of species distributions and population sizes . Given a detection probability of RWA nests by experts of ≈ 0.63, prior assertions of RWA decline (Dekoninck et al. 2010, IUCN 2015 should be revisited. Corrections for detection probability not only should be included in future inventories of RWA populations and other sessile organisms, but also should be accounted for in decisions to list these species as threatened or endangered. Numerous covariates affect detection success (Dénes et al. 2015). Our results suggest that observer experience strongly influenced detection success of RWA, which also had been noted for other essentially sessile insects (Fitzpatrick et al. 2009). Whereas beginners and experienced observers both were highly consistent in their searching per plot, beginners identified fewer RWA nests. Experienced observers consistently detected twice as many short RWA nests (1-10 cm in height), observed 33% more tall ones (>100 cm in height), and 66% more with smaller diameters (up to 50 cm) than beginners.
Although experience nearly doubled detection probability, experts were still imperfect observers. Detection probability may have been reduced because the survey was carried out early in the season during bouts of heavy rain. Dense undergrowth and steep topography (especially in plots 6, 12, and 14) also could have contributed to a high level of omissions. Nevertheless, detection probabilities of RWA nests in our study (Table 1) were comparable to those estimated in other studies of ants (Dorazio et al. 2011, Ward Fig. 5. Effect of nest abundance at each plot on the probability of detection. No trend was detectable in these data. and Stanley 2013). Although standard surveys of RWA are carried out during the summer, the dense undergrowth present then could lead to a higher percentage of nondetection. In contrast, we suggest that sampling RWA nests would be better performed in early spring when vegetation has not yet started to obscure the nests but temperatures are sufficiently high for ant activity. Finally, we found that with more surveys (or replicated ones: Dorazio et al. 2011), the combined detection probability increased relative to detection probability estimated from a single observer.
However, the gain in detection probability of RWA nests showed diminishing returns beyond six to eight observers (Fig. 7).
Even things as conspicuous as ant nests can be overlooked easily. Robust estimation of population density of sessile organisms-even highly apparent ones such as RWA nests-requires unbiased estimation of detection probability, just as it does when estimating population density of rare or cryptic species. Our Bayesian model for detection probability of sessile organisms included overlooked nests and other sources of heterogeneity in both occurrence and detection probabilities, and contributes to the further development of new methods for accurate assessments of population sizes.
As myrmecologists, we naturally are always surprised that not everyone is interested in mapping ant nests or estimating changes in ant population sizes through time and space (see also Underwood and Fisher 2006). However, the approach outlined here is relevant to any sessile organism for which robust population estimates are desired, but resources for exhaustive, repeated, population counts or estimates are limited (e.g., Philippi et al. 2001, Godefroid et al. 2011). Our methods can be used to provide answers to questions such as "how many Notes: Forest type was coded as spruce or not spruce. Location was coded as forest interior, forest edge, or forest path. Fig. 7. Maximum-likelihood estimates of the number of nests at each plot, based on 100 randomly drawn combinations of two to seven observers. Plots are sorted by number of estimated nests based on eight observers ( ). A indicates confirmed number of nests; solid orange and red circles are, respectively, the estimated number of nests according to Chao and Jackknife1 estimators (Table 3). Inset: Simulated probability of overlooking a nest as a function of the number of observers. This simulation uses the data from the 16 plots, each bootstrapped 1000 times to simulate a random sequence of observers. Horizontal dashed lines are at 10%, 5%, and 1% of overlooked nests. surveyors do I need to accurately estimate the size of this population?" or "can I use nonexpert surveyors, and how does that affect detection probability and their estimates of population size?" Answers to these questions are vital not only when censuses are conducted by experts, but also when citizen-scientists are engaged in mapping and monitoring of both common and rare species (e.g., Godet et al. 2009, Dickinson et al. 2010 acknowledgMents