Community distance sampling models allowing for imperfect detection and temporary emigration

Recent developments of community abundance models (CAMs) enable us to analyze communities subject to imperfect detection. However, existing CAMs assume spatial closure, that is, that individuals are always present in the sampling plots, which is often violated in field surveys. Violation of this assumption, such as in the presence of spatial temporary emigration, can lead to the underestimates of detection probability and overestimates of population densities and diversity metrics. Here, we propose a model that simultaneously accommodates both temporary emigration and imperfect detection by integrating CAMs and a form of hierarchical distance sampling for open populations. Expected values of species richness are obtained via the summation of occupancy (or incidence) probabilities, based on species-level densities, across all species of the community. Simulations were used to examine the effects of spatial temporary emigration on the estimation of biological communities. We also applied the proposed model to empirical data and constructed area-based rarefaction curves accounting for temporary emigration. Simulation experiments showed that temporary emigration can decrease the local species richness (a diversity) based on densities and increase the species turnover (b diversity). Raw species counts can overestimate or underestimate a diversity in the presence of temporary emigration, but the specific biases depend on the values of detection and emigration probabilities. Our newly proposed model yielded unbiased estimates of a, b, and c diversity in the presence of temporary emigration. The application to empirical data suggested that accounting for temporary emigration lowered area-based rarefaction curves because availability probabilities of individual species were estimated to be <1. Temporary emigration prevails in field surveys and has broad significance for understanding the ecology and function of biological communities and separation of imperfect detection and temporary emigration resolves long-standing issues in the use of count data. We therefore suggest that the consideration of temporary emigration would contribute to understanding the nature and role of biological communities.

Abstract. Recent developments of community abundance models (CAMs) enable us to analyze communities subject to imperfect detection. However, existing CAMs assume spatial closure, that is, that individuals are always present in the sampling plots, which is often violated in field surveys. Violation of this assumption, such as in the presence of spatial temporary emigration, can lead to the underestimates of detection probability and overestimates of population densities and diversity metrics. Here, we propose a model that simultaneously accommodates both temporary emigration and imperfect detection by integrating CAMs and a form of hierarchical distance sampling for open populations. Expected values of species richness are obtained via the summation of occupancy (or incidence) probabilities, based on species-level densities, across all species of the community. Simulations were used to examine the effects of spatial temporary emigration on the estimation of biological communities. We also applied the proposed model to empirical data and constructed area-based rarefaction curves accounting for temporary emigration. Simulation experiments showed that temporary emigration can decrease the local species richness (a diversity) based on densities and increase the species turnover (b diversity). Raw species counts can overestimate or underestimate a diversity in the presence of temporary emigration, but the specific biases depend on the values of detection and emigration probabilities. Our newly proposed model yielded unbiased estimates of a, b, and c diversity in the presence of temporary emigration. The application to empirical data suggested that accounting for temporary emigration lowered area-based rarefaction curves because availability probabilities of individual species were estimated to be <1. Temporary emigration prevails in field surveys and has broad significance for understanding the ecology and function of biological communities and separation of imperfect detection and temporary emigration resolves long-standing issues in the use of count data. We therefore suggest that the consideration of temporary emigration would contribute to understanding the nature and role of biological communities.

INTRODUCTION
Species richness is defined as the number of species in a specified area, and it is the simplest and most basic measure of species diversity (Whittaker 1975). Species richness varies in space and time (Rosenzweig 1995), and its drivers are of primary interest in ecology (Huston 1994). As humans have transformed >75% of Earth's ice-free land surface (Ellis and Ramankutty 2008) and appropriated >23% of its terrestrial net primary productivity (Haberl et al. 2007), global species diversity is now greatly threatened (Barnosky et al. 2011). Anthropogenic drivers of species richness or human-induced changes in species richness are therefore of prime interest in modern ecology (Newbold et al. 2016).
In this context, species richness has often been compared among sampling plots (MacArthur andMacArthur 1961, Huston 2014). One difficulty in doing so is the non-linear relationship between species richness and area (Connor and McCoy 1979), and it is therefore challenging to compare species richness among studies having sampling plots of different sizes and also even among sampling plots within the same study (Wiens 1989, Gotelli andColwell 2001). For example, when comparing species richness among different land uses, how to control for differences in plot size is a major issue (cf. Newbold et al. 2015).
Recently, we have proposed a community modeling framework based on individual species-level abundance models (Yamaura et al. 2011(Yamaura et al. , 2012. This community abundance modeling framework assumes that local abundance follows a Poisson distribution with expected abundance (k), and the probability that at least one individual of a species occurs is related directly to k (Royle and Dorazio 2008): Pr[N ≥ 1] = 1 À exp(Àk). Under this model, we can link this expected abundance k to the log-transformed size (area) of sampling plot j (A j ) using the log link (Connor et al. 1997): log(k j ) = b 0 + 1 9 log(A j ). Here, b 0 is the logarithmically transformed expected abundance when plot size is 1 (k j = exp[b 0 ], i.e., population density). By fixing the coefficient of a logarithmically transformed size at 1, we assume that population densities are constant irrespective of the area. However, we can relax this assumption using a free parameter b 1 not constrained to be equal to 1 (i.e., b 1 9 log[A j ]). We can also include other environmental covariates and associated coefficients into this linear predictor of log(k j ) (e.g., b 2 9 x 2 ). By expanding these equations into R species, we can therefore link expected species richness and area (Yamaura et al. 2016a, b) according to: is the expected abundance of species i in sampling plot j having size A j . In the presence of imperfect detection of individuals and species, the relevant parameters dictating k ij are estimated from repeated counts or capture-recapture data of detected species (Yamaura et al. 2016a, b). We again note that it is assumed that population densities of species constituting communities are constant irrespective of the area (by notation of 1 9 log[A j ]) for simplicity (James andWamer 1982, but see Yamaura et al. 2016a). This formulation allows us to construct area-based rarefaction curves (i.e., changes in species richness with area) given the scale-free parameters, k ij j½A j ¼ 1 ¼ exp b 0i ð Þ: the population densities of individual species.
This development suggests the importance of accounting for variation in density among species comprising the community. Using the number of detected individuals (count data), our proposed community abundance models (CAMs) yield density estimates of individual species and species richness accounting for imperfect detection (Yamaura et al. 2011(Yamaura et al. , 2012(Yamaura et al. , 2016b, which is one of the major sources biasing density estimates (K ery and Royle 2016). However, there is another major source of bias for mobile organisms: temporary emigration, which is due, in part, to movement of individuals off of the survey plot being sampled. Nichols et al. (2009) decomposed the detection process in a broad sense into three parts (Fig. 1). Given that home range of individuals only partially overlaps the sampling plots, individuals can be unavailable for sampling because they are out of the plot when the survey is conducted (Fig. 2). Following Nichols et al. (2009), we refer to the probability of individuals residing in the plots as the presence probability, p p , and 1 À p p is the probability of temporary emigration (Table 1). This temporary emigration process due to movement of individuals is called spatial temporary emigration (K ery and Royle 2016), and we say that individuals are spatially available when they are present in the sampling plot. Second, given that an individual is spatially available to be sampled (i.e., the individual is on the sampling plot), individuals can again be unavailable for the detection since they may not produce detection signals (cues). For example, birds may be sitting on the nest, and be silent; marine mammals may be below the water surface. This process which is unrelated to individuals moving about their home range is a random temporary emigration (K ery and Royle 2016) and the associated probability can be denoted by 1 À p a (Nichols et al. 2009) where p a is the probability of availability related to signaling (Table 1). We call the combined process of spatial and random temporary emigrations as an availability process in a broad sense (Fig. 1), and the corresponding probability is / (= p p 9 p a ). We also call its complement (1 À /) as temporary emigration in a broad sense. Finally, an individual that is both present on the plot and available for detection through signals can be detected by the surveyor with probability p d (Nichols et al. 2009). The problem of temporary emigration can propagate to inference about not only density but also occupancy (e.g., Tyre et al. 2003); in distance sampling, random temporary emigration is known to violate the assumption that all birds on the census line (i.e., zero distance) are perfectly detected (denoted by g[0] = 1), and can cause substantial bias in density estimates (Buckland et al. 2004).
Here, we propose an approach to overcome this problem in a community context by integrating count-based CAMs and open hierarchical distance sampling (HDS) that allows for temporary emigration (K ery and Royle 2016). Although our proposed model does not specify population dynamics, the open-population version of HDS allows the population size to change during the course of multiple surveys (K ery and Royle 2016). Our model separately estimates detection probability (p d ) and availability probability in a broad sense (denoted by / = p p 9 p a ) and produces density estimates of individual species unbiased by temporary emigration. This is possible because detection probability p d can be estimated by a single visit with distance sampling given the assumption of perfect detection at the zero distance. Multiple visits to individual sites allow for inference about availability probability. Availability process (in a broad sense): p p × p a Fig. 1. Decomposition of processes associated with detecting individuals during field surveys. An individual whose home range or territory overlaps with the sampling plot is detected by the surveyor via three steps (Nichols et al. 2009). Individual can be available for detection with probability p d given that it is in the plot and produces signals (song in an acoustic survey). These two processes required for the individual detection (spatial and random temporary emigration: K ery and Royle 2016) are denoted by the probabilities of p p and p a , and we call their product "availability" in a broad sense (p p 9 p a = /). Which parameters of the detection probabilities (p d or p a 9 p d or p p 9 p a 9 p d ) are associated with the field survey depends on the survey methods employed.
We used simulation to show how temporary emigration affects the estimation of biological communities, including a, b, and c diversity, and that our model can yield unbiased estimates. We then applied our model to empirical data and constructed area-based rarefaction curves accounting for temporary emigration. In the next section, we begin with the overview of the temporary emigration process and next develop the model.

Temporary emigration as a component of the detection process
Among spatial and random temporary emigration processes, we are primarily interested in the spatial temporary emigration because this process is prevalent in studies on mobile organisms. For example, observed individuals whose territories partially overlap with the sampling plots are frequently excluded from the analysis (e.g., Loman and von Schantz 1991), and James and Wamer (1982) excluded 23% (11/48) of the observed species in order to construct rarefaction curves since only fractions of their territories were included in the studied plots. Hanski and Haila (1988) showed that male individuals of chaffinch Fringilla coelebs can spend a long time out of their singing territories for foraging. Haila (1988) also discussed the problems of density calculation in small habitat fragments when birds can forage in the surrounding areas. Stratford and Robinson (2005) suggested that tropical birds have substantially larger territories and home ranges compared to temperate birds and thus should be more susceptible to biases induced by spatial temporary emigration.
Ideally, the three processes involved with imperfect detection should be explicitly addressed when estimating population densities and community structure. However, our ability to resolve all three components of the detection process depends on the protocols being employed (see also Discussion). If we assume p a = 1, and when we use distance sampling on repeated sampling bouts, we can resolve spatial temporary emigration (p p ) from probability of detection (p d ). Our study is on this line of model development in a community context. However, in that case, it is not possible to explicitly estimate all three components. Rather, we can only estimate directly p d and the product p p 9 p a (= / in our model).
In the presence of spatial temporary emigration of mobile individuals, density can be defined as the instantaneous number of individuals residing in the area of interest (i.e., based on their snapshot location) divided by the respective area (Buckland et al. 2001, Royle andDorazio 2008). If we do not separate p p from p d , then nominal detection probability from repeated counts includes p p , and the number of individuals that 100 m 400 m Fig. 2. Schematic illustration of partial availability of bird individuals during the survey. A rectangular sampling plot is assumed, and its size follows the field survey used by the model application. Territory size differs among the species; species like coal tit Periparus aster can have small territories (1 ha), some of which may be well incorporated by the plot (shown by the light-gray circles). But other species like woodpeckers can have larger territories only partially overlapping with the plot (dark gray), and they can be unavailable (out of the plot) when the survey is conducted. can be present in the plots ("super-population size") is estimated (Nichols et al. 2009). This leads to the overestimation of the population densities since the denominator calculating densities is not the size of sampling plots but the larger effective sampling area (Chandler et al. 2011). Hutto et al. (1986) therefore suggested that actual density cannot be calculated reliably due to the unknown effective sampling area in the fixed-radius point counts.
The concept of spatial temporary emigration is manifested as a closure assumption in models of abundance and occupancy estimation in which population size or occupancy status is assumed to be constant over the survey period (Royle 2004, MacKenzie et al. 2006. Spatial temporary emigration violates this assumption. To safely ignore the problem of temporary emigration for density estimation, sampling plots must be sufficiently large relative to the home range size of individuals (Royle and Dorazio 2008). However, in the context of sampling communities, home range size differs among species, and large plots suited for species with large home range sizes cannot usually be attained due to logistics.

The model: open-population CAM
We integrate CAM and open HDS to simultaneously deal with both imperfect detection and temporary emigration. We note that sampling methods other than distance sampling can be used such as removal sampling (Amundson et al. 2014), capture-recapture protocols (Yamaura et al. 2016a), and multiple counts in a single visit (Chandler et al. 2011); in any case, multiple visits to individual sampling plots are required to deal with temporary emigration. We develop our model using distance sampling because this method is a well-known method and commonly practiced in field studies. We can apply distance sampling when we measure and record the distance from the transect line or point to the individuals; nevertheless, distance sampling has been rarely used in CAMs (Sollmann et al. 2016). The model is composed of three hierarchical levels, and the first one is the process describing the super-population size of species i for site j (M ij ), and we assume that M ij follows a Poisson process (or negative binomial distribution; Sollmann et al. 2016): where k ij is an expected value and will be represented by a log-linear predictor as a function of relevant covariates. Conceptually, this quantity M ij is the population size of species i that are ever exposed to sampling at site j, that is, which have at least some part of their home range within the sampled plot. We can include random effects in the predictor to consider variation not explained by the Poisson distribution (Yamaura et al. 2012). The next level deals with the emigration process, and we assume that the number of individuals exposed to sampling (present in the sampling plots and signaling presence) during visit k (N ijk ) follows a binomial process with M ij as the number of trials and success rate (availability probability in the broad sense) / i (= p p 9 p a ): Although we assume that every individual of the same species has constant / i , this probability can vary among individuals depending on the location of home ranges relative to the sampling plots (Fig. 2). However, in the context of a singlespecies hierarchical model with the removal sampling, Chandler et al. (2011) showed that density estimates can be unbiased when / i varies among individuals but is modeled as constant. Bias is expected to decrease with the number of sampling sites and visits, which is relevant with the concept pooling robustness (Burnham et al. 1980).
The third level of the hierarchical model is the detection process, describing the number of individuals detected at visit k in distance classes (up to D), y ijk = (y ijk1 , y ijk2 , . . ., y ijkD ): where p ij (cell probabilities of the multinomial distribution) describes the distribution of observed detections among distance classes (discrete distance classes are commonly used; Buckland et al. 2001). These detection and availability parameters can depend on covariates (K ery and Royle 2016, Sollmann et al. 2016. We note that availability and detection probabilities are separately estimated given this data structure and model formulation. Variations in number of available individuals exposed to the detection (N ijk ) among the visits provide information on the availability probability / i (Eq. 3), and variations in number ❖ www.esajournals.org of detected individuals among the distance classes (y ijk ) given the total number of detected individuals at the individual visits (expected to be k ij Â / i Â P D d¼1 p ij ) provide the information necessary to estimate detection probability (parameters describing p ij [Eq. 4]). In other words, multiple visits to the sampling sites are required to estimate availability probability, while detection probability can be estimated via a single visit using distance sampling. Nevertheless, multiple visits are expected to increase the accuracy of parameter estimates of distance sampling models. We also note that detection probability of individual species is described by the scale parameters (r i ) of the detection model which dictates the rate of decay of detection probability as a function of distance (Buckland et al. 2001, K ery andRoyle 2016). We assume that detection probability is equal to 1 at the zero distance for individuals that are available for detection during any particular visit (see Discussion for the relaxation of this assumption).
For each parameter (e.g., intercepts in the linear predictor of k ij ), species-level parameters (suitably transformed) are governed by a single normal distribution with community-level hyper-parameters (mean and SD), and the existence of undetected species throughout the survey is considered by use of the method of data augmentation (Yamaura et al. 2016b).

Simulation experiments
We conducted simulation experiments to examine the effects of imperfect detection and temporary emigration on community analysis with 30 sampling sites. There was a single covariate (x j ) that can affect abundance of individual species, which ranged from À1 to 1, and their specific values were equally spaced among the sites (i.e., À1.00, À0.93, . . ., 0.93, 1.00). The regional species richness (the number of species comprising the regional meta-community, R) was fixed at 40 across the 30 sampling sites, and these species had 1.0 mean abundance (super-population size) and 1.0 standard deviation (exp[l b0 ] = r b0 = 1), and mean slope values 1.0 for x j in the linear predictor of k ij (l b1 = r b1 = 1). This situation indicates that species richness increases with x j . We then addressed five cases with different mean values of detection probability represented by the scale parameter ( r) of the half-normal detection model (Buckland et al. 2001, K ery andRoyle 2016) and availability probability ( /: Table 2). These five cases comprised the combinations of high and low detection and availability probabilities, and an intermediate case. Species-specific parameters of detection and availability probabilities (r i and / i ) are log-and logit-transformed normal random variables. We then conducted virtual surveys with three visits to each site since three visits have been shown to produce good performance of the singlespecies and community N-mixture (closed population) models (Yamaura 2013, Yamaura et al. 2016b. We collected distance sampling data with rectangular plots up to 50 m maximum distance and 10 m width distance classes (five distance classes). Although we used rectangular plots as in our empirical data, we can adopt the circular plots (point transects) using the corresponding distributional function of distances (Buckland et al. 2001, K ery andRoyle 2016). The variations of scale parameters represent the situation in which detection probability quickly declines ( r = 10) and hardly declines ( r = 100) from zero to the maximum 50 m distance. These two scale parameters represent 0.25 ( r = 10) and 0.96 ( r = 100) plotlevel detection probabilities (Table 2).
We applied the open-population CAM to the distance sampling data to infer the responses of community-level densities and species richness to the covariate. We augmented the observed data sets with m (= 80 À S) potential species with zero count histories, where S was the number of detected species. We defined expected densities (dens ij ) of the plots as k ij 9 / i 9 w i Notes: Values of 100 and 10 for indicate high and low detection probability, respectively. Plot-level detection probabilities p d corresponding to r are also shown. Values of 0.9 and 0.1 for / indicate high and low availability probability, respectively. The last case represents the intermediate scenario. Species variations of r and / were produced by normal distributions via log and logit transformations, respectively. We used values of 0.5 for the SD of r and 1.0 for SD of / at log and logit scales, respectively.
where k ij is an expected value of super-population size, and w i , the data augmentation variable, is a random variable that follows a Bernoulli distribution and indicates whether species i is included in the community (Yamaura et al. 2016b). For the detected species, this variable takes the value of 1 (with posterior probability 1.0) but can have the value of 0 for undetected species. We estimated site-specific community densities by summing dens ij across the species. Although we did not divide these quantities by the corresponding plot size, we treated these as densities since they were abundances for a constant unit size. Density-based expected local species richness (a diversity) was estimated by summing the occupancy probability (w ij ) derived from the local densities across the species: We obtained the overall occupancy probability across 30 sites for individual species (w i. ) as the complement of the products of probability that a site is not occupied: The summation of w i. across the species is the expected overall species richness across the sampling plots (c diversity), and we partitioned c diversity into b and a diversity by subtracting the mean expected a diversity from c diversity (additive partitioning; Crist et al. 2003).
To show the effects of making use of repeated distance sampling data and therefore considering temporary emigration, we reduced the distance sampling data to simple counts and applied the conventional N-mixture CAM that does not consider temporary emigration (Yamaura et al. 2012(Yamaura et al. , 2016b. The N-mixture CAM thus serves as an explicit model for the super-population size of individuals ever available for sampling during the repeated surveys (Nichols et al. 2009, Chandler et al. 2011. This conventional CAM is based on simple repeated count data and is suggested to be a promising approach to dealing with community-level abundance (Iknayan et al. 2014). In the application of N-mixture CAM to the temporary emigration data, the number of detected individuals for species i on site j at visit k (y ijk ) is y ijk~B inomial(p cap,i ,N ij ) where p cap,i is the probability of an individual being encountered, averaged over all distance classes. However, as described above, because the N-mixture model does not consider the emigration process (Eq. 3) and regards N ij as a realization of the underlying Poisson process (N ij~P oisson[k ij ]), the nominal detection probability (p cap,i ) is p cap,i 9 / i . Estimates of N ij are therefore super-population sizes. It is again noted that the objective of the comparison is to provide an alternative analysis (N-mixture model), which is sensible when only total counts are obtained (i.e., absent distance data), for our proposed open-population community model. Another modeling option is to fit a basic distance sampling model based on the aggregate counts in each distance class, summed across the visits ðy ij: ¼ P k y ijk Þ. This quantity follows the multinomial distribution: y ij.~M ultinomial(K 9 N ij , p ij ). Although we did not fit this model to our data, as in the N-mixture model, the nominal detection probability would include the emigration probability (/ i ) and also produce estimates of the superpopulation size.
We obtained the same estimates described above from this closed-population CAM except that densities were calculated as k ij 9 w i (availability probability [/ i ] was not considered). We compared the inferred responses of community densities and species richness to the covariate and a, b, and c diversity from open-and closedpopulation CAMs (using posterior distributions) with those of na€ ıve counts and known true (expected) values on closure and non-closure (open) assumptions. These true values were obtained by the known true values of k ij used to generate detection data with and without / i for values on open and closure assumptions, respectively.
We fitted the models using Markov chain Monte Carlo (MCMC), using conventional uninformative priors, 10,000 burn-in, and 100,000 iterations, and with three chains. We truncated community-and species-level r in the range of 2.7-403 to enhance the convergence. Model fitting was conducted using JAGS ver. 3.2.0 (Plummer 2012) with jagsUI ver. 1.3.7 (Kellner 2015) and R ver. 3.2.3 (R Core Team 2015). We also used vegan ver. 2.3-0 (Oksanen et al. 2015) for the additive partitioning of count data. We assessed the chain convergence with the Gelman-Rubin ❖ www.esajournals.org statistic of all parameters (<1.1), and conducted an additional 100,000 iterations until convergence was attained using the function autojags in jag-sUI. We replicated the simulations 100 times for each of five cases with different detection probability and availability probabilities. We obtained median values of the posterior distributions, and their mean values were used to represent the data across the 100 replicates.

Model application to the empirical data
We applied the open-population CAM to distance sampling data from a survey of birds which we previously used to develop and fit the N-mixture (closed population) CAM (Yamaura et al. 2012). The study area (15 9 15 km) was in the northern part of Kitakami highland, Iwate prefecture, northern Japan (39°50 0 N, 141°19 0 E). This area (200-1000 m asl) was covered by forests, which were dominated by deciduous natural forests and larch Larix leptolepis plantations. Original data were collected to examine the abundance and species richness of breeding bird communities in four open (pasture, meadow, young planted forest, and abandoned clear-cut) and two forested habitats (mature planted forest and natural old-growth). We established five sampling plots each of which was 4 ha in size (100 9 400 m; Fig. 1) for each habitat type (total of 30 plots). Sampling plots were spaced at least 600 m apart except for meadow plots. Line transect surveys were conducted by one person (Y.Y.) during 19 May-28 June 2009, and each plot was visited in the morning (sunrise to 10:00 a.m.) five times on different days. We recorded the number of individuals of each species in the plots and their distances from the transect line.
We fitted the open-population CAM to the data and derived the abundance in each habitat of size h (ha) accounting for temporary emigration as follows: abund ijh = (k ij 9 / i 9 w i /4) 9 area h where k ij is an expected super-population size of species i for habitat j in the original plot size (4 ha), and the quantity in parentheses indicates the expected density (per ha) accounting for temporary emigration (/ i ). We substituted this term instead of k ij in Eq. 1, and obtained expected species richness for each of six habitats at 53 different plot sizes from 0.01 to 10 ha. Species richness of songbirds was expected to change across this range of plot size (Yamaura et al. 2016a). We then constructed the area-based rarefaction curves for six habitat types based on these estimates. Following the framework of additive partitioning, we also obtained mean values of species richness across six habitats as a diversity at respective plot sizes. Expected species richness across six habitats (c diversity) was estimated using Eqs. 5, 6, in which we first estimated the probability that each species occupied at least one of the six habitats at respective sizes and then summed these probabilities across the species. We similarly constructed rarefaction curves from the closed-population CAM ignoring temporary emigration. The data were composed of 47 observed species, and we augmented with 30 potential species. We fitted the model with the same settings as in the simulations (e.g., numbers of burn-in and iterations, software), and convergence was determined for the relevant community-level parameters.

Analysis of simulated data
Under the high detectability and availability scenario ( r = 100, / = 0.9), which can be considered as an ideal situation (nearly perfect detection and high availability), there were few differences among the true, estimated, and observed values of community-level densities and species richness against covariate (x), and additive partitioning (Fig. 3a-c). Nevertheless, even with this high availability, relaxation of the closure assumption slightly decreased expected community densities and local species richness. Low availability ( / = 0.1) greatly decreased true community densities and local species richness compared with estimates obtained under the closure assumption, but did not greatly decrease overall species richness (c diversity); b diversity was therefore increased (Fig. 3d-f). The open-population CAM correctly recovered true values of community densities and species richness. The closed-population CAM estimated the appropriate super-population sizes (Fig. 3). Therefore, the presence of temporary emigration, when ignored, led to the overestimation and underestimation of a and b diversity, respectively. The most information-poor scenario ( r = 10, / = 0.1) was challenging for the closed-population CAM, yielding very diffuse posterior distributions and hence wide CIs (Fig. 3j-l). Bias of raw counts of individuals (index of densities) and local species richness depended on the situation; high detectability and low availability ( r = 100; / = 0.1) made raw counts higher than true values of community densities and species richness under the open community data-generating model (Fig. 3d-f). Low detectability and high availability ( r = 10; / = 0.9) instead made raw counts lower than true values of community densities and species richness (Fig. 3g-i). In the most information-poor scenario with low detectability and availability ( r = 10, / = 0.1), and the intermediate scenario ( r = 40, / = 0.5), the biases of raw counts of local species richness were smaller than those of the above scenarios ( Fig. 3j-o).

Analysis of empirical data
Estimates of community-and species-level scale parameters (r) of the distance sampling CAM reached the upper limits (>300). This means that detection probabilities were estimated to be nearly 1 within the plots for all species, indicating that all the variations in counts among the visits were caused by temporary emigration. This was a result of the detection frequency not declining with distance, and the assumption of perfect detection at the zero distance. The estimate of communitylevel mean availability was 0.13 and species-level estimates ranged from 0.02 (blue-and-white flycatcher Cyanoptila cyanomelana) to 0.86 (common skylark Alauda arvensis); these values were almost the same as detection probability estimates obtained from the closed-population CAM.
Estimates of super-population size varied across six habitats, and this variation did not have clear associations with availability estimates (Fig. 4a), that is, every species had small super-

(b) Density
Availability probability (φ) Density estimates (λ × φ ) Fig. 4. Estimates of super-population size and density for individual species in relation to availability probability. Individual vertical lines indicate ranges (maximum and minimum values) of (a) super-population size and (b) density estimates across six habitat types for individual species, and they were shown against individual species' availability estimates. Estimates of super-population size were obtained by dividing the density estimates (posterior medians) by availability estimates. population sizes in certain habitat types (~0) and high values in other habitats. However, since low availabilities made maximum density estimates decrease, species with higher availabilities had higher values of maximum estimated density and more variation in density estimates (Fig. 4b).
Area-based rarefaction curves produced from the open-population CAM were lower than those of the closed-population CAM (Fig. 5a, b). Habitatspecific super-population sizes at community levels (mean values across species) were also almost the same in both CAMs, and relative relationships of rarefaction curves among six habitats remained unchanged (Fig. 5a, b). Estimated regional species richness was also the same in the two CAMs (~50). Because overall species richness quickly reached the estimated regional species richness in the closed-population CAM, the difference between overall and mean species richness (b diversity) showed a unimodal shape (Fig. 5c). However, in the open-population CAM, three diversity metrics (a, b, and c) monotonically increased with area over the considered range of the area (Fig. 5d).

Effects of temporary emigration on density and diversity metrics
Temporary emigration and imperfect detection are long-standing issues in the analysis of count data (Hutto et al. 1986, Hutto 2016  . Area-based rarefaction curves for bird species richness in six habitat types and their additive partitioning produced by two community abundance models. (a, b) Rarefaction curves are shown using the median values of posterior distributions of expected species richness. Both models yielded the same estimates of the regional species richness (upper horizontal lines). (c, d) Mean and overall species richness across the six habitat types (a and c diversity, respectively) derived from the rarefaction curves. The difference between a and c diversity was b diversity and also shown by gray lines. et al. 2017). We demonstrated that temporary emigration as well as imperfect detection can seriously compromise the analysis of biological communities when we are primarily interested in density and density-based expected diversity metrics (Eqs. 5-6). When the closure assumption is violated, the N-mixture (closed population) model infers the number of individuals whose home ranges overlap with the sampling plots (i.e., the super-population size). When this quantity may be linked to the size of sampling plots, that is, it may be treated as the density, and the resultant density and diversity metrics can be greatly biased. Issues of the biased estimates would be important when detection probability can vary among sampling plots, which results in the confounding of the variations of diversity metrics with those of detection probabilities (Ruiz-Guti errez et al. 2010, Mc New andHandel 2015). Availability can also be expected to depend on covariates; territory size is likely to be small in resource-rich habitats (Marshall andCooper 2004, Hach e et al. 2012). Variations in availability among sites would be considered by treating the availability probability / i as the function of covariates as in the detection probability of N-mixture model (K ery 2008, Yamaura 2013). Biases involved with temporary emigration are likely to be large when sampling plots are small compared to home range of organisms.
We applied the open-population CAM to empirical repeated distance sampling data, and almost perfect detection in the plots was implied by the parameter estimates. We consider that this result is reasonable since it is known that detection probability is high (>90%) up to 50 m (Schieck 1997, Alldredge et al. 2007, which was the maximum distance of our distance sampling data. Therefore, our model application yielded density estimates on the basis of nearly perfect detection and imperfect availability. Imperfect availability led to a reduction in estimated density, and lowered the rarefaction curves based on density. We suggest that existing models only dealing with imperfect detection (Yamaura et al. 2011(Yamaura et al. , 2012(Yamaura et al. , 2016b would confound detection probability and availability probability, and therefore, abundance estimates cannot be treated as density estimates in many cases (particularly when home range is larger than plot size) unless temporary emigration is explicitly modeled.
The emigration formulated by the open-population CAM includes not only spatial movement of individuals (spatial temporary emigration), but also other stochastic processes that lead to individuals being unavailable for detection (K ery and Royle 2016). This random temporary emigration (Fig. 1) may somewhat contribute to the underestimation of densities. Specifically, we formulated the density as the product of super-population size and availability probability (k 9 /); following the notation of Fig. 1, our model dealt with the nominal detection probability as p d rather than p a 9 p d using distance sampling (given the perfect detection at the zero distance). Availability probability (/) and associated density therefore are equivalent to p p 9 p a and k 9 p p 9 p a , respectively. The use of other sampling methods including removal sampling or multiple counts in a single visit (Chandler et al. 2011) would incorporate this unavailability (p a ) into nominal detection probability (= p a 9 p d ), and yield more unbiased density estimates (= k 9 p p ). We also note that Amundson et al. (2014) combined removal sampling and distance sampling to deal with p a and p d simultaneously.
For example, in point counts using distance sampling, we may conduct distance sampling multiple times in the single visits and record the time and distance of the detection simultaneously. In this case, the series of the equations dictating the individual counts (Eqs. 2-4) would be changed as follows: N ijk $ BinomialðM ij ; / i Þ: where L ijkl is the number of individuals spatially available and producing signals at sub-time l during visit k, and h i is the associated probability of random temporary availability. Separation of these two temporary emigration processes in the transect survey may be achieved by recording other ancillary information of the detection process such as the time when individuals are detected (Borchers and Cox 2017).
When are we primarily interested in density rather than abundance (super-population size) in a community context?
When should we deal with spatial temporary emigration separately from imperfect detection by investing additional sampling effort? Individuals exclusively residing in the sampling plots are expected to play greater roles in biological communities and ecosystems than individuals infrequently visiting the sampling plots. Our proposed modeling framework would be an option to properly account for these differences, and species richness is treated as a function of densities of species comprising the community and area. Ordinary species richness (number of detected species in the plot) does not account for temporary emigration but rather implicitly enumerates the number of species whose individuals overlap their home range with sampling plots. Although species richness is suggested to be related to ecological functions performed by communities (e.g., Philpott et al. 2009), the derivative of the species richness (and occupancy probability)-area curves decreases as area increases, and differences in species richness among the habitats can be obscured depending on the area (plot size; Appendix S1). In such case, we suggest that ecosystem functions performed by organisms would be more directly linked to population densities or other related diversity metrics accounting for spatial availability.
In this study, we suggested that three parameters are related to species richness. First is the regional species richness: the number of species in regional communities which can occur in the habitats being sampled. We note that the regional species richness is different than c diversity, which can be substantially smaller than regional species richness when the total size of the sampling plots is small (Iknayan et al. 2014, Yamaura et al. 2016b. Second is the mean population densities across species, and the third is the variation in density among the species. These quantities are community-level parameters estimated in hierarchical community models and can be scalefree parameters. For example, we can assume that these parameters do not change greatly when the regional biota is shared by the sampling plots, and underlying environments dictating habitat quality of constituent species are similar. In other words, if we have the estimates of population densities for individual species, we can estimate and compare species richness across areas using area-based rarefaction curves. In the CAMs, species variation in population densities (represented by the standard deviation [SD] of b 0i in Eq. 1) affects rarefaction curves differently depending on the area (or plot size). For small areas, larger SDs increase species richness while, in large areas, larger SDs decrease species richness (Appendix S1). Rarefaction curves computed for different habitat types can therefore cross at a certain plot size when different habitats have different levels of variability in population density among species (SD of b 0i ; Appendix S1). Rarefaction curves can also cross when regional species richness (or species pool of a meta-community; Iknayan et al. 2014) differs (Appendix S1). This suggests the significance of comparing species richness across areas even when different habitats are surveyed with the same plot sizes.
We developed our modeling framework focused on community-level densities and species richness (a, b, and c diversity); however, as imperfect detection can impact the analysis of other diversity metrics such as the Shannon index (Yamaura et al. 2016b), variety of diversity metrics would also be affected by this prevalent ecological process of temporary emigration. Since temporary emigration prevails in field surveys and has broad significance for the ecology and function of populations and communities, we suggest the importance of accommodating not only imperfect detection but also temporary emigration in the analysis of biological communities.