Predicting kill sites of an apex predator from GPS data in different multiprey systems
Funding information: Charity foundation from Liechtenstein; Fundação para a Ciência e a Tecnologia, Grant/Award Number: SFRH/BD/144110/2019; Hunting inspectorate of the Canton of Bern, Stotzer-Kästli-Stiftung, Zigerli-Hegi-Stiftung, Haldimann-Stiftung, Zürcher Tierschutz, Temperatio-Stiftung, Karl Mayer Stiftung, Stiftung Ormella; Nature Protection Division of the County Governor's Office for Innlandet, Viken, Vestfold & Telemark, Trøndelag, Nordland, Troms & Finnmark County; Miljødirektoratet; Norges Forskningsråd, Grant/Award Numbers: 251112, 281092, 160022/F40; Slovenian Research Agency, Grant/Award Numbers: N1-0163, P4-0059
Abstract
Kill rates are a central parameter to assess the impact of predation on prey species. An accurate estimation of kill rates requires a correct identification of kill sites, often achieved by field-checking GPS location clusters (GLCs). However, there are potential sources of error included in kill-site identification, such as failing to detect GLCs that are kill sites, and misclassifying the generated GLCs (e.g., kill for nonkill) that were not field checked. Here, we address these two sources of error using a large GPS dataset of collared Eurasian lynx (Lynx lynx), an apex predator of conservation concern in Europe, in three multiprey systems, with different combinations of wild, semidomestic, and domestic prey. We first used a subsampling approach to investigate how different GPS-fix schedules affected the detection of GLC-indicated kill sites. Then, we evaluated the potential of the random forest algorithm to classify GLCs as nonkills, small prey kills, and ungulate kills. We show that the number of fixes can be reduced from seven to three fixes per night without missing more than 5% of the ungulate kills, in a system composed of wild prey. Reducing the number of fixes per 24 h decreased the probability of detecting GLCs connected with kill sites, particularly those of semidomestic or domestic prey, and small prey. Random forest successfully predicted between 73%–90% of ungulate kills, but failed to classify most small prey in all systems, with sensitivity (true positive rate) lower than 65%. Additionally, removing domestic prey improved the algorithm's overall accuracy. We provide a set of recommendations for studies focusing on kill-site detection that can be considered for other large carnivore species in addition to the Eurasian lynx. We recommend caution when working in systems including domestic prey, as the odds of underestimating kill rates are higher.
INTRODUCTION
Predation is a fundamental ecological process, with direct implications for both predator and prey species, consequently influencing community structure and ecosystem regulation (Ripple et al., 2014). Predation in large carnivores is usually quantified using two metrics, kill rates and predation rates, which represent the predator and prey population's perspective of predation, respectively (Vucetich et al., 2011). The estimation of both metrics, along with their functional responses, is crucial for understanding the impact of predators on prey populations and, therefore, relevant for the management and conservation of both predator and prey species (Hebblewhite et al., 2007; Sinclair et al., 1998). However, accurate estimates of these metrics remain a key limitation to answering many central ecological questions.
Early studies on predation and the estimation of kill rates by terrestrial carnivores relied on finding prey remains in the field (i.e., kill site) with the use of a wide variety of field methods including field-checking with VHF technology, aerial surveys, snow-tracking, or direct observation of hunts (e.g., Boertje et al., 1988; Breitenmoser & Haller, 1993; Carbyn, 1983). Recently, advances in GPS tracking technology have improved the quality of predation studies, introducing field-checking of GPS location clusters (hereafter, GLCs) as a new method (Merrill et al., 2010). Furthermore, the development of GLC analysis has facilitated the identification of specific behavioral states (e.g., Mahoney & Young, 2017), including predation events, prior to field-checking. To date, many predation studies have used at least one of these two approaches, sometimes both (e.g., Knopff et al., 2009; Krofel et al., 2013; Mattisson et al., 2011; Vogt et al., 2018; Webb et al., 2008). These approaches are suitable for several large predators that frequently kill prey larger or similar to their body size, because they tend to exhibit fidelity to the kill site for extended time periods (i.e., returning often and/or staying close to the prey remains for several hours or days until it is mostly consumed). This is particularly true for solitary predators such as most large felids (e.g., mountain lion, Puma concolor), while other predators with a group-living social structure (e.g., wolves, Canis lupus, and lions, Panthera leo) tend to consume the prey remains faster (shorter handling time), thus potentially leading to lower detection and prediction rates of kill sites (Merrill et al., 2010). While GLCs field-checking is a reliable method for finding medium-sized or large prey items in many predator–prey systems, the detection of small prey items is more difficult (Palacios & Mech, 2011; Vogt et al., 2018), mainly because of the shorter handling times associated with this type of prey, causing the predator to move before a potential GLC is formed, especially with low GPS-fix resolution. Furthermore, the shorter handling times of small prey may hamper the predictive success of GLC analysis, as consumption of small prey can be confused with other types of behavior, such as resting.
Regardless of the recent and widespread use of GLC field-checking to detect kill sites and the development of GLC analysis, there are several methodological sources of error that can potentially introduce bias in predation studies if disregarded. A first source of error refers to an inappropriate GPS-fix schedule. Deciding on optimal GPS schedules requires a trade-off between obtaining quality data and saving the collar's battery life. However, choosing inappropriate GPS settings with respect to the predator's behavior might lead to missed kills, because a kill would not be detectable through GLCs (e.g., Webb et al., 2008), resulting in an underestimation of kill rates. Then, even if all potential predation events would lead to the formation of GLCs, not all of them might be field checked, which leads us to the second potential source of error. Namely, if the generated GLCs are not validated in the field (e.g., due to operational costs or terrain/weather constraints) but still included in estimates of kill rates as a “virtual kill” (e.g., Nilsen et al., 2009), there is a risk of adding false positives, as the GLCs may be a result of other behavioral states such as resting or reproductive activity. Conversely, if excluded from the estimates (false negatives), kill rate may be underestimated. A third source of error can be introduced during GLCs field-checking. When no prey remains are found at a GLC, it is not always possible to be certain whether there was actually no kill or the predator stayed stationary for other reasons. Additionally, kills can be missed when, for example, scavengers have consumed or removed the prey completely (Krofel et al., 2013) or the remains are covered with snow or vegetation. The detection of prey remains may also be influenced by the elapsed time between the predation event and the field-checking. While the third source of error is hard to control for (although it can be reduced by double-checking GLCs and/or using a trained dog, among others; see Blecha & Alldredge, 2015), the first two errors can be addressed by improving study designs and statistical approaches. Specifically, the problems associated with a suboptimal GPS-fix schedule can be minimized by adjusting it to the target predator–prey system. The second source of error can be addressed using cluster-specific attributes from known kill sites to train classification algorithms and avoid removal or false classifications of nonchecked kill sites. Despite the recent developments of several machine learning algorithms in numerous research fields, including animal ecology and species distribution (e.g., Tabak et al., 2019; Tatler et al., 2018), few of the available classification algorithms have been tested or applied in predator–prey studies (e.g., Studd et al., 2021). Furthermore, hidden Markov models have been applied to distinguish different behavioral states, commonly relying on GPS and/or accelerometer data (McClintock et al., 2020; van de Kerk et al., 2015). However, few studies have relied solely on GPS data to predict GLCs using this framework (Franke et al., 2006).
In this study, we address the first two sources of error. First, we test how different GPS-fix schedules influence the detection of GLCs at confirmed kill sites. Specifically, we subsampled data from collars with high-frequency schedules and tested different GLC spatiotemporal parameterizations. Then, we evaluated the potential of a random forest algorithm to correctly predict GLCs as nonkills, small prey kills, and ungulate kills. We focused on three intensively studied ecosystems in Europe where there is a large, solitary felid of conservation concern, the Eurasian lynx (Lynx lynx), which acts as an apex predator in different prey communities (multiprey systems): (1) semidomestic reindeer only (Rangifer tarandus; hereafter, reindeer) (northern Scandinavia), (2) reindeer–roe deer (Capreolus capreolus)–domestic sheep (Ovis aries; hereafter, sheep) (central–south Scandinavia), and (3) roe deer–Alpine chamois (Rupicapra rupicapra; hereafter chamois) (central Europe). Due to the particular combination of prey types within system 2, we further tested if the detection and prediction of domestic prey differed from semidomestic and wild prey. We employed a large GPS dataset of collared lynx, in combination with extensive data from field-checked GLCs. This study is the first addressing these two sources of error potentially affecting lynx kill-site identification across different multiprey systems, including domestic and semidomestic prey. We provide recommendations on how to set the schedules for GPS collars and GLC spatiotemporal parameters to optimize the correct detection of GLCs reflecting kill sites. Additionally, we show the first robust application of a random forest algorithm in predicting kill sites and compare its performance with previously used methods.
METHODS
Study systems and data available
We used GPS and kill-sites data from three multiprey systems (Figure 1). We generated GLCs using 92,315 GPS fixes obtained within a total of 10,139 days, from 66 tracked lynx individuals. We considered 1818 confirmed kill sites and 1822 nonkills (i.e., field-checked GLCs where no prey remains were found; Appendix S1: Table S1). In addition to the main ungulate species (reindeer, roe deer, sheep, chamois), lynx in all study sites also prey on a range of smaller, medium-sized mammals and birds (Appendix S1: Table S2). Data were collected through the EUROLYNX network, a collaborative bottom-up platform of lynx researchers across Europe for sharing data and expertise (Heurich et al., 2021). Further details on the data collection, including field-checking, for each multiprey system are provided in Appendix S1.
GPS-fix schedule requirements
We used a subsampling approach to identify the most suitable GPS schedules and the minimum number of GPS fixes needed to detect GLCs reflecting lynx kill sites reliably. For each system, we gradually subsampled the original datasets (Appendix S2: Table S1), until we reached two fixes per 24 h (Appendix S2: Tables S2 and S3). The original fix schedules varied between the three systems, in which systems 1 and 2 started at 24 fixes per day and system three at seven fixes (Figure 2). When the dataset had been subsampled down to six fixes per 24 h, we additionally created different schedules depending on the time of the day: night only, day only, and mixed (i.e., with fixes equally distributed during 24 h). We considered GPS fixes between 5 p.m. and 7 a.m. as night, and the remaining as day. Despite the daylight differences between the study systems, we subsampled all data similarly across the year, because the lynx exhibited a constant bimodal activity pattern in all systems and across seasons (Heurich et al., 2014).
For each original and subsampled dataset, we applied a clustering algorithm to generate GLCs (Clapp et al., 2021). We tested each dataset with different scenarios for generating GLCs in terms of: (1) spatial buffer (from 100 to 500 m, with 50 m intervals), (2) temporal window (0.5–5 days, with 12 h intervals), and (3) minimum number of fixes per cluster (from 2 to 10, increasing one fix at a time). All possible combinations of these parameters resulted in 900 different scenarios. Second, we obtained the proportion of field-checked kill sites successfully identified as GLCs by the algorithm (hereafter, success rate) for each scenario. We selected the parameters from the scenario with the highest success rate for further analyses.
We considered prey size, because GLCs are more likely to be formed in association with larger prey due to the lynx behavior of returning to such kills for longer periods (Krofel et al., 2013; Mattisson et al., 2011). Therefore, for each dataset, we grouped field-checked kill sites considering three prey categories: (1) wild, domestic and semidomestic ungulates excluding neonates (i.e., ungulates with body mass >7 kg; hereafter, ungulates), (2) nonungulate prey and ungulate neonates <7 kg (hereafter, small prey), and (3) all prey (all ungulates, including those with unknown body mass, and small prey). We used a threshold of 7 kg to distinguish ungulates from small prey because this is approximately the weight of carcass that the lynx would feed on for two nights (Jobin et al., 2000), which increases the probability of a kill site being associated with a GLCs. The ungulate kill sites of the unassigned weight category were only included within the “all prey” category to avoid introducing bias. We assigned the weight category based on the date of the kill and the estimated dates of birth per species in each study system (system 1: Tallian et al. (Submitted); system 2: Zimmermann et al. (2015); and system 3: Garel et al. (2009), and approximations from the other study systems).
For each original and subsampled dataset, we verified that there was a minimum number of two successful fixes within two consecutive days for each field-checked kill site, starting from the estimated time of the kill from the original dataset. We considered this criterion to ensure that the lack of a GLC being formed was not due to a failure in fix success, but to the actual setting of fix schedules. For double kills, we considered only one kill (the largest prey item) for the purpose of estimating the success rate, because several kill sites within a short radius are likely to be associated with the same GLC. We defined double kills as any two or more prey killed and consumed in parallel by the same lynx (Duľa & Krofel, 2020). We generated GLCs using the package GPSeqClus v1.2.0 (Clapp et al., 2021) in R software v4.1.0 (R Core Team, 2022).
Prediction of kill sites
We pooled all GPS data available from all schedules (Appendix S2: Table S1) and generated GLCs using the parameters that provided the best results in the previous section for kill-site detection through GLCs of all prey. We then associated GLCs to confirmed kill sites (classified as either small prey or ungulates; see “GPS-fix schedule requirements”) or to field-checked nonkills.
We used random forest, a tree-based machine learning algorithm, for multiclass classification of GLCs. Random forest consists of an array of decision trees that grows based on bootstrap samples of the training data and infers its aggregated coefficients and metrics (Banerjee et al., 2012). Random forest is more stable and predicts more accurately than single tree methods, and is commonly used for classification tasks in ecology (Cutler et al., 2012). Prior to deciding on using random forest, we tested two other algorithms, logistic regression (GLM) and gradient booting (xgboost), which provided overall lower accuracy for multiclass classification (0.52–0.70). We classified GLCs as nonkills, small prey, or ungulate prey based on cluster attributes. We considered cluster duration, number of visits and fidelity to the cluster, maximum foray from the centroid, cluster radius (maximum and mean), proportion of night fixes, period of the day when the cluster started, and number of 24-h periods as cluster attributes (Appendix S3: Table S1). We added a covariate on lynx sex to account for potential prey handling differences between males and females. Additionally, we included a binary covariate on season, separating the period between 1 May until 1 November, to account for the period when smaller prey would be most available. We further quantified the strength of random forests using binary classification by comparing nonkills versus ungulates, and nonkills and small prey versus ungulates. We discarded correlated cluster attributes (Spearman rank, ρ > 0.7).
We split each dataset into 75%/25% to create train/test datasets to build the model and to evaluate model performance, respectively. The proportions of the different types of clusters varied within each study system (Appendix S1: Tables S1 and S2). As this is known to cause problems in learning algorithms, we used the Synthetic Minority Oversampling Technique (SMOTE) method to generate artificial data within the training dataset for the unbalanced classes (Chawla et al., 2002). SMOTE oversamples the minority class by creating new, plausible examples based on the existing data from the minority class. After exploring different parameter combinations for the random forest model, we used 500 trees for the array of decision trees, and one as the minimum size (number of splits) for each tree. We measured accuracy, specificity (true positive rate) and sensitivity (true negative rate) of each model. Additionally, we used confusion matrices to summarize the prediction results and generated multiway importance plots to understand the influence of each cluster attribute in GLCs classification. These plots displayed three measures of importance based on the structure of the forest: mean depth of the first split on a given variable (x), the number of trees in which the root is split on the variable (y), and number of nodes that use the variable for splitting (z). High values of y and low values of x indicate a stronger association with the response variable. A higher number of nodes for a given attribute suggests a greater relevance for classification. We used the R packages suncalc v0.5.0 (Thieurmel & Elmarhraoui, 2019) to obtain the day period, UBL v0.0.7 (Branco et al., 2016) to apply the SMOTE method, randomForest v4.6-14 to build and evaluate the models, and randomForestExplainer v0.10.1 to obtain the importance of each variable.
RESULTS
GPS-fix schedule requirements
The success rate (i.e., the proportion of field-checked kill sites successfully detected by the clustering algorithm) decreased when using fewer fixes per day, but depended on size of prey, prey system, and time of the day (Figure 2). System 1 (reindeer only) reached less than 50% when using two fixes during the day, or for 24 h. In system 2 (reindeer–roe deer–sheep), the use of three fixes per night was enough to detect >80% of the kill sites. When excluding sheep from the analyses (30% of adult ungulates in system 2), we observed an overall increase in success rate in all subsampled datasets, by up to 10%. System 3 (roe deer–chamois), with an original dataset of seven fixes per 24 h, obtained similar success rates across all subsampled datasets, with two fixes per night successfully identifying 88% of all prey, and 94% of ungulates. Using fixes during the night period resulted in the highest success rate across all subsampled datasets (increasing up to 20%), followed by mixed and then day period. When reducing the number of fixes to two fixes per 24 h, the success rate in detecting kill sites of small prey decreased to 35% in systems 1 and 2, and 65% in system.
We found variation in the cluster parameters considered to detect GLCs reflecting kill sites. The parameters that provided the best results for all prey, when considering the original schedules, were 150 m for spatial buffer, 3 days for temporal window, and a minimum of two fixes. In all systems, a reduction in the number of fixes led to an increase of up to 400–500 m in spatial buffer. The temporal window parameter varied less within and among systems, with an increase to 5 days only in system 1 (Appendix S3: Figure S1).
Prediction of kill sites
The accuracy of the multiclass random forest ranged between 0.66 and 0.75 (Table 1). Specificity (proportion of true negatives) was higher than 0.60 for all systems, and reached a maximum of 0.80 and 0.94 for nonkills and ungulates, respectively. Sensitivity (proportion of true positives) was the lowest for small prey in systems 1 and 3, and system 2 when excluding sheep (0.13–0.52). In these cases, most of the misclassified small prey were classified as nonkills (Table 2). In system 2, sensitivity for ungulates increased from 0.41 to 0.75 when excluding sheep from ungulate kills. For nonkills and ungulate classes, sensitivity reached 0.81 and 0.76, respectively. For binary random forest, we obtained overall higher accuracies, particularly when comparing nonkills with ungulates only, with values up to 0.75 in system 1, 0.92 in system 2, and 0.88 in system 3 (Appendix S3: Tables S2 and S3). For these models, the effects of removing sheep were less clear.
Multiprey system | Accuracy [95% confidence interval] | Sensitivity | Specificity | |
---|---|---|---|---|
System 1 | 0.66 [0.59–0.72] | Nonkills | 0.76 | 0.69 |
Small prey | 0.13 | 0.93 | ||
Ungulates | 0.67 | 0.80 | ||
System 2 | 0.71 [0.64–0.78] | Nonkills | 0.81 | 0.63 |
Small prey | 0.65 | 0.85 | ||
Ungulates | 0.41 | 0.94 | ||
System 2 (without sheep) | 0.75 [0.68–0.82] | Nonkills | 0.80 | 0.73 |
Small prey | 0.52 | 0.86 | ||
Ungulates | 0.75 | 0.90 | ||
System 3 | 0.70 [0.66–0.75] | Nonkills | 0.79 | 0.80 |
Small prey | 0.33 | 0.87 | ||
Ungulates | 0.76 | 0.87 |
Prediction | Reference | ||
---|---|---|---|
Nonkills | Small prey | Ungulates | |
System 1 | |||
Nonkills | 95 | 15 | 19 |
Small prey | 6 | 3 | 9 |
Ungulates | 24 | 6 | 58 |
System 2 | |||
Nonkills | 97 (97) | 8 (9) | 13a (3) |
Small prey | 15 (13) | 17 (13) | 7b (2) |
Ungulates | 8 (11) | 1 (3) | 14 (15) |
System 3 | |||
Nonkills | 155 | 28 | 13 |
Small prey | 24 | 22 | 19 |
Ungulates | 18 | 17 | 104 |
- Note: Parentheses in system 2 reflect the values when excluding sheep from the dataset.
- a Six sheep.
- b Four sheep.
We found differences across systems in the importance of cluster attributes that explained the GLC multiclass classification (Figure 3). Cluster duration was the most important of the cluster attributes across the three systems. Maximum foray and the number of re-visits were also important attributes, although of varying importance between/across systems. The average cluster radius and proportion of night fixes were of moderate importance in all systems, whereas season, time of day at the first fix cluster (day period), and sex were of overall low importance for classifying GLCs. For binary classification, cluster duration was the most harmonious attribute as well, followed by maximum foray and night proportion (except for system 1). Fidelity was relevant for system 3, while day period, season, and sex showed no relevance within any system (Appendix S3: Figures S2 and S3).
DISCUSSION
GPS-fix schedule requirements
Whereas the number of GPS fixes needed to generate a GLC had been addressed in previous studies, but mostly limited to wolves (Sand et al., 2005; Webb et al., 2008) and mountain lions (Knopff et al., 2009), it had never been done considering different multiprey systems, with different combinations of domestic, semidomestic, and wild prey. We observed a steady decrease in the success rate within systems 1 and 2, where domestic and semidomestic prey are predominant. The results for system 2 showed that GLCs reflecting sheep were more difficult to detect when reducing the number of fixes. This supports previous findings that the domestic/semidomestic nature of many prey in these systems influences lynx handling behavior (Odden et al., 2002; Tallian et al., Submitted). In system 3, mostly composed of wild prey, more than 95% of the GLCs representing ungulate kill sites were correctly identified with three fixes per night. The highest resolution schedule available in system 3 was seven fixes per day, with four fixes taken around dusk. While it is impossible to know how many kills were missed in system 3 compared with a schedule with 24 locations per day, our results suggested that not only the number of fixes, but also their timing, is crucial for the optimization of GPS-fix schedules designed for predation studies. The typical lynx behavior is to leave the kill during the day to find a resting site, and then return at dusk for feeding (Krofel et al., 2013, 2019; Molinari-Jobin et al., 2007). Indeed, we stress that the success rate drops when including only daytime fixes within the two systems where this could be evaluated (systems 1 and 2).
Nevertheless, we should stress that the success rate at detecting GLCs reflecting kill sites may not directly translate into the detection probability of finding a kill during GLC field-checking. Kill sites were originally found with high fix rate schedules, which facilitates finding the prey remains. Therefore, using a reduced number of fixes may negatively affect the detection of kills in the field. Additionally, the use of a dataset with few fixes per 24 h could result in fewer fixes in proximity to the kill site, leading to additional bias when detecting GLCs. Using such datasets may require increasing the cluster radius for detecting GLCs, which is supported by our results.
Our results revealed lower success rates when detecting small prey through GLCs, compared with ungulates, when reducing the number of daily fixes, mirroring previous findings for lynx and other carnivore species (e.g., Palacios & Mech, 2011; Svoboda et al., 2013; Vogt et al., 2018). This suggests the need to use a high number of fixes per day to identify prey smaller than 7 kg, but it nevertheless requires extensive fieldwork, as short-duration GLCs can also be confused with daybeds or other activities besides feeding (Vogt et al., 2018).
Besides differences in prey composition, the landscape varied significantly between systems, especially in terms of climate, topography, forest cover, human density, and the presence of other predators and/or scavengers (Krofel et al., 2019; Mattisson et al., 2011). These factors can affect lynx behavior while handling a kill; for example, in areas with less vegetation cover and topographic complexity, the lynx might need to move further away from the kill when not feeding (e.g., for resting, which usually happens nearby the kill in densely forested habitats; Krofel et al., 2013).
Prediction of kill sites
Our second main goal was to classify GLCs as nonkills, small prey kills and ungulate kills, with a particular focus on cluster attributes, which we accomplished using the random forest algorithm. We ran a multiclass classification algorithm for each system, which obtained an accuracy of 63%–73%, and two different binary classification algorithms with higher accuracy levels (75%–90%), varying among systems. The main reason for this difference is the inclusion of small prey in the multiclass classification model, which underperformed when compared with the classification of nonkills and ungulates, thus reducing overall accuracy. Previous studies have mentioned the prediction of smaller prey as a caveat, due to the shorter handling times (e.g., Franke et al., 2006; Knopff et al., 2009; Mahoney & Young, 2017; Sand et al., 2005; Webb et al., 2008). We also observed that most of the small prey kills incorrectly identified were mistaken for nonkills, similar to Webb et al. (2008), suggesting that the movement behavior while handling a small prey kill is comparable with other behaviors (e.g., resting). Although random forest is an algorithm with high performance for classification tasks in noisy datasets, such as ours, we believe that an accurate prediction of small prey would most likely require other input parameters, such as fine-scale accelerometer data that would capture the feeding bouts within a GLC. Environmental layers, such as slope, forest cover, and prey density, may also improve accuracy and facilitate distinguishing between resting sites and kill sites, because lynx may use distinct microlocations for foraging and resting.
When considering the binary algorithms and cluster attributes only, we obtained similar or higher accuracy than previously reported for other carnivore species that exhibit high site fidelity to the kill sites (86%, Knopff et al., 2009; 82%, Blecha & Alldredge, 2015; 88%, Webb et al., 2008; 75%, Franke et al., 2006; 88%, Pitman et al., 2012). Our algorithm is based on variables that can easily be extracted from GLCs, increasing its usefulness in classifying GLCs prior to field-checking. Nevertheless, despite similar model accuracy between systems, we found that the importance of cluster attributes varied among areas, suggesting that extrapolation between different systems should be done with caution. Variation may depend on the lynx's behavioral differences around kill sites and nonkills, which can also be influenced by the different environmental settings among the three systems. Therefore, in order to test the transferability and broader application of these models, we would like to stress that an external validation should be conducted, as suggested by Knopff et al. (2009), considering similar prey systems.
Applications and recommendations
- For GPS collar settings, in order to acquire reliable kill rate estimates and diets, the higher frequency of daily fixes the better. However, there is always a trade-off between battery life and the number of fixes, as well as the amount of fieldwork possible to conduct. In a prey system including only wild ungulates, our data suggest that the number of fixes can be reduced to as low as three fixes per night without missing more than 5% of the ungulate kills. However, to get the same accuracy in a system with domestic prey, at least 10 fixes/day may be needed. We do recommend starting with more intensive schedules for potential readjustments, and then detecting the minimum number of fixes to get the best trade-off between battery life and data quality.
- According to the number of fixes considered in the previous step, test and adjust the input parameters (spatial radius, temporal window, and minimum number of fixes per cluster) to generate GLCs. If a low-resolution schedule is used, we recommend increasing the cluster radius and testing different values rather than using the 100–200 m commonly considered.
- We propose the use of classification models as an ancillary tool to increase fieldwork efficiency and minimize costs, by prioritizing GLCs with the highest probabilities of being kill sites prior to field checks. This, as well as using these models to avoid discarding nonchecked GLCs, can be used to complement kill-site datasets, thus improving kill rates estimation. Additionally, this would allow generating comparable datasets across larger scales, where similar predator–prey systems exist. However, they should be mostly applied to detect kills >7 kg, as detecting smaller prey using classification models is not optimized yet.
Conclusion
Prediction of behavioral status from GPS data can be challenging, particularly when considering potential differences among different areas and individuals. However, the rapidly increasing number of technological and analytical tools allows for larger scale analyses, as well as the use of unbalanced datasets. Our approach for detecting and predicting GLCs can be applied to identify kill sites of other carnivores, particularly large, solitary species with longer feeding times, such as the mountain lion, tiger (Panthera tigris) or leopard (Panthera pardus). Furthermore, it can also be generalized to species that occasionally consume ephemeral but high-quality food sources (e.g., scavenging; Ebinger et al., 2016), as well as to other behaviors, such as hibernation, scent-marking, mating and maternal behavior (Krofel et al., 2013, 2017; Mahoney & Young, 2017; Melzheimer et al., 2020), provided that collar schedules are appropriate to generate enough data (i.e., number of fixes) for a GLC to form.
The current limitations of this approach are connected with detecting and predicting GLCs associated with shorter feeding times, mostly connected with the consumption of small prey, or larger prey that is removed by large scavengers early in the consumption process. Additionally, distinguishing between scavenging and predation events could be further complicated, as both can involve feeding behavior for extended periods. The outcome of these limitations can have consequences for management, particularly when estimating ungulate kill rates (Brockman et al., 2017; Jansen et al., 2019; Krofel & Jerina, 2016). Potential solutions could rely on including additional information to GLCs, namely by incorporating accelerometer, audio-loggers, or video data within GLC duration, as well as environmental characteristics (Brockman et al., 2017; Studd et al., 2021). Using techniques such as supervised/unsupervised machine learning, these alternative sources of data can further help to distinguish between different activities (e.g., resting from feeding on smaller prey; Studd et al., 2021), hence considerably assisting in the identification of target behaviors. Future research on feeding and behavioral ecology will likely include this information, as ancillary data collection becomes progressively more efficient and less energy demanding.
AUTHOR CONTRIBUTIONS
Teresa Oliveira, Miha Krofel, Kristina Vogt, Jenny Mattisson, John D. C. Linnell, John Odden, Andrea Corradini and Marco Heurich conceived the ideas; Teresa Oliveira, David Carricondo-Sanchez, Andrea Corradini, and Miha Krofel designed the methodology; Kristina Vogt, Jenny Mattisson, John D. C. Linnell and John Odden collected and compiled the data; Teresa Oliveira analyzed the data and led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.
ACKNOWLEDGMENTS
Teresa Oliveira was supported by Fundação para a Ciência e Tecnologia (grant no. SFRH/BD/144110/2019). Miha Krofel was supported by the Slovenian Research Agency (grant nos. N1-0163 and P4-0059). The Norwegian part of the study was founded by the Research Council of Norway (Norges Forskningsråd (grant nos. 251112 and 281092, NINA basic funding project no. 160022/F40), the Norwegian Directorate for Nature Management (Miljødirektoratet), the Nature Protection Division of the County Governor's Office for Innlandet, Viken, Vestfold and Telemark, Trøndelag, Nordland, Troms & Finnmark County. The Swiss part of the study was funded by the following institutions and foundations: charity foundation from Liechtenstein, hunting inspectorate of the Canton of Bern, Stotzer-Kästli-Stiftung, Zigerli-Hegi-Stiftung, Haldimann-Stiftung, Zürcher Tierschutz, Temperatio-Stiftung, Karl Mayer Stiftung, Stiftung Ormella.
CONFLICT OF INTEREST
We declare no conflicts of interest.
Open Research
DATA AVAILABILITY STATEMENT
Data used are stored in the shared repository of the EUROLYNX network (www.eurolynx.org), within the papers_dataset schema of the Eurolynx PostgreSQL. Data used to evaluate the performance of different GPS schedules contains sensitive information and, therefore, is not publicly available, but can be requested from the EUROLYNX database data curator. Data used for the random forest models (Oliveira, 2022) are available in Dryad at https://doi.org/10.5061/dryad.866t1g1tn.