Pitfalls during in silico prediction of primer specificity for eDNA surveillance

While high efficiency and cost-effectiveness are two merits of environmental DNA (eDNA) techniques for detecting aquatic organisms, the difficulty of designing species-specific primers can result in significant expenditure of time and money. During the in silico stage of primer development, primer specificity is predicted with alignment techniques such as BLAST that is based on the number and position of the primer/nontarget template mismatches. However, we speculate that nonspecific amplification is influenced by additional parameters, which lead to inaccuracies of in silico prediction. We performed in vitro specificity tests for 38 species-specific primers selected for seven fishes and six turtles, using single-plex conventional PCR (cPCR). A subset of 12 primer pairs were further tested with SYBR Green-based or TaqMan-based single-plex quantitative PCR (qPCR). We disentangle the relative importance of mismatch properties (types and positions), primer properties (length, GC content, and 30 end stability), PCR conditions (template concentrations and annealing temperatures), and PCR technique (cPCR, TaqMan-based, or SYBR Green-based qPCR) in determining the occurrence of amplifications. We then compared the PCR outcomes with the specificity check under two stringency scenarios based on alignment (i.e., BLAST search). We conducted a total of 679 cPCR and 226 qPCR analyses, with 90% of the reactions tested with nontarget templates. Primer pairs predicted by Primer-BLAST to be specific rarely showed such specificity during the in vitro testing. BLAST searches correctly predicted the outcomes of around 67% of cPCR and qPCR, but had low sensitivity in detection of nontarget amplification (29–57%). Primer specificity increased significantly with total number of mismatches and annealing temperature, but decreased with higher GC content in the primer sequence. Mismatches that consisted of A-A, G-A, and C-C pairings exerted 56% stronger reduction in nonspecific amplification effects than other mismatches. To conclude, we show that the prediction of primer specificity based only on the number and position of mismatches can be misleading. Our findings can be applied to increase the efficiency of the in silico primer selection process to maintain the relatively high efficiency and cost-effectiveness of eDNA techniques.


INTRODUCTION
Organisms in lakes and rivers shed DNA into the water column in the form of secretions, cells, gametes, or feces that are transported through drainage networks. Fragments of this environmental DNA (eDNA) can be differentiated from organic matter in water samples using PCR and species-specific primer assays (Elbrecht and Leese 2017). eDNA techniques are theoretically capable of detecting even a single copy of target DNA (Jerde et al. 2011), so the potential applications of eDNA techniques are substantial (Goldberg et al. 2016). Such methods have been used to detect rare and endangered freshwater species whose presence cannot be confirmed easily by more conventional means (Laramie et al. 2015, Sigsgaard et al. 2015, Eva et al. 2016, Renan et al. 2017, and to monitor the colonization of new habitat by invasive species (Jerde et al. 2013, Nathan et al. 2014, Rees et al. 2014. The high sensitivity of eDNA, relative to more traditional survey methods (Goldberg et al. 2013, Takahara et al. 2013, has contributed to its increasing prevalence as a research tool (Goldberg et al. 2016). Recently, however, Smart et al. (2016) showed that the development of speciesspecific primers for use in polymerase chain reactions (PCRs) can be costly, especially in cases where it is necessary to distinguish multiple closely related taxa. In such instances, primer design becomes critical since any nonspecific amplification produces a false positive that can result in inappropriate implementation of conservation or management measures (Wilcox et al. 2013). Pairs of specific primers for eDNA methods are developed in three stages: (1) in silico selection of candidate primer sets based on their specificity predicted with reference to DNA sequences of target and nontarget species (see below); (2) in vitro testing of the potential primers using conventional PCR (cPCR) and/or quantitative PCR (qPCR) with DNA extracted from tissue of both target and nontarget species; and (3) in situ testing of the primer sets with cPCR and/or qPCR using environmental samples collected in the presence (positive testing) and absence (negative testing) of the target species (Goldberg et al. 2016).
Candidate species-specific primers can be selected manually or automatically during the in silico stage using a variety of software programs such as Primer3 (Rozen and Skaletsky 2000), Primer Express (Applied Biosystems, Foster City, CA, USA), and Geneious (Biomatters, Auckland, New Zealand). The specificity of the selected primers is checked using alignment techniques such as Basic Local Alignment Search Tool (BLAST) that identifies the number and position of nucleotide mismatches between primer sequences and nontarget templates (e.g., Takahara et al. 2013, Robson et al. 2016. The prediction of primer specificity determines which primers are tested in subsequent in vitro and in situ stages. Thus, the accuracy during the primary in silico prediction has a direct influence on the time and money required for primer development, as well as the decision of template choices in subsequent in vitro specificity validation. For the issue of choosing templates, competitive templates can be overlooked if the prediction is inaccurate, leading to a higher possibility of false positives. Optimizing and evaluating methods used during the in silico prediction are thus essential to ensure that eDNAbased approaches are efficient, accurate, and costeffective. The presence of a mismatch lowers the thermal stability of the primer-template duplex and reduces PCR efficiency (Cha andThilly 1993, Bru et al. 2008). In general, the greater the number of mismatches between primer and nontarget templates, the more specific the primer is, and the importance of maximizing the number of mismatches has been widely recognized (Wilcox et al. 2013, Rees et al. 2014, Fukumoto et al. 2015. Nonetheless, most of the previous studies that have investigated mismatch effects were based on either single or double mismatches (Kwok et al. 1990, Simsek and Adnan 2000, Wu et al. 2009, Wright et al. 2014). It has been well documented that both the type and position of mismatches have differential effects on primer specificity. Certain mismatch types cause significant reduction in amplification efficiency: purine-purine pairings (i.e., A-A, A-G, or G-A) and one of the pyrimidine-pyrimidine pairings (i.e., C-C), hereinafter referred to as critical mismatches (CM; Kwok et al. 1990, Huang et al. 1992, Simsek and Adnan 2000. Mismatches located near the 3 0 end of the primer (i.e., in the last five bases) can greatly diminish amplification (Rychlik 1995, Bru et al. 2008, Wu et al. 2009, Stadhouders et al. 2010). In addition, the influence of a single mismatch depends on the nearest-neighboring nucleotides, and the destabilizing effect of a mismatch can be eliminated in some contexts (Xia et al. 1998). However, it is currently unclear how to apply these findings to predict primer specificity in eDNA studies because of complications arising from the presence of multiple mismatches and uncertainty about which mismatch properties have the greatest effect.
Additionally, primer specificity is governed by other parameters besides mismatches. For example, each additional nucleotide in the primer sequence can theoretically cause a fourfold increase in specificity since the probability of the nontarget template having exact complementarity to the primer sequence is vastly lowered, at least for taxa that are not closely related (Dieffenbach et al. 1993). Furthermore, the stability at the 3 0 end of the primer plays a critical role in controlling mispriming (Kwok et al. 1990, Abd-Elsalam 2003. Apart from primer properties, PCR conditions (annealing temperature and template concentration) affect the occurrence of nontarget amplification. For instance, increasing the annealing temperature can suppress nonspecific amplification in a trade-off with PCR efficiency (Wu et al. 1991). In short, the prediction of primer specificity is a complex process that is determined by more than mismatch properties. Therefore, selecting primers solely based on the number and position of mismatches between the primer and nontarget templates (e.g., by relying on BLAST search) can be inaccurate and result in additional effort needed for primer development. The lack of a systematic and comprehensive investigation of the effects of multiple base-pair mismatches and other parameters that may affect primer specificity hampers the design of speciesspecific primers in eDNA studies and limits the predictability of primer specificity during the in silico stage.
In this study, we designed species-specific cPCR and qPCR primer assays for seven fish and six turtle species, most of which occur in freshwater bodies in Hong Kong. During the in silico stage, we evaluated primer specificity on a set of co-occurring and/or closely related nontarget species using Primer-BLAST (Ye et al. 2012). We then performed in vitro primer specificity using single-plex cPCR, and SYBR Green-based or Taq-Man-based qPCR, with the genomic DNA of the same set of nontarget species as the template. These results allowed us to investigate the relative importance of several parameters on the development of species-specific eDNA primers: features of multiple primer mismatches (position and type), primer properties (length, GC content, and 3 0 end stability), and PCR conditions (template concentration and PCR annealing temperature). By cross-checking the results of in silico predictions with in vitro PCR validations on primer specificity, we also evaluated the performance of in silico prediction using BLAST searches. Our findings have application for the improved selection of species-specific primers in silico prior to any PCR testing, and thus help to maximize the accuracy, efficiency, and cost-effectiveness of eDNA methods for species detection in the field.

Species collection and DNA extraction
A total of 13 target species were selected to develop eDNA assays to study components of the endangered and invasive freshwater vertebrate diversity in Hong Kong. Seven of the target species were fishes: Oryzias curvinotus, O. dancena, and O. latipes (Adrianichthyidae); Macropodus opercularis (Belontiidae); Misgurnus anguillicaudatus (Cobitidae); and Gambusia affinis and Xiphophorus helleri (Poeciliidae). Oryzias curvinotus (vulnerable) and Macropodus opercularis (endangered) are locally threatened fish species (Nip et al. 2014), while the two poecilids are invasive species globally and widespread in Hong Kong. Oryzias dancena and O. latipes do not occur in the wild in Hong Kong but were included to investigate the parameters that influence primer discrimination among congeneric species. The six remaining target species were turtles: Cuora trifasciata, Mauremys reevesii, and Sacalia bealei (Geoemydidae); Platysternon megacephalum (Platysternidae); Pelodiscus sinensis (Trionychidae); and Trachemys scripta elegans (Emydidae). Apart from the invasive T. scripta elegans, all turtles are native to Hong Kong and are globally endangered (IUCN 2019).
For tests of primers targeting the seven fish species, the other six species were used as nontarget species. For each of the six target turtle species, an additional nine species of turtles were selected as the nontarget species based on sympatry and relatedness (see additional supporting information for the templates used in each test): Cuora amboinensis, C. flavomarginata, Cyclemys dentata, Mauremys japonica, M. sinensis, and Sacalia quadriocellata (Geoemydidae); Graptemys ouachitensis and G. pseudogeographica (Emydidae); and Palea steindachneri (Trionychidae).
Individuals of both target and nontarget fish and turtle species were obtained from captivity or collected from local water bodies. Genomic DNA was extracted from muscle tissues using a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol. DNA concentration was measured using a Qubit dsDNA high-sensitivity assay kit and a Qubit 3.0 fluorometer (Life Technologies, Waltham, MA, USA). All extracted DNA samples were stored at À20°C before use.

Primer selection
Mitochondrial genome sequences of each species were retrieved from the GenBank and aligned using MEGA7 software (Kumar et al. 2016). A total of 38 pairs of species-specific primers (Table 1) were manually selected for the 13 target species (i.e., two to three pairs for each species) within five commonly sequenced loci: 12S ribosomal RNA (12S), 16S ribosomal RNA (16S), cytochrome c oxidase subunit 1 (COI), cytochrome b (CYTB), and NADH-ubiquinone oxidoreductase chain 4 (ND4). Each primer pair had a target amplicon length within the optimal size range (i.e., 50-200 bp) that ensures high PCR efficiency (Thornton and Basu 2011). For a subset of 12 primer pairs, we also designed qPCR assays, either SYBR Green-based (five pairs) or TaqMan-based qPCR (seven pairs; Tables 1, 2).
Properties of primers were obtained using the primer check function of an online software Pri-mer3Plus (http://www.bioinformatics.nl/cgi-bin/ primer3plus/primer3plus.cgi): length, GC content (%), and 3 0 end stability of each of the forward and reverse primers. Primers were also checked with the IDT-oligoAnalyzer (https://sg.idtdna.c om/calc/analyzer) to ensure that stable hairpin and primer dimers were unlikely to form during the PCR annealing stage. For simplicity, investigation of the effects of mismatch properties focused on the intended priming region. The value of six primer mismatch variables was computed using R (R Development Core Team 2018; script provided in supplementary materials): the number of (1) noncritical mismatches (NCM; i.e., C-T, T-C, T-G, G-T, G-G, T-T, A-C, and C-A) within the whole primer (total NCM); (2) critical mismatches (CM) within the whole primer sequence (total CM); (3) NCM in the last five bases at the 3 0 end (3 0 end NCM); (4) CM in the last five bases at the 3 0 end (3 0 end CM); (5) NCM at the 3 0 terminal of primers (Term NCM); and (6) CM at the 3 0 terminal of primers (Term CM). Nontarget templates would not be amplified if either of the forward and reverse primers failed to form duplex with the templates. Hence, all of the primer and mismatch properties were recorded as the maximum or minimum value (between the forward and reverse primers) that favors the prevention of mispriming (i.e., the maximum value for mismatch properties and primer length, and the minimum value for GC content and 3 0 end stability).

In silico check of specificity using Primer-BLAST
We compared the sequences of each primer pair with the sequences of the selected nontarget species available in the nr/nt database using Primer-BLAST (http://www.ncbi.nlm.nih.gov/tools/ primer-blast/) with two specificity stringency scenarios. For scenario 1, we adopted the default specificity check settings such that ". . . at least one primer (for a given primer pair) must have two or more total mismatches to unintended targets, including at least two mismatches in the 3 0 end, and that any targets with six mismatches or more to at least one primer (for a given primer pair) should be ignored" (Ye et al. 2012). For scenario 2, we tightened the criteria and specified the settings to the upper limit of the software, such that at least one primer must have at least six total mismatches, including at least four mismatches at the 3 0 end, and that targets would not be ignored unless there were nine or more mismatches. A nonspecific amplification was considered to occur only if the size of the unintended product predicted was similar to the target amplicon (standardized as AE20 bp).

In vitro check of specificity using cPCR and qPCR
For the in vitro specificity validation, all 38 primer pairs were tested with nontarget templates using single-plex cPCR. A subset of 12 primer pairs (Table 2) were also tested with either SYBR Green-based (five pairs) or TaqMan-based qPCR (seven pairs). One to three independent replicates for each species template (i.e., genomic DNA extracted from different individuals of the same species) were used in the specificity validation of each primer pair, resulting in a total of 679 cPCR and 226 qPCR (70 SYBR Green-based and 156 TaqMan-based qPCR; see additional supporting information). Templates used in cPCR were 10-fold-diluted, while those used in more sensitive qPCR were 100-fold-diluted. The ❖ www.esajournals.org Notes: Primer length, GC content, and 3 0 end stability were computed using the maximum or minimum value within the primer pair that favors the prevention of nonspecific amplification. Annealing temperature was determined from a PCR temperature gradient. Forward primer sequence appears above the reverse primer sequence for each pair. ‡ Indicates primer pairs that were subjected to further specificity test using SYBR Green-based qPCR. § Indicates primer pairs that were subjected to further specificity test using TaqMan-based qPCR. Probe sequences are shown in Table 2. diluted samples used in cPCR contained 9.6 9 10 6 to 1.85 9 10 8 copies of genomic DNA fragments per µL, while those used in qPCR contained 9.6 9 10 5 to 1.6 9 10 7 copies of genomic DNA fragments per microliter, assuming DNA after extraction was dominated with fragments of length 50,000 bp (Frudakis 2010).
For cPCR, each reaction was performed in a 20 lL AccuPower PCR PreMix (Bioneer, Daejeon, Korea) with 0.4 lmol/L of each of the forward and reverse primers and 1 lL of DNA extract, using the following thermal cycler protocol: 3 min at 95°C for initial denaturation, followed by 35 cycles of 30 s at 95°C, 30 s at the corresponding optimal annealing temperature (determined previously by temperature gradient), and 20 s at 72°C, with a final extension of 5 min at 72°C. A negative control containing autoclaved water as the template was included in each run to check for the presence of contamination. The results from the whole batch of reactions were discarded if any band was detected in the negative control. PCR products were visualized by electrophoresis on a 2% (wt/vol) agarose gel stained with GelRed nucleic acid (Biotium, Fremont, CA, USA). Gel images were obtained under UV exposure of 0.08 s using Gel Doc 2000 system (Bio-Rad, Hercules, CA, USA) and analyzed with GelAnalyzer software (Lazer 2010). To correct the gel distortion, a Rf calibration was performed according to the software's manual. The size of the amplicons was determined by molecular weight calibration with a DNA ladder (GD 50 bp DNA Ladder RTU; Bio-Helix, Keelung City, Taiwan). Any band seen near the size of the target amplicon that was poorly differentiated from the target (i.e., AE 20 bp) was recorded and assumed to be caused by the amplification at the intended priming region.
† For abbreviations of species names, see Table 1.
StepOnePlus real-time PCR system (Life Technologies) on a 96-well plate with three technical replicates for each sample and negative control. Conditions for the thermal cycling were as follows: 2 min at 50°C, then 10 min at 95°C, followed by 40 cycles of 15 s at 95°C, and 1 min at the corresponding annealing temperature. Melting curve analyses and gel electrophoresis were conducted for SYBR Green-based qPCR to identify nonspecific amplification occurring outside the intended priming regions.

Statistical analyses
We examined the relative importance of various parameters for primer specificity by constructing mixed-effect logistic regression models using cPCR only, qPCR only, and the combined dataset. The models consisted of PCR outcomes (i.e., amplified or not) as the response variable, with 11 explanatory variables (Total NCM, Total CM, 3 0 end NCM, 3 0 end CM, Term NCM, Term CM, primer length, GC content, 3 0 end stability, template concentration, and annealing temperature) and two covariates: target taxon (fish or turtle) and amplicon locus. Template species identity and primer pair identity were included as crossed random effects. PCR technique (cPCR, TaqManbased, or SYBR Green-based qPCR) was included as an additional explanatory variable for the model based on the combined dataset. Multicollinearity of parameters was checked a priori using variance inflation factor (VIF) with threshold value of 5 (Fox 2015). The models were built using R package glmmTMB (Magnusson et al. 2017). The performance of Primer-BLAST was evaluated using a confusion matrix that indicated the false and true prediction rates of the program (Fielding and Bell 1997), using R package caret (Kuhn 2012).

PCR results and Primer-BLAST performance
A total of 679 cPCRs (93 reactions for target species and 586 reactions for nontarget species) were conducted for the in vitro specificity validation tests (see additional supporting information for the whole dataset). For the nontarget species reactions, 182 reactions (out of 586) exhibited nonspecific amplification (i.e., bands were observed at the site of intended amplicon for nontarget DNA templates). For qPCR, 226 qPCRs (SYBR Green, n = 70; TaqMan, n = 156) were carried out, with 24 of these reactions using target species as templates. Of the 202 reactions testing against nontarget templates, 42 reactions exhibited nonspecific amplification, 64% (27 out of 42) of which were SYBR Green-based.
Primer specificity (i.e., nontarget amplicons were not detected among all nontarget species reactions) was validated for 11 of the 38 primers using cPCR, and four out of 12 primer pairs using qPCR. Only three primer pairs demonstrated specificity in both cPCR validation and qPCR validation (i.e., GA-CYTB-3, PM-ND4-2, and PeS-16S-1). However, all of these primer pairs were predicted to be nonspecific by Primer-BLAST under at least one of the stringency scenarios (Table 3). In contrast, while 13 primer pairs were predicted by BLAST to be specific under the default stringency (i.e., scenario 1), only six of them showed specificity in cPCR tests. Among these six pairs, three were further tested with qPCR and only one was validated to be specific. Under the stricter scenario 2, all three primer pairs predicted to be specific were all nonspecific based on the validation tests. To further investigate the performance of Primer-BLAST, the predictions made by the software were cross-checked with each PCR (excluding those PCRs that tested primers with their corresponding target templates; n = 586 for cPCR and n = 202 for qPCR) using confusion matrices (Fig. 1). The confusion matrices revealed that Primer-BLAST was able to successfully predict around 9% of the nonspecific amplifications under scenario 1 for both cPCR and qPCR (Fig. 1A, C), and 13% for cPCR and 12% for qPCR under the more stringent scenario 2 (Fig. 1B, D). On average, the overall accuracy of Primer-BLAST in predicting the outcomes of both cPCR and qPCR was 67% for both stringency scenarios (Table 4). The sensitivity of Primer-BLAST predictions to nonspecific amplification was generally low (29-57%).

Relative importance of mismatch, primer properties, and PCR conditions
We modeled the PCR outcomes in relation to all parameters recorded in the PCR experimental results (cPCR, n = 679; qPCR, n = 226; combined, n = 905). The VIF of the covariate taxa group and amplicon locus exceeded 5, so they ❖ www.esajournals.org were omitted from the models. Overall, primer specificity always increased in response to three factors (total NCM, total CM, and annealing temperature; P < 0.001; Table 5), while 3 0 end NCM and GC content were also significant, but not for the qPCR model. The detection of nonspecific amplifications was also varied across PCR techniques, as suggested by the combined-dataset model (see Table 5).
Given that the results between models varied only slightly, we focus here on the combined dataset, which had a larger sample size and allowed for comparison between cPCR and qPCR (TaqMan and SYBR Green) techniques Notes: The U in cells represents target-specific primers, while the ✗ in cells represents nonspecific primers. Empty cells represent scenarios not tested. Primer pairs that were specific are underlined for better visualization. (Table 6). For the mismatch properties, total CM was 56% more effective than total NCM in increasing primer specificity. The effect of 3 0 end NCM was similar to that of total NCM (odds ratios: 0.68 vs. 0.66), indicating the effects of NCM were not affected by mismatch position. One-degree Celsius difference in the optimal annealing temperature between primers caused a 49% difference in primer specificity (the higher the temperature, the higher the specificity). GC content was the only explanatory variable that reduced primer specificity significantly (odds ratio = 1.12, P < 0.05). More nonspecific amplification was detected using SYBR Green compared with cPCR assays, but there was no difference in detection between TaqMan and cPCR assays.

DISCUSSION
Nonspecific PCR amplification is governed by a complex interplay of mismatch properties, primer properties, and PCR conditions, such that the effects of some parameters can be masked or influenced by other parameters (Dieffenbach et al. 1993, Rychlik 1995, Wu et al. 2009. Despite this, current in silico tests of primer specificity (e.g., Primer-BLAST) rely solely on the number and position of primer-template mismatches. In our study, we found that primer pairs predicted  by Primer-BLAST to be specific rarely showed specificity during in vitro testing. In contrast, all primers validated as being specific during in vitro cPCR and qPCR (if any) were predicted by Primer-BLAST to be nonspecific with respect to the nontarget templates tested (Table 3). Thus, relying solely on the number and position of primer-template mismatches is insufficient and often misleading for in silico predictions of primer specificity. We investigated the effect of 11 parameters on primer specificity and found that five showed significant effects: total number of critical mismatches (Total CM), total number of noncritical mismatches (Total NCM), NCM at 3 0 end of primer (3 0 end NCM), annealing temperature, and GC content. Total CM had the strongest effect on reducing nonspecific amplification (Table 6), while GC content was the only significant parameter that was negatively correlated with primer specificity. In the remaining sections of this discussion, we detail the importance of the significant parameters.

Mismatch type and number
In our study, primer specificity increased significantly with both total NCM and total CM (Tables 6, 7). This is in agreement with the assumption that that mismatch number tends to have a strong effect on primer specificity (Wilcox et al. 2013, Goldberg et al. 2016. Our finding of differences in the effects of total NCM and total CM (the latter was 56% more effective than the former) is also in general accordance with the results of previous work that focused on the effects of single primer-template mismatches located at the 3 0 terminal of primers (Kwok et al. 1990), and at the center of short oligonucleotides (Peyret et al. 1999). This indicates that primer specificity could be better predicted if Table 5. Overview of the results of logistic regression models constructed based on cPCR data only, qPCR only, and the combined dataset, which investigated the relationships between the occurrence of nonspecific amplifications and various parameters in in vitro primer specificity tests. * P < 0.05; **P < 0.01; ***P < 0.001.  mismatches were categorized into critical and noncritical mismatches. Irrespective, software developers and researchers need to be alert to the diverse behavior of particular mismatches in different contexts. For example, an A-C mismatch behaves differently depending on pH. At neutral pH, an A-C mismatch is comparable to a critical mismatch, while when pH is low (i.e., pH ≦ 5), the protonation of the A-C mismatch can contribute to the formation of an additional hydrogen bond that has a stabilizing effect equivalent to that of other noncritical mismatches (Allawi and SantaLucia 1998). In addition, the effect of any C-T and T-T mismatch (and its reciprocal) is highly dependent on its position: The pairing is very stable at the 3 0 terminal, but can destabilize the primer-template duplex when located in the center of the primer ( Vanhommerig et al. 1991, Peyret et al. 1999. The effects of G-G mismatches at the terminal tend to vary among different studies: They could be as strong as A-A/ A-G/ C-C mismatches (Stadhouders et al. 2010) or similar to T-T/ T-G/ C-T mismatches (Kwok et al. 1990). A G-G mismatch is idiosyncratically stable with a destabilizing effect that is only slightly stronger when the mismatch is located at the 3 0 terminal (Clanton-Arrowood et al. (2008). In Table 7, we have drawn upon published literature to rank the destabilizing effects of different single primer-template mismatch types and show how their position affects the DNA duplex. This information will assist researchers who select their primers manually, as well as software developers who can accommodate the dynamic effects of mismatches incrementally as ongoing advancements in technology lessen the computational cost.

Mismatch position
3 0 end NCM was the only significant property that accounted for position in explaining primer specificity. Although significant, the effect of 3 0 end NCM was similar to that of total NCM (odds ratios: 0.66 vs. 0.68), suggesting that a mismatch located at the 3 0 end did not exert a stronger influence. This result contradicts previous studies that focused on the effects of single mismatches (e.g., Wu et al. 2009, Stadhouders et al. 2010. We believe that in the context of multiple mismatches, the destabilization effects of mismatches located at the 3 0 end can be eliminated by other mismatches located in the central portion of the primer.

Annealing temperature
Annealing temperature had substantial influence on primer specificity (odds ratio = 0.51, P < 0.001). This impact is less likely to be affected by primer and mismatch properties since increasing annealing temperature directly destabilizes the primer/ nontarget template duplex and hence suppresses nontarget synthesis (Ugozzoli and Wallace 1991). However, it should be noted that direct increase in annealing temperature beyond the optimal level can reduce PCR efficiency (Wu et al. 1991), implying a reduction in the detection limit of the primer assay for eDNA studies. Hence, annealing temperature should be better manipulated through lengthening the primer sequence with additional nucleotides, preferably at the 5 0 end, to minimize the influence on primer-template hybridization.

GC content
While the assumption that primers should have a GC content between 40% and 60% was proven wrong decades ago (Rychlik 1995), many online primer design guidelines still advocate this parameter without detailed explanation. Our study shows that GC could significantly affect PCR results (odds ratio = 1.12, P < 0.05). Given that a G-C paring is much more stable than an A-T pairing due to the formation of an additional hydrogen bond, high GC content might imply more thermally stable primer/ (nontarget) Note: Results integrated from the following literature, Kwok et al. (1990), Vanhommerig et al. (1991), SantaLucia (1997, 1998), Peyret et al. (1999), Tikhomirova et al. (2006), Stadhouders et al. (2010). template duplexes and hence promote nontarget amplification (Bustin and Huggett 2017). However, it should be noted that for short primers (length < 20 bp), a low GC content can result in a low optimal annealing temperature that reduces primer specificity.

Method used in in vitro primer specificity tests
As a covariate, PCR technique impacted the detection of nonspecific amplification and hence the determination of primer specificity. Significantly more nonspecific amplification was detected with SYBR Green assays when compared to cPCR assays (odds ratio: 2.27, P < 0.05). This is consistent with the fact that qPCR is more sensitive than cPCR (Mary et al. 2004). However, no significant difference in detection of nonspecific amplification was found between TaqManbased qPCR and cPCR. In contrast with SYBR Green assays, TaqMan assays are more specific since a nontarget amplicon is only detected when there is probe-amplicon hybridization (Tajadini et al. 2014). In our study, the specificity of three primer pairs was disproved with TaqMan (i.e., CT-COI-1, SB-COI-2, and TSE-12S-1; Table 3), while one primer pair gained specificity with the TaqMan-based assay (i.e., OC-CYTB-3). These results indicate that use of cPCR does not always overestimate the specificity of primers. For in vitro testing of primers, we suggest following the current practice of beginning with cPCR, despite its low sensitivity. Since cPCR is only around 1% of the cost of TaqMan, this approach may help minimize expenditure in primer/probe development. If the price of primer/probe development is high, traditional sampling methods could be more cost-effective than eDNA sampling (Smart et al. 2016).

Template concentration
Although template concentration had no significant influence on primer specificity, we recommend that eDNA studies should include information on the concentration of nontarget templates used during in vitro validation of primer specificity-something that is often lacking in published literature (e.g., Atkinson et al. 2017, Klymus et al. 2017, Harper et al. 2018. Including the DNA concentration can alert other researchers to the specificity limit of the assays and help avoid false positives arising from insufficiently specific primers and high concentrations of nontarget templates (Bustin et al. 2009).

Implications
Designing specific primers has long been a challenging task for research employing PCR. Although there have been some published guidelines on primer selection (e.g., Dieffenbach et al. 1993, Rychlik 1995, Bustin and Huggett 2017, they may not be applicable in eDNA studies since the ratio of target to nontarget DNA in an environmental sample is often very low, leading to a much higher probability of false positives (nontarget amplification). False positives are especially misleading and costly in eDNA studies as management decision is made based on these results (e.g., implementing measures to conserve a nonexistent endangered species or remove a nonexistent invasive species), and accurate prediction of primer specificity is the first line of defense against false positives.
The number of eDNA studies has expanded exponentially in recent years. Some researchers make claims about primer specificity based soley on in silico validation (e.g., Wei et al. 2018) or in combination with in vitro tests of one or a few nontarget species (e.g., Yusishen et al. 2020). These practices are not recommended since, as our results show, in silico testing of primer specificity using BLAST can be inaccurate. Validation tests, both in vitro and in situ, should be conducted to properly safeguard against drawing false conclusions from eDNA investigations. Researchers should also sequence at least some of the field positives to confirm the primers are amplifying the target gene from the study organism.
The results of in vitro tests are often wasted as they are rarely published and analyzed systematically. Our study serves as a novel example of the use of such data for evaluating primer design in the context of an eDNA study. While we have outlined the importance of various parameters (mismatch number, mismatch type, GC content, and annealing temperature) in governing primer specificity, we are not able to provide a universal optimal value for each parameter due to limited sample sizes. In the future, researchers should share data obtained during in vitro validation of primer specificity, such that more advanced analytical methods, such as machine learning, can be applied to further understand the impact of each parameter on primer development.