Journal list menu

Volume 93, Issue 3
Article

Cross‐validation of species distribution models: removing spatial sorting bias and calibration with a null model

Robert J. Hijmans

Corresponding Author

E-mail address: rhijmans@ucdavis.edu

Department of Environmental Science and Policy, 1023 Wickson Hall, University of California, Davis, California 95616 USA

 E-mail: E-mail address: rhijmans@ucdavis.eduSearch for more papers by this author
First published: 01 March 2012
Citations: 272

Corresponding Editor: M. Fortin.

Abstract

Species distribution models are usually evaluated with cross‐validation. In this procedure evaluation statistics are computed from model predictions for sites of presence and absence that were not used to train (fit) the model. Using data for 226 species, from six regions, and two species distribution modeling algorithms (Bioclim and MaxEnt), I show that this procedure is highly sensitive to “spatial sorting bias”: the difference between the geographic distance from testing‐presence to training‐presence sites and the geographic distance from testing‐absence (or testing‐background) to training‐presence sites. I propose the use of pairwise distance sampling to remove this bias, and the use of a null model that only considers the geographic distance to training sites to calibrate cross‐validation results for remaining bias. Model evaluation results (AUC) were strongly inflated: the null model performed better than MaxEnt for 45% and better than Bioclim for 67% of the species. Spatial sorting bias and area under the receiver–operator curve (AUC) values increased when using partitioned presence data and random‐absence data instead of independently obtained presence–absence testing data from systematic surveys. Pairwise distance sampling removed spatial sorting bias, yielding null models with an AUC close to 0.5, such that AUC was the same as null model calibrated AUC (cAUC). This adjustment strongly decreased AUC values and changed the ranking among species. Cross‐validation results for different species are only comparable after removal of spatial sorting bias and/or calibration with an appropriate null model.

Number of times cited according to CrossRef: 272

  • Geographic shifts in the bioclimatic suitability for Aedes aegypti under climate change scenarios in Colombia, Heliyon, 10.1016/j.heliyon.2019.e03101, 6, 1, (e03101), (2020).
  • Lack of evidence of a Pleistocene migratory switch in current bird long‐distance migrants between Eurasia and Africa, Journal of Biogeography, 10.1111/jbi.13834, 47, 7, (1564-1573), (2020).
  • Comparing maximum entropy modelling methods to inform aquaculture site selection for novel seaweed species, Ecological Modelling, 10.1016/j.ecolmodel.2020.109071, 429, (109071), (2020).
  • Increasing Cervidae populations have variable impacts on habitat suitability for threatened forest plant and lichen species, Forest Ecology and Management, 10.1016/j.foreco.2020.118286, 473, (118286), (2020).
  • Meta‐replication, sampling bias, and multi‐scale model selection: A case study on snow leopard (Panthera uncia) in western China, Ecology and Evolution, 10.1002/ece3.6492, 10, 14, (7686-7712), (2020).
  • What and where? Predicting invasion hotspots in the Arctic marine realm, Global Change Biology, 10.1111/gcb.15159, 26, 9, (4752-4771), (2020).
  • Prediction of breeding regions for the desert locust Schistocerca gregaria in East Africa, Scientific Reports, 10.1038/s41598-020-68895-2, 10, 1, (2020).
  • Predicting mobulid ray distribution in coastal areas of Lesser Sunda Seascape: Implication for spatial and fisheries management, Ocean & Coastal Management, 10.1016/j.ocecoaman.2020.105328, 198, (105328), (2020).
  • Simulation-Based Approaches for Ecological Niche Modelling, Environmental and Agricultural Informatics, 10.4018/978-1-5225-9621-9, (805-827), (2020).
  • A stitch in time – Synergistic impacts to platypus metapopulation extinction risk, Biological Conservation, 10.1016/j.biocon.2019.108399, 242, (108399), (2020).
  • Modelling species presence–absence in the ecological niche theory framework using shape-constrained generalized additive models, Ecological Modelling, 10.1016/j.ecolmodel.2019.108926, 418, (108926), (2020).
  • Genetic diversity of the rain tree (Albizia saman) in Colombian seasonally dry tropical forest for informing conservation and restoration interventions, Ecology and Evolution, 10.1002/ece3.6005, 10, 4, (1905-1916), (2020).
  • Multiple axes of ecological vulnerability to climate change, Global Change Biology, 10.1111/gcb.15008, 26, 5, (2798-2813), (2020).
  • Coalescent-based delimitation and species-tree estimations reveal Appalachian origin and Neogene diversification in Russula subsection Roseinae, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106787, (106787), (2020).
  • Seafloor geomorphic features as an alternative approach into modelling the distribution of cetaceans, Ecological Informatics, 10.1016/j.ecoinf.2020.101092, (101092), (2020).
  • Recent diversification of Chrysoritis butterflies in the South African Cape (Lepidoptera: Lycaenidae), Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2020.106817, (106817), (2020).
  • The globally invasive small Indian mongoose Urva auropunctata is likely to spread with climate change, Scientific Reports, 10.1038/s41598-020-64502-6, 10, 1, (2020).
  • Modelling the impact of tidal range energy on species communities, Ocean & Coastal Management, 10.1016/j.ocecoaman.2020.105221, 193, (105221), (2020).
  • Migratory connectivity of Swan Geese based on species' distribution models, feather stable isotope assignment and satellite tracking, Diversity and Distributions, 10.1111/ddi.13077, 26, 8, (944-957), (2020).
  • Citizen science and habitat modelling facilitates conservation planning for crabeater seals in the Weddell Sea, Diversity and Distributions, 10.1111/ddi.13120, 26, 10, (1291-1304), (2020).
  • Systematics and evolution of Kibramoa Chamberlin 1924 (Araneae: Plectreuridae) from the California Floristic Province, Journal of Zoological Systematics and Evolutionary Research, 10.1111/jzs.12357, 58, 1, (114-126), (2020).
  • Validation and inference of high‐resolution information (downscaling) of ENETwild abundance model for wild boar, EFSA Supporting Publications, 10.2903/sp.efsa.2020.EN-1787, 17, 1, (2020).
  • Host-parasite interaction augments climate change effect in an avian brood parasite, the lesser cuckoo Cuculus poliocephalus, Global Ecology and Conservation, 10.1016/j.gecco.2020.e00976, 22, (e00976), (2020).
  • Performance evaluation of cetacean species distribution models developed using generalized additive models and boosted regression trees, Ecology and Evolution, 10.1002/ece3.6316, 10, 12, (5759-5784), (2020).
  • Climate change reduces the natural range of African wild loquat (Uapaca kirkiana Müll. Arg., Phyllanthaceae) in south-central Africa, Regional Environmental Change, 10.1007/s10113-020-01700-y, 20, 3, (2020).
  • A global assessment of the drivers of threatened terrestrial species richness, Nature Communications, 10.1038/s41467-020-14771-6, 11, 1, (2020).
  • The continuing march of Common Green Iguanas: arrival on mainland Asia, Journal for Nature Conservation, 10.1016/j.jnc.2020.125888, (125888), (2020).
  • Assessment of endemic northern swamp deer (Rucervus duvaucelii duvaucelii) distribution and identification of priority conservation areas through modeling and field surveys across north India, Global Ecology and Conservation, 10.1016/j.gecco.2020.e01263, (e01263), (2020).
  • Predicting the invasion range for a highly polyphagous and widespread forest herbivore, NeoBiota, 10.3897/neobiota.59.53550, 59, (1-20), (2020).
  • Distribution modelling of the Pudu deer (Pudu puda) in southern Chile, Nature Conservation, 10.3897/natureconservation.41.53748, 41, (47-69), (2020).
  • Spatial variation in fertilizer prices in Sub-Saharan Africa, PLOS ONE, 10.1371/journal.pone.0227764, 15, 1, (e0227764), (2020).
  • An empirical, cross-taxon evaluation of landscape-scale connectivity, Biodiversity and Conservation, 10.1007/s10531-020-01938-2, (2020).
  • Are the Dioon edule (Zamiaceae) forms from San Luis Potosí proposed by Whitelock (2004) recognizable? Morphological evidence, Revista Mexicana de Biodiversidad, 10.22201/ib.20078706e.2020.91.3167, 91, 0, (913167), (2020).
  • Modeling Landscape Use for Ungulates: Forgotten Tenets of Ecology, Management, and Inference, Frontiers in Ecology and Evolution, 10.3389/fevo.2020.00211, 8, (2020).
  • Selecting environmental descriptors is critical for modelling the distribution of Antarctic benthic species, Polar Biology, 10.1007/s00300-020-02714-2, (2020).
  • Quantifying range decline and remaining populations of the large marsupial carnivore of Australia’s tropical rainforest, Journal of Mammalogy, 10.1093/jmammal/gyaa077, (2020).
  • Including indigenous knowledge in species distribution modeling for increased ecological insights, Conservation Biology, 10.1111/cobi.13373, 0, 0, (2020).
  • Protection gaps and restoration opportunities for primary forests in Europe, Diversity and Distributions, 10.1111/ddi.13158, 0, 0, (2020).
  • Global distribution patterns provide evidence of niche shift by the introduced African dung beetle Digitonthophagus gazella, Entomologia Experimentalis et Applicata, 10.1111/eea.12961, 0, 0, (2020).
  • Toward an understanding of broad-scale patterns of the habitat suitability of fountain grass (Cenchrus setaceus (Forssk.) Morrone, Poaceae), Plant Ecology, 10.1007/s11258-020-01060-x, (2020).
  • Improving African bean productivity in a changing global environment, Mitigation and Adaptation Strategies for Global Change, 10.1007/s11027-019-09910-4, (2020).
  • Toward reliable habitat suitability and accessibility models in an era of multiple environmental stressors, Ecology and Evolution, 10.1002/ece3.6753, 0, 0, (2020).
  • Species Distribution Model of Trichinella Species in Cougars ( Puma concolor ) for the Southwestern Region of Colorado, USA , Journal of Wildlife Diseases, 10.7589/JWD-D-20-00055, (2020).
  • The effect of positional error on fine scale species distribution models increases for specialist species, Ecography, 10.1111/ecog.04687, 43, 2, (256-269), (2019).
  • Climate change and the future restructuring of Neotropical anuran biodiversity, Ecography, 10.1111/ecog.04510, 43, 2, (222-235), (2019).
  • Evaluating presence‐only species distribution models with discrimination accuracy is uninformative for many applications, Journal of Biogeography, 10.1111/jbi.13705, 47, 1, (167-180), (2019).
  • Evaluating high-altitude ramsar wetlands in the Eastern Himalayas, Global Ecology and Conservation, 10.1016/j.gecco.2019.e00715, (e00715), (2019).
  • undefined, Automated Visual Inspection and Machine Vision III, 10.1117/12.2525135, (5), (2019).
  • Wind turbines in high quality habitat cause disproportionate increases in collision mortality of the white-tailed eagle, Biological Conservation, 10.1016/j.biocon.2019.05.018, 236, (44-51), (2019).
  • Spatial optimizations of multiple plant species for ecological restoration of the mountainous areas of North China, Environmental Earth Sciences, 10.1007/s12665-019-8299-8, 78, 10, (2019).
  • Classification and regression with random forests as a standard method for presence-only data SDMs: A future conservation example using China tree species, Ecological Informatics, 10.1016/j.ecoinf.2019.05.003, (2019).
  • Broad-scale species distribution models applied to data-poor areas, Progress in Oceanography, 10.1016/j.pocean.2019.04.007, (2019).
  • Estimating the distribution of harvested estuarine bivalves with natural-history-based habitat suitability models, Estuarine, Coastal and Shelf Science, 10.1016/j.ecss.2019.02.009, (2019).
  • Global trends in antimicrobial resistance in animals in low- and middle-income countries, Science, 10.1126/science.aaw1944, 365, 6459, (eaaw1944), (2019).
  • The use of classification and regression algorithms using the random forests method with presence-only data to model species’ distribution, MethodsX, 10.1016/j.mex.2019.09.035, (2019).
  • Use of geospatial methods to characterize dispersion of the Emerald ash borer in southern Ontario, Canada, Ecological Informatics, 10.1016/j.ecoinf.2019.101037, (101037), (2019).
  • Climate change and its potential impact on the conservation of the Hoary Fox, Lycalopex vetulus (Mammalia: Canidae), Mammalian Biology, 10.1016/j.mambio.2019.08.002, (2019).
  • Exploring rain forest diversification using demographic model testing in the African foam‐nest treefrog Chiromantis rufescens, Journal of Biogeography, 10.1111/jbi.13716, 46, 12, (2706-2721), (2019).
  • Effects of climate and geography on spatial patterns of genetic structure in tropical skinks, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2019.106661, (106661), (2019).
  • Environmental predictive models for shark attacks in Australian waters, Marine Ecology Progress Series, 10.3354/meps13138, 631, (165-179), (2019).
  • ENETwild modelling of wild boar distribution and abundance: update of occurrence and hunting data‐based models, EFSA Supporting Publications, 10.2903/sp.efsa.2019.EN-1674, 16, 8, (2019).
  • A new null model approach to quantify performance and significance for ecological niche models of species distributions, Journal of Biogeography, 10.1111/jbi.13573, 46, 6, (1101-1111), (2019).
  • Collinearity in ecological niche modeling: Confusions and challenges, Ecology and Evolution, 10.1002/ece3.5555, 9, 18, (10365-10376), (2019).
  • Multispecies conservation of freshwater fish assemblages in response to climate change in the southeastern United States, Diversity and Distributions, 10.1111/ddi.12948, 25, 9, (1388-1398), (2019).
  • On Estimating Model in Feature Selection With Cross-Validation, IEEE Access, 10.1109/ACCESS.2019.2892062, 7, (33454-33463), (2019).
  • NOO3D: A procedure to perform 3D species distribution models, Ecological Informatics, 10.1016/j.ecoinf.2019.101008, (101008), (2019).
  • Cryptic phylogeographic history sheds light on the generation of species diversity in sky‐island mountains, Journal of Biogeography, 10.1111/jbi.13664, 46, 10, (2232-2247), (2019).
  • Climate policy action needed to reduce vulnerability of conservation‐reliant grassland birds in North America, Conservation Science and Practice, 10.1111/csp2.21, 1, 4, (2019).
  • New 30 m resolution Hong Kong climate, vegetation, and topography rasters indicate greater spatial variation than global grids within an urban mosaic, Earth System Science Data, 10.5194/essd-11-1083-2019, 11, 3, (1083-1098), (2019).
  • Improving Species Distribution Modelling of freshwater invasive species for management applications, PLOS ONE, 10.1371/journal.pone.0217896, 14, 6, (e0217896), (2019).
  • How do species and data characteristics affect species distribution models and when to use environmental filtering?, International Journal of Geographical Information Science, 10.1080/13658816.2019.1615070, (1-18), (2019).
  • The current and future global distribution and population at risk of dengue, Nature Microbiology, 10.1038/s41564-019-0476-8, (2019).
  • Mapping Arctic clam abundance using multiple datasets, models, and a spatially explicit accuracy assessment, ICES Journal of Marine Science, 10.1093/icesjms/fsz099, (2019).
  • Modelling Acacia saligna invasion on the Adriatic coastal landscape: An integrative approach using LTER data, Nature Conservation, 10.3897/natureconservation.34.29575, 34, (127-144), (2019).
  • Using climate change models to inform the recovery of the western ground parrot Pezoporus flaviventris, Oryx, 10.1017/S0030605318000923, (1-10), (2019).
  • Potential invasive plant expansion in global ecoregions under climate change, PeerJ, 10.7717/peerj.6479, 7, (e6479), (2019).
  • Impact of Habitat Loss and Mining on the Distribution of Endemic Species of Amphibians and Reptiles in Mexico, Diversity, 10.3390/d11110210, 11, 11, (210), (2019).
  • Spatio-temporal effects of climate change on the geographical distribution and flowering phenology of hummingbird-pollinated plants, Annals of Botany, 10.1093/aob/mcz079, (2019).
  • Ensemble Modeling of Antarctic Macroalgal Habitats Exposed to Glacial Melt in a Polar Fjord, Frontiers in Ecology and Evolution, 10.3389/fevo.2019.00207, 7, (2019).
  • A checklist for maximizing reproducibility of ecological niche models, Nature Ecology & Evolution, 10.1038/s41559-019-0972-5, (2019).
  • Modelling Distributions of Rove Beetles in Mountainous Areas Using Remote Sensing Data, Remote Sensing, 10.3390/rs12010080, 12, 1, (80), (2019).
  • Climatic niche of the Saker Falcon Falco cherrug: predicted new areas to direct population surveys in Central Asia, Ibis, 10.1111/ibi.12700, 162, 1, (27-41), (2018).
  • Genomic population structure aligns with vocal dialects in Palm Cockatoos ( Probosciger aterrimus ); evidence for refugial late-Quaternary distribution? , Emu - Austral Ornithology, 10.1080/01584197.2018.1483731, 119, 1, (24-37), (2018).
  • KnowBR: An application to map the geographical variation of survey effort and identify well-surveyed areas from biodiversity databases, Ecological Indicators, 10.1016/j.ecolind.2018.03.077, 91, (241-248), (2018).
  • Species Distributions, Spatial Ecology and Conservation Modeling, 10.1007/978-3-030-01989-1, (213-269), (2018).
  • Predicting the Potential Distribution of the Sierra Nevada Red Fox in the Oregon Cascades, Journal of Fish and Wildlife Management, 10.3996/082017-JFWM-067, 9, 2, (351-366), (2018).
  • Conservation Status and Threat Assessments for North American Crop Wild Relatives, North American Crop Wild Relatives, Volume 1, 10.1007/978-3-319-95101-0, (189-208), (2018).
  • Mapping ecological indicators of human impact with statistical and machine learning methods: Tests on the California coast, Ecological Informatics, 10.1016/j.ecoinf.2018.07.007, (2018).
  • Use of Machine Learning (ML) for Predicting and Analyzing Ecological and ‘Presence Only’ Data: An Overview of Applications and a Good Outlook, Machine Learning for Ecology and Sustainable Natural Resource Management, 10.1007/978-3-319-96978-7, (27-61), (2018).
  • Understanding human attitudes towards sharks to promote sustainable coexistence, Marine Policy, 10.1016/j.marpol.2018.02.018, 91, (122-128), (2018).
  • Modelling Dolphin Distribution to Inform Future Spatial Conservation Decisions in a Marine Protected Area, Scientific Reports, 10.1038/s41598-018-34095-2, 8, 1, (2018).
  • Usutu virus induced mass mortalities of songbirds in Central Europe: Are habitat models suitable to predict dead birds in unsampled regions?, Preventive Veterinary Medicine, 10.1016/j.prevetmed.2018.09.013, 159, (162-170), (2018).
  • Ensemble species distribution modelling with transformed suitability values, Environmental Modelling & Software, 10.1016/j.envsoft.2017.11.009, 100, (136-145), (2018).
  • Ecosystem functional diversity and the representativeness of environmental networks across the conterminous United States, Agricultural and Forest Meteorology, 10.1016/j.agrformet.2018.07.016, 262, (423-433), (2018).
  • Protected area management priorities crucial for the future of Bornean elephants, Biological Conservation, 10.1016/j.biocon.2018.03.015, 221, (365-373), (2018).
  • Transferability of species distribution models for the detection of an invasive alien bryophyte using imaging spectroscopy data, International Journal of Applied Earth Observation and Geoinformation, 10.1016/j.jag.2018.02.001, 68, (61-72), (2018).
  • Assessing distributions of two invasive species of contrasting habits in future climate, Journal of Environmental Management, 10.1016/j.jenvman.2017.12.053, 213, (478-488), (2018).
  • Hendra Virus Spillover is a Bimodal System Driven by Climatic Factors, EcoHealth, 10.1007/s10393-017-1309-y, 15, 3, (526-542), (2018).
  • Genetic diversity of Ceiba pentandra in Colombian seasonally dry tropical forest: Implications for conservation and management, Biological Conservation, 10.1016/j.biocon.2018.08.021, 227, (29-37), (2018).
  • Surveillance of porcine reproductive and respiratory syndrome virus in the United States using risk mapping and species distribution modeling, Preventive Veterinary Medicine, 10.1016/j.prevetmed.2017.11.011, 150, (135-142), (2018).
  • See more