Journal list menu

Volume 104, Issue 12 e4175
ARTICLE

Deep learning with citizen science data enables estimation of species diversity and composition at continental extents

Courtney L. Davis

Corresponding Author

Courtney L. Davis

Cornell Laboratory of Ornithology, Cornell University, Ithaca, New York, USA

Correspondence

Courtney L. Davis

Email: [email protected]

Search for more papers by this author
Yiwei Bai

Yiwei Bai

Department of Computer Science, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Di Chen

Di Chen

Department of Computer Science, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Orin Robinson

Orin Robinson

Cornell Laboratory of Ornithology, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Viviana Ruiz-Gutierrez

Viviana Ruiz-Gutierrez

Cornell Laboratory of Ornithology, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Carla P. Gomes

Carla P. Gomes

Department of Computer Science, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Daniel Fink

Daniel Fink

Cornell Laboratory of Ornithology, Cornell University, Ithaca, New York, USA

Search for more papers by this author
First published: 02 October 2023
Citations: 1

Handling Editor: James M. D. Speed

Courtney L. Davis and Yiwei Bai contributed equally to this work.

Abstract

Effective solutions to conserve biodiversity require accurate community- and species-level information at relevant, actionable scales and across entire species' distributions. However, data and methodological constraints have limited our ability to provide such information in robust ways. Herein we employ a Deep-Reasoning Network implementation of the Deep Multivariate Probit Model (DMVP-DRNets), an end-to-end deep neural network framework, to exploit large observational and environmental data sets together and estimate landscape-scale species diversity and composition at continental extents. We present results from a novel year-round analysis of North American avifauna using data from over nine million eBird checklists and 72 environmental covariates. We highlight the utility of our information by identifying critical areas of high species diversity for a single group of conservation concern, the North American wood warblers, while capturing spatiotemporal variation in species' environmental associations and interspecific interactions. In so doing, we demonstrate the type of accurate, high-resolution information on biodiversity that deep learning approaches such as DMVP-DRNets can provide and that is needed to inform ecological research and conservation decision-making at multiple scales.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT

The complete list of data analyzed in this study is provided in Appendix S3. The eBird data used to conduct this study are fully described in Appendix S3 and freely available on the eBird website at https://ebird.org/science/use-ebird-data. Data related to the Checklist Calibration Index are sensitive because they pertain to the behavior and location of individual eBird users and cannot be made publicly available; however, Checklist Calibration Index data supporting this research can be directly requested by contacting [email protected] and requesting access to the Checklist Calibration Index associated with the 2018 eBird Reference Dataset under a data sharing agreement. Model code and other nonsensitive data (Chen et al., 2023) are available on Zenodo at https://doi.org/10.5281/zenodo.8297796.