Impact of obscured data on species distribution models

Kyo Soung Koo, Ko Huan Lee, Dawon Lee, Yikweon Jang

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The lack of knowledge about geographic distribution and environmental preference can hinder conservation efforts for rare and threatened species. Open-source databases provide an opportunity to address these knowledge gaps through the geographic information they hold on species worldwide. However, to protect rare and endangered species, open-source databases often assign locations that do not match the original locations, which introduce inaccuracies in occurrence records (e.g., the “obscured” function in iNaturalist replaces the original location with a random location in a 0.2 × 0.2° cell). We tested the efficacy of the iNaturalist's obscured function in concealing geographic information and the function's impact on the species distribution modeling of 3 endangered species in South Korea: gold-spotted pond frogs (Pelophylax chosenicus), Reeves’ turtles (Mauremys reevesii), and Mongolia racerunner (Eremias argus). We collected occurrence data (orginal data) for these 3 species and uploaded the data to iNaturalist. We then compared location, elevation, and habitat area in the original data set with these data in the obscured data set. To investigate the differences in species distribution, we ran species distribution models with both data sets. We also assessed the awareness of obscured function in peer-reviewed articles for which occurrence records from iNaturalist were used. The locations assigned by the obscured function significantly altered the geographic information of the species, including elevational range, habitat type, and environmental variables relevant to species distribution. Potential distributions estimated using locations assigned under the obscured function were different from those estimated using the original data. Only 4 out of 170 peer-reviewed articles acknowledged the presence of obscured data in iNaturalist, suggesting that most researchers are unaware of this issue. The locations assigned by the obscured function can cause serious problems in species distribution modeling and thus may negatively affect conservation of endangered species. We encourage researchers to thoroughly vet data obtained from open-source databases and urge database platforms to make it clear when data have been obscured.

Original languageEnglish
Article numbere70050
JournalConservation Biology
Volume39
Issue number5
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 The Author(s). Conservation Biology published by Wiley Periodicals LLC on behalf of Society for Conservation Biology.

Keywords

  • GBIF
  • GBIF
  • biodiversity data
  • datos de biodiversidad
  • ecological model
  • estado de geoprivacidad
  • geoprivacy status
  • imprecisión posicional
  • modelo ecológico
  • positional inaccuracy

Fingerprint

Dive into the research topics of 'Impact of obscured data on species distribution models'. Together they form a unique fingerprint.

Cite this