AI and citizen science data help accelerate high quality fossil identifications by experts

Research output: Contribution to conferenceAbstractAcademic

Abstract

Accurate identification of fossils is instrumental to palaeontological research but requires expert knowledge, is time consuming, and is subject to human biases. Through citizen science platforms and apps, AI-assisted identifications can mitigate those challenges, as shown e.g., in successful biodiversity research applications. Fossil data is relatively scarce, but large and growing fossil datasets are made available in open data repositories through collection digitisation efforts. Furthermore, data is collected and validated by fossil enthusiasts on citizen science platforms. These datasets can be used as training data for deep learning classification models to provide both experts and citizen scientists with accurate, quick and easy to use tools to collect, validate and analyse palaeontological data. However, AI-model performance may be limited by the size and quality of the training dataset. We present and compare a set of convolutional neural networks (CNNs) that are trained and tested on standardised images from museum and private collections (>46,000 images) and images from the online citizen science platform Oervondstchecker.nl (>74,000 images). Both datasets consist of Quaternary vertebrate fossils and artefacts from the Netherlands and the southern North Sea Basin. Moreover, we compare model performance with identifications by 10 domain experts and active citizen scientists to gain a measure of data quality. The CNNs perform best when trained on standardised images (~85% top-1 accuracy) compared to citizen science data (~65% top-1 accuracy). Identifications by fossil experts show variable agreement among these experts.
Based on these insights we make recommendations on how to account for variable validator input to optimise AI model training and performance. The synergy between AI model predictions and domain expert identifications can rapidly increase the amount of high-quality identifications of fossils and flag potential rare finds. Finally, to further increase data acquisition and public engagement, the models with the best overall performance have been made publicly available online (https://museum.identify.biodiversityanalysis.nl/model/beach_fossils_species)
for use by professional experts, citizen scientists and the general public alike.
Original languageEnglish
Pages126-127
Number of pages2
Publication statusPublished - 2025
EventXXII Annual Meeting of the European Association of Vertebrate Palaeontologists - Kraków, Poland
Duration: 30 Jun 20255 Jul 2025
Conference number: 22
https://eavp2025.wixsite.com/eavp2025

Conference

ConferenceXXII Annual Meeting of the European Association of Vertebrate Palaeontologists
Abbreviated titleEAVP 2025
Country/TerritoryPoland
CityKraków
Period30/06/255/07/25
Internet address

Funding

Funding provided by NWO (Dutch Research Council) through an "Open Competition ENW-M" grant (dossier number: OCENW.M20.360)

Fingerprint

Dive into the research topics of 'AI and citizen science data help accelerate high quality fossil identifications by experts'. Together they form a unique fingerprint.

Cite this