The Five Generations of Entity Resolution on Web Data

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Entity Resolution constitutes a core data integration task that has attracted a bulk of works on improving its effectiveness and time efficiency. This tutorial provides a comprehensive overview of the field, distinguishing relevant methods into five main generations. The first one targets Veracity in the context of structured data with a clean schema. The second generation extends its focus to cover Volume, as well, leveraging multi-core or massive parallelization to process large-scale datasets. The third generation addresses the additional challenge of Variety, targeting voluminous, noisy, semi-structured, and highly heterogeneous data from the Semantic Web. The fourth generation also tackles Velocity so as to process data collections of a continuously increasing volume. The latest works, though, belong to the fifth generation, involving pre-trained (large) language models which heavily rely on external knowledge to address all four Vs with high effectiveness.

Original languageEnglish
Title of host publicationWeb Engineering - 24th International Conference, ICWE 2024, Proceedings
EditorsKostas Stefanidis, Kari Systä, Maristella Matera, Sebastian Heil, Haridimos Kondylakis, Elisa Quintarelli
PublisherSpringer
Pages469-473
Number of pages5
ISBN (Print)9783031623615
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event24th International Conference on Web Engineering, ICWE 2024 - Tampere, Finland
Duration: 17 Jun 202420 Jun 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14629 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Web Engineering, ICWE 2024
Country/TerritoryFinland
CityTampere
Period17/06/2420/06/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • Data Integration
  • Entity Resolution
  • LLMs

Fingerprint

Dive into the research topics of 'The Five Generations of Entity Resolution on Web Data'. Together they form a unique fingerprint.

Cite this