Abstract
Entity Resolution constitutes a core data integration task that has attracted a bulk of works on improving its effectiveness and time efficiency. This tutorial provides a comprehensive overview of the field, distinguishing relevant methods into five main generations. The first one targets Veracity in the context of structured data with a clean schema. The second generation extends its focus to cover Volume, as well, leveraging multi-core or massive parallelization to process large-scale datasets. The third generation addresses the additional challenge of Variety, targeting voluminous, noisy, semi-structured, and highly heterogeneous data from the Semantic Web. The fourth generation also tackles Velocity so as to process data collections of a continuously increasing volume. The latest works, though, belong to the fifth generation, involving pre-trained (large) language models which heavily rely on external knowledge to address all four Vs with high effectiveness.
| Original language | English |
|---|---|
| Title of host publication | Web Engineering - 24th International Conference, ICWE 2024, Proceedings |
| Editors | Kostas Stefanidis, Kari Systä, Maristella Matera, Sebastian Heil, Haridimos Kondylakis, Elisa Quintarelli |
| Publisher | Springer |
| Pages | 469-473 |
| Number of pages | 5 |
| ISBN (Print) | 9783031623615 |
| DOIs | |
| Publication status | Published - 2024 |
| Externally published | Yes |
| Event | 24th International Conference on Web Engineering, ICWE 2024 - Tampere, Finland Duration: 17 Jun 2024 → 20 Jun 2024 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 14629 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 24th International Conference on Web Engineering, ICWE 2024 |
|---|---|
| Country/Territory | Finland |
| City | Tampere |
| Period | 17/06/24 → 20/06/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Keywords
- Data Integration
- Entity Resolution
- LLMs