Abstract
Link Discovery constitutes a crucial task for increasing the connections between data sources in the Linked Open Data Cloud. Part of this task is Entity Resolution (ER), which aims to identify owl:sameAs relations between different entity descriptions that pertain to the same real-world object. Due to its quadratic time complexity, ER is typically carried out in two steps: first, blocking restricts the computational cost to similar descriptions, and then, matching estimates the actual similarity between them. A plethora of techniques has been proposed for each step. To facilitate their use by researchers and practitioners, we present pyJedAI, an open-source library that leverages Python’s data science ecosystem to build powerful end-to-end ER workflows. The purpose of this work is to demonstrate how this can be accomplished by expert and novice users in an intuitive, yet efficient and effective way.
| Original language | English |
|---|---|
| Number of pages | 5 |
| Journal | CEUR Workshop Proceedings |
| Volume | 3254 |
| Publication status | Published - 29 Oct 2022 |
| Externally published | Yes |