TY - JOUR

T1 - Anytime bottom-up rule learning for large-scale knowledge graph completion

AU - Meilicke, Christian

AU - Chekol, Melisachew Wudage

AU - Betz, Patrick

AU - Fink, Manuel

AU - Stuckeschmidt, Heiner

N1 - Publisher Copyright:
© 2023, The Author(s).

PY - 2024

Y1 - 2024

N2 - Knowledge graph completion is the task of predicting correct facts that can be expressed by the vocabulary of a given knowledge graph, which are not explicitly stated in that graph. Broadly, there are two main approaches for solving the knowledge graph completion problem. Sub-symbolic approaches embed the nodes and/or edges of a given graph into a low-dimensional vector space and use a scoring function to determine the plausibility of a given fact. Symbolic approaches learn a model that remains within the primary representation of the given knowledge graph. Rule-based approaches are well-known examples. One such approach is AnyBURL. It works by sampling random paths, which are generalized into Horn rules. Previously published results show that the prediction quality of AnyBURL is close to current state of the art with the additional benefit of offering an explanation for a predicted fact. In this paper, we propose several improvements and extensions of AnyBURL. In particular, we focus on AnyBURL’s capability to be successfully applied to large and very large datasets. Overall, we propose four separate extensions: (i) We add to each rule a set of pairwise inequality constraints which enforces that different variables cannot be grounded by the same entities, which results into more appropriate confidence estimations. (ii) We introduce reinforcement learning to guide path sampling in order to use available computational resources more efficiently. (iii) We propose an efficient sampling strategy to approximate the confidence of a rule instead of computing its exact value. (iv) We develop a new multithreaded AnyBURL, which incorporates all previously mentioned modifications. In an experimental study, we show that our approach outperforms both symbolic and sub-symbolic approaches in large-scale knowledge graph completion. It has a higher prediction quality and requires significantly less time and computational resources.

AB - Knowledge graph completion is the task of predicting correct facts that can be expressed by the vocabulary of a given knowledge graph, which are not explicitly stated in that graph. Broadly, there are two main approaches for solving the knowledge graph completion problem. Sub-symbolic approaches embed the nodes and/or edges of a given graph into a low-dimensional vector space and use a scoring function to determine the plausibility of a given fact. Symbolic approaches learn a model that remains within the primary representation of the given knowledge graph. Rule-based approaches are well-known examples. One such approach is AnyBURL. It works by sampling random paths, which are generalized into Horn rules. Previously published results show that the prediction quality of AnyBURL is close to current state of the art with the additional benefit of offering an explanation for a predicted fact. In this paper, we propose several improvements and extensions of AnyBURL. In particular, we focus on AnyBURL’s capability to be successfully applied to large and very large datasets. Overall, we propose four separate extensions: (i) We add to each rule a set of pairwise inequality constraints which enforces that different variables cannot be grounded by the same entities, which results into more appropriate confidence estimations. (ii) We introduce reinforcement learning to guide path sampling in order to use available computational resources more efficiently. (iii) We propose an efficient sampling strategy to approximate the confidence of a rule instead of computing its exact value. (iv) We develop a new multithreaded AnyBURL, which incorporates all previously mentioned modifications. In an experimental study, we show that our approach outperforms both symbolic and sub-symbolic approaches in large-scale knowledge graph completion. It has a higher prediction quality and requires significantly less time and computational resources.

KW - Knowledge graph completion

KW - Link prediction

KW - Rule learning

UR - http://www.scopus.com/inward/record.url?scp=85162036488&partnerID=8YFLogxK

U2 - 10.1007/s00778-023-00800-5

DO - 10.1007/s00778-023-00800-5

M3 - Article

SN - 1066-8888

VL - 33

SP - 131

EP - 161

JO - VLDB Journal

JF - VLDB Journal

IS - 1

ER -