Are knowledge graph embedding models biased, or is it the data that they are trained on?

Wessel Radstok, Mel Chekol, Mirko Schaefer

Research output: Contribution to conferencePaperAcademic

Abstract

Recent studies on bias analysis of knowledge graph (KG) embedding
models focus primarily on altering the models such that sensitive
features are dealt with differently from other features. The underlying
implication is that the models cause bias, or that it is their task to solve
it. In this paper we argue that the problem is not caused by the models
but by the data, and that it is the responsibility of the expert to
ensure that the data is representative for the intended goal. To support
this claim, we experiment with two different knowledge graphs and show
that the bias is not only present in the models, but also in the data.
Next, we show that by adding new samples to balance the distribution
of facts with regards to specifc sensitive features, we can reduce the bias
in the models.
Original languageEnglish
Number of pages14
Publication statusPublished - Oct 2021
EventWikidata Workshop 2021 co-located with the 20th International Semantic Web Conference (ISWC 2021) - online
Duration: 24 Oct 202124 Oct 2021
https://wikidataworkshop.github.io/2021/

Conference

ConferenceWikidata Workshop 2021 co-located with the 20th International Semantic Web Conference (ISWC 2021)
Period24/10/2124/10/21
Internet address

Fingerprint

Dive into the research topics of 'Are knowledge graph embedding models biased, or is it the data that they are trained on?'. Together they form a unique fingerprint.

Cite this