Abstract
Recent studies on bias analysis of knowledge graph (KG) embedding
models focus primarily on altering the models such that sensitive
features are dealt with differently from other features. The underlying
implication is that the models cause bias, or that it is their task to solve
it. In this paper we argue that the problem is not caused by the models
but by the data, and that it is the responsibility of the expert to
ensure that the data is representative for the intended goal. To support
this claim, we experiment with two different knowledge graphs and show
that the bias is not only present in the models, but also in the data.
Next, we show that by adding new samples to balance the distribution
of facts with regards to specifc sensitive features, we can reduce the bias
in the models.
models focus primarily on altering the models such that sensitive
features are dealt with differently from other features. The underlying
implication is that the models cause bias, or that it is their task to solve
it. In this paper we argue that the problem is not caused by the models
but by the data, and that it is the responsibility of the expert to
ensure that the data is representative for the intended goal. To support
this claim, we experiment with two different knowledge graphs and show
that the bias is not only present in the models, but also in the data.
Next, we show that by adding new samples to balance the distribution
of facts with regards to specifc sensitive features, we can reduce the bias
in the models.
Original language | English |
---|---|
Number of pages | 14 |
Publication status | Published - Oct 2021 |
Event | Wikidata Workshop 2021 co-located with the 20th International Semantic Web Conference (ISWC 2021) - online Duration: 24 Oct 2021 → 24 Oct 2021 https://wikidataworkshop.github.io/2021/ |
Conference
Conference | Wikidata Workshop 2021 co-located with the 20th International Semantic Web Conference (ISWC 2021) |
---|---|
Period | 24/10/21 → 24/10/21 |
Internet address |