Fine-grained controversy detection in Wikipedia

Siarhei Bykau, Flip Korn, Divesh Srivastava, Yannis Velegrakis

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review


The advent of Web 2.0 gave birth to a new kind of application where content is generated through the collaborative contribution of many different users. This form of content generation is believed to generate data of higher quality since the 'wisdom of the crowds' makes its way into the data. However, a number of specific data quality issues appear within such collaboratively generated data. Apart from normal updates, there are cases of intentional harmful changes known as vandalism as well as naturally occurring disagreements on topics which don't have an agreed upon viewpoint, known as controversies. While much work has focused on identifying vandalism, there has been little prior work on detecting controversies, especially at a fine granularity. Knowing about controversies when processing user-generated content is essential to understand the quality of the data and the trust that should be given to them. Controversy detection is a challenging task, since in the highly dynamic context of user updates, one needs to differentiate among normal updates, vandalisms and actual controversies. We describe a novel technique that finds these controversial issues by analyzing the edits that have been performed on the data over time. We apply the developed technique on Wikipedia, the world's largest known collaboratively generated database and we show that our approach has higher precision and recall than baseline approaches as well as is capable of finding previously unknown controversies.

Original languageEnglish
Title of host publication2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
PublisherIEEE Computer Society Press
Number of pages12
ISBN (Electronic)9781479979639
Publication statusPublished - 1 Jan 2015
Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
Duration: 13 Apr 201517 Apr 2015


Conference2015 31st IEEE International Conference on Data Engineering, ICDE 2015
Country/TerritoryKorea, Republic of


Dive into the research topics of 'Fine-grained controversy detection in Wikipedia'. Together they form a unique fingerprint.

Cite this