Fine-grained controversy detection in Wikipedia

Siarhei Bykau, Flip Korn, Divesh Srivastava, Yannis Velegrakis

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    Abstract

    The advent of Web 2.0 gave birth to a new kind of application where content is generated through the collaborative contribution of many different users. This form of content generation is believed to generate data of higher quality since the 'wisdom of the crowds' makes its way into the data. However, a number of specific data quality issues appear within such collaboratively generated data. Apart from normal updates, there are cases of intentional harmful changes known as vandalism as well as naturally occurring disagreements on topics which don't have an agreed upon viewpoint, known as controversies. While much work has focused on identifying vandalism, there has been little prior work on detecting controversies, especially at a fine granularity. Knowing about controversies when processing user-generated content is essential to understand the quality of the data and the trust that should be given to them. Controversy detection is a challenging task, since in the highly dynamic context of user updates, one needs to differentiate among normal updates, vandalisms and actual controversies. We describe a novel technique that finds these controversial issues by analyzing the edits that have been performed on the data over time. We apply the developed technique on Wikipedia, the world's largest known collaboratively generated database and we show that our approach has higher precision and recall than baseline approaches as well as is capable of finding previously unknown controversies.

    Original languageEnglish
    Title of host publication2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
    PublisherIEEE
    Pages1573-1584
    Number of pages12
    Volume2015-May
    ISBN (Electronic)9781479979639
    DOIs
    Publication statusPublished - 1 Jan 2015
    Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
    Duration: 13 Apr 201517 Apr 2015

    Conference

    Conference2015 31st IEEE International Conference on Data Engineering, ICDE 2015
    Country/TerritoryKorea, Republic of
    CitySeoul
    Period13/04/1517/04/15

    Fingerprint

    Dive into the research topics of 'Fine-grained controversy detection in Wikipedia'. Together they form a unique fingerprint.

    Cite this