Crowdsourcing high-quality structured data

Harry Halpin*, Ioanna Lykourentzou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

One of the most difficult problems faced by consumers of semi-structured and structured data on the Web is how to discover or create the data they need. On the other hand, the producers of Web data do not have any (semi)automated way to align their data production with consumer needs. In this paper we formalize the problem of a data marketplace, hypothesize that one can quantify the value of semi-structured and structured data given a set of consumers, and that this quantification can be applied on both existing data-sets and data-sets that need to be created. Furthermore, we provide an algorithm for showing how the production of this data can be crowd-sourced while assuring the consumer a certain level of quality. Using real-world empirical data collected via data producers and consumers, we simulate a crowd-sourced data marketplace with quality guarantees.

Original languageEnglish
Title of host publicationInformation Management and Big Data - 5th International Conference, SIMBig 2018, Proceedings
EditorsDenisse Muñante, Hugo Alatrista-Salas, Juan Antonio Lossio-Ventura
PublisherSpringer
Pages304-319
Number of pages16
ISBN (Print)9783030116798
DOIs
Publication statusPublished - 2019
Event5th International Conference on Information Management and Big Data, SIMBig 2018 - Lima, Peru
Duration: 3 Sept 20185 Sept 2018

Publication series

NameCommunications in Computer and Information Science
Volume898
ISSN (Print)1865-0929

Conference

Conference5th International Conference on Information Management and Big Data, SIMBig 2018
Country/TerritoryPeru
CityLima
Period3/09/185/09/18

Keywords

  • Crowdsourcing
  • Human computation
  • Resource allocation
  • Structured data

Fingerprint

Dive into the research topics of 'Crowdsourcing high-quality structured data'. Together they form a unique fingerprint.

Cite this