D-LUCEA: Curation of the UCU Accent Project Data

R. Orr, H. Quené

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

Abstract

The UCU Accent Project was set up in 2010 to collect a wide variety of non-native and native accents of English in an environment where English is the lingua franca, namely an international liberal arts and sciences college in Utrecht in the Netherlands. The recordings were made longitudinally over the three years of undergraduate study, and four cohorts of students were recorded in total. This yielded over 1,000 speech recordings over a six-year period in which the development of both native and non-native English accents in a non-native environment can be examined. In order to facilitate sharing the data with the wider research community, the D-LUCEA project undertook to curate the data. For each recording, the relevant concomitant metadata was produced, giving information to users of the database about the speaker, the technical specifications, the kinds of speech material recorded, and so forth. The project was funded by CLARIN, and specific CLARIN tools for curation were made available to us, including the Component Metadata Infrastructure (CMDI). To date, all of the speech data has been processed such that the metadata is available, and research is already running on this corpus, on topics as varied as prosodic convergence, L1 phonetic drift and phone convergence. Further plans include work with speaker recognition, accent recognition and models of language learning such as Flege’s Speech Learning Model, the Critical Theory Hypothesis, and the Perceptual Assimilation Model.
Original languageEnglish
Title of host publicationCLARIN in the Low Countries
EditorsJan Odijk, Arjan van Hessen
Place of PublicationLondon
PublisherUbiquity Press
Pages181-193
DOIs
Publication statusPublished - 28 Dec 2017

Keywords

  • speech corpus
  • phonetic convergence
  • English as lingua franca

Fingerprint

Dive into the research topics of 'D-LUCEA: Curation of the UCU Accent Project Data'. Together they form a unique fingerprint.

Cite this