Multi-level Fusion of Fisher Vector Encoded BERT and Wav2vec 2.0 Embeddings for Native Language Identification

Dani Krebbers, Heysem Kaya*, Alexey Karpov

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Native Language Identification is a prominent paralinguistic study with applications ranging from biometric analysis to speaker adaptation. Former studies on this task have benefited from alternative acoustic feature representations and pre-trained neural networks. In this work, we explore the Native Language Identification performance of contextual acoustic (wav2vec 2.0) and linguistic (BERT) embeddings as state-of-the-art feature representations and combine them with acoustic features at different levels. We encode acoustic and linguistic features using Fisher Vectors, applying Fisher Vector encoding on BERT word embeddings and wav2vec 2.0 for the first time for a paralinguistic task. We compare this approach with conventional functional summarization. In line with our former study using only acoustic modality, the results indicate the superiority of Fisher Vectors encoding over the traditional techniques. Moreover, we show the efficacy of combining alternative representations now in both acoustic and linguistic modalities. Results indicate a notable contribution of the transformer-based contextual auditory and linguistic feature representations to bimodal Native Language Identification systems.
Original languageEnglish
Title of host publicationSpeech and Computer
Subtitle of host publication24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings
EditorsS. R. Mahadeva Prasanna, Alexey Karpov, K. Samudravijaya, Shyam S. Agrawal
PublisherSpringer
Chapter34
Pages391-403
Number of pages13
Edition1
ISBN (Electronic)978-3-031-20980-2
ISBN (Print)978-3-031-20979-6
DOIs
Publication statusPublished - 10 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13721 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Bibliographical note

Funding Information:
Acknowledgments. Work of A. Karpov is supported by the RSF (project No. 22-11-00321). We also thank the data donors and challenge organizers for making this research possible.

Publisher Copyright:
© 2022, Springer Nature Switzerland AG.

Keywords

  • BERT
  • Computational Paralinguistics
  • Fisher Vector
  • Native Language Identification
  • Wav2vec 2.0

Fingerprint

Dive into the research topics of 'Multi-level Fusion of Fisher Vector Encoded BERT and Wav2vec 2.0 Embeddings for Native Language Identification'. Together they form a unique fingerprint.

Cite this