Abstract
Native Language Identification is a prominent paralinguistic study with applications ranging from biometric analysis to speaker adaptation. Former studies on this task have benefited from alternative acoustic feature representations and pre-trained neural networks. In this work, we explore the Native Language Identification performance of contextual acoustic (wav2vec 2.0) and linguistic (BERT) embeddings as state-of-the-art feature representations and combine them with acoustic features at different levels. We encode acoustic and linguistic features using Fisher Vectors, applying Fisher Vector encoding on BERT word embeddings and wav2vec 2.0 for the first time for a paralinguistic task. We compare this approach with conventional functional summarization. In line with our former study using only acoustic modality, the results indicate the superiority of Fisher Vectors encoding over the traditional techniques. Moreover, we show the efficacy of combining alternative representations now in both acoustic and linguistic modalities. Results indicate a notable contribution of the transformer-based contextual auditory and linguistic feature representations to bimodal Native Language Identification systems.
Original language | English |
---|---|
Title of host publication | Speech and Computer |
Subtitle of host publication | 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings |
Editors | S. R. Mahadeva Prasanna, Alexey Karpov, K. Samudravijaya, Shyam S. Agrawal |
Publisher | Springer |
Chapter | 34 |
Pages | 391-403 |
Number of pages | 13 |
Edition | 1 |
ISBN (Electronic) | 978-3-031-20980-2 |
ISBN (Print) | 978-3-031-20979-6 |
DOIs | |
Publication status | Published - 10 Nov 2022 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13721 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Bibliographical note
Funding Information:Acknowledgments. Work of A. Karpov is supported by the RSF (project No. 22-11-00321). We also thank the data donors and challenge organizers for making this research possible.
Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
Keywords
- BERT
- Computational Paralinguistics
- Fisher Vector
- Native Language Identification
- Wav2vec 2.0