Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Talman, Aarne; Suni, Antti; Celikkanat, Hande; Kakouros, Sofoklis; Tiedemann, Jörg; Vainio, Martti

Conference article

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Aarne Talman
Department of Digital Humanities, University of Helsinki, Finland / Basement AI, Finland

Antti Suni
Department of Digital Humanities, University of Helsinki, Finland

Hande Celikkanat
Department of Digital Humanities, University of Helsinki, Finland

Sofoklis Kakouros
Department of Digital Humanities, University of Helsinki, Finland

Jörg Tiedemann
Department of Digital Humanities, University of Helsinki, Finland

Martti Vainio
Department of Digital Humanities, University of Helsinki, Finland

Download article

Published in: Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Linköping Electronic Conference Proceedings 167:29, p. 281--290

NEALT Proceedings Series 42:29, p. 281--290

Show more +

Published: 2019-10-02

ISBN: 978-91-7929-995-8

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models will be made publicly available.

Keywords

prosody prediction prosodic prominence sequence labeling contextualized word representations

References

No references available

Conference article

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Abstract

Keywords

References

Citations in Crossref