Garnishing a phonetic dictionary for ASR intake

Iben Nyholm Debess
Grunnurin Føroysk Teldutala, Denmark

Sandra Saxov Lamhauge
Danish Language Council, Denmark

Peter Juel Juel Henrichsen
Danish Language Council, Denmark

Ladda ner artikel

Ingår i: Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Linköping Electronic Conference Proceedings 167:47, s. 395--399

NEALT Proceedings Series 42:47, p. 395--399

Visa mer +

Publicerad: 2019-10-02

ISBN: 978-91-7929-995-8

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


We present a new method for preparing a lexical-phonetic database as a resource for acoustic model training. The research is an offshoot of the ongoing Project Ravnur (Speech Recognition for Faroese), but the method is language-independent. At NODALIDA 2019 we demonstrate the method (called SHARP) online, showing how a traditional lexical-phonetic dictionary (with a very rich phone inventory) is transformed into an ASR-friendly database (with reduced phonetics, preventing data sparseness). The mapping procedure is informed by a corpus of speech transcripts. We conclude with a discussion on the benefits of a well-thoughtout BLARK design (Basic Language Resource Kit), making tools like SHARP possible.


Inga nyckelord är tillgängliga


Inga referenser tillgängliga

Citeringar i Crossref