Ryan Johnson
University of Tromsø, Norway
Lene Antonsen
University of Tromsø, Norway
Trond Trosterud
University of Tromsø, Norway
Published in: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Linköping Electronic Conference Proceedings 85:10, p. 59-71
NEALT Proceedings Series 16:10, p. 59-71
Published: 2013-05-17
ISBN: 978-91-7519-589-6
ISSN: 1650-3686 (print), 1650-3740 (online)
This article presents a novel way of combining finite-state transducers (FSTs) with electronic dictionaries; thereby creating efficient reading comprehension dictionaries. We compare a North Saami - Norwegian and a South Saami - Norwegian dictionary; both enriched with an FST; with existing; available dictionaries containing pre-generated paradigms; and show the advantages of our approach. Being more flexible; the FSTs may also adjust the dictionary to different contexts. The finite state transducer analyses the word to be looked up; and the dictionary itself conducts the actual lookup. The FST part is crucial for morphology-rich languages; where as little as 10% of the wordforms in running text actually consists of lemma forms. If a compound or derived word; or a word with an enclitic particle is not found in the dictionary; the FST will give the stems and derivation affixes of the wordform; and each of the stems will be given a separate translation. In this way; the coverage of the FST-dictionary will be far larger than an ordinary dictionary of the same size.
Lexicography; Computational Morphology; Orthographic Variation; Finite-state Transducers; Electronic Dictionaries
