Conference article

Finite-state relations between two historically closely related languages

Kimmo Koskenniemi
University of Helsinki, Finland

Download article

Published in: Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18

Linköping Electronic Conference Proceedings 87:4, p. 53-53

NEALT Proceedings Series 18:4, p. 53-53

Show more +

Published: 2013-05-17

ISBN: 978-91-7519-587-2

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

Regular correspondences between historically related languages can be modelled using finitestate transducers (FST). A new method is presented by demonstrating it with a bidirectional experiment between Finnish and Estonian. An artificial representation (resembling a protolanguage) is established between two related languages. This representation; AFE (Aligned Finnish-Estonian) is based on the letter by letter alignment of the two languages and uses mechanically constructed morphophonemes which represent the corresponding characters. By describing the constraints of this AFE using two-level rules; one may construct useful mappings between the languages. In this way; the badly ambiguous FSTs from Finnish and Estonian to AFE can be composed into a practically unambiguous transducer from Finnish to Estonian. The inverse mapping from Estonian to Finnish is mildly ambiguous. Steps according to the proposed method could be repeated as such with dialectal or older written texts. Choosing a set of model words; aligning them; recording the mechanical correspondences and designing rules for the constraints could be done with a limited effort. For the purposes of indexing and searching; the mild ambiguity may be tolerable as such. The ambiguity can be further reduced by composing the resulting FST with a speller or morphological analyser of the standard language.

Keywords

Finite-State Transducers; Historical Linguistics; HFST; Two-Level Morphology; Foma

References

No references available

Citations in Crossref