Conference article

Automatic Lemmatisation of Lithuanian MWEs

Loïc Boizou
Centre of Computational Linguistics, Vytautas Magnus University, Kaunas, Lithuania

Jolanta Kovalevskaitė
Centre of Computational Linguistics, Vytautas Magnus University, Kaunas, Lithuania

Erika Rimkutė
Centre of Computational Linguistics, Vytautas Magnus University, Kaunas, Lithuania

Published in: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 109:8, s. 41-49

NEALT Proceedings Series 23:8, s. 41-49

Published: 2015-05-06

ISBN: 978-91-7519-098-3

ISSN: 1650-3686 (print), 1650-3740 (online)


This article presents a study of lemmatisation of flexible multiword expressions in Lithuanian. An approach based on syntactic analysis designed for multiword term lemmatisation was adapted for a broader range of MWEs taken from the Dictionary of Lithuanian Nominal Phrases. In the present analysis, the main lemmatisation errors are identified and some improvements are proposed. It shows that automatic lemmatisation can be improved by taking into account the whole set of grammatical forms for each MWE. It would allow selecting the optimal grammatical form for lemmatisation and identifying some grammatical restrictions.


