
An evaluation of Czech word embeddings

Karolína Horenovská
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Czech Republic

Ingår i: Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Linköping Electronic Conference Proceedings 167:7, s. 65--75

NEALT Proceedings Series 42:7, p. 65--75

Publicerad: 2019-10-02

ISBN: 978-91-7929-995-8

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


We present an evaluation of Czech low-dimensional distributed word representations, also known as word embeddings. We describe five different approaches to training the models and three different corpora used in training. We evaluate the resulting models on five different datasets, report the results and provide their further analysis.


word embeddings word similarity word analogy synonym retrieval


