Adapting word2vec to Named Entity Recognition

Scharolta Katharina Sienčnik
Department of Swedish / Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg, Sweden

Ingår i: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 109:30, s. 239-243

NEALT Proceedings Series 23:30, p. 239-243

Publicerad: 2015-05-06

ISBN: 978-91-7519-098-3

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


In this paper we explore how word vectors built using word2vec can be used to improve the performance of a classifier during Named Entity Recognition. Thereby, we discuss the best integration of word embeddings into the classification problem and consider the effect of the size of the unlabelled dataset on performance, reaching the unexpected result that for this particular task increasing the amount of unlabelled data does not necessarily increase the performance of the classifier.


