Analysing Inconsistencies and Errors in PoS Tagging in two Icelandic Gold Standards

Steinþór Steingrímsson
The Árni Magnússon, Institute for Icelandic Studies, Reykjavík, Iceland

Sigrún Helgadóttir
The Árni Magnússon, Institute for Icelandic Studies, Reykjavík, Iceland

Eirikur Rögnvaldsson
University of Iceland, Reykjavík, Iceland

Published in: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 109:38, s. 287-291

NEALT Proceedings Series 23:38, s. 287-291

Published: 2015-05-06

ISBN: 978-91-7519-098-3

ISSN: 1650-3686 (print), 1650-3740 (online)


This paper describes work in progress. We experiment with training a state-of-the-art tagger, Stagger, on a new gold standard, MIM-GOLD, for the PoS tagging of Icelandic. We compare the results to results obtained using a previous gold standard, IFD. Using MIM-GOLD, tagging accuracy is considerably lower, 92.76% compared to 93.67% accuracy for IFD. We analyze and classify the errors made by the tagger in order to explain this difference. We find that inconsistencies and incorrect tags in MIM-GOLD may account for this difference.


