Optimizing the Oslo-Bergen Tagger

Eckhard Bick
University of Southern Denmark, Odense, Denmark

Kristin Hagen
University of Oslo, Norway, Norway

Anders Nøklestad
University of Oslo, Norway, Norway

Ingår i: Proceedings of the Workshop on “Constraint Grammar - methods, tools and applications” at NODALIDA 2015, May 11-13, 2015, Institute of the Lithuanian Language, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 113:2, s. 11-17

NEALT Proceedings Series 24:2, s. 11-17

Publicerad: 2015-06-17

ISBN: 978-91-7519-037-2

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


In this paper we discuss and evaluate machine learning-based optimization of a Constraint Grammar for Norwegian Bokmål (OBT). The original linguistwritten rules are reiteratively re-ordered, re-sectioned and systematically modified based on their performance on a handannotated training corpus. We discuss the interplay of various parameters and propose a new method, continuous sectionizing. For the best evaluated parameter constellation, part-of-speech F-score improvement was 0.31 percentage points for the first pass in a 5-fold cross evaluation, and over 1 percentage point in highly iterated runs with continuous resectioning.


