An Automatic Error Tagger for German

Inga Kempfert
Natural Lanuguage Systems Group, Department of Informatics, Universität Hamburg, Germany

Christine Köhn
Natural Lanuguage Systems Group, Department of Informatics, Universität Hamburg, Germany

Ladda ner artikel

Ingår i: Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning (NLP4CALL 2018) at SLTC, Stockholm, 7th November 2018

Linköping Electronic Conference Proceedings 152:4, s. 32-40

NEALT Proceedings Series 36:4, s. 32-40

Visa mer +

Publicerad: 2018-11-02

ISBN: 978-91-7685-173-9

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


Automatically classifying errors by language learners facilitates corpus analysis and tool development. We present a tag set and a rule-based classifier for automatically assigning error tags to edits in learner texts. In our manual evaluation, the classifier assigns the best or close to best fitting tag in 92% of the cases.


automatic error tagging, error classes, learner corpora, target hypotheses


D. Altinok. 2018. DEMorphy, German Language Morphological Analyzer. ArXiv e-prints.

Adriane Boyd. 2010. EAGLE: an Error-Annotated Corpus of Beginning Learner German. In Proceedings of the International Conference on Language Resources and Evaluation, Valletta, Malta. European Language Resources Association (ELRA).

Adriane Boyd. 2018. Using Wikipedia Edits in Low Resource Grammatical Error Correction. In Proceedings of the 4th Workshop on Noisy Usergenerated Text.

Christopher Bryant, Mariano Felice, and Ted Briscoe. 2017. Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 793–805, Vancouver, Canada. Association for Computational Linguistics.

Mariano Felice, Christopher Bryant, and Ted Briscoe. 2016. Automatic Extraction of Learner Errors in ESL Sentences Using Linguistically Enhanced Alignments. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 825–835, Osaka, Japan. The COLING 2016 Organizing Committee.

Kilian A. Foth. 2006. Eine umfassende Constraint-Dependenz-Grammatik des Deutschen. Fachbereich Informatik, Universit¨at Hamburg. URN: urn:nbn:de:gbv:18-228-7-2048.

Kilian A. Foth, Arne Köhn, Niels Beuck, and Wolfgang Menzel. 2014. Because size does matter: The Hamburg Dependency Treebank. In Proceedings of the Language Resources and Evaluation Conference 2014, Reykjavik, Iceland. LREC, European Language Resources Association (ELRA).

Roman Grundkiewicz and Marcin Junczys-Dowmunt. 2018. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 284–290. Association for Computational Linguistics.

Christine Köhn and Arne Köhn. 2018. An Annotated Corpus of Picture Stories Retold by Language Learners. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 121–132. Association for Computational Linguistics.

Marc Reznicek, Anke Lüdeling, and Hagen Hirschmann. 2013. Competing target hypotheses in the Falko corpus. In Ana Ballier Díaz-Negrillo and Paul Nicolas Thompson, editors, Automatic Treatment and Analysis of Learner Corpus Data, pages 101–123. John Benjamins Publishing Company, Amsterdam, NLD.

Marc Reznicek, Anke Lüdeling, Cedric Krummes, Franziska Schwantuschke, Maik Walter, Karin Schmidt, Hagen Hirschmann, and Torsten Andreas. 2012. Das Falko-Handbuch.

Margaret Rogers. 1984. On major types of written error in advanced students of German. International Review of Applied Linguistics in Language Teaching, 22(1):1–39.

Anne Schiller, Simone Teufel, Christine Stöckert, and Christine Thielen. 1999. Guidelines für das Tagging deutscher Textcorpora mit STTS. Technical report, Universität Stuttgart / Universität Tübingen.

Leonie Weissweiler and Alexander Fraser. 2018. Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers. In Proceedings of the 27th International Conference of the German Society for Computational Linguistics and Language Technology (GSCL 2017): Language Technologies for the Challenges of the Digital Age, pages 81–94, Cham. Springer International Publishing.

Citeringar i Crossref