Conference article

Misspellings in responses to listening comprehension questions: Prospects for scoring based on phonetic normalization

Heike Da Silva Cardoso
Department of Linguistics, Eberhard Karls Universität Tübingen, Tübingen, Germany

Magdalena Wolska
LEAD Graduate School, Eberhard Karls Universität Tübingen, Tübingen, Germany

Download article

Published in: Proceedings of the 4th workshop on NLP for Computer Assisted Language Learning at NODALIDA 2015, Vilnius, 11th May, 2015

Linköping Electronic Conference Proceedings 114:2, p. 1-10

NEALT Proceedings Series 26:2, p. 1-10

Show more +

Published: 2015-05-06

ISBN: 978-91-7519-036-5

ISSN: 1650-3686 (print), 1650-3740 (online)


Automated scoring systems which evaluate content require robust ways of dealing with form errors. The work presented in this paper is set in the context of scoring learners’ responses to listening comprehension items included in a placement test of German as a foreign language. Based on a corpus of over 3000 responses to 17 questions, by test takers of different language proficiencies, we perform a quantitative analysis of the diversity in misspellings. We evaluate the performance of an off-the-shelf open source spell-checker on our data showing that around 45% of the reported non-word errors are not correctly accounted for, that is, they are either falsely identified as misspelt or the spell-checker is unable to identify the intended word. We propose to address misspellings in computer-based scoring of constructed response items by means of phonetic normalization. Learner responses transcribed into Soundex codes and into two encodings borrowed from historical linguistics (ASJP and Dolgopolsky’s sound classes) are compared to transcribed reference answers using string distance measures. We show that reliable correlation with teachers’ scores can be obtained, however, similarity thresholds are item-specific.


misspellings in learner language; constructed responses to listening comprehension items; short answer scoring


Kevin Atkinson. 2006. Gnu Aspell 0.60.7.

Adriane Boyd. 2010. EAGLE: an Error-Annotated Corpus of Beginning Learner German. In Proceedings of the 7th LREC.

Cecil H Brown, Eric W Holman, Søren Wichmann, and Viveka Velupillai. 2008. Automated classification of the world’s languages: a description of the method and preliminary results. STUF-Language Typology and Universals Sprachtypologie und Universalienforschung, 61(4):285–308. doi: 10.1524/stuf.2008.0026.

Fred J. Damerau. 1964. A technique for computer detection and correction of spelling errors. Communications of the ACM. doi: 10.1145/363958.363994.

Aharon B. Dolgopolsky. 1986. A probabilistic hypothesis concerning the oldest relationships among the language families of northern Eurasia. In Typology, Relationship and Time: A Collection of Papers on Language Change and Relationship by Soviet Linguists, pages 27–50. (Original: 1964 In: Voprosy Jazykoznanija 2).

Michael Flor and Yoko Futagi. 2012. On using context for automatic correction of non-word misspellings in student essays. In Proceedings of the 7th Workshop on Building Educational Applications Using NLP.

Michael Hahn, Niels Ott, Ramon Ziai, and Detmar Meurers. 2013. CoMeT: Integrating different levels of linguistic modeling for meaning assessment. https://aclweb. org/anthology/S/S13/S13-2102.pdf.

Trude Heift and Anne Rimrott. 2008. Learner responses to corrective feedback for spelling errors in CALL. System. doi: 10.1016/j.system.2007.09.007.

International Phonetic Association, editor. 1999. Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge University Press.

Vilius Juozulynas. 2013. Errors in the compositions of second-year german students: An empirical study for parser-based icali. CALICO Journal.

V.I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions and reversals. Soviet physics doklady, 10(8):707–710.

Johann-Mattis List and Steven Moran. 2013. An open source toolkit for quantitative historical linguistics. In Proceedings of the ACL Conference (System Demonstrations).

Johann-Mattis List, Steven Moran, Peter Bouda, and Johannes Dellert. 2013. LingPy. Python Library for Automatic Tasks in Historical Linguistics.

Marc Reznicek, Anke L¨udeling, and Hagen Hirschmann. 2013. Competing target hypotheses in the Falko corpus. In Automatic Treatment and Analysis of Learner Corpus Data, volume 59 of Studies in Corpus Linguistics, pages 101–123.

Anne Rimrott and Trude Heift. 2008. Evaluating automatic detection of misspellings in German. Language Learning& Technology, 12(3):73–92.

Robert C. Russell. 1918, 1922. US Patents No.: 1261167 and 1435663. (Retrieved 04/15 via

Howida A. Shedeed. 2011. A new intelligent methodology for computer based assessment of short answer question based on a new enhanced soundex phonetic algorithm for arabic language. International Journal of Computer Applications.

Søren Wichmann, Andr´e Mller, Annkathrin Wett, Viveka Velupillai, Julia Bischoffberger, Cecil H. Brown, Eric W. Holman, Sebastian Sauppe, Zarina Molochieva, Pamela Brown, Harald Hammarström, Oleg Belyaev, Johann-Mattis List, Dik Bakker, Dmitry Egorov, Matthias Urban, Robert Mailhammer, Agustina Carrizo, Matthew S. Dryer, Evgenia Korovina, David Beck, Helen Geyer, Pattie Epps, Anthony Grant, and Pilar Valenzuela. 2013. The ASJP-Database (version 16). (Retrieved 04/15).

Magdalena Wolska, Andrea Horbach, and Alexis Palmer. 2014. Computer-assisted scoring of short responses: the efficiency of a clustering-based approach in a real-life task. In Advances in Natural Language Processing (Proceedings of the 9th International Conference on Natural Language Processing (PolTAL-14)). doi: 10.1007/978-3-319-10888-9 31.

Citations in Crossref