Conference article

Paraphrase Detection for Short Answer Scoring

Nikolina Koleva
Saarland University, Saarbrücken, Germany

Andrea Horbach
Saarland University, Saarbrücken, Germany

Alexis Palmer
Saarland University, Saarbrücken, Germany

Simon Ostermann
Saarland University, Saarbrücken, Germany

Manfred Pinkal
Saarland University, Saarbrücken, Germany

Download article

Published in: Proceedings of the third workshop on NLP for computer-assisted language learning at SLTC 2014, Uppsala University

Linköping Electronic Conference Proceedings 107:5, p. 59–73

NEALT Proceedings Series 22:5, p. 59–73

Show more +

Published: 2014-11-11

ISBN: 978-91-7519-175-1

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

We describe a system that grades learner answers in reading comprehension tests in the context of foreign language learning. This task, also known as short answer scoring, essentially requires determining whether a semantic entailment relationship holds between an individual learner answer and a target answer; thus semantic information is a necessary part of any automatic short answer scoring system. At the same time the method must be robust to the particularities of learner language. We propose using paraphrase detection, a method that meets both requirements. The basis for our specific paraphrasing method is word alignment learned from parallel corpora which we create from the available data in the CREG corpus (Corpus for Reading Comprehension Exercises for German). We show the usefulness of this kind of information for the task of short answer scoring. Combining our results with existing approaches we obtain an improvement tendency.

Keywords

Paraphrase fragments; short answer scoring; reading comprehension

References

Bannard, C. and Callison-Burch, C. (2005). Paraphrasing with bilingual parallel corpora. In Proceedings of the 43rd Annual Meeting of the ACL’05, pages 597–604.

Barzilay, R. and Elhadad, N. (2003). Sentence alignment for monolingual comparable corpora. In Collins, M. and Steedman, M., editors, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pages 25–32.

Barzilay, R. and McKeown, K. R. (2001). Extracting paraphrases from a parallel corpus. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics, pages 50–57, Toulouse, France. Association for Computational Linguistics.

Callison-Burch, C. (2008). Syntactic constraints on paraphrases extracted from parallel corpora. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 196–205, Honolulu, Hawaii. Association for Computational Linguistics.

Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. Bradford Books.

Gleize, M. and Grau, B. (2013). Limsiiles: Basic english substitution for student answer assessment at semeval 2013. In *SEM, Volume 2: Proceedings of SemEval 2013, pages 598–602, Atlanta, Georgia, USA. Association for Computational Linguistics.

Hahn, M. and Meurers, D. (2012). Evaluating the meaning of answers to reading comprehension questions: A semantics-based approach. In Proceedings of the 7th Workshop on Innovative Use of NLP for Building Educational Applications (BEA7), pages 326–336, Montreal, Canada. Association for Computational Linguistics.

Hamp, B. and Feldweg, H. (1997). Germanet - a lexical-semantic net for German. In In Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, pages 9–15.

Horbach, A., Palmer, A., and Pinkal, M. (2013). Using the text to evaluate short answers for reading comprehension exercises. In *SEM, Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity, pages 286–295, Atlanta, Georgia, USA. Association for Computational Linguistics.

Leacock, C. and Chodorow, M. (2003). C-rater: Automated scoring of short-answer questions. Computers and the Humanities, 37(4):389–405.

Meurers, D., Ziai, R., Ott, N., and Bailey, S. (2011a). Integrating parallel analysis modules to evaluate the meaning of answers to reading comprehension questions. Special Issue on Free-text Automatic Evaluation. International Journal of Continuing Engineering Education and Life-Long Learning (IJCEELL), 21(4):355–369.

Meurers, D., Ziai, R., Ott, N., and Kopp, J. (2011b). Evaluating answers to reading comprehension questions in context: Results for German and the role of information structure. In Proceedings of the TextInfer 2011 Workshop on Textual Entailment, pages 1–9, Edinburgh, Scottland, UK.

Meurers, D., Ziai, R., Ott, N., and Kopp, J. (2011c). Evaluating answers to reading comprehension questions in context: Results for German and the role of information structure. In Proceedings of the TextInfer 2011 Workshop on Textual Entailment, pages 1–9, Edinburgh, Scottland, UK. Association for Computational Linguistics.

Mohler, M., Bunescu, R. C., and Mihalcea, R. (2011). Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In Lin, D., Matsumoto, Y., and Mihalcea, R., editors, ACL, pages 752–762.

Munteanu, D. S. and Marcu, D. (2006). Extracting parallel sub-sentential fragments from non-parallel corpora. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, pages 81–88, Stroudsburg, PA, USA. ACL.

Och, F. J. and Ney, H. (2003). A systematic comparison of various statistical alignment models. Comput. Linguist., 29(1):19–51.

Ott, N., Ziai, R., and Meurers, D. (2012). Creation and analysis of a reading comprehension exercise corpus: Towards evaluating meaning in context. In Schmidt, T. and Wörner, K., editors, Multilingual Corpora and Multilingual Corpus Analysis, Hamburg Studies in Multilingualism (HSM), pages 47–69. Benjamins, Amsterdam.

Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50:157–175.

Pulman, S. G. and Sukkarieh, J. Z. (2005). Automatic short answer marking. In Proceedings of the second workshop on Building Educational Applications Using NLP, EdAppsNLP 05, pages 9–16.

Quirk, C., Brockett, C., and Dolan, W. (2004). Monolingual machine translation for paraphrase generation. In Lin, D. and Wu, D., editors, Proceedings of EMNLP 2004, pages 142–149, Barcelona, Spain. Association for Computational Linguistics.

Regneri, M. and Wang, R. (2012). Using discourse information for paraphrase extraction. In Proceedings of EMNLP-CoNNL 2012, Jeju, Korea. Schmid, H. (1995). Improvements in part-of-speech tagging with an application to German. In In Proceedings of the ACL SIGDAT-Workshop, pages 47–50.

Wang, R. and Callison-Burch, C. (2011). Paraphrase fragment extraction from monolingual comparable corpora. In Proceedings of the 4th Workshop on Building and Comparable Corpora: Comparable Corpora and the Web, pages 52–60, Portland, Oregon. ACL.

Witten, I. H. and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, 2nd edition.

Zhao, S., Wang, H., Liu, T., and Li, S. (2008). Pivot approach for extracting paraphrase patterns from bilingual corpora. In Proceedings of ACL-08: HLT, pages 780–788, Columbus, Ohio. Association for Computational Linguistics.

Ziai, R., Ott, N., and Meurers, D. (2012). Short answer assessment: Establishing links between research strands. In Proceedings of the 7th Workshop on Innovative Use of NLP for Building Educational Applications (BEA7), Montreal, Canada.

Citations in Crossref