Linguistically Motivated Question Classification

Alexandr Chernov
Saarland University, Spoken Language Systems, Saarbrücken, Germany

Volha Petukhova
Saarland University, Spoken Language Systems, Saarbrücken, Germany

Dietrich Klakow
Saarland University, Spoken Language Systems, Saarbrücken, Germany

Ladda ner artikel

Ingår i: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 109:9, s. 51-59

NEALT Proceedings Series 23:9, p. 51-59

Visa mer +

Publicerad: 2015-05-06

ISBN: 978-91-7519-098-3

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


In this paper we describe a question interpretation module designed as a part of a Question Answering Dialogue System (QADS) which is used for an interactive quiz application. Question interpretation is achieved in applying a sequence of classification, information extraction, query formalization and query expansion tasks. The process of a question classification is performed based on a domain-specific taxonomy of semantic roles and relations. Our taxonomy was designed in accordance with the real spoken dialogue data. The SVM-based classifier is trained to predict the Expected Answer Type (EAT) with the precision of 82%. In order to retrieve a correct answer, focus word(-s) are extracted to augment the EAT identified by the system. Our hybrid algorithm for the extraction of focus words demonstrates the accuracy of 94.6%. EAT together with focus words are formalized in a query, which is further expanded with the synonyms from WordNet. The expanded query facilitates the search and retrieval of the information that is necessary to generate the system’s responses.


Inga nyckelord är tillgängliga


J. Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46.

H. Faiz. 2014. Question classification module for question answering gaming application. Master’s thesis, Saarland University, Germany.

O. Ferret, B. Grau, M. Hurault-Plantet, G. Illouz, L. Monceaux, R. Robba, and A. Vilnat. 2001. Finding an answer based on the recognition of the question focus. In TREC.

D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. 2010. Building watson: An overview of the deepqa project. AI magazine, 31(3):59–79.

D. Ferrucci, A. Levas, S. Bagchi, D. Gondek, and E. T. Mueller. 2013. Watson: Beyond jeopardy! Artif. Intell., 199:93–105.

M. Heilman. 2011. Automatic Factual Question Generation from Text. PhD thesis. Carnegie Mellon University, USA.

Z. Huang, M. Thint, and Z. Qin. 2008. Question classification using head words and their hypernyms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 927–936, Stroudsburg, PA, USA. Association for Computational Linguistics.

H. Kamp and U. Reyle. 1993. From discourse to logic. Introduction to modeltheoretic semantics of natural language, formal logic and Discourse Representation Theory. Studies in Linguistics and Philosophy, 42.

W. Lehnert. 1977. The Process of Question Answering. Research Report No. 88 [microform] / Wendy Lehnert. Distributed by ERIC Clearinghouse [Washington, D.C.].

W. Lehnert. 1986. A conceptual theory of question answering. In Barbara J. Grosz, Karen Sparck-Jones, and Bonnie Lynn Webber, editors, Readings in Natural Language Processing, pages 651–657. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

X. Li and D. Roth. 2002. Learning question classifiers. In Proceedings of the 19th International Conference on Computational Linguistics - Volume 1, COLING ’02, pages 1–7, Stroudsburg, PA, USA. Association for Computational Linguistics.

A. Mikhailian, T. Dalmas, and R. Pinchuk. 2009. Learning foci for question answering over topic maps. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, ACLShort ’09, pages 325–328, Stroudsburg, PA, USA. Association for Computational Linguistics.

D. Moldovan, S. Harabagiu, A. Harabagiu, M. Pasca, R. Mihalcea, R. Girju, R. Goodrum, V. Rus, and I. Background. 2000. The structure and performance of an open-domain question answering system. In Proceedings of the Conference of the Association for Computational Linguistics (ACL-2000), pages 563–570.

A. D. Panicker, A. U, and S. Venkitakrishnan. 2012. Article: Question classification using machine learning approaches. International Journal of Computer Applications, 48(13):1–4, June. Published by Foundation of Computer Science, New York, USA.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

V. Petukhova, M. Gropp, D. Klakow, G. Eigner, M. Topf, S. Srb, P. Moticek, B. Potard, J. Dines, O. Deroo, R. Egeler, U. Meinz, and S. Liersch. 2014. The DBOX corpus collection of spoken human-human and human-machine dialogues. In Proceedings of the 9th Language Resources and Evaluation Conference (LREC). ELDA, Reykjavik, Iceland.

M. Razmara, A. Fee, and L. Kosseim. 2007. Concordia university at the TREC 2007 QA Track. In TREC.

E. Riloff and M. Thelen. 2000. A rule-based question answering system for reading comprehension tests. In Proceedings of the 2000 ANLP/NAACL Workshop on Reading Comprehension Tests As Evaluation for Computer-based Language Understanding Sytems - Volume 6, ANLP/NAACL-ReadingComp 00, pages 13–19, Stroudsburg, PA, USA. Association for Computational Linguistics.

F. Sebastiani. 2002. Machine learning in automated text categorization. ACM Comput. Surv., 34(1):1–47, March.

A. Singhal, S. P. Abney, M. Bacchiani, M. Collins, D. Hindle, and F. C. N. Pereira. 1999. AT&T at TREC-8. In TREC.

Citeringar i Crossref