Konferensartikel

Crossing the Border Twice: Reimporting Prepositions to Alleviate L1-Specific Transfer Errors

Johannes Graën
Institute of Computational Linguistics, University of Zurich, Switzerland

Gerold Schneider
Institute of Computational Linguistics, University of Zurich, Switzerland

Ladda ner artikel

Ingår i: Proceedings of the Joint 6th Workshop on NLP for Computer Assisted Language Learning and 2nd Workshop on NLP for Research on Language Acquisition at NoDaLiDa, Gothenburg, 22nd May 2017

Linköping Electronic Conference Proceedings 134:3, s. 18-26

NEALT Proceedings Series 30:3, p. 18-26

Visa mer +

Publicerad: 2017-05-11

ISBN: 978-91-7685-502-7

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

We present a data-driven approach which exploits word alignment in a large parallel corpus with the objective of identifying those verb- and adjective-preposition combinations which are difficult for L2 language learners. This allows us, on the one hand, to provide language-specific ranked lists in order to help learners to focus on particularly challenging combinations given their native language (L1). On the other hand, we provide extensive statistics on such combinations with the objective of facilitating automatic error correction for preposition use in learner texts. We evaluate these lists, first manually, and secondly automatically by applying our statistics to an error-correction task.

Nyckelord

Inga nyckelord är tillgängliga

Referenser

Aston, Guy and Lou Burnard (1998). The BNC Handbook. Exploring the British National Corpus with SARA. Edinburgh University Press.

Benson, Morton, Evelyn Benson, and Robert Ilson (2009). The BBI combinatory dictionary of English: Your guide to collocations and grammar. John Benjamins Publishing.

Boyd, Adriane, Marion Zepf, and Detmar Meurers (2012). “Informing Determiner and Preposition Error Correction with Hierarchical Word Clustering”. In: Proceedings of the 7th Workshop on Innovative Use of NLP for Building Educational Applications (BEA7). Montreal, Canada: Association for Computational Linguistics, pp. 208–215.

Felice, Mariano, Zheng Yuan, E. Øistein Andersen, Helen Yannakoudakis, and Ekaterina Kochmar (2014). “Grammatical error correction using hybrid systems and type filtering”. In: Proceedings of the 18th Conference on Computational Natural Language Learning (CoNLL): Shared Task. Baltimore, Maryland: Association for Computational Linguistics, pp. 15–24.

Gao, Qin and Stephan Vogel (2008). “Parallel implementations of word alignment tool”. In: Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP). Association for Computational Linguistics, pp. 49–57.

Gardner, Dee and Mark Davies (2007). “Pointing Out Frequent Phrasal Verbs: A Corpus- Based Analysis”. In: TESOL quarterly 41.2, pp. 339–359.

Gilquin, Gaëtanelle (2015). “The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach”. In: Corpus Linguistics and Linguistic Theory 11.1, pp. 51–88.

Gilquin, Gaëtanelle, Sylviane Granger, et al. (2011). “From EFL to ESL: evidence from the International Corpus of Learner English”. In: Exploring second-language varieties of English and learner Englishes: Bridging a paradigm gap, pp. 55–78.

Graën, Johannes, Dolores Batinic, and Martin Volk (2014). “Cleaning the Europarl Corpus for Linguistic Applications”. In: Proceedings of the Conference on Natural Language Processing (KONVENS). (Hildesheim). Stiftung Universität Hildesheim, pp. 222–227.

Granger, Sylviane, Estelle Dagneaux, Fanny Meunier, and Magali Paquot (2002). International corpus of learner English. Presses universitaires de Louvain.

Granger, Sylviane and Marie-Aude Lefer (2016). “From general to learners’ bilingual dictionaries: Towards a more effective fulfilment of advanced learners’ phraseological needs”. In: International Journal of Lexicography, pp. 279–295.

Koehn, Philipp (2005). “Europarl: A parallel corpus for statistical machine translation”. In: Machine Translation Summit. (Phuket). Vol. 5. Asia-Pacific Association for Machine Translation (AAMT), pp. 79–86.

Liang, Percy, Ben Taskar, and Dan Klein (2006). “Alignment by agreement”. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL), pp. 104–111.

Ng, Tou Hwee, Mei Siew Wu, Ted Briscoe, Christian Hadiwinoto, Hendy Raymond Susanto, and Christopher Bryant (2014). “The CoNLL-2014 Shared Task on Grammatical Error Correction”. In: Proceedings of the 18th Conference on Computational Natural Language Learning (CoNLL): Shared Task. Baltimore, Mary-land: Association for Computational Linguistics, pp. 1–14.

Ng, Tou Hwee, Mei Siew Wu, Yuanbin Wu, Christian Hadiwinoto, and Joel Tetreault (2013). “The CoNLL-2013 Shared Task on Grammatical Error Correction”. In: Proceedings of the 17th Conference on Computational Natural Language Learning (CoNLL): Shared Task. Sofia: Association for Computational Linguistics, pp. 1–12.

Nivre, Joakim, Johan Hall, and Jens Nilsson (2006). “Maltparser: A data-driven parsergenerator for dependency parsing”. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC). Vol. 6, pp. 2216–2219.

Och, Franz Josef and Hermann Ney (2003). “A Systematic Comparison of Various Statistical Alignment Models”. In: Computational linguistics 29.1, pp. 19–51.

Petrov, Slav, Dipanjan Das, and Ryan McDonald (2012). “A Universal Part-of-Speech Tagset”. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC).

Schmid, Helmut (1994). “Probabilistic partof-speech tagging using decision trees”. In: Proceedings of International Conference on New Methods in Natural Language Processing (NeMLaP). (Manchester). Vol. 12, pp. 44–49.

Schneider, Gerold and Gaëtanelle Gilquin (2016). “Detecting Innovations in a Parsed Corpus of Learner English”. In: International Journal of Learner Corpus Research 2.2.

Tetreault, R. Joel and Martin Chodorow (2008). “The Ups and Downs of Preposition Error Detection in ESL Writing”. In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING). Manchester: COLING 2008 Organizing Committee, pp. 865–872.

Varga, Dániel, Péter Halácsy, András Kornai, Viktor Nagy, László Németh, and Viktor Trón (2005). “Parallel corpora for medium density languages”. In: Proceedings of the Recent Advances in Natural Language Processing  (RANLP). (Borovets), pp. 590–596.

Volk, Martin, Chantal Amrhein, Noëmi Aepli, Mathias Müller, and Phillip Ströbel (2016). “Building a Parallel Corpus on the World’s Oldest Banking Magazine”. In: Proceedings of the Conference on Natural Language Processing (KONVENS). (Bochum).

Yannakoudakis, Helen, Ted Briscoe, and Ben Medlock (2011). “A new dataset and method for automatically grading ESOL texts”. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Vol. 1, pp. 180–189.

Citeringar i Crossref