Konferensartikel

Enhanced UD Dependencies with Neutralized Diathesis Alternation

Marie Candito
Univ. Paris Diderot, CNRS, Laboratoire de Linguistique Formelle, France

Bruno Guillaume
Inria Nancy Grand-Est, Loria, France

Guy Perrier
Univ. de Lorraine, Loria, UMR 7503, France

Djamé Seddah
Univ. Paris-Sorbonne, Inria, France

Ladda ner artikel

Ingår i: Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), September 18-20, 2017, Università di Pisa, Italy

Linköping Electronic Conference Proceedings 139:7, s. 42-53

Visa mer +

Publicerad: 2017-09-13

ISBN: 978-91-7685-467-9

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

The 2.0 release of the Universal Dependency treebanks demonstrates the effectiveness of the UD scheme to cope with very diverse languages. The next step would be to get more of syntactic analysis, and the “enhanced dependencies” sketched in the UD 2.0 guidelines is a promising attempt in that direction. In this work we propose to go further and enrich the enhanced dependency scheme along two axis: extending the cases of recovered arguments of non-finite verbs, and neutralizing syntactic alternations. Doing so leads to both richer and more uniform structures, while remaining at the syntactic level, and thus rather neutral with respect to the type of semantic representation that can be further obtained. We implemented this proposal in two UD treebanks of French, using deterministic graph-rewriting rules. Evaluation on a 200 sentence gold standard shows that deep syntactic graphs can be obtained from surface syntax annotations with a high accuracy. Among all arguments of verbs in the gold standard, 13.91% are impacted by syntactic alternation normalization, and 18.93% are additional deep edges.

Nyckelord

Inga nyckelord är tillgängliga

Referenser

Anne Abeill´e, Danielle Godard, and Philip Miller. 1997. Les causatives en franc¸ais, un cas de comp´etition syntaxique [in french]. Langue franc¸aise, 115(1):62–74.

Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer, and Noah A Smith. 2016. Many languages, one parser. Transactions of the Association for Computational Linguistics, 4:431–444.

Miguel Ballesteros, Bernd Bohnet, Simon Mille, and Leo Wanner. 2016. Data-driven deep-syntactic dependency parsing. Natural Language Engineering, 22(6):939–974.

Karine Baschung. 1996. Une approche lexicalis´ee des ph´enom`enes de contrˆole [in french]. Langages, 30(122):96–123.

Alena Böhmová, Jan Hajic, Eva Hajicová, and Barbora Hladká. 2003. The prague dependency treebank. In Treebanks, pages 103–127. Springer.

Bernd Bohnet, Andreas Langjahr, and Leo Wanner. 2000. A development environment for an mtt-based sentence generator. In Proc. of the First International Conference on Natural Language Generation, INLG ’00, pages 260–263.

Aoife Cahill, Michael Burke, Ruth O’Donovan, Josef van Genabith, and AndyWay. 2004. Long-Distance Dependency Resolution in Automatically Acquired Wide-Coverage PCFG-Based LFG Approximations. In Proc. of ACL, pages 320–327.

Marie Candito and Djamé Seddah. 2012. Le corpus Sequoia : annotation syntaxique et exploitation pour l’adaptation d’analyseur par pont lexical. In Proc. Of TALN.

Marie Candito, Guy Perrier, Bruno Guillaume, Corentin Ribeyre, Karën Fort, Djamé Seddah, and ´ Eric De La Clergerie. 2014. Deep Syntax Annotation of the Sequoia French Treebank. In In Proc. Of LREC, Reykjavik, Islande, May.

William Croft, Dawn Nordquist, Katherine Looney, and Michael Regan. 2017. Linguistic typology meets universal dependencies. In 15th International Workshop on Treebanks and Linguistic Theories (TLT15), Indiana University, US.

Marie-Catherine de Marneffe and Christopher D Manning. 2008. Stanford typed dependencies manual. Technical report, Stanford University.

Kim Gerdes and Sylvain Kahane. 2016. Dependency annotation choices: Assessing theoretical and practical issues of universal dependencies. In Proc. Of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), pages 131–140, Berlin, Germany, August.

Bruno Guillaume, Guillaume Bonfante, Paul Masson, Mathieu Morey, and Guy Perrier. 2012. rew : un outil de r´e´ecriture de graphes pour le TAL. In Proc. of TALN, Grenoble, France.

Jan Hajic, Jarmila Panevová, Eva Hajicová, Petr Sgall, Petr Pajas, Jan Štepánek, Jirí Havelka, Marie Mikulová, Zdenek Zabokrtsk?, and Magda ?Sevcikov´a Razimov´a. 2006. Prague dependency treebank 2.0. CD-ROM, Linguistic Data Consortium, LDC Catalog No.: LDC2006T01, Philadelphia, 98.

Julia Hockenmaier. 2003. Data and models for statistical parsing with Combinatory Categorial Grammar. Ph.D. thesis.

Lidia Iordanskaja and Igor Melcuk. 2000. The notion of surface-syntactic relation revisited (valencecontrolled surface-syntactic relations in french). Slovo v tekste i v slovare. Sbornik statej k semidesjatiletiju Ju.D. Apresjana, Moskva: Jazyki russkoj kul’tury, pages 391–433.

Angelina Ivanova, Stephan Oepen, Lilja Øvrelid, and Dan Flickinger. 2012. Who did what to whom?: A contrastive study of syntacto-semantic dependencies. In Proc. of the 6th Linguistic Annotation Workshop (LAW-VI 2012), pages 2–11.

Sylvain Kahane. 2003. On the status of deep syntactic structure. In Proc. of the First Meaning-Text Theory conference, Paris, France.

Kevin Knight, Lauren Baranescu, Claire Bonial, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Daniel Marcu, Martha Palmer, and Nathan Schneifer. 2014. Abstract meaning representation (amr) annotation release 1.0. Web download.

Marco Kuhlmann and Stephan Oepen. 2016. Towards a catalogue of linguistic graph banks. Computational Linguistics, Volume 42, Issue 4, December.

Ryan T McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith B Hall, Slav Petrov, Hao Zhang, Oscar Täckström, et al. 2013. Universal dependency annotation for multilingual parsing. In ACL (2), pages 92–97.

Igor Melcuk. 1988. Dependency syntax: theory and practice. State University Press of New York.

Olivier Michalon, Corentin Ribeyre, Marie Candito, and Alexis Nasr. 2016. Deeper syntax for better semantic parsing. In Proc. of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 409–420, Osaka, Japan, December.

Simon Mille, Alicia Burga, and Leo Wanner. 2013. AnCoraUPF: A Multi-Level Annotation of Spanish. In Proc. of DepLing 2013.

Yusuke Miyao and Jun’ichi Tsujii. 2005. Probabilistic disambiguation models for wide-coverage HPSG parsing. In Proc. of ACL 2005, pages 83–90.

Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher D Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, et al. 2016. Universal dependencies v1: A multilingual treebank collection. In Proc. of LREC 2016, pages 1659–1666.

Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Dan Flickinger, Jan Hajic, Angelina Ivanova, and Yi Zhang. 2014. Semeval 2014 task 8: Broad-coverage semantic dependency parsing. In Proc. of the 8th International Workshop on Semantic Evaluation, pages 63–72.

Guy Perrier, Marie Candito, Bruno Guillaume, Corentin Ribeyre, Karën Fort, and Djamé Seddah. 2014. Annotation scheme for deep dependency syntax of french (un sch´ema d’annotation en d´ependances syntaxiques profondes pour le franc¸ais) [in french]. In Proc. of TALN 2014 (Volume 2: Short Papers), pages 574–579, Marseille, France, July.

Slav Petrov, Dipanjan Das, and Ryan McDonald. 2011. A universal part-of-speech tagset. arXiv preprint arXiv:1104.2086.

Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, and Mirella Lapata. 2017. Universal semantic parsing. arXiv preprint arXiv:1702.03196.

Corentin Ribeyre, Djamé Seddah, and Éric Villemonte De La Clergerie. 2012. A Linguisticallymotivated 2-stage Tree to Graph Transformation. In Chung-Hye Han and Giorgio Satta, editors, Proc. Of TAG+11, Paris, France. INRIA.

Laura Rimell, Stephen Clark, and Mark Steedman. 2009. Unbounded dependency recovery for parser evaluation. In Proc. of EMNLP, pages 813–821.

Manuela Sanguinetti and Cristina Bosco. 2014. Parttut: The turin university parallel treebank. In Roberto Basili, Cristina Bosco, Rodolfo Delmonte, Alessandro Moschitti, and Maria Simi, editors, Harmonization and development of resources and tools for Italian Natural Language Processing within the PARLI project. Springer Verlag.

Sebastian Schuster and Christopher D. Manning. 2016. Enhanced english universal dependencies: An improved representation for natural language understanding tasks. In Proc. of LREC 2016. Portorož, Slovenia.

Citeringar i Crossref