Published: 2017-09-13
ISBN: 978-91-7685-467-9
ISSN: 1650-3686 (print), 1650-3740 (online)
This paper describes an automatic procedure, the Semgrex-Plus tool, we developed to convert dependency treebanks into different formats. It allows for the definition of formal rules for rewriting dependencies and token tags as well as an algorithm for treebank rewriting able to avoid rule interference during the conversion process. This tool is publicly available.
Linda Alfieri and Fabio Tamburini. 2016. (Almost) Automatic Conversion of the Venice Italian Treebank into the Merged Italian Dependency Treebank Format. In Proc. 3rd Italian Conference on Computational Linguistics - CLiC-IT 2016, pages 19–23, Napoli, Italy.
Cristina Bosco, Simonetta Montemagni, and Maria Simi. 2012. Harmonization and Merging of two Italian Dependency Treebanks. In Proc. of LREC 2012, Workshop on Language Resource Merging, pages 23–30, Istanbul.
Nathanael Chambers, Daniel Cer, Trond Grenager, David Hall, Chloe Kiddon, Bill MacCartney, Marie- Catherine de Marneffe, Daniel Ramage, Eric Yeh, and Christopher Manning. 2007. Learning Alignments and Leveraging Natural Logic. In Proc. of the Workshop on Textual Entailment and Paraphrasing, pages 165–170.
Jinho Choi and Martha Palmer. 2010. Robust Constituent-to-Dependency Conversion for English. In Proc. of 9th International Workshop on Treebanks and Linguistic Theories - TLT9, Tartu, Estonia.
Rodolfo Delmonte, Antonella Bristot, and Sara Tonelli. 2007. VIT - Venice Italian Treebank: Syntactic and Quantitative Features. In Proc. Sixth International Workshop on Treebanks and Linguistic Theories.
Bruno Guillaume, Guillaume Bonfante, Paul Masson, Mathieu Morey, and Guy Perrier. 2012. Grew: un outil de réécriture de graphes pour le TAL. In Gilles Sérasset Georges Antoniadis, Hervé Blan-chon, editor, 12ième Conférence annuelle sur le Traitement Automatique des Langues (TALN’12), Grenoble, France. ATALA.
Richard Johansson and Pierre Nugues. 2007. Extended Constituent-to-dependency Conversion for English. In Proc. of NODALIDA 2007, Tartu, Estonia.
Roger Levy and Galen Andrew. 2006. Tregex and Tsurgeon: tools for querying and manipulating tree data structures. In Proc. of 5th International Conference on Language Resources and Evaluation - LREC 2006, Genoa, Italy.
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. 2016. Universal dependencies v1: A multilingual treebank collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pages 1659–1666, Portorož, Slovenia.
Corentin Ribeyre. 2013. Vers un système générique de réécriture de graphes pour l’enrichissement de structures syntaxiques. In RECITAL 2013 - 15`eme Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, pages 178–191, Les Sables d’Olonne, France.