Conference article

Comparing Two Methods for Adding Enhanced Dependencies to UD Treebanks

Gosse Bouma
Center for Language and Cognition, University of Groningen, The Netherlands / Center for Advanced Study, Norwegian Academy for Science and Letters, Norway

Download article

Published in: Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018), December 13–14, 2018, Oslo University, Norway

Linköping Electronic Conference Proceedings 155:4, p. 17-30

Show more +

Published: 2018-12-10

ISBN: 978-91-7685-137-1

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

When adding enhanced dependencies to an existing UD treebank, one can opt for heuristics that predict the enhanced dependencies on the basis of the UD annotation only. If the treebank is the result of conversion from an underlying treebank, an alternative is to produce the enhanced dependencies directly on the basis of this underlying annotation. Here we present a method for doing the latter for the Dutch UD treebanks. We compare our method with the UD -based approach of Schuster et al. (2018). While there are a number of systematic differences in the output of both methods, it appears these are the result of insufficient detail in the annotation guidelines and it is not the case that one approach is superior over the other in principle.

Keywords

Universal Dependencies, Enhanced Universal Dependencies, Dutch, ellipsis, coordination, control

References

Bar-Haim, R., Dagan, I., Greental, I., Szpektor, I., and Friedman, M. (2007). Semantic inference at the lexical-syntactic level for textual entailment recognition. In Proceedings of the ACLPASCAL Workshop on Textual Entailment and Paraphrasing, pages 131–136. Association for Computational Linguistics.

Bouma, G. and van Noord, G. (2017). Increasing return on annotation investment: the automatic construction of a Universal Dependency treebank for Dutch. In Nivre, J. and de Marneffe, M.-C., editors, NoDaLiDa workshop on Universal Dependencies, Gothenburg.

Brants, S., Dipper, S., Hansen, S., Lezius, W., and Smith, G. (2002). The TIGER treebank. In Proceedings of the workshop on Treebanks and Linguistic Theories, volume 168.

Candito, M., Guillaume, B., Perrier, G., and Seddah, D. (2017). Enhanced UD dependencies with neutralized diathesis alternation. In Depling 2017-Fourth International Conference on Dependency Linguistics.

Futrell, R., Mahowald, K., and Gibson, E. (2015). Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences, 112(33):10336–10341.

Gotham, M. and Haug, D. (2018). Glue semantics for universal dependencies. In Proceedings of the 23rd international Lexical-Functional Grammar Conference, Vienna.

Hartmann, J., Konietzko, A., and Salzmann, M. (2016). On the limits of non-parallelism in ATB movement: Experimental evidence for strict syntactic identity. Quantitative Approaches to Grammar and Grammatical Change: Perspectives from Germanic, 290:51.

Lipenkova, J. and Soucek, M. (2014). Converting Russian dependency treebank to Stanford typed dependencies representation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 143–147.

Przepiórkowski, A. and Patejuk, A. (2018). From LFG to enhanced universal dependencies. In Proceedings of the 23rd International LFG conference, Vienna.

Pyysalo, S., Kanerva, J., Missilä, A., Laippala, V., and Ginter, F. (2015). Universal dependencies for Finnish. In Proceedings of the 20th Nordic Conference of Computational Linguistics (Nodalida 2015), pages 163–172.

Reddy, S., Täckström, O., Petrov, S., Steedman, M., and Lapata, M. (2017). Universal semantic parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 89–101.

Schuster, S., Lamm, M., and Manning, C. D. (2017). Gapping constructions in universal dependencies v2. In Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017), pages 123–132.

Schuster, S. and Manning, C. D. (2016). Enhanced English universal dependencies: An improved representation for natural language understanding tasks. In Proceedings of LREC.

Schuster, S., Nivre, J., and Manning, C. D. (2018). Sentences with gapping: Parsing and reconstructing elided predicates. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018).

Shwartz, V., Goldberg, Y., and Dagan, I. (2016). Improving hypernymy detection with an integrated path-based and distributional method. In Proceedings of the 54th Annual Meeting of the ACL 2016, pages 2389–2399, Berlin.

Skut, W., Brants, T., Krenn, B., and Uszkoreit, H. (1998). A linguistically interpreted corpus of German newspaper text. arXiv preprint cmp-lg/9807008.

van Noord, G., Bouma, G., van Eynde, F., de Kok, D., van der Linde, J., Schuurman, I., Sang, E. T. K., and Vandeghinste, V. (2013). Large scale syntactic annotation of written Dutch: Lassy. In Spyns, P. and Odijk, J., editors, Essential Speech and Language Technology for Dutch: the STEVIN Programme, pages 147–164. Springer.

Vulic, I. (2017). Cross-lingual syntactically informed distributed word representations. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, volume 2, pages 408–414.

Wang, Y. and Liu, H. (2017). The effects of genre on dependency distance and dependency direction. Language Sciences, 59:135 – 147.

Williams, E. (1978). Across-the-board rule application. Linguistic Inquiry, 9(1):31–43.

Zeman, D., Popel, M., Straka, M., Hajic, J., Nivre, J., Ginter, F., Luotolahti, J., Pyysalo, S., Petrov, S., Potthast, M., Tyers, F., Badmaeva, E., Gokirmak, M., Nedoluzhko, A., Cinkova, S., Hajic jr., J., Hlavacova, J., Kettnerová, V., Uresova, Z., Kanerva, J., Ojala, S., Missilä, A., Manning, C. D., Schuster, S., Reddy, S., Taji, D., Habash, N., Leung, H., de Marneffe, M.-C., Sanguinetti, M., Simi, M., Kanayama, H., dePaiva, V., Droganova, K., Martínez Alonso, H., Çöltekin, c., Sulubacak, U., Uszkoreit, H., Macketanz, V., Burchardt, A., Harris, K., Marheinecke, K., Rehm, G., Kayadelen, T., Attia, M., Elkahky, A., Yu, Z., Pitler, E., Lertpradit, S., Mandl, M., Kirchner, J., Alcalde, H. F., Strnadová, J., Banerjee, E., Manurung, R., Stella, A., Shimada, A., Kwak, S., Mendonca, G., Lando, T., Nitisaroj, R., and Li, J. (2017). Conll 2017 shared task: Multilingual parsing from raw text to universal dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 1–19, Vancouver, Canada. Association for Computational Linguistics.

Citations in Crossref