Article | Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland | Bootstrapping UD treebanks for Delexicalized Parsing Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
Bootstrapping UD treebanks for Delexicalized Parsing
Author:
Prasanth Kolachina: University of Gothenburg, Sweden Aarne Ranta: University of Gothenburg, Sweden
Download:
Full text (pdf)
Year:
2019
Conference:
Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland
Issue:
167
Article no.:
002
Pages:
15--24
No. of pages:
9
Publication type:
Abstract and Fulltext
Published:
2019-10-02
ISBN:
978-91-7929-995-8
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Standard approaches to treebanking traditionally employ a waterfall model (Sommerville, 2010), where annotation guidelines guide the annotation process and insights from the annotation process in turn lead to subsequent changes in the annotation guidelines. This process remains a very expensive step in creating linguistic resources for a target language, necessitates both linguistic expertise and manual effort to develop the annotations and is subject to inconsistencies in the annotation due to human errors. In this paper, we propose an alternative approach to treebanking---one that requires writing grammars. This approach is motivated specifically in the context of Universal Dependencies, an effort to develop uniform and cross-lingually consistent treebanks across multiple languages. We show here that a bootstrapping approach to treebanking via interlingual grammars is plausible and useful in a process where grammar engineering and treebanking are jointly pursued when creating resources for the target language. We demonstrate the usefulness of synthetic treebanks in the task of delexicalized parsing. Our experiments reveal that simple models for treebank generation are cheaper than human annotated treebanks, especially in the lower ends of the learning curves for delexicalized parsing, which is relevant in particular in the context of low-resource languages.

Keywords: Dependency parsing Universal Dependencies Abstract syntax trees

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Prasanth Kolachina, Aarne Ranta
Title:
Bootstrapping UD treebanks for Delexicalized Parsing
References:
No references available

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Prasanth Kolachina, Aarne Ranta
Title:
Bootstrapping UD treebanks for Delexicalized Parsing
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-11-06