Conference article

The INESS Treebanking Infrastructure

Paul Meurer
Uni Computing, Bergen, Norway

Helge Dyvik
University of Bergen, Norway and Uni Computing, Bergen, Norway

Victoria Rosén
University of Bergen, Norway and Uni Computing, Bergen, Norway

Koenraad De Smedt
University of Bergen, Norway

Gunn Inger Lyse
University of Bergen, Norway

Gyri Smørdal Losnegaard
University of Bergen, Norway

Martha Thunes
University of Bergen, Norway

Download article

Published in: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Linköping Electronic Conference Proceedings 85:43, p. 453-458

NEALT Proceedings Series 16:43, p. 453-458

Show more +

Published: 2013-05-17

ISBN: 978-91-7519-589-6

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

This paper briefly describes the current state of the evolving INESS infrastructure in Norway which is developing treebanks as well as making treebanks more accessible to the R&D community. Recent work includes the hosting of more treebanks; including parallel treebanks; and increasing the number of parsed and disambiguated sentences in the Norwegian LFG treebank. Other recent improvements include the presentation of metadata and license handling for restricted treebanks. The infrastructure is fully operational and accessible; but will be further improved during the lifetime of the INESS project.

Keywords

Treebanks; research infrastructure; parsed corpora; metadata; IPR; INESS; METANORD; CLARIN; CLARINO

References

Brants; S.; Dipper; S.; Hansen; S.; Lezius; W.; and Smith; G. (2002). The TIGER treebank. In Proceedings of the 1st Workshop on Treebanks and Linguistic Theories; pages 24–41.

Bresnan; J. (2001). Lexical-Functional Syntax. Blackwell; Malden; MA.

Butt; M.; Dyvik; H.; King; T. H.; Masuichi; H.; and Rohrer; C. (2002). The Parallel Grammar project. In Proceedings of COLING-2002 Workshop on Grammar Engineering and Evaluation; Taipei; Taiwan.

Dyvik; H.; Meurer; P.; Rosén; V.; and De Smedt; K. (2009). Linguistically motivated parallel parsebanks. In Passarotti; M.; Przepiórkowski; A.; Raynaud; S.; and Van Eynde; F.; editors; Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories; pages 71–82; Milan; Italy. EDUCatt.

Gaarder; J. (1991). Sofies verden: roman om filosofiens historie. Aschehoug; Oslo; Norway.

Meurer; P. (2012). INESS-Search: A search system for LFG (and other) treebanks. In Butt; M. and King; T. H.; editors; Proceedings of the LFG ’12 Conference; LFG Online Proceedings; pages 404–421; Stanford; CA. CSLI Publications.

Rosén; V.; De Smedt; K.; Meurer; P.; and Dyvik; H. (2012a). An open infrastructure for advanced treebanking. In Haji?c; J.; De Smedt; K.; Tadi´c; M.; and Branco; A.; editors; METARESEARCH Workshop on Advanced Treebanking at LREC2012; pages 22–29; Istanbul; Turkey.

Rosén; V.; Meurer; P.; and De Smedt; K. (2007). Designing and implementing discriminants for LFG grammars. In King; T. H. and Butt; M.; editors; The Proceedings of the LFG ’07 Conference; pages 397–417. CSLI Publications; Stanford.

Rosén; V.; Meurer; P.; and De Smedt; K. (2009). LFG Parsebanker: A toolkit for building and searching a treebank as a parsed corpus. In Van Eynde; F.; Frank; A.; van Noord; G.; and De Smedt; K.; editors; Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories (TLT7); pages 127–133; Utrecht. LOT.

Rosén; V.; Meurer; P.; Losnegaard; G. S.; Lyse; G. I.; De Smedt; K.; Thunes; M.; and Dyvik; H. (2012b). An integrated web-based treebank annotation system. In Hendrickx; I.; Kübler; S.; and Simov; K.; editors; Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories (TLT11); pages 157–167; Lisbon; Portugal. Edições Colibri.

Simov; K. and Osenova; P. (2004). BulTreeBank Stylebook. BulTreeBank Project Technical Report 5; Bulgarian Academy of Sciences.

Vasi¸ljevs; A.; Forsberg; M.; Gornostay; T.; Haltrup Hansen; D.; Jóhannsdóttir; K.; Lyse; G.; Lindén; K.; Offersgaard; L.; Olsen; S.; Pedersen; B.; Rögnvaldsson; E.; Skadin¸a; I.; De Smedt; K.; Oksanen; V.; and Rozis; R. (2012). Creation of an open shared language resource repository in the Nordic and Baltic countries. In Calzolari; N.; Choukri; K.; Declerck; T.; Do?gan; M. U.; Maegaard; B.; Mariani; J.; Odijk; J.; and Piperidis; S.; editors; Proceedings of the Eighth Conference on International Language Resources and Evaluation (LREC’12); pages 1076–1083; Istanbul; Turkey. European Language Resources Association (ELRA).

Wallenberg; J.; Ingason; A. K.; Sigurðsson; E. F.; and Rögnvaldsson; E. (2011). Icelandic Parsed Historical Corpus (IcePaHC) version 0.9.

Citations in Crossref