The CLARINO Bergen Centre: Development and Deployment

Koenraad De Smedt
University of Bergen, Bergen, Norway

Gunn Inger Lyse
University of Bergen, Bergen, Norway

Rune Kyrkjebø
University of Bergen, Bergen, Norway

Hemed Al Ruwehy
University of Bergen, Bergen, Norway

Øyvind Liland Gjesdal
University of Bergen, Bergen, Norway

Victoria Rosén
University of Bergen, Bergen, Norway

Paul Meurer
Uni Research Computing, Bergen, Norway

Ladda ner artikel

Ingår i: Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wroclaw, Poland

Linköping Electronic Conference Proceedings 123:1, s. 1-12

NEALT Proceedings Series 28:1, p. 1-12

Visa mer +

Publicerad: 2016-04-11

ISBN: 978-91-7685-765-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


The CLARINO Bergen Centre (Norway) provides a language resource repository, corpus and treebank services and metadata management services. We explain the motivation for using the LINDAT repository software as a model and describe the cloning and adaptation of that software for the CLARINO Bergen Repository. We also describe how the other centre services addressing CLARIN goals have been integrated into the centre, focusing on the steps taken to adapt the INESS treebanking service to CLARIN standards.


Inga nyckelord är tillgängliga


[Broeder et al.2010] Daan Broeder, Marc Kemps-Snijders, Dieter Van Uytvanck, Menzo Windhouwer, Peter Withers, Peter Wittenburg, and Claus Zinn. 2010. A data category registry- and component-based metadata framework. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors, Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA).

[De Kok et al.2014] Daniël De Kok, Dörte De Kok, and Marie Hinrichs. 2014. Build your own treebank. In CLARIN Annual Conference 2014 (abstracts).

[De Smedt et al.2015] Koenraad De Smedt, Victoria Rosén, and Paul Meurer. 2015. Studying consistency in UD treebanks with INESS-Search. In Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski, editors, Proceedings of the Fourteenth Workshop on Treebanks and Linguistic Theories (TLT14), pages 258–267, Warsaw, Poland. Institute of Computer Science, Polish Academy of Sciences.

[Losnegaard et al.2013] Gyri Smørdal Losnegaard, Gunn Inger Lyse, Anje Müller Gjesdal, Koenraad De Smedt, Paul Meurer, and Victoria Rosén. 2013. Linking Northern European infrastructures for improving the accessibility and documentation of complex resources. In Koenraad De Smedt, Lars Borin, Krister Lindén, Bente Maegaard, Eiríkur Rögnvaldsson, and Kadri Vider, editors, Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013, May 22–24, 2013, Oslo, Norway. NEALT Proceedings Series 20, number 89 in Linköping Electronic Conference Proceedings, pages 44–59. Linköping University Electronic Press.

[Lyse et al.2015] Gunn Inger Lyse, Paul Meurer, and Koenraad De Smedt. 2015. COMEDI: A component metadata editor. In Jan Odijk, editor, Selected Papers from the CLARIN 2014 Conference, October 24-25, 2014, Soesterberg, The Netherlands, number 116 in Linköping Electronic Conference Proceedings, pages 82–98, Linköping, Sweden. Linköping University Electronic Press.

[Martens2013] Scott Martens. 2013. TüNDRA: A web application for treebank search and visualization. In Sandra Kübler, Petya Osenova, and Martin Volk, editors, Proceedings of the Twelfth Workshop on Treebanks and Linguistic Theories (TLT12), pages 133–144. Bulgarian Academy of Sciences.

[Meurer et al.2013] Paul Meurer, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Gunn Inger Lyse, Gyri Smørdal Losnegaard, and Martha Thunes. 2013. The INESS treebanking infrastructure. In Stephan Oepen, Kristin Hagen, and Janne Bondi Johannessen, editors, Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), May 22–24, 2013, Oslo University, Norway. NEALT Proceedings Series 16, number 85 in Linköping Electronic Conference Proceedings, pages 453–458. Linköping University Electronic Press.

[Meurer2012a] Paul Meurer. 2012a. Corpuscle – a new corpus management platform for annotated corpora. In Gisle Andersen, editor, Exploring Newspaper Language: Using the Web to Create and Investigate a large corpus of modern Norwegian, number 49 in Studies in Corpus Linguistics. John Benjamins Publishing Company.

[Meurer2012b] Paul Meurer. 2012b. INESS-Search: A search system for LFG (and other) treebanks. In Miriam Butt and Tracy Holloway King, editors, Proceedings of the LFG ’12 Conference, LFG Online Proceedings, pages 404–421, Stanford, CA. CSLI Publications.

[Mišutka et al.2015] Jozef Mišutka, Amir Kamran, Ondrej Košarko, Michal Josífko, Loganathan Ramasamy, Pavel Stranák, and Jan Hajic. 2015. Linguistic digital repository based on DSpace 5.2. http://hdl.handle.net/11234/1-1481. LINDAT/CLARIN Digital Library at Institute of Formal and Applied Linguistics, Charles University in Prague.

[Oksanen et al.2010] Ville Oksanen, Krister Lindén, and Hanna Westerlund. 2010. Laundry symbols and license management – practical considerations for the distribution of LRs based on experiences from CLARIN. In Proceedings of LREC 2010 Workshop on Language Resources: From Storyboard to Sustainability and LR Lifecycle Management.

[Patejuk and Przepiórkowski2015] Agnieszka Patejuk and Adam Przepiórkowski. 2015. POLFIE: an LFG grammar of Polish accompanied by a structure bank. In CLARIN Annual Conference 2015 (abstracts).

[Rosén et al.2012] Victoria Rosén, Koenraad De Smedt, Paul Meurer, and Helge Dyvik. 2012. An open infrastructure for advanced treebanking. In Jan Hajic, Koenraad De Smedt, Marko Tadic, and António Branco, editors, META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 22–29, Istanbul, Turkey.

[Telljohann et al.2012] Heike Telljohann, Erhard W. Hinrichs, Sandra Kübler, Heike Zinsmeister, and Kathrin Beck. 2012. Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Technical report, Department of General and Computational Linguistics, University of Tübingen, Germany.

[van Noord et al.2013] Gertjan van Noord, Gosse Bouma, Frank Van Eynde, Daniël de Kok, Jelmer van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, and Vincent Vandeghinste. 2013. Large scale syntactic annotation of written Dutch: Lassy. In Peter Spyns and Jan Odijk, editors, Essential Speech and Language Technology for Dutch, Theory and Applications of Natural Language Processing, pages 147–164. Springer, Berlin/Heidelberg.

[Vandeghinste and Augustinus2014] Vincent Vandeghinste and Liesbeth Augustinus. 2014. Making a large treebank searchable online. The SoNaR case. In Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Banski, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, and Jani Taksha, editors, Challenges in the Management of Large Corpora (CMLC-2), Reykjavik, Iceland.

Citeringar i Crossref