Lene Offersgaard
University of Copenhagen, Denmark
Bart Jongejan
University of Copenhagen, Denmark
Mitchell Seaton
University of Copenhagen, Denmark
Dorte Haltrup Hansen
University of Copenhagen, Denmark
Download articlePublished in: Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 20
Linköping Electronic Conference Proceedings 89:3, p. 21-32
NEALT Proceedings Series 20:3, p. 21-32
Published: 2013-05-17
ISBN: 978-91-7519-585-8
ISSN: 1650-3686 (print), 1650-3740 (online)
The initiative CLARIN-DK (starting as a Danish preparatory DK-CLARIN project) is a part of the Danish research infrastructure initiative; DIGHUMLAB. In this paper the aims; status; and the current challenges for CLARIN-DK are presented. CLARIN-DK focuses on written and spoken language resources; multimodal resources and tools; and involving users is a core issue. Users involved in a preparatory project gave input that led to the current user interface of the resource repository website; clarin.dk. Clarin.dk is now in the transition phase from a repository to a research infrastructure; where researchers and students can be supported in their research; education and studies. Clarin.dk works with a Service-Oriented Architecture (SOA); uses eSciDoc and Fedora Commons; and is primarily based on open source solutions. A key issue in CLARIN-DK is using standards such as TEIP5; IMDI; OLAC; and CMDI for resource metadata. Optional metadata fields suggested by users have been included when it could comply with the standards; allowing for the diversity needed when describing the research material. Current work includes normalising metadata naming in the search pages; and making search more user-friendly by adding selectable pick-lists for query values. Also a consolidation of metadata quality is currently performed by changing some metadata values to a more harmonized set of values. All deposited metadata are maintained. Clarin.dk will apply for assessment as a CLARIN ERIC B centre in 2013 enforcing the sustainability and persistency of the infrastructure. Clarin.dk has already joined the national identity federation WAYF; implemented SSL-certificates; and offers harvesting of metadata via OAI-PMH as part of the CLARIN centre requirements.
Asmussen; J. (2011) Text metadata: What the header of a text item looks like; DKCLARIN WP2.1 Technical Report; http://korpus.dsl.dk/clarin/corpus-doc/textheader. pdf
Asmussen; J. (2011) Text formatting: Bringing corpus texts into good shape and enabling flexible annotation of them. DK-CLARIN WP2.1 Technical Report.
Asmussen; J. & Halskov; J. (2009) Compiling and annotating corpora in DK-CLARIN. Interpreting and tweaking TEI P5. In Proceedings of the Corpus Linguistics Conference CL2009. University of Liverpool; UK 2009. http://ucrel.lancs.ac.uk/publications/cl2009/
Conrad; A. (2010). The use of eSciDoc in Clarin.dk. eSciDoc Days Copenhagen; 2010. https://www.escidoc.org/pdf/day1-conrad-clarindk.pdf
Broeder; D. (2012) CMDI: a Component Metadata Infrastructure. CMDI (Component Metadata Infrastructure) workshop; September 13; 2012 MPI for Psycholinguistics; http://www.clarin.eu/sites/default/files/cmdi-daan.pdf
Fersøe; H & Maegaard; B. (2009). CLARIN in Denmark – European and Nordic Perspectives. In: Nordic Perspectives on the CLARIN Infrastructure on Common Language Resources; NEALT Proceedings Series; Vol. 5; pp. 6-11. Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9944.
Halskov; J.; Hansen; D. H.; Braasch; A.; & Olsen; S. (2010). Quality indicators of LSP texts – selection and measurements: Measuring the terminological usefulness of documents for an LSP corpus. In Proceedings of the Seventh International Conference on Language Resources and Evaluation: LREC 2010 (s. 2614-2620). Valletta; Malta: European language resources distribution agency.
Hinrichs; E. W. (2009). CLARIN Short Guide Standards for Text Encoding. http://www.clarin.eu/files/standards-text-CLARIN-ShortGuide.pdf
Jongejan; B. Workflow Management in CLARIN-DK. In Proceedings of the Nordic Language Research Infrastructure Workshop at NoDaLiDa; Oslo; May 22; 2013
Offersgaard; L. Jongejan; B. and Maegaard; B. (2011). How Danish users tried to answer the unaskable during implementation of clarin.dk. In SDH 2011 – Supporting Digital Humanities; Copenhagen.