Conference article

Variability of the Facet Values in the VLO – a Case for Metadata Curation

Margaret King
ACDH-OEAW, Vienna, Austria

Davor Ostojic
ACDH-OEAW, Vienna, Austria

Matej Ďurčo
ACDH-OEAW, Vienna, Austria

Go Sugimoto
ACDH-OEAW, Vienna, Austria

Download article

Published in: Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wroclaw, Poland

Linköping Electronic Conference Proceedings 123:3, p. 25-44

NEALT Proceedings Series 28:3, p. 25-44

Show more +

Published: 2016-04-11

ISBN: 978-91-7685-765-6

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

In this paper we propose a strategy for metadata curation especially with respect to the variability of the values encountered in the metadata records and hence in the facets of the main CLARIN metadata catalogue, the VLO. The approach concentrates on measures on the side of the infrastructure and on the interaction between human curators and the automatic processes.

Keywords

No keywords available

References

[Broeder et al.2010] D. Broeder, M. Kemps-Snijders, D. Van Uytvanck, M. Windhouwer, P. Withers, P. Wittenburg, and C. Zinn. 2010. A data category registry-and component-based metadata framework. In Procedings of the Seventh Conference on International Language Resources and Evaluation [LREC2010]. Pp. 43-47.

[Broeder et al.2012] D. Broeder, M. Windhouwer, D. Van Uytvanck, T. Goosen, and T. Trippel. 2012. CMDI: a component metadata infrastructure. In Proceedings of the Eighth International Conference on Language Resources and Evaluation [LREC2012]. Pp. 1387-1390.

[Broeder et al.2014] D. Broeder, I. Schuurman, and M. Windhouwer. 2014. Experiences with the ISOcat Data Category Registry. In Proceedings of the Ninth International Conference on Language Resources and Evaluation [LREC 2014]. Pp. 4565-4568.

[Calarco et al.2014] P. Calarco, L. Conrad, R. Kessler, and M. Vandenburg. 2014. Metadata Challenges in Library Discovery Systems. In Proceedings of the Charleston Library Conference. Purdue University e-Pubs. Pp. 533-540.

[Durco and Moerth2014] M. Durco, and K. Mörth. 2014. Towards a DH Knowledge Hub - Step 1: Vocabularies. Presented at Clarin 2014 Conference [CAC2014].

[Durco and Windhouwer2014] M. Durco and M. Windhouwer. 2014. From CLARIN Component Metadata to Linked Open Data. In Proceedings of the Third Workshop on Linked Data in Linguistics [LDL 2014]. Pp. 13-17.

[Europeana2009] Europeana. 2009. Metadata Mapping & Normalisation Guidelines for the Europeana Prototype: Europeana Version 1.2. Europeana: Think Culture, Den Haag, Netherlands.

[Europeana2014] Europeana. 2014. EDM Mapping Guidelines: Europeana Version 2.2. Europeana: Think Culture, Den Haag, Netherlands.

[Goosen et al.2014] T. Goosen, M. Windhouwer, O. Ohren, A. Herold, T. Eckart, M. Durco and O. Schonefeld. 2014. CMDI 1.2: Improvements in the CLARIN Component Metadata Infrastructure [CAC2014]. In Selected Papers from the CLARIN 2014 Conference [CAC2014]. Pp. 36-53.

[Haaf et al.2014] S. Haaf, P. Fankhauser, T. Trippel, K. Eckart, T. Eckart, H. Hedeland and D. Van Uytvanck. 2014. CLARIN’s Virtual Language Observatory (VLO) under scrutiny-The VLO taskforce of the CLARIN-D centres. Presented at Clarin 2014 Conference [CAC2014].

[Huffman2015] N. Huffman. 2015. Adventures in metadata hygiene: using Open Refine, XSLT, and Excel to dedup and reconcile name and subject headings in EAD. In Bitstreams: Notes from the digital projects team. Duke University Libraries, N.C.

[Kemps-Snijders2014] M. Kemps-Snijders. 2014. Metadata quality assurance for CLARIN. Technical report. [Lyse et al.2014] G. Lyse, P. Meurer, and K. De Smedt. 2014. COMEDI: A New Component Metadata Editor. In Papers from the CLARIN 2014 Conference [CAC2014]. Pp. 82-88.

[Odijk2014] J. Odijk. 2014. Discovering Resources in CLARIN: Problems and Suggestions for Solutions. Utrecht University Repository, Netherlands.

[Odijk2015] J. Odijk. 2015. Metadata curation strategy. Internal document, unpublished.

[Palmer2014] W. Palmer, 2014. Fits metadata normalisation API? Github Repository.

[Sofou and Tzouvaras2015] N. Sofou, and V. Tzouvaras. 2015. MS28: Sounds thesaurus and metadata cleaning and normalization module complete. Europeana Sounds 620591, Den Haag, Netherlands.

[Trippel et al.2014] T. Trippel, D. Broeder, M. Durco, and O. Ohren. 2014. Towards automatic quality assessment of component metadata. In Proceedings of the Ninth International Conference on Language Resources and Evaluation [LREC 2014]. Pp. 3851-3856.

[Van Uytvanck2010] D. Van Uytvanck, C. Zinn, D. Broeder, P. Wittenburg, and M. Gardelleni. 2010. Virtual Language Observatory: The portal to the language resources and technology universe. In Proceedings of the Seventh Conference on International Language Resources and Evaluation [LREC 2010]. Pp. 900-903.

[Van Uytvanck2012] D. Van Uytvanck, H. Stehouwer, and L. Lampen. 2012. Semantic metadata mapping in practice: The Virtual Language Observatory. In Proceedings of the Eighth International Conference on Language Resources and Evaluation [LREC2012]. Pp. 1029-1034.

[Windhouwer2012] M. Windhouwer. 2012. RELcat: a Relation Registry for ISOcat data categories. In Proceedings of the Eighth International Conference on Language Resources and Evaluation [LREC2012]. Pp. 3661-3664.

[Withers2012] P. Withers. 2012. Metadata management with Arbil. In Proceedings of the Eight International Conference on Language Resources and Evaluation [LREC2012]. Pp. 72–75.

Citations in Crossref