Discovering software resources in CLARIN

Jan Odijk
UiL-OTS, Utrecht University, The Netherlands

Ladda ner artikel

Ingår i: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018

Linköping Electronic Conference Proceedings 159:13, s. 121-132

Visa mer +

Publicerad: 2019-05-28

ISBN: 978-91-7685-034-3

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


I present a CMDI profile for the description of software that enables discovery of the software and formal documentation of aspects of the software, and a proposal for faceted search in metadata for software. The profile has been tested by making metadata for over 80 pieces of software. The profile forms an excellent basis for formally describing properties of the software, and for a faceted search dedicated to software which enables better discoverability of software in the CLARIN infrastructure. A faceted search application for this purpose has been implemented. A curation procedure is proposed to ensure that descriptions of software made on the basis of other profiles contain the relevant information in the right form and use the right vocabularies, and we created an experimental faceted search that includes software descriptions based on the WebLichtWebService profile.


Virtual Language Observatory, CMDI Metadata, Software, Faceted search


[Broeder et al.2010] D. Broeder, M. Kemps-Snijders, D. Van Uytvanck, M.Windhouwer, P.Withers, P.Wittenburg, and C. Zinn. 2010. A data category registry- and component-based metadata framework. In N. Calzolari, B. Maegaard, J. Mariani, J. Odijk, K. Choukri, S. Piperidis, M. Rosner, and D. Tapias, editors, Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), pages 43–47, Valetta, Malta. European Language Resources Association (ELRA).

[Broeder et al.2012] Daan Broeder, Menzo Windhouwer, Dieter Van Uytvanck, Twan Goosen, and Thorsten Trippel. 2012. CMDI: A component metadata infrastructure. In Proceedings of the LREC workshop ‘Describing LRs with Metadata: Towards Flexibility and Interoperability in the Documentation of LR’., pages 1–4, Istanbul, Tyrkey. European Language Resources Association (ELRA).

[Hinrichs et al.2010] ErhardW. Hinrichs, Marie Hinrichs, and Thomas Zastrow. 2010. WebLicht:Web-Based LRT Services for German. In Proceedings of the ACL 2010 System Demonstrations, pages 25–29.

[Lyse et al.2015] Gunn Inger Lyse, Paul Meurer, and Koenraad De Smedt. 2015. COMEDI: A component metadata editor. In Jan Odijk, editor, Selected Papers from the CLARIN 2014 Conference, volume 28 of NEALT Proceedings Series, pages 82–98, Linköping, Sweden. Linköping Electronic Conference Proceedings. http://www.ep.liu.se/ecp/116/008/ecp15116008.pdf.

[Odijk2009] Jan Odijk. 2009. Data categories and ISOCAT: some remarks from a simple linguist. Presentation given at FLaReNet/CLARIN Standards Workshop, Helsinki, 30 September.

[Odijk2014] Jan Odijk. 2014. Discovering resources in CLARIN: Problems and suggestions for solutions. Unpublished article, Utrecht University. http://dspace.library.uu.nl/handle/1874/303788, August.

[Odijk2015] Jan Odijk. 2015. Metadata curation strategy. manuscript, Utrecht, http://www.clarin.nl/sites/default/files/Metadata%20curation%20strategy%202015-06-29.pdf. Appendixes: http://www.clarin.nl/sites/default/files/Resource%20Type%20Curation%202015-6-29.xlsx and http://www.clarin.nl/sites/default/files/modality%20cleanup.xlsx, June 29.

[Odijk2018] Jan Odijk. 2018. Why I do not like web interfaces for data entry. Working paper, Utrecht University, October 11.

[Ostojic et al.2017] Davor Ostojic, Go Sugimoto, and Matej Durcô. 2017. The curation module and statistical analysis on VLO metadata quality. In Selected papers from the CLARIN Annual Conference 2016, Aix-en- Provence, 26–28 October 2016, number 136 in Linköping Electronic Conference Proceedings, pages 90–101. Linköping University Electronic Press, Linköpings Universitet.

[Schuurman et al.2016] Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, and Daniel Zeman. 2016. CLARIN Concept Registry: The New Semantic Registry. In Koenraad De Smedt, editor, Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wroclaw, Poland, number 123 in Linköping Electronic Conference Proceedings, pages 62–70, Linköping, Sweden. CLARIN, Linköping University Electronic Press. http://www.ep.liu.se/ecp/article.asp?issue=123&article=004.

[van den Bosch et al.2007] A. van den Bosch, G.J. Busser, W. Daelemans, and S. Canisius. 2007. An efficient memory-based morphosyntactic tagger and parser for Dutch. In F. Van Eynde, P. Dirix, I. Schuurman, and V. Vandeghinste, editors, Selected Papers of the 17th Computational Linguistics in the Netherlands Meeting, pages 99–114. Leuven, Belgium.

[Van Uytvanck2014] Dieter Van Uytvanck. 2014. How can I find resources using CLARIN? Presentation held at the Using CLARIN for Digital Research tutorial workshop at the 2014 Digital Humanities Conference, Lausanne, Switzerland. https://www.clarin.eu/sites/default/files/CLARIN-dvu-dh2014_VLO.pdf, July.

[Westerhout and Odijk2013] Eline Westerhout and Jan Odijk. 2013. Metadata for tools: creating a CMDI profile for tools. Presentation held at CLIN 2013, Enschede, the Netherlands. http://www.clarin.nl/sites/default/files/13CLIN.pdf, 18January.

[Zeeman and Windhouwer2018] Rob Zeeman and Menzo Windhouwer. 2018. Tweak your CMDI forms to the max. Presentation at the CLARIN Annual Conference, Pisa, Italy. https://www.clarin.eu/sites/default/files/CLARIN2018_Session-4-5_Paper-22_Zeeman-Windhouwer.pdf, October10.

[Zinn2016a] Claus Zinn. 2016a. The CLARIN language resource switchboard. https://www.clarin.eu/sites/default/files/08%20-%20ZINN-Lg-Sw-Board.pdf. Presentation at the CLARIN 2016 Annual Conference.

[Zinn2016b] Claus Zinn. 2016b. The CLARIN language resource switchboard. https://www.clarin.eu/sites/default/files/zinn-CLARIN2016_paper_26.pdf. Abstract for the CLARIN 2016 Annual Conference.

[Zinn2017] Claus Zinn. 2017. A bridge from EUDAT’s B2DROP cloud service to CLARIN’s language resource switchboard. https://www.clarin.eu/sites/default/files/Zinn-CLARIN2017_paper_17.pdf. Abstract for the CLARIN 2017 Annual Conference.

Citeringar i Crossref