Conference article

Workflow Management in CLARIN-DK

Bart Jongejan
Copenhagen University, Denmark

Download article

Published in: Proceedings of the workshop on Nordic language research infrastructure at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 20

Linköping Electronic Conference Proceedings 89:2, p. 11-20

NEALT Proceedings Series 20:2, p. 11-20

Show more +

Published: 2013-05-17

ISBN: 978-91-7519-585-8

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

Clarin.dk; the infrastructure maintained by the CLARIN-DK project; is not only a repository of resources; but also a place where users can analyse; annotate; reformat and potentially even translate resources; using tools that are integrated in the infrastructure as web services. In many cases a single tool does not produce the desired output; given the input resource at hand. Still; in such cases it may be possible to reach the set goal by chaining a number of tools. The approach presented here frees the user of having to meddle with tools and the construction of workflows. Instead; the user only needs to supply the workflow manager with the features that describe her goal; because the workflow manager not only executes chains of tools in a workflow; but also takes care of autonomously devising workflows that serve the user’s intention; given the tools that currently are integrated in the infrastructure as web services. To do this; the workflow manager needs stringent and complete information about each integrated tool. We discuss how such information is structured in clarin.dk. Provided that many tools are made available to and through the clarin.dk infrastructure; the automatically created workflows; although simple linear programs without branching or looping constructs; can cover a large swath of users’ needs. It is rewarding for both users and tool developers that the infrastructure takes advantage of new tools from the moment they are registered; because there is no need to wait for human expert users to construct and save for later use workflows that incorporate new tools.

Keywords

NoDaLiDa 2013; workflow; tools; automation

References

Cristea; D.; Pistol; I. (2008): Managing Language Resources and Tools Using a Hierarchy of Annotation Schemas. In Proceedings of the Workshop on Sustainability of Language Resources; LREC-2008; Marrakech.

Funk; A.; Bel; N.; Bel; S.; Büchler; M.; Cristea; D.; Fritzinger; F.; Hinrichs; E.; Hinrichs; M.; Ion; R.; Kemps-Snijders; M.; Panchenko; Y.; Schmid; H.; Wittenburg; P.; Quasthoff; U. and Zastrow; T. (2010): Requirements Specification Web Services and Workflow Systems. Available at: http://www-sk.let.uu.nl/u/D2R-6b.pdf

Hinrichs; E.; Hinrichs; M.; Zastrow; T. (2010) WebLicht: web-based LRT services for German. In ACLDemos ’10 Proceedings of the ACL 2010 System Demonstrations; Pages 25- 29; Association for Computational Linguistics Stroudsburg; PA; USA.

Kemps-Snijders; M.; Brouwer; M.; Kunst; J. P. and Visser; T. (2012): Dynamic web service deployment in a cloud environment. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012); Istanbul; Turkey; May 23-25; 2012: 2941-2944; European Language Resources Association (ELRA)

Offersgaard; L. Jongejan; B. and Maegaard; B. (2011). How Danish users tried to answer the unaskable during implementation of clarin.dk. In SDH 2011 – Supporting Digital Humanities; Copenhagen.

Offersgaard; L.; Jongejan; B.; Seaton; M. and Haltrup Hansen; D. (2013). CLARIN DK – status and challenges. In Proceedings of the Nordic Language Research Infrastructure Workshop at NoDaLiDa; Oslo; May 22; 2013

Citations in Crossref