Conference article

HPC-ready Language Analysis for Human Beings

Emanuele Lapponi
Language Technology Group, Department of Informatics, University of Oslo, Norway

Erik Velldal
Language Technology Group, Department of Informatics, University of Oslo, Norway

Nikolay A. Vazov
Research Support Services Group, University Center for Information Technology, University of Oslo, Norway

Stephan Oepen
Language Technology Group, Department of Informatics, University of Oslo, Norway

Download article

Published in: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Linköping Electronic Conference Proceedings 85:42, p. 447-452

NEALT Proceedings Series 16:42, p. 447-452

Show more +

Published: 2013-05-17

ISBN: 978-91-7519-589-6

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

This demonstration presents a first operable pilot of the Language Analysis Portal (LAP); an ongoing project within the Norwegian CLARINO initiative that aims at providing easy access to Language Technology (LT) tools running on a powerful High-Performance Computing (HPC) cluster. The system is built on top of the Galaxy framework; giving users an on-line platform where they can design experiments using an array of processors. These processors can be combined into complex workflows using a visual editor. The current implementation functions as a testbed for further development; hosting a limited collection of tools addressing common use-cases in the LT-realm; the long-term goal for LAP is to reach beyond the field and be an enabling platform for LT-powered research in the humanities and social sciences.

Keywords

Research infrastructure; High-Performance Computing; web portal; CLARINO

References

Blankenberg; D.; Kuster; G. V.; Coraor; N.; Ananda; G.; Lazarus; R.; Mangan; M.; Nekrutenko; A.; and Taylor; J. (2010). Galaxy: a web-based genome analysis tool for experimentalists. Current Protocols in Molecular Biology; pages 19.10.1–19.10.21.

Giardine; B.; Riemer; C.; Hardison; R. C.; Burhans; R.; Elnitski; L.; Shah; P.; Zhang; Y.; Blankenberg; D.; Albert; I.; Taylor; J.; Miller; W.; Kent; W. J.; and Nekrutenko; A. (2005). Galaxy: a platform for interactive large-scale genome analysis. Genome Research; 15(10):1451– 5.

Goecks; J.; Nekrutenko; A.; Taylor; J.; and Team; T. G. (2010). Galaxy: a comprehensive approach for supporting accessible; reproducible; and transparent computational research in the life sciences. Genome Biology; 11(8):R86.

Heid; U.; Schmid; H.; Eckart; K.; and Hinrichs; E. (2010). A corpus representation format for linguistic web services: The D-SPIN Text Corpus Format and its relationship with ISO standards. In Proceedings of the 7th International Conference on Language Resources and Evaluation; pages 494–499; Malta.

Lapponi; E.; Velldal; E.; Vazov; N. A.; and Oepen; S. (2013). Towards large-scale language analysis in the cloud. In Proceedings of the Workshop on Nordic Language Research Infrastructure at the 19th Nordic Conference of Computational Linguistics; Oslo; Norway.

Citations in Crossref