Conference article

Korp and Karp - a bestiary of language resources: the research infrastructure of Språkbanken

Malin Ahlberg
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Lars Borin
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Markus Forsberg
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Martin Hammarstedt
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Leif-Jöran Olsson
Språkbanken, Dept. Of Swedish, University of Gothenburg, Sweden

Olof Olsson
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Johan Roxendal
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Jonatan Uppström
Språkbanken, Dept. of Swedish, University of Gothenburg, Sweden

Download article

Published in: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Linköping Electronic Conference Proceedings 85:39, p. 429-433

NEALT Proceedings Series 16:39, p. 429-433

Show more +

Published: 2013-05-17

ISBN: 978-91-7519-589-6

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

A central activity in Språkbanken; an R&D unit at the University of Gothenburg; is the systematic construction of a research infrastructure based on interoperability and widely accepted standards for metadata and data. The two main components of this infrastructure deal with text corpora and with lexical resources. For modularity and flexibility; both components have a backend; or server-side part; accessed through an API made up of a set of well-defined web services. This means that there can be any number of different user interfaces to these components; corresponding; e.g.; to different research needs. Here; we will demonstrate the standard corpus and lexicon search interfaces; designed primarily for linguistic searches: Korp and Karp.

Keywords

Swedish; corpora; lexical resources; research infrastructure

References

Adesam; Y.; Ahlberg; M.; and Bouma; G. (2012). bokstaffua; bokstaffwa; bokstafwa; bokstaua; bokstawa. . . Towards lexical link-up for a corpus of Old Swedish. In Proceedings of LTHist 2012.

Bick; E. (2009). A graphical corpus-based dictionary of word relations. In Proceedings of NODALIDA 2009. NEALT Proceedings Series Vol. 4; Odense. NEALT.

Borin; L. and Forsberg; M. (2009). All in the family: A comparison of SALDO and WordNet. In Proceedings of the Nodalida 2009 Workshop on WordNets and other Lexical Semantic Resources – between Lexical Semantics; Lexicography; Terminology and Formal Ontologies; Odense. NEALT.

Borin; L. and Forsberg; M. (2011). A diachronic computational lexical resource for 800 years of Swedish. In Language technology for cultural heritage; pages 41–61. Springer; Berlin.

Borin; L.; Forsberg; M.; and Kokkinakis; D. (2010). Diabase: Towards a diachronic blark in support of historical studies. In Proceedings of LREC 2010.

Borin; L.; Forsberg; M.; Olsson; L.-J.; and Uppström; J. (2012a). The open lexical infrastructure of Språkbanken. In Proceedings of LREC 2012; pages 3598–3602; Istanbul. ELRA.

Borin; L.; Forsberg; M.; and Roxendal; J. (2012b). Korp – the corpus infrastructure of Språkbanken. In Proceedings of LREC 2012; pages 474–478; Istanbul. ELRA.

ISO (2008). Language resource management – lexical markup framework (lmf). International Standard ISO 24613:2008.

Kilgarriff; A.; Rychlý; P.; Smrž; P.; and Tugwell; D. (2008). The Sketch Engine. In Fontenelle; T.; editor; Practical Lexicography: A Reader; pages 297–306. Oxford University Press; Oxford.

Nygaard; L.; Priestley; J.; Nøklestad; A.; and Johannessen; J. B. (2008). Glossa: a multilingual; multimodal; configurable user interface. In Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC’08); Marrakech. ELRA.

Volodina; E.; Borin; L.; Loftsson; H.; Arnbjörnsdóttir; B.; and Leifsson; G. Ö. (2012). Waste not; want not: Towards a system architecture for icall based on nlp component re-use. In Proceedings of the SLTC 2012 workshop on NLP for CALL; Lund; 25th October; 2012; pages 47–58.

Citations in Crossref