Anders Nøklestad
The Text Laboratory, ILN, University of Oslo, Oslo, Norway
Kristin Hagen
The Text Laboratory, ILN, University of Oslo, Oslo, Norway
Janne Bondi Johannessen
The Text Laboratory, ILN, University of Oslo, Oslo, Norway / MultiLing, University of Oslo, Norway
Michal Kosek
The Text Laboratory, ILN, University of Oslo, Oslo, Norway
Joel Priestley
The Text Laboratory, ILN, University of Oslo, Oslo, Norway
Ladda ner artikelIngår i: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden
Linköping Electronic Conference Proceedings 131:32, s. 251-254
NEALT Proceedings Series 29:32, p. 251-254
Publicerad: 2017-05-08
ISBN: 978-91-7685-601-7
ISSN: 1650-3686 (tryckt), 1650-3740 (online)
This paper presents and describes a modernised version of Glossa, a corpus search and results visualisation system with a user-friendly interface. The system is open source and can be easily installed on servers or even laptops for use with suitably prepared corpora. It handles parallel corpora as well as monolingual written and spoken corpora. For spoken corpora, the search results can be linked to audio/video, and spectrographic analysis and visualised geographical distributions can be provided. We will demonstrate the range of search options and result visualisations that Glossa provides.
Eckhard Bick. 2004. Corpuseye: Et Brugervenligt Webinterface for Grammatisk Opmærkede Korpora. Peter Widell and Mette Kunøe (eds). Møde om Udforskningen af Dansk Sprog, Proceedings. Denmark: Århus University. 46-57.
Lars Borin, Markus Forsberg and Johan Roxendal. 2012. Korp – the corpus infrastructure of
Språkbanken. Proceedings of LREC 2012. Istanbul: ELRA, pages 474–478.
Sebastian Hoffmann and Evert, Stefan. 2006. Bncweb (cqp-edition): The Marriage of two Corpus Tools. S. Braun, K. Kohn, and J. Mukherjee (eds). Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, volume 3 of English Corpus Linguistics. Frankfurt am Main: Peter Lang. 177-195.
Janne Bondi Johannessen, Lars Nygaard, Joel Priestley, Anders Nøklestad. 2008. Glossa: a Multilingual, Multimodal, Configurable User Interface. Proceedings of the Sixth International Language Resources and Evaluation (LREC’08). Paris: European Language Resources Association (ELRA).
Paul Meurer. 2012. Corpuscle – a new corpus management platform for annotated corpora. In: Gisle Andersen (ed.). Exploring Newspaper Language: Using the Web to Create and Investigate a large corpus of modern Norwegian, Studies in Corpus Linguistics 49, John Benjamins, 2012.
Web sites
CANS (Corpus of Norwegian-American Speech): http://tekstlab.uio.no/norskiamerika/english/index.html
CLARIN federated content search: https://www.clarin.eu/content/federated-content-search-clarin-fcs
CLARINO: http://clarin.b.uib.no/
Clojure: https://clojure.org/
ELENOR: http://www.hf.uio.no/ilos/studier/ressurser/elenor/index.html
Glossa on GitHub: https://github.com/textlab/cglossa
IMS Open Corpus Workbench: http://cwb.sourceforge.net/
Leksikografisk bokmålskorpus:https://tekstlab.uio.no/glossa2/?corpus=bokmal
MySql: https://www.mysql.com/
Nordic Dialect Corpus: http://www.tekstlab.uio.no/nota/scandiasyn/index.html
NORINT: http://www.hf.uio.no/iln/english/about/organization/text-laboratory/projects/norint/index.html
NoWaC (Norwegian Web as Corpus): http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/nowac/index.html