Konferensartikel

WebAnno-MM: EXMARaLDA meets WebAnno

Steffen Remus
Language Technology Group, Department of Informatics, Universität Hamburg, Germany

Hanna Hedeland
Hamburg Centre for Language Corpora (HZSK), Universität Hamburg, Germany

Anne Ferger
Hamburg Centre for Language Corpora (HZSK), Universität Hamburg, Germany

Kristin Bührig
Hamburg Centre for Language Corpora (HZSK), Universität Hamburg, Germany

Chris Biemann
Language Technology Group, Department of Informatics, Universität Hamburg, Germany

Ladda ner artikel

Ingår i: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018

Linköping Electronic Conference Proceedings 159:17, s. 166-172

Visa mer +

Publicerad: 2019-05-28

ISBN: 978-91-7685-034-3

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

In this paper, we present WebAnno-MM, an extension of the popular web-based annotation tool WebAnno, which is designed for the linguistic annotation of transcribed spoken data with timealigned media files. Several new features have been implemented for our current use case: a novel teaching method based on pair-wise manual annotation of transcribed video data and systematic comparison of agreement between students. To enable the annotation of transcribed spoken language data, apart from technical and data model related challenges, WebAnno-MM offers an additional view to data: a (musical) score view for the inspection of parallel utterances, which is relevant for various methodological research questions regarding the analysis of interactions of spoken content.

Nyckelord

Multimodal, Transcription, Annotation

Referenser

Timofey Arkhangelskiy, Anne Ferger, and Hanna Hedeland. 2019. Uralic multimedia corpora: ISO/TEI corpus data in the project INEL. In Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages, pages 115–124, Tartu, Estonia.

Claude Barras, Edouard Geoffrois, Zhibiao Wu, and Mark Liberman. 2000. Transcriber: development and use of a tool for assisting speech corpora production. Speech Communication – Special issue on Speech Annotation and Corpus Tools, 33(1–2).

David Ferrucci and Adam Lally. 2004. UIMA: An Architectural Approach to Unstructured Information Processing in the Corporate Research Environment. Natural Language Engineering, 10(3–4):327–348.

ISO/TC 37/SC 4. 2016. Language resource management – Transcription of spoken language. Standard ISO 2462:2016, International Organization for Standardization, Geneva, Switzerland.

Loc Ligeois, Carole Etienne, Christophe Parisse, Christophe Benzitoun, and Christian Chanard. 2015. Using the TEI as a pivot format for oral and multimodal language corpora. Paper presented at Text Encoding Initiative Conference, Lyon, 28–31, 2015.

Richard Eckart de Castilho, Chris Biemann, Iryna Gurevych, and Seid Muhie Yimam. 2014. WebAnno: a flexible, web-based annotation tool for CLARIN. In Proceedings of the CLARIN Annual Conference 2014, pages 1–3, Soesterberg, Netherlands.

Elinor Ochs. 1979. Transcription as theory. In E. Ochs and B.B. Schieffelin, editors, Developmental pragmatics, pages 43–72. Academic Press, New York.

Jochen Rehbein, Thomas Schmidt, Bernd Meyer, Franziska Watzke, and Annette Herkenrath. 2004. Handbuch für das computergestützte Transkribieren nach HIAT. Arbeiten zur Mehrsprachigkeit, Folge B, 56:1 ff. (in German).

Thomas Schmidt and Kai Wörner. 2014. EXMARaLDA. In Jacques Durand, Ulrike Gut, and Gjert Kristoffersen, editors, Handbook on Corpus Phonology, pages 402–419. Oxford University Press.

Thomas Schmidt, Hanna Hedeland, and Daniel Jettka. 2017. Conversion and annotation web services for spoken language data in clarin. In Selected papers from the CLARIN Annual Conference, number 136, pages 113–130, Aix-en-Provence, France. Linköping University Electronic Press, Linköpings Universitet.

Thomas Schmidt. 2011. A TEI-based Approach to Standardising Spoken Language Transcription. Journal of the Text Encoding Initiative, 1:1–28.

Han Sloetjes. 2014. ELAN: Multimedia annotation application. In Jacques Durand, Ulrike Gut, and Gjert Kristoffersen, editors, Handbook on Corpus Phonology, pages 305–320. Oxford University Press.

Pontus Stenetorp, Goran Topi´c, Sampo Pyysalo, Tomoko Ohta, Jin-Dong Kim, and Jun’ichi Tsujii. 2011. Bionlp shared task 2011: Supporting resources. In Proceedings of BioNLP Shared Task 2011 Workshop, pages 112–120, Portland, OR, USA.

Seid Muhie Yimam, Iryna Gurevych, Richard Eckart de Castilho, and Chris Biemann. 2013. WebAnno: A flexible, web-based and visually supported system for distributed annotations. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 1–6, Sofia, Bulgaria.

Seid Muhie Yimam, Chris Biemann, Richard Eckart de Castilho, and Iryna Gurevych. 2014. Automatic annotation suggestions and custom annotation layers in WebAnno. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 91–96, Baltimore, MD, USA.

Citeringar i Crossref