Conference article

Human-human, human-machine communication: on the HuComTech multimodal corpus

Laszlo Hunyadi
Department of General and Applied Linguistics, University of Debrecen, Debrecen, Hungary

Tamás Várad
MTA Institute of Linguistics, Research Group on Language Technology, Budapest, Hungary

György Kovács
MTA SzTE Reserach Group on Artificial Ingelligence, Szeged, Hungary / Embedded Internet Systems Lag, Luleå Univeristy of Technology, Luleå, Sweden

István Szekrényes
Institute of Philosophy, University of Debrecen, Hungary

Hermina Kiss
Department of General and Applied Linguistics, University of Debrecen, Debrecen, Hungary

Karolina Takács
Department of Phonetics Eötvös Loránd University, Budapest, Hungary

Download article

Published in: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018

Linköping Electronic Conference Proceedings 159:6, p. 56-65

Show more +

Published: 2019-05-28

ISBN: 978-91-7685-034-3

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

The present paper describes HuComTech, a multimodal corpus featuring over 50 hours of video taped interviews with 112 informants. The interviews were carried out in a lab equipped with multiple cameras and microphones able to record posture, hand gestures, facial expressions, gaze etc. as well as the acoustic and linguistic features of what was said. As a result of large-scale manual and semi-automatic annotation, the HuComTech corpus offers a rich dataset on 47 annotation levels. The paper presents the objectives, the workflow, the annotation work, focusing on two aspects in particular i.e. time alignment made with the Leipzig tool WEBMaus and the automatic detection of intonation contours developed by the HuComTech team. Early exploitation of the corpus included analysis of hidden patterns with the use of sophisticated multivariate analysis of temporal relations within the data points. The HuComTech corpus is one of the flagship language resources available through the HunCLARIN repository.

Keywords

Multimodality, Multimodal corpus, Hidden patterns of communication, Human-machine communication

References

No references available

Citations in Crossref