Laszlo Hunyadi
Department of General and Applied Linguistics, University of Debrecen, Debrecen, Hungary
Tamás Várad
MTA Institute of Linguistics, Research Group on Language Technology, Budapest, Hungary
György Kovács
MTA SzTE Reserach Group on Artificial Ingelligence, Szeged, Hungary / Embedded Internet Systems Lag, Luleå Univeristy of Technology, Luleå, Sweden
István Szekrényes
Institute of Philosophy, University of Debrecen, Hungary
Hermina Kiss
Department of General and Applied Linguistics, University of Debrecen, Debrecen, Hungary
Karolina Takács
Department of Phonetics Eötvös Loránd University, Budapest, Hungary
Download articlePublished in: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018
Linköping Electronic Conference Proceedings 159:6, p. 56-65
Published: 2019-05-28
ISBN: 978-91-7685-034-3
ISSN: 1650-3686 (print), 1650-3740 (online)
The present paper describes HuComTech, a multimodal corpus featuring over 50 hours of video taped interviews with 112 informants. The interviews were carried out in a lab equipped with multiple cameras and microphones able to record posture, hand gestures, facial expressions, gaze etc. as well as the acoustic and linguistic features of what was said. As a result of large-scale manual and semi-automatic annotation, the HuComTech corpus offers a rich dataset on 47 annotation levels. The paper presents the objectives, the workflow, the annotation work, focusing on two aspects in particular i.e. time alignment made with the Leipzig tool WEBMaus and the automatic detection of intonation contours developed by the HuComTech team. Early exploitation of the
corpus included analysis of hidden patterns with the use of sophisticated multivariate
analysis of temporal relations within the data points. The HuComTech corpus is one of
the flagship language resources available through the HunCLARIN repository.
Multimodality,
Multimodal corpus,
Hidden patterns of communication,
Human-machine communication