Recognition of Human Body Movements for Studying Engagement in Conversational Video Files

Martin Vels
Institute of Computer Science, University of Tartu, Estonia

Kristiina Jokinen
Institute of Computer Science, University of Tartu, Estonia

Ladda ner artikel

Ingår i: Proceedings of the 2nd European and the 5th Nordic Symposium on Multimodal Communication, August 6-8, 2014, Tartu, Estonia

Linköping Electronic Conference Proceedings 110:13, s. 97-105

Visa mer +

Publicerad: 2015-05-26

ISBN: 978-91-7519-074-7

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


This paper investigates object recognition techniques to automatically detect human behavior in video conversa- tions. The ViBe background subtraction algorithm, together with standard image processing techniques is ap - plied to conversational videos where two people meet for the first time, and the results show the usefulness of the technique in human communication analysis. By detecting the conversational participants and analyzing their conversational styles through the detected body movements, we can visualize, and draw conclusions concerning the participants’ engagement in the communicative activity. The paper discusses these novel observations that show the synchrony and engagement in the participants’ behavior.


Inga nyckelord är tillgängliga


Allwood, Jens, Cerrato, Loredana, Jokinen, Kristiina, Navarretta, Costanza & Paggio, Patrizia. 2007. The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. In Martin, J.C. et al (eds) Multimodal Corpora for Modelling Human Multimodal Behaviour. Special issue of the International Journal of Language Resources and Evaluation, 41(3–4), 273–287, Springer.

Argyle, Michael. 1975. Bodily Communication. London: Routledege.

Barnich, Olivier and Van Droogenbroeck, Marc 2011. M., ViBe: Universal Background Subtraction Algorithm for Video Sequences. Image Processing, IEEE Transactions on, Vol. 20, pp 1709-1724.

Tobias Baur, Ionut Damian, Florian Lingenfelser, Johannes Wagner and Elisabeth André. 2013. NovA: Automated Analysis Of Nonverbal Signals In Social Interactions, HBU’13 Proceedings of the Third international conference on Human Behavior Understanding.

Caridakis, G., Raouzaiou, A., Karpouzis, K., Kollias, S. 2006. Synthesizing Gesture Expressivity Based on Real Sequences, Proceedings of the LREC 2006 Conference.

Chellappa, Rema, Chen, Tsuhan and Katsaggleos, Angelo 1997. Audio-visual interaction in multimodal communication, IEEE Signal Processing Mag., pp 37-38.

Gonzales, Rafael C. and Woods, Richard E. 2010. Digital Image Processing (3rd edition). Pearson Education, Inc.

Goodwin, Charles 1981. Conversational Organization: Interaction between Speakers and Hearers. Academic Press, New York.

Jokinen, Kristiina and Tenjes, Silvi, 2012. Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data . Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC). Istanbul, Turkey.

Jokinen, Kristiina and Wilcock, Graham, 2012. Multimodal Signals and Holistic Interaction Structuring. Procs of the 24th International Conference on Computational Linguistics (COLING). Mumbai, India.

Kendon, Adam. 2004. Gesture: Visual Action as Utterance. Cambridge University Press.

Kipp, Michael. 2001. Anvil - A Generic Annotation Tool for Multimodal Dialogue. Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), pp. 1367-1370.

Michelet, Stephane, Karp , Koby, Delaherche , Emilie, Achard , Catherine, and Chetouani , Mohamed, 2012. Automatic Imitation Assessment in Interaction, HBU’12 Proceedings of the Third international conference on Human Behavior Understanding.

Mitra Sushmita and Acharya, Tinku 2007. Gesture Recognition: A Survey. Trans. Sys. Man Cyber Part C 37, 3, pp 311-324.

Oikonomopoulos , Antonios, Patras , Ioannis, and Pantic , Maja, Spatiotemporal Salient Points for Visual Recognition of Human Actions, IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, Vol. 36, No. 3, June 2006.

Santhanam, T., Sumathi, C. P. and Gomathi, S. 2012. A survey of techniques for human detection in static images. In Proceedings of the Second International Conference on Computational Science, Engineering and Informa - tion Technology (CCSEIT ’12). ACM, New York, NY, USA, 328-336.

Suzuki, Satoshi. and Abe, Keiichi, 1985. Topological Structural Analysis of Digitized Binary Images by Border Following. CVGIP 30 1, pp 32-46.

Citeringar i Crossref