Experiments With Hand-tracking Algorithm in Video Conversations

Pihel Saatmann
Institute of Computer Science, University of Tartu, Estonia

Kristiina Jokinen
Institute of Computer Science, University of Tartu, Estonia

Ladda ner artikel

Ingår i: Proceedings of the 2nd European and the 5th Nordic Symposium on Multimodal Communication, August 6-8, 2014, Tartu, Estonia

Linköping Electronic Conference Proceedings 110:11, s. 81-86

Visa mer +

Publicerad: 2015-05-26

ISBN: 978-91-7519-074-7

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


This paper describes a simple colour-based object tracking plugin for the video annotation tool ANVIL. The tracker can be used to automatically annotate hand gestures or the movements of any object that is distinguishable from its background. The tracker records velocity, duration and total travel distance of hand gestures and can be configured to display gesture direction. Results of the tracker are compared to manually created annotations for hand gestures. Data recorded by the tracker is not accurate enough to provide a complete alternative to manual annotation, but could rather be used as a basis for determining where hand gestures can be detected. Thus using the tracker in combination with a human annotator could significantly speed up the annotation process.


Inga nyckelord är tillgängliga


Gary R. Bradski. 1998. Computer video face tracking for use in a perceptual user interface. Intel Techno-logy Journal. Q2, pp.705-740.

Bing Han, Christopher Paulson, Taoran Lu, Dapeng Wu and Jian Li. 2009. Tracking of Multiple Objects under Partial Occlusion. Automatic target Recognition XIX, 7335. Available at: http://www.wu.ece.ufl.edu/mypapers/trackingSPIE09.pdf

Jing Guang Han, Nick Campbell, Kristiina Jokinen and Graham Wilcock. 2012. Investigating the use of nonverbal cues in human-robot interaction with a Nao robot. Proceedings of 3rd IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2012), Kosice, 679-683.

Bart Jongejan. 2012. Automatic annotation of face velocity and acceleration in Anvil. Proceedings of the Language Resources and Evaluation Conference (LREC-2012). Istanbul, Turkey.

Kristiina Jokinen and Silvi Tenjes. 2012. Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), 23-25 May 2012, Istanbul, Turkey.

Kristiina Jokinen and Graham Wilcock. 2014. Automatic and manual annotations in first encounter dialogues. Proceedings of the 6th International Conference: Human Language Technologies – The Baltic Perspective. Kaunas, Lithuania.

Costanza Navarretta, Elisabet Ahlsén, Jens Allwood, Kristiina Jokinen and Patrizia Paggio. 2012. Feedback in Nordic First-Encounters: a Comparative Study. Proceedings of the Language Resources and Evaluation Conference (LREC-2012). Istanbul, Turkey.

Michael Kipp. 2001. Anvil - A Generic Annotation Tool for Multimodal Dialogue. Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), pp. 1367-1370.

Alper Yilmaz, Omar Javed, and Mubarak Shah. 2006. Object tracking: A survey. ACM Computing Surveys. 38 (4), Article 13. Available at: http://crcv.ucf.edu/papers/Object%20Tracking.pdf.

Citeringar i Crossref