Eye-trackers and Multimodal Communication Studies

Kristiina Jokinen
Institute of Behavioural Sciences, University of Helsinki, Finland

Ladda ner artikel

Ingår i: Proceedings from the 1st European Symposium on Multimodal Communication University of Malta; Valletta; October 17-18; 2013

Linköping Electronic Conference Proceedings 101:4, s. 29-39

NEALT Proceedings Series 21:4, p. 29-39

Visa mer +

Publicerad: 2014-06-24

ISBN: 978-91-7519-266-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


This article provides an overview of eye-tracking technology in multimodal communication studies. It presents a short review of the human visual perception system and the eye-tracking technology; and discusses two types of eye-gaze studies as examples of how eye-trackers can be used in interaction management: in turn-taking analysis and involvement in conversation.


Inga nyckelord är tillgängliga


Argyle; M. and Cook; M. 1976. Gaze and mutual gaze. Oxford; England: Cambridge U Press.

Bavelas; J. B. 2005. The two solitudes: Reconciling Social Psychology and Language and Social Interaction. In K. Fitch & R. Sanders (Eds.); Handbook of Language and Social Interaction (pp. 179-200).

Mahwah; NJ: Erlbaum. Buswell; G. T. 1935. How people look at pictures. University of Chicago Press; Chicago.

Cassell; J.; Nakano; Y.; Bickmore; T.; Sidner; C. and Rich; C. 2001. “Non-Verbal Cues for Discourse Structure.” Proceedings of the 41st Annual Meeting of the Association of Computational Linguistics; pp. 106-115. July 17-19; Toulouse; France.

Cassell; J.; Vilhjálmsson; H. and Bickmore; T. 2001 "BEAT: the Behavior Expression Animation Toolkit." Proceedings of SIGGRAPH ’01; pp. 477-486. August 12-17; Los Angeles; CA.

Cassell; J. and Ryokai; K. 2001. "Making Space for Voice: Technologies to Support Children’s Fantasy and Storytelling." Personal Technologies 5(3): 203-224.

Cassell; J.; Bickmore; T.; Vilhjalmsson; H. and Yan; H. 2001. "More Than Just a Pretty Face: Conversational Protocols and the Affordances of Embodiment." Knowledge-Based Systems 14: 55-64.

Cassell; J. and Bickmore; T. 2001 "A Relational Agent: A Model and Implementation of Building User Trust." Proceedings of the CHI’01 Conference; pp. 396-403. March 31-April 5; Seattle; Washington.

Cassell; J.; D. McNeill; and K. E. McCullough. 1999. Speech-gesture mismatches: evidence for one underlying representation of linguistic and nonlinguistic information. Pragmatics and Cognition 7(1):1–34.

Cogain Network for Gaze and interaction studies: http://www.cogain.org/wiki/Bibliography_Gaze_Interaction

Duchowski; A.T. 2003. Eye-tracking Methodology: Theory and Practice. Springer

Edlund; J.; Skantze; G. and Carlson; R. 2004. Higgins - a spoken dialogue system for investigating error handling techniques- In Proceedings of ICSLP ; 2004

Edlund; J.; House; D. and Skantze; G. 2005. The effects of prosodic features on the interpretation of clarification ellipses- In Proceedings of Interspeech 2005.

Edlund; J.; Heldner; M. and Hirschberg; J. 2009a. Pause and gap length in face-to-face interaction. In Proceedings. of Interspeech 2009; Brighton.

Edlund; J.; Heldner; M. and Pelcé; A. 2009b. Prosodic features of very short utterances in dialogue. In Proceedings of the Nordic Prosody 2008; pp. 57-68. Frankfurt am Main.

Goodwin; C. 1981. Conversational Organization: Interaction Between Speakers and Hearers. New York; NY: Academic Press.

Groner; R. and Groner; M. T. 1989. Attention and eye movement control: An overview. European Archives of Psychiatry and Clinical Neuroscience; 239; 9–16.

Gullberg; M. and Holmqvist; K. 1999. Keeping an Eye on Gestures: Visual Perception of Gestures in Face-to-Face Communication. Pragmatics and Cognition 7 (1):35-63.

Gullberg; M. and Holmqvist; K. 2006. What speakers do and what addressees look at: Visual attention to gestures in human interaction live and on video. Pragmatics& Cognition; 14(1); 53-82

Huey; E. B. 1898. Preliminary experiments in the physiology and psychology of reading. American Journal of Psychology; 9; 575-586.

Hyrskykari; A.; Majaranta; P. and Räihä; K.-J. 2005. From gaze control to attentive interfaces. Proceedings of HCII 2005; Las Vegas; NV.

Jakob; R.J.K. and Karn; K.S. 2010. Commentary on Section 4. Eye tracking in human-computer interaction and usability research: Ready to deliver the promises.

Jokinen; K.; Nishida; M. and Yamamoto; S. 2009. Eye-gaze Experiments for Conversation Monitoring. The 3rd International Universal Communication Symposium; Tokyo; Japan.

Jokinen; K. and M. McTear 2009. Spoken Dialogue Systems. Synthesis Lectures on Human Language Technologies. Morgan and Claypool.

Jokinen; K. and F. Cheng 2010. New Trends in Speechbased Interactive Systems. Springer Publishers.

Jokinen; K. and J. Allwood 2010. Hesitation in Intercultural Communication: Some Observations and Analyses on Interpreting Shoulder Shrugging. In: T. Ishida (Ed.): Culture and Computing; LNCS 6259; pp. 55--70. Springer; Heidelberg.

Jokinen; K.; K. Harada; M. Nishida and S. Yamamoto 2010a. Turn-alignment using eye-gaze and speech in conversational interaction. Proceedings of Interspeech 2010. Makuhari; Japan.

Jokinen; K.; M. Nishida and S. Yamamoto 2010b. Collecting and Annotating Conversational Eye-Gaze Data. Proceedings of Multimodal Corpora: Advances in Capturing; Coding and Analyzing Multimodality (MMC 2010); Language Resources and Evaluation Conference (LREC-2010). Valetta; Malta.

Jokinen; K.; Nishida; M. and Yamamoto; S. 2010c. On Eyegaze and Turn-taking. Proceedings of the Workshop on Eye-gaze in Intelligent Human-Machine Interaction. International Conference on Intelligent User Interfaces.

Kendon; A. 1967. Some functions of gaze direction in social interaction. Acta Psychologica;26; 22–63.

Kendon; A; 1990. Signs in the cloister and elsewhere. Semiotica. 79; 307-29.

Koiso; H.; Horiuchi; Y.; Tutiya; S. Ichikawa; A. and Den; Y. 1998. An analysis of turn-taking and backchannels based on prosodic and syntactic features in japanese map task dialogs. Language and Speech; 41(3-4):295–321.

Land; M. F. 2006. Eye movements and the control of actions in everyday life. Progress in retinal and eye research; 25(3); 296–324.

Land; M. F. 2009. Vision; eye movements; and natural behavior. Visual neuroscience; 26(1); 51–62.

Levitski; A.; Radun; J. and Jokinen; K. 2012. Visual interaction and conversational activity. Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Eye Gaze and Multimodality. Santa Monica; USA

Majaranta; P. and Räihä; K.-J. 2007. Text entry by gaze: Utilizing eye-tracking. In MacKenzie; I. S.; and Tanaka-Ishii; K. (Eds.); Text entry systems: Mobility; accessibility; universality; pp. 175-187. Morgan Kaufmann.

Noguchi; H. and Den; Y. 1998. Prosody-Based Detection of the Context of Backchannel Responses. In Fifth International Conference on Spoken Language Processing.

Nakano; Y. and Nishida; T. 2007. Attentional behaviours as nonverbal communicative signals in situated interactions with conversational agents. In Nishida; T. (Ed.); Engineering approaches to conversational informatics; pp. 85-102. John Wiley & Sons.

Novick; D.; Walton; L. and Ward; K. 1996. Contribution graphs in multiparty conversations; Proceedings of the International Symposium on Spoken Dialogue (ISSD-96); Philadelphia; PA; October; 1996; 53-56.

Pannasch; S.; Schulz; J. and Velichkovsky; B. M. 2011. On the control of visual fixation durations in free viewing of complex images. Attention; Perception; & Psychophysics; Psychnomic Society; Inc. DOI: 10.3758/s13414-011-0090-1

Poggi I. 2001. The lexicon and the Alphabet of Gesture; Gaze; and Touch. Proceedings of the Third International Workshop on Intelligent Virtual Agents (IVA); p. 235-236. DOI: 10.1007/3-540-44812-8_20

Pylyshyn; Z. 1999. Is vision continuous with cognition? The case for cognitive impenetrability of visual perception Behavioral and Brain Sciences 22:341–423. Cambridge University Press.

Rensink; R.A.; O’Regan; J.; Kevin and Clark; J. 1997. To see or not to see: the need for attention to perceive changes in scenes. Psychological Science 8 (5): 368–373.

Rensink; R. A. 2000. The dynamic representation of scenes. Visual Cognition; 7(1/2/3); 17–42.

Simons and Chabris 1999. In youtube: http://www.youtube.com/watch?v=vJG698U2Mvo

Skarratt; P.A.; Cole; G.G. and Kuhn; G. 2012. Visual cognition during real social interaction. Frontiers in human neuroscience; 6; 196.

Sondhi; A.; O’Shea; J. and Williams; T. 2002 Arrest Referral: emerging findings from the national monitoring and evaluation programme. DPAS paper 18. London: Home Office.

Streek; J. and Knapp; M. L. 1992. The interaction of visual and verbal features in human communication. In F. Poyatos (Ed.); Advances in Nonverbal Communication; sociocultural; Clinical; Esthetic and Literary Perspectives. pp. 3-23. Amasterdam and Philadelphia: John Benjamins.

Trewarthen; C. 1984. “Emotions in Infancy: Regulators of Contact and Relationships with Persons.” Pp. sivunumero; ei löydy!! in Approaches to Emotion; edited by K. R. Sherer and P. Ekman. Hillsdale; NJ: Lawrence Erlbaum.

Unema; P. J. A.; Pannasch; S.; Joos; M. and Velichkovsky; B. M. 2005. Time course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition; 12; 473–494.

Walther; D. and Koch; C. 2006. Modeling attention to salient proto-objects. Neural Networks 19; 1395-1407.

Wennerstrom; A. and Siegel; A. F. 2003. Keeping the Floor in Multiparty Conversations: Intonation; Syntax; and Pause. Discourse Processes 36; 77-107.

Yarbus; A. 1967. Eye Movements and Vision; Plenum Press; New York.

Yonezawa; T.; Yamazoe; H.; Utsumi; A. and Abe; S. 2007. Gaze-communicative behavior of stuffed-toy robot with joint attention and eye contact based on ambient gazetracking. Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI’07); pp. 140-145. New York; NY: ACM.

Citeringar i Crossref