Contributions of different modalities to the attribution of affective-epistemic states

Jens Allwood
SCCIIL Center, University of Gothenburg

Stefano Lanzini
SCCIIL Center, University of Gothenburg

Elisabeth Ahlsén
SCCIIL Center, University of Gothenburg

Ladda ner artikel

Ingår i: Proceedings from the 1st European Symposium on Multimodal Communication University of Malta; Valletta; October 17-18; 2013

Linköping Electronic Conference Proceedings 101:1, s. 1-6

NEALT Proceedings Series 21:1, s. 1-6

Visa mer +

Publicerad: 2014-06-23

ISBN: 978-91-7519-266-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


The focus of this study is the relation between multimodal and unimodal perception of emotions and attitudes. A point of departure for the study is the claim that multimodal presentation increases redundancy and often thereby also the correctness of interpretation. A study was carried out in order to investigate this claim by examining the relative role of unimodal versus multimodal visual and auditory perception for interpreting affective-epistemic states (AES). The abbreviation AES will be used both for the singular form “affective-epistemic state” and the plural form “affective-epistemic states”. Clips from video-recorded dyadic in-teractions were presented to 12 subjects using three types of presentation; Audio only; Video only and Audio+Video. The task was to inter-pret the affective-epistemic states of one of the two persons in the clip. The results indicated differences concerning the role of different sensory modalities for different affective-epistemic states. In some cases there was a “filtering” effect; rendering fewer interpretations in a multimodal presentation than in a unimodal one for a specific AES. This oc-curred for happiness; disinterest and understanding; whereas “mutual reinforcement”; rendering more interpretations for multimodal presentation than for unimodal video or audio presentation; occurred for nervousness; interest and thoughtfulness. Finally; for one AES; confidence; audio and video seem to have mutually restrictive roles.


Inga nyckelord är tillgängliga


Abrilian; S.; L. Devillers; S. Buisine and J.-C. Martin (2005). EmoTV1: Annotation of real-life emotions for the specification of multimodal affective inter-faces. 11th Int. Conf. Human-Computer Interac-tion (HCII’2005); Las Vegas; Nevada; USA; Elec-tronic proceedings; LEA.

Allwood; J.; Cerrato; L.; Jokinen; K.; Navarretta; C. and Paggio; P. (2007) The MUMIN coding scheme for the annotation of feedback; turn Management and sequencing. In J. C. Martin et al. (eds) Multimodal Corpora for Modelling Human Multimodal Behaviour. Special issue of the Inter-national Journal of Language Resources and Evaluation. Springer.

Allwood; J.; Chindamo; M. & Ahlsén; E. (2012). Some suggestions for the study of stance in communication. Paper presented at the ASE/IEEE In-ternational Conference on Social Computing; Am-sterdam; 2012.

Allwood; J.; Nivre; J.; and Ahlsén; E. (1992). On the semantics and pragmatics of linguistic feedback. Journal of Semantics; 9; 1–26.

Beattie; G. & Shovelton; H. (2011). An exploration of the other side of semantic communication: How the spontaneous movements of the human hand add crucial meaning to narrative. Semiotica.; 184; 33-51.

Boersma; P. & Weenink; D. (2013). Praat: doing phonetics by computer [Computer program]. Version 5.3.51; retrieved 2 June 2013 from http://www.praat.org/
Buisine; S. Abrilian; S; Niewiadomski; R; Martin; J-C.; DeVillers; L. & Pelachaud; C. (2006). Percep-tion of blended emotions: From video corpus to expressive agent. In J. Gratch et al. (eds.) IVA 2006; LNAI 4233; pp. 93-106l Heidelberg: Springer-Verlag.

Cohn; Jeffrey F.; & De la Torre; Fernando. (In press). Automated face analysis for affective computing. In Calvo; R.A.; D’Mello; S.K; Gratch; J. & Kap-pas; A. (Eds.); Handbook of affective computing. New York; NY: Oxford. Inget av detta är egentligen vad vi gör.

Cunningham; D. W.; Kleiner; M.; Vallraven C. & Bülthoff; H. H. (2005). Manipulating video se-quences to determine the components of conversa-tional facial expressions. ACM Transactions on Applied Perception (TAP) Volume 2 Issue 3; July 05:251-269.

Douglas-Cowie; E.; Cowie; R. & Schröder; M. (2000). A new emotion database: considerations; sources and scope. ITRW on Speech and Emotion; Newcastle; Northern Ireland; UK; September 5-7; 2000. ISCA Archive. http://www.iscaspeech.org/archive.

Kipp; M. (2001). Anvil – A Generic Annotation Tool for Multimodal Dialogue. In Proceedings of Eurospeech 2001; pp. 1367 – 1370.

Lanzini; S. (2013). How do different modes contribute to the interpretation of affective-epistemic states? University Gothenburg; Division of Communication and Cognition; Department of Applied IT.

Paggio; P.; Allwood; J.; Ahlsén; Jokinen. K and Na-varretta; C. (2010). The NOMCO Multimodal Nordic Resource - Goals and Characteristics. In Calzolari; N.; Choukri; K.; Maegaard; B.; Mariani; J.; Odijk; J.; Piperidis; S.; Rosner; M.; & Tapias; D. (Eds.). Proceedings of the Seventh Conference on International Language Resources and Evalua-tion (LREC´10) Valletta; Malta. May 19-21. European Language Resources Association (ELRA). ISBN 2-9517408-6-7. http://www.Irec-conf.org/proceedings/Irec2010/index.html (PAG-GIO10.98).

Schroder; M.; Bevacqua; E.; Cowie; R.; & Eyben; F. et al. (2011) Building autonomous sensitive artifi-cial listeners. IEEE Transactions. Affective Com-puting. Vol. 3:2; 165-183.

Vinciarelli; A. Pantic; M.; Heylen; D.; Pelachaud; C.;Poggi; I. D’Errico; F. & Schroeder; M. (2012). Bridging the gap between social animal and unsocial machine: a survey of Social Signal Processing; IEEE Transactions on Affective Computing; Vol. 3:1; pp. 69-87.

Citeringar i Crossref