Starting a Conversation with Strangers in Virtual Reykjavik: Explicit Announcement of Presence

Stefán Ólafsson
CADIA, Reykjavik University, Iceland

Branislav Bédi
University of Iceland, Iceland

Hafdïs Erla Erla Helgdóttir
CADIA, Reykjavik University, Iceland

Birna Arnbjörnsdóttir
University of Iceland, Iceland

Hannes Högni Vilhjálmsson
CADIA, Reykjavik University, Iceland

Ladda ner artikel

Ingår i: Proceedings from the 3rd European Symposium on Multimodal Communication, Dublin, September 17-18, 2015

Linköping Electronic Conference Proceedings 105:11, s. 62-68

Visa mer +

Publicerad: 2016-09-16

ISBN: 978-91-7685-679-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


Virtual Reykjavik is an Icelandic language and culture training application for foreigners learning Icelandic. In this video game-like environment, the user is asked to solve given tasks in the game and in order to complete them he/she must interact with the characters, e.g. by conversing with them on context- specific topics. To make this a reality, a model for how natural conversations start in a specific situations has been developed, based on data from that same situation in real life: a stranger asking another stranger for directions to a particular place in downtown Reykjavik. This involved defining a multimodal an- notation scheme, outlining the communicative functions and behaviors associated with them. However, current annotation schemes lacked the appropriate function for this specific case, which lead us to finding and proposing an appropriate commu-nicative function – the Explicit Announcement of Presence. A study was conducted to explore and better understand how conversation is initiated in first encounters between people who do not know each other. Human-to-human conversations were analyzed for the purpose of modelling a realistic conversation between human users and virtual agents. Results from the study have led to the inclusion of the communicative function in the human-to-agent conversation system. By playing the game the learners will be exposed to situations that they may encounter in real life, and therefore the interaction is based on real life data, rather than textbook examples. We believe that this application will help bridge the gap from the class room to the real world, preparing learners to initiate conversations with real Icelandic speakers.


Explicit Announcement of Presence, communicative function, human-agent interaction, embodied conversational agent, multimodal communication, natural language, social behavior


[1] B. Meyer, “Designing serious games for foreign language education in a global perspective,” Support for Learning, vol. 1, pp. 715–719, 2009.

[2] R. Ellis, Task-based language learning and teaching. Oxford University Press, 2003.

[3] W. Littlewood, Communicative language teaching: An introduction. Cambridge University Press, 1981.

[4] L. W. Johnson, H. Vilhjalmsson, and S. Marsella, “Serious games for language learning: How much game, how much AI?” Proceedings of the 12th International Conference on Artificial Intelligence in Education, 2005.

[5] J. Cassell, T. Bickmore, L. Campbell, H. Vilhjálmsson, and H. Yan, “Embodied conversational agents.” Cambridge, MA, USA: MIT Press, 2000, ch. Human Conversation As a System Framework: Designing Embodied Conversational Agents, pp. 29–63.

[6] J. Guðnason, O. Kjartansson, J. Jóhannsson, E. Carstensdóttir, H. Vilhjálmsson, H. Loftsson, S. Helgadóttir, K. Jóhannsdóttir, and E. Rögnvaldsson, “Almannaromur: An open icelandic speech corpus,” in Proceedings of the Third International Workshop on Spoken Language Technologies for Under-resourced languages (SLTU 2012), 2012.

[7] A. Kendon, Conducting interaction: Patterns of behavior in focused encounters. Cambridge: Cambridge University Press, 1990.

[8] J. Allwood, L. Cerrato, K. Jokinen, C. Navarretta, and P. Paggio, “The mumin coding scheme for the annotation of feedback, turn management and sequencing phenomena,” Language Resources and Evaluation, vol. 41, no. 3/4, pp. 273–287, 2007.

[9] F. Schiel, S. Steininger, and U. Türk, “The smartkom multimodal corpus at bas.” in LREC, 2002.

[10] K. Pápay, S. Szeghalmy, and I. Szekrényes, “Hucomtech multimodal corpus annotation,” Argumentum, vol. 7, pp. 330–347, 2011.

[11] H. H. Clark, Using Language. Cambridge: Cambridge University Press, 1996.

[12] A. Cafaro, H. H. Vilhjálmsson, T. Bickmore, D. Heylen, and C. Pelachaud, “Representing communicative functions in saiba with a unified function markup language,” in Intelligent Virtual Agents. Springer, 2014, pp. 81–94.

[13] I. T. Coyne, “Sampling in qualitative research. purposeful and theoretical sampling; merging or clear boundaries?” Journal of advanced nursing, vol. 26, no. 3, pp. 623–630, 1997.

[14] H. Bunt, J. Alexandersson, J. Carletta, J.-W. Choe, A. C. Fang, K. Hasida, V. Petukhova, A. Popescu-Belis, C. Soria, and D. Traum, “Language resource management—semantic annotation framework—part 2: Dialogue acts,” International Organization, 2010.

[15] I. Zwitserlood, A. Ozyurek, and P. M. Perniss, “Annotation of sign and gesture cross-linguistically,” in 6th International Conference on Language Resources and Evaluation (LREC 2008)/3rd Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora. ELDA, 2008, pp. 185–190.

[16] H. Vilhjálmsson, N. Cantelmo, J. Cassell, N. E. Chafai, M. Kipp, S. Kopp, M. Mancini, S. Marsella, A. N. Marshall, C. Pelachaud et al., “The behavior markup language: Recent developments and challenges,” in Intelligent virtual agents. Springer, 2007, pp. 99–111.

[17] F. Quek, D. McNeill, R. Bryll, S. Duncan, X.-F. Ma, C. Kirbas, K. E. McCullough, and R. Ansari, “Multimodal human discourse: gesture and speech,” ACM Transactions on Computer-Human Interaction (TOCHI), vol. 9, no. 3, pp. 171–193, 2002.

[18] S. Kopp, B. Krenn, S. Marsella, A. N. Marshall, C. Pelachaud, H. Pirker, K. R. Thórisson, and H. Vilhjálmsson, “Towards a common framework for multimodal generation: The behavior markup language,” in Intelligent virtual agents. Springer, 2006, pp. 205–217.

[19] D. Heylen, S. Kopp, S. C. Marsella, C. Pelachaud, and H. Vilhjálmsson, “The next step towards a function markup language,” in Intelligent Virtual Agents. Springer, 2008, pp. 270–280.

[20] A. Cafaro, “First impressions in human-agent virtual encounters,” 2014.

[21] J. Allwood, L. Cerrato, L. Dybkjaer, K. Jokinen, C. Navarretta, and P. Paggio, “The mumin multimodal coding scheme,” NorFA yearbook, vol. 2005, pp. 129–157, 2005.

[22] S. Abrilian, L. Devillers, S. Buisine, and J.-C. Martin, “Emotv1: Annotation of real-life emotions for the specification of multimodal affective interfaces,” in HCI International, 2005.

[23] Á. Abuczki and E. B. Ghazaleh, “An overview of multimodal corpora, annotation tools and schemes,” Argumentum, vol. 9, pp. 86–98, 2013.

[24] H. Sloetjes and P. Wittenburg, “Annotation by category: Elan and iso dcr.” in LREC, 2008.

[25] S. Ólafsson, “When strangers meet: Collective construction of procedural conversation in embodied conversational agents,” Master’s thesis, The University of Iceland, Reykjavik, Iceland, 2015.

[26] “Icelandic online,” http://icelandiconline.is/index.html, accessed: 2016-15-01.

Citeringar i Crossref