A Greek Corpus of Aphasic Discourse: Collection, Transcription, and Annotation Specifications

Spyridoula Varlokosta
National and Kapodistrian University of Athens, Greece

Spyridoula Stamouli
National and Kapodistrian University of Athens, Greece / Institute for Language and Speech Processing / “Athena” Research Center, Greece

Athanassios Karasimos
National and Kapodistrian University of Athens, Greece / Academy of Athens, Greece

Georgios Markopoulos
National and Kapodistrian University of Athens, Greece

Maria Kakavoulia
Panteion University of Social and Political Sciences, Greece

Michaela Nerantzini
National and Kapodistrian University of Athens, Greece / Northwestern University, USA

Aikaterini Pantoula
National and Kapodistrian University of Athens, Greece

Valantis Fyndanis
National and Kapodistrian University of Athens, Greece / University of Oslo, Norway

Alexandra Economou
National and Kapodistrian University of Athens, Greece

Athanassios Protopapas
National and Kapodistrian University of Athens, Greece

Ladda ner artikel

Ingår i: Proceedings of LREC 2016 Workshop. Resources and Processing of Linguistic and Extra-Linguistic Data from People with Various Forms of Cognitive/Psychiatric Impairments (RaPID-2016), Monday 23rd of May 2016

Linköping Electronic Conference Proceedings 128:3, s. 14 to 21

Visa mer +

Publicerad: 2016-06-03

ISBN: 978-91-7685-730-4

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


In this paper, the process of designing an annotated Greek Corpus of Aphasic Discourse (GREECAD) is presented. Given that resources of this kind are quite limited, a major aim of the GREECAD was to provide a set of specifications which could serve as a methodological basis for the development of other relevant corpora, and, therefore, to contribute to the future research in this area. The GREECAD was developed with the following requirements: a) to include a rather homogeneous sample of Greek as spoken by individuals with aphasia; b) to document speech samples with rich metadata, which include demographic information, as well as detailed information on the patients’ medical record and neuropsychological evaluation; c) to provide annotated speech samples, which encode information at the micro-linguistic (words, POS, grammatical errors, clause types, etc.) and discourse level (narrative structure elements, main events, evaluation devices, etc.). In terms of the design of the GREECAD, the basic requirements regarding data collection, metadata, transcription, and annotation procedures were set. The discourse samples were transcribed and annotated with the ELAN tool. To ensure accurate and consistent annotation, a Transcription and Annotation Guide was compiled, which includes detailed guidelines regarding all aspects of the transcription and annotation procedure.


aphasia, aphasic discourse, annotated corpus


Armstrong, E. (2000). Aphasic discourse analysis: The story so far. Aphasiology, 14, pp. 875-892.

Armstrong, E. & Ulatowska, H. K. (2007). Making stories: Evaluative language and the aphasia experience. Aphasiology, 21, pp. 763-774.

Armstrong, E. & Ulatowska, H. K. (2006). Stroke stories: Conveying emotive experiences in aphasia. In M. J. Ball& J. S. Damico (Eds.), Clinical ?phasiology: Future Directions. Hove, UK: Psychology Press.

Armstrong, E. (2005). Expressing opinions and feelings in aphasia: Linguistic options. Aphasiology, 19, pp. 285-296.

Berko-Gleason, J., Goodglass, H., Obler, L., Green, E., Hyde, M. & Weintraub, S. (1980). Narrative strategies of aphasics and normal-speaking subjects. Journal of Speech and Hearing Research, 23, pp. 370-382.

Bird, S. & Liberman, M. (1999). A Formal Framework for Linguistic Annotation. Technical Report (MS-CIS-99-01). University of Pennsylvania.

Capilouto, G. J., Wright, H. H. & Wagovich, S. A. (2006). Reliability of main event measurement in the discourse of individuals with aphasia. Aphasiology, 20, pp. 205-216.

De Roo, E. (1999). Agrammatic Grammar: Functional Categories in Agrammatic Speech. Hague: Theseus.

Doyle, P. J., McNeil, M. R., Spencer, K. A., Goda, A. J., Cottrell, K. & Lustig, A. P. (1998). The effects of concurrent picture presentations on retelling of orally presented stories by adults with aphasia. Aphasiology, 12, pp. 561-574.

Faroqi-Shah, Y. & Thompson, C. K. (2007). Verb inflections in agrammatic aphasia: Encoding of tense features. Journal of Memory and Language, 56, pp. 129-151.

Fyndanis, V., Varlokosta, S., & Tsapkini, K. (2012). Agrammatic production: Interpretable features and selective impairment in verb inflection. Lingua, 122, pp. 1134-1147.

Harley, T. (2001). The Psychology of Language: From Data to Theory (2nd edition). New York: Psychology Press.

Ide, N. & Suderman, K. (2007). GrAF: A graph-based format for linguistic annotations. In Proceedings of the Linguistic Annotation Workshop. Stroudsburg, PA: Association for Computational Linguistics, pp. 1-8.

Ide, N. & Suderman, K. (2014). The linguistic annotation framework: A standard for annotation interchange and merging. Language Resources and Evaluation, 48, pp. 395-418.

Kakavoulia, ?., Stamouli, S., Foka-Kavalieraki, P., Economou, ?., Protopapas, A. & Varlokosta, S. (2014). A battery for eliciting narrative discourse by Greek speakers with aphasia: Principles, methodological issues, and preliminary results [in Greek]. Glossologia, 22, pp. 41-60.

Labov, W. (1972). Language in the Inner City. Philadelphia: The University of Pennsylvania Press.

Labov, W. & Waletsky, J. (1967). Narrative analysis. In J. Helm (?d.), Essays in the Verbal and Visual Arts. Seattle: University of Seattle Press, pp. 12-44.

MacWhinney, B., Fromm, D., Forbes, M. & Holland, A. (2011). AphasiaBank: Methods for studying discourse. Aphasiology, 25, pp. 1286-1307.

MacWhinney, B., Fromm, D., Holland, A. & Forbes, M. (2012). AphasiaBank: Data and methods. In N. Mueller& M. Ball (Eds.), Methods in Clinical Linguistics. New York: Wiley, pp. 31-48.

McNeil, M. R., Sung, J. E., Yang, D., Pratt, S. R., Fossett, T. R. D., Pavelko, S. & Doyle, P. J. (2007). Comparing connected language elicitation procedures in person with aphasia: Concurrent validation of the Story Retell Procedure. Aphasiology, 21, pp. 775-790.

Menn, L., Ramsberger, G. & Helm-Estabrooks, N. (1994). A linguistic communication measure for aphasic narratives. Aphasiology, 8, pp. 315-342.

Mesulam, M. M. (2000). Principles of Behavioral and Cognitive Neurology (2nd edition). New York: Oxford University Press.

Nicholas, L. E. & Brookshire, R. H. (1995). Presence, completeness and accuracy of main concepts in the connected speech of non-brain-damaged adults and adults with aphasia. Journal of Speech and Hearing Research, 38, pp. 145-156.

Nicholas, L. E. & Brookshire, R. H. (1993). A system for quantifying the informativeness and efficiency of the connected speech of adults with aphasia. Journal of Speech and Hearing Research, 36, pp. 338-350.

Obler, L. K. & Gjerlow, K. (1999). Language and the Brain. Cambridge: Cambridge University Press.

Olness, G. S. & Ulatowska, H. K. (2011). Personal narratives in aphasia: Coherence in the context of use. Aphasiology, 25, pp. 1393-1413.

Papathanassiou, E., Papadimitriou, D., Gavrilou, V.& Michou, A. (2008). Normative data for the Boston Diagnostic Aphasia Battery in Greek: Gender and age effects [in Greek]. Psychology, 15, pp. 398-410.

Saffran, E. M., Sloan-Berndt, R. & Schwartz, M. (1989). The quantitative analysis of agrammatic production: Procedure and data. Brain and Language, 37, pp. 440-479.

Simos, P.G., Kasselimis, D. & Mouzaki, A. (2011). Age, gender, and education effects on vocabulary measures in Greek. Aphasiology, 25, pp. 492-504.

Stamouli, S. & Karasimos, A. (2015). The Greek Corpus of Aphasic Discourse. Oral presentation at the workshop on the “Interdisciplinary Study of Aphasia” Thales Project. University of Athens. Athens, 27 June 2015.

Thompson, C. K., Shapiro, L. P., Tait, M. E., Jacobs, B. J., Schneider, S. L. & Ballard, K. J. (1995). A system for the linguistic analysis of agrammatic language production. Brain and Language, 51, pp. 124-127.

Ulatowska, H. K., Freedman-Stern, R., Doyel, A. W., Macaluso-Haynes, S. & North.A. (1983). Production of narrative discourse in aphasia. Brain and Language, 19, pp. 317-334.

Ulatowska, H. K., North, A. J. & Macaluso-Haynes, S. (1981). Production of narrative and procedural discourse in aphasia. Brain and Language, 13, pp. 345-371.

Ulatowska, H. K., Olness, G. S., Keebler, M. & Tillery, J. (2006). Evaluation in stroke narratives: A study in aphasia. Brain and Language, Special Issue Academy of Aphasia 2006 Program, 99 (1-2), pp. 51-52.

Ulatowska, H. K., Reyes, ?. ?., Santos, T. O. & Worle, C. (2011). Stroke narratives in aphasia: The role of reported speech, Aphasiology, 25, pp. 93-105.

Varlokosta, S., Karasimos, ?., Stamouli, S., Kakavoulia, M., Markopoulos, G., Goutsos, D., Fyndanis, V., Neranztini, M. & Pantoula, A. (2013). Greek Corpus of Aphasic Discourse. Transcription and Annotation Guide [in Greek]. Athens: National and Kapodistrian University of Athens.

Vermeulen, J., Bastiaanse, R. & van Wageningen, B. 1989. Spontaneous speech in aphasia: A correlational study. Brain and Language, 36, pp. 252-274.

Wang, H., Yoshida, M. & Thompson, C. K. (2014). Parallel functional category deficits in clauses and nominal phrases: The case of English agrammatism. Journal of Neurolinguistics, 27, pp. 75-102

Westerhout, E. & Monachesi, P. (2006). A pilot study for a Corpus of Dutch Aphasic Speech (CoDAS). In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), pp. 1648-1653.

Williams, C., Thwaites, A., Buttery, P., Geertzen, J., Randall, B., Shafto, M., Devereux, B. & Tyler, L. (2010). The Cambridge Cookie-Theft Corpus: A corpus of directed and spontaneous speech of brain-damaged patients and healthy individuals. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), pp. 2824-2830.

Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H. (2006). ELAN: a Professional Framework for Multimodality Research. In Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation, pp. 1556-1559. (URL: http://tla.mpi.nl/tools/tla-tools/elan/, Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands).

Wright, H. H. (2011). Discourse in aphasia: An introduction to current research and future directions. Aphasiology, 25, pp. 1283-1285.

Wright, H. H., Capilouto, G. J., Wagovich, S. A., Cranfill, T. & Davis, J. (2005). Development and reliability of a quantitative measure of adults’ narratives. Aphasiology, 19, pp. 263-273.

Citeringar i Crossref