Conference article

Data Collection from Persons with Mild Forms of Cognitive Impairment and Healthy Controls - Infrastructure for Classification and Prediction of Dementia

Dimitrios Kokkinakis
Department of Swedish, University of Gothenburg, Sweden

Kristina Lundholm Fors Lundholm Fors
Department of Swedish, University of Gothenburg, Sweden

Eva Björkner
Department of Swedish, University of Gothenburg, Sweden

Arto Nordlund
Department of Psychiatry and Neurochemistry, Sahlgrenska Academy, University of Gothenburg, Sweden

Download article

Published in: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Linköping Electronic Conference Proceedings 131:20, p. 172-182

NEALT Proceedings Series 29:20, p. 172-182

Show more +

Published: 2017-05-08

ISBN: 978-91-7685-601-7

ISSN: 1650-3686 (print), 1650-3740 (online)


Cognitive and mental deterioration, such as difficulties with memory and language, are some of the typical phenotypes for most neurodegenerative diseases including Alzheimer’s disease and other dementia forms. This paper describes the first phases of a project that aims at collecting various types of cognitive data, acquired from human subjects in order to study relationships among linguistic and extra-linguistic observations. The project’s aim is to identify, extract, process, correlate, evaluate, and disseminate various linguistic phenotypes and measurements and thus contribute with complementary knowledge in early diagnosis, monitor progression, or predict individuals at risk. In the near future, automatic analysis of these data will be used to extract various types of features for training, testing and evaluating automatic classifiers that could be used to differentiate individuals with mild symptoms of cognitive impairment from healthy, age-matched controls and identify possible indicators for the early detection of mild forms of cognitive impairment. Features will be extracted from audio recordings (speech signal), the transcription of the audio signals (text) and the raw eye-tracking data.


No keywords available


Malin Ahlberg et al. 2013. Korp and Karp – a bestiary of language resources: the research infrastructure of Språkbanken. 19th Nordic Conf of Computational Linguistics (NODALIDA). Linköping Electronic Conference Proceedings #85.

Samrah Ahmed, Anne-Marie Haigh, Celeste de Jager and Peter Garrard. 2013. Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain. 136(Pt 12):3727-37.

Tim J. Anderson and Michael R. MacAskill. 2013. Eye movements in patients with neurodegenerative disorders. Nat Rev Neurology 9: 74-85. doi:10.1038/nrneurol.2012.273.

Eiji Aramaki, Shuko Shikata, Mai Miyabe and Ayae Kinoshita. 2016. Vocabulary Size in Speech May Be an Early Indicator of Cognitive Impairment. PLoS One. 11(5):e0155195.

Sheena K. Au-Yeung, Johanna Kaakinen, Simon Liversedge and Valerie Benson. 2015. Processing of Written Irony in Autism Spectrum Disorder: An Eye-Movement Study. Autism Res. 8(6):749-60. doi: 10.1002/aur.1490.

Paul Boersma and David Weenink. 2013. Praat: doing phonetics by computer [Computer program]. Version 6.0.19, retrieved in Aug. 2016 from <>.

Barbara Caracciolo et al. 2011. The symptom of low mood in the prodromal stage of mild cognitive impairment and dementia: a cohort study of a community dwelling elderly population. J Neurol Neurosurg Psychiatry. 82:788-793.

Vineeta Chand, Kathleen Baynes, Lisa M. Bonnici and Sarah Tomaszewski Farias. 2012. A Rubric for Extracting Idea Density from Oral Language Samples Analysis of Idea Density (AID): A Manual. Curr Protoc Neurosci. Ch. Unit10.5. doi:10.1002/0471142301.ns1005s58.

James W Dodd 2015. Lung disease as a determinant of cognitive decline and dementia. Alzh Res & Therapy, 7:32.

Alison Ferguson, Elizabeth Spencer, Hugh Craig and Kim Colyvas. 2014. Propositional Idea Density in women’s written language over the lifespan: Computerized analysis. Cortex 55. 107-121.

Gerardo Fernández et al. 2013. Eye Movement Alterations during Reading in Patients with Early Alzheimer Disease. Investigative Ophthalmology & Visual Science. Vol.54, 8345-8352. doi:10.1167/iovs.13-12877.

Katrina Forbes-McKay, Mike Shanks and Annalena Venneria. 2014. Charting the decline in spontaneous writing in Alzheimer’s disease: a longitudinal study. Acta Neuropsychiatrica. Vol. 26:04, pp 246-252.

Kathleen C. Fraser and Graeme Hirst. 2016. Detecting semantic changes in Alzheimer’s disease with vector space models. LREC Workshop: Resources and ProcessIng of linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric impairments (RaPID). Pp. 1-8. Portorož Slovenia.

Peter Garrard and Brita Elvevåg. 2014. Special issue: Lang., computers and cognitive neuroscience. Cortex 55; 1-4.

Frederique Gayraud, Hye-Ran Lee and Melissa Barkat-Defradas. 2011. Syntactic & lexical context of pauses and hesitations in the discourse of Alzheimer patients and healthy elderly subjects. Clin Ling&Phon. 25(3):198-209.

Jeroen Geertzen. 2009. Wide-coverage parsing of speech transcripts. 11th Pars. Tech (IWPT). Pp 218–221. France.

Elaine Gilles, Karalyn Patterson and John Hodges. 1996. Performance on the Boston Cookie Theft picture description task in patients with early dementia of the Alzheimer’s type: missing information. Aphasiology. 10:4:395-408.

Harald Goodglass and Edith Kaplan. 1983. The Assessment of Aphasia and Related Disorders. Lea&Febiger. USA.

Ildikó Hoffmann et al. 2010. Temporal parameters of spontaneous speech in Alzheimer’s disease. J of Speech-Language Pathology, 12(1), 29–34.

Kenneth Holmqvist, Richard Dewhurst, Marcus Nyström, Joost van de Weijer, Halszka Jarodzka and Richard Andersson. 2015. Eye Tracking – A comprehensive guide to methods & measures. OUP.

Frank Jessen et al. 2010. Prediction of dementia by subjective memory impairment: effects of severity and temporal association with cognitive impairment. Arch. Gen. Psychiatry, 67(4). Pp. 414–422.

Leigh A Johnson et al. 2013. Cognitive differences among depressed and non-depressed MCI participants. J Geriatr Psychiatry. 28(4):377-82.

Tom Johnstone and Klaus R. Scherer. 2000. Vocal communication of emotion. The Handbook of Emotion. Lewis & Haviland (eds). NY Guildford.

Marcel A. Just and Patricia A. Carpenter. 1980. A theory of reading: from eye fixations to comprehension. Psychological review, 87(4):329-354.

Gitit Kavé & Mira Goral. 2016. Word retrieval in picture descriptions produced by individuals with Alzheimer’s disease. J Clin Exp Neuropsychol. 38(9):958-66.

Dimitrios Kokkinakis. 2001. More than Surface-Based Parsing; Higher Level Evaluation of Cass-SWE. 13th Nordic Computational Linguistics Conference (NODALIDA). Uppsala, Sweden.

Alexandra König et al. 2015. Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring. 1:112–124. Elsevier.

Dmitry Lagun et al. 2011. Detecting cognitive impairment by eye movement analysis using automatic classification algorithms. J Neurosci Methods. 201(1): 196–203. doi:10.1016/j.jneumeth. 2011.06.027.

Christoph Laske et al. 2014. Innovative diagnostic tools early detection of Alzheimer’s disease. Alzheimer’s & Dementia. 1-18.

Xuan Le, Ian Lancashire, Graeme Hirst, and Regina Jokel. 2011. Longitudinal Detection of Dementia through Lexical and Syntactic Changes in Writing: A Case Study of Three British Novelists. JLLC 26 (4): 435-461.

Matthew Lease and Mark Johnson. 2006. Early deletion of fillers in processing conversational speech. Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pages 73–76.

Hyeran Lee, Frederique Gayraud, Fabrice Hirsh and Melissa Barkat-Defradas. 2011. Speech dysfluencies in normal and pathological aging: A comparison between Alzheimer patients and healthy elderly subjects. ICPhS: proceedings of the 17th International Congress of Phonetic Sciences.Pp. 1174–1177. Hong Kong.

Deborah L. Levy, Anne B. Sereno, Diane C. Gooding,and Gilllian A. O’Driscoll. 2010. Eye Tracking Dysfunction in Schizophrenia: Characterization and Pathophysiology. Curr Top Behav Neurosci. 4: 311–347.

Juan JG. Meilán, Francisco Martínez-Sánchez, Juan Carro, José A. Sánchez and Enrique Pérez. 2012. Acoustic Markers Associated with Impairment in
Language Processing in AD. Spanish J of Psych. Vol. 15:2, 487-494.

Juan JG. Meilán et al. 2014. Speech in Alzheimer’s Disease: Can Temporal and Acoustic Parameters Discriminate Dementia? Dement Geriatr Cogn Disord 2014;37:327–334. doi: 10.1159/000356726.

H. B. Mitchell. (2007). Multi-Sensor Data Fusion: An Introduction. Springer.

Robert J. Molitor, Philip C. Ko and Brandon A. Ally. 2015. Eye Movements in Alzheimer’s Disease. J of Alzheimer’s Disease 44, 1–12. IOS Press.

James A. Mortimer, Amy R. Borenstein, Karen M. Gosche and David A. Snowdon. 2005. Very Early Detection of Alzheimer Neuropathology and the Role of Brain Reserve in Modifying Its Clinical Expression. J Geriatr Psychiatry Neurol. 18(4): 218–223.

Joakim Nivre et al. 2007. MaltParser: A languageindependent system for data-driven dependency parsing. Natural Language Engineering. 13(2):95-135.

Arto Nordlund, S. Rolstad, P. Hellström, M. Sjögren, S. Hansen and Anders Wallin. 2005. The Goteborg MCI study: mild cognitive impairment is a heterogeneous condition. J Neurol Neurosurg Psychiatry. 76(11):1485-90.

Sylvester Olubolu Orimaye, Jojo Sze-Meng Wong and Kren J. Golden. 2014. Learning Predictive Linguistic Features for Alzheimer’s Disease and related Dementias using Verbal Utterances. Workshop on Computational Ling. & Clinical Psychology: From Linguistic Signal to Clinical Reality. 78–87. Maryland, USA.

Serguei VS Pakhomov et al. 2010. A co-mputerized technique to assess language use patterns in patients with frontotemporal dementia. J Neuroling. 23(2):127–144.

Sona Patel, Klaus R. Scherer, Eva Björkner, Johan Sundberg. 2011. Mapping emotions into acoustic space: The role of voice production. Biological Psychology 87. 93–98.

Sajidkhan S. Pathan et al. 2011. Association of lung function with cognitive decline and dementia: the Atherosclerosis Risk in Communities (ARIC) Study. Eur J Neurol. 18(6):888-9.

Luz Rello and Miguel Ballesteros. 2015. Detecting Readers with Dyslexia Using Machine Learning with Eye Tracking Measures. Proceedings of the 12th Web for All Conference W4A. Florence, Italy.

Vassiliki Rentoumi, Ladan Raoufian, Samrah Ahmed and Peter Garrard. 2014. Features and Machine Learning Classification of Connected Speech Samples from Patients with Autopsy Proven Alzheimer’s Disease with and without Additional Vascular Pathology. J of Alzheimer’s Disease 42. IOS Press. S3–S17.

Karen Ritchie and Jacques Touchon. 2010. Mild cognitive impairment: conceptual basis and current nosological status. The Lancet. Vol. 355:9199. Pp. 225–228. Doi:10.1016/S0140-6736(99)06155-3.

Brian Roark, Margaret Mitchell, John-Paul Hosom, Kristy Hollingshead, and Jeffrey Kaye. 2011. Spoken Language Derived Measures for Detecting Mild Cognitive Impairment. IEEE Trans Audio Speech Lang Processing. 19(7): 2081–2090.

David A. Snowdon, Lydia Greiner and William R. Markesbery. 2000. Linguistic ability in early life and the neuropathology of Alzheimer’s disease and cerebrovascular disease. Findings from the Nun Study. Annals of the NY Academy of Sciences. 903:34-8.

Greta Szatloczki, Ildiko Hoffmann, Veronika Vincze, Janos Kalman and Magdolna Pakaski. 2015. Speaking in Alzheimer’s disease, is that an early sign? Importance of changes in language abilities in Alzheimer’s disease. Frontiers in Aging Neuroscience. Vol 7, article 195. doi: 10.3389/fnagi.2015.00195.

Vanessa Taler and Natalie Phillips. 2008. Language performance in Alzheimer’s disease and mild cognitive impairment: A comparative review. J Clin Exp Neuropsychol. 30(5):501-56. doi: 10.1080/13803390701550128.

Susanne Trauzettel-Klosinski, Klaus Dietz & the IReST Study Group. 2012. Standardized Assessment of Reading Performance: The New International Reading Speed Texts IReST. Investigative Ophthalmol&Visual Sc. 53:9.

Laszló Tóth et al. 2015. Automatic Detection of MCI from Spontaneous Speech using ASR. Interspeech. Germany.

Anders Wallin et al. 2016. The Gothenburg MCI study: Design and distribution of Alzheimers disease and subcortical vascular disease diagnoses from baseline to 6-year follow-up. J Cer Blood Flow Metab. 36(1):114-31.

Olof Tyche. 2001. Subtila språkstörningar hos patienter med diagnosen MCI. Master’s thesis. Karolinska institute, Sweden (In Swedish).

Brian MacWhinney, Davida Fromm, Margaret Forbes and Audrey Holland. 2011. AphasiaBank: Methods for studying discourse. Aphasiology. 25 (11), 1286- 1307.

Caroline Williams et al. 2010. The Cambridge Cookie-Theft Corpus: A Corpus of Directed and Spontaneous Speech of Brain-Damaged Patients and Healthy Individuals. 7th Language Resources and Evaluation (LREC). Pp. 2824-2830. Malta.

Maria Yancheva, Kathleen Fraser and Frank Rudzicz. 2015. Using linguistic features longitudinally to predict clinical scores for Alzheimer’s disease and related dementias. 6th SLPAT. Pp. 134–139, Dresden, Germany.

Victoria Yaneva, Irina Temnikova and Ruslan Mitkov. 2016. Corpus of Text Data and Gaze Fixations from Autistic and Non-autistic Adults. 10th Language Resources and Evaluation (LREC). Pp. 480-487. Slovenia.

Gustaf Öqvist Seimyr. 2010. Swedish IReST translation. The Bernadotte Laboratory, Karolinska institute, Sweden.

Citations in Crossref