Performance of XML Databases for Epidemiological Queries in Archetype-Based EHRs

Sergio Miranda Freire
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Erik Sundvall
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Daniel Karlsson
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Patrick Lambrix
Department of Computer and Information Science, Linköping University, Linköping, Sweden

Ingår i: Scandinavian Conference on Health Informatics 2012; October 2-3; Linköping; Sverige

Linköping Electronic Conference Proceedings 70:9, s. 51-57

Publicerad: 2012-09-28

ISBN: 978-91-7519-758-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


There are very few published studies regarding the performance of persistence mechanisms for systems that use the openEHR multi level modelling approach. This paper addresses the performance and size of XML databases that store openEHR compliant documents. Database size and response times to epidemiological queries are described. An anonymized relational epidemiology database and associated epidemiological queries were used to generate openEHR XML documents that were stored and queried in four opensource XML databases. The XML databases were considerably slower and required much more space than the relational database. For population-wide epidemiological queries the response times scaled in order of magnitude at the same rate as the number of records (total database size) but were orders of magnitude slower than the original relational database. For individual focused clinical queries where patient ID was specified the response times were acceptable. This study suggests that the tested XML database configurations without further optimizations are not suitable as persistence mechanisms for openEHR-based systems in production if population-wide ad hoc querying is needed.


Medical Record Systems; Computerized; Database Management Systems; Archetypes; XML Databases; openEHR


