Performance of XML Databases for Epidemiological Queries in Archetype-Based EHRs

Sergio Miranda Freire
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Erik Sundvall
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Daniel Karlsson
Department of Biomedical Engineering, Linköping University, Linköping, Sweden

Patrick Lambrix
Department of Computer and Information Science, Linköping University, Linköping, Sweden

Ladda ner artikel

Ingår i: Scandinavian Conference on Health Informatics 2012; October 2-3; Linköping; Sverige

Linköping Electronic Conference Proceedings 70:9, s. 51-57

Visa mer +

Publicerad: 2012-09-28

ISBN: 978-91-7519-758-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


There are very few published studies regarding the performance of persistence mechanisms for systems that use the openEHR multi level modelling approach. This paper addresses the performance and size of XML databases that store openEHR compliant documents. Database size and response times to epidemiological queries are described. An anonymized relational epidemiology database and associated epidemiological queries were used to generate openEHR XML documents that were stored and queried in four opensource XML databases. The XML databases were considerably slower and required much more space than the relational database. For population-wide epidemiological queries the response times scaled in order of magnitude at the same rate as the number of records (total database size) but were orders of magnitude slower than the original relational database. For individual focused clinical queries where patient ID was specified the response times were acceptable. This study suggests that the tested XML database configurations without further optimizations are not suitable as persistence mechanisms for openEHR-based systems in production if population-wide ad hoc querying is needed.


Medical Record Systems; Computerized; Database Management Systems; Archetypes; XML Databases; openEHR


[1] ISO. TR 20514 - Health informatics - Electronic health record - Definition; scope and context [Internet]. International Organization for Standardization; 2005 p. 27. Available at: http://www.iso.org/iso/home/store/catalogue_tc/catalogu e_detail.htm?csnumber=39525

[2] ISO. IS 18308 - Health informatics -- Requirements for an electronic health record architecture [Internet]. International Organization for Standardization; 2011 p. 25. Available at: http://www.iso.org/iso/home/store/catalogue_tc/catalogu e_detail.htm?csnumber=52823

[3] ISO. IS 13606: Health informatics — Electronic healthcare record communication — Part 1: Reference Model [Internet]. International Organization for Standardization; 2008 p. 83. Available at: http://www.iso.org/iso/home/store/catalogue_tc/catalogu e_detail.htm?csnumber=40784

[4] Johnson J. Generic Data Modeling for Clinical Repositories. J Am Med Inform Assoc. 1996;3:328-39.

[5] Nadkarni P; Marenco L; Chen R; Skoufos E; Shepherd G; Miller P. Organization of Heterogeneous Scientific Data Using the EAV/CR Representation. J Am Med Inform Assoc. 1999;6:478-93.

[6] Chen R; Enberg G; Klein G. Julius - a template based supplementary electronic health record system. BMC Med Inform Decis Mak. 7(10).

[7] openEHR. The openEHR Foundation [Internet]. [Accessed 2012 jul 28]. Available at: http://www.openehr.org

[8] Beale T; Heard S. OpenEHR architecture overview [Internet]. [Accessed 2012 jul 20]. Available at: http://www.openehr.org/releases/1.0.2/architecture/overv iew.pdf.

[9] Chen R; GeorgiI-Hemming P; Ahlfeldt H. Representing a Chemotherapy Guideline Using openEHR and Rules. Stud Health Technol Inform. 2009;150:653-7.

[10] Moner D; Maldonado J; Boscá D; Fernandez J; Angulo C; Crespo P; et al. Archetype-Based Semantic Integration and Standardization of Clinical Data. Proceedings of the 28th IEEE EMBS Annual International Conference. New York; 2006. p. 514-5144.

[11] Chen R; Klein G; Sundvall E; Karlsson D; Ahlfeldt H. Archetype-based conversion of EHR content models: pilot experience with a regional EHR system. BMC Med Inform Decis Mak. 2009;9(33).

[12] Chen R; Klein G. The openEHR Java Reference Implementation Project. MEDINFO 2007. Kuhn et al.; 2007. p. 58-62.

[13] Muñoz A; Solominos R; Pascual M; Fragua J; Gonzalez M; Monteagudo J; et al. Proof-of-concept Design and Development of an EN13606-based Electronic Health Care Record Service. J Am Med Inform Assoc. 2007;14(1):118-129.

[14] openEHR Foundation. openEHR Technical Discussion List [Internet]. [Accessed 2012 jul 27]. Available at: http://lists.openehr.org/pipermail/openehrtechnical_ lists.openehr.org/

[15] Arikan S. openEHR REFerence Framework and Application [Internet]. [Accessed 2011 may 4]. Available at: http://opereffa.chime.ucl.ac.uk/introduction.jsf

[16] Atalag K; Yang H. From openEHR Domain Models to Advanced User Interfaces: A Case Study in Endoscopy. 2010 Health Informatics New Zealand Conference [Internet]. Wellington; 2010 [Accessed 2011 may 5]. Available at: http://www.openehr.org/wiki/download/attachments/185 13934/Atalag_HINZ2010-Paper.pdf? version=1&modificationDate=1291667587000

[17] Pazos P; Carrasco L; Machado F; Simini F. Traumagen: historia clínica electrónica con acceso a estudios radiológicos digitales especializada en la atención de pacientes gravemente traumatizados. CAIS - JAIIO 2010 [Internet]. 2010 [Accessed 2009 nov 10]. Available at: http://www.slideshare.net/pablitox/proyecto-traumagencais- jaiio-2010

[18] Freire SM; Almeida RT de; Bastos E de A; Cabral MDB; Souza RC; Silva MGP. A record linkage process of a cervical cancer screening database. Computer Methods and Programs in Biomedicine. 2012; 108:90–101 .

[19] MySQL database [Internet]. Oracle Corporation; [Accessed 2010 jun 1]. Available at: http://www.mysql.com

[20] Ocean Archetype Editor [Internet]. Ocean Informatics; [Accessed 2012 jul 28]. Available at: http://www.openehr.org/svn/knowledge_tools_dotnet/TR UNK/ArchetypeEditor/Help/index.html

[21] Ocean Template Designer [Internet]. Ocean Informatics; [Accessed 2012 jul 28]. Available at: http://wiki.oceaninformatics.com/confluence/display/TT L/Template+Designer+Releases

[22] Sundvall E; Nyström M; Karlsson D; Eneling M; Chen R; Örman H. Applying Representational State Transfer (REST) Architecture to Archetype-based Electronic Health Record Systems. Unpublished Manuscript; 2012.

[23] Freemarker - Java Template Engine Library [Internet]. [Accessed 2012 jul 28]. Available at: http://freemarker.sourceforge.net/

[24] http://exist-db.org/exist/index.xml [Internet]. [Accessed 2012 jul 28]. Available at: http://existdb. org/exist/index.xml

[25] BaseX:The XML database [Internet]. [Accessed 2012 jul 28]. Available at: http://basex.org/

[26] Sedna: Native XML Database System [Internet]. [Accessed 2012 jul 28]. Available at: http://www.sedna.org/

[27] Oracle Berkeley DB XML [Internet]. Available at: http://www.oracle.com/technetwork/products/berkeleydb/ index-083851.html

[28] openEHR Foundation. Archetype Query Language [Internet]. [Accessed 2012 jul 28]. Available at: http://www.openehr.org/wiki/display/spec/Archetype+Qu ery+Language+Description

[29] W3C. XQuery 1.0 [Internet]. 2010 [Accessed 2012 jul 28]. Available at: http://www.w3.org/TR/xquery/

[30] Bastos E de A. Estimativa da Efetividade do Programa de Rastreamento do Câncer do Colo do Útero no Estado do Rio de Janeiro [M. Sc. Thesis]. [Rio de Janeiro]: Universidade Federal do Rio de Janeiro; 2011.

[31] Green J. A Comparison of the Relative Performance of XML and SQL Databases in the Context of the Grid- SAFE Project. University of Edinburgh; 2008.

[32] Beale T. Node + Path persistence [Internet]. 2008 [Accessed 2012 jul 28]. Available at: http://www.openehr.org/wiki/pages/viewpage.action? pageId=786487

[33] Strömbäck L; Freire J. XML Management for Bioinformatics Applications. Computing in Science & Engineering. 2011;13(5):12-22.

Citeringar i Crossref