Pseudonymisation of Swedish Electronic Patient Records Using a Rule-Based Approach

Hercules Dalianis
Department of Computer and Systems Sciences, Stockholm University, Sweden

Ladda ner artikel

Ingår i: Proceedings of the Workshop on NLP and Pseudonymisation, September 30, 2019, Turku, Finland

Linköping Electronic Conference Proceedings 166:3, s. 16-23

NEALT Proceedings Series 41:3, p. 16-23

Visa mer +

Publicerad: 2019-09-30

ISBN: 978-91-7929-996-5

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


This study describes a rule-based pseudonymisation system for Swedish clinical text and its evaluation. The pseudonymisation system replaces already tagged Protected Health Information (PHI) with realistic surrogates. There are eight types of manually annotated PHIs in the electronic patient records; personal first and last names, phone numbers, locations, dates, ages and healthcare units. Two evaluators, both computer scientists, one junior and one senior, evaluated whether a set of 98 electronic patients records where pseudonymised or not. Only 3.5 percent of the records were correctly judged as pseudonymised and 1.5 percent of the real ones were wrongly judged as pseudo, giving that in average 91 percent of the pseudonymised records were judged as real.


pseudonymisation, Swedish electronic patient records, rule-based method


Inga referenser tillgängliga

Citeringar i Crossref