Conference article

HB Deid - HB De-identification tool demonstrator

Hercules Dalianis
Department of Computer and Systems Sciences Stockholm University Kista, Sweden

Hanna Berg
Department of Computer and Systems Sciences Stockholm University Kista, Sweden

Download article

Published in: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021.

Linköping Electronic Conference Proceedings 178:54, p. 467-471

NEALT Proceedings Series 45:54, p. 467-471

Show more +

Published: 2021-05-21

ISBN: 978-91-7929-614-8

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

This paper describes a freely available web-based demonstrator called HB Deid. HB Deid identifies so-called protected health information, PHI, in a text written in Swedish and removes, masks, or replaces them with surrogates or pseudonyms. PHIs are named entities such as personal names, locations, ages, phone numbers, dates. HB Deid uses a CRF model trained on non-sensitive annotated text in Swedish, as well as a rule-based post-processing step for finding PHI. The final step in obscuring the PHI is then to either mask it, show only the class name or use a rule-based pseudonymisation system to replace it.

Keywords

de-identification, pseudonymisation, clinical text, electronic patient records, CRF, Swedish

References

No references available

Citations in Crossref