Konferensartikel

HB Deid - HB De-identification tool demonstrator

Hercules Dalianis
Department of Computer and Systems Sciences Stockholm University Kista, Sweden

Hanna Berg
Department of Computer and Systems Sciences Stockholm University Kista, Sweden

Ladda ner artikel

Ingår i: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021.

Linköping Electronic Conference Proceedings 178:54, s. 467-471

Visa mer +

Publicerad: 2021-05-21

ISBN: 978-91-7929-614-8

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

This paper describes a freely available web-based demonstrator called HB Deid. HB Deid identifies so-called protected health information, PHI, in a text written in Swedish and removes, masks, or replaces them with surrogates or pseudonyms. PHIs are named entities such as personal names, locations, ages, phone numbers, dates. HB Deid uses a CRF model trained on non-sensitive annotated text in Swedish, as well as a rule-based post-processing step for finding PHI. The final step in obscuring the PHI is then to either mask it, show only the class name or use a rule-based pseudonymisation system to replace it.

Nyckelord

de-identification, pseudonymisation, clinical text, electronic patient records, CRF, Swedish

Referenser

Inga referenser tillgängliga

Citeringar i Crossref