Article | Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland | Projecting named entity recognizers without annotated or parallel corpora Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
Projecting named entity recognizers without annotated or parallel corpora
Author:
Jue Hou: Department of Computer Science, University of Helsinki, Finland Maximilian W. Koppatz: Department of Computer Science, University of Helsinki, Finland José María Hoya Quecedo: Department of Computer Science, University of Helsinki, Finland Roman Yangarber: Department of Computer Science, University of Helsinki, Finland
Download:
Full text (pdf)
Year:
2019
Conference:
Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland
Issue:
167
Article no.:
024
Pages:
232--241
No. of pages:
9
Publication type:
Abstract and Fulltext
Published:
2019-10-02
ISBN:
978-91-7929-995-8
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

Named entity recognition (NER) is a well-researched task in the field of NLP, which typically requires large annotated corpora for training usable models. This is a problem for languages which lack large annotated corpora, such as Finnish. We propose an approach to create a named entity recognizer with no annotated or parallel documents, by leveraging strong NER models that exist for English. We automatically gather a large amount of {\em chronologically matched} data in two languages, then project named entity annotations from the English documents onto the Finnish ones, by resolving the matches with limited linguistic rules. We use this ``artificially’’ annotated data to train a BiLSTM-CRF model. Our results show that this method can produce annotated instances with high precision, and the resulting model achieves state-of-the-art performance.

Keywords: Automatic data annotation Named Entity Recognition Neural Network

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Jue Hou, Maximilian W. Koppatz, José María Hoya Quecedo, Roman Yangarber
Title:
Projecting named entity recognizers without annotated or parallel corpora
References:
No references available

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Jue Hou, Maximilian W. Koppatz, José María Hoya Quecedo, Roman Yangarber
Title:
Projecting named entity recognizers without annotated or parallel corpora
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-11-06