Data collection for learner corpus of Latvian: copyright and personal data protection

Kaija, Inga; Auzina, Ilze

Konferensartikel

Data collection for learner corpus of Latvian: copyright and personal data protection

Inga Kaija
Institute of Mathematics and Computer Science, University of Latvia; Riga Stradinš University, Latvia

Ilze Auzina
Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia

Ladda ner artikel

https://doi.org/10.3384/ecp2020172006

Ingår i: Selected Papers from the CLARIN Annual Conference 2019

Linköping Electronic Conference Proceedings 172:6, s. 41-47

Visa mer +

Publicerad: 2020-07-03

ISBN: 978-91-7929-807-4

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

Copyright and personal data protection are two of the most important legal aspects of collecting data for a learner corpus. The paper explains the challenges in data collection for the learner corpus of Latvian “LaVA” and describes the procedure undertaken to ensure protection of the texts’ authors’ rights. An agreement / metadata questionnaire form was created to inform the authors of the ways their texts are used and to receive the authors’ permission to use them in the stated way. The information, permission, and the metadata questionnaire are printed on one side of an A4 size paper sheet, and the author is supposed to write the text on the other side by hand, thus eliminating the need to identify the author of the text separately. After scanning and adding to the corpus, the text originals are returned to the authors.

Nyckelord

Referenser

Inga referenser tillgängliga

Konferensartikel

Data collection for learner corpus of Latvian: copyright and personal data protection

Abstract

Nyckelord

Referenser

Citeringar i Crossref