Inga Kaija
Institute of Mathematics and Computer Science, University of Latvia; Riga Stradinš University, Latvia
Ilze Auzina
Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
Download articlehttps://doi.org/10.3384/ecp2020172006Published in: Selected Papers from the CLARIN Annual Conference 2019
Linköping Electronic Conference Proceedings 172:6, p. 41-47
Published: 2020-07-03
ISBN: 978-91-7929-807-4
ISSN: 1650-3686 (print), 1650-3740 (online)
Copyright and personal data protection are two of the most important legal aspects of collecting data for a learner corpus. The paper explains the challenges in data collection for the learner corpus of Latvian “LaVA” and describes the procedure undertaken to ensure protection of the texts’ authors’ rights. An agreement / metadata questionnaire form was created to inform the authors of the ways their texts are used and to receive the authors’ permission to use them in the stated way. The information, permission, and the metadata questionnaire are printed on one side of an A4 size paper sheet, and the author is supposed to write the text on the other side by hand, thus eliminating the need to identify the author of the text separately. After scanning and adding to the corpus, the text originals are returned to the authors.