Article | Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland | Docria: Processing and Storing Linguistic Data with Wikipedia Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
Docria: Processing and Storing Linguistic Data with Wikipedia
Author:
Marcus Klang: Department of Computer Science, Lund University, Lund, Sweden Pierre Nugues: Department of Computer Science, Lund University, Lund, Sweden
Download:
Full text (pdf)
Year:
2019
Conference:
Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland
Issue:
167
Article no.:
048
Pages:
401--405
No. of pages:
4
Publication type:
Abstract and Fulltext
Published:
2019-10-02
ISBN:
978-91-7929-995-8
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

The availability of user-generated content has increased significantly over time. Wikipedia is one example of a corpus, which spans a huge range of topics and is freely available. Storing and processing such corpora requires flexible document models as they may contain malicious or incorrect data. Docria is a library which attempts to address this issue with a model using typed property hypergraphs. Docria can be used with small to large corpora, from laptops using Python interactively in a Jupyter notebook to clusters running apreduce frameworks with optimized compiled code. Docria is available as opensource code at https://github.com/marcusklang/docria.

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Marcus Klang, Pierre Nugues
Title:
Docria: Processing and Storing Linguistic Data with Wikipedia
References:
No references available

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Author:
Marcus Klang, Pierre Nugues
Title:
Docria: Processing and Storing Linguistic Data with Wikipedia
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-11-06