Proceedings

Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden

Linköping Electronic Conference Proceedings 131 (2017)

NEALT Proceedings Series 29 (2017)

Download proceedings
Show more +

Editor(s): Jörg Tiedemann

Published: 2017-05-08

ISBN: 978-91-7685-601-7

ISSN: 1650-3686 (print), 1650-3740 (online)

Number of pages: 337

Content

1 Joint UD Parsing of Norwegian Bokmål and Nynorsk
Erik Velldal, Lilja Øvrelid and Petter Hohle

2 Replacing OOV Words For Dependency Parsing With Distributional Semantics
Prasanth Kolachina, Martin Riedl and Chris Biemann

3 Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing
Ali Basirat and Joakim Nivre

4 Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger
Kimmo Kettunen and Laura Löfberg

5 Machine Learning for Rhetorical Figure Detection: More Chiasmus with Less Annotation
Marie Dubremetz and Joakim Nivre

6 Coreference Resolution for Swedish and German using Distant Supervision
Alexander Wallin and Pierre Nugues

7 Aligning phonemes using finte-state methods
Kimmo Koskenniemi

8 Acoustic Model Compression with MAP adaptation
Katri Leino and Mikko Kurimo

9 OCR and post-correction of historical Finnish texts
Senka Drobac, Pekka Kauppinen and Krister Lindén

10 Twitter Topic Modeling by Tweet Aggregation
Asbjørn Steinskog, Jonas Therkelsen and Björn Gambäck

11 A Multilingual Entity Linker Using PageRank and Semantic Graphs
Anton Södergren and Pierre Nugues

12 Linear Ensembles of Word Embedding Models
Avo Muromägi, Kairit Sirts and Sven Laur

13 Using Pseudowords for Algorithm Comparison: An Evaluation Framework for Graph-based Word Sense Induction
Flavio Massimiliano Cecchini, Chris Biemann and Martin Riedl

14 North-Sámi to Finnish rule-based machine translation system
Tommi Pirinen, Francis M. Tyers, Trond Trosterud, Ryan Johnson, Kevin Unhammer and Tiina Puolakainen

15 Machine translation with North Saami as a pivot language
Lene Antonsen, Ciprian Gerstenberger, Maja Kappfjell, Sandra Nystø Ráhka, Marja-Liisa Olthuis, Trond Trosterud and Francis Morton Tyers

16 SWEGRAM – A Web-Based Tool for Automatic Annotation and Analysis of Swedish Texts
Jesper Näsman, Beata Megyesi and Anne Palmér

17 Optimizing a PoS Tagset for Norwegian Dependency Parsing
Petter Hohle, Lilja Øvrelid and Erik Velldal

18 Creating register sub-corpora for the Finnish Internet Parsebank
Veronika Laippala, Juhani Luotolahti, Aki-Juhani Kyröläinen, Tapio Salakoski and Filip Ginter

19 KILLE: a Framework for Situated Agents for Learning Language Through Interaction
Simon Dobnik and Erik de Graaf

20 Data Collection from Persons with Mild Forms of Cognitive Impairment and Healthy Controls - Infrastructure for Classification and Prediction of Dementia
Dimitrios Kokkinakis, Kristina Lundholm Fors, Eva Björkner and Arto Nordlund

21 Evaluation of language identification methods using 285 languages
Tommi Jauhiainen, Krister Lindén and Heidi Jauhiainen

22 Can We Create a Tool for General Domain Event Analysis?
Siim Orasmaa and Heiki-Jaan Kaalep

23 From Treebank to Propbank: A Semantic-Role and VerbNet Corpus for Danish
Eckhard Bick

24 Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Johannes Bjerva and Robert Östling

25 Will my auxiliary tagging task help? Estimating Auxiliary Tasks Effectivity in Multi-Task Learning
Johannes Bjerva

26 Iconic Locations in Swedish Sign Language: Mapping Form to Meaning with Lexical Databases
Carl Börstell and Robert Östling

27 Docforia: A Multilayer Document Model
Marcus Klang and Pierre Nugues

28 Finnish resources for evaluating language model semantics
Viljami Venekoski and Jouko Vankka

29 Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech
Steinþór Steingrímsson, Jón Guðnason, Sigrún Helgadóttir and Eiríkur Rögnvaldsson

30 The Effect of Translationese on Tuning for Statistical Machine Translation
Sara Stymne

31 Multilingwis2 – Explore Your Parallel Corpus
Johannes Graën, Dominique Sandoz and Martin Volk

32 A modernised version of the Glossa corpus search system
Anders Nøklestad, Kristin Hagen, Janne Bondi Johannessen, Michal Kosek and Joel Priestley

33 Dep_search: Efficient Search Tool for Large Dependency Parsebanks
Juhani Luotolahti, Jenna Kanerva and Filip Ginter

34 Proto-Indo-European Lexicon: The Generative Etymological Dictionary of Indo-European Languages
Jouna Pyysalo

35 Tilde MODEL - Multilingual Open Data for EU Languages
Roberts Rozis and Raivis Skadinš

36 Mainstreaming August Strindberg with Text Normalization
Adam Ek and Sofia Knuutinen

37 Word vectors, reuse, and replicability: Towards a community repository of large-text resources
Murhaf Fares, Andrey Kutuzov, Stephan Oepen and Erik Velldal

38 Improving Optical Character Recognition of Finnish Historical Newspapers with a Combination of Fraktur & Antiqua Models and Image Preprocessing
Mika Koistinen, Kimmo Kettunen and Tuula Pääkkönen

39 Redefining Context Windows for Word Embedding Models: An Experimental Study
Pierre Lison and Andrey Kutuzov

40 The Effect of Excluding Out of Domain Training Data from Supervised Named-Entity Recognition
Adam Persson

41 Quote Extraction and Attribution from Norwegian Newspapers
Andrew Salway, Paul Meurer, Knut Hofland and Øystein Reigem

42 Wordnet extension via word embeddings: Experiments on the Norwegian Wordnet
Heidi Sand, Erik Velldal and Lilja Øvrelid

43 Universal Dependencies for Swedish Sign Language
Robert Östling, Carl Börstell, Moa Gärdenfors and Mats Wirén

44 Services for text simplification and analysis
Johan Falkenjack, Evelina Rennes, Daniel Fahlborg, Vida Johansson and Arne Jönsson

45 Exploring Properties of Intralingual and Interlingual Association Measures Visually
Johannes Graën and Christof Bless

46 TALERUM - Learning Danish by Doing Danish
Peter Juel Henrichsen

47 Cross-Lingual Syntax: Relating Grammatical Framework with Universal Dependencies
Aarne Ranta, Prasanth Kolachina and Thomas Hallgren

48 Exploring Treebanks with INESS Search
Victoria Rosén, Helge Dyvik, Paul Meurer and Koenraad De Smedt

49 A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora
Aleksi Vesanto, Asko Nivala, Tapio Salakoski, Hannu Salmi and Filip Ginter