Konferensartikel

Helping Swedish words come to their senses: word-sense disambiguation based on sense associations from the SALDO lexicon

Ildikó Pilán
Språkbanken, Dept. of Swedish, University of Gothenburg, Gothenburg, Sweden

Ladda ner artikel

Ingår i: Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 109:36, s. 275-279

NEALT Proceedings Series 23:36, p. 275-279

Visa mer +

Publicerad: 2015-05-06

ISBN: 978-91-7519-098-3

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

We report the initial results of a word-sense disambiguation experiment which aims at identifying the correct sense of Swedish nouns and verbs in a sentence using a lexical-semantic resource, SALDO. This resource containing associations between word senses has not been previously used for this purpose. The proposed method is based on overlaps between a list of hierarchically organized related word senses. Overall, our approach proved more efficient for nouns, since not only was the accuracy score higher for nouns (56%) than for verbs (46%), but, for the former category, in 22% more of the cases was a sense overlap found. As a result of an in-depth analysis of the predictions, we identified a number of ways the system could be modified or extended for an improved performance.

Nyckelord

Inga nyckelord är tillgängliga

Referenser

Eneko Agirre and Philip Glenny Edmonds. 2007. Word sense disambiguation: Algorithms and applications, volume 33. Springer.

Satanjeev Banerjee and Ted Pedersen. 2002. An adapted Lesk algorithm for word sense disambiguation using WordNet. In Computational linguistics and intelligent text processing, pages 136–145. Springer.

Lars Borin, Markus Forsberg, Leif-J¨oran Olsson, and Jonatan Uppstr¨om. 2012a. The open lexical infrastructure of Språkbanken. In LREC, pages 3598–3602.

Lars Borin, Markus Forsberg, and Johan Roxendal. 2012b. Korp-the corpus infrastructure of Spr°akbanken. In LREC, pages 474–478.

Lars Borin, Markus Forsberg, and Lennart L¨onngren. 2013. SALDO: a touch of yin to WordNet’s yang. Language Resources and Evaluation, 47(4):1191–1211.

Philip Edmonds and Scott Cotton. 2001. SENSEVAL-2: overview. In The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, pages 1–5. Association for Computational Linguistics.

Eva Ejerhed and Gunnel K¨allgren. 1992. The linguistic annotation of the Stockholm-Ume°a Corpus project. Technical report, University of Umeå.

Jonas Ekedahl and Koraljka Golub. 2004. Word sense disambiguation using Wordnet and the Lesk algorithm. Projektarbeten 2004, page 17.

Christiane Fellbaum. 1998. WordNet. Wiley Online Library.

Martin Hassel. 2005. Word sense disambiguation using co-occurrence statistics on random labels. In Proceedings of Recent Advances in Natural Language Processing 2005, Borovets, Bulgaria.

Katarina Heimann Mühlenbock. 2013. I see what you mean. Ph.D. thesis, University of Gothenburg.

Dimitrios Kokkinakis, Jerker J¨arborg, and Yvonne Cederholm. 2001. Senseval-2: the Swedish framework. In The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, pages 45–48. Association for Computational Linguistics.

Michael Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation, pages 24–26. ACM.

Tristan Miller, Chris Biemann, Torsten Zesch, and Iryna Gurevych. 2012. Using distributional similarity for lexical expansion in knowledge-based word sense disambiguation. In COLING, pages 1781–1796.

Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2):10.

David Yarowsky, Silviu Cucerzan, Radu Florian, Charles Schafer, and Richard Wicentowski. 2001. The John Hopkins SENSEVAL-2 system descriptions. In Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems, pages 163–166, Toulouse, France, July. Association for Computational Linguistics.

Citeringar i Crossref