Determining the most frequent senses using Russian linguistic ontology RuThes

Natalia Loukachevitch
Lomonosov Moscow State University, Moscow, Russia

Ilia Chetviorkin
Lomonosov Moscow State University, Moscow, Russia

Ladda ner artikel

Ingår i: Proceedings of the Workshop on Semantic resources and Semantic Annotation for Natural Language Processing and the Digital Humanities at NODALIDA 2015, Vilnius, 11th May, 2015

Linköping Electronic Conference Proceedings 112:4, s. 21–27

NEALT Proceedings Series 27:4, s. 21–27

Visa mer +

Publicerad: 2015-05-06

ISBN: 978-91-7519-049-5

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


The paper describes a supervised approach for the detection of the most frequent senses of words on the basis of RuThes thesaurus, which is a large linguistic ontology for Russian. Due to the large number of monosemous multiword expressions and the set of RuThes relations it is possible to calculate several context features for ambiguous words and to study their contribution to a supervised model for detecting frequent senses.


lexical sense; lexical disambiguation; linguistic ontology; multiword expressions


Eneko Agirre, and Oier Lopez De Lacalle. 2004. Publicly Available Topic Signatures for all WordNet Nominal Senses. Proceedings of LREC-2004.

Eneko Agirre, Lluís Màrquez, and Richard Wicentowski., Eds. 2007. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval). Association for Computational Linguistics, Prague, Czech Republic.

Irina Azarowa. 2008. RussNet as a Computer Lexicon for Russian. Proceedings of the Intelligent Information systems IIS-2008: 341-350.

Valentina Balkova, Andrey Suhonogov, and Sergey Yablonsky. 2008. Some Issues in the Construction of a Russian WordNet Grid. Proceedings of the Forth International WordNet Conference, Szeged, Hungary: 44-55.

Pavel Braslavski, Dmitrii Ustalov, and Mikhail Mukhin. 2014. A Spinning Wheel for YARN: User Interface for a Crowdsourced Thesaurus. Proceedings of EACL-2014, Sweden.

Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.

Elena Grishina, and Ekaterina Rakhilina. 2005. Russian National Corpus (RNC): an overview and perspectives. AATSEEL- 2005.

Rob Koeling, Diana McCarthy, and John Carroll. 2005. Domain-specific sense distributions and predominant sense acquisition. Proceedings EMNLP-2005, Vancouver, B.C., Canada: 419-426.

Shari Landes, Claudia Leacock, and Randee Tengi. 1998. Building semantic concordances. In Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database. Cambridge (Mass.): The MIT Press.

Jey Han Lau, Paul Cook, Diana McCarthy, Spandana Gella, and Timothy Baldwin. 2014. Learning Word Sense distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models. In Proceedings of ACL-2014, pages 259-270.

Claudia Leacock, George Miller, and Martin Chodorow. 1998. Using corpus statistics and WordNet relations for sense identification. Computational Linguistics, 24(1): 147-165.

Dekang Lin. 1998. Automatic retrieval and clustering of similar words. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, V(2): 768-774.

Natalia Loukachevitch and Boris Dobrov. 2014. RuThes Linguistic Ontology vs. Russian Wordnets. In Proceedings of Global WordNet Conference GWC-2014.

Diana McCarthy, Rob Koeling, Julie Weeds, and John Carroll. 2004. Finding predominant word senses in untagged text. In Proceedings of ACL-2004.

Diana McCarthy, Rob Koeling, Julie Weeds, and John Carroll. 2007. Unsupervised acquisition of predominant word senses. Computational Linguistics, 33(4): 553-590.

Rada Mihalcea. 2002. Bootstrapping large sense tagged corpora. In Proceedings of LREC-2002.

Sunny Mitra, Ritwik Mitra, Martin Riedl, Chris Biemann, Animesh Mukherjee, and Pawan Goyal. 2014. That’s sick dude!: Automatic identification of word sense change across different timescales. Proceedings of ACL-2014.

Saif Mohammad and Graeme Hirst. 2006. Determining word sense dominance using a thesaurus. Proceedings of EACL-2006: 121-128.

Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2), 10.

Benjamin Snyder and Martha Palmer. 2004. The English all-words task. In Mihalcea, R. and Chklowski, T., editors, Proceedings of SENSEVAL-3: Third International Workshop on Evaluating Word Sense Disambiguating Systems: 41-43.

Tommaso Petrolito and Francis Bond. 2014. A Survey of WordNet Annotated Corpora. In Proceedings Global WordNet Conference, GWC-2014: 236-245.

Citeringar i Crossref