Enriching a wordnet from a thesaurus

Sanni Nimb
Society for Danish Language and Literature, Denmark

Bolette S. Pedersen
University of Copenhagen, Denmark

Anna Braasch
University of Copenhagen, Denmark

Nicolai Sørensen
Society for Danish Language and Literature, Denmark

Thomas Troelsgård
Society for Danish Language and Literature, Denmark

Ingår i: Proceedings of the workshop on lexical semantic resources for NLP at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 19

Linköping Electronic Conference Proceedings 88:5, s. 36-50

NEALT Proceedings Series 19:5, s. 36-50

Publicerad: 2013-05-17

ISBN: 978-91-7519-586-5

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


Wordnets are traditionally built around synonym sets with the vertical hyponymy relations as the central structuring principle. The hyponymy relation; however; does not necessarily group concepts into synsets that are particularly close from a thematic or functional point of view; a phenomenon which is sometimes referred to as the “ISA overload”; or if contemplated from a thematic view point: the “tennis problem”. In this paper we present two experiments. The first one concerns a method for remedying these problems by transferring thematic information from a thesaurus to a wordnet (Danish Thesaurus to DanNet). Hereby we can automatically subdivide co-hyponyms thematically as well as relate synsets thematically across parts of speech. Since the thesaurus is not yet fully completed; the paper describes work in progress; nevertheless; with an error rate below 5% of the most coarse-grained transferred themes; the experiment appears to be very promising. Finally; the second experiment concerns extension of DanNet via the Danish Thesaurus: The thematic organisation of the thesaurus in near synonyms is further applied as a very precise method for automatically extending the lexical coverage of DanNet.


Wordnet; “ tennis problem”; ISA overload; thesaurus; thematic information


