IPhraxtor - A linguistically informed system for extraction of term candidates

Magnus Merkel
Department of Computer and Information Science, Linköping University, Sweden

Jody Foo
Department of Computer and Information Science, Linköping University, Sweden

Lars Ahrenberg
Department of Computer and Information Science, Linköping University, Sweden

Ladda ner artikel

Ingår i: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Linköping Electronic Conference Proceedings 85:14, s. 121-132

NEALT Proceedings Series 16:14, s. 121-132

Visa mer +

Publicerad: 2013-05-17

ISBN: 978-91-7519-589-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


In this paper a method and a flexible tool for performing monolingual term extraction is presented; based on the use of syntactic analysis where information on parts-of-speech; syntactic functions and surface syntax tags can be utilised. The standard approaches to evaluating term extraction; namely by manual evaluation of the top n term candidates or by comparing to a gold standard consisting of a list of terms from a specific domain can have its advantages; but in this paper we try to realise a proposal by Bernier-Colborne (2012) where extracted terms are compared to a gold standard consisting of a test corpus where terms have been annotated in context. Apart from applying this evaluation to different configuratio


Computational terminology; term extraction; evaluation; terminological work


Ananiadou; S. 1994. A methodology for automatic term recognition. In Proceedings of the 15th conference on Computational linguistics; pages 1034–1038; Morristown; NJ; USA. Association for Computational Linguistics.

Bernier-Colborne; G. (2012). Defining a Gold Standard for the evaluation of Term Extractors. In: Proceedings of the colabTKR workshop Teminology and Knowledge Representation; Istanbul; Turkey.

Justeson; J.S. & Katz S.M. 1995. Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1:9-27. Bourigault; 1992; Merkel; Foo. 2007. Terminology extraction and term ranking for standardizing term banks. In Proceedings of NODALIDA’07; Tartu; Estonia.

Mustafa El Hadi; W.; Timimi; I.. Dabadie; M.; Choukri; K.; Hamon; O.; Chiao; Y.-C. (2006). Terminological resources acquisition tools: Toward a user-oriented evaluation model. In Proceedings of LREC’06; pp. 945-958; Genova; Italy.

Nazarenko; A. & Zargayouna; H. (2009). Evaluating term extraction. In: Proceedings of the International Conference RANLO 2009; pp. 299-304; Borovets; Bulgaria. Terminologicentrum (2012). Rikstermbanken. http:www.rikstermbanken.se. Terminologicentrum; Stockholm.

Tapanainen; P.; & Järvinen; T. (1997). A non-projective dependency parser. In Proceedings of the fifth conference on Applied Natural Language Processing (pp. 64–71). Washington; DC; USA. Stroudsburg; PA; USA: Association for Computational Linguistics.

Zhang; Z.; Iria; J.; Brewster; C. & Ciravegna; F. (2008). A comparative evaluation of term recognition algorithms. In: Proceedings of LREC’08; pp. 2108-2113; Marrakech; Morocco.

Citeringar i Crossref