Konferensartikel

Using weighted finite state morphology with VISL CG-3 - Some experiments with free open source Finnish resources

Tommi A Pirinen
Ollscoil Chathair Bhaile Átha Cliath, CNGL—School of Computing, Dublin City University, Dublin, Ireland

Ladda ner artikel

Ingår i: Proceedings of the Workshop on “Constraint Grammar - methods, tools and applications” at NODALIDA 2015, May 11-13, 2015, Institute of the Lithuanian Language, Vilnius, Lithuania

Linköping Electronic Conference Proceedings 113:5, s. 29-33

NEALT Proceedings Series 24:5, p. 29-33

Visa mer +

Publicerad: 2015-06-17

ISBN: 978-91-7519-037-2

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

Traditionally, the coupling of finite state morphology and constraint grammar has been strictly rule-based, making binary distinctions between allowed and disallowed readings, however, in the recent years much of the research in the finite state morphologies has adapted the contemporary paradigm of statistically weighted analysis. This is reflected in current versions of free and open source morphology of Finnish, omorfi, in the finite state morphology part. In this paper we examine two strategies of making use of the weights as a part of VISL CG-3 pipeline. We evaluate the results intrinsically on small sample of analyses we have disambiguated by hand ourselves, and extrinsically on the effect it has on the rule-based machine translation of that text using the freely available open source translator, apertiumfin-eng.

Nyckelord

Inga nyckelord är tillgängliga

Referenser

Kenneth R Beesley and Lauri Karttunen. 2003. Finite-state morphology: Xerox tools and techniques. CSLI, Stanford.

Eckhard Bick. 2009. Introducing probabilistic information in constraint grammar parsing. In Proceedings of Corpus Linguistics 2009.

Péter Halácsy, András Kornai, and Csaba Oravecz. 2007. Hunpos: an open source trigram tagger. In Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, pages 209–212. Association for Computational Linguistics.

Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 177–180. Association for Computational Linguistics.

Krister Lindén and Tommi Pirinen. 2009. Weighting finite-state morphological analyzers using HFST tools. In Bruce Watson, Derrick Courie, Loek Cleophas, and Pierre Rautenbach, editors, FSMNLP 2009, July.

Krister Lindén, Erik Axelson, Sam Hardwick, Tommi A Pirinen, and Miikka Silfverberg. 2011. Hfst–framework for compiling and applying morphologies. Systems and Frameworks for Computational Morphology, pages 67–85.

Tommi A Pirinen. 2011. Modularisation of Finnish finite-state language description­towards wide collaboration in open source development of morphological analyser. In Proceedings of Nodalida, volume 18 of NEALT proceedings.

Citeringar i Crossref