Konferensartikel

Finite state applications with Javascript

Mans Hulden
University of Helsinki, Finland

Mikka Silfverberg
University of Helsinki, Finland

Jerid Francom
Wake Forest University, USA

Ladda ner artikel

Ingår i: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16

Linköping Electronic Conference Proceedings 85:41, s. 441-446

NEALT Proceedings Series 16:41, p. 441-446

Visa mer +

Publicerad: 2013-05-17

ISBN: 978-91-7519-589-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

In this paper we present a simple and useful Javascript application programming interface for performing basic online operations with weighted and unweighted finite-state machines; such as word lookup; transductions; and least-cost-path finding. The library; jsfst; provides access to frequently used online functionality in finite-state machine-based language technology. The library is technology-agnostic in that it uses a neutral representation of finite-state machines into which most formats can be converted. We demonstrate the usefulness of the library through addressing a task that is useful in web and mobile environments - a multilingual spell checker application that also detects real-word errors.

Nyckelord

Finite-state technology; Javascript; spell checking; perceptrons

Referenser

Beesley; K. R. and Karttunen; L. (2003). Finite state morphology. CSLI; Stanford.

Carreras; X.; Chao; I.; Padró; L.; and Padró; M. (2004). Freeling: An open-source suite of language analyzers. In Proceedings of the 4th LREC; volume 4.

Collins; M. (2002). Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In Proceedings of EMNLP ’02; pages 1–8.

Graff; D. (2011). Spanish Gigaword Third Edition (LDC2011T12). Linguistic Data Consortium; University of Pennsylvania; Philadelphia; PA.

Hirst; G. and Budanitsky; A. (2005). Correcting real-word spelling errors by restoring lexical cohesion. Natural Language Engineering; 11(01):87–111.

Hulden; M. (2009). Foma: a finite-state compiler and library. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Demonstrations Session; pages 29–32. Association for Computational Linguistics.

Karttunen; L. (1990). Binary encoding format for finite-state networks. Technical Report; Palo Alto Research Center; (P90-00019).

Lindén; K.; Silfverberg; M.; and Pirinen; T. (2009). HFST tools for morphology–an efficient open-source package for construction of morphological analyzers. In State of the Art in Computational Morphology; pages 28–47. Springer.

Mohri; M.; Pereira; F.; Riley; M.; and Allauzen; C. (1997). AT&T FSM library-finite state machine library. AT&T Labs-Research.

Pirinen; T. (2011). Modularisation of Finnish finite-state language description—towards wide collaboration in open source development of morphological analyser. In Proceedings of NoDaLiDa; volume 18.

Schmid; H. (2006). A programming language for finite state transducers. Lecture Notes in Computer Science; 4002.

Schmid; H.; Fitschen; A.; and Heid; U. (2004). SMOR: A German computational morphology covering derivation; composition and inflection. In Proceedings of LREC 2004; pages 1263–1266. Citeseer.

Yarowsky; D. (1994). Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. Proceedings of the 32nd annual meeting on Association for Computational Linguistics; pages 88–95.

Citeringar i Crossref