Konferensartikel

The Lemlat 3.0 Package for Morphological Analysis of Latin

Marco Passarotti
CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Italy

Marco Budassi
Università degli Studi di Pavia, Italy

Eleonora Litta
CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Italy

Paolo Ruffolo
CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Italy

Ladda ner artikel

Ingår i: Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

Linköping Electronic Conference Proceedings 133:6, s. 24-31

NEALT Proceedings Series 32:6, p. 24-31

Visa mer +

Publicerad: 2017-05-10

ISBN: 978-91-7685-503-4

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

This paper introduces the main components of the downloadable package of the 3.0 version of the morphological analyser for Latin Lemlat. The processes of word form analysis and treatment of spelling variation performed by the tool are detailed, as well as the different output formats and the connection of the results with a recently built resource for derivational morphology of Latin. A light evaluation of the tool’s lexical coverage against a diachronic vocabulary of the entire Latin world is also provided.

Nyckelord

Inga nyckelord är tillgängliga

Referenser

Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. 2007. OpenFst: A General and Efficient Weighted Finite-State Transducer Library. In Proceedings of the Twelfth International Conference on Implementation and Application of Automata, (CIAA 2007), volume 4783 of Lecture Notes in Computer Science, pages 11–23, Springer, Prague, Czech Republic.

David Bamman and Gregory Crane. 2006. The Design and Use of a Latin Dependency Treebank. In TLT 2006: Proceedings of the Fifth International Treebanks and Linguistic Theories Conference, pages 67–78.

Andrea Bozzi and Giuseppe Cappelli. 1990. A Project for Latin Lexicography: 2. A Latin Morphological Analyser. Computer and the Humanities, 24:421–426.

Marco Budassi and Marco Passarotti. 2016. Nomen Omen. Enhancing the Latin Morphological Analyser Lemlat with an Onomasticon. In Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2016), pages 90–94, The Association for Computational Linguistics, Berlin.

Chris Culy, Eleonora Litta, and Marco Passarotti. Forthcoming. Visual Exploration of Latin Derivational Morphology. In Proceedings of the 30th International Conference of the Florida Artificial Intelligence Research Society (FLAIRS-30).

Markus Dreyer, Jason R. Smith, and Jason Eisner. 2008. Latent-variable modeling of string transductions with finite-state methods. In Proceedings of the conference on empirical methods in natural language processing, pages 1080–1089, Association for Computational Linguistics.

Charles du Fresne Du Cange et al. 1883-1887. Glossarium Mediae et Infimae Latinitatis. L. Favre, Niort.

Steffen Eger, Tim vor der Brück, and Alexander Mehler. 2015. Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization methods. In Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2015), pages 105–113.

Egidio Forcellini. 1940. Lexicon Totius Latinitatis / ad Aeg. Forcellini lucubratum, dein a Jos. Furlanetto emendatum et auctum; nunc demum Fr. Corradini et Jos. Perin curantibus emendatius et auctius meloremque in formam redactum adjecto altera quasi parte Onomastico totius latinitatis opera et studio ejusdem Jos. Perin. Typis Seminarii, Padova.

Michèle Fruyt. 2011. Word Formation in Classical Latin. In James Clackson (ed.), A companion to the Latin language. Vol. 132, pages 157–175, John Wiley & Sons, Chichester.

Karl E. Georges and Heinrich Georges. 1913-1918. Ausführliches Lateinisch-Deutsches Handwörterbuch. Hahn, Hannover.

Peter G.W. Glare. 1982. Oxford Latin Dictionary. Oxford University Press, Oxford.

Otto Gradenwitz. 1904. Laterculi Vocum Latinarum. Hirzel, Leipzig.

Dag T.T. Haug and Marius L. Jøhndal. 2008. Creating a Parallel Treebank of the Old Indo-European Bible Translations. In Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008), pages 27–34.

Mans Hulden. 2009. Foma: a finite-state compiler and library. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pages 29–32.

Paul Jenks. 1911. A Manual of Latin Word Formation for Secondary Schools. DC Heath & Company, Boston, New York etc.

Lauri Karttunen and Kenneth R. Beesley. 2005. Twenty-five years of finite-state morphology. In Inquiries Into Words, a Festschrift for Kimmo Koskenniemi on his 60th Birthday, pages 71–83.

Heinrich Keil. 1855-1880. Grammatici Latini. Teubner, Leipzig.

Kimmo Koskenniemi. 1983. Two-level morphology: A general computational model for word-form recognition and production. Publication 11, University of Helsinki, Department of General Linguistics, Helsinki.

Kristen Lindén, Miikka Silfverberg, and Tommi Pirinen. 2009. HFST Tools for Morphology – An Efficient Open-Source Package for Construction of Morphological Analyzers. In Cerstin Mahlow and Michael Piotrowski (eds.), Proceedings of the Workshop on Systems and Frameworks for Computational Morphology, volume 41 of Lecture Notes in Computer Science, pages 28–47, Springer, Zurich, Switzerland.

Eleonora Litta, Marco Passarotti, and Chris Culy. 2016. Formatio formosa est. Building a Word Formation Lexicon for Latin. In Anna Corazza, Simonetta Montemagni, and Giovanni Semeraro (eds.), Proceedings of the Third Italian Conference on Computational Linguistics (CLiC–it 2016). 5-6 December 2016, Napoli, Italy, pages 185–189, Accademia university press, Collana dell’Associazione Italiana di Linguistica Computazionale, vol. 2.

Valeria Lomanto. 1980. Lessici latini e lessicografia automatica. Memorie dell’Accademia delle Scienze di Torino, 5.4.2:111–269.

Nino Marinone. 1983. A project for a Latin lexical data base. Linguistica Computazionale, 3:175–187.

Nino Marinone. 1990. A Project for Latin Lexicography: 1. Automatic Lemmatization and Word-List. Computer and the Humanities, 24:417–420.

Dat Quoc Nguyen, Dai Quoc Nguyen, Dang Duc Pham, and Son Bao Pham. 2014. RDRPOSTagger: A Ripple Down Rules-based Part-Of-Speech Tagger. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 17–20.

Marco Passarotti. 2004. Development and perspectives of the Latin morphological analyser LEMLAT. Linguistica Computazionale, XX-XXI: 397–414.

Marco Passarotti. 2009. Theory and Practice of Corpus Annotation in the Index Thomisticus Treebank. Lexis, 27:5–23.

Tommi A. Pirinen. 2015. Using weighted finite state morphology with VISL CG-3 – Some experiments with free open source Finnish resources. In Proceedings of the Workshop on “Constraint Grammar- methods, tools and applications” at NODALIDA 2015, May 11-13, 2015, pages 29–33, Institute of the Lithuanian Language, Vilnius, Lithuania. No. 113. Linköping University Electronic Press.

Helmut Schmid. 1999. Improvements in part-ofspeech tagging with an application to German.
Natural language processing using very large corpora. pages 13–25, Springer.

Helmut Schmid. 2005. A Programming Language for Finite State Transducers. In Proceedings of the 5th International Workshop on Finite State Methods in Natural Language Processing (FSMNLP 2005), Helsinki, Finland.

Uwe Springmann, Helmut Schmid, and Dietmar Najock. 2016. LatMor: A Latin Finite-State Morphology Encoding Vowel Quantity. In Giuseppe Celano and Gregory Crane (eds.), Treebanking and Ancient Languages: Current and Prospective Research (Topical Issue), Open Linguistics vol. 2, pages 386–392.

Jana Straková, Milan Straka, and Jan Hajic. 2014. Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 13–18.

Paul Tombeur (ed.). 1998. Thesaurus formarum totius latinitatis a Plauto usque ad saeculum XXum. Brepols, Turnhout.

Citeringar i Crossref