Konferensartikel

Towards automatic tracking of lexical change: Linking historical lexical resources

Malin Ahlberg
Språkbanken / Department of Swedish University of Gothenburg, Sweden

Peter Andersson
Språkbanken / Department of Swedish University of Gothenburg, Sweden

Ladda ner artikel

Ingår i: Proceedings of the workshop on computational historical linguistics at NODALIDA 2013; May 22-24; 2013; Oslo; Norway. NEALT Proceedings Series 18

Linköping Electronic Conference Proceedings 87:1, s. 1-10

NEALT Proceedings Series 18:1, p. 1-10

Visa mer +

Publicerad: 2013-05-17

ISBN: 978-91-7519-587-2

ISSN: 1650-3686 (tryckt), 1650-3740 (online)

Abstract

In the field of historical linguistics; large-scale corpora studies are a key component in identifying phenomena such as language variation and language change. Manually performed corpora studies are very time consuming and may obscure interesting changes in the sense that phenomena that are not being specifically searched for easily are overlooked. During the last couple of years the potential of language technology tools has been put forward in relation to historical linguistic research. This paper is based on an experiment of linking up several lexical resources in Swedish; which together reflect a vocabulary from Old Swedish to Contemporary Swedish. The link-up aims at identifying potential lexical change such as cases of grammaticalization and it may further be of use in other language technology applications. In our case study we are linking lemmas together with part-of-speech information given in each entry for all the lexical resources. This paper describes our first results; where we focus on the cases when information about word class differs in one of the resources. In future studies it is necessary and desirable to include more digitalized lexicon resources and confirm the analyses with corpora research. Still; the current result already shows some interesting cases of semantic change and grammaticalization. Changes in the content word system such as generalization and specialization of meaning are also exemplified in our data. Even though the links sometimes show errors that at first sight lead us towards a rong conclusion we believe that methods like the one used here may be very fruitful to future research to reach more efficiency in historical linguistics research.

Nyckelord

Lexical linking; historical linguistics; language change

Referenser

Adesam; Y.; Ahlberg; M.; and Bouma; G. (2012). bokstaffua; bokstaffwa; bokstafwa; bokstaua; bokstawa. . . Towards lexical link-up for a corpus of Old Swedish. In Proceedings of LTHist 2012.

Andersson; P. (2007). Modalitet och förändring. En studie av må och kunna i fornsvenska. Göteborgsstudier i nordisk språkvetenskap 10; Institutionen för svenska språket; University of Gothenburg; Gothenburg.

Andersson; P. (2008). Swedish må and the degrammaticalization debate. In Seoane; E. and López-Couso; M. J.; editors; Theoretical and empirical issues in grammaticalization; Typological studies in language; pages 15–32; Amsterdam/Philadelphia.

Borin; L.; Dannélls; D.; Forsberg; M.; Toporowska Gronostaj; M.; and Kokkinakis; D. (2010). The past meets the present in swedish FrameNet++. In 14th EURALEX International Congress; pages 269–281.

Borin; L. and Forsberg; M. (2011). A diachronic computational lexical resource for 800 years of Swedish. In Language technology for cultural heritage; pages 41–61. Springer; Berlin.

Borin; L.; Forsberg; M.; and Lönngren; L. (2008). SALDO 1.0 (Svenskt associationslexikon version 2). Språkbanken; Göteborgs universitet.

Dahlgren; F. A. (1960). Glossarium öfver föråldrade eller ovanliga ord och talesätt i svenska språket från och med 1500-talets andra årtionde. Atelier Elektra; Köpenhamn.

Dalin; A. F. (1853/1855). Ordbok öfver svenska språket; volume I–II. Stockholm; Sweden.

de Melo; G. and Weikum; G. (2009). Towards a universal wordnet by learning from combined evidence. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009); pages 513–522; New York; NY; USA. ACM.

Eriksson; M. (1992). Ett fall av grammatikalisering i modern svenska. Ba i ungdomars talspråk. FUMS rapport 166. Institutionen för nordiska språk; Uppsala.

Ernst-Gerlach; A. and Fuhr; N. (2007). Retrieval in text collections with historic spelling using linguistic and spelling variants. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries; pages 333–341. ACM.

Gotscharek; A.; Reffle; U.; Ringlstetter; C.; Schulz; K. U.; and Neumann; A. (2011). Towards information retrieval on historical document collections: the role of matching procedures and special lexica. International journal on document analysis and recognition; 14(2):159–171.

Gurevych; I.; Eckle-Kohler; J.; Hartmann; S.; Matuschek; M.; Meyer; C. M.; and Wirth; C. (2012). UBY - a large-scale unified lexical-semantic resource based on LMF. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012); pages 580–590.

Hellquist; E. (1957). Svensk etymologisk ordbok 1&2; volume 3. C.W.K Gleerups förlag; Lund.

Hopper; P. and Traugott; E. C. (2003). Grammaticalization. Cambridge university press; Cambridge; second edition.

Koolen; M.; Adriaans; F.; Kamps; J.; and De Rijke; M. (2006). A cross-language approach to historic document retrieval. Advances in Information Retrieval; pages 407–419.

Ljunggren; K. (1939). Adjektivering av substantiv i svenskan. C.W.K Gleerups förlag; Lund. Navigli; R. and Ponzetto; S. P. (2012). BabelNet: The automatic construction; evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence; 193:217–250.

Pumfrey; Stephen; P. R. and Mariani; J. (2012). Experiments in 17th century English: manual versus automatic conceptual history. Literary and Linguistic Computing. Oxford University Press.

Schlyter; C. J. (1887). Ordbok till Samlingen af Sweriges Gamla Lagar; volume 13 of Saml. af Sweriges Gamla Lagar. Lund; Sweden.
Swedberg; J. (2009). Swensk ordabok. Uppsala; Sweden.

Söderwall; K. F. (1884). Ordbok Öfver svenska medeltids-språket. Supplement. Lund; Sweden.

Söderwall; K. F. (1953). Ordbok Öfver svenska medeltids-språket. Supplement; volume IV–V. Lund; Sweden.

Traugott; E. C. (2001). Legitimate counterexamples to unidirectionality. stanford.edu/ ~traugott/papers/Freiburg.Unidirect.pdf. Paper presented at Freiburg university.

Traugott; E. C. and Dasher; R. B. (2002). Regularity in semantic change. Cambridge university press; Cambridge.

Citeringar i Crossref