In qualitative projects on ICALL (Intelligent Computer-Assisted Language Learning); research and development always go hand in hand: development both depends upon the research results and dictates the research agenda. Likewise; in the development of the Swedish ICALL platform Lärka; the practical issues of development have dictated its research agenda. With NLP approaches; sooner or later; the necessity for reliable training data becomes unavoidable. At the moment Lärka’s research agenda cannot be addressed without access to reliable training data; so-called “gold standard”. This paper gives an overview of the current state of the Swedish ICALL platform development and related research agenda; and describes the first attempts to collect the reference corpus (“gold standard”) coming from course books used in CEFR-based language teaching.
Aldabe; I.; Lacalle; M.L.D.; Maritxalar; M.; Martinez; E.; Uria; L. (2006). ArikIturri: An Automatic Question Generator Based on Corpora and NLP Techniques. In Intelligent Tutoring Systems (2006); 584-594
Amaral; L. & Meurers; D. (2011). On using intelligent computer-assisted language learning in real-life foreign language teaching and learning. ReCALL 23(1): 4–24.
Amaral; L.; Meurers; D. & Ziai; R. (2011). Analyzing learner language: towards a flexible natural language processing architecture for intelligent language tutors. Computer Assisted Language Learning 24(1): 1–16.
Beinborn; L.; Zesch; T.; & Gurevych; I. (2012). Towards fine-grained readability measures for self-directed language learning. In Electronic Conference Proceedings (Vol. 80; pp. 11-19).
Borin; L.; Forsberg; M.; & Roxendal; J. (2012). Korp – the corpus infrastructure of Språkbanken. Proceedings of LREC 2012. Istanbul: ELRA; p.474–478.
Byrnes H. (2007). Perspectives. The Modern Language Journal; 91; iv; p.641– 645.
Carlsten; C. (2012). Proficiency Level – a Fuzzy Variable in Computer Learner Corpora. Applied Linguistics; Volume 33(2); p.161-183
Collins-Thompson; K. & Callan; J. (2005). Predicting reading difficulty with statistical language models. Journal of the American Society for Information¨Science and Technology; 56(13). pp. 1448-1462.
Collins-Thompson; K. and Callan; J. (2007). Automatic and Human Scoring of Word Definition Responses. Proceedings of NAACL HLT 2007; 476-483. Rochester; NY.
Council of Europe. (2001). The Common European Framework of Reference for Languages: Learning; Teaching; Assessment. Cambridge University Press.
Council of Europe. 2009. Relating language examinations to the Common European Framework of Reference for Languages: learning; teaching; assessment (CEFR). A Manual; Strasbourg: Language Policy Division.
Dávid; G.A. 2010. Linking the general English suite of Euro Examinations to the CEFR: a case study report. In Martyniuk; W. (Ed.) Aligning Tests with the CEFR. Cambridge University Press; p.177-203.
Einarsson; J. (1976). Talbanken: Talbankens skriftspråkskonkordans/ Talbankens talspråkskonkordans. Lund University.
Francois; T. & Miltsakaki; E. (2012). Do NLP and Machine Learning Improve Traditional Readability Formulas? In Proceedings of the First Workshop on Predicting and Improving Text Readability for Target Reader Population;NAACL
Hawkins; J. A. & Buttery; P. (2009). Using learner language from corpora to profile levels of proficiency: Insights from the English Profile Programme. In Taylor; L. & Weir; C. J. (Eds). Language Testing Matters: Investigating the Wider Social and Educational Impact of Assessment; 158-175. Cambridge: Cambridge University Press.
Heift; T. (2003). Multiple learner errors and meaningful feedback: A challenge for ICALL systems. CALICO Journal; 20(3); 533–548.
Heilman; M.; Collins-Thompson; K.; Callan; J. and Eskenazi; M. (2007). Combining Lexical and Grammatical Features to Improve Readability Measures for First and Second Language Texts. Proceedings of NAACL HLT 2007; 460-467. Rochester; NY.
Heimann Mühlenbock; K. (2013). I see what you mean: Assessing readability for specific target groups. PhD Thesis. Data linguistica; University of
Gothenburg.
Hultman; T. G. & Westman; M. (1977). Gymnasistsvenska. Lund: Liber Läromedel.
Johansson Kokkinakis; S. & Magnusson; U. (2011). Computer based quantitative methods applied to first and second language student writing. Young urban Swedish. Variation and change in multilingual settings.University of Gothenburg; 105-124.
Kate; R. J.; Luo; X.; Patwardhan; S.; Franz; M.; Florian; R.; Mooney; R. J. Roukos; S. & Welty; C. (2010). Learning to predict readability using diverse
linguistic features. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 546-554). Association for Computational Linguistics.
Khalifa; H.; Ffrench; A. & Salamoura; A. 2010. Maintaining alighnment to the CEFR: the FCE case study. In Martyniuk; W. (Ed.) Aligning Tests with the CEFR. Cambridge University Press; p.80-101.
Kilgarriff A.; Charalabopoulou F.; Gavrilidou M.; Bondi Johannessen J.; Khalil S.; Johansson Kokkinakis S.; Lew R.; Sharoff S.; Vadlapudi R; Volodina E. (accepted; LREJ 2013). Corpus-Based Vocabulary lists for Language Learners for Nine Languages. Language Resources and Evaluation Journal; special issue.
Kilgarriff; A.; Husak; M.; McAdam; K.; Rundell; M.; & Rychlý; P. (2008). GDEX: Automatically finding good dictionary examples in a corpus. In Proc. Euralex.
Knoop; S. & Wilske; S. (2013). Automatic Generation of Gap-Filling Vocabulary Exercises for Mobile Learning. 2nd workshop on NLP in Computer-Assisted Language Learning. Proceedings of the NODALIDA 2013 workshop on NLP for CALL. Linköping Electronic Conference Proceedings 85.
Källgren; G.; Gustafson-Capková; S. and Hartmann; B. (2006). Manual of the Stockholm Umeå Corpus version 2.0. Department of Linguistics; Stockholm University.
Lindberg; I. & Johansson Kokkinakis; S. (2007). OrdiL - en korpusbaserad kartläggning av ordförrådet i läromedel för grundskolans senare år. Göteborgs universite
Lindberg; I. & Johansson Kokkinakis; K. (2009). Word Type Grouping in Swedish Secondary School Textbooks - An Inventory of Words from a Second Language Perspective Multilingualism; Proceedings of the 23rd Scandinavian Conference of Linguistics. 337-339
Little D. (2007). The Common European Framework of Reference for Languages: Perspectives on the Making of Supranational Language Education Policy. The Modern Language Journal 91; p.645–655.
Little D. (2011). The Common European Framework of Reference for Languages: A research agenda. Language Teaching; Vol 44.3; p.381-393.
Cambridge University Press 2011.
Martin; J.R. & Rose; D. (2008). Genre Relations. Equinox Publishing Ltd.
Meurers; D.; Ziai; R.; Amaral; L.; Boyd; A.; Dimitrov; A.; Metcalf; V. & Ott; N. (2010. Enhancing Authentic Web Pages for Language Learners. Proceedings of the 5th Workshop on Innovative Use of NLP for Building Educational Applications; NAACL-HLT 2010; Los Angeles.
Milton; J. (2009). Measuring Second Language Vocabulary Acquisition. Toronto: Multilingual Matters.
Nagata; N. 2009. Robo-Sensei‘s NLP-based error detection and feed-back generation. CALICO Journal; 26(3); 562–579.
Nivre; J.; Nilsson; J. and Hall; J. (2006). Talbanken05: A Swedish Treebank with Phrase Structure and Dependency Annotation. In Proceedings of the fifth international conference on Language Resources and Evaluation (LREC2006) Genoa: ELRA. 1392-1395.
North; B. (2007). The CEFR illustrative descriptor scales. The Modern Language Journal 91; p.656–659.
Nyström; C. (2000). Gymnasisters skrivande. En studie av genre; textstruktur och sammanhang. Uppsala: Uppsala universitet.
Pijetlovic; D. & Volodina; E. (forthcoming). Developing a Swedish spelling game on an ICALL platform. Proceedings of EuroCALL 2013.
Pilán; I.; Volodina; E. & Johansson; R. (forthcoming). Automatic selection of suitable sentences for language learning exercises. Proceedings of EuroCALL 2013.
Szabó; G. 2010. Relating language examinations to the CEFR: ECL as a case study. In Martyniuk; W. (Ed.) Aligning Tests with the CEFR. Cambridge University Press; p.133-144.
Tanaka-Ishii; K.; Tezuka; S.; & Terada; H. (2010). Sorting texts by readability.Computational Linguistics; 36(2); 203-227.
Teleman; U. (1974). Manual för grammatisk beskrivning av talad och skriven svenska. Lund.
Toole; J. & Heift; T. (2002). Task-Generator: A Portable System for Generating Learning Tasks for Intelligent Language Tutoring Systems.
Proceedings of ED-MEDIA 02; World Conference on Educational Multimedia; Hypermedia & Telecommunications; Charlottesville; VA: AACE: 1972-1978. Volodina; E. and Borin; L. (2012). Developing a freely available web-based exercise generator for Swedish. CALL: Using; Learning; Knowing. EuroCALL Conference; Gothenburg; Sweden; 22-25 August 2012; Proceedings. Eds. Linda Bradley and Sylvie Thouësny. Research-publishing.net; Dublin; Ireland.
Volodina; E.; Borin; L.; Loftsson; H.; Arnbjörnsdóttir; B. & Örn Leifsson; G. (2012a). Waste not; want not: Towards a system architecture for ICALL
based on NLP component re-use. Workshop on NLP in Computer-Assisted Language Learning. Proceedings of the SLTC 2012 workshop on NLP for CALL. Linköping Electronic Conference Proceedings 80: 47-58.
Volodina; E.; Johansson; R. & Johansson Kokkinakis; S. (2012b). Semiautomatic selection of best corpus examples for Swedish: Initial algorithm evaluation. Workshop on NLP in Computer-Assisted Language Learning. Proceedings of the SLTC 2012 workshop on NLP for CALL. Linköping Electronic Conference Proceedings 80: 59–70.
Volodina; E. & Johansson Kokkinakis; S. (2012). Introducing Swedish Kellylist; a new lexical e-resource for Swedish. Proceedings of LREC 2012.
Istanbul: ELRA. Westhoff G. (2007). Challengens and Opportunities of the CEFR for Reimagining Foreign Language Pedagogy. The Modern Language Journal 91; p.676–679.
Östlund-Stjärnegårdh; E. (2002). Godkänd i svenska? Bedömning och analys av gymnasieelevers texter. Uppsala: Uppsala universitet.