Sabrina Dittrich
Department of Linguistics, University of Tübingen, Germany
Zarah Weiss
Department of Linguistics, University of Tübingen, Germany
Hannes Schröter
German Institute for Adult Education – Leibniz Centre for Lifelong Learning, Germany
Detmar Meurers
Department of Linguistics, University of Tübingen, Germany / LEAD Graduate School and Research Network, University of Tübingen, Germany
Download articlePublished in: Proceedings of the 8th Workshop on Natural Language Processing for Computer Assisted Language Learning (NLP4CALL 2019), September 30, Turku Finland
Linköping Electronic Conference Proceedings 164:5, p. 41-56
NEALT Proceedings Series 39:5, p. 41-56
Published: 2019-09-30
ISBN: 978-91-7929-998-9
ISSN: 1650-3686 (print), 1650-3740 (online)
Reading material that is of interest and at
the right level for learners is an essential
component of effective language education. The web has long been identified as a
valuable source of reading material due to
the abundance and variability of materials
it offers and its broad range of attractive
and current topics. Yet, the web as source
of reading material can be problematic in
low literacy contexts.
We present ongoing work on a hybrid
approach to text retrieval that combines
the strengths of web search with retrieval
from a high-quality, curated corpus resource. Our system, KANSAS Suche 2.0,
supports retrieval and reranking based on
criteria relevant for language learning in
three different search modes: unrestricted
web search, filtered web search, and corpus search. We demonstrate their complementary strengths and weaknesses with regard to coverage, readability, and suitability of the retrieved material for adult literacy and basic education. We show that
their combination results in a very versatile and suitable text retrieval approach for
education in the language arts.