How Can Big Data Help Us Study Rhetorical History?

Jon Viklund
Department of Literature, Uppsala University, Sweden

Lars Borin
Språkbanken/Department of Swedish, University of Gothenburg, Sweden

Ladda ner artikel

Ingår i: Selected Papers from the CLARIN Annual Conference 2015, October 14–16, 2015, Wroclaw, Poland

Linköping Electronic Conference Proceedings 123:7, s. 79-93

NEALT Proceedings Series 28:7, p. 79-93

Visa mer +

Publicerad: 2016-04-11

ISBN: 978-91-7685-765-6

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


Rhetorical history is traditionally studied through rhetorical treatises or selected rhetorical practices, for example the speeches of major orators. Although valuable sources, these do not give us the answers to all our questions. Indeed, focus on a few canonical works or the major historical key figures might even lead us to reproduce cultural self-identifications and false generalizations. However, thanks to increasing availability of relevant digitized texts, we are now at a point where it is possible to see how new research questions can be formulated – and how old research questions can be addressed from a new angle or established results verified – on the basis of exhaustive collections of data, rather than small samples, but where a methodology has not yet established itself. The aim of this paper is twofold: (1) We wish to demonstrate the usefulness of large-scale corpus studies (“text mining”) in the field of rhetorical history, and hopefully point to some interesting research problems and how they can be analyzed using “big-data” methods. (2) In doing this, we also aim to make a contribution to method development in e-science for the humanities and social sciences, and in particular in the framework of CLARIN.


Inga nyckelord är tillgängliga


Erez Aiden and Jean-Baptiste Michel. 2013. Uncharted: Big data as a lens on human culture. Riverhead Books, New York.

Robert B. Allen, Andrea Japzon, Palakorn Achananuparp, and Ki Jung Lee. 2007. A framework for text processing and supporting access to collections of digitized historical newspapers. In M. J. Smith and G. Salvendy, editors, Human interface, Part II, HCII 2007, number 4555 in LNCS, pages 235–244. Springer, Berlin.

Ruth Amossy. 2002. How to do things with doxa: Toward an analysis of argumentation in discourse. Poetics Today, 23(3):465–487.

Marc Angenot. 1982. La parole pamphl´etaire: Typologie des discours modernes. Payot, Paris. Jean-Claude Anscombre. 1995. Théorie des topoï . Kimé, Paris.

Michael Billig. 1996. Arguing and thinking: A rhetorical approach to social psychology. Cambridge University Press, Cambridge.

Lars Borin and Richard Johansson. 2014. Kulturomik: Att spana efter språkliga och kulturella förändringar i digitala textarkiv. In Jessica Parland-von Essen and Kenneth Nyberg, editors, Historia i en digital värld.

Lars Borin, Markus Forsberg, and Johan Roxendal. 2012. Korp – the corpus infrastructure of Språkbanken. In Proceedings of LREC 2012, pages 474–478, Istanbul. ELRA.

Peter M. Broadwell and Timothy R. Tangherlini. 2012. TrollFinder: Geo-semantic exploration of a very large corpus of Danish folklore. In The Third Workshop on Computational Models of Narrative, pages 50–57, Istanbul. ELRA.

Kenneth Burke. 1969. A rhetoric of motives. University of California Press, Berkeley.

Toby Burrows. 2013. A data-centred ‘virtual laboratory’ for the humanities: Designing the Australian Humanities Networked Infrastructure (HuNI) service. Literary and Linguistic Computing, 28(4):576–581.

Gunilla Byrman. 1998. Tidningsnotisen i förändring 1746–1997. Institutionen f¨or nordiska språk, Lunds universitet. Svensk sakprosa, rapport nr 15.

Gunilla Byrman. 2001. Municipalstämma hölls igår i Tomelilla . . . . Svenskt notisspråk 1746–1997. In Björn Melander and Björn Olsson, editors, Verklighetens texter. Sjutton fallstudier, pages 443–483. Studentlitteratur, Lund.

Annie T. Chen, Ayoung Yoon, and Ryan Shaw. 2012. People, places and emotions: Visually representing historical context in oral testimonies. In The Third Workshop on Computational Models of Narrative, pages 45–49, Istanbul. ELRA.

Jason Chuang, Christopher D. Manning, and Jeffrey Heer. 2012a. Termite: Visualization techniques for assessing textual topic models. In Advanced Visual Interfaces.

Jason Chuang, Daniel Ramage, Christopher D. Manning, and Jeffrey Heer. 2012b. Interpretation and trust: Designing model-driven visualizations for text analysis. In ACM Human Factors in Computing Systems (CHI).

Otto Fischer. 2013. Mynt i Ciceros sopor. Retorikens och v¨altalighetens status i 1700-talets svenska diskussion, volume 1 of Södertörn Retoriska Studier. Södertörns högskola, Huddinge.

Roberto Franzosi. 1987. The press as a source of socio-historical data: Issues in the methodology of data collection from newspapers. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 20(1):5–16.

Dirk Geeraerts, editor. 2006. Cognitive linguistics: Basic readings. De Gruyter, Berlin.

Paul Gooding. 2013. Mass digitization and the garbage dump: The conflicting needs of quantitative and qualitative methods. Literary and Linguistic Computing, 28(3):425–431.

S. Havre, B. Hetzler, and L. Nowell. 2000. ThemeRiver: Visualizing theme changes over time. In IEEE Symposium on Information Visualization, 2000. InfoVis 2000, pages 115–123, Salt Lake City.

Hagen Hirschmann, Anke Lüdeling, and Amir Zeldes. 2012. Measuring and coding language change: An evolving study in a multilayer corpus architecture. ACM Journal on Computing and Cultural Heritage, 5(1):article 4.

Matthew L. Jockers. 2013. Macroanalysis: Digital methods and literary history. University of Illinois Press, Urbana/Chicago/Springfield.

Kurt Johannesson, Eric Johannesson, Björn Meidal, and Jan Stenkvist. 1987. Heroer på offentlighetens scen. Politiker och publicister i Sverige 1809–1914. Tidens förlag, Stockholm.

Kurt Johannesson. 2005. Svensk retorik. Från medeltiden till våra dagar. Norstedts, Stockholm.

Olle Josephson. 1991. Diskussionsskolan 1886: Språkmiljö, argumentation och stil i tidig arbetarrörelse. Number 1 in Arbetarrörelsen och språket. Avdelningen för retorik, Uppsala universitet, Uppsala.

Daniel A. Keim, Leishi Zhang, Miloš Krstajic, and Svenja Simon. 2010. Solving problems with visual analytics: Challenges and applications. ACM Transactions on Embedded Computing Systems, 4(4):article 39.

Miloš Krstajic, Mohammad Najm-Araghi, Florian Mansmann, and Daniel A. Keim. 2012. Incremental visual text analytics of news story development. In Proceedings of Conference on Visualization and Data Analysis (VDA ’12).

Per Lagerholm. 1999. Talspråk i skrift. Om muntlighetens utveckling i svensk sakprosa 1800–1997. Number A 54 in Lundastudier i nordisk språkvetenskap. Lunds universitet, Institutionen för nordiska språk, Lund.

George Lakoff and Mark Johnson. 1980. Metaphors we live by. University of Chicago Press, Chicago.

Per Ledin. 1995. Arbetarnes är denna tidning. Textförändringar i den tidiga socialdemokratiska pressen. Number 20 in Acta Universitatis Stockholmiensis: Stockholm Studies in Scandinavian Philology, New Series. Almqvist & Wiksell International, Stockholm.

John Lee. 2007. A computational model of text reuse in ancient literary texts. In Proceedings of the 45th Annual Meeting of the ACL, pages 472–479, Prague. ACL.

Mats Malm. 2014. Digitala textarkiv och forskningsfrågor. In Jessica Parland-von Essen and Kenneth Nyberg, editors, Historia i en digital värld.

Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books

Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak, and Erez Lieberman Aiden. 2011. Quantitative analysis of culture using millions of digitized books. Science, (331).

Franco Moretti. 2005. Graphs, maps, trees: Abstract models for a literary history. Verso, London/New York.

Franco Moretti. 2013. Distant reading. Verso, London/New York.

Brigitte Mral. 1993. Kommunikation och handlande i Malmö kvinnliga diskusionsklubb 1900–1904. Number 6 in Arbetarrörelsen och språket. Avdelningen f¨or retorik, Uppsala universitet, Uppsala.

Daniela Oelke, Dimitrios Kokkinakis, and Mats Malm. 2012. Advanced visual analytics methods for literature analysis. In Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pages 35–44, Avignon. ACL.

Daniela Oelke, Dimitrios Kokkinakis, and Daniel A. Keim. 2013. Fingerprint matrices: Uncovering the dynamics of social networks in prose literature. Computer Graphics Forum, 32(3):371–380.

Ann Öhrberg. 2001. Vittra fruntimmer. Författarroll och retorik hos frihetstidens kvinnliga författare. Gidlunds, Hedemora.

Ann Ö hrberg. 2010. ”Fasa fo¨r all fla¨rd, konstlan och fo¨rsta¨llning”. Den ideala retorn inom 1700-talets nya offentlighet. Samlaren, 131.

Ann Ö hrberg. 2011. Between the civic and the polite. Classical rhetoric, eloquence and gender in late eighteenth century Sweden. In Otto Fischer and Ann Ö hrberg, editors, Metamorphoses of Rhetoric. Classical Rhetoric in the Eighteenth Century, number 3 in Studia Rhetorica Upsaliensia. Uppsala University, Uppsala.

Ann Ö hrberg. 2014. Samtalets retorik. Belevade kulturer, offentlig kommunikation och kön i svenskt 1700-tal. Symposions f¨orlag, Höör.

Eitan Adam Pechenick, Christopher M. Danforth, and Peter Sheridan Dodds. 2015. Characterizing the Google Books corpus: Strong limits to inferences of socio-cultural and linguistic evolution. PLoS ONE, 10(10):e0137041, 10.

Daniel Ramage, Evan Rosen, Jason Chuang, Christopher D. Manning, and Daniel A. McFarland. 2009. Topic modeling for the social sciences. In NIPS 2009 Workshop on Applications for Topic Models: Text and Beyond, Whistler, Canada.

Christian Rohrdantz, Michael Hund, Thomas Mayer, Bernhard W¨alchli, and Daniel A. Keim. 2012. The world’s languages explorer: Visual analysis of language features in genealogical and areal contexts. Computer Graphic Forum, 31(3):935–944.

Mats Rosengren. 2002. Doxologi. En essä om kunskap. Rhetor förlag, Åstorp.

Bill N. Schilit and Okan Kolak. 2008. Exploring a digital library through key ideas. In Proceedings of JCDL’08, pages 177–186, Pittsburgh. ACM.

Christof Schöch. 2013. Big? Smart? Clean? Messy? Data in the humanities. Journal of Digital Humanities, 2(3). <http://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/>.

Ben Shneiderman. 1998. Designing the user interface. Addison-Wesley, Reading, Mass., 3rd ed edition.

Marie-Christine Skuncke. 1999. Den svenska demokratidebatten 1766–1772. In Rut Boström Andersson, editor, Ordets makt och tankens frihet. Om språket som maktfaktor. Uppsala universitet, Uppsala.

Marie-Christine Skuncke. 2004. Press and political culture in Sweden at the end of the Age of liberty. Enlightenment, revolution and the periodical press. In Hans-Jürgen Lüsebrink and Jeremy D. Popkin, editors, SVEC 2004:06. Voltaire Foundation, Oxford.

David A. Smith. 2002. Detecting and browsing events in unstructured text. In Proceedings of SIGIR’02, Tampere. ACM.

Nina Tahmasebi, Lars Borin, Gabriele Capannini, Devdatt Dubhashi, Peter Exner, Markus Forsberg, Gerhard Gossen, Fredrik Johansson, Richard Johansson, Mikael Kågebäck, Olof Mogren, Pierre Nugues, and Thomas Risse. 2015. Visions and open challenges for a knowledge-based culturomics. International Journal on Digital Libraries, 15(2–4):169–187.

Timothy R. Tangherlini. 2013. The folklore macroscope. Challenges for a computational folkloristics. Western Folklore, 72(1):7–27.

Jon Viklund. 2004. Ett vidunder i sitt sekel. Retoriska studier i C.J.L. Almqvists kritiska prosa. Gidlund, Hedemora.

Jon Viklund. 2013. Performance in an age of democratization: The rhetorical citizen and the transformation of elocutionary manuals in Sweden ca. 1840–1920. Paper presented at ISHR [International Society for the History of Rhetoric] biannual conference in Chicago.

Claire Warwick, Melissa Terras, Paul Huntington, and Nikoleta Pappa. 2008. If you build it will they come? The LAIRAH study: Quantifying the use of online resources in the arts and humanities statistical analysis of user log data. Literary and Linguistic Computing, 23(1):85–102.

Ben Zimmer. 2013. When physicists do linguistics. Is English ‘cooling’? A scientific paper gets the cold shoulder. Boston Globe, February 10.

Citeringar i Crossref