Extracting Scientists from Wikipedia

Gustaf Harari Ekenstierna
Lund University, Lund, Sweden

Victor Shu-Ming Lam
Lund University, Lund, Sweden

Ladda ner artikel

Ingår i: Digital Humanities 2016. From Digitization to Knowledge 2016: Resources and Methods for Semantic Processing of Digital Works/Texts, Proceedings of the Workshop, July 11, 2016, Krakow, Poland

Linköping Electronic Conference Proceedings 126:3, s. 13--20

Visa mer +

Publicerad: 2016-07-08

ISBN: 978-91-7685-733-5

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


The Internet is, among other things, a very large and continuously growing source of information and knowledge. This knowledge can be found in the form of text, images, databases, tables, etc. In this article, we describe a system that gathers information from Wikipedia articles and existing data from Wikidata, which is then combined and put in a searchable database. This system is dedicated to making the process of finding scientists both quicker and easier.


Inga nyckelord är tillgängliga


Kurt Bollacker, Patrick Tufts, Tomi Pierce, and Robert Cook. 2007. A platform for scalable, collaborative, structured information integration. In Intl. Workshop on Information Integration on the Web (IIWeb’07).

Marcus Klang and Pierre Nugues. 2016. Wikiparq: A tabulated wikipedia resource using the parquet format. In Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), Portoroz.

Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Chris Bizer. 2014. DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal.

Meta. 2016. List of Wikipedias — Meta, discussion about Wikimedia projects. Meta, discussion about Wikimedia projects., https://meta.wikimedia.org/w/index.php?title=List of Wikipedias&oldid=15207561. [Online; accessed 12-January-2016].

Wikipedia. 2015. Wikipedia:Modelling Wikipedia’s growth — Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia.,
https://en.wikipedia.org/w/index.php?title=Wikipedia:Modelling_Wikipedia’s_growth&oldid=685257398. [Online; accessed 12-January-2016].

Amy Zhao Yu, Shahar Ronen, Kevin Hu, Tiffany Lu, and C´esar A. Hidalgo. 2016. Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific Data, 3, 1. From

Citeringar i Crossref