Wikipedia is a multilingual encyclopedia written collaboratively by volunteers online, and it is now the largest, most visited encyclopedia in existence. Wikipedia has arisen through the self-organized collaboration of contributors, and since its launch in January 2001, its potential as a research resource has become apparent to scientists, its appeal lies in the fact that it strikes a middle ground between accurate, manually created, limited-coverage resources, and noisy knowledge mined from the web. For this reason, Wikipedia's content has been exploited for a variety of applications: to build knowledge bases, to study interactions between users on the Internet, and to investigate social and cultural issues such as gender bias in history, or the spreading of information. Similarly to what happened for the Web at large, a structure has emerged from the collaborative creation of Wikipedia: its articles contain hundreds of millions of links. In Wikipedia parlance, these internal links are called wikilinks. These connections explain the topics being covered in articles and provide a way to navigate between different subjects, contextualizing the information, and making additional information available. In this thesis, we argue that the information contained in the link structure of Wikipedia can be harnessed to gain useful insights by extracting it with dedicated algorithms. More prosaically, in this thesis, we explore the link structure of Wikipedia with new methods. In the first part, we discuss in depth the characteristics of Wikipedia, and we describe the process and challenges we have faced to extract the network of links. Since Wikipedia is available in several language editions and its entire edition history is publicly available, we have extracted the wikilink network at various points in time, and we have performed data integration to improve its quality. In the second part, we show that the wikilink network can be effectively used to find the most relevant pages related to an article provided by the user. We introduce a novel algorithm, called CycleRank, that takes advantage of the link structure of Wikipedia considering cycles of links, thus giving weight to both incoming and outgoing connections, to produce a ranking of articles with respect to an article chosen by the user. In the last part, we explore applications of CycleRank. First, we describe the Engineroom EU project, where we faced the challenge to find which were the most relevant Wikipedia pages connected to the Wikipedia article about the Internet. Finally, we present another contribution using Wikipedia article accesses to estimate how the information about diseases propagates. In conclusion, with this thesis, we wanted to show that browsing Wikipedia's wikilinks is not only fascinating and serendipitous, but it is an effective way to extract useful information that is latent in the user-generated encyclopedia.
The Dao of Wikipedia: Extracting Knowledge from the Structure of Wikilinks / Consonni, Cristian. - (2019 Oct 24), pp. 1-184.
Titolo: | The Dao of Wikipedia: Extracting Knowledge from the Structure of Wikilinks |
Anno di pubblicazione: | 2019-10-24 |
Ciclo: | XXX |
Anno Accademico: | 2016-2017 |
Struttura: | Dipartimento di Ingegneria e Scienza dell'Informazione |
Corso di dottorato: | Information and Communication Technology |
Autori Unitn: | |
Relatore: | Montresor, Alberto |
Supervisori e coordinatori: | Velegrakis, Ioannis |
Tesi in cotutela: | no |
Abstract: | Wikipedia is a multilingual encyclopedia written collaboratively by volunteers online, and it is now the largest, most visited encyclopedia in existence. Wikipedia has arisen through the self-organized collaboration of contributors, and since its launch in January 2001, its potential as a research resource has become apparent to scientists, its appeal lies in the fact that it strikes a middle ground between accurate, manually created, limited-coverage resources, and noisy knowledge mined from the web. For this reason, Wikipedia's content has been exploited for a variety of applications: to build knowledge bases, to study interactions between users on the Internet, and to investigate social and cultural issues such as gender bias in history, or the spreading of information. Similarly to what happened for the Web at large, a structure has emerged from the collaborative creation of Wikipedia: its articles contain hundreds of millions of links. In Wikipedia parlance, these internal links are called wikilinks. These connections explain the topics being covered in articles and provide a way to navigate between different subjects, contextualizing the information, and making additional information available. In this thesis, we argue that the information contained in the link structure of Wikipedia can be harnessed to gain useful insights by extracting it with dedicated algorithms. More prosaically, in this thesis, we explore the link structure of Wikipedia with new methods. In the first part, we discuss in depth the characteristics of Wikipedia, and we describe the process and challenges we have faced to extract the network of links. Since Wikipedia is available in several language editions and its entire edition history is publicly available, we have extracted the wikilink network at various points in time, and we have performed data integration to improve its quality. In the second part, we show that the wikilink network can be effectively used to find the most relevant pages related to an article provided by the user. We introduce a novel algorithm, called CycleRank, that takes advantage of the link structure of Wikipedia considering cycles of links, thus giving weight to both incoming and outgoing connections, to produce a ranking of articles with respect to an article chosen by the user. In the last part, we explore applications of CycleRank. First, we describe the Engineroom EU project, where we faced the challenge to find which were the most relevant Wikipedia pages connected to the Wikipedia article about the Internet. Finally, we present another contribution using Wikipedia article accesses to estimate how the information about diseases propagates. In conclusion, with this thesis, we wanted to show that browsing Wikipedia's wikilinks is not only fascinating and serendipitous, but it is an effective way to extract useful information that is latent in the user-generated encyclopedia. |
Digital Object Identifier (DOI): | 10.15168/11572_243097 |
Luogo di edizione: | Trento |
Casa editrice: | Università degli studi di Trento |
Da pag.: | 1 |
A pag.: | 184 |
N. di pagine: | 184 |
Lingua: | Inglese |
Area CUN: | 1 - Area scientifica - Scienze Matematiche e Informatiche 9 - Area tecnologica - Ingegneria Industriale e dell'Informazione |
Settore Scientifico Disciplinare: | Settore INF/01 - Informatica Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni |
Bibliografia: | [1] English Wikipedia, Special:Statistics. https://en.wikipedia.org/wiki/Special:Statistics. Accessed: 2016-06-16. [2] English Wikipedia, Wikipedia:Copyrights (revision no. 708324718, as of 2016-03-04). https://en.wikipedia.org/w/index.php?title=Wikipedia:Copyrights&oldid=708324718. Accessed: 2015-06-17. [3] English Wikipedia, Wikipedia:Identifying reliable sources (revision no. 725234829, as of 2016-06-14). https://en.wikipedia.org/w/index.php?title=Wikipedia:Identifying_reliable_sources&oldid=725234829. Accessed: 2015-06-16. [4] English Wikipedia, Wikipedia:Neutral point of view (revision no. 716663004, as of 2016-04-23). https://en.wikipedia.org/w/index.php?title=Wikipedia:Neutral_point_of_view&oldid=716663004. Accessed: 2016-06-16. [5] English Wikipedia, Wikipedia:Size of Wikipedia (revision no. 688564347, as of 2015-11-01). https://en.wikipedia.org/w/index.php?title=Wikipedia:Size_of_Wikipedia&oldid=688564347. Accessed: 2015-12-11. [6] English Wikipedia, Wikipedia:Verifiability (revision no. 725008627, as of 2016-06-13). https://en.wikipedia.org/w/index.php?title=Wikipedia:Verifiability&oldid=725008627. Accessed: 2015-06-16. [7] Wikimedia foundation reportcard. https://reportcard.wmflabs.org/. Accessed: 2015-12-11. [8] Eytan Adar, Matthew Hurst, Tim Finin, Natalie S. Glance, Nicolas Nicolov, and Belle L. Tseng, editors. Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California, USA, May 17-20, 2009. The AAAI Press, 2009. [9] Eduard Aibar, Josep Lladós-Masllorens, Antoni Meseguer-Artola, Julià Minguillón, and Maura Lerga. Wikipedia at university: What faculty think and do about it. The electronic library, 33(4):668--683, 2015. [10] Alexa Internet, Inc. The top 500 sites on the web. https://www.alexa.com/topsites, 2019. [Online; accessed 13-March-2019]. [11] Roy M Anderson and Robert M May. Infectious diseases of humans: dynamics and control. Oxford university press, 1992. [12] Pablo Aragon, David Laniado, Andreas Kaltenbrunner, and Yana Volkovich. Biographical social networks on Wikipedia: a cross-cultural study of links that made history. In Proceedings of the eighth annual international symposium on Wikis and open collaboration, page 19. ACM, 2012. [13] Marcius Armada de Oliveira, Kate Cerqueira Revoredo, and Jose Eduardo Ochoa Luna. Semantic unlink prediction in evolving social networks through probabilistic description logic. In Intelligent Systems (BRACIS), 2014 Brazilian Conference on, pages 372--377. IEEE, 2014. [14] Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. Gephi: An open source software for exploring and manipulating networks. In Adar et al. [8]. [ http ] [15] Chris T Bauch and Alison P Galvani. Social factors in epidemiology. Science, 342(6154):47--49, 2013. [16] Bo Begole, Jinwoo Kim, Kori Inkpen, and Woontack Woo, editors. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI 2015, Seoul, Republic of Korea, April 18-23, 2015. ACM, 2015. [ http ] [17] Andras A Benczur, Karoly Csalogany, Tamas Sarlos, and Mate Uher. Spamrank--fully automatic link spam detection work in progress. In Proceedings of the first international workshop on adversarial information retrieval on the web, 2005. [18] Jon Bentley. Programming Pearls. Addison--Wesley, Boston, MA, USA, 2nd edition, 1999. [19] Paolo Boldi, Massimo Santini, and Sebastiano Vigna. Pagerank as a function of the damping factor. In Proceedings of the 14th international conference on World Wide Web, pages 557--566. ACM, 2005. [20] Erik Borra, Esther Weltevrede, Paolo Ciuccarelli, Andreas Kaltenbrunner, David Laniado, Giovanni Magni, Michele Mauri, Richard Rogers, and Tommaso Venturini. Societal controversies in wikipedia articles. In Begole et al. [16], pages 193--196. [ DOI | http ] [21] Paolo Bosetti, Piero Poletti, Cristian Consonni, Bruno Lepri, David Lazer, Stefano Merler, and Alessandro Vespignani. Disentangling social contagion and media drivers in the emergence of health threats awareness. Science Advances, 2019. Under review at Science Advances. [22] Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107--117, 1998. [23] Robert Bringhurst. The Elements of Typographic Style. Version 4.0: 20th Anniversary Edition. Hartley & Marks Publishers, Point Roberts, WA, USA, 2013. [24] Andries E Brouwer and Willem H Haemers. Spectra of graphs. Springer Science & Business Media, 2011. [25] Kenneth P Burnham and David R Anderson. Model selection and multimodel inference: a practical information-theoretic approach. Springer Science & Business Media, 2003. [26] Cristian Candia, C Jara-Figueroa, Carlos Rodriguez-Sickert, Albert-László Barabási, and César A Hidalgo. The universal decay of collective memory and attention. Nature Human Behaviour, page 1, 2018. [27] Andrea Capocci, Vito DP Servedio, Francesca Colaiori, Luciana S Buriol, Debora Donato, Stefano Leonardi, and Guido Caldarelli. Preferential attachment in the growth of social networks: The internet encyclopedia wikipedia. Physical review E, 74(3):036116, 2006. [28] Marco Cè, Cristian Consonni, Georg P Engel, and Leonardo Giusti. Non-Gaussianities in the topological charge distribution of the SU(3) Yang-Mills theory. Physical Review D, 92(7):074502, 2015. [29] Damon Centola. The spread of behavior in an online social network experiment. science, 329(5996):1194--1197, 2010. [30] Alexei D Chepelianskii. Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, 2010. [31] Cristian Consonni, David Laniado, and Alberto Montresor. Discovering Topical Contexts from Links in Wikipedia. Part of The Web Conference 2019, 2019. [32] Cristian Consonni, David Laniado, and Alberto Montresor. WikiLinkGraphs: A complete, longitudinal and multi-language dataset of the Wikipedia link networks. In Proceedings of the International AAAI Conference on Web and Social Media, volume 13, pages 598--607, 2019. [33] Cristian Consonni, David Laniado, and Alberto Montresor. CycleRank, or There and Back Again: personalized relevance scores from cyclic paths on graphs. Submitted to VLDB 2020, 2020. [34] Cristian Consonni, Paolo Sottovia, Alberto Montresor, and Yannis Velegrakis. Discovering Order Dependencies through Order Compatibility. In International Conference on Extending Database Technology, 2019. [35] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, Cambridge, MA, USA, 3rd edition, 2009. [36] BJ Cowling, LM Ho, and GM Leung. Effectiveness of control measures during the sars epidemic in beijing: a comparison of the r t curve and the epidemic curve. Epidemiology & Infection, 136(4):562--566, 2008. [37] Fabio De Rosa, Alessio Malizia, and Massimo Mecella. Disconnection prediction in mobile ad hoc networks for supporting cooperative work. Pervasive Computing, IEEE, 4(3):62--70, 2005. [38] Peter Sheridan Dodds and Duncan J Watts. Universal behavior in a generalized model of contagion. Physical review letters, 92(21):218701, 2004. [39] Gunter Dueck. Dueck's Trilogie: Omnisophie -- Supramanie -- Topothesie. Springer, Berlin, Germany, 2005. http://www.omnisophie.com. [40] Economist editorial. The data deluge. The Economist, 2010. [ http ] [41] Scott Emmons, Stephen Kobourov, Mike Gallant, and Katy Börner. Analysis of network clustering algorithms and cluster quality metrics at scale. PLOS ONE, 11:1--18, 07 2016. [ DOI | http ] [42] Young-Ho Eom, Pablo Aragón, David Laniado, Andreas Kaltenbrunner, Sebastiano Vigna, and Dima L Shepelyansky. Interactions of cultures and top people of Wikipedia from ranking of 24 language editions. PloS one, 10(3):e0114825, 2015. [43] Eli P Fenichel, Carlos Castillo-Chavez, M Graziano Ceddia, Gerardo Chowell, Paula A Gonzalez Parra, Graham J Hickling, Garth Holloway, Richard Horan, Benjamin Morin, Charles Perrings, et al. Adaptive human behavior in epidemiological models. Proceedings of the National Academy of Sciences, 108(15):6306--6311, 2011. [44] Neil Ferguson. Capturing human behaviour. Nature, 446(7137):733, 2007. [45] Neil M Ferguson, Derek AT Cummings, Christophe Fraser, James C Cajka, Philip C Cooley, and Donald S Burke. Strategies for mitigating an influenza pandemic. Nature, 442(7101):448, 2006. [46] Massimo Franceschet. Pagerank: Standing on the shoulders of giants. arXiv preprint arXiv:1002.2858, 2010. [47] Isaac Chun-Hai Fung, Zion Tsz Ho Tse, Chi-Ngai Cheung, Adriana S Miu, and King-Wa Fu. Ebola and the social media. The Lancet, 2014. [48] Sebastian Funk, Erez Gilad, Chris Watkins, and Vincent AA Jansen. The spread of awareness and its impact on epidemic outbreaks. Proceedings of the National Academy of Sciences, 106(16):6872--6877, 2009. [49] Walter R Gilks, Sylvia Richardson, and David Spiegelhalter. Markov chain Monte Carlo in practice. Chapman and Hall/CRC, 1995. [50] David F Gleich, Paul G Constantine, Abraham D Flaxman, and Asela Gunawardana. Tracking the random surfer: empirically measured teleportation parameters in pagerank. In Proceedings of the 19th international conference on World wide web, pages 381--390. ACM, 2010. [51] Marcelo FC Gomes, Ana Pastore y Piontti, Luca Rossi, Dennis Chao, Ira Longini, M Elizabeth Halloran, and Alessandro Vespignani. Assessing the international spreading risk associated with the 2014 west african ebola outbreak. PLoS currents, 6, 2014. [52] Google, Inc. Google news. https://news.google.com/, 2018. [Online; accessed January-2018]. [53] Roger Guimerà and Marta Sales-Pardo. Missing and spurious interactions and the reconstruction of complex networks. Proceedings of the National Academy of Sciences, 106(52):22073--22078, 2009. [54] James M Heilman, Eckhard Kemmann, Michael Bonert, Anwesh Chatterjee, Brent Ragar, Graham M Beards, David J Iberri, Matthew Harvey, Brendan Thomas, Wouter Stomp, et al. Wikipedia: a key tool for global public health promotion. Journal of medical Internet research, 13(1), 2011. [55] James M Heilman and Andrew G West. Wikipedia and medicine: quantifying readership, editors, and the significance of natural language. Journal of medical Internet research, 17(3), 2015. [56] Cesar A Hidalgo and C Rodriguez-Sickert. The dynamics of a mobile phone network. Physica A: Statistical Mechanics and its Applications, 387(12):3017--3024, 2008. [57] Benjamin Mako Hill and Aaron Shaw. Consider the redirect: A missing dimension of Wikipedia research. In Proceedings of The International Symposium on Open Collaboration, page 28. ACM, 2014. [58] J J Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8):2554--2558, 1982. [ DOI | arXiv | http ] [59] Istituto Superiore di Sanità. Meningite: l’epidemia è solo mediatica. http://www.epicentro.iss.it/problemi/meningiti/EpidemiaMediatica.asp, 2018. [Online; accessed January-2018]. [60] Mathieu Jacomy, Tommaso Venturini, Sebastien Heymann, and Mathieu Bastian. Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS one, 9(6):e98679, 2014. [61] Dariusz Jemielniak and Eduard Aibar. Bridging the gap between wikipedia and academia. Journal of the Association for Information Science and Technology, 67(7):1773--1776, 2016. [62] Donald B Johnson. Finding all the elementary circuits of a directed graph. SIAM Journal on Computing, 4(1):77--84, 1975. [63] Jaap Kamps and Marijn Koolen. Is wikipedia link structure different? In Proceedings of the second ACM international conference on Web search and data mining, pages 232--241. ACM, 2009. [64] Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, and Volker Markl. Benchmarking distributed stream processing engines. arXiv preprint arXiv:1802.08496, 2018. [65] Donald E. Knuth. Computer Programming as an Art. Communications of the ACM, 17(12):667--673, 1974. [66] Donald E. Knuth. Big Omicron and Big Omega and Big Theta. SIGACT News, 8(2):18--24, 1976. [67] Bruno Latour et al. Reassembling the social: An introduction to actor-network-theory. Oxford university press, 2005. [68] Michaël R Laurent and Tim J Vickers. Seeking health information online: does wikipedia matter? Journal of the American Medical Informatics Association, 16(4):471--479, 2009. [69] Paul F. Lazarsfeld, Bernard Berelson, and Hazel Gaudet. The People's Choice. How the Voter Makes up his Mind in Presidential Campaign. Columbia University Press, 1944. [70] David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, et al. The science of fake news. Science, 359(6380):1094--1096, 2018. [71] Dirk Lewandowski and Ulrike Spree. Ranking of wikipedia articles in search engines revisited: Fair ranking for reasonable quality? Journal of the American Society for Information Science and Technology, 62(1):117--132, 2011. [ DOI | arXiv | http ] [72] Yilun Lin, Bowen Yu, Andrew Hall, and Brent Hecht. Problematizing and addressing the article-as-concept assumption in wikipedia. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, pages 2052--2067. ACM, 2017. [73] Linyuan Lü and Tao Zhou. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150--1170, 2011. [74] Nils Markusson, Tommaso Venturini, David Laniado, and Andreas Kaltenbrunner. Contrasting medium and genre on Wikipedia to open up the dominating definition and classification of geoengineering. Big Data & Society, 3(2):2053951716666102, 2016. [75] Noortje Marres, Richard Rogers, et al. Recipe for Tracing the Fate of Issues and their Publics on the Web. 2005. [76] MediaWiki. Manual:$wgLegalTitleChars --- MediaWiki, The Free Wiki Engine, 2014. [Online; accessed 29-December-2018]. [ www: ] [77] MediaWiki. Manual:pagelinks table --- mediawiki, the free wiki engine, 2019. [Online; accessed 15-January-2019]. [ http ] [78] MediaWiki developers. English Wikipedia, Special:Statistics. https://en.wikipedia.org/wiki/Special:Statistics, 2018. [Online; accessed 28-December-2018]. [79] Mostafa Mesgari, Chitu Okoli, Mohamad Mehdi, Finn Årup Nielsen, and Arto Lanamäki. “the sum of all human knowledge”: A systematic review of scholarly research on the content of w ikipedia. Journal of the Association for Information Science and Technology, 66(2):219--245, 2015. [80] Marc Miquel-Ribé and David Laniado. Wikipedia culture gap: Quantifying content imbalances across 40 language editions. Frontiers in Physics, 6:54, 2018. [81] National Science and Technology Council Pandemic Prediction and Forecasting Science and Technology Working Group. Towards epidemic prediction: Federal efforts and opportunities in outbreak modeling. https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/NSTC/towards_epidemic_prediction-federal_efforts_and_opportunities.pdf, 2016. [Online; accessed January-2018]. [82] Roberto Navigli and Simone Paolo Ponzetto. Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193:217--250, 2012. [83] Finn AArup Nielsen. Scientific citations in Wikipedia. arXiv preprint arXiv:0705.2106, 2007. [ http ] [84] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999. [85] A Demetri Pananos, Thomas M Bury, Clara Wang, Justin Schonfeld, Sharada P Mohanty, Brendan Nyhan, Marcel Salathé, and Chris T Bauch. Critical dynamics in population vaccinating behavior. Proceedings of the National Academy of Sciences, page 201704093, 2017. [86] Riccardo Pasi, Cristian Consonni, and Maurizio Napolitano. Open Community Data & Official Public Data in flood risk management: a comparison based on InaSAFE. In FOSS4G-Europe 2015, the 2nd European Conference for for Free and Open Source Software for Geospatial, 2015. [87] Christian Pentzold, Esther Weltevrede, Michele Mauri, David Laniado, Andreas Kaltenbrunner, and Erik Borra. Digging Wikipedia: The Online Encyclopedia as a Digital Cultural Heritage Gateway and Site. Journal on Computing and Cultural Heritage (JOCCH), 10(1):5, 2017. [88] Nicola Perra, Duygu Balcan, Bruno Gonçalves, and Alessandro Vespignani. Towards a characterization of behavior-disease models. PloS one, 6(8):e23084, 2011. [89] Piero Poletti, Marco Ajelli, and Stefano Merler. The effect of risk perception on the 2009 h1n1 pandemic influenza dynamics. PloS one, 6(2):e16460, 2011. [90] Piero Poletti, Bruno Caprile, Marco Ajelli, Andrea Pugliese, and Stefano Merler. Spontaneous behavioural changes in response to epidemics. Journal of theoretical biology, 260(1):31--40, 2009. [91] Chiara Poletto, Pierre-Yves Boëlle, and Vittoria Colizza. Risk of mers importation and onward transmission: a systematic review and analysis of cases reported to who. BMC infectious diseases, 16(1):448, 2016. [92] Valentina Presutti, Sergio Consoli, Andrea Giovanni Nuzzolese, Diego Reforgiato Recupero, Aldo Gangemi, Ines Bannour, and Haïfa Zargayouna. Uncovering the semantics of wikipedia pagelinks. In International Conference on Knowledge Engineering and Knowledge Management, pages 413--428. Springer, 2014. [93] Julia Preusse, Jérôme Kunegis, Matthias Thimm, Steffen Staab, and Thomas Gottron. Structural dynamics of knowledge networks. In ICWSM, 2013. [94] Troy Raeder, Omar Lizardo, David Hachen, and Nitesh V Chawla. Predictors of short-term decay of cell phone contacts in a large scale communication network. Social Networks, 33(4):245--257, 2011. [95] Richard Rogers, Emina Sendijarevic, et al. Neutral or national point of view? a comparison of srebrenica articles across wikipedia's language versions. Proc. Wikipedia Academy, 2012. [96] Malte Schwarzer, Moritz Schubotz, Norman Meuschke, Corinna Breitinger, Volker Markl, and Bela Gipp. Evaluating link-based recommendations for wikipedia. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pages 191--200. ACM, 2016. [97] Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, and Jure Leskovec. Why we read wikipedia. In Proceedings of the 26th International Conference on World Wide Web, pages 1591--1600. International World Wide Web Conferences Steering Committee, 2017. [98] Ian Sommerville. Software Engineering. Addison-Wesley, Boston, MA, USA, 10th edition, 2015. [99] Brian G Southwell, Suzanne Dolina, Karla Jimenez-Magdaleno, Linda B Squiers, and Bridget J Kelly. Zika virus--related news coverage and online behavior, united states, guatemala, and brazil. Emerging infectious diseases, 22(7):1320, 2016. [100] David J Spiegelhalter, Nicola G Best, Bradley P Carlin, and Angelika Van Der Linde. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4):583--639, 2002. [101] Anselm Spoerri. What is popular on wikipedia and why? First Monday, 12(4), 2007. [102] Gillian K SteelFisher, Robert J Blendon, Mark M Bekheit, and Keri Lubell. The public's response to the 2009 h1n1 influenza pandemic. New England Journal of Medicine, 362(22):e65, 2010. [103] Gillian K SteelFisher, Robert J Blendon, and Narayani Lasala-Blanco. Ebola in the united states—public reactions and implications. New England Journal of Medicine, 373(9):789--791, 2015. [104] Nassim Nicholas Taleb. Antifragile: Things That Gain from Disorder (Incerto Book 3). Random House, New York, NY, USA, 2012. [105] Misha Teplitskiy, Grace Lu, and Eamon Duede. Amplifying the Impact of Open Access: Wikipedia and the Diffusion of Science. arXiv preprint arXiv:1506.07608, 2015. [ http ] [106] Leo Torres, Pablo Suárez-Serrato, and Tina Eliassi-Rad. Non-backtracking cycles: length spectrum theory and graph mining applications. Applied Network Science, 4(1):41, 2019. [107] Sherry Towers, Shehzad Afzal, Gilbert Bernal, Nadya Bliss, Shala Brown, Baltazar Espinoza, Jasmine Jackson, Julia Judson-Garcia, Maryam Khan, Michael Lin, et al. Mass media and the contagion of fear: the case of ebola in america. PloS one, 10(6):e0129179, 2015. [108] Cécile Viboud, Kaiyuan Sun, Robert Gaffey, Marco Ajelli, Laura Fumanelli, Stefano Merler, Qian Zhang, Gerardo Chowell, Lone Simonsen, Alessandro Vespignani, et al. The rapidd ebola forecasting challenge: Synthesis and lessons learnt. Epidemics, 22:13--21, 2018. [109] Soroush Vosoughi, Deb Roy, and Sinan Aral. The spread of true and false news online. Science, 359(6380):1146--1151, 2018. [110] Xianwen Wang, Chen Liu, Wenli Mao, and Zhichao Fang. The open access advantage considering citation, article usage and social media attention. Scientometrics, 103(2):555--564, May 2015. [ DOI | http ] [111] Duncan J Watts and Peter Sheridan Dodds. Influentials, networks, and public opinion formation. Journal of consumer research, 34(4):441--458, 2007. [112] Robert West and Jure Leskovec. Automatic versus human navigation in information networks. In Sixth International AAAI Conference on Weblogs and Social Media, 2012. [113] Robert West and Jure Leskovec. Human wayfinding in information networks. In Proceedings of the 21st international conference on World Wide Web, pages 619--628. ACM, 2012. [114] Robert West, Ashwin Paranjape, and Jure Leskovec. Mining missing hyperlinks from human navigation traces: A case study of wikipedia. In Proceedings of the 24th international conference on World Wide Web, pages 1242--1252. International World Wide Web Conferences Steering Committee, 2015. [115] Wikimedia Foundation. Pagecounts-raw. https://wikitech.wikimedia.org/w/index.php?title=Analytics/Archive/Data/Pagecounts-raw&oldid=1757933, 2018. [Online; accessed January-2018]. [116] Wikimedia Foundation. Pageviews analysis. https://tools.wmflabs.org/pageviews/, 2018. [Online; accessed January-2018]. [117] Wikimedia Foundation. Wikimedia traffic analysis report -- page views per wikipedia language -- breakdown. https://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerLanguageBreakdown.htm, 2018. [Online; accessed January-2018]. [118] Wikipedia contributors. Help:minor edit --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Help:Minor_edit&oldid=873199115, 2018. [Online; accessed 30-December-2018]. [119] Wikipedia contributors. Help:wikitext --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Help:Wikitext&oldid=866759011, 2018. [Online; accessed 28-December-2018]. [120] Wikipedia contributors. History of wikipedia --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=History_of_Wikipedia&oldid=875601169, 2018. [Online; accessed 28-December-2018]. [121] Wikipedia contributors. Wikipedia:five pillars --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Five_pillars&oldid=869228495, 2018. [Online; accessed 28-December-2018]. [122] Wikipedia contributors. Wikipedia:manual of style/linking --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Manual_of_Style/Linking&oldid=875531776, 2018. [Online; accessed 28-December-2018]. [123] Wikipedia contributors. Wikipedia:naming conventions (technical restrictions) --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Naming_conventions_(technical_restrictions)&oldid=870712053#Title_length, 2018. [Online; accessed 29-December-2018]. [124] Wikipedia contributors. Wikipedia:neutral point of view --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Neutral_point_of_view&oldid=871557947, 2018. [Online; accessed 28-December-2018]. [125] Wikipedia contributors. Wikipedia:red link --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Red_link&oldid=858691658, 2018. [Online; accessed 29-December-2018]. [126] Wikipedia contributors. Wikipedia:redirect --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Redirect&oldid=875600949, 2018. [Online; accessed 29-December-2018]. [127] Wikipedia contributors. Wikipedia:visualeditor --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:VisualEditor&oldid=869507404, 2018. [Online; accessed 29-December-2018]. [128] Wikipedia contributors. Wikipedia:wikipedia records --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Wikipedia_records&oldid=875607124#Article_with_longest_title, 2018. [Online; accessed 29-December-2018]. [129] Wikipedia contributors. List of wikipedias --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=List_of_Wikipedias&oldid=886713365, 2019. [Online; accessed 13-March-2019]. [130] Wikipedia contributors. Wikipedia:getting to philosophy --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Getting_to_Philosophy&oldid=880926083, 2019. [Online; accessed 13-March-2019]. [131] Wikipedia contributors. Wikipedia:manual of style/layout --- 'see also' section --- Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia:Manual_of_Style/Layout&oldid=906190242#"See_also"_section, 2019. [Online; accessed 15-July-2019]. [132] Chris Woolston. A call to deal with the data deluge. Nature, 525(7570), 2015. [ DOI | http ] [133] Ellery Wulczyn and Dario Taraborelli. Wikipedia clickstream, Feb 2017. [ DOI | http ] [134] Ellery Wulczyn, Robert West, Leila Zia, and Jure Leskovec. Growing wikipedia across languages via recommendation. In Proceedings of the 25th International Conference on World Wide Web, pages 975--985. International World Wide Web Conferences Steering Committee, 2016. [135] Taha Yasseri, Anselm Spoerri, Mark Graham, and János Kertész. The most controversial topics in wikipedia: A multilingual and geographical analysis. In P.Fichman and N. Hara, editors, Global Wikipedia: International and Cross-Cultural Issues in Online Collaboration. Scarecrow Press, 2014. [136] Eric Yeh, Daniel Ramage, Christopher D Manning, Eneko Agirre, and Aitor Soroa. Wikiwalk: random walks on wikipedia for semantic relatedness. In Proceedings of the 2009 workshop on graph-based methods for natural language processing, pages 41--49. Association for Computational Linguistics, 2009. [137] Erik Zachte. Wikimedia stats. [Online; accessed 30-December-2018]. [ http ] [138] An Zeng and Giulio Cimini. Removing spurious interactions in complex networks. Physical Review E, 85(3):036101, 2012. [139] Qian Zhang, Kaiyuan Sun, Matteo Chinazzi, Ana Pastore y Piontti, Natalie E Dean, Diana Patricia Rojas, Stefano Merler, Dina Mistry, Piero Poletti, Luca Rossi, et al. Spread of zika virus in the americas. Proceedings of the National Academy of Sciences, page 201620161, 2017. [140] AO Zhirov, OV Zhirov, and DL Shepelyansky. Two-dimensional ranking of wikipedia articles. The European Physical Journal B, 77(4):523--531, 2010. |
Citazione: | The Dao of Wikipedia: Extracting Knowledge from the Structure of Wikilinks / Consonni, Cristian. - (2019 Oct 24), pp. 1-184. |
Appare nelle tipologie: | 08.1 Tesi di dottorato (Doctoral Thesis) |
File in questo prodotto:
File | Descrizione | Tipologia | Licenza | |
---|---|---|---|---|
main.pdf | Tesi | Tesi di dottorato (Doctoral Thesis) | ![]() | Open Access Visualizza/Apri |