Title
Linguistic influence patterns within the global network of Wikipedia language editions.
Abstract
The Internet is highly multilingual, and its content is created, shared, debated and shaped within many different language-speaking communities. These communities do not exist in isolation, but communicate and influence each other's interests, just as in the offline world. Quantifying this influence is however a non-trivial task, as these communities are usually spread across multiple heterogeneous platforms. In this work, we set out to measure the influence of languages on each other by observing concept overlap between the 110 largest Wikipedia language editions. We describe experiments to test if language overlap in concept coverage is a random process, and find that edition size is a strong predictor of higher concept overlap, with English--German being the most frequently co-occurring pair (45%). Both small and large editions co-occur more frequently than expected with editions of similar size, but co-occurrences across groups are below what is expected by chance. Additionally, by applying network analysis, we find that the hierarchy of language interconnections differs depending on the locality of topics: for interlingually popular topics, the dominance of English, German and French is pronounced, while for topics with a local reach, geographical and cultural proximity as well as common heritage are better explanators of co-occurrence.
Year
DOI
Venue
2015
10.1145/2786451.2786497
WebSci
Field
DocType
Citations 
World Wide Web,Locality,Social media,Global network,Computer science,Information warfare,Network analysis,Hierarchy,Linguistics,German,The Internet
Conference
1
PageRank 
References 
Authors
0.38
1
5
Name
Order
Citations
PageRank
Anna Samoilenko1162.75
Fariba Karimi2526.49
Jérôme Kunegis387451.20
Daniel Edler431.15
Markus Strohmaier51210102.65