Title
Respondent-driven sampling bias induced by clustering and community structure in social networks.
Abstract
Sampling hidden populations is particularly challenging using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling (RDS) is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted for the likelihood of being sampled due to differences in the number of contacts. In RDS, the structure of the social contacts thus defines the sampling process and affects its coverage, for instance by constraining the sampling within a sub-region of the network. In this paper we study the bias induced by network structures such as social triangles, community structure, and heterogeneities in the number of contacts, in the recruitment trees and in the RDS estimator. We simulate different scenarios of network structures and response-rates to study the potential biases one may expect in real settings. We find that the prevalence of the estimated variable is associated with the size of the network community to which the individual belongs. Furthermore, we observe that low-degree nodes may be under-sampled in certain situations if the sample and the network are of similar size. Finally, we also show that low response-rates lead to reasonably accurate average estimates of the prevalence but generate relatively large biases.
Year
Venue
Field
2015
CoRR
Econometrics,Community structure,Social network,Sampling bias,Sampling (statistics),Respondent,Statistics,Cluster analysis,Mathematics,Estimator,Network structure
DocType
Volume
Citations 
Journal
abs/1503.05826
0
PageRank 
References 
Authors
0.34
6
4
Name
Order
Citations
PageRank
Luis Enrique Correa da Rocha100.34
Anna Ekeus Thorson200.34
Renaud Lambiotte392064.98
Fredrik Liljeros400.68