Network Sampling with Memory: A Proposal for More Efficient Sampling from Social Networks

TitleNetwork Sampling with Memory: A Proposal for More Efficient Sampling from Social Networks
Publication TypeJournal Article
Year of Publication2012
AuthorsTed Mouw, Ashton M. Verdery
JournalSociological Methodology
Volume42
Issue1
Pagination206-256
ISSN00811750
Abstract

Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations in which study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk–based approach such as respondent-driven sampling (RDS) can result in a high design effect (DE): the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high DE means that more cases must be collected to achieve the same level of precision as SRS. In this article, we propose an alternative strategy, network sampling with memory (NSM), which collects network data from respondents to reduce DEs and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a “list” mode, in which all individuals on the revealed network list are sampled with the same cumulative probability, with a “search” mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared with RDS and SRS on 162 school and university networks from the National Longitudinal Study of Adolescent Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average DE for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE = 1) and 98.5 percent lower than the average DE we observed for RDS.

DOI10.1177/0081175012461248
Short TitleNetwork Sampling with Memory