Cover: Comparing Covariation Among Vaccine Hesitancy and Broader Beliefs Within Twitter and Survey Data

Comparing Covariation Among Vaccine Hesitancy and Broader Beliefs Within Twitter and Survey Data

Published in: PLoS ONE, Volume 15, No. 10, e0239826 (2020). DOI: 10.1371/journal.pone.0239826

Posted on rand.org Jun 14, 2024

by Sarah A. Nowak, Christine Chen, Andrew M. Parker, Courtney A. Gidengil, Luke J. Matthews

Over the past decade, the percentage of adults in the United States who use some form of social media has roughly doubled, increasing from 36 percent in early 2009 to 72 percent in 2019. There has been a corresponding increase in research aimed at understanding opinions and beliefs that are expressed online. However, the generalizability of findings from social media research is a subject of ongoing debate. Social media platforms are conduits of both information and misinformation about vaccines and vaccine hesitancy. Our research objective was to examine whether we can draw similar conclusions from Twitter and national survey data about the relationship between vaccine hesitancy and a broader set of beliefs. In 2018 we conducted a nationally representative survey of parents in the United States informed by a literature review to ask their views on a range of topics, including vaccine side effects, conspiracy theories, and understanding of science. We developed a set of keyword-based queries corresponding to each of the belief items from the survey and pulled matching tweets from 2017. We performed the data pull of the most recent full year of data in 2018. Our primary measures of belief covariation were the loadings and scores of the first principal components obtained using principal component analysis (PCA) from the two sources. We found that, after using manually coded weblinks in tweets to infer stance, there was good qualitative agreement between the first principal component loadings and scores using survey and Twitter data. This held true after we took the additional processing step of resampling the Twitter data based on the number of topics that an individual tweeted about, as a means of correcting for differential representation for elicited (survey) vs. volunteered (Twitter) beliefs. Overall, the results show that analyses using Twitter data may be generalizable in certain contexts, such as assessing belief covariation.

Research conducted by

This report is part of the RAND external publication series. Many RAND studies are published in peer-reviewed scholarly journals, as chapters in commercial books, or as documents published by other organizations.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.