Even as the U.S. midterm elections loom, academic researchers continue to publish new datasets and results drawn from the 2020 election cycle that give insight into the role of tech platforms in the spread of mis- and disinformation that has contributed to widespread doubts about election legitimacy among a majority of Republican voters.
Much of the published research on the subject to date is focused on social media platforms, including Facebook, Twitter and YouTube. Now, a new peer-reviewed study and accompanying dataset published in the Journal of Online Trust and Safety considers the role of Google Search in the spread of content that undermines confidence in election outcomes.
In Auditing Google’s Search Headlines as a Potential Gateway to Misleading Content: Evidence from the 2020 US Election, a team led by University of Washington researchers Himanshu Zade and Morgan Wack set out to “investigate whether and potentially how Google served as a gateway to content that may have undermined trust in election processes, institutions, and results.”
Working with a dataset of over 800,000 headlines drawn from Google search engine results for election-related keywords, the researchers “present results from qualitative coding of 5,600 headlines focused on the prevalence of delegitimizing information.” The searches– which were experimentally conducted from 20 cities in 16 states– also permitted an analysis of differences based on the politics of different geographies, from Los Angeles to Houston to Poughkeepsie.
The researchers identified four key questions:
Question One: How do the SERP (search engine results page) verticals—search results, stories, videos, and advertisements—differ in the amount of misleading content?
Question Two: How does one’s location in a specific city—split by population and party representation—change the kind of election content found in search results?
Question Three: Do different search terms lead to different search result quality?
Question Four: Which online news domains served as the most frequent gateways to content that may have undermined trust during the election period?
Search terms included in the experimental set included ten general terms, such as “election results,” “ballots,” “main-in voting,” and “where do I vote,” as well as ten “conspiratorial terms,” such as “rigged election,” “voter fraud,” and “ballot harvesting.” A crucial part of the methodology revolved around the coding of a random sample of the results, which was done by hand after consultation with journalists and researchers expert on the construction of headlines.
The researchers find that “for searches based on general election terms, Google did a relatively good job of surfacing relevant content without leading users to misleading arguments that negatively impacted civic trust in the election processes.” And, they are “pleased” to report “no evidence that the search engine created information bubbles catering to any regional bias.”
But perhaps unsurprisingly, Google “offers more access to controversial content when users actively search for it.” And, “the domains associated with content that promoted narratives with the potential to undermine electoral integrity with the highest frequency were affiliated with hyper partisan outlets when examined both by the total and frequency of concerning posts.”
One of the potentially more actionable findings is that “the headlines of the video content reported in our Google Search results pages contained a disproportionate amount of undermining-trust content when compared to alternative SERP verticals (search results, stories, and advertisements).” Indeed, the researchers say, “video headlines served to be a notable pathway to content with the potential to undermine trust.” One likely reason video results are more problematic is that they are harder to evaluate, including by fact-checkers, than text results. The researchers also point to tactics, such as creators using “innocuous headlines” to disguise video that contains problematic content, that should be studied further.
(One compelling footnote: citing 2021 research by Michal Pecanek, the researchers point out that Google automatically overwrites about a third of headlines, posing “a question of whether Google’s rewriting could play any role in altering the trust-undermining or trust-imparting potential of SERP headlines.”)
The researchers suggest that Google could take action to limit the visibility of headlines that are problematic, including “ensuring that they remain in the periphery without showing in the results page when users search for general election concepts, terms, and questions,” but also say that “nothing in our audit suggests that censorship should be promoted as a central strategy of search engines in managing political and politically adjacent content.”
The paper concludes by arguing future analyses of the performance of search engines could be strengthened by more transparency and data access for independent researchers, and that the results of such analyses during elections should help identify problematic phenomena and inform platform priorities.
Each election is a live experiment; now on to the next one.
– – –
Zade, H., Wack, M., Zhang, Y., Starbird, K., Calo, R., Young, J., & West, J. D. (2022). Auditing Google’s Search Headlines as a Potential Gateway to Misleading Content. In Journal of Online Trust and Safety (Vol. 1, Issue 4). Stanford Internet Observatory. https://doi.org/10.54501/jots.v1i4.72
Justin Hendrix is CEO and Editor of Tech Policy Press, a new nonprofit media venture concerned with the intersection of technology and democracy. Previously, he was Executive Director of NYC Media Lab. He spent over a decade at The Economist in roles including Vice President, Business Development & Innovation. He is an associate research scientist and adjunct professor at NYU Tandon School of Engineering. Opinions expressed here are his own.