Skip to content

Facebook’s Convoluted History Providing Data Access Proves Necessity of Regulation

Last week, Senators Chris Coons (D-DE), Rob Portman (R-OH) and Amy Klobuchar (D-MN) put forward proposed legislation, the Platform Accountability and Transparency Act (PATA), which according to an announcement “would require social media companies to provide vetted, independent researchers and the public with access to certain platform data.” Such access is sorely needed to better understand the harms and calibrate the appropriate regulation of companies such as Facebook (now Meta), where a trove of documents leaked by whistleblower Frances Haugen suggests the company knows far more than the public about how it affects everything from democracy to COVID-19 misinformation to teen mental health. 

Facebook’s history of sharing data with researchers is perhaps the most convoluted of the social media platforms. In 2016, personal data belonging to 87 million people was collected through a Facebook application devised by Cambridge academics working with the British consulting firm and military contractor Cambridge Analytica. The breached data, which was used to target ads for the Ted Cruz and Donald Trump campaigns, resulted in a historic $5 billion fine against Facebook by the Federal Trade Commission (FTC). In 2018 testimony before Congress, Facebook founder and CEO Mark Zuckerberg described the scandal as a mistake, and later admitted his own personal data was acquired in the breach. 

After that scandal, in an effort to safely provide data to researchers, Facebook helped launch a program in 2018 called Social Science One that represented a partnership between industry and academia to allow researchers to study social media effects on elections and democracy. The dataset contained public URLs shared and clicked by Facebook users, as well as age, country, ideological affiliations, and so on. But earlier this year, the company admitted that the data set it provided to the consortium contained “serious errors, affecting the findings in an unknown number of academic papers,” according to The Washington Post. A limited group of researchers were permitted special access to data to study the 2020 US elections- their results are not expected until next year. 

More recently, Facebook decided to block researchers who used other methods to acquire data about how the platform really functions. In August 2021, the company shut down a research project launched by AlgorithmWatch that aimed to monitor instagram’s News Feed. It also banned the accounts of NYU researchers studying Facebook ads in order to understand the dynamics of political advertising on the platform. The FTC issued an open letter regarding misleading claims by the company that sought to justify its actions. At the same time, Facebook tightened its grip on CrowdTangle, a tool it acquired in 2016 that provides user engagement analytics. 

Other efforts to provide insights into how the platform works have proven unsatisfactory. Under pressure, the company recently started to share guidelines about how it moderates the content on the News Feed, but still provides no detailed data or explanation of how algorithmic decisions are made. Facebook also shares annual transparency reports, but these have been criticized as substantially flawed by many researchers and experts. A new API for researchers is apparently in beta, and is expected to be released in 2022. But it will limit what researchers can study- such as types of media and geographies- and at present does not include data from Instagram. 

It is past time for legislation. The Senate proposal- if adopted- would introduce new requirements and new rigor on how the company must share data with academic researchers. Modeled in part on a proposal by Stanford Law Professor and Cyber Policy Center co-director Nathaniel Persily– what he called the “Platform Transparency and Accountability Act”— the legislation does afford the platforms a useful immunity from civil and criminal liability when sharing data with vetted academics, and also immunizes qualified researchers who scrape publicly available data for research purposes. Other experts– such as Rebekah Tromble, Director of the Institute for Data, Democracy, and Politics (IDDP) at the George Washington University– are working on this in a European context to create a mechanism for “​independent audits of data shared by the platforms with researchers, the public, or both.” 

Whether Facebook will fight the Senate legislation is yet to be seen- and what it may do in public may be different from what it does in private. For instance, despite Mark Zuckerberg publicly stating that Facebook would support “Congress passing legislation to make all advertising more transparent,” the company quietly lobbied to kill the Honest Ads Act, which would have done just that. 

Scholars and advocates who seek to understand the effects of companies such as Facebook on society must band together lest such lobbying efforts succeed. There are some signs that researchers are coming together to demand transparency. Last week, more than 300 scientists studying social media’s effects on child and teen mental health demanded access to Facebook data. In August, hundreds of researchers signed on to an open letter in support of industry accountability research, calling on companies to “Join efforts to develop codes of conduct for providing responsible, ethical data access to independent researchers” and on regulators to “Compel access to data for independent researchers.” 

In the decade ahead, it is imperative that society understand how social media impacts the health of individuals, and the health of democracy, and to have a better understanding of issues such as misinformation, polarization, hate speech, election integrity and fraudulent advertising. Companies such as Facebook– based on its behavior to date– cannot be trusted to provide access to the data necessary to make such assessments. The time for regulation is now.

.