Over the past few weeks, Facebook has been under fire following a trove of leaked internal documents. Regulators, politicians, and journalists have dug into them, revealing ways in which Facebook prioritizes engagement over safety. Making sense of this glimpse into Facebook’s inner workings has been challenging– in part due to the sheer volume– yet also due to the fact that much of the leaked material focuses on the complex effects of algorithms and platform design on human behavior at scale. Understanding how social media impacts society is about as difficult as it is important. We somehow have to make sense of what happens when billions of remarkably complex brains are networked via smartphone to giant server farms leveraging petabytes of data through opaque algorithms designed to enhance engagement. In comparison, Elon Musk’s quest for Mars may as well be playing with Lego. Yet the consequences are difficult to overstate.
Consider the fact that in many countries with ample access to vaccines, misinformation is arguably one of the leading causes of death. Even when the pandemic becomes a memory, social media will cast its shadow over geopolitical and ethnic tension, climate change, future pandemics, human rights, and a host of other global challenges. As my colleagues and I wrote in a paper published this summer, sorting this all out is going to require an urgent, massive, whole-of-science response on the scale of climate science.
So where have academics been over the last few weeks of Facebook leaks? If my experience is representative, we’ve been trying to get past paywalls at a dozen news organizations to squint at fuzzy screenshots of PowerPoint slides. If we’re especially lucky a journalist shares a few slides with one of us, asking for a comment or a bit of insight. Of the hundreds of people who have been granted access to the Facebook files, academics have been largely shut out- so far.
There may be good reasons for that, such as legalities to do with whistleblower disclosures and privacy concerns. But it is likely academics will have access to these documents soon- one way or another. And when we do, we will need to work fast. While journalists have already reported on many of the findings and regulators and lawmakers are already stirring, academic researchers are already behind the ball. It’s imperative that before governments act on any empirical evidence, it should be evaluated by a community of experts with domain knowledge and relevant methodological training. Regulatory bodies desperately need to know what the findings tell us, what they don’t, and what questions to ask when executives from these companies are hauled in front of a committee to explain themselves.
These are questions that we can’t reasonably expect to glean from even top-tier journalistic or congressional investigations. Empirical findings, such as the impact of algorithmic choices on amplifying extremism or Instagram’s harm to teens, can only be understood in the context of their methodology. Providing this context is a job for statisticians, social scientists, computer scientists, psychologists, and a host of other experts. Effective regulation is hopeless without this type of input. Imagine if BP controlled the world’s thermometers and our only glimpse of climate change was through a paywalled article containing a leaked hockey-stick graph. Would we expect regulators to make the right call? What if all we knew about the coronavirus pandemic was a World Health Organization PowerPoint slide containing a pie plot of the case outcomes? We’d fail in either of these cases, and it’s foolish to think we’ll do better in sorting out the mess that Facebook has made. Making sense of complicated topics and solving difficult problems is one of the main reasons we fund science.
What can we do while we wait? Researchers across a range of disciplines should spend the days ahead preparing to receive these documents. There are practical considerations- we already know most of the files are not machine-readable, for instance, but rather consist of photographs of documents on a laptop screen. Even redacted versions require cautious use for legal, ethical, and practical reasons. Beyond practicalities, making sense of these findings is going to require combining knowledge running the gamut from highly qualitative to heavily quantitative. We’ll have to sort this out together, bridging historical divides in the social sciences. Now is the time to start having conversations, spin up interdisciplinary working groups, and brainstorm how to effectively bring an academic perspective to the table.
Facebook should also take this opportunity to rethink its relationship with academic researchers. The company has made it exceedingly clear that it would prefer not to have researchers poking around outside of the proscribed programs it has set up to provide access to its data, even going so far as to retaliate against scientists for creating a browser plugin to collect advertising data. Without informed experts to testify to the most troublesome aspects of its platform design, it can dismiss internal research as out of context and produce conflicting findings to sow doubt. But in the post-whistleblower period, this is at best a short-term strategy. The company can do a lot to set the record straight by coming to the table today to help formulate a mechanism for privacy-protected, transparent access for independent research.
A transparent approach is the only path forward. We don’t have a coherent framework for understanding dynamics online, much less an idea of what a healthy and profitable social network would look like. Even ignoring financial incentives, there is no reason to believe that Facebook is capable of solving all of its empirical problems internally. In fact, some features of social media may have externalities that we cannot resolve given our present state of knowledge. In these cases, we will need to have a conversation about whether the costs they impose outweigh their benefits. The Facebook files have demonstrated that harm caused by their platform cannot be indefinitely swept under the rug. It is in Facebook’s interest to open up and ask the global community of experts how best to move forward.
It’s easy to see the Facebook files as akin to the Panama or Paradise papers; a tale of corporate greed sitting under our noses for years. Yet this framing obscures that the Facebook files provide an unprecedented glimpse into complex and poorly understood design decisions with global consequences. Lives are quite literally on the line—from teenage Instagram users to marginalized groups around the world and the billions of people who have yet to be vaccinated. The fantastic journalism we’ve been seeing is exposing these problems, but we’re going to need a much broader effort to solve them.
Dr. Joe Bak-Coleman is an associate research scientist at the Craig Newmark Center for Journalism Ethics and Security at Columbia University. His research focuses on how the actions and interactions of group members give rise to broader patterns of collective action. He is particularly interested in understanding how communication technology alters collective decision-making and the spread of information. To ask these questions, he uses a combination of online experiments, observational data and mathematical modeling. Bak-Coleman earned his Ph.D. in Ecology and Evolutionary Biology at Princeton University. Prior to working on human collective behavior, he studied the behavior of animal groups, from zebra herds to fish schools.