Home

Donate

Community Notes and its Narrow Understanding of Disinformation

Nadia Jude, Ariadna Matamoros-Fernández / Feb 3, 2025

Meta's corporate headquarters in Menlo Park, California. Shutterstock

Last month, Meta announced sweeping changes to its content moderation approach: the paring back of hate speech restrictions, a more personalized approach to political content, and the replacement of third-party professional fact-checking with its own version of X’s Community Notes, a crowd-sourced initiative that allows users to add context to tweets. Meta’s changes mean less moderation, more hate, and more personalized political content on Facebook, Instagram, and Threads.

Our research shows how the ‘community notes’ model –which combines automation with the ‘wisdom’ of user crowds– is ill-equipped to address hate-fuelled, harmful narratives often deeply entwined with disinformation. Rather than being a more empowering and comprehensive system as Meta claims, its Community Notes will likely mirror and extend the divisive, ideological leaning of its CEO, erecting another wrestling ring for the battle over ‘facts’ that favors the political right in an online environment where marginalized communities have fewer protections than ever before.

What do we know about Community Notes?

X’s Community Notes is a crowd-sourced content moderation system designed in 2020 to emulate professional fact-checking and “reduce misinformation” by allowing volunteer users to add context to tweets they found misleading. In 2023, after Twitter became X under the new leadership of Elon Musk, the company described Community Notes as the “focus” of its “evolving approach” to countering disinformation. X then slashed its content moderation team and withdrew from the EU Code of Practice on Disinformation.

Community Notes is gamified: tweets can get many notes, and volunteers that sign up and get accepted to the program rate notes based on their ‘helpfulness.’ A bridging algorithm then decides which notes will be shown publicly on the X interface, selecting notes that receive ‘consensus’ or ‘cross-ideological agreement,' indicated by positive ‘helpfulness’ ratings from people across diverse perspectives. This means that if volunteers who usually disagree on how they rate notes agree that a particular note is ‘helpful,’ this note is more likely to get displayed on X.

Studies on the ‘effectiveness’ of Community Notes are mixed. For example, there is debate about whether the tool reduces or heightens engagement with misleading posts. Notes are slow, and the public sees less than 12.5 percent of all submitted notes. Partisanship often motivates volunteers to participate, and the system struggles to fact-check divisive issues from influential accounts and hard-to-verify claims that include, for example, sarcasm. These challenges call into question whether prioritizing ‘consensus’ in moderation systems designed to address mis- and disinformation is a desirable or worthy aim.

Auditing Community Notes as a socio-technical system with a known content moderation challenge in mind

Our research differs from existing studies on Community Notes, taking care to critically examine the tool as a socio-technical system embedded in X’s culture, business practices, and norms. This approach means we did not merely download Community Notes data and look for trends and technical (in)accuracies. Rather, we paid close attention to Community Notes’ design, the work of its volunteer contributors, and its algorithmic outcomes. By algorithmic outcomes, we mean the notes that reached ‘consensus’ and went public on the X interface. Recognizing that disinformation is a contested concept, understood and enacted differently by various companies, governments, regulators, researchers, and communities impacted, we asked: how does X’s tool advance particular understandings of disinformation, and how does it attempt to solve it? What are the social implications of this orientation towards solving the disinformation problem?

We also audited the tool by focusing on a well-known challenge for content moderators: the difficulty of moderating disinformation that uses humor to harm. While humor is a vital source of social commentary, playing with notions of truth to offer broader critique, the ambiguity inherent in humor can also be instrumentalized to harm, with nefarious actors able to excuse the sharing of harmful false narratives as ‘just a joke.’ Humor, when entwined with disinformation, becomes especially dangerous when it deploys ridicule and stereotyping to target marginalized groups. This complexity creates a conundrum for moderators seeking to address disinformation: how to distinguish between false narratives that intentionally mobilize the ambiguities within humor to harm and false narratives that use humor to contest reality to convey an important political viewpoint.

An operational logic that inscribes true-false, real-fake binaries

We find that the Community Notes system inscribes reductive true-false, real-fake binaries. By design, the tool requires that volunteer moderators identify and assess individual tweets based on their potential to mislead (or not) first and foremost. Volunteers can then select other non-mandatory options such as whether the tweet contains, for example, a ‘factual error’ or is ‘a joke or satire that might be misinterpreted as fact.’ When conducting this research, volunteers could also classify tweets as being ‘believable by many’ or ‘believable by few’ and as posing ‘considerable harm’ or ‘little harm.’ However, the believability and harm classifications were at first optional and later removed from the system in 2024. Beyond these optional classifications, X offers little guidance on what ‘misleading content’ is and how one might assess its “check-worthiness” or, in other words, consider the potential for harm that false content could create.

Our close, qualitative analysis of public notes that classified tweets as ‘satirical’ reveals that volunteers rely on internal logics and biases when making determinations around content, reverting to a basic true-false, real-fake heuristic. Yet, this heuristic is at odds with content like satire, which is misleading by nature and plays with notions of truth and reality to relay a deeper critical commentary. Sometimes, that humor sought to critique societal inequities, with the corrected tweets “punching up” at those in power to convey an important political point. At other times, the humor was harmful, relying on discriminatory stereotypes and contributing to gendered or racialized disinformation. Yet, volunteers did not acknowledge or address the intertextuality or broader meta-narratives of this content. Instead, they judged content based on their own reductive notion of ‘the truth,’ ignoring the benefits or harms of its deeper critical commentary.

A system that lacks critical reflection on the potential for content to harm

We also find that the tool’s orientation to correct falsity and fakeness, first and foremost, leads volunteers to fact-check harmless, inconsequential tweets. These tweets are fake because they have, for example, been altered or edited for comedic purposes. For example, a tweet with an edited screenshot falsely claimed that Elon Musk ‘changed the like button’s color.’ ‘Twitter Changed The Color of the Like Button’ is a frequently repeated internet joke. The tweet received four community notes, with one note receiving enough ratings as helpful for it to go public. We hypothesize that these kinds of humorous tweets are likely to receive public notes, as consensus on their ‘fakeness’ is straightforward; their humor cues are clear, and the content is not ideologically divisive or debatable.

Finally, when harmful, humorous tweets did receive a public community note, the harms of the narratives were not considered or addressed by volunteer moderators. We found that most contributors do not actively or critically reflect on the potential harm of a humorous tweet before writing and submitting their notes. Even when contributors filled out the optional harm field, they did not expand on the tweet’s potential risk of harm in writing their notes. This oversight is a common criticism of fact-checking as a practice more broadly since its reliance on the ‘fact’/‘value’ distinction is “a narrow account of the requirements of a healthy public sphere.”

Correcting falsity is insufficient for our current moment

Our research reveals a fundamental issue with the ‘community notes’ model as a preferred content moderation system: it is oriented toward addressing an outdated and extremely narrow understanding of disinformation, one that views falsity and fakeness as the problem alone, ignoring the historical, social, cultural, economic and political nature of disinformation. By examining the tool through the lens of humor, we show how the system fails to assess the harms or ‘check-worthiness’ of false narratives. This problem is a skill expert fact-checkers are trained to address, making Community Notes a poor substitute for culturally informed content moderation performed by a combination of experts, AI, and crowd workers situated in place and time.

That said, components of the ‘community notes’ model, when employed outside the platform logic of X, have delivered promising results in other contexts. For example, crowd-sourced initiatives such as Wikipedia work well when they have robust, transparent, and inclusive mechanisms of community governance to ensure volunteers have ownership over and subscribe to community guidelines. Similarly, bridging algorithms, when deployed in government online engagement platforms, have demonstrated some success at facilitating consensus around contained policy issues. However, the ability of these systems to respect and elevate marginalized perspectives has proven problematic or has yet to be explored.

X, YouTube, and now Meta are taking advantage of these successes, taking elements, then co-opting the language of ‘empowerment’ and ‘democracy’ to reduce moderation and market their techno-solutionist products as beneficial for the world. Research like ours is a useful reminder of the importance of critically interrogating platform content moderation systems, which involves paying attention to these systems’ design, the problems they are oriented to solve, their contexts of use, and their risk of directly supporting and entrenching online harms. There is an urgent need to think beyond technology to address the societal challenges often associated with the disinformation problem despite large tech companies' efforts to convince us otherwise. Media system reform, market-shaping approaches, and “big tent” civil society coalitions led by the Global Majority would be a fruitful start.

Authors

Nadia Jude
Nadia Jude is a PhD researcher within the Digital Media Research Centre at the Queensland University of Technology. Her research centers on questions around platform governance, with a focus on problem representations of mis- and disinformation in Australian policy-making discourse.
Ariadna Matamoros-Fernández
Dr. Ariadna Matamoros-Fernández is Associate Professor at University College Dublin (UCD). Her research focuses on social media cultures, platform governance, online harm, and algorithmic systems

Related

Meta Dropped Fact-Checking Because of Politics. But Could Its Alternative Produce Better Results?

Topics