Perspective

Data Work Is Too Secretive. Big Tech Should be Held Accountable.

Tatiana Dias / Apr 15, 2026

Tatiana Dias is a fellow at Tech Policy Press.

A gig worker in Sao Paulo, Brazil, is seen taking a selfie in 2021. (Ettore Chiereguini / AGIF via AP)

Right now, somewhere in the world someone is selling selfies of themselves. Or photos of random children. Or putting on a mounted camera and recording their daily routine, all to sell images of their home's interior and their own personal space. Some are even photographing their own sensitive identification documents to turn a profit.

Who's buying? Often, it's a mystery.

All of these paid tasks are real projects for those in the data work industry — which monetizes the collection, production, labeling and cleaning of data used to train artificial intelligence systems. These tasks power third-party data work platforms that act as intermediaries between Big Tech companies, which own AI technologies that require vast amounts of data, and the human beings that produce this data – often called “ghost” workers.

Usually, Big Tech companies don’t disclose publicly which data work companies they rely on to train their AI models, according to the Foundation Model Transparency Index. But now, a newly released investigation conducted by the Dutch nonprofit organization The Centre for Research on Multinational Corporations (SOMO) has mapped the network that supplies this cheap labor for Big Tech.

According to its findings, there are at least 30 data work platforms — from outsourcing companies (BPOs) such as Sama to crowdwork platforms like Clickworkers — used by Amazon, Google, Meta, Microsoft and Nvidia.

The investigation shed important light on to what extent contractors are accountable for workers’ conditions, as well as the risks and labor rights violations they potentially face. Even when Big Tech companies don’t directly employ data workers, they can influence working conditions by cutting costs and demanding tight deadlines, said the researchers.

Between 2023 and 2024, I spent a year mapping the data work platforms — such as Appen, OneForma and Telus — that offer these tasks, also called "microtasks," for the production and classification of data for AI systems. It is typically precarious, repetitive and very poorly paid work, especially in the Global South. In Brazil, the average pay is less than $2 per hour.

This labor chain is extremely opaque, masking the fact that developing and maintaining AI systems actually requires many hours of human labor. Maybe it wouldn’t be good for business if the consumers knew that a worker in Kenya could see the videos they’re recording with RayBan Meta, or maybe the secrets they’re telling to ChatGPT or Google Gemini.

The data work platforms recruit workers through listings on their websites and on LinkedIn – sometimes with mass recruitments of over 42,000 job posts, powered by AI hype, as Marché Arends and Kathryn Cleary documented on Tech Policy Press. Very commonly there is no information about who the final client is, with projects labeled with brief descriptions or nicknames. Workers, upon accepting a task, often must sign an NDA that prevents them from speaking about the work outside the platforms.

This is how gig workers in Africa unknowingly aided the United States military, as the Bureau of Investigative Journalism revealed in February. According to the report, Appen, one of the biggest data work platforms, was among the suppliers for a secret US military unit called Big Safari. "They were secretive about the ultimate goal," said one person who worked for Appen.

In my own research, I uncovered similarly opaque projects announced as job posts on data work platforms. One of them, called “Spoofy Doo”, available on the Telus platform, paid $50 to Egyptians to record videos imitating security cameras footage. For whom? It’s a mystery. One of Telus’ biggest clients, according to Somo’s investigation, is Google.

Another project, available in the US, Argentina, Brazil, India, the Philippines and Mexico, asked participants to capture 10 photos and 10 five-second videos of their ID documents, including driver's licenses, gun licenses, and passports. The images had to be recorded against different backgrounds. Payments varied by country: US citizens received $25; Brazilians, $9; Indians, Filipinos and Mexicans, $7.

Once again, there was no information about where the documents submitted by workers would end up.

In 2024, there was another project that asked workers in Brazil, Colombia and Mexico to sell photos of minors that were “0-17 years old.” For 10 videos of a child doing some activity — blowing out candles, kicking a ball, playing in the water — the worker would receive an amount between $2 and $20. The project is not available anymore.

Appen did not answer questions about the data privacy, purpose or clients of these projects.

Big tech needs to be held accountable even when work is outsourced

Now, SOMO’s investigation presents a broader view of the data work supply chain.

"A small number of large clients often dominate the vendor's business," the group said, citing examples where a single big tech company accounts for 48% of a data work firm's profits.

"This creates a clear imbalance. While vendors largely depend on Big Tech, Big Tech does not depend on any single vendor but maintains multiple contracts, which enables them to avoid vendor lock-in and shop around for the 'best deal,’" the investigators said.

The study found that Amazon has the largest number of data work suppliers: 18. These include platforms such as Scale AI, Appen, Oneforma and Toloka. Appen is used by all the Big Tech companies studied — except Google, which terminated its contract after US data workers demanded a wage increase from $10 to $14 per hour.

Amazon stated to SOMO that the company is “committed to continuous improvement,” and “actively partner with suppliers to embed respect for human rights and the environment in their operations and supply chain, strengthening protections for workers and fostering safe, responsible workplaces.” Appen said that implemented fair pay rates in 2025, and provide contributors access to wellness programs and counseling. Scale AI said “data work platforms provide flexible, on-demand earning opportunities for contributors who choose when and how much they work.”

In my research, which gave rise to the series The Platform Proletariat, I found that Brazilian data workers were working for Meta through Appen. The work involved fact-checking pieces of content and ads — many on complex subjects like politics and the devastating floods that ravaged Brazil — under precarious conditions and without adequate training.

Employees received about $0.10 for every classified post. They needed to evaluate 40 posts per hour, which left them just over 90 seconds for each publication. After successfully completing an hour, workers receive $3.50. In many cases, they didn't even receive payment.

Given the conditions, the fact-checking work ended up suffering. Often, under pressure and without adequate guidelines, workers turned to ChatGPT to help classify the information.

What was supposed to be a content-cleaning and moderation job to ensure the quality of an AI system ended up becoming an AI-powered production line — precisely because of the poor working conditions.

At the time, I sent Appen and Meta representatives in Brazil a series of questions, including how the training of outsourced content-moderators via Appen works, how their wage is calculated and what the objectives of the projects are. Both companies declined to answer those questions.

I was only able to carry out this investigation because I contacted workers directly and joined groups where they organize to seek support and instructions for completing tasks. The training materials are confidential, and NDAs prevent them from publicly disclosing information — imposing a layer of secrecy over the work.

“Do not participate in groups or chats outside of the communication methods provided by Appen to discuss confidential information,” says one of the company’s terms of use pages.

Milagros Miceli, sociologist, computer scientist and founder of a research group on algorithms and ethics at the Weizenbaum Institute in Berlin, previously told me that outsourcing is a strategy that disconnects workers from their final clients. “It’s important for the workers themselves to realize they’re contributing to a multibillion-dollar industry. I think all workers should know who they are working for and the profit generated by their work.”

Research from Fairwork found that many of the platforms used by Big Tech fail to ensure the most basic working conditions and rights, such as the right to safety, decent terms and freedom of association. None of the companies examined paid a living wage.

Generally, the rhetoric from Big Tech companies when allegations about this type of work emerges is that they are not responsible for the workers and that their contracts with outsourcing companies include counseling, training and other support.

Now, SOMO’s research has connected these precarious conditions with the Big Tech companies ultimately footing the bill. “Tech companies dominating AI development have an impact far beyond their own operations.”

It continued, “Their decisions shape how vendors operate and, ultimately, how workers are treated. When prices are pushed down and contracts are abruptly terminated, workers directly bear the consequences.”

Transparency is not only a matter of labor rights; it is also a matter of social justice, accountability and improving AI systems. Whose interests do these secrets serve? The answer is clear.

Support Tech Policy Press

If you've found our work helpful, consider supporting us.

Donate

Authors

Tatiana Dias

Tatiana Dias is a Brazilian investigative journalist specializing in technopolitics and human rights. She holds a degree in journalism from Faculdade Cásper Líbero and is a master's student in Communication Sciences at the University of São Paulo. In 2023-24, she was a Pulitzer Center AI Accountabil...