Unpacking OpenAI’s Amazonian Archaeology Initiative
Lori Regattieri / Jun 27, 2025
Anne Fehres and Luke Conroy & AI4Media / Better Images of AI / Hidden Labour of Internet Browsing / CC-BY 4.0
What if I told you that one of the most well-capitalized AI companies on the planet is asking volunteers to help them uncover “lost cities” in the Amazonia—by feeding machine learning models with open satellite data, lidar, “colonial” text and map records, and indigenous oral histories? This is the premise of the OpenAI to Z Challenge, a Kaggle-hosted hackathon framed as a platform to "push the limits" of AI through global knowledge cooperation. In practice, this is a product development experiment cloaked as public participation. The contributions of users, the mapping of biocultural data, and the modeling of ancestral landscapes all feed into the refinement of OpenAI’s proprietary systems. The task itself may appear novel. The logic is not. This is the familiar playbook of Big Tech firms—capture public knowledge, reframe it as open input, and channel it into infrastructure that serves commercial, rather than communal goals.
The "challenge” is marketed as a “digital archaeology” experiment, it invites participants from all around the world to search for “hidden” archaeological sites in the Amazonia biome (Brazil, Bolivia, Columbia, Ecuador, Guyana, Peru, Suriname, Venezuela, and French Guiana) using a curated stack of open-source data. The competition requires participants to use OpenAI’s latest GPT-4.1 and the o3/o4-mini models to parse multispectral satellite imagery, LiDAR-derived elevation maps (Light Detection and Ranging is a remote sensing technology that uses laser pulses to generate high-resolution 3D models of terrain, including areas covered by dense vegetation), historical maps, and digitized ethnographic archives. The coding teams or individuals need to geolocate “potential” archaeological sites, argue their significance using verifiable public sources, and present reproducible methodologies. Prize incentives total $400,000 USD, with a first-place award of $250,000 split between cash and OpenAI API credits.
While framed as a novel invitation to “anyone” to do archaeological research, the competition focuses mainly on the Brazilian territory, transforming the Amazonia and its peoples into an open laboratory for model testing. What is presented as scientific crowdsourcing is in fact a carefully designed mechanism for refining geospatial AI at scale. Participants supply not just labor and insight, but novel training and evaluation strategies that extend far beyond heritage science and into the commercial logics of spatial computing.
There is already a reaction to this settler-techno-colonialism in Brazil. The Society of Brazilian Archeology, or SAB, responded to the OpenAI-Kaggle competition with a formal statement of concern. Addressed to Brazil’s Cultural Heritage Authorities and the Ministry of Indigenous Peoples, the letter outlines a series of procedural and ethical violations. In Brazil, archaeological research is not only a regulated scientific field but a constitutional matter of cultural protection. IPHAN, the National Institute of Historical and Artistic Heritage, holds jurisdiction over archaeological sites and their safeguarding. Under Brazilian law, any project that involves locating or interpreting archaeological material must receive IPHAN’s approval. SAB highlights that the OpenAI to Z Challenge circumvents these processes. It frames archaeological detection as a data science task, dissociating it from the fieldwork, consent, and legal obligations that legitimate archaeology demands. By replacing professional protocols with leaderboard metrics, the competition encourages technical enthusiasm without institutional accountability or respect for the International Labor Organisation’s 169 Convention, to which Brazil is a signatory.
OpenAI has strong incentives to structure initiatives like this. The company does not primarily seek answers about ancient civilizations in the forest. What it gains is an opportunity to simulate high-uncertainty, data-scarce environments, which are ideal for testing the robustness of its models. Tasks such as identifying human-made patterns beneath dense vegetation using sparse LiDAR points and ambiguous historical texts help refine OpenAI’s core systems. These conditions mirror the enterprise contexts that large AI models must solve for, from logistics to infrastructure forecasting. This kind of domain stress-testing helps validate OpenAI’s APIs in the field. Participants provide annotated examples, embedding quality evaluations, and prompt engineering strategies without payment. Amazônia becomes a terrain not only of computational challenge but of strategic model tuning. The output is not a scientific discovery, but simply product testing and scalability.
Competitions have long served the AI industry as low-cost and high-yield infrastructures for product development. While positioned as inclusive civic science, they often rely on unpaid labor, unprotected data flows, and communities excluded from governance. The OpenAI to Z Challenge presents itself as a research collaboration, yet its downstream utility extends well beyond archaeology. The data types involved—high-resolution satellite imagery, LiDAR point clouds, vegetation indices, elevation models, and digitized records of archives and artifacts—correspond directly to computational modalities used in mining exploration, industrial agriculture, and large-scale transport logistics. These sectors require AI models capable of operating in environments marked by ecological complexity, ambiguous terrain, and spatial fragmentation. By treating the Amazônia as a machine-learning sandbox, OpenAI stress-tests its models across precisely these variables. This process supports its broader product ecosystem and implicitly subsidizes industries whose operations depend on land-use remapping, extraction logistics, and risk modeling across contested geographies. The competition operates as an informal research and development subsidy, advancing technologies with high commercial and strategic value.
Geographic information and data analysis in the Amazônia offers a real-world proxy for some of the most difficult problems in spatial artificial intelligence. Its challenges—sparse labels, dense canopy interference, heterogeneous inputs, and temporally layered data—create an ideal setting for training multimodal architectures. Forest occlusion replicates adversarial visual conditions common to drone-based surveying and autonomous navigation. Multispectral imagery combined with annotated historical documents and maps simulates the need for dynamic cross-modality fusion, refining contrastive learning pipelines. Site prediction based on ambiguous textual references and elevation masks requires neuro-symbolic inference, pushing models to integrate language with probabilistic spatial reasoning. These are not marginal or trivial computer science "challenge” exercises for graduates to play with. They expand OpenAI’s capacity to deploy general-purpose spatial automated inference systems, products with immediate applicability in infrastructure forecasting, industrial agriculture, mineral prospecting, military operations, and even border surveillance. Through this competition, what appears as an open inquiry functions as a pipeline for enterprise product development, accelerating the transfer of biocultural landscapes into computationally exploitable platforms.
This same logic extends to the project’s legal architecture. OpenAI employs a CC0 license, a designation that allows contributors to irrevocably waive all rights to their submissions. This licensing choice is not incidental; it mirrors the model’s development rationale—to maximize design flexibility, reduce friction for commercial deployment, and disembed knowledge vitality from its origin. There is no structured mechanism through which indigenous or local communities can negotiate the terms of participation, assert rights over data extraction, or receive any form of reciprocal benefit.
Yet it is precisely their territories, material cultures, and historical continuities that ground the entire computational task. The challenge aligns with possible geospatial research and product roadmap: enhancing symbolic-spatial inference, multimodal learning, and generalization across sparse, high-noise datasets. But in doing so, it reduces living biocultural systems to parameters of optimization. Participation is positioned as voluntary and open, but in practice, it is unidirectional and asymmetrical. This mode of extraction targets the very lifeways that constitute a territory, beyond rational units of carbon or spectral data, into presences more-than-human. It severs knowledge from the cosmologies that sustain it, encoding relational worlds into statistical representations that erase the conditions of their making.
The CARE Principles for Indigenous Data Governance offer a counter-model. Developed in 2019 by the Global Indigenous Data Alliance, CARE stands for Collective Benefit, Authority to Control, Responsibility, and Ethics. It complements the FAIR principles used in open science by centering Indigenous rights, data sovereignty, and self-determination. CARE is grounded in the international framework of Free, Prior, and Informed Consent (FPIC), as codified in ILO Convention 169. These protocols are not symbolic. They are legal safeguards to prevent unauthorized use of cultural knowledge. They demand that projects involving Indigenous lands or heritage be designed in collaboration, not abstraction. In this case, the OpenAI challenge was launched with no engagement, no ethical review, and no participation mechanisms for the communities most affected. What is framed as civic discovery undermines the very standards designed to ensure data justice.
Archaeologists working in Amazônia do not locate sites through satellite maps alone. They collaborate with Indigenous researchers and rely on oral knowledge, cultural signs, and long-term field presence. A project like Amazônia Revelada provides an example of this co-production. It combines cutting-edge LiDAR technology with community-led observation to identify archaeological legacies beneath the forest canopy. This work is carried out through formal partnerships and with legal recognition from IPHAN. Researchers walk alongside local communities to document biocultural markers and honor territorial memory. The project protects cultural heritage not only through discovery but through relational commitment. In contrast, the OpenAI challenge encourages the extraction of new knowledge within data colonialism by mapping, collecting, and processing data without consent. It replaces co-creation with crowdsourced inference. By abstracting archaeology into an AI prompt and data training models, the competition disconnects it from the living territories and relationships that give it meaning.
The policy brief by the Amazônia Technopolitics Coalition provides clear recommendations for avoiding such misalignments. It calls for enforceable standards of participation, binding data governance protocols, and the alignment of digital initiatives with environmental justice frameworks. It affirms that technological projects in Amazônia must respect both ecological complexity and social sovereignty. It proposes participatory models for innovation that are led by, not imposed upon, Amazonian local organizations, communities, and peoples. These recommendations are not anti-technology. They are pro-accountability. They position digital systems within a matrix of territorial, cultural, and legal responsibility. The challenge by OpenAI ignores these protocols and safeguards models entirely. It selects the forest as a backdrop and the heritage from its people as a playground, but refuses to engage with its political, historical, or legal realities.
Technological engagements in Amazônia must begin from the recognition that land, knowledge, and social structures are co-constitutive. Archaeological research in these territories is not a detached exercise—it is embedded in cosmologies that hold together people and place through intergenerational care, responsibility, and territorial memory. What external initiatives often reduce to data points or algorithmic features are, in fact, expressions of living presence, carried, tangible, and intangible across time. When design processes are disconnected from the communities whose worlds are being mapped, the result is not merely procedural failure, but an erosion of the relational grounds that sustain these worlds. This includes frameworks grounded in Free, Prior, and Informed Consent, the CARE Principles for Indigenous Data Governance, and territorial rights codified in national and international law. Any technological system that aims to operate in these regions needs to be governed through these commitments—not as symbolic consultation, but as the starting condition for legitimate engagement.
Public interest technology must be accountable not only to the lifecycle till deployment, but to power in the nexus of what is particular to otherness. In Indigenous and other-than-Western worlds, knowledge is generated through cosmologies where kinship, territory, and memory are mutually constituted. These are livelihoods-frameworks of realities sustained through practices of care, relationality, and the continuous making of worlds. In contexts like Amazônia, science and technology developments cannot be separated from the histories of territorial dispossession and epistemic marginalization that continue to shape who builds, who benefits, and who is made invisible by the new geopolitical industrial order.
Redistributive models of technological innovation demand that indigenous and traditional communities are not treated as passive data subjects, but as rightful stewards, co-researchers, and owners of the infrastructures that touch their territories and knowledge systems. This includes not just participatory methodologies, but co-governance, equitable resource allocation, and long-term institutional partnerships. Knowledge production must return to those who have sustained it through relational, cosmological, and territorial praxis.
The traditional occupied lands in the Amazonia are worlds in motion, articulated through reciprocal responsibilities and innovative forms of worldmaking. In that sense, innovation cannot be open if its terms are pre-written. It must be restructured around sovereignty, accountability, and reparations—led by the peoples whose worlds, futures, and technologies are already here.
Authors
