Home

Donate

Researchers Develop an AI Agent Index to Inform Governance of Agentic Systems

Prithvi Iyer / Mar 4, 2025

Hanna Barakat & Archival Images of AI + AIxDESIGN / Better Images of AI / Weaving Wires 1 / CC-BY 4.0

AI agents, systems that can execute tasks with minimal human intervention, are the current fascination of AI developers, investors, and enthusiasts. Despite an increase in commercially available AI agent systems, there is little public awareness about how these systems are developed and their impact on society. To address this gap, researchers from various universities developed the AI Agent Index, the “first public database to document information about currently deployed AI agentic systems.”

The researchers say the index is a first step in building public awareness and helping policymakers design appropriate governance mechanisms for AI agents. While similar efforts to document AI systems and their real-world impacts exist (see our coverage of the Foundation Model Transparency Index), Tech Policy Press is not aware of any equivalent frameworks for AI agents. Thus, by documenting “technical, safety, and policy-relevant” information for AI agents, this research will help users understand the capabilities and risks of AI agents they may interact with while also helping auditors to decide “the scope and focus of their evaluations of agentic systems.”

Methods

Since there is no universally accepted definition of an "AI agent," the researchers first established practical criteria to determine which systems to include. They evaluated four key characteristics:

  • Underspecification – The AI agent must be capable of accomplishing a given goal without requiring a highly detailed step-by-step directive from the user.
  • Directness of Impact – The agent should be able to execute tasks that affect the external world with minimal human intervention (e.g., navigating software, writing and running code).
  • Goal-Directedness – The system should demonstrate decision-making behavior aligned with achieving a specific goal.
  • Long-Term Planning – The AI agent must be able to reason through multi-step problems, construct execution plans, and carry them out sequentially.

Additionally, to ensure the index is focused on real-world applications, it focused on AI agents that were either commercially deployed or available as open-source projects. Based on these criteria, the research team developed a sample of 67 such AI agents. The data collection occurred between August 2024 and January 2025, with AI agents being identified using “web searches, academic literature review, benchmark leaderboards and additional resources that compile lists of agentic systems.” For each of the 67 systems included, a preliminary "agent card" was created containing information about the system, the underlying AI model, notable safety evaluations, and whether it was interoperable with other systems. For entries with limited public data, the team contacted developers directly to request clarifications, corrections, and additional insights. This outreach resulted in a 36% response rate from developers.

Findings

Along with documenting the evolving AI agent landscape, this index provides some interesting high-level findings.

  • AI agents are being deployed rapidly. Half of the 67 systems included in this index were “deployed in the second half of 2024.” This demonstrates AI companies' growing interest in deploying such agents and reiterates the importance of research efforts to track these developments.
  • 45 out of the 67 systems are the product of companies located in the US.
  • Most of the AI agents in this index (73.1%) were created by big AI companies, while a smaller yet substantial number of projects (26.9%) are affiliated with universities.
  • According to the paper’s findings, AI agents are primarily used for software engineering and computer automation (74.6%). This suggests that AI agents are currently optimized to handle structured, logic-driven tasks, such as coding or navigating digital interfaces, rather than more open-ended problem-solving.
  • On the topic of public disclosures, the research finds that developers are quite transparent about information pertaining to usage and capabilities, with 33 of the 67 systems releasing code while 47 released documentation. However, only 13 developers provided information about safety policies, and only five provided safety evaluations. This trend was also found in the FMTI, where researchers found higher transparency for model capabilities but less for downstream impacts and safety disclosures.

Limitations and Conclusion

The authors note that the AI Agent Index is not meant to be “a comprehensive or exhaustive database of all agentic systems or related resources, such as language models and development frameworks for building agentic systems.” That being said, in its current form, the researchers allow that the index has a few key limitations:

  • This index only included publicly available AI agents which overlooks a vast array of systems deployed for private use.
  • The index only covers systems in English and may potentially exclude systems from non-Western developers.
  • The response rate from developers was only 36%, meaning some developers may have important safety information that the researchers could not access. However, the researchers created a mechanism to correct their index by allowing developers to respond with clarifications.

Despite these limitations, the AI Agent index appears to be a useful first step in documenting the AI-agent ecosystem. While tracking these developments requires companies to be more transparent, it provides important insights to inform AI governance. Since US tech firms developed most of the indexed systems, the researchers suggest that “governance efforts focused on US contexts could have more leverage than efforts in other countries or regions.” Given the lack of transparency around the safety and impact of these systems, the researchers suggest active collaboration by academic labs and government agencies in producing coordinated risk assessments. Furthermore, they say, incorporating such indices with existing regulatory frameworks to ensure “unified reporting of agentic systems, common safety benchmarks, and clearer accountability mechanisms” will be crucial.

Authors

Prithvi Iyer
Prithvi Iyer is a Program Manager at Tech Policy Press. He completed a masters of Global Affairs from the University of Notre Dame where he also served as Assistant Director of the Peacetech and Polarization Lab. Prior to his graduate studies, he worked as a research assistant for the Observer Resea...

Related

Looking Beyond the Black Box: Transparency and Foundation Models

Topics