Home

Donate
Perspective

We Need to Control Personal AI Data So Personal AI Cannot Control Us

Chris Riley / Aug 18, 2025

Chris Riley is the executive director of the Data Transfer Initiative.

Emotion: Joy by Elise Racine / Better Images of AI / CC by 4.0

To many, the ultimate question surrounding the explosion of computing capabilities that are broadly labeled “artificial intelligence” is whether, as computers get more powerful, we will control them, or they will control us. In this vein, the most important technology policy issue facing the future of artificial intelligence is one that receives very little attention. It’s not about training nor models—not the legal questions, not whether training data or model weights are open, nor the availability of chips or compute cycles. It’s whether users have fundamental rights to control their personal data, as it is used post-training, in order to personalize AI systems for their individual needs and use cases.

Fortunately, we know how to solve this: by extending today’s paradigm of data portability into the generative AI era, as the AI ecosystem continues to develop. Designing portability tools for AI will be a non-trivial exercise, but the principles DTI has established and tested in other contexts will guide our work. Rather than work out all the complexities up front, the urgency of development of this market demands action, acknowledging that we will need to iterate and adapt along the way. Thus, in this article, I offer principles for personal AI data transfers, and propose that we as a community begin immediately to develop AI data transfer tools, starting with a concrete and yet valuable data model: generative AI conversation histories.

AI systems, and the personal data in them, play a significant role in peoples’ lives today

Harvard Business Review published an article in April 2025 entitled “How People Are Really Using Gen AI in 2025.” The top 3 use cases? “therapy/companionship,” “organizing my life,” and “finding purpose.” That reflects quite a jump from the 2024 results, which headlined mostly technical and anodyne tasks, including “specific search” and “editing text.” In the fall of 2023, I wrote that the evolution of AI would proceed from new and niche to more familiar and widespread, then to intensely personal: “The market is shifting quickly from commodity generative technologies toward one where it’s not only the raw power of the large language model that matters, but also its ability to tailor itself over time in response to your input. In other words, like the rest of the internet, AI becomes more valuable through user data – and thus poses an increasing risk of lock-in effects.” We are witnessing that transition in real time this year as many embrace AI for more and more personal tasks, and in the process, contribute more and more personal data to AI systems.

This trend will continue. Personal data will continue to be contributed and used to provide more and more personalized AI experiences. This isn’t a malicious development; rather it’s an effort to provide specific and individualized value. But this development raises privacy concerns, including as search becomes supplanted by AI for everyday (including highly sensitive) use cases, as Georgetown scholar Mark MacCarthy has written, even as the US has failed consistently to pass a federal privacy law.

Increasingly, our personal data is significant even after our death. A non-profit organization called Permanent.org is building tools to help individuals preserve their data legacies, and the OpenID Foundation’s Death and the Digital Estate (DADE) Community Group is building mechanisms to ensure continued access to an individual’s data. The upside of these activities is valuable, albeit a bit mind-boggling, to allow our descendants and future historians to understand this present moment in time, with our warts (physical and metaphorical) captured and preserved in high resolution permanent digital storage. AI systems connected to these data stores will massively expand the capabilities— and the risks!—of this digital longevity.

Personal data risks constraining our control

More of our data and more of our lives (to the extent these are even separate) will be processed by AI, and used to tune powerful AI systems to serve us, at least in theory. But, what will we do if we aren’t happy? If we feel like the system we’ve given our data to, and allowed into our lives in intimate ways, is no longer meeting our needs?

Markets are supposed to be our answer here. If we’re not happy, we switch. In other contexts, if the price we’re paying is too high, or the quality we’re getting isn’t good enough, or if we’re not happy with other terms or qualities of service, we just switch to a competitor.

And there are options in the market today. Looking specifically at general-purpose conversational partners, OpenAI’s ChatGPT has a majority of the market by most standards (see one example from July 2025 here). But Google’s Gemini, Anthropic’s Claude, Perplexity, Microsoft’s CoPilot, and Meta’s embedded AI tools all count large, and growing, user bases.

Can a user switch? There aren’t significant lock-in effects currently preventing it. But many see a future of lock-in as plausible, if not inevitable. Openness today is advantageous for everyone, encouraging a race of innovation and growth. At some point, though, these expensive services will need to deliver returns.

Even in the absence of lock-in effects, there aren’t tools or pathways to make it easy for a user to transfer their personal data between services. And as personalization becomes more and more of the value-add being offered, switching without also transferring personal data will be less and less appealing of an option. Users risk being stuck and not having control.

The role of government to ensure human control isn’t clear

Historically, this would be the setup for a public advocacy play to develop a comprehensive AI governance regime, perhaps even an entirely new AI regulatory agency. In earlier platform regulatory contexts, both Harold Feld of Public Knowledge and Mark MacCarthy have written entire books on the potential design of a new digital regulator.

There’s room for some level of regulatory intervention to be sure. The EU has its AI Act, and despite some concerns, its implementation continues; the voluntary General-Purpose AI Code of Practice was released on July 10, providing specific implementation suggestions. Plenty of advocates working in US states and other jurisdictions will push for ambitious rules around fairness and liability, to set proactive guardrails on businesses and software engineers in the development and deployment of AI systems.

But the regulatory climate is different today. A common attitude among many politicians and industry leaders is that top-down guidance from government on how to build systems poses too much risk of cramping innovation. Both AI businesses and prospective regulators perceive a global race, compared often to the Space Race of the mid-20th century, and believe that too heavy a hand will cede the field to China. Which doesn’t mean there won’t be more laws and regulations; just that, to this veteran policy wonk at least, things feel different now, and harder to predict.

I’m not taking a normative view on the matter of more regulation of AI or not. I will note, though, that there is a deep – but not unresolvable – tension between a fear of over-regulation and a fear of loss of control. Historically, regulation is a path to asserting control; fearing both a loss of control and the consequences of regulation can be … perplexing.

Protecting data rights provides an opportunity for all stakeholders to build a better future

There is an answer to this dilemma, a path that seeks to align market forces with human interests. And the key to that alignment is protecting data rights. When that is accomplished, people will be more able to effectively choose services that align with their values.

In other words, the single most important thing we need to do, today, for the future of AI to serve us rather than the other way around, is to protect human control over the human data that is used to apply these systems to human contexts and circumstances.

Two specific rights within the historical data rights umbrella are paramount to provide that control: the right to transfer our data, and the right to delete it. The latter is, perhaps, more straightforward than the former – at least when we move beyond the intractable problem of extracting individual data from bulk training sets, and focus specifically on personal data stored in a way that facilitates AI personalization, human data that allows AI to be applied to a specific human and to iterate and improve that personalization over time. I’m sure there are complexities in data deletion beyond my expertise with this angle of data rights, but I will have to leave those to other authors to explore.

With regards to personal data transfer – which is my area of expertise! – there remain unresolved questions, such as those related to data scope, transfer mechanisms, and trust. I broke some of these out in a piece last year, and earlier this year I articulated the concept of “open intersections” to describe the specific nature of openness that will maximize our ability as humans to select AI services that best meet our needs. The AI market is competitive, and ensuring that end users – retail and enterprise – can move freely between service providers is the key to keeping it that way. It becomes a positive cycle: freedom of movement means that people will select services that meet their needs, which in turn creates incentives for businesses to design services in such a way that people both feel, and are, in control.

Now is the time to protect open intersections

Today we can see the moment, need, and opportunity for open intersections in AI:

  • This is the moment in time, because nothing on the business or technology side feels fully established. We’re still at a market development stage where openness is obviously valuable, because it facilitates downstream use and innovation that showcases the value of the platform and the entire ecosystem emerging around it. There’s ample progress in platforms like LangChain and shared protocols like MCP and A2A, efforts to build unified and interoperable communications amongst services.
  • The need is increasingly palpable now, because people – at least outside Silicon Valley – already don’t feel great about AI coming and taking over the world. There is a huge disparity between AI experts and the general public. For example, a Pew Research Center study in April found that, “While 73% of AI experts surveyed say AI will have a very or somewhat positive impact on how people do their jobs over the next 20 years, that share drops to 23% among US adults.” Businesses will struggle to achieve returns on their investment if people do not feel more empowered by AI technology in the future.
  • And the opportunity for open intersections exists because current regulatory approaches, such as the data portability obligations present in data protection laws and, increasingly, competition laws worldwide, provide frameworks for developing open intersections in (potentially at least) pro-market and sensible ways. Open intersections isn’t an inherently new paradigm, it’s an application of existing obligations in data services in general.

Open intersections will not address all of the policy problems in the AI landscape. Likeness recreations like the recent Scarlett Johansson and Christopher Pelkey incidents will be a thorny knot for some time. And intellectual property questions, exports of hardware, and others will persist. But compared to all of those, open intersections as a guiding principle for AI governance is relatively straightforward as an agenda to pursue. There are implementation costs to be sure, but the resulting benefits for individuals and for the ecosystem as a whole will dwarf them.

DTI’s AI transfer principles

What would open intersections look like in practice? Here’s what we at DTI suggest as core principles:

  1. Users should be able, at their request, to download personal data from AI services, and to request the direct transfer of personal data between AI services. This data should be in a structured, machine-readable, well-documented format.
  2. User-directed portability should focus on personal data, and should not extend to training data, model weights, or other elements of AI services not specifically related to the user initiating data transfer; however, personal data used to customize the actions of the AI service should be included as within scope.
  3. Data portability tools and interfaces should adhere to an open, interoperable technical specification to allow users to easily transfer personal data directly between AI services on a reciprocal basis.
  4. AI services, including generative services, AI agents, and tools, should communicate with other services through open protocols and should not impose unduly restrictive terms on data interfaces to ensure that users have their choice of products or services.
  5. Where data is transferred directly between service providers at a user’s request, all parties should employ reasonable, well-documented frameworks and practices for security vetting of the other party to the transfer, including organizational policies regarding data privacy, data security, end user transparency, and authentication of transfer parties and end users.

Now, critical to this is the definition of “personal data.” Here’s our take:

Personal data includes all user data provided by a user to an AI service, as well as responses generated by an AI service and provided to a user, and applicable user activity data collected and stored by a service. Where personal data specific to an individual has been acquired from another source and is associated with that individual user by the AI service and used for customization or personalization of the AI service or communication between the AI service and the user, such data shall also be included.

Personal data does not include data that has been combined or processed together with data from other users, such as aggregated data used for example in enterprise applications.

These are our AI transfer principles. To be clear, they’re not revolutionary when looking at the AI market today. They wouldn’t require radical changes, more of a continuation of the status quo. But protecting them going forward is essential to keep the landscape open for innovation. And some stakeholders predict a very near future where services begin to lock down users, partners, and data to create moats for greater revenue and investor returns. Thus, we must put in place the necessary policy infrastructure to preserve and protect open intersections in AI.

Drawing lines in implementing principles is always a difficult task. Data in AI is complex, as the Open Data Institute’s taxonomy work illustrates. With the speed of evolution involved, though, these subtleties aren’t best worked out through traditional consultative, multistakeholder processes and consensus; technology assumptions and designs would change in the meantime. Instead, it’s necessary to move quickly from principles to shipping tools, and then take in additional feedback and iterate.

That’s our approach to data portability at DTI. We don’t stop with principles; we work to translate them through stakeholder coordination all the way to shipping tools that get into the hands of users. We started on this journey with AI portability last year, when we partnered with our affiliate, the AI firm Inflection, to help it provide user downloads of conversation histories in a well-structured, easy-to-reuse format.

Our work on personal AI data transfers starts with conversation history, as it is the most concrete and immediate-term use case. But as many continue to develop and ship more complex articulations of user experiences, often under the anthropomorphized term “memory,” the needs for collective work on data models and transfer tools will grow more complex, and more important.

It’s going to be hard, and the pace of innovation isn’t likely to slow down, but at the same time, ensuring continued human control of our AI experiences is paramount. And data transfer is central to that control.

Authors

Chris Riley
Chris Riley is Executive Director of the Data Transfer Initiative and a Distinguished Research Fellow at the University of Pennsylvania’s Annenberg Public Policy Center. Previously, he was a senior fellow for internet governance at the R Street Institute. He has worked on tech policy in D.C. and San...

Related

Perspective
Learning from Past Successes and Failures to Guide AI InteroperabilityJuly 10, 2025

Topics