Can Bots Read Your Encrypted Messages? Encryption, Privacy, and the Emerging AI Dilemma

Mallory Knodel, Andrés Fábrega / Feb 27, 2025

It may seem like AI chatbots are taking over every digital application, whether we like it or not. You might have noticed more AI note-taking bots in online conferencing platforms, some of which offer end-to-end encryption (E2EE). Then Apple Intelligence plans were announced, promising application redesigns to offer AI features across its phone and laptop operating systems. The latest changes have come from Meta AI’s integration in WhatsApp, replete with “bots nobody wants.”

Any time new features are added to an E2EE messaging app, it raises concerns about privacy and security. So, what concerns are raised by the addition of AI bots? How can we evaluate those concerns? As AI becomes more embedded into encrypted services, is it possible to resolve the tension between the privacy users expect from E2EE and the data access needed for AI functionality? With our colleagues at Cornell and NYU, we set out to answer these questions.

We uncovered several facets of this question from both a technical and legal perspective and published a paper laying practical recommendations for E2EE messaging platforms and regulators. It’s also important that we outline the practical solutions and recommendations for the public. You can read the full preprint paper here.

Background

Online messaging systems that allow communication between users (such as iMessage, WhatsApp, and Signal) are intermediaries to every communication between users. End-to-end encryption (E2EE) is a standard secure communication system that is designed to ensure that only the sender and the intended recipients can read communications between them. Messages that are encrypted are called “ciphertexts,” and content that is not encrypted, even if it’s an image or video, are called “plaintexts.” The core requirement of private and confidential messaging is that the service provider (and any other third parties) cannot read these communications.

AI assistants are programs designed to interpret everyday language and perform computational tasks. Today’s AI assistants are able to handle a wide range of tasks, including text analysis, content creation, code generation, language translation, and more. At the core of these technologies are programs trained on data to identify patterns, e.g., large language models (LLMs) such as OpenAI’s ChatGPT, Google’s Gemini, and Meta’s Llama, which process complex inputs (queries) and provide contextually relevant responses.

Putting this together, it seems impossible that an application with no access to message content can initiate AI processing on that same message content. To introduce AI in E2EE, the strictness of E2EE is widened in two dimensions: what is an “end,” and where is the end.

It would not be a strict violation of E2EE system design if you were to copy and paste your messages into a chatbot (though you might be violating the norms of confidentiality and privacy in the context of the conversation). If an application were to facilitate that for you, however, it would need to guarantee that the processing happens in a way that preserves E2EE, such as processing on the end—your device—and not on another computer, such as the application servers. Apple Intelligence has proposed the use of “trusted execution environments” (TEE) as the way to keep private the data used for AI training and processing.

However, a key finding in our analysis is that TEEs are insufficient to achieve the strict confidentiality and privacy guarantees of E2EE. But first, let’s discuss what this looks like in practice:

How AI Interacts with Your Encrypted Messages

AI features—such as message summarization, smart replies, and chatbots—are being seamlessly integrated into a wide range of applications with the goal of enhancing user experience. However, AI models critically require access to vast amounts of plaintext user data to power these tools. There are two main ways in which AI features interact with application data. First, they receive user data as part of queries during regular feature usage. For example, a message summarization tool receives as input a list of sent and received messages and outputs a summary of these. Second, user data is generally used to continuously train and refine the AI features. For example, data could be used to fine-tune models and improve their general performance, personalize models for the usage patterns of individual users, etc.

These considerations raise significant concerns for integrating AI features in E2EE applications. Processing of user content—during regular feature usage and model training—could expose sensitive user data to the parties who own the AI models. While some lightweight features can be implemented with smaller models that live on end-user devices (and thus, all data is processed locally), other features require offloading user data to more powerful models on the application servers. This is directly in tension with the strong security promises of E2EE applications, which require that no user data is visible to third parties.

There are a number of privacy-enhancing tools that attempt to address these issues and allow cloud-based AI models to process user data while protecting it from model owners. However, these tools offer varying degrees of security, not all of which offer the same strong privacy guarantees of E2EE. It is critical that any adopted solution is compatible with E2EE, and that the models' processing of user data does not undermine users' privacy expectations. Unfortunately, none of the existing technologies that are compatible with E2EE security, such as fully-homomorphic encryption (FHE), are yet practical solutions since they currently cannot efficiently evaluate the large models used in AI applications. On the other hand, more practical approaches, such as hardware-based solutions (e.g., storing models inside TEEs), do not meet the strong confidentiality guarantees of E2EE, and raise additional security considerations. While these represent a substantial privacy improvement over plaintext processing and may be an appropriate solution for other contexts, they are not suitable for E2EE environments.

Furthermore, even if an AI feature could somehow process user content in an encrypted manner (e.g., if FHE were to become practical for this task), training AI models with E2EE data raises an additional security concern: it is well-known that AI models often inadvertently “memorize” training data, which can lead to reproduction of this data during model responses, or even deliberate extraction by actors who can query the model in the form of “adversarial attacks.” So, even if training is performed privately, E2EE data could be exposed to other application users who can query (but not observe) the model. While certain technologies address memorization and adversarial attacks (e.g., differential privacy), these do not meet the strong security of E2EE.

"Whose Bot Is It?" The Tension Between AI, E2EE, and Ownership

We need to address the key tension: users often treat AI assistants as personal tools, but these bots belong to corporations that control data access. Especially in the context of E2EE messaging, we risk treating encryption like one more feature that might contraindicate the use of AI, but it’s more than that.

Users might expect their data to remain private when communicating with these bots, especially since platforms advertise E2EE features. However, driven by business incentives, we are seeing a trend away from privacy and towards the use of novel user data—some of it from messaging platforms using E2EE—to train AI models, sometimes without explicit user consent. This practice would not only undermine the privacy protections that E2EE is meant to provide, but also puts at greater risk the massive privacy gains made over the last decade.

At the same time, AI assistants aren’t very good (yet), and many people don’t want them. Like dark patterns, this just adds to the number of repetitive failures in corporate tech that have desensitized our intuition about real measures of performance when applications are working for us and not companies.

Practical Solutions and Recommendations to Regulators and Platforms

Our technical and legal analysis is meant to inform the design and implementation of AI in E2EE platforms so as to preserve user privacy and expectations of confidentiality. Verbatim are the recommendations based on our findings:

Training. Using end-to-end encrypted content to train shared AI models is incompatible with E2EE.
Processing. Processing E2EE content for AI features (such as inference or training) may be compatible with end-to-end encryption only if the following recommendations are upheld:
1. Prioritize endpoint-local processing whenever possible.
2. If processing E2EE content for non-endpoint-local models,
  1. No third party can see or use any E2EE content without breaking encryption, and
  2. A user’s E2EE content is exclusively used to fulfill that user’s requests.
Disclosure. Messaging providers should not make unqualified representations that they provide E2EE if the default for any conversation is that E2EE content is used (e.g., for AI inference or training) by any third party.
Opt-in consent. If offered in E2EE systems, AI assistant features should generally be off by default and only activated via opt-in consent. Obtaining meaningful consent is complex and requires careful consideration, including but not limited to the scope and granularity of opt-in/out, ease and clarity of opt-in/out, group consent, and management of consent over time.

Similarly, regulators and platforms can make design decisions to mitigate some of the privacy challenges posed by AI features to E2EE applications, such as:

Opt-in Features: AI features should be off by default and only activated via explicit opt-in mechanisms. Each individual feature should have a separate opt-in mechanism, with unambiguous disclosure notices of what each feature entails. Once activated, users should be able to subsequently turn off AI features if they desire.
Privacy Settings and Granularity: Messaging services should provide granular privacy settings that let users control what specific data is used for AI features and how much is stored or processed.
Ease of Setting Adjustment: AI-related settings (such as turning off AI features or adjusting data usage policies) should be easy to find and navigate. There must be a low barrier to toggling off.
Clear Disclosure: Messaging services should be transparent about when and how AI interacts with encrypted messages, including the precise level of security offered by their systems. This includes specifying whether AI features process user data in ways that provide weaker privacy protections than E2EE (e.g., using trusted hardware). These disclosures should be clearly and prominently displayed. Relatedly, companies should adopt a policy of over-disclosure, ensuring that users are fully informed about data usage even when interacting with AI.
Data Ownership and Access: Services should clarify who owns and controls the data AI uses, ensuring users understand that they are interacting with corporate-owned models and not private personal assistants.

We recommend that regulators consider mandating these five practices and that platforms take proactive steps to implement them.

Practical Solutions and Recommendations for the Public

Aside from informing technologists, companies and regulators, it’s important to provide actionable steps the public can take to protect their privacy in the era of AI and encryption. Based on what we know about how companies like Apple and Meta plan to integrate AI into applications that have promised privacy, here is what anyone can do to help maintain their privacy and confidentiality:

Choose OS-level app permissions carefully: Device-wide AI capabilities like Apple Intelligence mean that you need to be aware of which of your applications interact with AI features.
Review App Settings: Regularly check the privacy settings on your applications. If you're concerned about privacy, turn off AI-based features like message summarization or smart replies.
Be Aware of What You’re Sharing: Be mindful of the data you share with AI services, especially personal or sensitive information that could be used for training or other purposes. When applications tell you they might use your data for training AI, believe them. Passwords, contact information, and a wide variety of sensitive information might end up in an AI model and out of your control.
Beware of Opt-in Conditions: If you choose to invoke MetaAI or Apple Intelligence features in a private or confidential setting, be sure you understand the limitations– is it for all chats? Is it forever?
Talk to Your Contacts: If you are having sensitive conversations over E2EE services, have a conversation with relevant contacts, and make sure they aren’t inviting bots to the conversation.

Conclusion

AI features are being developed at a rapid pace, raising significant security risks for users of E2EE applications. It is crucial that AI innovation does not come at the expense of user privacy, and that the strong protections expected from E2EE applications are maintained. Absent perfect technical solutions, service providers should inform and empower users to navigate the interplay between AI and privacy, with transparent disclosures of how data is processed, user-friendly consent mechanisms, and granular controls over how data is used.

A previous version of this article stated that both Apple Intelligence and Meta AI have proposed the use of “trusted execution environments” (TEE) as the way to keep private the data used for AI training and processing. In fact, only Apple Intelligence has publicly proposed the use of TEE. We regret the error.

Authors

Mallory Knodel

Mallory Knodel is a founder and director of the Social Web Foundation. She studies cryptography at NYU and advises governments and companies on issues of technology and human rights. She was the CTO of the Center for Democracy and Technology and on the steering committee of the Global Encryption Coa...

Andrés Fábrega

Andrés Fábrega is a third-year PhD student in Computer Science at Cornell University, advised by Tom Ristenpart and Ari Juels. His research is in computer security and applied cryptography. He received his Master's and BS in Computer Science at MIT.

A Playbook for End-to-End Encrypted Messaging InteroperabilityJanuary 24, 2025

What Trump’s Return Means For EncryptionJanuary 10, 2025

The FBI Wants You to Know it Has Not Changed its Position on Encryption. And That’s a Problem.December 20, 2024

UK Encryption Crackdown Imperils Privacy, Security & Free SpeechFebruary 21, 2025

Secure Messaging Apps and Election IntegrityOctober 20, 2024