Almost everyone has at least one account on some kind of private messaging app. Using end-to-end encryption, these services ensure that only the sender and the intended recipient of a certain message can decrypt the communication and read the message. While the apps protect user privacy and promote freedom of speech, encryption creates challenges for moderation, as moderators do not have access to the message content.
The encryption debate, as Sofia Lesmes and Kathryn Waldron pointed out in their recent Tech Policy Press article, is stuck in the frame of “security vs privacy”. This has been reflected in the debate surrounding Apple’s recently announced child sexual abuse materials (CSAM) scanning protocol. The protocol was proposed as a solution to ensure security, in response to the call from law enforcement to prevent CSAM in private messengers. The protocol reveals a user’s content in their iCloud to moderators only when the number of CSAM content artifacts found reaches a certain threshold. However, this supposedly-privacy-preserving system was criticized by scholars and advocates due to privacy concerns. As a result, Apple delayed its plan to roll out this system. Yet an important question remains, if such a privacy-preserving proposal still breaks the promise of private messaging, is it fundamentally impossible to have moderation in end-to-end encrypted environments?
Before diving into this question, let us take a closer look at Apple’s proposal. The proposed system embeds user images into a binary representation named NeuralHash at the client side. If two images are perceptually similar, it is most likely that their corresponding NeuralHash will have the same value. The central server maintains a list of NeuralHashes of known CSAM images. The server performs a cryptographic protocol with the client, such that if there are more than a certain number of the NeuralHashes from the client side matching the NeuralHashes in the server’s list, the corresponding images will be revealed to the server. The protocol ensures that: 1) If the number of matches does not meet the threshold, all the NeuralHashes and the images, as well as the exact number of matches, will remain secret to the server, and 2) if the image’s NeuralHash value does not match any value in the CSAM list, the image and its NeuralHash will remain secret.
Such a system seems to perfectly preserve the user’s privacy and protect child safety, as only bad content will be revealed to the server. However, is that really the case? Privacy researchers and cryptographers have long warned that such a system could be a slippery slope that leads to surveillance and censorship. The proposed system could be repurposed into a system that censors speech, by silently replacing the content on Apple’s CSAM list with something else– say, certain names or images associated with a political group– without being noticed by anyone. Certain countries could employ laws to force companies to report on users who are sending or storing content that they consider sensitive. Authoritarian regimes could use such a tool to surveil journalists who risk their lives to shed light on human-rights violations.
So, in order to keep the promise of private messengers and protect freedom of speech, should we leave end-to-end encrypted communications unmoderated? While all the attention has been drawn onto moderating CSAM in end-to-end encrypted environments, it is worth noting that CSAM is not the only problematic content that needs to be moderated. Misinformation on private messengers, for example, also brings significant harm to society, sometimes with a real cost of life.
Or, can we reimagine abuse mitigation such that the privacy promises are kept intact and users are protected? Traditional moderation frameworks are set up in a top-down system, providing tools for the platform to regulate communication among users. To rethink abuse mitigation from a user-centric perspective, one can imagine providing technical solutions that empower users with information to combat various threats online.
Unfortunately, the CSAM moderation system proposed by Apple cannot be easily applied in this scenario. Replacing the list of CSAM images with a list of known misinformation images would notify the server users who have been frequently sending and receiving these images, while users will not benefit from any extra information from such a system.
And, users who receive and forward misinformation images often do not have adversarial intentions, rather, they were misinformed. Well-designed interventions could warn users that these images might mislead and create harm, in order to stop them from further circulating the content. We could imagine a moderation system that labels manipulated media or false information for user communications in WhatsApp, similar to those that were employed on Twitter and Facebook. Such a system has to be privacy-preserving, scalable and efficient. First, user content should be kept private. Second, the system has to adapt to the large scale of misleading content online. Lastly, warnings should come in real time before users forward the content further.
At Cornell Tech, we took some first steps to address the challenge of realizing such a system. In a new paper that will be presented at USENIX Security 2022, we propose a prototype concept called “similarity-based bucketization,” or SBB. A client reveals a small amount of information to a database-holding server so that it can generate a “bucket” of potentially similar items. This bucket would be small enough for efficient computation, but big enough to provide ambiguity so the server doesn’t know exactly what the image is, protecting the privacy of the user. The key to SBB is to strike the correct balance of obtaining enough information to warn users against possible abuses while preserving user privacy.
After generating the bucket, the server could reveal the bucket to the client such that the client could check if the images they receive resemble any items in the bucket. A more secure approach for the server is to perform a multi-party computation protocol with the client using the bucket as an input, such that the bucket content will remain secret while the client learns if their image was included in the bucket. Upon learning this information, the client could choose to report this image to the server or a centralized moderator. Note that unless the client is willing to do so, no client content will be revealed to the server.
This prototype cannot fully address the tension between privacy and moderation, and it cannot be applied to report CSAM spreading in private messages. In fact, it was designed intentionally not for this purpose. As mentioned before, even privacy-preserving solutions that reveal nothing but the policy-violating content to platforms or law enforcement can be used for surveillance and censorship purposes. However, this does not imply that technical tools cannot help users defend themselves against abuses. We consider this work a new direction out of the “security vs privacy” frame. With this prototype, we ask scholars, practitioners and policy makers to reimagine abuse mitigation in end-to-end encrypted environments, and to work on technical innovations that empower users with more information and with more control over what they see on private messaging apps.