Juan David Gutiérrez is an associate professor at Universidad del Rosario (Colombia) where researches and teaches on public policy and artificial intelligence.
In the last week of March, a judge in Peru and a magistrate in Mexico claimed that they used OpenAI’s ChatGPT to motivate a second instance ruling and to illustrate their arguments in a court hearing, respectively. Peruvian and Mexican news reports framed the use of ChatGPT in judicial proceedings as a positive innovation and did not raise concerns about how the chatbot was used.
The Peruvian judge and the Mexican magistrate are not the first judicial officials in Latin America who have turned to ChatGPT to draft and/or motivate part of their decisions. In February, I reported about a judge and a magistrate in Colombia that used ChatGPT to draft judicial decisions (Maia Levy Daniel also reported on the cases in Tech Policy Press).
Moreover, at the beginning of March, five of the ten Court of Appeals Judges that participated in the process to fill one of the vacancies of the Supreme Court of Chile discussed the use of artificial intelligence (AI) in the judiciary. One of them even referred to the case of the Colombian judge who transcribed four prompts and answers from ChatGPT to motivate his ruling.
The new cases in Peru and Mexico demonstrate why the judges should have taken greater care when using large language models (LLMs) in their work.
Peruvian judge uses ChatGPT to decide how to do mathematical calculations
In the Peruvian case, the use of ChatGPT appeared to be very limited (Process No. 00052-2022-18-3002-JP-FC-01). The Specialized Judge of the Transitory Civil Court of San Juan de Miraflores, of the Judicial District of South Lima, decided a second instance case related to the determination of the child support obligations that a mother and a father had with respect to their daughter, who was born in December 2021.
The judge “invoked” ChatGPT in the motivation section of the ruling, in which he had to calculate the value of the child support obligation that corresponded to the father and the mother in proportion to their “economic possibilities and personal conditions”. In the eleventh recital, the judge reported that:
“…through the assistance of the Artificial Intelligence platform of Open AI – Chat GTP (sic), it is appropriate to apply the technique of mathematical proportion, in order of establishing what is the contribution that corresponds to each parent, according to their income, to meet the living expenses of their daughter.”
In other words, the judge apparently consulted which mathematical operation was required to calculate the proportion of child support that each parent should assume according to his and her capacity. At this point it is worth adding that the judge was not entirely transparent about how he used the system because he did not reveal the prompt or prompts he used, nor the answer(s) he received from the system. We also do not know if he prompted ChatGPT to do the calculation directly.
The only trace the judge left about his query to the system was a footnote in which he transcribed the URL of his interaction with ChatGPT. Perhaps the judge was unaware that including that URL offered no clue about his ChatGPT query because other users cannot access his past interactions with the system.
This point is not minor, because it indicates that the judge was not familiar with how the tool works. Moreover, if he asked ChatGPT to do the calculation, it suggests he was unaware that these types of LLMs are not trained to perform precise mathematical calculations. The inability of ChatGPT to resolve these types of queries is notorious and it is not difficult to identify the high frequency of errors it produces when faced with these types of questions.
In summary, there were two problematic aspects of the Peruvian judge’s use of ChatGPT to draft the judgment that was rendered on March 27, 2023: first, he was not transparent about how he used the system to support his decision; second, he was apparently unfamiliar with how the system works, which may imply that he was unaware of its limitations and risks.
Mexican magistrate uses ChatGPT to inquire about the expression “you know who”
At the hearing of the SUP-JE-21/2023 process, held on March 29, 2023, Magistrate Reyes Rodríguez Mondragón, president of the Superior Chamber of the Electoral Tribunal of the Mexican Judiciary, indicated that he had consulted ChatGPT on his phone. He read some of the outputs that he obtained from the chatbot to illustrate how a ruling could be easily motivated with the help of ChatGPT.
Before explaining how ChatGPT was used by the magistrate and the discussion that ensued in the Tribunal, it is pertinent to provide the context of the case. In this process, the Electoral Tribunal had to decide about an appeal filed against a first judgment that dealt with the use of the expression “you know who” (in Spanish, “ya sabes quién”) as part of the electoral pre-campaign advertisements of the Morena party in the State of Mexico. The legal question addressed by the Electoral Tribunal was whether the use of the expression could generate a situation of imbalance in the pre-campaign, since it could be interpreted as a signal of support by President Andrés Manuel López Obrador.
In the hearing in which the first instance ruling of the local electoral court was examined, Judge Rodríguez Mondragón argued the judgment should be revoked since it did not include a contextual analysis of the use of the expression “you know who” and, therefore, there was a legal defect due to lack of completeness.
To illustrate how a ruling could be motivated, Justice Rodríguez Mondragón stated the following:
“For example, here on the cell phone, I have been making consultations to PT Chat (sic) […] if you know who ‘you know who’ is and the answer is that in the Mexican political context it is the president […] they refer to President Andrés Manuel López Obrador and it gives us an explanation that this reference was popularized in 2018 in the campaign.”
The magistrate did not stop there. He made additional queries to ChatGPT which he referenced during the hearing:
“I also asked PT Chat (sic)… who was referred to as the ‘unnamable’ or ‘the one who is not named’, and it answers that to Voldemort in the Harry Potter series. […] I think that if I also use the same paragraphs, I could ask whether this is an act in anticipation of a campaign or pre-campaign and it provides an explanation, that is, if the artificial intelligence gives us an explanation of context with motivations, this is what is expected from a court that reasons about the expressions that are analyzed.”
Then the magistrate continued his argument that the courts of first instance should offer arguments and reasons to justify their rulings and insisted on his comparison between ChatGPT’s supposed ability to reason. The magistrate ended with an implicit invitation to use systems such as ChatGPT to draft judicial rulings:
“[…] the Electoral Tribunal of the State of Mexico reiteratively is not complying with the principle and duty of exhaustiveness… it is not providing sufficient reasons… when even technology now facilitates a series of information, obviously processing databases and all the knowledge that is available to the courts.”
Unlike the Peruvian judge, the Mexican magistrate did expressly explain the queries he put to ChatGPT and how he proposed that judges rely on such systems to draft judicial sentences. But the Mexican judge, like the Peruvian judge, seemed to be uninformed about the limitations of LLMs. Given that these systems provide incorrect and imprecise information and even mix fictional elements in their answers, they are not an ideal tool for answering queries on factual matters.
Afterwards, Justice José Luis Vargas took the floor to criticize the way in which his colleague proposed that judges use tools such as ChatGPT:
“I would like to think that what you have just told us is simply an isolated example and is not a forecast of what will be the jurisprudence of this court…because I would be concerned that now our resolutions will be taken based on what ChatGPT says.”
Then Justice Vargas expressed his concern about what role the courts would have if in the future ChatGPT was to define cases from start to finish, arguing that this type of system “still has quite a few errors” and that “that is the reason why I think we human beings will be in charge of this kind of positions for a good while.”
Judge Vargas is right to be concerned about the risk of judicial officers relying on ChatGPT to absolve legal, technical, and factual questions. The tool may be useful to explore topics, but it is not suitable to use its answers to motivate rulings without a detailed verification of each sentence included in the synthetic text generated by the system.
Automated or semi-automated decision systems used by states to perform their public functions should always be subject to close human scrutiny and control. This is especially important in the context of judicial functions, as certain uses of some artificial intelligence tools may have negative consequences for the fundamental rights of the parties to the proceedings, and of third parties.
It seems likely that judges from other countries of Latin America will use ChatGPT to draft rulings and to participate in court hearings. In other countries in the Global South, such as India and Pakistan, similar cases have emerged in the last weeks. Two common themes in these cases are the idea that LLMs may save time in the drafting process of judicial decisions and the lack of digital literacy of the judges, each of whom used ChatGPT as if it was a reliable source of information.
As it has been widely reported and documented, the outputs produced by these systems often contain inaccurate, incorrect, and even fictitious information. LLMs may be useful for exploring topics, summarizing texts, and even for inspiration, but they are not suitable tools for answering factual, technical, or even legal questions.
Hence, it is urgent that the bodies in charge of judiciary governance in Latin America open a dialogue on the use of LLMs as support tools for the drafting of judicial decisions. It is necessary to publicly discuss which types of AI and which uses of these technologies are appropriate, and which should be avoided.
Finally, judicial operators should have basic knowledge about how LLMs work, how they were developed, what they can be suitable support tools for, and what limitations and risks they present. They should not turn to ChatGPT as if it was an oracle that can answer everything reliably.