The way that some AI works makes it unsuitable for some applications in legal work, according to one lawyer who's "at the courtroom coalface", as he put it.
Alan Parfery is Advocate Depute in the Crown Council, a group of independent senior lawyers from Scotland's Crown Office, who prosecute the most serious criminal cases in the High Courts. But Parfery is no Luddite: he is also a Member of the Probable Futures Project, a UK-wide responsible AI project, which seeks to track the impact of AI on the criminal justice system:
AI seems to me to be truly the transformational technology of our time, something that may go on even to dwarf the internet in terms of its disruption and impact. But my current view is there remains a significant gap between how lawyers are using AI out of court, and how it is being used inside the courtroom. My submission is that there is a good reason for that gap. I can tell you, in my capacity as prosecuting counsel, that my worst nightmare would be prosecuting a case that results in a miscarriage of justice. And because of that risk to justice, I submit that courts have always been - and are always likely to be - careful gatekeepers of what evidence is allowed into court.
Parfery's point is that the gap is between how AI works in theory and the reality of hearing a case in a jury trial, which demands that humans are always in the loop. He explains:
There was recently a homicide case in Scotland where the accused was alleged to have struck the deceased with a knife in the chest, causing death. The Crown case was that this was an intentional murder, whereas the defence case was that it was death by accident and that the deceased had, essentially, caused their own death during a struggle. The Crown commissioned an expert report which used AI to model the statistical probability of the death being accidental. Its conclusion was that there was a 0.25 percent probability of it being an accident.
So, you might think that sounds like a strong piece of evidence for the prosecution: how could that have been an accident? But in fact, the appeal court held that the report was not admissible as it usurped the function of the jury by placing a probability on whether the accused, the defendant, was telling the truth or not.
In short, it strayed into the realm of automated justice via probability. Even 0.25 percent is a margin of error, however slim it might be, which means it constitutes reasonable doubt. A probability is just that: not a certainty, or a statement of evidenced fact. Unlikely things do happen. Parfery explains:
I refer to this appeal case to make the point that, while AI may well work wonders in helping prepare cases for trial, in document review, and maybe even in legal research with the lawyer being responsible for the output, when evidence gets to court there is a gap between that promise and the practical reality. So long as there are humans in court cases - civil or criminal, whether they are witnesses, counsel, judge, or jury, there will always be elements that are unpredictable and there will always be problems that arise during litigation that can only be solved by human lawyers. AI- powered lawyers, yes, but human lawyers nonetheless. So, my submission is that the courtroom may well be the ultimate stress test for AI, because there every assumption can be challenged, and every output interrogated, and every flaw exposed.
Plus, any AI is itself prone to error, which might include bias against demographic groups from historical training data. Witness the notorious case of the COMPAS sentencing advice algorithm in the US, which was found to predict that black defendants were more likely to reoffend than whites, and so recommended harsher sentences, and more lenient sentences for whites.
In 2025, research from Stanford University looked at AI in the legal sector, in the wake of US attorneys presenting fake caselaw in court. It found that even AIs that had been trained on specialist, trusted data sets were prone to hallucinate and invent citations. As noted on diginomica:
Stanford's Human-Centered Artificial Intelligence institute (Stanford HAI) noted: 'The Lexis+ AI and Ask Practical Law AI systems produced incorrect information more than 17% of the time, while Westlaw's AI-Assisted Research hallucinated more than 34% of the time'. Granted, those were better figures than the Institute's 2024 research, which found that general-purpose chatbots, such as ChatGPT, hallucinated between 58% and 82% of the time on legal queries. Even so, the implication is that accurate source data is just one element of a deeper problem.
In the US, there have been at least 150 cases of experienced lawyers presenting ChatGPT hallucinations in court. Parfery is certainly aware of these problems. He says:
It will be necessary for all lawyers referring to cases to always check their caselaw. In the United States, the equivalent of contempt of court has been used against lawyers who have used AI hallucinated caselaw. There have been fines imposed and we all know the reputational risk, because let's face it, none of us want to end up on the front page of the Guardian or any other paper for all the wrong reasons.
So, what might a good use of AI be in a courtroom? Parfery shares a moving story:
A real judge, Sheriff Alistair Carmichael, publicly - and I consider courageously - revealed that he has Motor Neurone Disease. He went on to make recordings of his own voice so that an AI could use it, and this would allow him to continue to do the work that was so valuable to him and to the communities he serves. Now it is not a robotic voice giving legal directions to juries, but his own.
So, I say to you that may very well be the best example of how AI can be used responsibly, with a dramatic and important effect for individuals and, indeed, for the cases that the Sheriff continues to hear.
He adds:
In Scotland where a witness' evidence is pre-recorded, an AI-generated transcript is produced and that is then given to the trial judge. Now, the AI isn't being used in evidence as such, it is not being given to juries, but it is part of the trial judge's preparation pack that is available to them. AI is also being used in preliminary hearings to transcribe the content of those. I understand it can be a significant time saver and, indeed, I've even seen it being used in trial courts to resolve issues as to what may have been said by a witness in a video.
So, what of the future, given that Parfery is a Member of the Probable Futures Project on AI? He says:
A transparent approach of acknowledging when AI is being used for legal matters will be critical for public confidence. It's much better to use and to acknowledge the use of AI in a responsible and limited way, rather than simply to hope that it won't enter the courtroom.
These will be important matters to wrestle with, because it does appear to be entering the courtroom. I suggest all judges, lawyers, academics, and others that contribute to policy work would be well advised to keep abreast of both the risks and opportunities of AI.
Then he adds:
For those that continue to say AI is all hype and it's a bubble, or there's nothing near transformative about it, I would question whether they will continue to be able to have the ear of the court.
Valuable insights from the cutting edge of legal work. Automated justice is still some years away, and we should all be grateful for that. As Parfery noted in his conclusion:
* AI is influencing how litigation is prepared and conducted. However, while useful, the technology presents risks that demand careful professional management. A lawyer's ethical judgement must remain paramount.
* Meanwhile in the courtroom, legitimacy and fairness must always take precedence over efficiency.
On the use of AI for transcription, I would urge greater caution, however: my own experience of AI summaries and transcripts, especially those powered by GPT, is that they are often inaccurate, with the summaries missing nuance, subtext, and context: you generally end up with, in effect, a PR reading of a real conversation.
Plus, as I noted in a September 2024 report on transcription service Otter, the increasingly AI-powered platform had taken to flying in data from unknown, unverifiable external sources, and adding them to its summary of a real-life conversation, effectively rewriting history and putting words in an interviewee's mouth.
So I would advise lawyers to use such tools with extreme caution, therefore, and to always check both transcript and summary against the original audio. But doing so presents a problem, of course: in general, busy professionals use AI primarily to save time, so having to check its workings undermines its advantages as a time-saving tool.