AI passes the Turing Test as GPT-4.5 fools humans 73% of the time in landmark study

Reviewed byNidhi Govil

2 Sources

Share

A University of California San Diego study provides the first empirical evidence that modern AI can pass the Turing Test. When given persona prompts, GPT-4.5 was judged to be human 73% of the time—outperforming actual humans. The findings force a rethink of what the test measures: not raw intelligence, but human-likeness.

Advanced Large Language Models Pass Historic Benchmark

A groundbreaking study from the University of California San Diego has delivered the first empirical evidence that AI passes the Turing Test, the iconic benchmark created by British mathematician Alan Turing in 1950

1

. Published in the Proceedings of the National Academy of Sciences, the research rigorously tested whether large language models could convincingly imitate human conversation so well that people couldn't reliably distinguish them from real humans

2

. The results are striking: GPT-4.5, when equipped with persona prompts, was judged to be human 73% of the time—significantly outperforming actual human participants

1

.

How the Experiments Revealed AI Can Seem More Human Than Real Humans

The study involved randomized, controlled experiments with two independent participant groups: UC San Diego undergraduates and online participants. In text-based conversations, interrogators chatted simultaneously with a human and an AI, then decided which was which

2

. Researchers tested four different systems: state-of-the-art models GPT-4.5 and LLaMa-3.1-405B, alongside baseline models GPT-4o and ELIZA, a classic 1960s rules-based chatbot. LLaMa-3.1-405B was judged human 56% of the time—statistically indistinguishable from actual humans—while baseline systems ELIZA and GPT-4o were selected as human only about 23% and 21% of the time

1

.

Source: Neuroscience News

Source: Neuroscience News

Persona Prompts Enable Mimicking Human Communication

The key to success lay in persona prompts that instructed each model to adopt a specific human character and communication style. "What we found is that if given the right prompts, advanced LLMs can exhibit the same tone, directness, humor and fallibility as humans," said corresponding author Cameron Jones, who conducted the research while earning his doctorate in Cognitive Science

2

. Without explicit instructions, performance dropped dramatically: GPT-4.5 fell to a 36% win rate and LLaMa-3.1 to 38%

1

. This suggests that while models possess the ability to appear human-like, they often need humans to tell them how.

Measuring Human-Likeness Rather Than Raw Intelligence

Study coauthor Ben Bergen, a professor of cognitive science at UC San Diego, explained that the Turing Test now measures something different than Turing originally imagined. "The Turing test started as a way to to ask whether machines could rival human intelligence," Bergen said. "But now we know AI can answer many questions faster and more accurately than people can, so the real issue isn't raw brainpower"

1

. The models weren't winning through displays of knowledge—they won because they made mistakes like humans would, exhibiting social behavioral traits rather than superior math and logic problem-solving

2

.

Source: Newswise

Source: Newswise

Implications for Online Trust and Future Interactions

The findings carry serious implications for online trust, especially since the models passed as human during extended five or 15-minute conversations. "It's relatively easy to prompt these models to be indistinguishable from humans," said Jones, now an assistant professor of Psychology at Stony Brook University. "When you interact with strangers online people should be much less confident that they know they're talking to a human rather than an LLM"

2

. Jones also highlighted darker risks: "The Turing test is a game about lying for the models. One of the implications is that models seem to be really good at that"

2

. Bergen added that being unable to discern whether you're interacting with a human or bot can have serious consequences, particularly as bad actors might deploy bots to persuade people or manipulate social dynamics online.

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved