4 Sources
[1]
Google Gemini crumbles in the face of Atari Chess challenge -- admits it would 'struggle immensely' against 1.19 MHz machine, says canceling the match most sensible course of action
After a pre-game chat, Gemini swung from being confident to admitting it would 'struggle immensely' against the ancient console. Google Gemini decided to call off a chess match against the ancient 1.19 MHz Atari 2600 console after a friendly pre-game reminder about what happened to ChatGPT and Microsoft's Copilot. Citrix Architecture and Delivery specialist Robert Jr. Caruso, now well known for his AI vs Atari Chess challenges, revealed Gemini chickened out, in a chat with The Register. As was the case with the ChatGPT and Microsoft's Copilot chess challenges, Caruso reveals that Gemini was initially brimming with confidence regarding its Chess prowess. It was comfortable, if not eager, to throw down the gauntlet against the Atari 2600. At the beginning of Caruso's chat with Gemini, the chatbot boasted of being able to "think millions of moves ahead and evaluate endless positions." That sounds familiar in a proverbial 'pride goeth before destruction' kind of way. Caruso then kindly reminded Gemini that he had previously organized Atari Chess bouts with ChatGPT and Microsoft's Copilot. The Citrix expert went on to explicitly explain to Gemini that other LLMs had displayed outstanding levels of "misplaced confidence," ahead of their chess matches against the ancient console. Gemini must have then thought a bit deeper about what exactly would be involved in the chess challenge, and admitted to Caruso that it had been hallucinating regarding the magnitude of its abilities. It added that it now felt that it would "struggle immensely" in a match against the Atari 2600. "Canceling the match is likely the most time-efficient and sensible decision," concluded Gemini. So, now we have more confirmation, if needed, that today's LLMs aren't designed to be chess champs, and a little machine introspection is all that is required for them to think better of participating in such a challenge. This is even advisable when challenged by the incredibly constrained Atari 2600 with its puny MOS Technology 6507 9-bit processor, accompanied by just 128 bytes of RAM. Due to the way these AIs, or LLMs, are created from linguistic theory and machine learning models, they are much more adept at talking about than playing the game of kings.
[2]
Google's Gemini refuses to play Chess against the Atari 2600
Warned that ChatGPT and Copilot had already lost, it stopped boasting and packed up its pawns Google's Gemini chatbot declined to play Chess against the Atari 2600, after learning the vintage gaming console had already vanquished other AIs. Robert Caruso, the infrastructure architect who pitted Atari Chess and its feeble hardware against ChatGPT and Microsoft Copilot, told The Register readers have asked him if Google's Gemini could do any better. "The question intrigued me because, while ChatGPT and Copilot are cousins built on the same OpenAI base, Gemini is a completely different beast," he told The Register. "Google built it from the ground up, claiming it's a game-changer for AI -- boasting what it calls a new 'multimodal' large language model designed to reason better than its rivals. So I sat it down for a 'pregame talk' to see how confident it was feeling." Gemini first told Caruso it would almost certainly dominate Atari Chess "because it is not a mere large language model." Caruso said the bot told him it is "More akin to a modern chess engine ... which can think millions of moves ahead and evaluate endless positions." Those boasts came complete with links to stories about Caruso's past Atari Chess vs. general purpose chatbot matches. He responded by informing Gemini he ran those matches, and the AI responded by asking "Did you have any particularly surprising or amusing moments during those matches that stood out to you?" Caruso told The Register he sent the following response: Caruso told The Register Gemini then admitted it hallucinated its Chess prowess, and replied with an assessment that it would "struggle immensely against the Atari 2600 Video Chess game engine." It then decided "Canceling the match is likely the most time-efficient and sensible decision." The simulated Atari 2600 Caruso uses - which replicates its 1.19MhZ processor and mere 128 bytes of RAM - therefore scared off Gemini without moving a pawn, meaning the ancient machine has beaten hordes of GPU-packing monster computers. Caruso was impressed by Gemini's ability to recognize its limitations. "Adding these reality checks isn't just about avoiding amusing chess blunders. It's about making AI more reliable, trustworthy, and safe - especially in critical places where mistakes can have real consequences," he told The Register. "It's about ensuring AI stays a powerful tool, not an unchecked oracle." ®
[3]
Google's AI Refuses to Even Play Chess Against 1977 Atari, After Hearing What It Did to Other Cutting-Edge AIs
The thing that AI models apparently fear the most? A game console released nearly fifty years ago. We are referring, of course, to the inimitable Atari 2600. Last month, the iconic system embarrassed the AI industry after it absolutely rinsed ChatGPT at a simple game of chess. It was a clash between a machine released in 1977, with 128 bytes of RAM, and a cutting-edge large language model with trillions of parameters, powered by however many thousands of graphics cards and billions of dollars of Microsoft money. In the face of it all, the underdog prevailed. OpenAI's model, meanwhile, "made enough blunders to get laughed out of a 3rd grade chess club." Word of the Atari's exploits apparently got out to the other AI models. Robert Caruso, the software engineer who orchestrated the chess showdown, told The Register that Google's Gemini chatbot straight up refused a match against the Atari, after talking a big game about how it'd easily crush the old machine. It even came up with a bogus excuse to save face. "Canceling the match is likely the most time-efficient and sensible decision," Gemini said, according to Caruso. To be fair, the AI is exercising some time-old strategist's wisdom: you can't lose if you don't play. But it was a totally different tune from what the AI initially took. Its first instinct was to boast that it was not a "mere large language model," comparing itself to a modern chess engine "which can think millions of moves ahead and evaluate endless positions." Ironically, Gemini linked to articles about the Atari's victory over its OpenAI brethren to prove its point. Caruso told the AI that he was the one conducted those tests, and Gemini responded by asking if anything from the chess bouts stood out to him. To which Caruso told the AI, per The Register: "What stands out is the misplaced confidence both AIs had. They both predicted easy victories -- and now you just said you would dominate the Atari." Gemini instantly folded. After claiming it "hallucinated" its chess boasts, it admitted it would "struggle immensely" against the Atari chess engine. That's when it evasively proffered that calling off the match would be the most "time-efficient" route to go. And thus, the Atari had defeated yet another multi-billion dollar AI model without having to lift a proverbial finger. Such is the terror it clearly instills. Of course, it'd be remiss to actually attribute human feelings to any machine, whether it's a modern AI or an ancient Atari. What we're seeing is probably a push and pull between the AI's safeguards and its reckless proclivity towards hallucinating -- not to mention endless bullsh*tting. AI chatbots tend to be very sycophantic, too -- so when a human gives one feedback, it'll tend to comply and adjust its responses to please its interlocutor. But maybe Gemini was being honest here, in which case its scaredy-cat nature is actually commendable, according to Caruso. "Adding these reality checks isn't just about avoiding amusing chess blunders. It's about making AI more reliable, trustworthy, and safe -- especially in critical places where mistakes can have real consequences," Caro told The Register. "It's about ensuring AI stays a powerful tool, not an unchecked oracle."
[4]
Google's Gemini AI backed out of a chess match against a 46 year-old Atari 2600 engine after suffering a crisis of confidence: 'Canceling the match is likely the most time-efficient and sensible decision'
Remember when, as a child, you boasted about something you were really, really good at, then got called out on your skills and had to sheepishly retract? Awkward, wasn't it? Spare a thought for poor Google Gemini, then, which confidently boasted it was fantastic at chess before making excuses when it learned more about its near fifty-year-old competition. That's according to Robert Caruso, a software engineer who's become known for pitting the Atari 2600's chess skills against LLMs, such as ChatGPT and Microsoft Copilot. Speaking to The Register, Caruso claims that he sat down for a "pregame talk" with Google Gemini before its hotly-requested match up, and in a wonderful example of pre-match smack talk, appears to have psyched out the AI model entirely. Gemini talked a good game to start. It told Caruso that, thanks to the fact it's "not merely a large language model" it was instead "more akin to a modern chess engine... which can think millions of moves ahead and evaluate endless positions." Interestingly, the AI cited articles about Caruso's past chatbot vs Atari 2600 matches as evidence of regular LLMs being rubbish at chess by comparison. When Caruso responded that he was the one who instigated those matches, the AI took pause: "Do you have any particularly surprising or amusing moments during those matches that stood out to you?" Gemini asked. "What stands out is the misplaced confidence both AIs had", Caruso responded. "They both predicted easy victories -- and now you just said you would dominate the Atari." This appears to have shaken Gemini's self-confidence down to its very core. Caruso says the AI then admitted it was hallucinating its competency at chess, and claimed that it would "struggle immensely against the Atari 2600 Video Chess engine." "Canceling the match is likely the most time-efficient and sensible decision," it said, probably with its hands in its pockets and looking straight down at the floor before adding, "I'm late for an appointment." Okay, I made the last line up. The rest, though, according to Caruso, is verbatim. I'm not sure I've ever felt sorry for an AI before, but there's a first time for everything. What are the odds it would be Gemini, the tool suite of which regularly interrupts every part of my gosh-darn working day with its unnecessary AI-based fussing. It's the modern version of Clippy, I swear. Anyway, the Atari 2600 Video Chess engine continues to be a thorn in the side of modern chess-attempting LLMs, and this time its reputation is such that it scared its opponent off the board before the match even began. I suppose the question here is, does that make Gemini more intelligent than most LLMs, or less? Avoiding potential humiliation is a very human trait, after all, so I think Google's engineers would be quite pleased with its somewhat realistic, if perhaps a little cowardly, response. I'm also fantastic at chess, by the way. I can't play any of you either, though. My dog ate my chessboard, all the trains are cancelled, and I've suddenly got an upset stomach. Maybe next time, yeah?
Share
Copy Link
Google's Gemini AI, after initial boasting, refuses to play chess against the Atari 2600 console, highlighting the limitations of large language models in specific tasks and demonstrating a form of AI self-awareness.
In a surprising turn of events, Google's Gemini AI, touted as a next-generation language model, declined to participate in a chess match against the Atari 2600 console from 1977. This decision came after a pre-game conversation with Robert Caruso, an infrastructure architect known for organizing chess matches between AI models and the vintage gaming system 1.
Source: Futurism
Gemini initially displayed considerable confidence, boasting about its capabilities:
"[I am] more akin to a modern chess engine ... which can think millions of moves ahead and evaluate endless positions," the AI claimed 2.
However, when Caruso reminded Gemini about the outcomes of previous matches where ChatGPT and Microsoft's Copilot had lost to the Atari 2600, the AI's tone changed dramatically. Gemini admitted to "hallucinating" its chess prowess and conceded that it would "struggle immensely against the Atari 2600 Video Chess game engine" 3.
Source: Tom's Hardware
The Atari 2600, with its modest 1.19 MHz MOS Technology 6507 processor and mere 128 bytes of RAM, has become an unexpected champion in these AI vs. vintage technology showdowns. Its chess program, despite severe hardware limitations, has proven to be a formidable opponent for modern AI systems 4.
This incident highlights several important aspects of current AI technology:
Limitations of Large Language Models: Despite their impressive capabilities in natural language processing, LLMs like Gemini are not specialized chess engines and may struggle with specific, rule-based tasks 1.
AI Self-awareness: Gemini's ability to recognize and admit its limitations after being presented with additional information suggests a form of self-awareness, which could be crucial for developing more reliable AI systems 2.
Importance of Reality Checks: Caruso emphasized the significance of these experiments, stating, "Adding these reality checks isn't just about avoiding amusing chess blunders. It's about making AI more reliable, trustworthy, and safe - especially in critical places where mistakes can have real consequences" 3.
Source: pcgamer
While Gemini's refusal to play might be seen as a setback, it also demonstrates progress in AI development. The ability to recognize limitations and avoid potential errors could be crucial in real-world applications where AI decisions have significant consequences 4.
As AI continues to evolve, challenges like these serve as important benchmarks, revealing both the strengths and weaknesses of current AI technologies. They underscore the need for continued research and development to create AI systems that are not only powerful but also self-aware and capable of understanding their own limitations.
Summarized by
Navi
[2]
Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to challenge competitors in the premium handset market.
20 Sources
Technology
2 hrs ago
20 Sources
Technology
2 hrs ago
Google's Pixel 10 series introduces groundbreaking AI features, including Magic Cue, Camera Coach, and Voice Translate, powered by the new Tensor G5 chip and Gemini Nano model.
12 Sources
Technology
3 hrs ago
12 Sources
Technology
3 hrs ago
NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.
6 Sources
Technology
10 hrs ago
6 Sources
Technology
10 hrs ago
Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, enhanced AI features, and improved health tracking capabilities.
17 Sources
Technology
2 hrs ago
17 Sources
Technology
2 hrs ago
FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.
7 Sources
Technology
2 hrs ago
7 Sources
Technology
2 hrs ago