7 Sources
7 Sources
[1]
OpenAI beats Elon Musk's Grok in AI chess tournament
ChatGPT-maker OpenAI has beaten Elon Musk's Grok in the final of a tournament to crown the best artificial intelligence (AI) chess player. Historically, tech companies have often used chess to assess the progress and abilities of a computer, with modern chess machines virtually unbeatable against even the top human players. But this competition did not involve computers designed for chess - instead it was held between AI programs designed for everyday use. OpenAI's o3 model emerged unbeaten in the tournament and defeated xAI's model Grok 4 in the final, adding fuel to the fire of an ongoing rivalry between the two firms.
[2]
OpenAI o3 checkmates Grok in a chess showdown, and it wasn't even close
Magnus Carlsen delivered biting commentary on the quality of Grok's logic The AI chess tournament between OpenAI's o3 model and xAI's Grok 4 invited plenty of speculation as a kind of proxy battle between the two companies and their respective CEOs. Any comparison to the days of Deep Blue and Bobby Fischer soon faded, though, as OpenAI o3 repeatedly wiped out Grok 4, winning four games in a row, accompanied by the derisive commentary of former world chess champion Magnus Carlsen and grandmaster David Howell. The showdown happened on Kaggle's Game Arena, a digital coliseum where AI models battle in chess and other games. The tournament featured eight of the most prominent LLMs in the business: OpenAI's o3 and o4-mini, Google's Gemini 2.5 Pro and Flash, Anthropic's Claude Opus, Moonshot's DeepSeek and Kimi, and xAI's Grok 4. The final came down to Grok and o3, but Grok's performance in the final round didn't seem like a battle of champions. Carlsen and Howell veered between serious commentary and a roast as Grok's performance came off as somewhat erratic. In the first game, it quickly sacrificed its bishop, then began trading pieces like it was in a hurry to go home. Things didn't improve in the next game for Grok. "[Grok] is like that one guy in a club tournament who has learnt theory and literally knows nothing else," Carlsen said during the second game. "Makes the worst blunders after that." Grok's performance was so off-the-rails that Carlsen rated it around 800 ELO, or slightly above a beginner. He gave o3 a modest but respectable 1200, in the middle of most hobby players. Though o3 didn't play brilliantly, it didn't have to. It played solid chess. It didn't blunder pieces. It converted its advantages and carried out the classic chess moves. "o3 is fairly ruthless in conversions; it looks like a chess player. Grok looks like it learnt a few opening moves and knows the rules, but not much more.," Carlsen said. "Grok's moves are chess-related moves. They just came at the wrong time and in weird sequences." The chess wasn't the main point of the tournament, despite its prominence. It was about how general-purpose AI models handle events with strict rules like chess games. Turns out, they're not great, but o3 is the best of the limited sample. As AI becomes embedded in everything, the ability to follow rules and spot patterns becomes essential. Chess is a uniquely transparent way to observe that. You either made the right move or you didn't. When a model plays well, you can see the logic; otherwise, queens fall like dominoes, and the game becomes as confused as that metaphor. Chess is a window into how well an AI can plan, evaluate options, avoid catastrophic mistakes, and stay logically consistent. If Grok throws away a queen because it doesn't grasp long-term consequences, what might it do in a legal document, or when booking travel? That the final was between OpenAI and xAI did add some drama with Sam Altman and Elon Musk at loggerheads in public. The chess final didn't resolve the battle between them, but it did give OpenAI a PR win in the realm of public perception, and a limited but very real compliment from Magnus Carlsen.
[3]
Sam Altman's OpenAI Crushes Elon Musk's Grok in AI Chess Championship - Decrypt
Sam Altman's OpenAI o3 model -- which was deprecated late last week with the release of GPT-5 -- demolished Elon Musk's Grok 4 in four straight games Thursday to win Google's Kaggle Game Arena AI Chess Exhibition. You may think it was a super complex spectacle of high tech behemoths putting their reasoning to the ultimate test, but as an appetizer, let's say world champion Magnus Carlsen compared both bots to "a talented kid who doesn't know how the pieces move." The three-day tournament, which ran August 5-7, forced general-purpose chatbots -- yes, the same ones that help you write email and claim to be approaching human-level intelligence -- to play chess without any specialized training. No chess engines, no looking up moves, just whatever chess knowledge they'd randomly absorbed from the internet. The results were about as elegant as you'd expect from forcing a language model to play a board game. Carlsen, who co-commentated the final, estimated both AIs were playing at the level of casual players who recently learned the rules -- around 800 ELO. For context, he's arguably the best chess player who ever lived, with an ELO of 2839 points. These AIs were playing like they'd learned chess from a corrupted PDF. "They oscillate between really, really good play and incomprehensible sequences," Carlsen said during a broadcast, following the game. At one point, after watching Grok walk its king directly into danger, he joked it might think they were playing King of the Hill instead of chess. The actual games were like a masterclass in how not to play chess, even for those who don't know the game. In the first match, Grok essentially gave away one of its important pieces for free, then made things worse by trading off more pieces while already behind. Game two got even weirder. Grok tried to execute what chess players call the "Poisoned Pawn" -- a risky but legitimate strategy where you grab an enemy pawn that looks free but isn't. Except Grok grabbed the wrong pawn entirely, one that was obviously defended. Its queen (the most powerful piece in the board) got trapped and captured immediately. By game three, Grok had built what looked like a solid position -- good positional control, no obvious dangers, and basically a set up that can help you win the match. Then in mid game, it basically fumbled the ball directly to the opponent. It lost piece after piece in rapid succession. This was actually weird, considering that before the match against o3, Grok was a pretty strong contender, showing solid potential -- so much that the chess Grand Master Hikaru Nakamura praised it. "Grok is easily the best so far, just being objective, easily the best." The fourth (and last) game provided the only genuine suspense. OpenAI's o3 made a massive blunder early in the game, which is a big danger in any reasonable match. Nakamura, who was streaming the match, said there were still "a few tricks" left for o3 despite the disadvantage. He was right -- o3 clawed back to win its queen back and slowly squeezed out a victory while Grok's endgame play fell apart like wet cardboard. "Grok made so many mistakes in these games, but OpenAI did not," Nakamura said during his livestream. This was quite the reversal from earlier in the week. The timing couldn't have been worse for Elon Musk. After Grok's strong early rounds, he'd posted on X that his AI's chess abilities were just a "side effect" and that xAI had "spent almost no effort on chess." That turned out to be an understatement. Before this "official" chess tournament, International Master Levy Rozman hosted his own tournament earlier this year with less advanced models. He respected all the moves the chatbots recommended, and the whole situation ended up being a complete mess with illegal moves, piece summonings, and incorrect calculations. Stockfish, an AI built specifically for chess, ended up winning the tournament against ChatGPT. Altman's AI was matched against Musk's in the semifinals, and Grok lost. So it's 2-0 for Sam. However, this tournament was different. Each bot got four chances to make a legal move -- if they failed four times, they automatically lost. This wasn't hypothetical. In early rounds, AIs tried to teleport pieces across the board, bring dead pieces back to life, and move pawns sideways like they were playing some fever-dream version of chess they'd invented themselves. They got disqualified. Google's Gemini grabbed third place by beating another OpenAI model, salvaging some dignity for the tournament organizers. That bronze medal match featured a particularly absurd drawn game where both AIs had completely winning positions at different points but couldn't figure out how to finish. Carlsen pointed out that the AIs were better at counting captured pieces than actually delivering checkmate -- they understood material advantage but not how to win. It's like being great at collecting ingredients but unable to cook a meal. These are the same AI models that tech executives claim are approaching human intelligence, threatening white-collar jobs, and revolutionizing how we work. Yet they can't play a board game that has existed for 1,500 years without trying to cheat or forgetting the rules. So it's probably safe to say we're safe, AI won't take control of humanity, for now.
[4]
OpenAI's o3 Beats Grok 4 to Win AI Chess Tournament | AIM
Just a day before OpenAI's o3 reasoning model was officially retired from ChatGPT on August 8 in favour of GPT-5 with built-in reasoning capabilities, it won Kaggle's AI Chess Exhibition Tournament. The model beat xAI's Grok-4 model to win the finals 4-0. The model reached the finals, beating the Kimi K2 Instruct model in the quarter-finals and the company's own o4-mini model with the same scoreline. "Up until the semifinals, it seemed like nothing would be able to stop Grok 4 on its way to winning the event," said Chess.com, the world's leading platform for online chess. "But the illusion fell through on the last day of the tournament. The chatty o3 simply dismantled its mysterious opponent with four convincing wins. Grok's play was unrecognisabl
[5]
Checkmate! Sam Altman's ChatGPT Beats Elon Musk's Grok in AI Chess Final
The world watched with bated breath as ChatGPT, developed by Sam Altman, went head-to-head with Grok, Elon Musk's latest AI venture, in a high-stakes chess final. In what can only be described as a grandmaster duel of the tech era, ChatGPT emerged victorious, marking yet another significant milestone in the competitive evolution of artificial intelligence. The tournament, which was broadcast to millions of people, wasn't only about who could checkmate whom. It was a symbolic fight between two tech titans' ideas on how AI would work in the future. Grok came into the battle with aggressive, new moves meant to throw ChatGPT off balance. But ChatGPT's ability to be precise, flexible, and cool under duress made all the difference. People who saw the game said it was more than just a series of planned moves. It was an example of AI chess at its best, combining deep computing with a kind of creative problem-solving that was supposed to be only possible for human players. The intense ChatGPT AI Chess performance stunned even seasoned grandmasters.
[6]
Magnus Carlsen beats ChatGPT in chess game without looking - VnExpress International
OpenAI's chatbot played black, with a human assistant entering Carlsen's moves and executing ChatGPT's on the physical board. Early on, ChatGPT called Carlsen's opening "strange," to which he smiled and agreed, but he soon described the AI's approach as "passive." Unlike top chess engines such as Stockfish, ChatGPT occasionally offered very human-like ideas, and sometimes non-existent piece locations or proposed illegal moves. In the spirit of the game, Carlsen let some stand. Midgame, Carlsen made a move which ChatGPT warned was "risky." When Carlsen asked if it saw the danger, the bot appeared confused and he took advantage of this. A human assistant filtered out a few of ChatGPT's illegal move suggestions and replaced them with legal ones, with Carlsen's consent. In the endgame, Carlsen jokingly "moved" his king like a knight, an illegal move that ChatGPT nevertheless praised as strong. From there, the board turned into chaos as ChatGPT revived pawns, moved rooks into odd postitions and captured pieces without a logical reason. Carlsen eventually delivered the checkmate by having his queen leap over his own king, which is also illegal, but the result was accepted given the exhibition setting and the AI's prior absurd moves. The match took place after what organizers billed as the first AI chess tournament, held Aug. 5-7, where OpenAI entered the competition with two representatives: the o3 large language model and the faster o4-mini. Powered by o3's reasoning, ChatGPT won the event 4-0 in the final against Elon Musk's Grok 4, with o3 making significantly fewer errors than the model used in Carlsen's exhibition. The o4-mini model is a speed-optimized variant available free to ChatGPT users, while access to the more capable o3 is limited for free accounts to a small number of uses per month.
[7]
ChatGPT defeats Grok 4 to win chess tournament for AI - VnExpress International
With a 4-0 win over Elon Musk's AI Grok 4 in the final, ChatGPT's o3 Large Language Model (LLM) took the crown at the first AI chess tournament. About an hour before the final, OpenAI announced the launch of its 11th generation LLM, called GPT-5. However, the model ChatGPT used in the final was still o3 - the company's strongest reasoning product. Facing xAI's Grok 4, which had performed well in the group stage, o3 showed superior strength with an average move accuracy rate of 90.8%, compared to Grok 4's 80.2%, according to Chess.com. In all four games, ChatGPT checkmated its opponent after 35, 30, 28 and 54 moves. World number two Hikaru Nakamura said Grok 4 seemed nervous and anxious in this match, so it did not perform as well as in the previous two. It often lost pieces easily in the final match, which rarely happened in victories over Gemini 2.5 Flash and Gemini 2.5 Pro from Google. The o3 model ended the tournament with three wins with a score of 4-0, with an average accuracy rate of 12 games above 91%. Although its strength is not comparable to that of a grandmaster, players with an online chess rating of 2,000 or lower may have difficulty facing o3, especially in blitz or super blitz games. ChatGPT, a product of OpenAI, is a pioneer chatbot in the AI revolution, launched on Nov. 30, 2022. At that time, it used the GPT-3, a general-purpose model. The GPT line represents versatility, while the o lines are more inclined towards reasoning. The o3 was released in January 2025, while the o4-mini appeared three months later. These two models are also representatives of OpenAI in the first AI chess tournament in history. Grok 4 is a creation by xAI, owned by the world's richest man Elon Musk. He said that Grok 4 had hardly learned chess before participating in this tournament. The tournament was organized by Google on the Kaggle platform for three days between Aug. 5-7, with 8 AIs competing in a single-elimination format. The two Chinese representatives, Kimi K4 and Deepseek, were both eliminated early in the quarterfinals, with heavy losses. The remaining six representatives in the tournament are all from American companies. These are the strongest AIs in the world. In the third-place match, Gemini 2.5 Pro beat o4-mini with a score of 3.5-0.5. The LLMs participating in the tournament were all non-chess specialists. At the same time, another chess tournament was taking place between eight chess engines, which received less attention. These eight engines had Elo ratings ranging from 3,576 (Integral) to 3,731 (Stockfish). Some of them used AI algorithms to improve their performance. They were all far above human level.
Share
Share
Copy Link
OpenAI's o3 model decisively defeated xAI's Grok 4 in a high-profile AI chess tournament, showcasing the capabilities and limitations of general-purpose AI in rule-based games.
In a groundbreaking event that captured the attention of both the tech and chess communities, OpenAI's o3 model emerged victorious in the Kaggle Game Arena AI Chess Exhibition. The tournament, which ran from August 5-7, pitted some of the most prominent large language models (LLMs) against each other in a chess competition designed to test their general-purpose capabilities
1
.Source: VnExpress International
The competition featured eight leading LLMs, including OpenAI's o3 and o4-mini, Google's Gemini 2.5 Pro and Flash, Anthropic's Claude Opus, Moonshot's DeepSeek and Kimi, and xAI's Grok 4
2
. These AI models, primarily designed for everyday use and not specialized in chess, were challenged to play without access to chess engines or move databases, relying solely on their general knowledge3
.The tournament culminated in a final match between OpenAI's o3 and xAI's Grok 4, with o3 securing a decisive 4-0 victory
4
. This outcome was particularly noteworthy given Grok's strong performance in earlier rounds, which had led to high expectations3
.The matches were accompanied by commentary from chess luminaries, including former world champion Magnus Carlsen and grandmaster David Howell. Their insights provided a unique perspective on the AI models' performance:
Skill Level: Carlsen estimated both AIs were playing at around 800 ELO, comparable to casual players who recently learned the rules
3
.Gameplay Observations: The experts noted that while o3 demonstrated more consistent and logical play, Grok's performance was erratic, often making basic blunders
2
.Strategic Understanding: Carlsen remarked that o3 showed a better grasp of chess principles, while Grok seemed to have learned opening moves but struggled with overall strategy
2
.Source: Analytics Insight
Related Stories
This tournament served as a litmus test for the reasoning capabilities of general-purpose AI models when faced with rule-based challenges:
Pattern Recognition: The competition highlighted the AIs' ability to recognize patterns and follow complex rules, albeit with varying degrees of success
2
.Limitations Exposed: The models' struggles with chess strategy and long-term planning revealed current limitations in AI reasoning
3
.Real-world Applications: The tournament's results raise questions about the readiness of these AI models for complex decision-making tasks in other domains
2
.Source: BBC
The competition's outcome has sparked discussions about the current state and future direction of AI development:
OpenAI's Triumph: The victory of o3 over Grok 4 has been seen as a PR win for OpenAI in the ongoing rivalry with xAI
2
.Model Retirement: Interestingly, o3's victory came just before its retirement and replacement with GPT-5, which boasts built-in reasoning capabilities
4
.Future Competitions: The success of this tournament may pave the way for similar competitions testing AI capabilities in other rule-based domains
5
.Summarized by
Navi
[4]
[5]
05 Aug 2025•Technology
09 Jun 2025•Technology
03 Jul 2025•Technology
1
Business and Economy
2
Technology
3
Business and Economy