2 Sources
[1]
Spirit AI beats Nvidia on RoboArena robotics benchmark
Chinese startup Spirit AI tops the RoboArena leaderboard Nvidia co-built, scoring 1,924 to Nvidia's 1,881 as physical AI becomes the next tech battleground. Two days. That is how long Nvidia's latest robotics model sat at the top of the RoboArena leaderboard before a startup from Hangzhou knocked it off. On Wednesday, Spirit AI announced that its foundation model for embodied intelligence, Spirit v1.6, had scored 1,924 on the benchmark, edging out Nvidia's Cosmos3-Nano-Policy at 1,881. A second Nvidia project, DreamZero, came third with 1,763. It is the first time a Chinese model has claimed the top spot on RoboArena, a benchmark Nvidia co-developed with Stanford University and the University of California, Berkeley. The timing stings. Nvidia had launched its Cosmos 3 omnimodel at Computex in Taipei on 1 June, calling it the "open frontier foundation model for physical AI." Trained on 20 trillion tokens of multimodal data, Cosmos 3 was supposed to demonstrate the company's dominance in a category it practically invented. Spirit AI had other plans. What physical AI actually measures RoboArena does not test chatbot fluency or image generation quality. It evaluates how effectively a generalist robot policy translates into real-world actions: object manipulation, navigation, tool usage, perception, planning, and adaptability in unfamiliar environments. In short, it measures whether a machine can think and then do. Physical AI relies on two core capabilities. Policy capabilities determine a model's ability to act on what it observes, which is exactly what RoboArena measures. World capabilities determine a model's ability to simulate and predict what happens next if a specific action is taken. The industry is moving towards integrating both. Last September, Chinese researchers introduced a unified "Policy World Model" architecture that merges world modelling and trajectory planning into a single system. That convergence is now accelerating across the sector. China is leading on multiple fronts Spirit AI's RoboArena result is not an isolated data point. Across the broader ecosystem of physical AI benchmarks, Chinese firms hold leading positions in nearly every category. On the WorldArena benchmark, which evaluates embodied world models, the top spot belongs to WorldScape-0.2 from Manifold AI. It outperforms Nvidia's Cosmos-Predict 2.5 in the policy evaluator track. The perception track is led by AgiBot, one of China's largest robotics firms, with its GenieEnvisioner-Sim2.0-2B model unveiled last week. The data engine track is topped by DexForce, another Chinese startup. On the WorldScore benchmark, which tests a model's ability to generate worlds from text prompts, Manifold AI's WorldScape-0.2 again leads, outperforming WonderJourney, a joint project from Stanford and Google. The money pouring in The technical results are underpinned by an extraordinary wave of capital. Spirit AI announced a 1.5 billion yuan ($222 million) financing round on Wednesday, its fourth in just three months. That pace is reportedly the most aggressive fundraising seen in the embodied AI sector. Earlier rounds had already pushed the company's valuation past 10 billion yuan ($1.4 billion). On the same day, XYZ Embodied AI, incubated by the Beijing Academy of Artificial Intelligence, said it had closed its pre-A round, having raised 1 billion yuan in just 10 months to develop "embodied brains" and world models. Manifold AI has completed five funding rounds in 10 months, with its latest in April securing reportedly hundreds of millions of yuan. The broader Chinese robotics sector attracted $3.4 billion in venture funding in 2025 alone, 42 per cent more than the United States. That gap appears to be widening in 2026. Nvidia's counter-strategy Nvidia is not standing still. At Computex, CEO Jensen Huang announced a partnership with Chinese robotics firm Unitree, which is preparing a $7 billion IPO, and Singaporean robotic hand maker Sharpa to build a humanoid robot reference design. The platform combines Unitree's H2 Plus humanoid body, Sharpa's Wave tactile hands, and Nvidia's Jetson AGX Thor T5000 processor. Huang also launched the Cosmos Coalition, enlisting AI labs including Agile Robots, Black Forest Labs, Runway, and Skild AI to advance open world models. The message is clear: Nvidia wants to be the infrastructure layer for the entire physical AI ecosystem, even if individual models lose benchmark crowns. But Huang himself identified the sector's fundamental bottleneck. "For robotic systems and physical AI, data is the hardest problem," he said at Computex. That admission points to why China may hold a structural advantage. The data question Alexandr Wang, the Scale AI founder who joined Meta as its first chief AI officer in 2025, reportedly said last year that China was "fundamentally very well positioned on data" and that many US companies relied on Chinese data to train robotics foundation models. The claim has not been independently verified in full, but it aligns with observable policy. In tech hubs like Beijing and Shenzhen, Chinese authorities have established state-backed "data factories" to collect robotics training data at scale. China's manufacturing supply chain, with its dense network of factories and assembly lines, provides a natural environment for the kind of real-world robotic interaction data that remains expensive and difficult to gather in the West. If data really is the hardest problem, the country with the most factories, the most robots, and the most willingness to collect may already have the answer.
[2]
Chinese robotics startup overtakes the world most's valuable firm to top global AI ranking
Chinese robotics startup Spirit AI has surpassed U.S. tech giant Nvidia to claim the top spot on a major global benchmark for embodied artificial intelligence. The Zhejiang-based company said on Wednesday that its Spirit v1.6 foundation model scored 1,924 points on the RoboArena leaderboard, ahead of Nvidia's Cosmos3-Nano-Policy at 1,881, according to the South China Morning Post. RoboArena measures how effectively general-purpose robot AI systems convert perception into real-world actions. The benchmark was co-developed by Nvidia and researchers from U.S.'s Stanford University and the University of California, Berkeley. DreamZero, another Nvidia project introduced in February, ranked third with a score of 1,763. Nvidia is currently the world's most valuable company, with a market capitalization exceeding $5 trillion, according to CompaniesMarketCap. Spirit AI's announcement came two days after Nvidia unveiled Cosmos 3, an open foundation model for physical AI designed to help robots "think before they act." Physical AI refers to artificial intelligence that enables machines to perceive, understand and interact with the real world. These systems are generally assessed on two core capabilities: policy models, which determine what actions a robot should take based on what it observes, and world models, which predict the outcomes of those actions. While Spirit AI currently leads the policy category, Chinese companies also occupy the top positions in several other embodied AI benchmarks. WorldScape-0.2, developed by Chinese startup Manifold AI, ranks first on the WorldArena benchmark for world models, ahead of Nvidia's Cosmos-Predict 2.5. Chinese robotics company AgiBot leads the perception category with its GenieEnvisioner-Sim2.0-2B, a video world simulator for robotic manipulation unveiled last week. Meanwhile, Chinese startup DexForce tops the data engine category with DSCFuncWorld, a platform designed to optimize robot training data pipelines. Manifold AI's WorldScape-0.2 also holds the leading position on the WorldScore benchmark, which measures a model's ability to generate interactive virtual worlds from text prompts, outperforming WonderJourney, a joint project by Stanford University and Google.
Share
Copy Link
Chinese robotics startup Spirit AI has claimed the top spot on the RoboArena leaderboard with its Spirit v1.6 model scoring 1,924, edging out Nvidia's Cosmos3-Nano-Policy at 1,881. The achievement marks the first time a Chinese model has led this benchmark, co-developed by Nvidia with Stanford and UC Berkeley, and signals intensifying competition in physical AI as Chinese firms dominate multiple robotics benchmarks.
Nvidia's dominance in physical AI lasted exactly two days. On Wednesday, Spirit AI, a Chinese robotics startup based in Hangzhou, announced that its Spirit v1.6 model had scored 1,924 on the RoboArena robotics benchmark, surpassing Nvidia's Cosmos3-Nano-Policy at 1,881
1
. This marks the first time a Chinese model has claimed the leading position on RoboArena, a benchmark co-developed by Nvidia with Stanford University and the University of California, Berkeley2
.The timing carries particular weight. Nvidia had just launched its Cosmos 3 omnimodel at Computex in Taipei on June 1, branding it the "open frontier foundation model for physical AI" and trained on 20 trillion tokens of multimodal data
1
. Nvidia's DreamZero, another project introduced in February, ranked third with 1,763 points2
. The result represents a significant milestone in embodied artificial intelligence, where machines must translate perception into real-world actions.
Source: VnExpress
The RoboArena benchmark evaluates how effectively a generalist robot policy translates into tangible actions: object manipulation, navigation, tool usage, perception, planning, and adaptability in unfamiliar environments
1
. Unlike benchmarks that test chatbot fluency or image generation quality, RoboArena measures whether a machine can both think and execute.Physical AI relies on two core capabilities that define its effectiveness. Policy capabilities determine a model's ability to act on what it observes, which is precisely what RoboArena measures. World models, meanwhile, determine a model's ability to simulate and predict outcomes when specific actions are taken
2
. The industry is moving toward integrating both capabilities, with Chinese researchers introducing a unified "Policy World Model" architecture last September that merges world modelling and trajectory planning into a single system1
.The AI development from Chinese robotics startup Spirit AI represents just one data point in a broader pattern. Across the ecosystem of physical AI benchmarks, Chinese firms hold leading positions in nearly every category. On the WorldArena benchmark, which evaluates embodied world models, Manifold AI's WorldScape-0.2 claims the top spot, outperforming Nvidia's Cosmos-Predict 2.5 in the policy evaluator track
1
.AgiBot, one of China's largest robotics firms, leads the perception track with its GenieEnvisioner-Sim2.0-2B model unveiled last week
2
. DexForce, another Chinese startup, tops the data engines track with DSCFuncWorld, a platform designed to optimize robot training data pipelines2
. On the WorldScore benchmark, which tests a model's ability to generate worlds from text prompts, Manifold AI's WorldScape-0.2 again leads, outperforming WonderJourney, a joint project from Stanford and Google1
.Related Stories
These technical achievements are backed by extraordinary capital flows. Spirit AI announced a 1.5 billion yuan ($222 million) financing round on Wednesday, its fourth in just three months—reportedly the most aggressive fundraising pace seen in the embodied AI sector
1
. Earlier rounds had already pushed the company's valuation past 10 billion yuan ($1.4 billion). XYZ Embodied AI, incubated by the Beijing Academy of Artificial Intelligence, closed its pre-A round after raising 1 billion yuan in just 10 months to develop "embodied brains" and world models1
.Manifold AI has completed five funding rounds in 10 months, with its latest in April securing reportedly hundreds of millions of yuan. The broader Chinese robotics sector attracted $3.4 billion in venture funding in 2025 alone, 42 per cent more than the United States, with that gap appearing to widen in 2026
1
.Nvidia, currently the world's most valuable company with a market capitalization exceeding $5 trillion
2
, is adapting its strategy. At Computex, CEO Jensen Huang announced a partnership with Chinese robotics firm Unitree, which is preparing a $7 billion IPO, and Singaporean robotic hand maker Sharpa to build a humanoid robot reference design1
. The platform combines Unitree's H2 Plus humanoid body, Sharpa's Wave tactile hands, and Nvidia's Jetson AGX Thor T5000 processor.Huang also launched the Cosmos Coalition, enlisting AI labs including Agile Robots, Black Forest Labs, Runway, and Skild AI to advance open world models
1
. The message is clear: Nvidia wants to position itself as the infrastructure layer for the entire physical AI ecosystem, even if individual models lose benchmark crowns. However, Huang himself identified the sector's fundamental bottleneck at Computex: "For robotic systems and physical AI, data is the hardest problem"1
. That admission points to why China may hold a structural advantage, with Scale AI founder Alexandr Wang reportedly stating last year that China was "fundamentally very well positioned on data" and that many US companies relied on Chinese data to train robotics foundation models1
.Summarized by
Navi
[1]
23 Jan 2026•Policy and Regulation

24 Apr 2026•Policy and Regulation

12 Aug 2025•Technology

1
Technology

2
Business and Economy

3
Health
