Tencent Hy3 AI Model: Chinese LLM with 40% Boost

Tencent Hy3 AI Model Prioritizes Product Performance Over Benchmark Claims

Tencent is taking a different approach with its latest Chinese LLM release. While the Hy3 AI Model arrives with the expected benchmark results, the company is staking its credibility on something more tangible: real-world product rollout 2

. Before its broader public launch, Tencent Hy3 preview has already been integrated into Yuanbao, CodeBuddy, WorkBuddy, Tencent Docs, and even the gaming title Peacekeeper Elite 2

. This strategy signals a shift in how AI labs might need to prove their models matter as benchmark scores increasingly converge across competitors.

Source: Decrypt

The Hy3 preview leverages a Mixture-of-Experts (MoE) architecture with 295 billion total parameters but only 21 billion active at any given time 1

. This design routes each query to specialized expert sub-networks rather than activating the entire model, delivering lower computational costs without sacrificing output quality 1

. Tencent explicitly scaled back from its previous flagship Hy2, which had over 400 billion parameters, arguing that 295 billion represents the optimal point where reasoning fully matures but additional parameters stop delivering returns 1

. The model also supports up to 256,000 tokens of context, enough capacity to process a full-length novel in a single prompt 1

Dramatic Improvements in Coding Benchmarks and Agent Workflows

On coding benchmarks, the progress is striking. SWE-Bench Verified tests whether models can fix actual bugs from GitHub repositories rather than artificial problems 1

. Hy2 scored 53.0%, while Hy3 preview reaches 74.4%—a 40% jump in one generation that places it near Claude Opus 4.6 at 80.8% and above GLM-5 at 77.8% 1

. Terminal-Bench 2.0, which measures autonomous task execution in real command-line environments, saw an even larger leap from 23.2% to 54.4% 1

These gains make the model particularly compelling for agent workflows, where complex instruction sets involving memories, skills, and tool calls often break down 1

. In CodeBuddy and WorkBuddy, Tencent reports that Hy3 preview reduced first-token latency by 54%, cut end-to-end generation time by 47%, and achieved task success rates exceeding 99.99% 2

. The model has stably supported complex agent workflows spanning up to 495 steps in live user environments, handling document processing, data analysis, knowledge retrieval, and tool orchestration 2

Efficiency Gains and Competitive Positioning Among Chinese LLMs

Tencent claims inference efficiency improvements of 40% with Hy3 preview 2

. By the size-to-performance ratio, the model occupies what Tencent calls an "optimal zone" with approximately 295 billion parameters, ahead of DeepSeek-V3.2 with over 600 billion parameters and matching Kimi-K2.5 with over 1 trillion parameters at a fraction of the compute cost 1

. Search and browsing agents also saw sharp improvements, with BrowseComp scores reaching 67.1% from Hy2's 28.7%, and WideSearch hitting 70.2%, outperforming GLM-5 and Kimi-K2.5 though trailing Claude Opus 4.6's 77.2% 1

On reasoning tasks, the model scored 88.4 on Tsinghua University's math PhD qualifying exam for Spring 2026 and 87.8 on China's national high school biology olympiad (CHSBO 2025), topping every Chinese competitor in both categories 1

. Tencent emphasizes these are real-world exams rather than curated datasets, part of its effort to avoid benchmark gaming 1

Rapid Development Timeline and Infrastructure Overhaul

Hy3 preview started training in late January 2026 and launched in April—under three months from cold start to open-source release, unusually fast for a frontier-class model 1

. Tencent attributes this speed to a February infrastructure overhaul led by Yao Shunyu, its chief AI scientist, who pushed a complete rebuild of the pretraining and reinforcement learning stack 1

. Yao described the preview as the first step in rebuilding the Hunyuan model line, with further gains expected from continued pre-training and reinforcement learning work 2

Accessible Pricing and Deployment Through Tencent Cloud

Tencent Cloud offers API access at approximately $0.18 per million input tokens and $0.59 per million output tokens, with personal Token Plan packages starting around $4.10 per month 1

. Through TokenHub, pricing starts at RMB 1.2 per million input tokens and RMB 4 per million output tokens, emphasizing lower deployment barriers for enterprise use 2

. Users can access Hy3 preview through a two-week free token offer via OpenRouter, extending the emphasis on real-world testing beyond internal products 2

For developers building with agents, this matters. As agentic capabilities become the most hyped area in AI, models that can reliably execute complex multi-step workflows without breaking down offer practical value beyond raw parameter counts 1

. Tencent is betting that visible user-facing behavior in live products will define model progress more convincingly than lab-style evaluations alone 2

. Watch whether other Chinese LLM developers follow this product-first validation approach as the AI market matures.

Tencent launches Hy3 AI Model with 40% efficiency gain and real-world product integration

Tencent Hy3 AI Model Prioritizes Product Performance Over Benchmark Claims

Dramatic Improvements in Coding Benchmarks and Agent Workflows

Efficiency Gains and Competitive Positioning Among Chinese LLMs

Rapid Development Timeline and Infrastructure Overhaul

Accessible Pricing and Deployment Through Tencent Cloud

References

Tencent's New Hy3 AI Model Is the Most Efficient Chinese LLM No One's Talking About - Decrypt

Tencent uses product rollout, not just benchmarks, to define Hy3 preview

Related Stories

Alibaba's Qwen3 Models Set New Benchmarks in Open-Source AI

Alibaba unveils Qwen3.5 AI model with visual agentic capabilities, claims edge over GPT-5.2

Alibaba Unveils Qwen 3: A New Family of Hybrid AI Reasoning Models Challenging Global Leaders

Recent Highlights

Anthropic warns AI may soon build itself, calls for global pause on frontier development

Florida sues OpenAI and Sam Altman over ChatGPT safety, alleging AI harms linked to violence

Nvidia RTX Spark AI chip debuts in premium laptops, promising Windows its Apple Silicon moment

Recent Highlights

Today's Top Stories

Google to pay SpaceX $920 million monthly for AI compute power in $30 billion infrastructure deal

OpenAI rolls out Lockdown Mode to all ChatGPT users to combat prompt injection attacks

iOS 27 brings Gemini-powered Siri and AI advancements to iPhones at WWDC 2026

Anthropic and White House ease tensions as Pentagon maintains blacklist amid NSA partnership