Tencent launches Hy3 AI Model with 40% efficiency gain and real-world product integration

2 Sources

Share

Tencent unveiled Hy3 preview, a 295-billion-parameter Chinese LLM that improves inference efficiency by 40% while excelling at coding and agent workflows. Unlike typical model launches focused on benchmarks, Tencent is proving Hy3's capabilities through live integration across Yuanbao, CodeBuddy, and other products, where it reduced latency by 54% and handles complex tasks up to 495 steps.

Tencent Hy3 AI Model Prioritizes Product Performance Over Benchmark Claims

Tencent is taking a different approach with its latest Chinese LLM release. While the Hy3 AI Model arrives with the expected benchmark results, the company is staking its credibility on something more tangible: real-world product rollout

2

. Before its broader public launch, Tencent Hy3 preview has already been integrated into Yuanbao, CodeBuddy, WorkBuddy, Tencent Docs, and even the gaming title Peacekeeper Elite

2

. This strategy signals a shift in how AI labs might need to prove their models matter as benchmark scores increasingly converge across competitors.

Source: Decrypt

Source: Decrypt

The Hy3 preview leverages a Mixture-of-Experts (MoE) architecture with 295 billion total parameters but only 21 billion active at any given time

1

. This design routes each query to specialized expert sub-networks rather than activating the entire model, delivering lower computational costs without sacrificing output quality

1

. Tencent explicitly scaled back from its previous flagship Hy2, which had over 400 billion parameters, arguing that 295 billion represents the optimal point where reasoning fully matures but additional parameters stop delivering returns

1

. The model also supports up to 256,000 tokens of context, enough capacity to process a full-length novel in a single prompt

1

.

Dramatic Improvements in Coding Benchmarks and Agent Workflows

On coding benchmarks, the progress is striking. SWE-Bench Verified tests whether models can fix actual bugs from GitHub repositories rather than artificial problems

1

. Hy2 scored 53.0%, while Hy3 preview reaches 74.4%—a 40% jump in one generation that places it near Claude Opus 4.6 at 80.8% and above GLM-5 at 77.8%

1

. Terminal-Bench 2.0, which measures autonomous task execution in real command-line environments, saw an even larger leap from 23.2% to 54.4%

1

.

These gains make the model particularly compelling for agent workflows, where complex instruction sets involving memories, skills, and tool calls often break down

1

. In CodeBuddy and WorkBuddy, Tencent reports that Hy3 preview reduced first-token latency by 54%, cut end-to-end generation time by 47%, and achieved task success rates exceeding 99.99%

2

. The model has stably supported complex agent workflows spanning up to 495 steps in live user environments, handling document processing, data analysis, knowledge retrieval, and tool orchestration

2

.

Efficiency Gains and Competitive Positioning Among Chinese LLMs

Tencent claims inference efficiency improvements of 40% with Hy3 preview

2

. By the size-to-performance ratio, the model occupies what Tencent calls an "optimal zone" with approximately 295 billion parameters, ahead of DeepSeek-V3.2 with over 600 billion parameters and matching Kimi-K2.5 with over 1 trillion parameters at a fraction of the compute cost

1

. Search and browsing agents also saw sharp improvements, with BrowseComp scores reaching 67.1% from Hy2's 28.7%, and WideSearch hitting 70.2%, outperforming GLM-5 and Kimi-K2.5 though trailing Claude Opus 4.6's 77.2%

1

.

On reasoning tasks, the model scored 88.4 on Tsinghua University's math PhD qualifying exam for Spring 2026 and 87.8 on China's national high school biology olympiad (CHSBO 2025), topping every Chinese competitor in both categories

1

. Tencent emphasizes these are real-world exams rather than curated datasets, part of its effort to avoid benchmark gaming

1

.

Rapid Development Timeline and Infrastructure Overhaul

Hy3 preview started training in late January 2026 and launched in April—under three months from cold start to open-source release, unusually fast for a frontier-class model

1

. Tencent attributes this speed to a February infrastructure overhaul led by Yao Shunyu, its chief AI scientist, who pushed a complete rebuild of the pretraining and reinforcement learning stack

1

. Yao described the preview as the first step in rebuilding the Hunyuan model line, with further gains expected from continued pre-training and reinforcement learning work

2

.

Accessible Pricing and Deployment Through Tencent Cloud

Tencent Cloud offers API access at approximately $0.18 per million input tokens and $0.59 per million output tokens, with personal Token Plan packages starting around $4.10 per month

1

. Through TokenHub, pricing starts at RMB 1.2 per million input tokens and RMB 4 per million output tokens, emphasizing lower deployment barriers for enterprise use

2

. Users can access Hy3 preview through a two-week free token offer via OpenRouter, extending the emphasis on real-world testing beyond internal products

2

.

For developers building with agents, this matters. As agentic capabilities become the most hyped area in AI, models that can reliably execute complex multi-step workflows without breaking down offer practical value beyond raw parameter counts

1

. Tencent is betting that visible user-facing behavior in live products will define model progress more convincingly than lab-style evaluations alone

2

. Watch whether other Chinese LLM developers follow this product-first validation approach as the AI market matures.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo