Baidu Launches ERNIE 5.0 AI Model, Claims Performance Edge Over GPT-5 and Gemini 2.5 Pro

Reviewed byNidhi Govil

3 Sources

Share

Chinese tech giant Baidu unveils its proprietary ERNIE 5.0 foundation model at Baidu World 2025, featuring native multimodal capabilities and claiming superior performance over Western competitors in document understanding and visual tasks. The company also announces global expansion plans for its AI products.

Baidu Unveils ERNIE 5.0 at Annual Conference

Chinese technology giant Baidu introduced its latest artificial intelligence model, ERNIE 5.0, at the company's annual Baidu World 2025 conference, positioning it as a direct competitor to leading Western AI models including OpenAI's GPT-5 and Google's Gemini 2.5 Pro

1

2

.

Source: InfoWorld

Source: InfoWorld

The new model represents a significant departure from Baidu's previous open-source approach. While its predecessor, ERNIE-4.5-VL-28B-A3B-Thinking, was released under an Apache license, ERNIE 5.0 is proprietary and built on the company's PaddlePaddle deep learning framework

1

.

Native Multimodal Architecture

Baidu CTO and head of AI Group Haifeng Wang explained that ERNIE 5.0 adopts a "unified auto-regression architecture for native full multimodal modelling," integrating speech, images, video, and audio data from the beginning of training rather than through post-processing fusion

1

.

Source: VentureBeat

Source: VentureBeat

This native multimodal approach distinguishes ERNIE 5.0 from competitors that rely on modality-specific encoders. The model jointly processes and generates content across text, images, audio, and video, enabling comprehensive multimodal understanding and generation capabilities

3

.

Performance Claims Against Western Competitors

Baidu presented benchmark results suggesting ERNIE 5.0 achieves parity or superiority compared to top Western foundation models across multiple task categories. According to company data shared at the conference, ERNIE 5.0 Preview outperformed or matched OpenAI's GPT-5-High and Google's Gemini 2.5 Pro in multimodal reasoning, document understanding, and image-based question answering

2

.

The model demonstrated particularly strong performance on visual tasks, achieving leading scores on OCRBench, DocVQA, and ChartQA benchmarks that test document recognition, comprehension, and structured data reasoning. Baidu claims these results position ERNIE 5.0 as superior to both GPT-5-High and Gemini 2.5 Pro in document and chart-based applications crucial for enterprise use cases

2

.

Model Variants and Availability

Baidu introduced ERNIE 5.0 Preview 1022, a specialized variant optimized for text-intensive tasks, alongside the general preview model that balances performance across all modalities. The Preview 1022 variant showed enhanced language-specific results in early developer access, particularly excelling in Chinese-language performance

2

.

The ERNIE 5.0 preview is currently available to the public through ERNIE Bot and to enterprise users via Baidu AI Cloud's MaaS platform Qianfan. The model is positioned at the premium end of Baidu's pricing structure, aligning costs with other top-tier offerings in the market

2

3

.

Global Expansion Strategy

Beyond the model launch, Baidu announced significant international expansion plans for its AI product suite. The company introduced upgrades to its digital human platform, no-code application builder Miaoda 2.0, and general AI agent GenFlow 3.0, all targeted at expanding its AI footprint beyond China

3

.

Baidu's digital human technology has already debuted in Brazil, with the company exploring expansion opportunities into key markets including the United States and Southeast Asia. The international version of its no-code application builder, called MeDo, is now available globally via medo.dev

3

.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo