Perplexity AI unveils hybrid system that splits tasks between your PC and the cloud automatically

5 Sources

Share

Perplexity AI introduced a hybrid local-cloud inference system at Computex that automatically routes AI tasks between personal computers and cloud servers in real time. CEO Aravind Srinivas demonstrated the technology alongside Intel, showing how the system keeps sensitive data on-device while sending complex work to frontier models in the cloud—addressing both privacy concerns and the cost crisis of centralized AI inference.

Perplexity AI introduces automatic task routing between devices and servers

Perplexity AI unveiled a hybrid local-cloud inference system at Computex 2026 that fundamentally changes how AI workloads are processed

4

. CEO Aravind Srinivas demonstrated the platform alongside Intel CEO Lip-Bu Tan during Intel's keynote address, describing it as an "air-traffic controller for AI tasks" that decides in real time which operations run locally on a user's device and which require cloud servers

2

. The system will be added to Personal Computer, Perplexity's AI agent that works across files, apps, and the web, with the hybrid AI feature launching in July

1

.

Source: 9to5Mac

Source: 9to5Mac

What sets this approach apart is that the system makes routing decisions autonomously, task by task, without requiring users to choose between on-device AI models and cloud-based AI models upfront

3

. A smaller model running locally handles the decision-making about which information should remain on the device and which can be sent to more powerful frontier models in the cloud

4

. "No product has done this before," a Perplexity spokesperson told VentureBeat

4

.

Source: CNET

Source: CNET

Addressing the cost crisis and sensitive data processing challenges

The announcement comes as companies grapple with massive AI infrastructure expenses. Srinivas referenced reports of organizations "spending half a billion dollars per month" on AI compute, emphasizing the need for "efficient value per watt per user"

5

. OpenAI's infrastructure costs have been widely reported at that scale, while Anthropic's projected $10.9 billion in Q2 revenue comes with substantial compute expenses that compress margins

2

. By offloading AI inference tasks to the billions of PCs already in circulation, Perplexity can serve more users while reducing the burden on data centers

2

.

The hybrid system addresses privacy concerns by keeping sensitive data processing on local devices. Financial records, health information, and personal files can be handled by compact models running directly on user hardware without ever touching cloud computing infrastructure

3

. Meanwhile, complex tasks requiring frontier model capabilities—such as multi-step reasoning or retrieval-augmented generation across large datasets—get routed to servers

2

. The system reportedly asks for user permission before sending sensitive tasks to the cloud, addressing data governance anxieties that enterprises have about agentic AI

4

.

Chip-agnostic platform works across Intel, Nvidia, and other hardware

While Srinivas made the announcement alongside Intel's CEO, he emphasized that the platform remains chip-agnostic and works with Nvidia processors as well as other local silicon

5

. The timing aligns strategically with major hardware announcements at Computex, where Nvidia unveiled its RTX Spark platform for AI-powered laptops and desktops

1

. Intel showcased its Core Ultra Series 3 processors as the client silicon enabling hybrid inference on PCs

4

.

Source: VentureBeat

Source: VentureBeat

The AI orchestrator creates direct economic incentives for users and enterprises to invest in more powerful local silicon. The more capable the on-device chip, the more inference can run locally, reducing cloud costs and improving latency for sensitive AI workloads

4

. This dynamic benefits chipmakers competing for AI PC market share while giving Perplexity a competitive edge in cost efficiency.

Revenue growth highlights efficiency gains from distributed computing

Perplexity's financial trajectory underscores why cost efficiency matters for AI companies. Srinivas posted on X in April that the company's revenue grew fivefold, from $100 million to $500 million, while headcount increased just 34%

2

. That ratio reflects the leverage of AI-native business models and Perplexity's position as an aggregator that routes queries across multiple AI providers. "Every time any of the AI gets better, our unified system also gets better because we route across all of them," Srinivas explained

5

.

The hybrid compute platform extends this architectural efficiency to hardware. If Perplexity can use the compute already sitting on users' desks to handle a meaningful share of inference work, it reduces marginal cost per query while improving response times for lightweight tasks

2

. As AI moves deeper into enterprise workflows, the economics of who pays for compute—the cloud provider, the AI company, or the user's own hardware—will become a critical competitive variable. Perplexity Computer is currently available through the company's Mac app, with Windows support coming soon

1

.

Today's Top Stories