Apple AI strategy shifts as Google Gemini and Nvidia power new Siri in major cloud pivot

Reviewed byNidhi Govil

5 Sources

Share

Apple is working to distill Google's multi-trillion-parameter Gemini model to run locally on iPhones, but complex queries will route to Google Cloud using Nvidia chips. The move marks a significant shift from Apple's privacy-focused, on-device AI stance as the company prepares to unveil its enhanced Siri at WWDC, blending local processing with cloud capabilities.

Apple Google Partnership Brings Major Changes to Siri

Apple is preparing to merge its iconic Siri assistant with Google Gemini later this year, marking a fundamental shift in how the iPhone maker approaches artificial intelligence. According to reports from The Information, the Apple Google partnership will result in a hybrid system that processes AI queries both on-device and in the cloud, departing from Apple's longstanding emphasis on local processing for user privacy

1

. The company has been working to distill Google's massive Gemini models, which feature trillions of parameters, into smaller versions capable of running on Apple's custom silicon. This model distillation process allows a compact model to mimic the behavior of large language models while pruning less critical weights, potentially enabling Siri to handle some tasks with private local compute

1

.

Source: Ars Technica

Source: Ars Technica

On-Device AI Faces Hardware Limitations

Despite Apple's 15 years of custom silicon expertise and continuous Neural Engine upgrades, smartphones remain fundamentally constrained by memory and processing capabilities. On-device AI models typically feature only a few billion parameters and are quantized to run at lower precision, making them faster but affecting token generation accuracy

1

2

. Even Google doesn't attempt to run its full conversational Gemini experience locally on Android, instead routing those interactions to the cloud where more powerful hardware handles them

1

. Apple has reportedly struggled to get Google's undistilled Gemini models running on its Private Cloud Compute infrastructure, which operates on M-series Mac chips, highlighting the massive gap between what edge devices can handle and what cutting-edge AI models require

1

4

.

Nvidia Confidential Computing Addresses Privacy Concerns

To maintain its commitment to user privacy while leveraging cloud-based AI, Apple has recently approved the use of Nvidia's Confidential Computing platform within Google Cloud. This technology keeps data encrypted on Nvidia GPUs while being processed in the cloud, allowing Apple to claim sensitivity to privacy concerns even as more user data leaves the device

1

5

. The decision to use Nvidia's confidential computing system happened in recent weeks, according to people familiar with the matter

5

. While this security feature adds a modest performance cost and slightly slows down AI query processing, it offers stronger privacy protections than standard cloud processing

4

5

. Apple is expected to retain its Private Cloud Compute branding despite the infrastructure change, even though queries will no longer be handled exclusively by Apple's own servers

1

5

.

WWDC to Showcase Hybrid AI Approach

Source: 9to5Mac

Source: 9to5Mac

At the upcoming WWDC, Apple plans to highlight its on-device AI capabilities as a competitive advantage, positioning local inference as a privacy-preserving, cost-saving alternative to the massive data center buildouts its rivals have pursued

4

. The company will showcase how chips designed for iPhones, Apple Watches, and Macs give it an edge in processing AI queries directly on devices, while acknowledging that cloud-based processing remains necessary for complex queries

4

. The hybrid AI approach means users likely won't know which version of Gemini is handling individual Siri requests, as device makers building such systems aim to make the experience feel seamless

1

. However, users may notice latency differences, as Nvidia's fully encrypted confidential compute slows processing compared to other AI options, and cloud-based requests will naturally feel less immediate than local processing

1

2

.

Siri Enhancement Shows Promise Despite Delays

Apple Intelligence was first announced at WWDC 2024, but the rollout has been hampered by tepid response to initial features and protracted delays to the more personal version of Siri

4

. Real-world testing of Google's agentic AI system Gemini Spark, which will power the new Siri, suggests the technology can largely deliver on promises made during demonstrations. The Verge's Jay Peters tested features like drafting emails that compile budget data from spreadsheets, with results he described as "scarily good" and "actually nuts," though imperfect

3

. This real-life performance indicates the new Siri may finally live up to Apple's promises, even if those promises are fulfilled by Google

3

. Apple is also reportedly scouting acquisitions to advance its model-shrinking work, with one company under consideration being Liquid AI, a Massachusetts startup focused on running AI locally on devices

4

5

.

Source: 9to5Mac

Source: 9to5Mac

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved