Microsoft Copilot AI Upgrades Use GPT and Claude

Microsoft Copilot Deploys Multi-Model Approach to Improve Research Accuracy

Microsoft announced significant AI upgrades to its Copilot research assistant on March 30, introducing a collaborative framework where multiple AI models work together within the same workflow 1

. The centerpiece is a new Critique feature that integrates rival AI models from OpenAI and Anthropic to deliver more reliable outputs. Under this system, GPT generates initial responses while Claude reviews the output for accuracy, completeness, and citation quality before presenting it to users 3

. This represents a strategic shift from single-model reliance to orchestrated collaboration between competing systems.

GPT and Claude Collaboration Tackles AI Hallucinations

The Critique feature addresses one of the most persistent challenges in enterprise AI: hallucinations where systems generate false information. Nicole Herskowitz, corporate vice president of Microsoft 365 and Copilot, explained that having various different models from different vendors working together takes the technology "to the next level, where customers actually get the benefits of the models working together" 1

. The multi-model approach creates what Microsoft describes as "a powerful feedback loop that delivers higher-quality results across factual accuracy, analytical breadth, and presentation" 3

. Microsoft expects this workflow to eventually become bi-directional, allowing GPT to review Claude's drafts as well 1

Researcher Agent Scores 13.8% Higher on DRACO Benchmark

The impact of these AI upgrades is measurable. The Researcher agent with Critique enabled now scores 13.8% higher on the DRACO benchmark—the industry standard for Deep Research Accuracy, Completeness, and Objectivity 2

. Achieving 57.4% on this benchmark, the system outperforms standalone deep-research tools from OpenAI, Google, Perplexity, and Anthropic 4

. This makes it more than twice as reliable as Deep Research with OpenAI's o4-mini model and superior to Gemini Deep Research and Claude Opus 4.6 5

Model Council Enables Side-by-Side Comparison

Beyond Critique, Microsoft introduced Model Council, a feature that allows users to compare responses from different AI models side-by-side 1

. This tool hands prompts to several models and displays what each one says, highlighting where they agree or disagree 2

. The approach gives users more autonomy in workflow management and allows them to assemble the best possible response by drawing from multiple sources of intelligence.

Copilot Cowork Expands to Frontier Program

These updates arrive as Microsoft makes Copilot Cowork more widely available to members in its Frontier program, which provides customers with early access to cutting-edge AI features 1

. Copilot Cowork, built on Anthropic's viral Claude Cowork product, represents a move toward agentic AI that goes beyond simple chatbot interactions to become a digital assistant capable of handling long-running, multi-step tasks 2

. Microsoft unveiled Copilot Cowork in testing mode earlier this month, capitalizing on growing demand for autonomous AI agents 1

Driving Adoption Amid Intense Enterprise AI Competition

The timing reflects Microsoft's urgency to accelerate adoption among its existing business customer base. The company reported 15 million paid Copilot seats in January—roughly 3.3% of its 450 million commercial Microsoft 365 users 4

. This low penetration rate comes amid intensifying competition in enterprise AI, with Google expanding Gemini across Workspace apps and Anthropic's Claude gaining traction among businesses 4

. Jared Spataro, Microsoft AI at Work Chief Marketing Officer, framed the shift: "When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done" 5

Microsoft Copilot integrates rival AI models to fact-check research and reduce hallucinations

Microsoft Copilot Deploys Multi-Model Approach to Improve Research Accuracy

GPT and Claude Collaboration Tackles AI Hallucinations

Researcher Agent Scores 13.8% Higher on DRACO Benchmark

Model Council Enables Side-by-Side Comparison

Copilot Cowork Expands to Frontier Program

Driving Adoption Amid Intense Enterprise AI Competition

References

Microsoft unveils AI upgrades, rolls out Copilot Cowork to early-access customers

Microsoft 365 Copilot's new agent uses Claude to fact-check GPT's work

Microsoft's research assistant can now use multiple AI models simultaneously

GPT drafts, Claude critiques: Microsoft blends rival AI models in new Copilot upgrade

'When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done': Microsoft and OpenAI are making AI research tools smarter to help answer even your trickiest questions

Related Stories

Microsoft Copilot Cowork brings Claude AI agents to handle multi-step workflows autonomously

Microsoft Copilot Unveils Major AI Expansion with Secure Sandboxing and No-Code Development Tools

Microsoft Expands AI Partnerships: Integrates Anthropic's Claude Models into Copilot

Recent Highlights

Apple Plans Major Siri AI Overhaul in iOS 27 With Third-Party Chatbot Integration

OpenAI shuts down Sora after six months, ending Disney's $1 billion licensing partnership

AI chatbots validate you too much, making you less kind to others, Stanford study reveals

Recent Highlights

Today's Top Stories

Starcloud Reaches $1.1 Billion Valuation Building AI Data Centers in Space

Bluesky launches Attie, an AI assistant app that lets you build custom feeds with natural language

Students rent AI smart glasses for $6 a day to cheat on exams, exposing detection challenges

OpenClaw AI agents promise task automation but bring significant security challenges