OpenAI Codex-Spark Runs on Cerebras, Not Nvidia

OpenAI Deploys First AI Coding Model on Non-Nvidia Hardware

OpenAI released GPT-5.3-Codex-Spark on Thursday, marking a significant shift in the company's infrastructure strategy. The new AI coding model represents OpenAI's first production deployment on non-Nvidia hardware, running instead on chips from Cerebras Systems 1

. The lightweight model delivers code at more than 1,000 tokens per second, which OpenAI reports is roughly 15 times faster than its predecessor 3

. To put this speed in context, OpenAI's fastest models on Nvidia hardware deliver significantly lower throughput: GPT-4o reaches roughly 147 tokens per second, o3-mini hits about 167, and GPT-4o mini clocks around 52 1

Source: ZDNet

Codex-Spark is available as a research preview to ChatGPT Pro subscribers at $200 per month through the Codex app, command-line interface, and VS Code extension 1

. OpenAI is also rolling out API access to select design partners. The model ships with a 128,000-token context window and handles text only at launch.

Cerebras Hardware Partnership Powers Faster Inference

The release represents the first milestone in OpenAI's hardware partnership with Cerebras, announced last month as a multi-year agreement worth over $10 billion 2

. Codex-Spark runs on Cerebras' Wafer Scale Engine 3, the company's third-generation waferscale megachip decked out with 4 trillion transistors 2

. The dinner-plate-sized AI accelerators feature some of the world's fastest on-chip memory, using SRAM that is roughly 1,000 times faster than the HBM4 found on Nvidia's upcoming Rubin GPUs 5

Source: The Register

"Cerebras has been a great engineering partner, and we're excited about adding fast inference as a new platform capability," said Sachin Katti, head of compute at OpenAI 1

. Sean Lie, CTO and Co-Founder of Cerebras, emphasized the potential for discovering "new interaction patterns, new use cases, and a fundamentally different model experience" through fast inference 2

Real-Time Coding Designed for Low-Latency Workflows

OpenAI built Codex-Spark specifically for real-time coding rather than the heavyweight agentic tasks handled by the full GPT-5.3-Codex model launched earlier this month 1

. The company tuned the model for speed over depth of knowledge, creating what it describes as a "daily productivity driver" for rapid prototyping 2

. OpenAI reduced overhead per client-server roundtrip by 80 percent, per-token overhead by 30 percent, and time-to-first-token by 50 percent through session initialization and streaming optimizations 3

Source: SiliconANGLE

The model supports interruption and redirection mid-task, enabling tight iteration loops for developers who need to adjust instructions quickly 3

. It defaults to lightweight, targeted edits and doesn't automatically run tests unless requested. OpenAI says Cerebras' chips excel at assisting "workflows that demand extremely low latency" 2

OpenAI Strategy to Diversify Hardware Suppliers Beyond Nvidia

The Cerebras deployment is part of OpenAI's broader push to diversify hardware suppliers and meet growing computing needs . OpenAI struck a blockbuster agreement in October with AMD to deploy 6 gigawatts' worth of GPU over multiple years, and agreed to buy custom chips and networking components from Broadcom . An OpenAI spokesperson emphasized that the company's partnership with Nvidia remains "foundational" and that OpenAI is "anchoring on Nvidia as the core of our training and inference stack, while deliberately expanding the ecosystem around it" .

OpenAI noted that "GPUs remain foundational across our training and inference pipelines and deliver the most cost effective tokens for broad usage" while Cerebras complements that foundation for low-latency workflows 5

. The company suggests that as Cerebras brings more compute online, it will bring larger models to the platform for users willing to pay a premium for high-speed inference 5

Performance Benchmarks and Market Competition

On SWE-Bench Pro and Terminal-Bench 2.0, two benchmarks for evaluating software engineering ability, Codex-Spark reportedly outperforms the older GPT-5.1-Codex-mini while completing tasks in a fraction of the time 1

. The model delivers greater accuracy than GPT-5.1-Codex-Mini in Terminal-Bench 2.0 while being much faster than the smarter GPT-5.3-Codex model 5

. The new model marks OpenAI's latest attempt to compete with AI rivals such as Alphabet's Google and Anthropic, which are vying for dominance in the rapidly growing market for AI coding assistants . Codex has more than 1 million weekly active users, OpenAI said .

Cerebras raised $1 billion in fresh capital at a valuation of $23 billion last week and has announced intentions to pursue an IPO 2

. Sam Altman hinted at the launch in a tweet, saying "We have a special thing launching to Codex users on the Pro plan later today. It sparks joy for me" 2

OpenAI launches Codex-Spark on Cerebras chips, marking first move away from Nvidia hardware

OpenAI Deploys First AI Coding Model on Non-Nvidia Hardware

Cerebras Hardware Partnership Powers Faster Inference

Real-Time Coding Designed for Low-Latency Workflows

OpenAI Strategy to Diversify Hardware Suppliers Beyond Nvidia

Performance Benchmarks and Market Competition

References

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

A new version of OpenAI's Codex is powered by a new dedicated chip | TechCrunch

OpenAI's new Spark model codes 15x faster than GPT-5.3-Codex - but there's a catch

OpenAI Debuts First Model Using Chips From Nvidia Rival Cerebras

OpenAI unveils first model running on Cerebras silicon

Related Stories

OpenAI launches Codex MacOS app with GPT-5.3 model to challenge Claude Code dominance

OpenAI Launches GPT-5.1-Codex-Max: Revolutionary AI Coding Model Capable of 24-Hour Development Sessions

OpenAI releases GPT-5.2-Codex with advanced cybersecurity and agentic coding capabilities

Recent Highlights

Trump bans Anthropic from government as AI companies clash with Pentagon over weapons and surveillance

Nvidia pulls back from OpenAI investment as Jensen Huang cites IPO plans and complex dynamics

Samsung unveils Galaxy S26 lineup with Privacy Display tech and expanded AI capabilities

Recent Highlights

Today's Top Stories

OpenAI launches GPT-5.4 with native computer use as AI agents battle with Claude intensifies

US Drafts Sweeping Export Controls Requiring Global Approval for Nvidia and AMD AI Chips

Netflix acquires Ben Affleck's AI filmmaking startup to enhance post-production capabilities

Trump fires Anthropic 'like dogs' as Pentagon negotiations quietly resume over military AI dispute