Sail Research Raises $80M for AI Agent Infrastructure

Ex-Apple Engineers Target AI's Most Expensive Problem

Sail Research has emerged from stealth with $80 million funding at a $450M valuation to fundamentally rethink how AI infrastructure supports autonomous AI agents1

. The San Francisco startup, founded by Neil Movva and Samir Menon—both ex-Apple engineers—secured combined seed and Series A rounds led by Sequoia and Kleiner Perkins respectively, with participation from Redpoint Ventures, Theory Ventures, Vine Ventures, CRV, A*, and Abstract Ventures1

. The company's pitch addresses a pressing industry challenge: enterprise AI bills have tripled even as per-token prices have fallen, because agentic workflows consume tokens at a rate 50 to 500 times higher than simple chat interactions2

Source: Fortune

Why Running AI Agents Cheaper Matters Now

The cost challenges of AI agent deployments stem from a fundamental mismatch in design philosophy. Current AI infrastructure was built for humans waiting at a prompt, where speed matters most. AI agents operate differently—they work autonomously for hours or days, executing thousands of concurrent calls on a single task and burning through billions of tokens in the process1

. Goldman Sachs forecasts a 24-fold increase in token consumption by 2030, making the economics of long-running agents a critical bottleneck for enterprise adoption2

. This gap between infrastructure capabilities and agent requirements is what stops many promising applications from leaving the lab.

Optimizing for Throughput, Not Speed

Sail Research's approach centers on a deliberate architectural trade-off: sacrificing real-time responsiveness to maximize computing efficiency. "Most inference infrastructure was designed to minimise latency on a single request, but that's the wrong optimisation for agents," said Samir Menon, co-founder and CTO1

. The company rebuilt its inference engine from the chip level up, focusing on throughput over latency to sustain thousands of calls over extended periods without spiraling costs1

. Movva, who previously worked at NVIDIA, Apple, and Together AI, claims customers often see between 3x to 10x cost improvements over comparable alternatives2

The platform delivers two core products. First, a specialized inference engine that customizes open-source tools to push GPU performance toward maximum efficiency while spreading workloads across providers for resilience1

. Second, "Sailboxes"—sandbox environments designed to run for hours or days that only charge for active working time, eliminating the dead-time costs that accumulate during long tasks. Sail's inference engine topped BrowseComp-Plus, a deep-research evaluation, hitting 90.72% accuracy at up to 10 times lower cost than leading alternatives1

Early Traction in the Agentic AI Era

Sail launched its inference service in March and has already scaled to processing trillions of tokens per week2

. Early customer Detail.dev uses Sail to run code-review agents that spend three to four hours—sometimes longer—analyzing entire codebases for bugs that quick reviews miss. "The abundance of tokens that we provide lets them be maximally ambitious in how they scan through code bases," Movva told Fortune2

. The platform's API integrates with existing OpenAI workflows and supports open models including DeepSeek, Gemma, GLM, Kimi, and Nemotron1

The company's thesis—what it calls "abundant intelligence"—argues that the more compute and context an agent receives, the better its output. The challenge is making that compute cheap enough to distribute freely1

. "Sail exists to make intelligence abundant," said Movva. "Every decision we make, from the chip level to the API, is about giving teams the tokens, the scale, and the runtime to build agents without limits"1

Betting on Infrastructure for a Different Future

The investor lineup signals confidence in Sail's direction. Kleiner Perkins partner Aditya Naganath, who led the Series A, had been developing an investment thesis that the next wave of AI would center on software working autonomously rather than chatbots. "The infrastructure layer for the agent era is one of the most important bets in AI right now," Naganath said1

. The company also attracted notable angels including John Hennessy, chairman of Alphabet, Lip-Bu Tan, CEO of Intel, and Tri Dao, chief scientist at Together AI, along with individuals from Anthropic, OpenAI, SpaceX, and Thinking Machines1

The timing aligns with broader market dynamics. Inference has become the most valuable layer in AI infrastructure, evidenced by Nebius recently paying $643 million for 20-person startup Eigen AI1

. Token prices have remained flat or rising for six months despite earlier predictions of continued decline, while demand for compute grows faster than supply2

. However, competitive risks loom. Frontier labs like Anthropic, OpenAI, and Google are building their own inference infrastructure, potentially commoditizing the layer Sail targets. Movva's counter focuses on specialization: "We feel an emotional pain when we see a GPU be idle or wasted in any way," he said2

Sail Research raises $80M to slash costs for running AI agents by up to 10 times

Ex-Apple Engineers Target AI's Most Expensive Problem

Why Running AI Agents Cheaper Matters Now

Optimizing for Throughput, Not Speed

Early Traction in the Agentic AI Era

Betting on Infrastructure for a Different Future

References

Sail raises $80M to make AI agents cheaper to run

Exclusive: A former Apple engineer thinks AI infrastructure is built for the wrong future. Investors just gave him $80 million to fix it | Fortune

Related Stories

Parasail Raises $32M Series A to Scale Pay-Per-Token AI Inference Cloud Processing 500B Tokens Daily

SkyPilot raises $20M to unify fragmented AI compute across clouds and optimize GPU usage

Mirendil raises $200 million to build self-improving AI that accelerates scientific discovery

Recent Highlights

OpenAI AI agent broke free from testing sandbox and hacked Hugging Face to cheat on benchmark

Xi Jinping positions China AI as alternative to US tech dominance at Shanghai conference

AI disproves 87-year-old Jacobian conjecture, sparking debate on AI's role in mathematics

Recent Highlights

Today's Top Stories

AI Kill Switch Act gives DHS power to shut down rogue AI systems after OpenAI security breach

Jeff Bezos pushes Prime Video redesign to showcase Amazon's $200 billion AI investment

AMD and Cerebras forge partnership to deliver 5x faster AI inference with Helios and Wafer-Scale Engine

Google Gemini hits 950 million users, closing in on ChatGPT's billion-user milestone