Anthropic explores UK startup's AI inference chips promising 100x speed boost at fraction of cost

2 Sources

Share

Anthropic has entered early talks with London-based Fractile about purchasing the startup's novel AI inference chips. The UK startup's SRAM-based architecture promises to run large language models 100 times faster and 10 times cheaper than current solutions by eliminating memory bottlenecks. The move would add a fourth chip supplier alongside Nvidia, Google, and Amazon.

Anthropic Explores Partnership with UK Startup Fractile

Anthropic has entered early discussions with UK startup Fractile about acquiring the company's AI inference chips, according to reports from The Information

1

. The Claude AI developer would add Fractile as a fourth source of AI server silicon, joining its existing relationships with NVIDIA, Google, and Amazon. Founded in 2022 by Oxford PhD Walter Goodwin, Fractile has attracted attention for its innovative approach to solving one of the most persistent challenges in AI computing: the memory bottleneck that limits how quickly models can process information.

Source: Wccftech

Source: Wccftech

The timing aligns with Anthropic's explosive growth trajectory. The company's annualized revenue run rate surpassed $30 billion in March, up dramatically from around $9 billion at the end of 2025

1

. This surge in demand has intensified pressure on existing infrastructure, with inference costs dragging on gross margins. Unlike competitors such as OpenAI and xAI, which are building massive proprietary data centers, Anthropic has opted to reduce reliance on external chipmakers through diversified chip supply agreements with multiple providers.

DRAM-less AI Inference Chips Target Memory Bottleneck

Fractile's approach centers on what it calls Memory Compute Fusion Architecture, a design that fundamentally reimagines how AI chips handle data

2

. The SRAM-based architecture co-locates memory and compute on the same die using SRAM rather than shuttling data to separate DRAM chips. This eliminates the data movement between GPUs and off-chip DRAM, which represents one of the main bottlenecks in running large AI models at speed

1

.

Goodwin explained to Fortune in July 2024 that Fractile's design stores data needed for computations directly next to the transistors that perform the arithmetic. Based on simulations at that time, the company projected it could run a large language model 100 times faster and 10 times cheaper than NVIDIA's GPUs, though it had not yet manufactured test chips

1

. These ambitious performance claims position Fractile among several inference-focused startups pursuing SRAM or near-memory architectures, including Groq and Cerebras.

Investment Momentum and Timeline Considerations

Fractile raised $15 million in seed funding co-led by Kindred Capital, the NATO Innovation Fund, and Oxford Science Enterprises

1

. The startup is now in talks to raise $200 million at a valuation exceeding $1 billion, with Founders Fund, 8VC, and Accel among potential investors. The Fractile team reportedly includes engineers from Graphcore, NVIDIA, and Imagination Technologies, and the company is building its own software stack alongside the hardware

1

.

However, Fractile's chips aren't expected to reach commercial readiness until around 2027, placing any deployment well outside Anthropic's near-term procurement plans

1

. This timeline roughly aligns with Anthropic's Google-Broadcom TPU partnership, which expanded to 3.5GW of compute capacity from 2027 through 2031 in early April.

Competitive Landscape Shifts as NVIDIA Responds

The market for faster AI inference solutions has intensified following NVIDIA's $20 billion acquisition deal with Groq in December

1

. NVIDIA subsequently launched its own dedicated inference accelerator, Groq 3 LPX, acknowledging the growing commercial pressure to optimize cost-per-token at scale. The Groq 3 LPU packs 500 MB of SRAM, 150 TB/s of SRAM bandwidth, and 2.5 TB/s of scale-up bandwidth, housed in racks containing 256 LPUs with a massive 128 GB of SRAM for low-latency processing

2

.

Anthropic's strategy to reduce inference expenses through diversified sourcing reflects broader industry trends. The company has deliberately avoided dependence on any single chip vendor, running Claude AI on NVIDIA GPUs, Amazon's Trainium processors through Project Rainier, and Google's TPUs under deals providing over 1GW of compute capacity

1

. Reports suggest Anthropic will soon add AMD as a potential fourth name to its compute portfolio

2

. Whether Fractile's technology can deliver on its performance promises remains to be seen, particularly as the company has yet to design test chips to validate its simulation-based projections.

Source: Tom's Hardware

Source: Tom's Hardware

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved