Curated by THEOUTPOST
On Fri, 31 Jan, 8:10 AM UTC
2 Sources
[1]
Cerebras becomes the world's fastest host for DeepSeek R1, outpacing Nvidia GPUs by 57x
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Cerebras Systems announced today it will host DeepSeek's breakthrough R1 artificial intelligence model on U.S. servers, promising speeds up to 57 times faster than GPU-based solutions while keeping sensitive data within American borders. The move comes amid growing concerns about China's rapid AI advancement and data privacy. The AI chip startup will deploy a 70-billion-parameter version of DeepSeek-R1 running on its proprietary wafer-scale hardware, delivering 1,600 tokens per second -- a dramatic improvement over traditional GPU implementations that have struggled with newer "reasoning" AI models. Why DeepSeek's reasoning models are reshaping enterprise AI "These reasoning models affect the economy," said James Wang, a senior executive at Cerebras, in an exclusive interview with VentureBeat. "Any knowledge worker basically has to do some kind of multi-step cognitive tasks. And these reasoning models will be the tools that enter their workflow." The announcement follows a tumultuous week in which DeepSeek's emergence triggered Nvidia's largest-ever market value loss, nearly $600 billion, raising questions about the chip giant's AI supremacy. Cerebras' solution directly addresses two key concerns that have emerged: the computational demands of advanced AI models, and data sovereignty. "If you use DeepSeek's API, which is very popular right now, that data gets sent straight to China," Wang explained. "That is one severe caveat that [makes] many U.S. companies and enterprises...not willing to consider [it]." How Cerebras' wafer-scale technology beats traditional GPUs at AI speed Cerebras achieves its speed advantage through a novel chip architecture that keeps entire AI models on a single wafer-sized processor, eliminating the memory bottlenecks that plague GPU-based systems. The company claims its implementation of DeepSeek-R1 matches or exceeds the performance of OpenAI's proprietary models, while running entirely on U.S. soil. The development represents a significant shift in the AI landscape. DeepSeek, founded by former hedge fund executive Liang Wenfeng, shocked the industry by achieving sophisticated AI reasoning capabilities reportedly at just 1% of the cost of U.S. competitors. Cerebras' hosting solution now offers American companies a way to leverage these advances while maintaining data control. "It's actually a nice story that the U.S. research labs gave this gift to the world. The Chinese took it and improved it, but it has limitations because it runs in China, has some censorship problems, and now we're taking it back and running it on U.S. data centers, without censorship, without data retention," Wang said. U.S. tech leadership faces new questions as AI innovation goes global The service will be available through a developer preview starting today. While it will be initially free, Cerebras plans to implement API access controls due to strong early demand. The move comes as U.S. lawmakers grapple with the implications of DeepSeek's rise, which has exposed potential limitations in American trade restrictions designed to maintain technological advantages over China. The ability of Chinese companies to achieve breakthrough AI capabilities despite chip export controls has prompted calls for new regulatory approaches. Industry analysts suggest this development could accelerate the shift away from GPU-dependent AI infrastructure. "Nvidia is no longer the leader in inference performance," Wang noted, pointing to benchmarks showing superior performance from various specialized AI chips. "These other AI chip companies are really faster than GPUs for running these latest models." The impact extends beyond technical metrics. As AI models increasingly incorporate sophisticated reasoning capabilities, their computational demands have skyrocketed. Cerebras argues its architecture is better suited for these emerging workloads, potentially reshaping the competitive landscape in enterprise AI deployment.
[2]
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
DeepSeek R1 will run on Cerebras cloud and the data will remain in the USA Cerebras has announced that it will support DeepSeek in a not-so-surprising move, more specifically the R1 70B reasoning model. The move comes after Groq and Microsoft confirmed they would also bring the new kid of the AI block to their respective clouds. AWS and Google Cloud have yet to do so but anybody can run the open source model anywhere, even locally. The AI inference chip specialist will run DeepSeek R1 70B at 1,600 tokens/second, which it claims is 57x faster than any R1 provider using GPUs; one can deduce that 28 tokens/second is what GPU-in-the-cloud solution (in that case DeepInfra) apparently reach. Serendipitously, Cerebras latest chip is 57x bigger than the H100. I have reached out to Cerebras to find out more about that claim. Research by Cerebras also demonstrated that DeepSeek is more accurate than OpenAI models on a number of tests. The model will run on Cerebras hardware in US-based datacentres to assuage the privacy concerns that many experts have expressed. DeepSeek - the app - will send your data (and metadata) to China where it will most likely be stored. Nothing surprising here as almost all apps - especially free ones - capture user data for legitimate reasons. Cerebras wafer scale solution positions it uniquely to benefit from the impending AI cloud inference boom. WSE-3, which is the fastest AI chip (or HPC accelerator) in the world, has almost one million cores and a staggering four trillion transistors. More importantly though, it has 44GB of SRAM, which is the fastest memory available, even faster than HBM found on Nvidia's GPUs. Since WSE-3 is just one huge die, the available memory bandwith is huge, several orders of magnitude bigger than what the Nvidia H100 (and for that matter the H200) can muster. No pricing has been disclosed yet but Cerebras, which is usually coy about that particular detail, did divulge last year that Llama 3.1 405B on Cerebras Inference would cost $6/million input tokens and $12/million output tokens. Expect DeepSeek to be available for far less. WSE-4 is the next iteration of WSE-3 and will deliver a significant boost in the performance of DeepSeek and similar reasoning models when it is expected to launch in 2026 or 2027 (depending on market conditions). The arrival of DeepSeek is also likely to shake the proverbial AI money tree, bringin more competition to established players like OpenAI or Anthropic, pushing prices down. A quick look at Docsbot.ai LLM API calculator shows OpenAI is almost always the most expensive in all configurations, sometimes by several orders of magnitude.
Share
Share
Copy Link
Cerebras Systems announces hosting of DeepSeek's R1 AI model on US servers, promising 57x faster speeds than GPU solutions while addressing data privacy concerns. This move reshapes the AI landscape, challenging Nvidia's dominance and offering a US-based alternative to Chinese AI services.
Cerebras Systems has announced a groundbreaking partnership to host DeepSeek's R1 artificial intelligence model on U.S. servers. This collaboration promises to deliver inference speeds up to 57 times faster than traditional GPU-based solutions, while ensuring sensitive data remains within American borders 1.
Cerebras will deploy a 70-billion-parameter version of DeepSeek-R1 on its proprietary wafer-scale hardware. The company claims its implementation can process 1,600 tokens per second, a significant improvement over GPU implementations that have struggled with newer "reasoning" AI models 1.
The performance boost is attributed to Cerebras' novel chip architecture, which keeps entire AI models on a single wafer-sized processor. This design eliminates memory bottlenecks common in GPU-based systems 1.
A key aspect of this partnership is the focus on data sovereignty. By hosting DeepSeek R1 on U.S. servers, Cerebras addresses concerns about data privacy and control, particularly for American companies wary of their data being processed in China 1.
James Wang, a senior executive at Cerebras, emphasized this point: "If you use DeepSeek's API, which is very popular right now, that data gets sent straight to China. That is one severe caveat that [makes] many U.S. companies and enterprises...not willing to consider [it]." 1
This development represents a significant shift in the AI industry. DeepSeek, founded by former hedge fund executive Liang Wenfeng, has achieved sophisticated AI reasoning capabilities reportedly at just 1% of the cost of U.S. competitors. Cerebras' hosting solution now offers American companies a way to leverage these advances while maintaining data control 1.
The announcement follows a week in which DeepSeek's emergence triggered Nvidia's largest-ever market value loss, nearly $600 billion, raising questions about the chip giant's AI supremacy 1.
Cerebras is offering the service through a developer preview starting immediately. While initially free, the company plans to implement API access controls due to strong early demand 1.
This move could accelerate the shift away from GPU-dependent AI infrastructure. Industry analysts suggest that specialized AI chips, like those developed by Cerebras, are outperforming GPUs for running the latest models 1 2.
The partnership between Cerebras and DeepSeek may also impact AI pricing. The arrival of DeepSeek is likely to increase competition among established players like OpenAI and Anthropic, potentially driving prices down 2.
As AI models increasingly incorporate sophisticated reasoning capabilities, their computational demands have skyrocketed. Cerebras argues its architecture is better suited for these emerging workloads, potentially reshaping the competitive landscape in enterprise AI deployment 1.
Cerebras Systems announces a significant expansion of its AI infrastructure, adding six new data centers across North America and Europe. The company aims to increase its inference capacity to over 40 million tokens per second, potentially disrupting Nvidia's stronghold in the AI hardware market.
4 Sources
4 Sources
DeepSeek's open-source R1 model challenges OpenAI's o1 with comparable performance at a fraction of the cost, potentially revolutionizing AI accessibility and development.
6 Sources
6 Sources
Microsoft integrates DeepSeek R1 into its Azure AI Foundry and GitHub, expanding AI model accessibility while raising questions about competition and intellectual property in the AI industry.
14 Sources
14 Sources
DeepSeek R1, a new open-source AI model, demonstrates advanced reasoning capabilities comparable to proprietary models like OpenAI's GPT-4, while offering significant cost savings and flexibility for developers and researchers.
21 Sources
21 Sources
DeepSeek, a Chinese AI startup, has developed a new language model that achieves state-of-the-art performance without relying on advanced hardware, challenging the 'bigger is better' approach in AI development.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved