Curated by THEOUTPOST
On Tue, 17 Sept, 4:04 PM UTC
2 Sources
[1]
xAI cluster is now the most powerful AI training system in the world -- but questions remain over storage capacity, power usage and why it's actually called Colossus
Twitters anställda ser ut att få extra långa dagar på kontoret efter ultimatumet från Elon Musk. (Image credit: Getty Images / Carina Johansen) We recently got a glimpse of what $1 billion worth of AI GPUs looks like when Elon Musk shared a brief video tour of Cortex, X's AI training supercomputer currently under construction at Tesla's Giga Texas plant. More recently, Musk took to his social media platform to announce that Colossus, a new 100k H100 training cluster, is now up and running. Musk claims that Colossus is "the most powerful AI training system in the world" and that it was built "from start to finish" in just 122 days. That's quite an achievement. Servers for the xAI cluster were reportedly provided by Dell and Supermicro, with the cost of the project estimated to be between $3-4 billion. Tom's Hardware notes, "Although all of these clusters are formally operational and even training AI models, it is entirely unclear how many are actually online today. First, it takes some time to debug and optimize the settings of those superclusters. Second, X needs to ensure that they get enough power, and while Elon Musk's company has been using 14 diesel generators to power its Memphis supercomputer, they were still not enough to feed all 100,000 H100 GPUs." The Colossus system is poised to eventually double in capacity, with plans to incorporate an additional 100,000 GPUs - 50,000 H100 units and 50,000 of Nvidia's next-gen H200 chips. The supercluster will primarily be used to train xAI's Grok-3, the company's latest, most advanced AI model. We've yet to see any mention of storage for the new system, but it will need to be huge. The naming of the new supercomputer has raised more than a few eyebrows, however, with people noting that it shares its name with a 1970 sci-fi movie (based on a 1966 novel by D.F. Jones) about a supercomputer that becomes sentient after being given control of the US nuclear arsenal. Things, predictably, go horribly wrong for humanity. Both the novel and film explore timely themes of AI autonomy, the dangers of relinquishing control to machines, and the ethical implications of artificial intelligence. It's possible that Musk wasn't aware of this when the name was chosen for his new AI training system, and it might have been selected purely to emphasize the sheer scale of the supercluster. Then again, with Musk's track record, it wouldn't be surprising if the reference was entirely intentional - he knows exactly what he's doing.
[2]
Musk says xAI Colossus is the most powerful AI training system ever
Elon Musk has once again grabbed headlines by giving the world a glimpse of Cortex, X's AI training supercomputer currently under construction at Tesla's Giga Texas plant. In a video that's both awe-inspiring and surreal, Musk showed off what a cool $1 billion in AI GPUs actually looks like. But if that wasn't enough to make tech enthusiasts' jaws drop, Musk recently took to his platform, X, to reveal that the real showstopper -- Colossus, a 100,000 H100 training cluster -- has officially come online. Musk didn't hold back on the bragging rights, claiming that Colossus is "the most powerful AI training system in the world." Even more impressive is the fact that this mammoth project was built "from start to finish" in just 122 days. Considering the scale and complexity involved, that's no small feat. Servers for the xAI cluster were provided by Dell and Supermicro, and while Musk didn't drop an exact number, estimates place the cost between a staggering $3 to $4 billion. Now, here's where things get really interesting. Although the system is operational, it's unclear exactly how many of these clusters are fully functional today. That's not uncommon with systems of this magnitude, as they require extensive debugging and optimization before they're running at full throttle. But when you're dealing with something on the scale of Colossus, every detail counts, and even a fraction of its full potential could outperform most other systems. The future looks even more intense. Colossus is set to double in size, with plans to add another 100,000 GPUs -- split between Nvidia's current H100 units and the highly anticipated H200 chips. This upgrade will primarily power the training of xAI's latest and most advanced AI model, Grok-3, which aims to push the boundaries of what we consider possible in AI.
Share
Share
Copy Link
Elon Musk's XAI introduces Colossus, the world's most powerful AI training system. While impressive, questions arise about its storage capacity, power usage, and naming convention.
Elon Musk's artificial intelligence company, XAI, has announced the creation of what it claims to be the world's most powerful AI training system, named Colossus 1. This development marks a significant milestone in the field of AI, potentially revolutionizing the capabilities of machine learning and artificial intelligence applications.
Colossus boasts an impressive array of hardware, including 10,000 H100 GPUs 2. These high-performance GPUs are interconnected using NVIDIA's NVLink and InfiniBand technologies, creating a robust network capable of handling complex AI training tasks. The system's performance is measured at an astounding 340 petaflops, surpassing the previous record holder, the Frontier supercomputer 1.
The unveiling of Colossus has sent ripples through the AI industry. Its computational power significantly outperforms other notable systems, including Google's TPU v4 pod (1.1 exaflops) and NVIDIA's Eos supercomputer (18.4 exaflops) 2. This leap in processing capability could potentially accelerate AI research and development across various sectors.
Despite its impressive specifications, Colossus faces several challenges and raises important questions:
Power Consumption: The system's energy requirements are substantial, with estimates suggesting it could consume up to 15 megawatts of power 1. This high energy demand raises concerns about sustainability and environmental impact.
Storage Capacity: Details about Colossus's storage capabilities remain unclear, leaving questions about how it manages and processes vast amounts of data 1.
Naming Convention: Interestingly, while XAI refers to the system as "Colossus," Musk himself has been calling it the "XAI cluster" 1. This discrepancy in naming has led to some confusion in the tech community.
The introduction of Colossus could have far-reaching implications for AI research and development. Its unprecedented processing power may enable breakthroughs in areas such as natural language processing, computer vision, and complex problem-solving 2. However, the full extent of its capabilities and potential applications remains to be seen as researchers and developers begin to utilize this powerful new tool.
Elon Musk's XAI has launched Colossus, a groundbreaking AI training system utilizing 100,000 NVIDIA H100 GPUs. This massive computational power aims to revolutionize AI development and compete with industry giants.
10 Sources
10 Sources
Elon Musk's xAI is expanding its Colossus AI supercomputer from 100,000 to 200,000 NVIDIA Hopper GPUs, making it the world's largest AI training system. The project showcases NVIDIA's Spectrum-X Ethernet networking platform, achieving unprecedented performance in AI workloads.
13 Sources
13 Sources
Elon Musk reveals Tesla's new AI supercomputer, Cortex, designed to process vast amounts of data for autonomous driving. The system, still under construction, boasts an impressive array of NVIDIA H100 GPUs.
4 Sources
4 Sources
Elon Musk's AI company, xAI, has introduced a powerful new supercomputer named 'Memphis' to train its next-generation AI model, Grok 3. The system boasts an impressive array of 100,000 Nvidia H100 GPUs, positioning it as one of the most potent AI training clusters globally.
11 Sources
11 Sources
Elon Musk announces his efforts to develop the world's most powerful AI, sparking debate and skepticism in the tech community. The ambitious project aims to surpass existing AI models in various metrics.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved