Curated by THEOUTPOST
On Tue, 27 Aug, 12:02 AM UTC
4 Sources
[1]
Elon Musk shows off Cortex AI supercluster -- first look at Tesla's 50,000 Nvidia H100s
Elon Musk's supercomputing exploits continue to press forward this week, as the technocrat shared a video of his newly renamed "Cortex" AI supercluster on X. The recent expansion to Tesla's "Giga Texas" plant will contain 70,000 AI servers and will require 130 megawatts (MW) of cooling and power at launch, upscaling to 500 MW by 2026. Musk's video of the Cortex supercluster shows off the in-progress assembly of a staggering number of server racks. From the fuzzy video, the racks seem to be laid out in an array of 16 compute racks per row, with four or so non-GPU racks splitting the rows. Each computer rack holds 8 servers. Somewhere between 16-20 rows of server racks are visible in the 20-second clip, so rough napkin math estimates 2,000 GPU servers can be seen, less than 3% of the estimated full-scale deployment. Musk shared in Tesla's July earnings call that the Cortex supercluster will be Tesla's largest training cluster to date, containing "50,000 [Nvidia] H100s, plus 20,000 of our hardware." This is a smaller number than Musk previously shared, with tweets from June estimating Cortex would house 50,000 units of Tesla's Dojo AI hardware. Previous remarks from the Tesla CEO also suggest that Tesla's own hardware will come online at a later date, with Cortex expected to be solely Nvidia-powered at launch. The Cortex training cluster is being built to "solve real-world AI," per Elon's Twitter. In Tesla's Q2 2024 earnings call, this means training Tesla's Full Self Driving (FSD) autopilot system for Tesla -- which will power consumer Teslas and the promised "Cybertaxi" product -- and training AI for the Optimus robot, an autonomous humanoid robot expected to begin limited production in 2025 to be used in Tesla's manufacturing process. Cortex first turned heads in the press thanks to the massive fans under construction to chill the entire supercluster, shown off by Musk in June. The fan stack cools the Supermicro-provided liquid cooling solution, built to handle an eventual 500 MW of cooling and power at full power. For context, an average coal power plant may output around 600 MW of power. Cortex joins Elon Musk's stable of supercomputers in development. So far, the first of Musk's data centers to become operational is the Memphis Supercluster, owned by xAI and powered by 100,000 Nvidia H100s. All of Memphis' 100,000 servers are connected with a single RDMA (remote direct memory access) fabric, and are likewise cooled with help from Supermicro. Musk has also announced plans for a $500 million Dojo supercomputer in Buffalo, New York, another Tesla operation. The Memphis Supercluster is also expected to upgrade its H100 base to 300,000 B200 GPUs, but delays on Blackwell's production due to design flaws have pushed this massive order back by several months. As one of the largest single customers of Nvidia AI GPUs, Musk seems to be following Jensen Huang's CEO math: "The more you buy, the more you save." Time will tell whether this rings true for Musk and his supercomputer collection.
[2]
Inside Tesla's Cortex: Elon Musk's giant AI supercomputer revealed
The 20-second clip offers a glimpse inside the massive facility, revealing rows of servers packed with Nvidia H100 GPUs. These powerful graphics cards are essential for training large language models (LLMs) like OpenAI's GPT-4 and Google's PaLM 2. Tesla is developing its Texas Gigafactory to house a cutting-edge AI supercomputer cluster. The expansion project, which is nearing completion, will initially incorporate 50,000 Nvidia GPUs alongside Tesla's proprietary AI hardware and is expected to advance Tesla's Full Self Driving. Tesla CEO Elon Musk estimates that the Gigafactory supercomputer will initially draw 130 megawatts upon deployment and potentially scale up to 500 megawatts. Remarkably, this is not Musk's only foray into supercomputing. Parallel to this project is another multi-billion dollar supercomputer cluster for xAI, the billionaire's artificial intelligence startup.
[3]
Elon Musk unveils Tesla's new 'Cortex' supercomputer, but it's not ready quite yet
Elon Musk has unveiled Tesla's new 'Cortex' supercomputer, which could become one of the biggest in the world, but it's not ready quite yet. This project has been described as critical to Tesla by CEO Elon Musk, who has shifted the automaker's focus to artificial intelligence in the last few years. AI needs computing power. Earlier this year, we reported that Tesla was having issues building a new expansion at Gigafactory Texas to house a new giant supercomputer to train Tesla's AI. At the time, we heard that Tesla was aiming for a 100 MW cluster to be ready by August to match the later delayed unveiling of its robotaxi. Musk canceled other projects at Tesla to focus construction resources on the expansion. Later, Musk revealed that Tesla plans to eventually grow the cluster to over 500 MW and use half NVIDIA processors and half of its own AI hardware. Tesla plans to use this computing power to train its neural nets to deliver its long-promised self-driving capability. Now, at the end of August, Musk has released a video showing off the supercomputer cluster, now called 'Cortex' (sound warning: the video is loud): The video only shows a small section of the cluster, which has yet to be completed and running properly. Sources familiar with the matter told Electrek that it is currently running on a temporary cooling system and that it won't be fully operational until the chiller plant is completed. Furthermore, Tesla needs more network feeders. Some believed that the new cluster won't be ready until October, which is when the new unveiling of the Tesla Robotaxi has been pushed. Delays or no, it's hard for me to get too excited about this new computing power because Elon said last year that Tesla is no longer compute-constrained when it comes to training neural nets. We saw very few improvements despite that announcement, which is why I'm skeptical about the impact of Tesla adding over 100 MW of extra power to its training capacity. Sure, it won't hurt, but how much closer is it going to get us to self-driving? Even with delays aside, this supercomputer cluster is not without controversy. Tesla shareholders are suing Musk for multiple breaches of fiduciary duty, including diverting a shipment of NVIDIA processors meant for Tesla to his private company, xAI. Musk confirmed the shipment was diverted, but he used the excuse that Tesla "didn't have anywhere to put the processors" - meaning that this expansion at Gigafactory Texas and server room for the cluster weren't ready. However, as part of the lawsuit, Musk is going to have to explain why Tesla, which ordered the processors, which he himself agreed are expensive and hard to get, was not ready to receive them but his private startup was.
[4]
Tesla's 'Cortex' Supercomputer Is What Its Robotaxi Hopes Ride On
The facility appears similar to other data centers, with thick cables and loud cooling systems. As its lead in the electric vehicle arms race starts to wane, Tesla's solution for staying ahead in the future is to focus on autonomous vehicles. But how it supposedly aims to make cars drive themselves is different from other players out there. It involves gathering colossal amounts of video footage from millions of its electric cars worldwide, processing that information at AI-powered data centers and then packaging that into software updates before sending them over-the-air with the goal of making Teslas drive like humans. To make this a reality, Tesla has started building giant data centers at its Gigafactories. One of these facilities is being built on the south side of the company's headquarters in Austin, Texas. The system at this complex is called Cortex, a "supercomputer cluster" that will run on more than 100,000 Nvidia H100 and H200 graphics processing units (GPUs) for video training of the Optimus humanoid robots and the Full-Self Driving (FSD) system. CEO Elon Musk shared a video of Cortex this morning. Even though the facility appears similar to other data centers, the whole environment gives off a high-tech, futuristic vibe. It's one of our first-ever looks at the supposed technology behind the technology. There appear to be countless aisles of black server racks with tightly stacked hardware placed behind glass doors and casings. Thick bundles of red and blue cables seem to snake between the racks and there's a loud hum, possibly emanating from the overhead cooling systems. Speaking of cooling, the facility has gargantuan fans on the roof, probably the size of an aircraft propeller, or even bigger -- Tesla is using a patented cooling system for the same. Moreover, the Texas supercomputer will apparently require 500 megawatts of power in the future. A Duke University study said a sports stadium can consume 5 mW of electricity during a game. So 500 mW could power about 100 such stadiums at once, which highlights the power-hungry nature of AI data centers. This isn't Tesla's only supercomputer. The company is also building its "Dojo" supercomputer at its New York Gigafactory, also for video training of the AI-based FSD systems in its future EVs. The head honcho claimed early this year that Tesla is investing $500 million in New York to build out this facility, which is also producing the NACS adapters for owners of Ford, Rivian and other brands. "Five hundred million, while obviously a large sum of money, is only equivalent to a 10k H100 system from Nvidia," Musk said at the time. "Tesla will spend more than that on Nvidia hardware this year. The table stakes for being competitive in AI are at least several billion dollars per year at this point." (However, Musk's decision to prioritize sending Nvidia chips to social media platform X -- which is privately held and technically unrelated to Tesla -- raised some eyebrows earlier this year.) Another one in Memphis is already operational. It's being expanded at the moment and when complete it would be "the most powerful AI-training cluster in the world," according to Musk. Since announcing a pivot to AI and autonomy, the company's passenger car and public charging businesses have taken a back seat. Apart from the refreshed Model 3 and the Cybertruck, the rest of Tesla's line-up is aging fast and that's somewhat reflected in its sales slip. Musk wants Tesla to be a tech company building robotaxis, humanoid robots and AI. For now, Tesla's FSD has shown countless flaws in real-world driving conditions and it doesn't quite merit the FSD label yet. The automaker also hasn't dealt with who or what would be liable in the event of a robotaxi crash, and it faces numerous investigations -- including a U.S. Department of Justice probe -- into how it has sold this technology to investors and the public. We may get some answers at Tesla's robotaxi debut event on Oct. 10.
Share
Share
Copy Link
Elon Musk reveals Tesla's new AI supercomputer, Cortex, designed to process vast amounts of data for autonomous driving. The system, still under construction, boasts an impressive array of NVIDIA H100 GPUs.
Elon Musk, CEO of Tesla, has given the world a first look at the company's latest technological marvel: the Cortex AI supercomputer. This massive computational system is currently under construction at Tesla's Gigafactory Texas and is set to play a crucial role in advancing the company's autonomous driving capabilities 1.
At the heart of Cortex lies an astounding array of NVIDIA H100 Tensor Core GPUs. The system is designed to house up to 10,000 of these powerful GPUs, although the current installation showcases approximately 5,000 units 2. This hardware configuration positions Cortex as a formidable contender in the realm of AI supercomputers.
The primary purpose of Cortex is to process the enormous amounts of data collected from Tesla's fleet of vehicles. This data is crucial for training and improving the company's autonomous driving algorithms 3. By leveraging such immense computational power, Tesla aims to accelerate the development of its Full Self-Driving (FSD) technology and push the boundaries of AI in automotive applications.
While the unveiling of Cortex has generated significant excitement, it's important to note that the system is not yet fully operational. Musk's reveal showed a partially completed installation, with many GPU slots still empty 4. The company has not provided a specific timeline for when Cortex will be fully functional, leaving room for speculation about its ultimate capabilities.
The introduction of Cortex represents a significant investment by Tesla in AI infrastructure. With an estimated cost of over $1 billion for the hardware alone, this move underscores the company's commitment to leading the race in autonomous vehicle technology 1. The sheer scale of the project also highlights the increasing importance of AI in the automotive sector and the competitive advantage that advanced computational resources can provide.
As Tesla moves forward with Cortex, several challenges lie ahead. These include managing the immense power requirements of such a system, ensuring efficient cooling for the densely packed GPUs, and developing software capable of fully utilizing this computational behemoth 2. Additionally, the ethical implications of advancing AI technology at this scale are likely to be a topic of ongoing discussion within the industry and beyond.
Reference
[1]
[2]
Elon Musk's XAI introduces Colossus, the world's most powerful AI training system. While impressive, questions arise about its storage capacity, power usage, and naming convention.
2 Sources
2 Sources
Elon Musk unveils Tesla's Dojo supercomputer, a powerful AI training system designed to compete with Nvidia's dominance in the AI hardware market. The move comes as a response to high costs and supply constraints in the AI chip industry.
4 Sources
4 Sources
Elon Musk's XAI has launched Colossus, a groundbreaking AI training system utilizing 100,000 NVIDIA H100 GPUs. This massive computational power aims to revolutionize AI development and compete with industry giants.
10 Sources
10 Sources
Elon Musk's AI company, xAI, has introduced a powerful new supercomputer named 'Memphis' to train its next-generation AI model, Grok 3. The system boasts an impressive array of 100,000 Nvidia H100 GPUs, positioning it as one of the most potent AI training clusters globally.
11 Sources
11 Sources
Elon Musk's AI startup xAI is set to dramatically expand its Colossus supercomputer in Memphis, Tennessee, aiming to reach over 1 million GPUs. This ambitious project involves partnerships with major tech companies and significant infrastructure challenges.
10 Sources
10 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved