Curated by THEOUTPOST
On Tue, 23 Jul, 4:02 PM UTC
3 Sources
[1]
AMD & NVIDIA Fight It Out In The Server CPU Segment: Each Claiming Over 2x Performance Uplift With EPYC Genoa & Grace Chips
AMD & NVIDIA are claiming that they both offer over 2x gain over the competitor with their EPYC Genoa & Grace Superchip CPU platforms for data centers. NVIDIA's entry into the data center CPU segment with its Grace Superchip and Grace CPU has been a major deal, especially for the x86 market which has enjoyed its dominance in the space for quite some time. While Arm chips have seen decent adoption, NVIDIA's Grace CPU, also based on Arm architecture, is posing a major threat. That's why the company is now facing some heated reaction from AMD who have now responded to some recent performance claims made by NVIDIA. NVIDIA is already massively successful in the AI space with its GPU-based accelerators and Grace CPUs further challenge x86 chipmakers such as AMD & Intel. With Grace, NVIDIA might have the potential to gain a good chunk of market share in the data center segment given that its Superchip platform is adopted in data center solution based on its Hopper & next-gen Blackwell solution and those are selling like hot cakes at the moment. As a response, AMD has published a blog post, explaining the importance of data centers and how their high-performance and energy-efficient operations shape the tech world. What followed next was a shocker to not just NVIDIA but to the tech world in general. AMD shows that while EPYC Milan was already top-notch for the x86 data center segment, its latest "4th Gen EPYC" Genoa and Bergamo CPUs take the performance to a whole new level. AMD shows off a huge performance lead with its 4th Gen EPYC processors against NVIDIA's Grace CPU Superchip. However, do note that these are not third-party benchmarks. So, the final results may vary but it does look like the new EPYC processors are significantly outperforming NVIDIA's Grace Superchip which is based on the ARM processor IP. The Grace Superchip is already a beastly 144-core processor (72 cores per chip) but in the benchmarks published by AMD, it seems to be of no match to the EPYC offerings. AMD EPYC processors are the best choice for datacenter performance and efficiency, as they outperform the NVIDIA Grace CPU Superchip across ten key workloads, based on extensive industry-standard benchmark publications and testing. AMD EPYC processors also offer the advantage of x86 processor architecture compatibility, which enables you to deploy a broad set of workloads with no compromises, and without expensive architectural transitions to a different ISA. AMD EPYC processors are the best option for datacenter operators who want to maximize performance while minimizing power and real estate footprint in a datacenter with an easy button. In the age of AI where you need capacity for your emerging AI workloads, AMD provides the best option with no compromises based on industry standards, transparency in data and benchmarks, and broad availability of platforms and solutions across the Ecosystem without expensive architectural transitions. via AMD In AMD's benchmark slides, we can see that both the EPYC 9654 and 9754 are outperforming Grace CPU Superchip by over 2x higher performance (over 4x in HPC Workloads such as ESPRESSO). In several data center-related tests, the EPYC CPUs can deliver at least 2.5x better relative performance, reaching up to 4x in HPC Workloads and over 3.5x in Server-Side Java. The EPYC 9754 is the flagship data center GPU featuring a whopping 128-core, 256-thread configuration, while the EPYC 9654 brings a 96-core, 192-thread config. Next is the efficiency test, where AMD compares both EPYC CPUs against the NVIDIA Grace Superchip with single and dual-socket configs of the EPYC 9754. The EPYC 9654 comes out to be 2.27x more power-efficient, while a single 9754 offers 2.50x better efficiency. With a dual-socket solution, the efficiency increases to 2.75x, signaling that the EPYC is the best overall choice for demanding workloads. AMD also reminds readers about the advantage of its EPYC CPUs due to being based on the x86-64 architecture and how it can help you run a broad set of workloads without any compatibility issues. Meanwhile, NVIDIA has also shared a new performance update for Grace CPU Superchip and sticks to its original claims while adding some additional information. According to the benchmarks published by NVIDIA, the server-side performance of Grace CPU Superchip is up to 2.4x faster than a dual-socket EPYC 9654 CPU platform and around 1.2-1.3x faster on average while also being much ahead of Intel's Sapphire Rapids platforms. The same is the case for Data Center throughput which is up to 3x faster on Grace and around 1.5-2.0x faster on average across multiple tests. NVIDIA states that one of the key elements in testing Grace's CPU performance is to use optimized code using the latest compilers such as gcc 12. x. The company also recommends the use of optimal math libraries such as BLAS, LAPACK, FFT, and ScaLAPACK to fully leverage the capabilities that the Superchip architecture has to offer. via NVIDIA One thing to note is that both AMD and NVIDIA are using a different set of benchmarks and work-loads for comparisons of their EPYC & Grace CPUs. So the performance is expected to be different but given that the data center market is booming right now, each chipmaker will try to showcase how they sit ahead of the competition. There's no doubt in the fact that both AMD & NVIDIA have some compelling options for data centers but we expect these benchmarks to pave the way for fairer and more accurate representations of the performance in the future, where not only the best practices and optimizations are used for their (AMD/Intel/NVIDIA) own chips but the competition too. As we look into the future, AMD is expected to introduce its 5th Gen EPYC CPU family, codenamed Turin, later this year and NVIDIA is aiming the launch its next major Arm CPU, codenamed Vera, by 2026.
[2]
AMD says Nvidia's Grace no match for its Epyc Zen 4 cores
Comment AMD has claimed its current datacenter silicon is already more than twice as fast, and up to 2.75 times more efficient, than Nvidia's Grace CPU Superchips. The chip design firm's assertions came after its own testing, published last week, in which it considered Nvidia's 2022 Grace CPU Superchip. That product combines a pair of CPU dies packing 72 Arm Neoverse V2 cores apiece, connects them with a 900GB/sec NVLink chip-to-chip interconnect, and backs that with up to 960GB of speedy LPDDR5x memory. However, it appears AMD was testing the 480GB version. To be clear, this isn't Nvidia's Grace-Hopper Superchip (GH200) which combines a single Grace CPU with up to 480GB of LPDDR5x and an 144GB H100 GPU die. Against Nvidia's Grace CPU, AMD pitted both a single- and dual-socket system running an Epyc 4 Genoa (9654) and Bergamo (9754), each with 768GB of DDR5 4800MT/sec memory. Across ten workloads - ranging from general purpose compute, server-side Java, power efficiency, transactional databases, decision support systems, web servers, in-memory databases, video encoding, and high-performance compute (HPC) - AMD boasted its kit delivered between 1.5x and 4x the performance of Nvidia's chip. As with any vendor supplied benchmarks, take these with a grain of salt. You can find a more detailed breakdown of AMD's performance claims here, but here's a consolidated view. Meanwhile, in the SPECpower-ssj2008 benchmark, AMD claimed a single 128-core Epyc 9754 offered roughly 2.5x better performance per watt than Nvidia's Arm Neoverse V2-based chip, while a pair of the Bergamo Epycs pushed that advantage to 2.75x. None of this should come as a surprise to anyone who's been following Grace's development - though the situation isn't as simple as AMD would have you believe. As our sibling site The Next Platform reported back in February, researchers at Stoney Brook and Buffalo Universities compared performance data from Nvidia's Grace CPU Superchip and several x86 processors gathered from multiple scientific research institutes and one cloud builder. Naturally, most of these tests were HPC-centric, including Linpack, High Performance Conjugate Gradient (HPCG), OpenFOAM, and Gromacs. While the Grace system's performance varied wildly between tests, at worst it fell somewhere between Intel's Skylake architecture (circa 2015) and its Ice Lake (2019) tech, bested AMD's Milan (from 2021) and came within spitting distance of a Xeon Max launched in early 2023. The findings suggest that AMD's most powerful Genoa and Bergamo Epyc processors might beat out Nvidia's first datacenter CPU - on the right benchmark. But as we alluded to earlier, all of this is workload dependent. In its Grace CPU Superchip datasheet, Nvidia shows the silicon achieving anywhere from 90 percent to 2.4x the performance of a dual 96-core Epyc 9654s - that's the same Genoa Epyc used in AMD's tests - and up to three times the throughput in a variety of cloud and HPC services. While a good old CPU shootout might make sense - Grace and Epyc are, at the end of the day, both datacenter CPU platforms - we haven't really seen Nvidia's Grace CPU Superchips deployed widely outside of HPC applications, and usually in preparation for larger scale deployments of the next-generation GH200 silicon. The UK's Isambard-3 and Isambard-AI supercomputers are fine examples of that strategy in action. Nvidia itself bills the CPU Superchip as one that's designed to "process mountains of data to produce intelligence with maximum energy efficiency," and specifically cites AI, data analytics, hyperscale cloud applications, and HPC applications. What's more, in the GH200 configuration most of the computation is done by the GPU - Grace mostly keeps the accelerator fed with data. And clearly Nvidia thinks Grace and its NVLink-C2C interconnect is up to the task, as it chose to reuse the CPU on its upcoming GB200 superchips, which we looked at back at Nvidia's GTC developer conference. That's arguably all Nvidia needs Grace to do for it to be successful. And explains why the acceleration champ has already started work on its successor. We have to imagine the number of folks cross shopping Grace-Grace against 4th-gen Epyc - outside of the HPC arena of course - is a rather short list. In all honesty, we'd have been far more interested to see a head-to-head between the GH200 and AMD's MI300A APUs. AMD closes its claims with a discussion on Arm compatibility - a topic worthy of many more benchmarks. We get the sense AMD's tests may just be an exercise in dispelling fears that x86 is running out of steam and that Arm is taking over. Arm isn't exactly new to the HPC community or the cloud - markets that are far from rejecting the architecture. In fact, every major US cloud provider now has an Arm CPU to call their own. But if this is really about how AMD's Zen 4 and Zen 4c cores stack up against Arm's Neoverse V2 architecture, a comparison with Amazon Web Service's Graviton4 would have been more useful. Announced late in 2023, Graviton4 is based on the same Neoverse V2 core as Grace, but boasts 96 cores and supports standard dual-socket configurations and 12 channels of DDR4, as opposed to Grace's soldered down LPDDR5x modules. Instances running Graviton4 have been available in preview for months now and became generally available last week. Perhaps more importantly, AWS offers both Epyc 4- and Graviton 4-based instances, making the likelihood of someone comparing the two far higher. ®
[3]
AMD says its EPYC processors beat Nvidia's Grace CPU Superchip across multiple benchmarks
When it comes to data center CPUs, Nvidia is a new kid on the block that has to prove that it can develop competitive processors for servers. AMD, an established player, apparently considers Nvidia's Grace Hopper Superchip a competitor, which is why it decided to publish a blog that compared the performance of its EPYC processor to Nvidia's Grace platform in a series of benchmarks. While, of course, such a comparison has to be taken with a grain of salt, it gives us some idea about the performance of Nvidia's Grace CPU. Based on tests conducted by AMD, its EPYC 9754 (128-cores) and 9654 (96 cores) processors deliver over twice the performance of Nvidia's Grace CPU Superchip (72 cores) in various workloads, including general-purpose computing, server-side Java, transactional databases, decision support systems, web servers, in-memory analytics, video encoding, and high-performance computing (HPC). For instance, in the SPECpower_ssj2008 benchmark, single- and dual-socket AMD EPYC systems outperform Nvidia Grace systems by approximately 2.50x and 2.75x, respectively. A dual-socket AMD EPYC 9654 system outperforms a similar Nvidia system by about 2.27x on the same tests, highlighting their superior energy efficiency. AMD says its EPYC processors hold over 300 world records for performance and efficiency across various benchmarks, including business applications, technical computing, data management, data analytics, digital services, media and entertainment, and infrastructure solutions. Still, considering the fact that the main purpose of Nvidia's Grace Hopper project is to run AI training and inference workloads, not exactly general-purpose server workloads, the results aren't shocking. However, Nvidia's Hopper is renowned for its prowess in AI applications. AMD's EPYC Zen 4 processors are built on the x86-64 architecture, ensuring compatibility with a wide array of software written for x86 architecture. By contrast, Nvidia's Grace is based on the Arm architecture, and while Arm is catching up with x86 in terms of data center software support, x86 is still ahead in terms of compatibility. AMD also notes that due to their high performance, energy efficiency, and compatibility with existing software and infrastructure, AMD's EPYC processors offer a lower total cost of ownership (TCO) than Nvidia's Grace. AMD's own tests show that its EPYC processors outperform Nvidia's Grace CPU Superchip across various key workloads, provide extensive compatibility due to their x86-64 architecture, and offer a proven, future-proof solution for diverse data center needs, which is why the company claims they are a better choice for general data center workloads. However, when it comes to AI training and inference, Nvidia's Grace Hopper platform can offer significant advantages over AMD's Instinct. After all, AI is what Nvidia's Grace Hopper platform was developed to address from the start.
Share
Share
Copy Link
AMD and NVIDIA are locked in a fierce competition for datacenter CPU supremacy. Both companies claim leadership with their respective EPYC and Grace chips, sparking debate in the tech industry.
In a high-stakes battle for dominance in the datacenter CPU market, AMD and NVIDIA have both claimed leadership positions with their latest chip offerings. This clash has ignited a debate within the tech industry, with both companies presenting benchmark results to support their claims.
AMD has boldly stated that its EPYC processors outperform NVIDIA's Grace CPU Superchip across multiple benchmarks 1. The company's 4th Gen EPYC processors, particularly the EPYC 9654 with 96 cores, are at the center of this claim. AMD asserts that these chips deliver superior performance in various workloads, including integer, floating-point, and memory bandwidth-intensive applications 3.
NVIDIA, not to be outdone, has introduced its Grace CPU Superchip, a 144-core Arm-based processor designed for high-performance computing and AI workloads. The company has touted its chip's capabilities, particularly in energy efficiency and performance-per-watt metrics 2.
The competition has led to a flurry of benchmark comparisons. AMD claims that its EPYC 9654 outperforms NVIDIA's Grace CPU Superchip by up to 1.4x in SPECrate2017_int_base and up to 1.9x in SPECrate2017_fp_base benchmarks 1. However, NVIDIA counters with its own set of benchmarks, highlighting areas where the Grace CPU Superchip excels.
This rivalry between AMD and NVIDIA is reshaping the datacenter CPU landscape. As both companies continue to innovate and compete, customers in the high-performance computing and AI sectors stand to benefit from improved performance and efficiency. The outcome of this competition could have far-reaching implications for the future of datacenter architecture and the broader tech industry.
Interestingly, NVIDIA's use of Arm-based architecture in its Grace CPU Superchip adds another dimension to this competition. It highlights the growing importance of Arm designs in the high-performance computing space, traditionally dominated by x86 architecture 2. This shift could potentially influence future chip designs and market dynamics.
As the debate rages on, potential customers are left to navigate through conflicting claims and benchmark results. IT decision-makers will need to carefully evaluate these processors based on their specific workload requirements, considering factors such as performance, energy efficiency, and compatibility with existing systems.
Reference
[2]
Intel's new Xeon 6900P series, based on Granite Rapids architecture, brings 120 cores to the table, matching AMD's EPYC core counts for the first time since 2017. This launch marks a significant milestone in the CPU market, with implications for AI and data center performance.
5 Sources
5 Sources
AMD has unveiled its 5th generation EPYC 'Turin' server processors, featuring up to 192 cores, 5 GHz clock speeds, and significant performance improvements over previous generations and competitors.
4 Sources
4 Sources
AMD has revealed details about its upcoming Zen 5 architecture and Ryzen 9000 series processors, promising significant improvements in performance and efficiency. The new design lays the foundation for future CPU architectures and introduces advanced features like RDNA 3.5 iGPU and XDNA 2 NPU.
9 Sources
9 Sources
Intel cancels its Falcon Shores AI chip, delays Clearwater Forest Xeons, and struggles to maintain server CPU market share against AMD and Arm-based competitors.
5 Sources
5 Sources
AMD introduces a range of new Ryzen AI processors at CES 2025, including the high-performance Ryzen AI Max+ series and expanded Ryzen AI 300 and 200 series, targeting AI PCs, gaming, and professional workloads.
8 Sources
8 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved