The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Thu, 3 Apr, 12:05 AM UTC
3 Sources
[1]
Nvidia dominates in gen AI benchmarks, clobbering 2 rival AI chips
Nvidia's general-purpose GPU chips have once again made a nearly clean sweep of one of the most popular benchmarks for measuring chip performance in artificial intelligence, this time with a new focus on generative AI applications such as large language models (LLMs). There wasn't much competition. Systems put together by SuperMicro, Hewlett Packard Enterprise, Lenovo, and others -- packed with as many as eight Nvidia chips -- on Wednesday took most of the top honors in the MLPerf benchmark test organized by the MLCommons, an industry consortium. Also: With AI models clobbering every benchmark, it's time for human evaluation The test, measuring how fast machines can produce tokens, process queries, or output samples of data -- known as AI inference -- is the fifth installment of the prediction-making benchmark that has been going on for years. This time, the MLCommons updated the speed tests with two tests representing common generative AI uses. One test is how fast the chips perform on Meta's open-source LLM Llama 3.1 405b, which is one of the larger gen AI programs in common use. The MLCommons also added an interactive version of Meta's smaller Llama 2 70b. That test is meant to simulate what happens with a chatbot, where response time is a factor. The machines are tested for how fast they generate the first token of output from the language model, to simulate the need for a quick response when someone has typed a prompt. A third new test measures the speed of processing graph neural networks, which are problems composed of a bunch of entities and their relations, such as in a social network. Graph neural nets have grown in importance as a component of programs that use gen AI. For example, Google's DeepMind unit used graph nets extensively to make stunning breakthroughs in protein-folding predictions with its AlphaFold 2 model in 2021. A fourth new test measures how fast LiDAR sensing data can be assembled in an automobile map of the road. The MLCommons built its own version of a neural net for the test, combining existing open-source approaches. The MLPerf competition comprises computers assembled by Lenovo, HPE, and others according to strict requirements for the accuracy of neural net output. Each computer system submitted reports to the MLCommons of its best speed in producing output per second. In some tasks, the benchmark is the average latency, how long it takes for the response to come back from the server. Nvidia's GPUs produced top results in almost every test in the closed division, where the rules for the software setup are the most strict. Competitor AMD, running its MI300X GPU, took the top score in two of the tests of Llama 2 70b. It produced 103,182 tokens per second, significantly better than the second-best result from Nvidia's newer Blackwell GPU. That winning AMD system was put together by a new entrant to the MLPerf benchmark, the startup MangoBoost, which makes plug-in cards that can speed data transfer between GPU racks. The company also develops software to improve serving of gen AI, called LLMboost. Also: ChatGPT's new image generator shattered my expectations - and now it's free to try Google also submitted a system, showing off its Trillium chip, the sixth iteration of its in-house Tensor Processing Unit (TPU). That system trailed far behind Nvidia's Blackwell in a test of how fast the computer could answer queries for the Stable Diffusion image-generation test. The latest round of MLPerf benchmarks featured fewer competitors to Nvidia than in some past installments. For example, microprocessor giant Intel's Habana unit did not have any submissions with its chips, as it has in years past. Mobile chip giant Qualcomm did not have any submissions this time around either. The benchmarks offered some nice bragging rights for Intel, however. Every computer system needs not only the GPU to accelerate the AI math, but also a host processor to run the ordinary work of scheduling tasks and managing memory and storage. Also: Intel's new CEO vows to run chipmaker 'as a startup, on day one' In the datacenter closed division, Intel's Xeon microprocessor was the host processor that powered seven of the top 11 systems, versus only three wins for AMD's EPYC server microprocessor. That represents an improved showing for Intel versus years prior. The 11th top-performing system, the benchmark of speed to process Meta's giant Llama 3.1 405b, was built by Nvidia itself without an Intel or AMD microprocessor onboard. Instead, Nvidia used the combined Grace-Blackwell 200 chip, where the Blackwell GPU is connected in the same package with Nvidia's own Grace microprocessor.
[2]
New AI benchmarks test speed of running AI applications
SAN FRANCISCO, April 2 (Reuters) - Artificial intelligence group MLCommons unveiled two new benchmarks that it said can help determine how quickly top-of-the-line hardware and software can run AI applications. Since the launch of OpenAI's ChatGPT over two years ago, chip companies have begun to shift their focus to making hardware that can efficiently run the code that allows millions of people to use AI tools. As the underlying models must respond to many more queries to power AI applications such as chatbots and search engines, MLCommons developed two new versions of its MLPerf benchmarks to gauge speed. One of the new benchmarks is based on Meta's (META.O), opens new tab so-called Llama 3.1 405-billion-parameter AI model, and the test targets general question answering, math and code generation. The new format tests a system's ability to process large queries and synthesize data from multiple sources. Nvidia (NVDA.O), opens new tab submitted several of its chips for the benchmark, and so did system builders such as Dell Technologies (DELL.N), opens new tab. There were no Advanced Micro Devices (AMD.O), opens new tab submissions for the large 405-billion-parameter benchmark, according to data provided by MLCommons. For the new test, Nvidia's latest generation of artificial intelligence servers - called Grace Blackwell, which have 72 Nvidia graphics processing units (GPUs) inside - was 2.8 to 3.4 times faster than the previous generation, even when only using eight GPUs in the newer server to create a direct comparison to the older model, the company said at a briefing on Tuesday. Nvidia has been working to speed up the connections of chips inside its servers, which is important in AI work where a chatbot runs on multiple chips at once. The second benchmark is also based on an open-source AI model built by Meta and the test aims to more closely simulate the performance expectations set by consumer AI applications such as ChatGPT. The goal is to tighten the response time for the benchmark and make it close to an instant response. Reporting by Max A. Cherney and Stephen Nellis in San Francisco; Editing by Muralikumar Anantharaman Our Standards: The Thomson Reuters Trust Principles., opens new tab Suggested Topics:Artificial Intelligence Max A. Cherney Thomson Reuters Max A. Cherney is a correspondent for Reuters based in San Francisco, where he reports on the semiconductor industry and artificial intelligence. He joined Reuters in 2023 and has previously worked for Barron's magazine and its sister publication, MarketWatch. Cherney graduated from Trent University with a degree in history.
[3]
New AI benchmarks test speed of running AI applications
SAN FRANCISCO (Reuters) - Artificial intelligence group MLCommons unveiled two new benchmarks that it said can help determine how quickly top-of-the-line hardware and software can run AI applications. Since the launch of OpenAI's ChatGPT over two years ago, chip companies have begun to shift their focus to making hardware that can efficiently run the code that allows millions of people to use AI tools. As the underlying models must respond to many more queries to power AI applications such as chatbots and search engines, MLCommons developed two new versions of its MLPerf benchmarks to gauge speed. One of the new benchmarks is based on Meta's so-called Llama 3.1 405-billion-parameter AI model, and the test targets general question answering, math and code generation. The new format tests a system's ability to process large queries and synthesize data from multiple sources. Nvidia submitted several of its chips for the benchmark, and so did system builders such as Dell Technologies. There were no Advanced Micro Devices submissions for the large 405-billion-parameter benchmark, according to data provided by MLCommons. For the new test, Nvidia's latest generation of artificial intelligence servers - called Grace Blackwell, which have 72 Nvidia graphics processing units (GPUs) inside - was 2.8 to 3.4 times faster than the previous generation, even when only using eight GPUs in the newer server to create a direct comparison to the older model, the company said at a briefing on Tuesday. Nvidia has been working to speed up the connections of chips inside its servers, which is important in AI work where a chatbot runs on multiple chips at once. The second benchmark is also based on an open-source AI model built by Meta and the test aims to more closely simulate the performance expectations set by consumer AI applications such as ChatGPT. The goal is to tighten the response time for the benchmark and make it close to an instant response. (Reporting by Max A. Cherney and Stephen Nellis in San Francisco; Editing by Muralikumar Anantharaman)
Share
Share
Copy Link
MLCommons introduces new benchmarks for generative AI, with Nvidia's GPUs leading in most tests. The benchmarks highlight the industry's focus on efficient hardware for AI applications.
MLCommons, an industry consortium, has unveiled two new benchmarks designed to measure the speed and efficiency of hardware and software in running AI applications. These benchmarks come in response to the growing demand for efficient AI infrastructure following the launch of ChatGPT over two years ago 23.
The new benchmarks specifically target generative AI applications, reflecting the industry's shift towards large language models (LLMs) and chatbots. One benchmark is based on Meta's Llama 3.1 405-billion-parameter AI model, testing systems on general question answering, math, and code generation. The other simulates consumer AI applications like ChatGPT, emphasizing quick response times 23.
Nvidia's GPUs have once again demonstrated their superiority in AI performance:
While Nvidia dominated, other companies also made notable contributions:
These benchmarks highlight several important trends:
Notably, some major players were absent from this round of benchmarks:
As the AI industry continues to evolve rapidly, these benchmarks provide valuable insights into the current state of AI hardware and software capabilities, guiding future development and investment in the field.
Reference
[3]
NVIDIA's new Blackwell GPUs set records in MLPerf Inference v5.0 benchmarks, while AMD's Instinct MI325X shows competitive performance against NVIDIA's H200 in specific tests.
3 Sources
3 Sources
NVIDIA's latest Blackwell B200 GPU demonstrates unprecedented AI performance in the MLPerf Inference 4.1 benchmarks, outperforming its predecessor and competitors. The results showcase significant advancements in generative AI and large language model processing.
4 Sources
4 Sources
NVIDIA quietly released a new open-source AI model, Llama-3.1-Nemotron-70B-Instruct, which has reportedly outperformed leading models from OpenAI and Anthropic in benchmark tests, signaling a shift in NVIDIA's AI strategy.
6 Sources
6 Sources
NVIDIA's new Blackwell AI GPUs have set new performance records in MLPerf v4.1 AI training benchmarks, showing up to 2.2x faster performance compared to their predecessor, the Hopper GPUs. This significant leap in AI training capabilities has implications for various AI applications, including large language models.
4 Sources
4 Sources
As Nvidia dominates the AI training chip market with GPUs, competitors are focusing on developing specialized AI inference chips to meet the growing demand for efficient AI deployment and reduce computing costs.
6 Sources
6 Sources