3 Sources
[1]
Nvidia dominates in gen AI benchmarks, clobbering 2 rival AI chips
Nvidia's general-purpose GPU chips have once again made a nearly clean sweep of one of the most popular benchmarks for measuring chip performance in artificial intelligence, this time with a new focus on generative AI applications such as large language models (LLMs). There wasn't much competition. Systems put together by SuperMicro, Hewlett Packard Enterprise, Lenovo, and others -- packed with as many as eight Nvidia chips -- on Wednesday took most of the top honors in the MLPerf benchmark test organized by the MLCommons, an industry consortium. Also: With AI models clobbering every benchmark, it's time for human evaluation The test, measuring how fast machines can produce tokens, process queries, or output samples of data -- known as AI inference -- is the fifth installment of the prediction-making benchmark that has been going on for years. This time, the MLCommons updated the speed tests with two tests representing common generative AI uses. One test is how fast the chips perform on Meta's open-source LLM Llama 3.1 405b, which is one of the larger gen AI programs in common use. The MLCommons also added an interactive version of Meta's smaller Llama 2 70b. That test is meant to simulate what happens with a chatbot, where response time is a factor. The machines are tested for how fast they generate the first token of output from the language model, to simulate the need for a quick response when someone has typed a prompt. A third new test measures the speed of processing graph neural networks, which are problems composed of a bunch of entities and their relations, such as in a social network. Graph neural nets have grown in importance as a component of programs that use gen AI. For example, Google's DeepMind unit used graph nets extensively to make stunning breakthroughs in protein-folding predictions with its AlphaFold 2 model in 2021. A fourth new test measures how fast LiDAR sensing data can be assembled in an automobile map of the road. The MLCommons built its own version of a neural net for the test, combining existing open-source approaches. The MLPerf competition comprises computers assembled by Lenovo, HPE, and others according to strict requirements for the accuracy of neural net output. Each computer system submitted reports to the MLCommons of its best speed in producing output per second. In some tasks, the benchmark is the average latency, how long it takes for the response to come back from the server. Nvidia's GPUs produced top results in almost every test in the closed division, where the rules for the software setup are the most strict. Competitor AMD, running its MI300X GPU, took the top score in two of the tests of Llama 2 70b. It produced 103,182 tokens per second, significantly better than the second-best result from Nvidia's newer Blackwell GPU. That winning AMD system was put together by a new entrant to the MLPerf benchmark, the startup MangoBoost, which makes plug-in cards that can speed data transfer between GPU racks. The company also develops software to improve serving of gen AI, called LLMboost. Also: ChatGPT's new image generator shattered my expectations - and now it's free to try Google also submitted a system, showing off its Trillium chip, the sixth iteration of its in-house Tensor Processing Unit (TPU). That system trailed far behind Nvidia's Blackwell in a test of how fast the computer could answer queries for the Stable Diffusion image-generation test. The latest round of MLPerf benchmarks featured fewer competitors to Nvidia than in some past installments. For example, microprocessor giant Intel's Habana unit did not have any submissions with its chips, as it has in years past. Mobile chip giant Qualcomm did not have any submissions this time around either. The benchmarks offered some nice bragging rights for Intel, however. Every computer system needs not only the GPU to accelerate the AI math, but also a host processor to run the ordinary work of scheduling tasks and managing memory and storage. Also: Intel's new CEO vows to run chipmaker 'as a startup, on day one' In the datacenter closed division, Intel's Xeon microprocessor was the host processor that powered seven of the top 11 systems, versus only three wins for AMD's EPYC server microprocessor. That represents an improved showing for Intel versus years prior. The 11th top-performing system, the benchmark of speed to process Meta's giant Llama 3.1 405b, was built by Nvidia itself without an Intel or AMD microprocessor onboard. Instead, Nvidia used the combined Grace-Blackwell 200 chip, where the Blackwell GPU is connected in the same package with Nvidia's own Grace microprocessor.
[2]
New AI benchmarks test speed of running AI applications
SAN FRANCISCO, April 2 (Reuters) - Artificial intelligence group MLCommons unveiled two new benchmarks that it said can help determine how quickly top-of-the-line hardware and software can run AI applications. Since the launch of OpenAI's ChatGPT over two years ago, chip companies have begun to shift their focus to making hardware that can efficiently run the code that allows millions of people to use AI tools. As the underlying models must respond to many more queries to power AI applications such as chatbots and search engines, MLCommons developed two new versions of its MLPerf benchmarks to gauge speed. One of the new benchmarks is based on Meta's (META.O), opens new tab so-called Llama 3.1 405-billion-parameter AI model, and the test targets general question answering, math and code generation. The new format tests a system's ability to process large queries and synthesize data from multiple sources. Nvidia (NVDA.O), opens new tab submitted several of its chips for the benchmark, and so did system builders such as Dell Technologies (DELL.N), opens new tab. There were no Advanced Micro Devices (AMD.O), opens new tab submissions for the large 405-billion-parameter benchmark, according to data provided by MLCommons. For the new test, Nvidia's latest generation of artificial intelligence servers - called Grace Blackwell, which have 72 Nvidia graphics processing units (GPUs) inside - was 2.8 to 3.4 times faster than the previous generation, even when only using eight GPUs in the newer server to create a direct comparison to the older model, the company said at a briefing on Tuesday. Nvidia has been working to speed up the connections of chips inside its servers, which is important in AI work where a chatbot runs on multiple chips at once. The second benchmark is also based on an open-source AI model built by Meta and the test aims to more closely simulate the performance expectations set by consumer AI applications such as ChatGPT. The goal is to tighten the response time for the benchmark and make it close to an instant response. Reporting by Max A. Cherney and Stephen Nellis in San Francisco; Editing by Muralikumar Anantharaman Our Standards: The Thomson Reuters Trust Principles., opens new tab Suggested Topics:Artificial Intelligence Max A. Cherney Thomson Reuters Max A. Cherney is a correspondent for Reuters based in San Francisco, where he reports on the semiconductor industry and artificial intelligence. He joined Reuters in 2023 and has previously worked for Barron's magazine and its sister publication, MarketWatch. Cherney graduated from Trent University with a degree in history.
[3]
New AI benchmarks test speed of running AI applications
SAN FRANCISCO (Reuters) - Artificial intelligence group MLCommons unveiled two new benchmarks that it said can help determine how quickly top-of-the-line hardware and software can run AI applications. Since the launch of OpenAI's ChatGPT over two years ago, chip companies have begun to shift their focus to making hardware that can efficiently run the code that allows millions of people to use AI tools. As the underlying models must respond to many more queries to power AI applications such as chatbots and search engines, MLCommons developed two new versions of its MLPerf benchmarks to gauge speed. One of the new benchmarks is based on Meta's so-called Llama 3.1 405-billion-parameter AI model, and the test targets general question answering, math and code generation. The new format tests a system's ability to process large queries and synthesize data from multiple sources. Nvidia submitted several of its chips for the benchmark, and so did system builders such as Dell Technologies. There were no Advanced Micro Devices submissions for the large 405-billion-parameter benchmark, according to data provided by MLCommons. For the new test, Nvidia's latest generation of artificial intelligence servers - called Grace Blackwell, which have 72 Nvidia graphics processing units (GPUs) inside - was 2.8 to 3.4 times faster than the previous generation, even when only using eight GPUs in the newer server to create a direct comparison to the older model, the company said at a briefing on Tuesday. Nvidia has been working to speed up the connections of chips inside its servers, which is important in AI work where a chatbot runs on multiple chips at once. The second benchmark is also based on an open-source AI model built by Meta and the test aims to more closely simulate the performance expectations set by consumer AI applications such as ChatGPT. The goal is to tighten the response time for the benchmark and make it close to an instant response. (Reporting by Max A. Cherney and Stephen Nellis in San Francisco; Editing by Muralikumar Anantharaman)
Share
Copy Link
MLCommons introduces new benchmarks for generative AI, with Nvidia's GPUs leading in most tests. The benchmarks highlight the industry's focus on efficient hardware for AI applications.
MLCommons, an industry consortium, has unveiled two new benchmarks designed to measure the speed and efficiency of hardware and software in running AI applications. These benchmarks come in response to the growing demand for efficient AI infrastructure following the launch of ChatGPT over two years ago 23.
The new benchmarks specifically target generative AI applications, reflecting the industry's shift towards large language models (LLMs) and chatbots. One benchmark is based on Meta's Llama 3.1 405-billion-parameter AI model, testing systems on general question answering, math, and code generation. The other simulates consumer AI applications like ChatGPT, emphasizing quick response times 23.
Nvidia's GPUs have once again demonstrated their superiority in AI performance:
While Nvidia dominated, other companies also made notable contributions:
These benchmarks highlight several important trends:
Notably, some major players were absent from this round of benchmarks:
As the AI industry continues to evolve rapidly, these benchmarks provide valuable insights into the current state of AI hardware and software capabilities, guiding future development and investment in the field.
Summarized by
Navi
[3]
Google introduces Gemini CLI, an open-source AI tool that brings the power of Gemini 2.5 Pro to developers' terminals, offering advanced coding assistance and versatile AI capabilities directly in the command line interface.
17 Sources
Technology
12 hrs ago
17 Sources
Technology
12 hrs ago
Nvidia's stock hits a record high, reclaiming its position as the world's most valuable company, driven by renewed optimism in AI technology and strong market performance despite geopolitical challenges.
10 Sources
Business and Economy
4 hrs ago
10 Sources
Business and Economy
4 hrs ago
Google DeepMind unveils AlphaGenome, an AI model that predicts how DNA sequences affect gene expression and regulation, potentially revolutionizing genomics research and disease understanding.
6 Sources
Science and Research
4 hrs ago
6 Sources
Science and Research
4 hrs ago
A comprehensive study published in Nature highlights the growing connection between computer vision research and surveillance applications, raising ethical concerns about privacy and human rights.
6 Sources
Technology
4 hrs ago
6 Sources
Technology
4 hrs ago
A recent study reveals that 60% of U.S. K-12 public school teachers are using AI tools, with weekly users saving an average of 6 hours per week. Teachers report improved work quality and better work-life balance, while also navigating concerns about student misuse.
11 Sources
Technology
20 hrs ago
11 Sources
Technology
20 hrs ago