AI Progress Slows as Scaling Laws Show Diminishing Returns

7 Sources

[1]

Has AI Progress Really Slowed Down?

For over a decade, companies have bet on a tantalizing rule of thumb: that artificial intelligence systems would keep getting smarter if only they found ways to continue making them bigger. This wasn't merely wishful thinking. In 2017, researchers at Chinese technology firm Baidu demonstrated that pouring more data and computing power into machine learning algorithms yielded mathematically predictable improvements -- regardless of whether the system was designed to recognize images, speech, or generate language. Noticing the same trend, in 2020, OpenAI coined the term "scaling laws," which has since become a touchstone of the industry. This thesis prompted AI firms to bet hundreds of millions on ever-larger computing clusters and datasets. The gamble paid off handsomely, transforming crude text machines into today's articulate chatbots. But now, that bigger-is-better gospel is being called into question. Last week, reports by Reuters and Bloomberg suggested that leading AI companies are experiencing diminishing returns on scaling their AI systems. Days earlier, The Information reported doubts at OpenAI about continued advancement after the unreleased Orion model failed to meet expectations in internal testing. The co-founders of Andreessen Horowitz, a prominent Silicon Valley venture capital firm, have echoed these sentiments, noting that increasing computing power is no longer yielding the same "intelligence improvements." Though, many leading AI companies seem confident that progress is marching full steam ahead. In a statement, a spokesperson for Anthropic, developer of the popular chatbot Claude, said "we haven't seen any signs of deviations from scaling laws." OpenAI declined to comment. Google DeepMind did not respond for comment. However, last week, after an experimental new version of Google's Gemini model took GPT-4o's top spot on a popular AI-performance leaderboard, the company's CEO, Sundar Pichai posted to X saying "more to come." Read more: The Researcher Trying to Glimpse the Future of AI Recent releases paint a somewhat mixed picture. Anthropic has updated its medium sized model, Sonnet, twice since its release in March, making it more capable than the company's largest model, Opus, which has not received such updates. In June, the company said Opus would be updated "later this year," but last week, speaking on the Lex Fridman podcast, co-founder and CEO Dario Amodei declined to give a specific timeline. Google updated its smaller Gemini Pro model in February, but the company's larger Gemini Ultra model has yet to receive an update. OpenAI's recently released o1-preview model outperforms GPT-4o in several benchmarks, but in others it falls short. o1-preview was reportedly called "GPT-4o with reasoning" internally, suggesting the underlying model is similar in scale to GPT-4. Parsing the truth is complicated by competing interests on all sides. If Anthropic cannot produce more powerful models, "we've failed deeply as a company," Amodei said last week, offering a glimpse at the stakes for AI companies that have bet their futures on relentless progress. A slowdown could spook investors and trigger an economic reckoning. Meanwhile, Ilya Sutskever, OpenAI's former chief scientist and once an ardent proponent of scaling, now says performance gains from bigger models have plateaued. But his stance carries its own baggage: Suskever's new AI start up, Safe Superintelligence Inc., launched in June with less funding and computational firepower than its rivals. A breakdown in the scaling hypothesis would conveniently help level the playing field. "They had these things they thought were mathematical laws and they're making predictions relative to those mathematical laws and the systems are not meeting them," says Gary Marcus, a leading voice on AI, and author of several books including Taming Silicon Valley. He says the recent reports of diminishing returns suggest we have finally "hit a wall" -- something he's warned could happen since 2022. "I didn't know exactly when it would happen, and we did get some more progress. Now it seems like we are stuck," he says. A slowdown could be a reflection of the limits of current deep learning techniques, or simply that "there's not enough fresh data anymore," Marcus says. It's a hypothesis that has gained ground among some following AI closely. Sasha Luccioni, AI and climate lead at Hugging Face, says there are limits to how much information can be learned from text and images. She points to how people are more likely to misinterpret your intentions over text messaging, as opposed to in person, as an example of text data's limitations. "I think it's like that with language models," she says. The lack of data is particularly acute in certain domains like reasoning and mathematics, where we "just don't have that much high quality data," says Ege Erdil, senior researcher at Epoch AI, a nonprofit that studies trends in AI development. That doesn't mean scaling is likely to stop -- just that scaling alone might be insufficient. "At every order of magnitude scale up, different innovations have to be found," he says, noting that it does not mean AI progress will slow overall. Read more: Is AI About to Run Out of Data? The History of Oil Says No It's not the first time critics have pronounced scaling dead. "At every stage of scaling, there are always arguments," Amodei said last week. "The latest one we have today is, 'we're going to run out of data, or the data isn't high quality enough or models can't reason.," "...I've seen the story happen for enough times to really believe that probably the scaling is going to continue," he said. Reflecting on OpenAI's early days on Y-Combinator's podcast, Altman partially credited the company's success with a "religious level of belief" in scaling -- a concept he says was considered "heretical" at the time. In response to a recent post on X from Marcus saying his predictions of diminishing returns were right, Altman posted saying "there is no wall." Though there could be another reason we may be hearing echoes of new models failing to meet internal expectations, says Jaime Sevilla, director of Epoch AI. Following conversations with people at OpenAI and Anthropic, he came away with a sense that people had extremely high expectations. "They expected AI was going to be able to, already write a PhD thesis," he says. "Maybe it feels a bit.. anti-climactic." A temporary lull does not necessarily signal a wider slowdown, Sevilla says. History shows significant gaps between major advances: GPT-4, released just 19 months ago, itself arrived 33 months after GPT-3. "We tend to forget that GPT three from GPT four was like 100x scale in compute," Sevilla says. "If you want to do something like 100 times bigger than GPT-4, you're gonna need up to a million GPUs," Sevilla says. That is bigger than any known clusters currently in existence, though he notes that there have been concerted efforts to build AI infrastructure this year, such as Elon Musk's 100,000 GPU supercomputer in Memphis -- the largest of its kind -- which was reportedly built from start to finish in three months. In the interim, AI companies are likely exploring other methods to improve performance after a model has been trained. OpenAI's o1-preview has been heralded as one such example, which outperforms previous models on reasoning problems by being allowed more time to think. "This is something we already knew was possible," Sevilla says, gesturing to an Epoch AI report published in July 2023. Read more: Elon Musk's New AI Data Center Raises Alarms Over Pollution Prematurely diagnosing a slowdown could have repercussions beyond Silicon Valley and Wall St. The perceived speed of technological advancement following GPT-4's release prompted an open letter calling for a six-month pause on the training of larger systems to give researchers and governments a chance to catch up. The letter garnered over 30,000 signatories, including Musk and Turing Award recipient Yoshua Bengio. It's an open question whether a perceived slowdown could have the opposite effect, causing AI safety to slip from the agenda. Much of the U.S.'s AI policy has been built on the belief that AI systems would continue to balloon in size. A provision in Biden's sweeping executive order on AI, signed in October 2023 (and expected to be repealed by the Trump White House) required AI developers to share information with the government regarding models trained using computing power above a certain threshold. That threshold was set above the largest models available at the time, under the assumption that it would target future, larger models. This same assumption underpins export restrictions (restrictions on the sale of AI chips and technologies to certain countries) designed to limit China's access to the powerful semiconductors needed to build large AI models. However, if breakthroughs in AI development begin to rely less on computing power and more on factors like better algorithms or specialized techniques, these restrictions may have a smaller impact on slowing China's AI progress. "The overarching thing that the U.S. needs to understand is that to some extent, export controls were built on a theory of timelines of the technology," says Scott Singer, a visiting scholar in the Technology and International Affairs Program at the Carnegie Endowment for International Peace. In a world where the U.S. "stalls at the frontier," he says, we could see a national push to drive breakthroughs in AI. He says a slip in the U.S.'s perceived lead in AI could spur a greater willingness to negotiate with China on safety principles. Whether we're seeing a genuine slowdown or just another pause ahead of a leap remains to be seen. "It's unclear to me that a few months is a substantial enough reference point," Singer says. "You could hit a plateau and then hit extremely rapid gains."

[2]

TechCrunch

Current AI scaling laws are showing diminishing returns, forcing AI labs to change course

AI labs traveling the road to super-intelligent systems are realizing they might have to take a detour. "AI scaling laws," the methods and expectations that labs have used to increase the capabilities of their models for the last five years, are now showing signs of diminishing returns, according to several AI investors, founders, and CEOs who spoke with TechCrunch. Their sentiments echo recent reports that indicate models inside leading AI labs are improving more slowly than they used to. Everyone now seems to be admitting you can't just use more compute and more data while pretraining large language models, and expect them to turn into some sort of all-knowing digital god. Maybe that sounds obvious, but these scaling laws were a key factor in developing ChatGPT, making it better, and likely influencing many CEOs to make bold predictions about AGI arriving in just a few years. OpenAI and Safe Super Intelligence co-founder Ilya Sutskever told Reuters last week that "everyone is looking for the next thing" to scale their AI models. Earlier this month, a16z co-founder Marc Andreessen said in a podcast that AI models currently seem to be converging at the same ceiling on capabilities. But now, almost immediately after these concerning trends started to emerge, AI CEOs, researchers, and investors are already declaring we're in a new era of scaling laws. "Test-time compute," which gives AI models more time and compute to "think" before answering a question, is an especially promising contender to be the next big thing. "We are seeing the emergence of a new scaling law," said Microsoft CEO Satya Nadella onstage at Microsoft Ignite on Tuesday, referring to the test-time compute research underpinning OpenAI's o1 model. He's not the only one now pointing to o1 as the future. "We're now in the second era of scaling laws, which is test-time scaling," said Andreessen Horowitz partner Anjney Midha, who also sits on the board of Mistral and was an angel investor in Anthropic, in a recent interview with TechCrunch. If the unexpected success - and now, the sudden slowing - of the previous AI scaling laws tell us anything, it's that it is very hard to predict how and when AI models will improve. Regardless, there seems to be a paradigm shift underway: the ways AI labs try to advance their models for the next five years likely won't resemble the last five. What are AI scaling laws? The rapid AI model improvements that OpenAI, Google, Meta, and Anthropic have achieved since 2020 can largely be attributed to one key insight: use more compute and more data during an AI model's pretraining phase. When researchers give machine learning systems abundant resources during this phase - in which AI identifies and stores patterns in large datasets - models have tended to perform better at predicting the next word or phrase. This first generation of AI scaling laws pushed the envelope of what computers could do, as engineers increased the number of GPUs used and the quantity of data they were fed. Even if this particular method has run its course, it has already redrawn the map. Every Big Tech company has basically gone all in on AI, while Nvidia, which supplies the GPUs all these companies train their models on, is now the most valuable publicly traded company in the world. But these investments were also made with the expectation that scaling would continue as expected. It's important to note that scaling laws are not laws of nature, physics, math, or government. They're not guaranteed by anything, or anyone, to continue at the same pace. Even Moore's Law, another famous scaling law, eventually petered out -- though it certainly had a longer run. "If you just put in more compute, you put in more data, you make the model bigger - there are diminishing returns," said Anyscale co-founder and former CEO Robert Nishihara in an interview with TechCrunch. "In order to keep the scaling laws going, in order to keep the rate of progress increasing, we also need new ideas." Nishihara is quite familiar with AI scaling laws. Anyscale reached a billion-dollar valuation by developing software that helps OpenAI and other AI model developers scale their AI training workloads to tens of thousands of GPUs. Anyscale has been one of the biggest beneficiaries of pretraining scaling laws around compute, but even its cofounder recognizes that the season is changing. "When you've read a million reviews on Yelp, maybe the next reviews on Yelp don't give you that much," said Nishihara, referring to the limitations of scaling data. "But that's pretraining. The methodology around post-training, I would say, is quite immature and has a lot of room left to improve." To be clear, AI model developers will likely continue chasing after larger compute cluster and bigger datasets for pretraining, and there's probably more improvement to eke out of those methods. Elon Musk recently finished building a supercomputer with 100,000 GPUs, dubbed Colossus, to train xAI's next models. There will be more, and larger, clusters to come. But trends suggest exponential growth is not possible by simply using more GPUs with existing strategies, so new methods are suddenly getting more attention. Test-time compute: the AI industry's next big bet When OpenAI released a preview of its o1 model, the startup announced it was part of a new series of models separate from GPT. OpenAI improved its GPT models largely through traditional scaling laws: more data, more power during pretraining. But now that method reportedly isn't gaining them much. The o1 framework of models relies on a new concept, test-time compute, so called because the computing resources are used after a prompt, not before. The technique hasn't been explored much yet in the context of neural networks, but is already showing promise. Some are already pointing to test-time compute as the next method to scale AI systems. "A number of experiments are showing that even though pretraining scaling laws may be slowing, the test-time scaling laws - where you give the model more compute at inference - can give increasing gains in performance," said a16z's Midha. "OpenAI's new 'o' series pushes [chain-of-thought] further, and requires far more computing resources, and therefore energy, to do so," said famed AI researcher Yoshua Benjio in an op-ed on Tuesday. "We thus see a new form of computational scaling appear. Not just more training data and larger models but more time spent 'thinking' about answers." Over a period of 10 to 30 seconds, OpenAI's o1 model re-prompts itself several times, breaking down a large problem into a series of smaller ones. Despite ChatGPT saying it is "thinking," it isn't doing what humans do -- although our internal problem-solving methods, which benefit from clear restatement of a problem and stepwise solutions, were key inspirations for the method. A decade or so back, Noam Brown, who now leads OpenAI's work on o1, was trying to build AI systems that could beat humans at poker. During a recent talk, Brown says he noticed at the time how human poker players took time to consider different scenarios before playing a hand. In 2017, he introduced a method to let a model "think" for 30 seconds before playing. In that time, the AI was playing different subgames, figuring out how different scenarios would play out to determine the best move. Ultimately, the AI performed seven times better than his past attempts. Granted, Brown's research in 2017 did not use neural networks, which weren't as popular at the time. However, MIT researchers released a paper last week showing that test-time compute significantly improves an AI model's performance on reasoning tasks. It's not immediately clear how test-time compute would scale. It could mean that AI systems need a really long time to think about hard questions; maybe hours or even days. Another approach could be letting an AI model "think" through a questions on lots of chips simultaneously. If test-time compute does take off as the next place to scale AI systems, Midha says the demand for AI chips that specialize in high-speed inference could go up dramatically. This could be good news for startups such as Groq or Cerebras, that specialize in fast AI inference chips. If finding the answer is just as compute-heavy as training the model, the "pick and shovel" providers in AI win again. The AI world is not yet panicking Most of the AI world doesn't seem to be losing their cool about these old scaling laws slowing down. Even if test-time compute does not prove to be the next wave of scaling, some feel we're only scratching the surface of applications for current AI models. New popular products could buy AI model developers some time to figure out new ways to improve the underlying models. "I'm completely convinced we're going to see at least 10 to 20x gains in model performance just through pure application-level work, just allowing the models to shine through intelligent prompting, UX decisions, and passing context at the right time into the models," said Midha. For example, ChatGPT's Advanced Voice Mode is one the more impressive applications from current AI models. However, that was largely an innovation in user experience, not necessarily the underlying tech. You can see how further UX innovations, such as giving that feature access to the web or applications on your phone, would make the product that much better. Kian Katanforoosh, the CEO of AI startup Workera and a Stanford adjunct lecturer on deep learning, tells TechCrunch that companies building AI applications, like his, don't necessarily need exponentially smarter models to build better products. He also says the products around current models have a lot of room to get better. "Let's say you build AI applications and your AI hallucinates on a specific task," said Katanforoosh. "There are two ways that you can avoid that. Either the LLM has to get better and it will stop hallucinating, or the tooling around it has to get better and you'll have opportunities to fix the issue." Whatever the case is for the frontier of AI research, users probably won't feel the effects of these shifts for some time. That said, AI labs will do whatever is necessary to continue shipping bigger, smarter, and faster models at the same rapid pace. That means several leading tech companies could now pivot how they're pushing the boundaries of AI.

[3]

Fortune

The Rabbit R1's new Teach Mode is another example of AI's "move fast, break things" ethos

Hello and welcome to Eye on AI. In today's edition...Rabbit launches Teach mode for the R1; Microsoft and HarperCollins strike a deal for training data; Google gives Gemini memory; an AI pioneer cites OpenAI's upcoming model when urging for regulation; Stanford ranks countries on how vibrant their AI ecosystems are. Rabbit, maker of the handheld orange AI-in-box device known as the R1, today released a new capability called Teach mode. The mode allows users to "teach" their devices how to perform specific actions by describing the process step-by-step on a computer using natural language and then syncing the lessons to their R1. The new capability is a step toward Rabbit's vision of an app store for model actions -- and ultimately, an entirely new way of interacting with devices sans the graphical user interfaces to which we've become accustomed. It's also the latest release from a company that has largely been building its product in public (and Teach mode will be no exception). Critics blasted the R1 at launch as clunky, basically useless, and more of a prototype than a finished product. In conversation with founder and CEO Jesse Lyu ahead of the Teach mode announcement, he described how the company has been using the criticism and user feedback to improve the product, pushing over 20 updates including a second generation of its LAM (large action model) system that makes it possible to interact with any site rather than just the four services available at launch. He framed this back-and-forth as essential, purporting that the only way to build in the AI era is to put a product out in the world and iterate from there. If you watch any of the demo videos for Teach mode, you may think it looks like the most tedious way possible to do something. Take one example where an early tester teaches the device how to draft a tweet. On his computer, he instructs the program to "click the icon to open the box to compose a tweet," "click inside the text box to start typing," "type your text inside the box," and so on before syncing the lesson to the R1. No coding is required, but it's still far more involved than just drafting a post on a smartphone or laptop as we always have. Rabbit's idea, however, is that eventually most users won't be doing any of this tedious work or interacting with Teach mode at all. Rather than teaching the R1 actions themselves, they will be able to go into an app store of sorts and buy actions created by others. (That's a bit ironic seeing as Rabbit talks about doing away with apps.) The store isn't launching with Teach mode today, however, and Lyu said there's no timeline for it as they still need to figure out how it will be monetized (both for the company and users who create the actions)to make money from it. For now, users will have access to a limited number of lessons created by early Teach mode testers or be able to create their own. "You can imagine, like you train an agent, you should be able to choose whether you want to keep it public or you keep it private. And if you choose to publish it to the public community, and then if a million others are using your license, you should get paid. You should be able to benefit from it, right?" Lyu siad. "So I think this is, to us, the app store for iOS, but for agents." The press release announcing Teach mode states that it's still "experimental" and that "output can be unpredictable." This is becoming a common disclaimer with the launch of generative AI products. Anthropic, for example, said the same last month when it released Computer Use, which enables its model to use computers the way people do. Lyu believes this is intrinsic to the nature of how AI works (models aren't pre-programmed, so you'll never know exactly what they'll do) and a result of how fast the technology is developing. "You have to kind of encounter all the edge cases and tweak on a fly and continue. That's just the whole nature of developing with AI models," he said. In the case of Rabbit in particular, he also also pointed to how the startup doesn't have a 10-year runway or the resources of a tech giant that would enable it to take its time. "We have to make sure that we take our shot and move fast. This is the only way that we can stay in competition." This isn't to say Rabbit isn't testing or taking care of issues before launch. The company collaborated with 20 testers to create more than 400 lessons, had them work with the company's engineers in a dedicated Slack group, and implemented improvements and security safeguards prior to the Teach mode launch. Still, to many who have seen the "move fast and break things" mantra of the tech industry play out, Lyu's philosophy may sound concerning. In AI safety and responsibility circles in particular, there's growing conversation about how companies should be seeking out every edge case internally before launch. And as my colleague Jeremy Kahn noted in Tuesday's newsletter, some believe it'd be a good idea to slow the whole train. As Rabbit continues to work out its kinks, who's using it? And for what? Teenagers, Lyu said, are currently the driving force of the product. On the flip side, he said they have elderly users who find just pressing the button easier than navigating apps on a smartphone. Additionally, it's finding uses in specific industries: doctors who use it for translation when speaking with patients, and truck drivers who can't use a smartphone while driving but can hit one button on the R1, similar to how they use the radio in their truck. While it's easy to still see the R1 as cumbersome and unnecessary, these examples of how it's being used show it may have some potential. I wouldn't count out the R1 (well, a future version of it) just yet. Microsoft signs a deal with HarperCollins to train an upcoming model on the publisher's non-fiction books. Little is known about the unannounced model, but HarperCollins says the three-year deal includes "select nonfiction backlist titles" and that authors will have to opt into the training program. Author Daniel Kibblesmith posted screenshots of an email showing he was offered $2,500 per book to use his works. "I'd probably do it for a billion dollars. I'd do it for an amount of money that wouldn't require me to work anymore, since that's the end goal of this technology," he posted on BlueSky. You can read more from The Verge. Google rolls out a memory feature for paying Gemini users. Just like the memory feature OpenAI launched for ChatGPT earlier this year, this allows users to ask Gemini Advanced to remember facts about their preferences and interests so the chatbot can give more tailored responses. Users can view when the information is used in responses and edit or delete any of the information Gemini saves about them. While the feature personalizes the experience and is critical to achieving the vision of a true personal assistant chatbot, researchers have shown how this type of the capability can be exploited by hackers to steal a user's data, as TechCrunch notes. AI pioneer Yoshua Bengio calls for urgent AI regulations, citing OpenAI's o1 model. In an opinion piece in the Financial Times, the AI researcher credited (along with Geoffery Hinton and Yann LeCun) for developing the breakthroughs currently driving AI argues that the OpenAI's upcoming model may put AI on the brink of bridging the gap to human-level reasoning. Noting how the principles behind reasoning have been at the heart of AI research in the 20th century, he says the advancement could enable important applications. At the same time, he says there are major risks to consider, including an increased ability to deceive humans demonstrated in o1 and how it could assist in developing biological weapons. "Advances in reasoning abilities make it all the more urgent to regulate AI models in order to protect the public," he writes. From Nvidia to Anthropic, meet the Fortune 50 AI Innovators -- by Sheryl Estrada Nvidia crushes expectations, more than doubling its profit amid a 94% surge in revenue as AI demand booms -- by Paolo Confino AI enters a new phase, and the Fortune 50 AI Innovators list identifies the companies leading it -- by John Kell Fortune's 50 AI Innovator's List -- by the Editors Bitcoin billionaire Barry Silbert talks about his next big bet -- on 'decentralized AI' -- by Jeff John Roberts Dec. 8-12: Neural Information Processing Systems (Neurips) 2024, Vancouver, British Columbia That's where the U.S. ranks on a new index that evaluates countries for how vibrant their AI ecosystems are, released today by The Stanford Institute for Human-Centered AI (HAI). Thirty six countries were ranked on eight pillars: research and development; responsible AI; economy; education; diversity; policy and governance; public opinion; and infrastructure. Indicators that informed the rankings include the number of AI journal publications, total AI private investment, AI legislation passed, and foundation model datasets. In the rankings, China follows the U.S. for the number two spot. The remainder of the top 10 -- in order -- consists of the U.K., India, the United Arab Emirates, France, South Korea, Germany, Japan, and Singapore. HAI notes that the U.S. leads the ranking by a significant margin, consistently producing the most notable models, attracting the most private investment, and exceling significantly in the R&D and AI economy pillars. China additionally scores well with R&D and economy, plus leadership in infrastructure. The U.K. stands out for its strength in education and policy/governance. You can read the paper or explore the interactive tool.

[4]

Fortune

AI enters a new phase, and the Fortune 50 AI Innovators list identifies the companies leading it

It's been more than 700 days since OpenAI debuted its now buzzy chatbot ChatGPT, unofficially ushering in the AI era and a race by businesses to capitalize on it. First came a period of experimentation by corporate customers. They tried out AI in a limited way -- infusing some of their internal tools with the nascent technology to see what worked and what didn't. Then, more recently, companies started getting more serious and confident. They invested more heavily in AI and made it more central to their operations. "2023 was really a year of industry and businesses wrapping their heads around AI," says James Dyett, head of platform sales of OpenAI. "2024 is the year we're starting to see real-scale deployments of our technology." On Wednesday, Fortune released its second annual Fortune 50 AI Innovators list, highlighting the AI companies that are leading in this new phase of the technology's adoption. They're the vendors that have gained the most traction, such as funding or customers, and whose AI is considered to be among the best of class. In addition to OpenAI, the Big Tech companies on the list include Microsoft, chipmaker Nvidia, business software firm Salesforce, and Chinese e-commerce juggernaut Alibaba. Among the startups are voice cloning tech company ElevenLabs, French AI chatbot upstart Mistral, China-based AI company ModelBest, and biotechnology company Xaira Therapeutics -- all of which are less than three years old. Of course, it's still early days in AI, and the industry's pecking order is still very fluid. Companies on top today face many challenges, including the high cost of training models and the difficulty of landing enough paying customers to turn a profit. At the same time, the industry has a glut of startups doing the same thing, often with little difference between the products they sell. A winnowing, by all accounts, is inevitable. So far, businesses adopting AI have focused on improving employee productivity, like helping software developers write code or tackling customer service questions, thereby reducing the number of calls a customer support desk must jockey. Such tweaks can marginally improve business operations. But Eric Boyd, corporate vice president of Microsoft's Azure AI platform, says executives should dream bigger. "You are probably missing an opportunity," warns Boyd, saying corporate customers should consider entirely changing how they do business. Microsoft is, of course, one of the biggest beneficiaries of the AI boom. Its AI-related revenue will soon reach a $10 billion annual run rate, making the business segment the fastest in the company's history to achieve that milestone. In general, corporate partnerships between AI vendors and corporate customers have exploded in 2024, including OpenAI's pacts with biotechnology company Moderna and financial giant Morgan Stanley. Similar deals have helped to lift OpenAI's valuation to $157 billion, most recently raising $6.6 billion in a huge funding round in October, nearly doubling the company's valuation from nine months earlier. Over the past year, new large AI models have been unveiled at a steady clip including the preview release of o1, which is touted by OpenAI for its stronger "reasoning" capabilities. Meanwhile, Meta has introduced three versions of the company's open-source language model in 2024, up from two the prior year. Uptake for many such foundational models is growing rapidly. "We've hit 500 million downloads, that's 10-times the same time as last year," says Ahmad Al-Dahle, Meta's vice president and head of genAI. "We've seen incredible adoption in the Fortune 500 companies leveraging Llama for their internal use cases," citing the name of Meta's large language model. Health care is among the sectors especially poised to benefit from AI, especially considering its reputation for high costs, administrative complexity, staffing shortages and clinician burnout. AI could create $370 billion in value for health care by accelerating drug discovery and development and more accurately matching patients with potential treatments, according to consulting firm McKinsey. Xaira Therapeutics is among the AI-focused drug development startups that's made a splash. In April, it emerged with $1 billion in funding and has since started its first effort to come up with new drug treatments, set up an office in San Francisco, and made key appointments including hiring a chief scientific officer. The startup intends to apply generative AI to drug discovery, production, and clinical trials, aiming to improve the entire drug development process and get more effective treatments to market faster. "We believe that AI will enable us to transform how we do all three steps and be much more successful," says Marc Tessier-Lavigne, CEO of Xaira Therapeutics. Abridge, which helps doctors save time by automatically transcribing and organizing their discussions with patients, in February announced a $150 million Series C investment. It has also disclosed a steady pace of new contracts with the likes of Yale New Haven Health System and The University of Vermont Health Network. "Our opportunity is to really unburden clinicians and help them focus on the most important person in the room...the patient," says Shiv Rao, CEO and founder of Abridge, and also a practicing cardiologist. He cites surveys showing physicians are overwhelmed and that many are even considering exiting the field entirely. Rural hospitals, already at risk of closing due to financial constraints, would be particularly impacted by such an exodus. Rao says Abridge's value is that it's a sector-specific AI application -- trained specifically on medical and patient data -- and that it's meant to address multiple markets, ranging from large academic medical centers to private practices, and across different parts of the health care system including cardiology, primary care, and the emergency room. This year, Abridge alongside the Mayo Clinic and software provider Epic, unveiled an AI documentation product specifically targeted to nurses. To score deals, Rao says he must make a successful pitch to three executives: the chief medical information officer, chief information officer, and chief financial officer. "We need all three of them to believe that this creates ROI for their clinicians," says Rao. The creative industry also presents fresh opportunities with AI, but also thorny questions about copyright protection, the potential threat to artists' livelihoods, and big changes to the art, films, design, and advertising. The pitch from AI video company Runway, which in June released its newest model that can create 10 second video clips from text prompts, says AI can help speed up video projects and give creators more time. "Cinema is an art form because of technology," says Cristobal Valenzuela, Runway's CEO and cofounder. "These are just tools." Runway works with major film studios, including Lionsgate, following a partnership they signed in September. The deal involves two creating and training a new AI model that is customized on the entertainment company's proprietary catalog, which includes The Hunger Games and Saw film series. Meanwhile, Adobe's Firefly AI image-generating tool has been used by its customers to create over 13 billion images. Clients have used the technology, for example, to create images for Barbie packaging and for customizable bottles sold by Gatorade. "We're excited about actually seeing these enterprises not only using gen AI to make the lives of their creative departments easier, but in some cases, enable their customers to personalize and change the engagement with the brand," says Alexandru Costin, vice president of generative AI at Adobe. Adobe has recently unveiled a generative AI video tool, and a feature that lets artists draw a shape that Adobe Illustrator then fills with an image via a text prompt. Adobe stresses that its generative AI is trained only on data that the company has a right to use or that is in the public domain and therefore does not violate copyright laws. Overall, Meta says use cases that felt impossible before the current AI boom are now becoming a reality as AI models get smarter. Some customers using Llama today include consulting firm Accenture, which built a custom LLM to more efficiently produce its annual ESG report, while telecommunications giant AT&T used fine-tuned versions of Llama models to help speed up response times for customer inquiries. Microsoft's Boyd says that while more than 60,000 customers are using Azure AI today, many C-suite leaders are still thinking through the responsible and ethical use of AI, how to manage their data, and the hunt for ROI. "Where we've been focused is, how do we help our customers take advantage of this new technology that's really touching every facet of their business?," asks Boyd. OpenAI's Dyett says it's unclear what the next big thing is in AI now that the basics are largely taken care of. But whoever does take the technology to the next level will have to take some risks. Said Dyett, "It'll be a company that goes out on a limb and experiments with something that hasn't been done before, starts to see some really positive results, and that gets the rest of the market going."

[5]

TechCrunch

This Week in AI: Congressional commission warns of Chinese AGI | TechCrunch

Hiya, folks, welcome to TechCrunch's regular AI newsletter. If you want this in your inbox every Wednesday, sign up here. America's AI war with China is intensifying -- or at least, the rhetoric around it is. On Tuesday, a U.S. congressional commission proposed a "Manhattan Project-style" effort to fund the development of AI systems with human-level -- or superhuman -- intelligence. In its annual report, the U.S.-China Economic and Security Review Commission (USCC) recommended that policymakers authorize funding for "leading AI, cloud, and data center companies," and direct the U.S. secretary of defense to ensure AI development receives "national priority." "We've seen throughout history that countries that are first to exploit periods of rapid technological change can often cause shifts in the global balance of power," Reuters quoted USCC commissioner Jacob Helberg as saying. "China is racing towards [AI superintelligence]. ... It's critical that we take them extremely seriously." The USCC, established by Congress to provide recommendations on U.S.-China relations, tends to be hawkish in its proposals. But the commission isn't alone in advocating for more aggressive actions to slow China's tech ambitions. Commerce secretary Gina Raimondo, for example, has suggested the U.S. share AI technology with foreign allies to combat China's rise. Defense Department officials, meanwhile, have called for safeguards to prevent technology leakage to China through overseas data centers and chip suppliers. The U.S. has already adopted a number of policies aimed at curbing China's AI progress, including export bans on hardware infrastructure and investments in AI tech in the region. China has circumvented some of these. But the impacts have been palpable -- and far-reaching. To give one example, China's access to the most sophisticated chips required to train AI, including next-gen GPUs, has been completely cut off. And in light of that, the USCC's pronouncements seem a bit overkill. It's not clear what superintelligent AI would even look like. But assuming for a moment it involves so-called reasoning models, as some people suggest, Chinese labs appear to be lagging, not leading. According to one analysis, top Chinese companies' models are about six to nine months behind their U.S. counterparts. We must consider the possibility that the USCC's recommendations are self-interested. Helberg is a senior adviser to the CEO of Palantir, a company with many AI defense contracts. And, naturally, government funding for AI would benefit U.S. AI companies. That's all to say, calls for a Manhattan Project-type program for superintelligent AI seem more alarmist than anything. AI at Ignite: Microsoft announced a slew of AI products during Microsoft Ignite 2024 on Tuesday, including a voice cloner and an AI dev platform called Azure AI Foundry. Advanced Voice Mode on the web: OpenAI has expanded ChatGPT's Advanced Voice Mode feature to the web, letting users talk to the AI chatbot right from their desktop browser. Indian news agency sues OpenAI: On the subject of OpenAI, one of India's largest news agencies, Asian News International, has sued the startup in what could be a precedent-setting case over the use of copyrighted news content. Gemini gets memory: Google's Gemini chatbot can now remember things like info about your life, work, and personal preferences during conversations. U.K. green-lights Anthropic investment: The U.K.'s Competition and Markets Authority has okayed Alphabet's partnership and investment in AI rival Anthropic, concluding that it doesn't qualify for investigation under current merger rules. Perplexity launches shopping: AI-powered search engine Perplexity debuted a feature that offers e-commerce recommendations, as well as the ability to place an order without navigating to a retailer's website. It seems like Stripe is doing the heavy lifting here, though. Altman joins team SF: San Francisco's mayor-elect, Daniel Lurie, has tapped OpenAI CEO Sam Altman to help run his transition team. Alongside nine other San Francisco leaders, Altman will provide guidance to Lurie's team on ways the city can innovate. New Mistral models: French AI startup Mistral released major new products and tools this week, including a "canvas" feature in its chatbot platform that lets users transform and edit content, like web mock-ups. The U.K. AI Safety Institute, a U.K. government body that studies risks in AI systems, has released its first academic paper, which proposes a way AI developers can demonstrate that their models don't pose "unacceptable cyber risks." In the paper, the AI Safety Institute co-authors note that "safety cases" -- structured, substantiated arguments for why risks associated with a model are acceptable -- are gaining traction. Yet there isn't a "readily available" safety case methodology for frontier AI. The co-authors propose a safety case template focusing on cyber capabilities, which they assert have well-established near-term risks. The template is designed to inform deployment decisions, they say, including whether to start or continue a model's training run. "This template serves as a proof of concept," the co-authors wrote. "It does not guarantee safety; some of the claims in our template could fail to hold true in reality, invalidating the conclusion. Still, we expect that even these imperfect safety cases serve to increase the level of rigor in reasoning about development or deployment decisions." Suno, the controversial generative music startup, released its latest music-generating model today, Suno v4. Suno claims that v4, which is only available to the platform's paying users, delivers crisper audio, better lyrics, and "more dynamic" song structures than its predecessor, v3. Suno's v4 now powers the company's Covers feature, which "reimagines" uploaded audio, and Personas, which captures the vocals, style, and "vibe" of a track and carries it into other creations. It's remarkable, in many ways, that Suno's charging ahead, given it's been sued by three major record labels alleging copyright infringement. Sony Music Entertainment, Universal Music Group, and Warner Music Group filed a lawsuit against Suno and rival firm Udio this summer, alleging that the pair trained their models on music without permission. In their responses to the lawsuits, Suno and Udio more or less admitted that their models might've ingested copyrighted music during training -- but they argued that fair use doctrine under U.S. copyright law shields them. HarperCollins has inked a three-year data licensing deal with Microsoft to let the tech giant train its AI on the publisher's nonfiction works. HarperCollins, whose parent company, News Corp., has a similar agreement in place with OpenAI, says that authors will have to opt in and that the deal only covers "select nonfiction backlist titles." Authors aren't pleased -- and it hasn't helped that the payouts HarperCollins is offering are measly. One author, Daniel Kibblesmith, says he was offered a flat $2,500 per book. "I'd probably do it for a billion dollars," Kibblesmith wrote in a post on Tuesday. "I'd do it for an amount of money that wouldn't require me to work anymore, since that's the end goal of this technology."

[6]

Fortune

At Web Summit, no sign of an AI slowdown

Hello and welcome to Eye on AI. In this edition...no sign of an AI slowdown at Web Summit; work on Amazon's new Alexa plagued by further technical issues; a general purpose robot model; trying to bend Trump's ear on AI policy. Last week, I was at Web Summit in Lisbon, where AI was everywhere. There was a strange disconnect, however, between the mood at the conference, where so many companies were touting AI-powered products and features, and the tenor of AI news last week -- much of which was focused on reports that the AI companies building foundation models were seeing diminishing returns from building ever larger AI models and rampant speculation in some quarters that the AI hype cycle was about to end. I moderated a center stage panel discussion on whether the AI bubble is about to burst, and I heard two very different, but not diametrically opposed, takes. (You can check it out on YouTube.) Bhavin Shah, the CEO of Moveworks, which offers an AI-powered service to big companies that allows employees to get their IT questions automatically answered, argued -- as you might expect -- that not only is the bubble not about to burst, that it isn't even clear there's a bubble. Sure, Shah said, the valuations for a few tech companies might be too high. But AI itself was very different from something like crypto or the metaverse or the tulip mania of the 17th century. Here was a technology that was having real impact on how the world's largest companies operate -- and it was only just getting going. He said it was only now, two years after the launch of ChatGPT, that many companies were finding AI use cases that would create real value. Rather than being concerned that AI progress might be plateauing, Shah argued that companies were still exploring all the possible, transformative use cases for the AI that already exists today -- and the transformative effects of the technology were predicated on further progress in LLM capabilities. In fact, he said, there was far too much focus on what the underlying LLMs could do and not nearly enough on how to build systems and workflows around LLMs and other, different kinds of AI models, that could as a whole deliver significant return-on-investment (ROI) for businesses. The idea that some people might have had that just throwing an LLM at a problem would magically result in ROI was always naïve, Shah argued. Instead, it was always going to involve systems architecting and engineering to create a process in which AI could deliver value. Meanwhile, Sarah Myers West, the coexecutive director of the AI Now Institute, argued not so much that the AI bubble is about to burst -- but rather that it might be better for all of us if it did. West argued that the world could not afford a technology with the energy footprint, appetite for data, and problems around unknown biases that today's generative AI systems have. In this context, though, a slowdown in AI progress at the frontier might not be a bad thing, as it might force companies to look for ways to make AI both more energy and data efficient. West was skeptical that smaller models, which are more efficient, would necessarily help. She said they might simply result in the Jevons paradox, the economic phenomenon where making the use of a resource more efficient only results in more overall consumption of that resource. As I mentioned last week, I think that for many companies that are trying to build applied AI solutions for specific industry verticals, the slowdown at the frontier of AI model development matters very little. Those companies are mostly bets that those teams can use the current AI technology to build products that will find product-market fit. Or, at least, that's how they should be valued. (Sure, there's a bit of "AI pixie dust" in the valuation too, but those companies are valued mostly on what they can create using today's AI models.) But for the companies whose whole business is creating foundation models -- OpenAI, Anthropic, Cohere, and Mistral -- their valuations are very much based around the idea of getting to artificial general intelligence (AGI), a single AI system that is at least as capable as humans at most cognitive tasks. For these companies, diminishing returns from scaling LLMs does matter. But even here, it's important to note a few things -- while returns from the pre-training larger and larger AI models seems to be slowing, AI companies are just starting to look at the returns from scaling up "test time compute" (i.e. giving an AI model that runs some kind of search process over possible answers more time -- or more computing resources -- to conduct that search). That is what OpenAI's o1 model does, and it is likely what future models from other AI labs will do too. Also, while OpenAI has always been most closely associated with LLMs and the "scale is all you need" hypothesis, most of these frontier labs have employed, and still employ, researchers with expertise in other flavors of deep learning. If progress from scale alone is slowing, that is likely to encourage them to push for a breakthrough using a slightly different method -- search, reinforcement learning, or perhaps even a completely different, non-Transformer architecture. Google DeepMind and Meta are also in a slightly different camp here, because those companies have huge advertising businesses that support their AI efforts. Their valuations are less directly tied to frontier AI development -- especially if it seems like the whole field is slowing down. It would be a different story if one lab were achieving results that Meta or Google could not replicate -- which is what some people thought was happening when OpenAI leapt out ahead with the debut of ChatGPT. But since then, OpenAI has not managed to maintain a lead of more than three months for most new capabilities. As for Nvidia, its GPUs are used for both training and inference (i.e. applying an AI model once it has been trained) -- but it has optimized its most advanced chips for training. If scale stops yielding returns during training, Nvidia could potentially be vulnerable to a competitor with chips better optimized for inference. (For more on Nvidia, check out my feature on company CEO Jensen Huang that accompanied Fortune's inaugural 100 Most Powerful People in Business list.) With that, here's more AI News. Jeremy Kahn [email protected] @jeremyakahn Correction, Nov. 15: Due to erroneous information provided by Robin AI, last Tuesday's edition of this newsletter incorrectly identified billionaire Michael Bloomberg's family office Willets as an investor in the company's "Series B+" round. Willets was not an investor. **Before we get the news: If you want to learn more about what's next in AI and how your company can derive ROI from the technology, join me in San Francisco on Dec. 9-10 for Fortune Brainstorm AI. We'll hear about the future of Amazon Alexa from Rohit Prasad, the company's senior vice president and head scientist, artificial general intelligence; we'll learn about the future of generative AI search at Google from Liz Reid, Google's vice president, search; and about the shape of AI to come from Christopher Young, Microsoft's executive vice president of business development, strategy, and ventures; and we'll hear from former San Francisco 49er Colin Kaepernick about his company Lumi and AI's impact on the creator economy. You can view the agenda and apply to attend here. (And remember, if you write the code KAHN20 in the "Additional comments" section of the registration page, you'll get 20% off the ticket price -- a nice reward for being a loyal Eye on AI reader!) Amazon's launch of a new AI-powered Alexa plagued by further technical issues. My Fortune colleague Jason Del Rey has obtained internal Amazon emails that show staff working on the new version of Amazon Alexa have written managers to warn that the product is not yet ready to be launched. In particular, emails from earlier this month show that engineers worry that latency -- or how long it takes the new Alexa to generate responses -- make the product potentially too frustrating for users to enjoy or pay an additional subscription fee to use. Other emails indicate the new Alexa may not be compatible with older Amazon Echo smart speakers and that staff worry that the new Alexa won't offer enough "skills" -- or actions that a user can perform through the digital voice assistant -- to justify an increased price for the product. You can read Jason's story here. Anthropic is working with the U.S. government to test if its AI chatbot will leak nuclear secrets. That's according to a story from Axios that quotes the AI company as saying it has been working with the Department of Energy's National Nuclear Security Administration since April to test its Claude 3 Sonnet and Claude 3.5 Sonnet models to see if the model can be prompted to give responses that might help someone develop a nuclear weapon or perhaps figure out how to attack a nuclear facility. Neither Anthropic nor the government would reveal what the tests -- which are classified -- have found so far. But Axios points out that Anthropic's work with the DOE on secret projects may pave the way for it to work with other U.S. national security agencies and that several of the top AI companies have recently been interested in obtaining government contracts. Nvidia's struggling to overcome heating issues with Blackwell GPU racks. Unnamed Nvidia employees and customers told The Information that the company has faced problems in keeping large racks of its latest Blackwell GPU from overheating. The company has asked suppliers to redesign the racks, which house 72 of the powerful chips, several times and the issue may delay shipment of large numbers of GPU racks to some customers, although Michael Dell has said that his company has shipped some of the racks to Nvidia-backed cloud service provider CoreWeave. Blackwell has already been hit by a design flaw that delayed full production of the chip by a quarter. Nvidia declined to comment on the report. OpenAI employees raise questions about gender diversity at the company. Several women at OpenAI have raised concerns about the company's culture following the departures of chief technology officer Mira Murati and another senior female executive, Lilian Weng, The Information reported. A memo shared internally by a female research program manager and seen by the publication called for more visible promotion of women and nonbinary individuals already making significant contributions. The memo also highlights challenges in recruiting and retaining female and nonbinary technical talent, a problem exacerbated by Murati's departure and her subsequent recruitment of former OpenAI staff to her new startup. OpenAI has since filled some leadership gaps with male co-leads, and its overall workforce and leadership remain predominantly male. A foundation model for household robots. Robotic software startup Physical Intelligence, which recently raised $400 million in funding from Jeff Bezos, OpenAI, and others, has released a new foundation model for robotics. Like LLMs for language tasks, the idea is to create AI models for robots that will let any robot perform a host of basic motions and tasks in any environment. In the past, robots often had to be trained specifically for a particular setting in which they would operate -- either through actual experience in that setting, or through having their software brains learn in a simulated virtual environment that closely matched the real world setting into which they would be deployed. The robot could usually only perform one task or a limited range of tasks in that specific environment. And the software controlling the robot only worked for one specific robot model. But the new model from Physical Intelligence -- which it calls π0 (Pi-Zero) allows different kinds of robots to perform a whole range of household tasks -- from loading and unloading a dishwasher to folding laundry to taking out the trash to delicately handling eggs. What's more, the model works across multiple types of robots. Physical Intelligence trained π0 by building a huge dataset of eight different kinds of robots performing a whole multitude of tasks. The new model may help speed the adoption of robots, yes, in households, but also in warehouses, factories, restaurants, and other work settings too. You can see Physical Intelligence's blog here. How Mark Zuckerberg has fully rebuilt Meta around Llama -- by Sharon Goldman Exclusive: Perplexity's CEO says his AI search engine is becoming a shopping assistant -- but he can't explain how products it recommends are chosen -- by Jason Del Rey Tesla jumps as Elon Musk's 'bet for the ages' on Trump is seen paying off with federal self-driving rules -- by Jason Ma Commentary: AI will help us understand the very fabric of reality -- by Demis Hassabis and James Manyka Dec. 8-12: Neural Information Processing Systems (Neurips) 2024, Vancouver, British Columbia What is Trump going to do about AI? A lobbying group called BSA | The Software Alliance, which represents OpenAI, Microsoft, and other tech companies, is calling on President-elect Donald Trump to preserve some Biden Administration initiatives on AI. These include a national AI research pilot Biden funded and a new framework developed by the U.S. Commerce Department to manage high-risk use cases of AI. It also wants Trump's administration to continue international collaboration on AI safety standards, enact a national privacy law, negotiate data transfer agreements with more countries, and coordinate U.S. export controls with allies. It also wants to see Trump consider lifting Biden-era controls on the export of some computer hardware and software to China. You read more about the lobbying effort in this Semafor story. The tech industry group is highly unlikely to get its entire wish list. Trump has signaled he plans to repeal Biden's Executive Order on AI, which resulted in the Commerce Department's framework, the creation of the U.S. AI Safety Institute, and several other measures. And Trump is likely to be even more hawkish on trade with China than Biden was. But trying to figure out exactly what Trump will do on AI is difficult -- as my colleague Sharon Goldman detailed in this excellent explainer. It may be that Trump winds up being more favorable to AI regulation and international cooperation on AI safety than many expect.

[7]

Analytics India Magazine

AI Scaling Laws Crumble Under Token Pressure

Have LLMs Hit a Wall? Microsoft chief Satya Nadella tackled this hot-button issue at Microsoft Ignite 2024, offering a refreshingly candid take on the discussion. "There's a lot of debate on whether we have hit the wall with scaling laws. Is it going to continue? The thing to remember, at the end of the day, is that these are not physical laws. They are just empirical observations that held true, much like how Moore's Law did for a long time," he said. Nadella welcomed the scepticism and debates, calling them beneficial to push innovation in areas such as model architectures, data regimes, and systems architecture. He also discussed OpenAI's new scaling law, which focuses on test-time computing, and how it will be integrated into features like Copilot Think Deeper, powered by OpenAI's o1. In a recent earnings call, NVIDIA chief Jensen Huang said that OpenAI o1 had introduced a new scaling law called 'test-time scaling', which consumed a lot of computing resources. Microsoft is working closely with NVIDIA to scale test-time computing for OpenAI. Nadella emphasised the importance of maximising value in the most efficient way. "Last month, we introduced new clusters with H200s that became available. We're very excited about it," said Nadella. He added that with their stack optimisation between H100 and H200, Azure can deliver performance for everything from inference to training. "Tokens per watt plus dollar is the best way to think about the new currency of performance," said Nadella, adding that Microsoft will continue to build new data centre intelligence factories. Nadella introduced a new metric that reflects the efficiency of generating tokens, considering both energy consumption (measured in watts) and cost (measured in dollars). This means that for every unit of energy (watt) used and every dollar spent, a certain number of tokens are produced. Despite the progress, NVIDIA has yet to solve the inferencing challenge. Acknowledging the difficulties involved, Huang shared that their goal is to produce tokens at low latency. "Inference is super hard. And the reason...is that you need the accuracy to be high...You need the throughput to be high so that the cost can be as low as possible. But you also need the latency to be low. And computers that are high-throughput and have latency are incredibly hard to build," he said. "Our hopes and dreams are that, someday, the world will do a ton of inference," said Huang, adding that there will be thousands of AI-native start-ups that will generate tokens. Microsoft also announced the preview of NVIDIA Blackwell AI infrastructure on Azure. "Blackwell is pretty amazing. It has 72 GPUs on a single NVLink domain, and when combined with Infiniband on the backend, these racks are optimised for the most cutting-edge training and inference workloads. We are very excited about having Blackwell," said Nadella. Besides NVIDIA, Microsoft is also working closely with AMD. "We were the first cloud to offer VMs powered by AMD's MI300X GPU, and we're using that infrastructure to power Azure OpenAI. Today, we're introducing Azure HBv5, which we co-engineered with AMD," he said. Nadella confirmed that Azure HBv5 is up to eight times faster than any other cloud virtual machine, setting a new standard for high-performance computing, and it will be generally available next year. In a recent podcast with No Priors, Huang explained that NVIDIA now views data centres as a product rather than just GPUs. "We have an initiative in our company called data centre as a product. We don't sell it as a product, but we have to treat it like it's a product -- everything from planning for it to standing it up, optimising it, tuning it, and keeping it operational," he said, adding that their goal is for data centres to be as beautiful as iPhones. However, he acknowledged that energy, capital, and supply chain are a major challenge when it comes to scaling compute. Interestingly, with the growing demand for data centres to power AI technology, tech giants such as Microsoft, Google, and Amazon have struck deals with nuclear power plant operators. Huang highlighted that intelligence is built on top of computing power, marvelling at the evolution of data centres. Initially, data centres were simply used for storing data, but now, they are generating new tokens. "We are creating single-tenant data centres that don't just store files; instead, they produce tokens. These tokens are then reconstituted into something that seems like intelligence," he elaborated. According to Huang, intelligence or tokens can take many forms. "It could be robotic motion, sequences of amino acids, or chemical chains - the possibilities are countless." Meanwhile, Groq chief Jonathan Ross shared similar views in a recent LinkedIn post, comparing generative AI to the internet and mobile phones. He explained that the internet was part of the Information Age, and its primary function was to take a piece of data, replicate it with high fidelity, and distribute it globally. Generative AI, on the other hand, is different. "It's not about copying," Ross said. "It's not about data or information. It's about compute." In an interview with Lex Fridman earlier this year, OpenAI chief Sam Altman said, "Compute is going to be the currency of the future. It may become the most valuable commodity in the world and we should invest significantly in expanding compute resources." In a similar vein, Altman proposed a concept where everyone would have access to a portion of GPT-7's computing resources. "I wonder if the future looks something more like 'universal basic compute' than universal basic income, where everyone receives a slice of GPT-7 compute," Altman speculated. This explains why OpenAI plans to partner with TSMC and Broadcom to launch its first in-house AI chip by 2026. On the other hand, Elon Musk's xAI has built the world's largest and most powerful AI supercomputer, Colossus, a liquid-cooled cluster in Memphis, comprising 100,000 NVIDIA H100 GPUs. xAI is now working to double its size to a combined total of 200,000 NVIDIA Hopper GPUs. Moreover, OpenAI's new scaling method has prompted major inference chip makers like Groq, Sambanova, and Cerebras to improve their performance, enabling them to run Llama models at record-breaking speeds. Cerebras recently shared that its Llama 3.1 model with 405 billion parameters is now running at 969 tokens per second. Meanwhile, Groq launched a new endpoint for Llama 3.1 70B, achieving 1,665 tokens per second by leveraging speculative decoding. On the other hand, Sambanova's Llama 3.1 405B models are now running at up to 200 tokens per second. Meanwhile, NVIDIA has reportedly asked its suppliers to redesign their racks multiple times to address the overheating issue. In response, Groq's Sunny Madra posted a picture of Groq's cluster with the caption, "Air cooled here." However, Huang has dismissed such reports, asserting that Blackwell production is running at full steam. NVIDIA's CFO revealed that the company shipped 13,000 GPU samples to customers in the third quarter, including one of the first Blackwell DGX engineering samples to OpenAI. On the other hand, SambaNova AI's Rodrigo Liang said, "Sambanova's datascale rack weighs just 738 pounds, requires no special cooling or power, and outperforms an NVIDIA rack while using only one-tenth of the power."

Twitter

Facebook

Copy Link

Leading AI companies are experiencing diminishing returns on scaling their AI systems, prompting a shift in approach and raising questions about the future of AI development.

AI Scaling Laws Hit a Plateau

The AI industry is facing a significant challenge as the long-held belief in continuous improvement through larger models and more data is being called into question. Recent reports suggest that leading AI companies are experiencing diminishing returns on scaling their AI systems, forcing a reevaluation of development strategies 1 2.

The End of an Era

For years, AI firms have relied on a simple principle: bigger models with more data and computing power would yield better results. This approach, known as "scaling laws," has been a cornerstone of AI development since 2017 1. However, recent developments indicate that this method may be reaching its limits.

Signs of Slowdown

Multiple sources, including reports from Reuters and Bloomberg, have highlighted the diminishing returns on AI scaling. OpenAI, a pioneer in the field, has reportedly faced challenges with its unreleased Orion model, which failed to meet internal expectations 1. Even prominent venture capitalists like Marc Andreessen have noted that increasing computing power is no longer yielding the same "intelligence improvements" 2.

Industry Reactions

While some companies maintain optimism, others acknowledge the need for change. Anthropic, developer of the Claude chatbot, claims they haven't seen deviations from scaling laws 1. However, OpenAI's former chief scientist, Ilya Sutskever, now argues that performance gains from bigger models have plateaued 1.

The Search for New Approaches

As the industry grapples with these challenges, AI labs are exploring alternative methods to advance their models:

Test-time compute: This emerging technique gives AI models more time and computational resources to "think" before answering questions 2.
Innovative data usage: Researchers are looking at ways to use existing data more effectively, recognizing the limitations of simply increasing data volume 1.
Domain-specific improvements: There's a growing focus on enhancing AI capabilities in specific areas like reasoning and mathematics, where high-quality data is scarce 1.

Implications for the AI Industry

The potential slowdown in AI progress has significant implications:

Economic impact: A deviation from expected progress could spook investors and trigger an economic reckoning in the AI sector 1.
Competitive landscape: Smaller companies and startups may find opportunities as the playing field levels 1.
Research priorities: The focus may shift from raw scaling to more nuanced approaches in AI development 2.

The Road Ahead

Despite the challenges, many in the industry remain optimistic about AI's future. Microsoft CEO Satya Nadella has pointed to test-time compute as a promising new direction 2. Others, like Anyscale co-founder Robert Nishihara, emphasize the need for new ideas to keep the rate of progress increasing 2.

As the AI community navigates this transition, it's clear that the next phase of AI development will require more than just bigger models and more data. The industry stands at a crossroads, with the potential for new breakthroughs that could redefine the future of artificial intelligence.

References

Summarized by

Navi

[1]

TIME

|Has AI Progress Really Slowed Down?

[2]

TechCrunch

|Current AI scaling laws are showing diminishing returns, forcing AI labs to change course

[3]

Fortune

|The Rabbit R1's new Teach Mode is another example of AI's "move fast, break things" ethos

[4]

Fortune

|AI enters a new phase, and the Fortune 50 AI Innovators list identifies the companies leading it

[5]

TechCrunch

|This Week in AI: Congressional commission warns of Chinese AGI | TechCrunch

Explore today's top stories

NVIDIA Unveils Major GeForce NOW Upgrade with RTX 5080 Performance and Expanded Game Library

NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.

9 Sources

Technology

6 hrs ago

NVIDIA Unveils Major GeForce NOW Upgrade with RTX 5080

9 Sources

Technology

6 hrs ago

Space: The New Frontier of 21st Century Warfare

As nations compete for dominance in space, the risk of satellite hijacking and space-based weapons escalates, transforming outer space into a potential battlefield with far-reaching consequences for global security and economy.

7 Sources

Technology

22 hrs ago

Space: The New Frontier of 21st Century Warfare

7 Sources

Technology

22 hrs ago

OpenAI Tweaks GPT-5 to Be 'Warmer and Friendlier' Amid User Backlash

OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.

6 Sources

Technology

14 hrs ago

OpenAI Tweaks GPT-5 to Be 'Warmer and Friendlier' Amid User

6 Sources

Technology

14 hrs ago

Russian Disinformation Campaign Exploits AI to Spread Fake News

A pro-Russian propaganda group, Storm-1679, is using AI-generated content and impersonating legitimate news outlets to spread disinformation, raising concerns about the growing threat of AI-powered fake news.

2 Sources

Technology

22 hrs ago

Russian Disinformation Campaign Exploits AI to Spread Fake

2 Sources

Technology

22 hrs ago

AI in Healthcare: Patients Trust AI Medical Advice Over Doctors, Raising Concerns and Challenges

A study reveals patients' increasing reliance on AI for medical advice, often trusting it over doctors. This trend is reshaping doctor-patient dynamics and raising concerns about AI's limitations in healthcare.

3 Sources

Health

14 hrs ago

3 Sources

Health

14 hrs ago

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

The Outpost

News

About