Curated by THEOUTPOST
On Mon, 11 Nov, 4:01 PM UTC
14 Sources
[1]
Is AI's meteoric rise beginning to slow?
A quietly growing belief in Silicon Valley could have immense implications: the breakthroughs from large AI models -- the ones expected to bring human-level artificial intelligence in the near future -- may be slowing down. Since the frenzied launch of ChatGPT two years ago, AI believers have maintained that improvements in generative AI would accelerate exponentially as tech giants kept adding fuel to the fire in the form of data for training and computing muscle. The reasoning was that delivering on the technology's promise was simply a matter of resources -- pour in enough computing power and data, and artificial general intelligence (AGI) would emerge, capable of matching or exceeding human-level performance. Progress was advancing at such a rapid pace that leading industry figures, including Elon Musk, called for a moratorium on AI research. Yet the major tech companies, including Musk's own, pressed forward, spending tens of billions of dollars to avoid falling behind. OpenAI, ChatGPT's Microsoft-backed creator, recently raised $6.6 billion to fund further advances. xAI, Musk's AI company, is in the process of raising $6 billion, according to CNBC, to buy 100,000 Nvidia chips, the cutting-edge electronic components that power the big models. However, there appears to be problems on the road to AGI. Industry insiders are beginning to acknowledge that large language models (LLMs) aren't scaling endlessly higher at breakneck speed when pumped with more power and data. Despite the massive investments, performance improvements are showing signs of plateauing. "Sky-high valuations of companies like OpenAI and Microsoft are largely based on the notion that LLMs will, with continued scaling, become artificial general intelligence," said AI expert and frequent critic Gary Marcus. "As I have always warned, that's just a fantasy." 'No wall' One fundamental challenge is the finite amount of language-based data available for AI training. According to Scott Stevenson, CEO of AI legal tasks firm Spellbook, who works with OpenAI and other providers, relying on language data alone for scaling is destined to hit a wall. "Some of the labs out there were way too focused on just feeding in more language, thinking it's just going to keep getting smarter," Stevenson explained. Sasha Luccioni, researcher and AI lead at startup Hugging Face, argues a stall in progress was predictable given companies' focus on size rather than purpose in model development. "The pursuit of AGI has always been unrealistic, and the 'bigger is better' approach to AI was bound to hit a limit eventually -- and I think this is what we're seeing here," she told AFP. The AI industry contests these interpretations, maintaining that progress toward human-level AI is unpredictable. "There is no wall," OpenAI CEO Sam Altman posted Thursday on X, without elaboration. Anthropic's CEO Dario Amodei, whose company develops the Claude chatbot in partnership with Amazon, remains bullish: "If you just eyeball the rate at which these capabilities are increasing, it does make you think that we'll get there by 2026 or 2027." Time to think Nevertheless, OpenAI has delayed the release of the awaited successor to GPT-4, the model that powers ChatGPT, because its increase in capability is below expectations, according to sources quoted by The Information. Now, the company is focusing on using its existing capabilities more efficiently. This shift in strategy is reflected in their recent o1 model, designed to provide more accurate answers through improved reasoning rather than increased training data. Stevenson said an OpenAI shift to teaching its model to "spend more time thinking rather than responding" has led to "radical improvements". He likened the AI advent to the discovery of fire. Rather than tossing on more fuel in the form of data and computer power, it is time to harness the breakthrough for specific tasks. Stanford University professor Walter De Brouwer likens advanced LLMs to students transitioning from high school to university: "The AI baby was a chatbot which did a lot of improv'" and was prone to mistakes, he noted. "The homo sapiens approach of thinking before leaping is coming," he added.
[2]
Is AI's meteoric rise beginning to slow?
SAN FRANCISCO (AFP) - A quietly growing belief in Silicon Valley could have immense implications: the breakthroughs from large AI models -- the ones expected to bring human-level artificial intelligence in the near future -- may be slowing down. Since the frenzied launch of ChatGPT two years ago, AI believers have maintained that improvements in generative AI would accelerate exponentially as tech giants kept adding fuel to the fire in the form of data for training and computing muscle. The reasoning was that delivering on the technology's promise was simply a matter of resources -- pour in enough computing power and data, and artificial general intelligence (AGI) would emerge, capable of matching or exceeding human-level performance. Progress was advancing at such a rapid pace that leading industry figures, including Elon Musk, called for a moratorium on AI research. Yet the major tech companies, including Musk's own, pressed forward, spending tens of billions of dollars to avoid falling behind. OpenAI, ChatGPT's Microsoft-backed creator, recently raised USD6.6 billion to fund further advances. xAI, Musk's AI company, is in the process of raising USD6 billion, according to CNBC, to buy 100,000 Nvidia chips, the cutting-edge electronic components that power the big models. However, there appears to be problems on the road to AGI. Industry insiders are beginning to acknowledge that large language models (LLMs) aren't scaling endlessly higher at breakneck speed when pumped with more power and data. Despite the massive investments, performance improvements are showing signs of plateauing. "Sky-high valuations of companies like OpenAI and Microsoft are largely based on the notion that LLMs will, with continued scaling, become artificial general intelligence," said AI expert and frequent critic Gary Marcus. "As I have always warned, that's just a fantasy." One fundamental challenge is the finite amount of language-based data available for AI training. According to Scott Stevenson, CEO of AI legal tasks firm Spellbook, who works with OpenAI and other providers, relying on language data alone for scaling is destined to hit a wall. "Some of the labs out there were way too focused on just feeding in more language, thinking it's just going to keep getting smarter," Stevenson explained. Sasha Luccioni, researcher and AI lead at startup Hugging Face, argues a stall in progress was predictable given companies' focus on size rather than purpose in model development. "The pursuit of AGI has always been unrealistic, and the 'bigger is better' approach to AI was bound to hit a limit eventually -- and I think this is what we're seeing here," she told AFP. The AI industry contests these interpretations, maintaining that progress toward human-level AI is unpredictable. "There is no wall," OpenAI CEO Sam Altman posted Thursday on X, without elaboration. Anthropic's CEO Dario Amodei, whose company develops the Claude chatbot in partnership with Amazon, remains bullish: "If you just eyeball the rate at which these capabilities are increasing, it does make you think that we'll get there by 2026 or 2027." Nevertheless, OpenAI has delayed the release of the awaited successor to GPT-4, the model that powers ChatGPT, because its increase in capability is below expectations, according to sources quoted by The Information. Now, the company is focusing on using its existing capabilities more efficiently. This shift in strategy is reflected in their recent o1 model, designed to provide more accurate answers through improved reasoning rather than increased training data. Stevenson said an OpenAI shift to teaching its model to "spend more time thinking rather than responding" has led to "radical improvements". He likened the AI advent to the discovery of fire. Rather than tossing on more fuel in the form of data and computer power, it is time to harness the breakthrough for specific tasks. Stanford University professor Walter De Brouwer likens advanced LLMs to students transitioning from high school to university: "The AI baby was a chatbot which did a lot of improv'" and was prone to mistakes, he noted. "The homo sapiens approach of thinking before leaping is coming," he added.
[3]
Is AI's meteoric rise beginning to slow?
A quietly growing belief in Silicon Valley could have immense implications: The breakthroughs from large AI models -- the ones expected to bring human-level artificial intelligence in the near future -- may be slowing down. Since the frenzied launch of ChatGPT two years ago, AI believers have maintained that improvements in generative AI would accelerate exponentially as tech giants kept adding fuel to the fire in the form of data for training and computing muscle. The reasoning was that delivering on the technology's promise was simply a matter of resources -- pour in enough computing power and data, and artificial general intelligence (AGI) would emerge, capable of matching or exceeding human-level performance.
[4]
OpenAI Reportedly Hitting Law of Diminishing Returns as It Pours Computing Resources Into AI
Reports are emerging that OpenAI is hitting a wall as it continues to pour more computing power into its much-hyped large language models (LLMs) like ChatGPT in a bid for more intelligent outputs. AI models need loads of training data and computing power to operate at scale. But in an interview with Reuters, recently-exited OpenAI cofounder Ilya Sutskever claimed that the firm's recent tests trying to scale up its models suggest that those efforts have plateaued. "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again," Sutskever, a staunch believer in the forthcoming arrival of so-called artificial general intelligence (AGI) or human-level AI, told Reuters. "Everyone is looking for the next thing." While it's unclear what exactly that "next thing" may be, Sutskever's admission -- which comes, notably, just under a year after he moved to oust OpenAI CEO Sam Altman and was subsequently sidelined until his eventual departure -- seems to dovetail with other recent claims and conclusions: that AI companies, and OpenAI specifically, are butting up against the law of diminishing returns. Over the weekend, The Information reported that with each new flagship model, OpenAI is seeing a slowdown in the sort of "leaps" users have come to expect in the wake of its game-changing ChatGPT release in December 2022. This slowdown seems to test the core belief at the center of the argument for AI scaling: that as long as there's ever more data and computing power to feed the models -- which is a big "if," given that firms have already run out of training data and are eating up electricity at unprecedented rates -- those models will continue to grow or "scale" at a consistent rate. Responding to this latest news from The Information, data scientist Yam Peleg teased on X that another cutting-edge AI firm had "reached an unexpected HUGE wall of diminishing returns trying to brute-force better results by training longer & using more and more data." While Peleg's commentary could just be gossip, researchers have been warning for years now that LLMs would eventually hit this wall. Given the insatiably high demand for powerful AI chips -- and that firms are now training their models on AI-generated data -- it doesn't take a machine learning expert to wonder whether the low-hanging fruit is running out. "I think it is safe to assume that all major players have reached the limits of training longer and collecting more data already," Peleg continued. "It is all about data quality now.. which takes time."
[5]
What if AI doesn't just keep getting better forever?
For years now, many AI industry watchers have looked at the quickly growing capabilities of new AI models and mused about exponential performance increases continuing well into the future. Recently, though, some of that AI "scaling law" optimism has been replaced by fears that we may already be hitting a plateau in the capabilities of LLMs trained with standard methods. A weekend report from The Information effectively summarized how these fears are manifesting amid a number of insiders at OpenAI. Unnamed OpenAI researchers told The Information that Orion, the company's codename for its next full-fledged model release, is showing a smaller performance jump than the one seen between GPT-3 and GPT-4 in recent years. On certain tasks, in fact, the upcoming model "isn't reliably better than its predecessor," according to unnamed OpenAI researchers cited in the piece. On Monday, OpenAI co-founder Ilya Sutskever, who left the company earlier this year, added to the concerns that LLMs were hitting a plateau in what can be gained from traditional pre-training. Sutskever told Reuters that "the 2010s were the age of scaling," where throwing additional computing resources and training data at the same basic training methods could lead to impressive improvements in subsequent models. "Now we're back in the age of wonder and discovery once again," Sutskever told Reuters. "Everyone is looking for the next thing. Scaling the right thing matters more now than ever." A large part of the training problem, according to experts and insiders cited in these and other pieces, is a lack of new, quality textual data for new LLMs to train on. At this point, model makers may have already picked the lowest hanging fruit from the vast troves of text available on the public Internet and published books.
[6]
LLM Scaling Has Hit a Wall; What's Next For ChatGPT?
Similar to OpenAI's o1 models, Google and Anthropic are working on inference scaling techniques. While OpenAI chief Sam Altman is drumming up hype that AGI is just around the corner, new reports suggest that LLM scaling has hit a wall. The predominant view in the AI field has been that training larger models on massive amounts of data and compute resources will lead to greater intelligence. In fact, Ilya Sutskever, former chief scientist at OpenAI and founder of Safe Superintelligence Inc., has been a strong advocate for scaling models as the path to unlocking intelligence. Responding to Reuters, Sutskever now says, "results from scaling up pre-training - the phase of training an AI model that uses a vast amount of unlabeled data to understand language patterns and structures - have plateaued." In a turnaround, Sutskever says scaling the right things now matters: "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing. Scaling the right thing matters more now than ever." That's the reason OpenAI released its new series of '01' reasoning models on ChatGPT that scale during inference. It has been seen that if AI models are given more time to "think" and re-evaluate their response, they yield far better results. So companies are now focusing more on test-time compute which means adding more resources during inference and then generating a final response. Recently, The Information reported that OpenAI has changed its strategy as its next big "Orion" model didn't deliver better results as anticipated. The jump from GPT-3.5 to GPT-4 was huge, but OpenAI employees who tested the upcoming model say that the improvement from GPT-4 to Orion is marginal. In tasks like coding, it doesn't outperform prior GPT models. OpenAI is now focused on inference scaling as a new way to improve model performance on ChatGPT. Noam Brown, a researcher at OpenAI, says that inference scaling improves the model performance significantly. Recently, he tweeted, "OpenAI's o1 thinks for seconds, but we aim for future versions to think for hours, days, even weeks. Inference costs will be higher, but what cost would you pay for a new cancer drug? For breakthrough batteries? For a proof of the Riemann Hypothesis? AI can be more than chatbots." Google and Anthropic are also working on a similar technique to improve model performance through inference scaling. However, François Chollet, a researcher at Google, argues that scaling LLMs alone won't lead to generalized intelligence. Yann LeCun, chief AI scientist at Meta, similarly says that LLMs are not sufficient for achieving AGI. As companies run out of data to train larger models, they are looking for novel techniques to improve LLM performance. Now whether AGI is genuinely around the corner or it's simply hype is something only time will tell.
[7]
OpenAI and rivals seek new path to smarter AI as current methods hit limitations
(Reuters) - Artificial intelligence companies like OpenAI are seeking to overcome unexpected delays and challenges in the pursuit of ever-bigger large language models by developing training techniques that use more human-like ways for algorithms to "think". A dozen AI scientists, researchers and investors told Reuters they believe that these techniques, which are behind OpenAI's recently released o1 model, could reshape the AI arms race, and have implications for the types of resources that AI companies have an insatiable demand for, from energy to types of chips. OpenAI declined to comment for this story. After the release of the viral ChatGPT chatbot two years ago, technology companies, whose valuations have benefited greatly from the AI boom, have publicly maintained that "scaling up" current models through adding more data and computing power will consistently lead to improved AI models. But now, some of the most prominent AI scientists are speaking out on the limitations of this "bigger is better" philosophy. Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training - the phase of training an AI model that uses a vast amount of unlabeled data to understand language patterns and structures - have plateaued. Sutskever is widely credited as an early advocate of achieving massive leaps in generative AI advancement through the use of more data and computing power in pre-training, which eventually created ChatGPT. Sutskever left OpenAI earlier this year to found SSI. "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing," Sutskever said. "Scaling the right thing matters more now than ever." Sutskever declined to share more details on how his team is addressing the issue, other than saying SSI is working on an alternative approach to scaling up pre-training. Behind the scenes, researchers at major AI labs have been running into delays and disappointing outcomes in the race to release a large language model that outperforms OpenAI's GPT-4 model, which is nearly two years old, according to three sources familiar with private matters. The so-called 'training runs' for large models can cost tens of millions of dollars by simultaneously running hundreds of chips. They are more likely to have hardware-induced failure given how complicated the system is; researchers may not know the eventual performance of the models until the end of the run, which can take months. Another problem is large language models gobble up huge amounts of data, and AI models have exhausted all the easily accessible data in the world. Power shortages have also hindered the training runs, as the process requires vast amounts of energy. To overcome these challenges, researchers are exploring "test-time compute," a technique that enhances existing AI models during the so-called "inference" phase, or when the model is being used. For example, instead of immediately choosing a single answer, a model could generate and evaluate multiple possibilities in real-time, ultimately choosing the best path forward. This method allows models to dedicate more processing power to challenging tasks like math or coding problems or complex operations that demand human-like reasoning and decision-making. "It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer," said Noam Brown, a researcher at OpenAI who worked on o1, at TED AI conference in San Francisco last month. OpenAI has embraced this technique in their newly released model known as "o1," formerly known as Q* and Strawberry, which Reuters first reported in July. The O1 model can "think" through problems in a multi-step manner, similar to human reasoning. It also involves using data and feedback curated from PhDs and industry experts. The secret sauce of the o1 series is another set of training carried out on top of 'base' models like GPT-4, and the company says it plans to apply this technique with more and bigger base models. At the same time, researchers at other top AI labs, from Anthropic, xAI, and Google DeepMind, have also been working to develop their own versions of the technique, according to five people familiar with the efforts. "We see a lot of low-hanging fruit that we can go pluck to make these models better very quickly," said Kevin Weil, chief product officer at OpenAI at a tech conference in October. "By the time people do catch up, we're going to try and be three more steps ahead." Google and xAI did not respond to requests for comment and Anthropic had no immediate comment. The implications could alter the competitive landscape for AI hardware, thus far dominated by insatiable demand for Nvidia's AI chips. Prominent venture capital investors, from Sequoia to Andreessen Horowitz, who have poured billions to fund expensive development of AI models at multiple AI labs including OpenAI and xAI, are taking notice of the transition and weighing the impact on their expensive bets. "This shift will move us from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference," Sonya Huang, a partner at Sequoia Capital, told Reuters. Demand for Nvidia's AI chips, which are the most cutting edge, has fueled its rise to becoming the world's most valuable company, surpassing Apple in October. Unlike training chips, where Nvidia dominates, the chip giant could face more competition in the inference market. Asked about the possible impact on demand for its products, Nvidia pointed to recent company presentations on the importance of the technique behind the o1 model. Its CEO Jensen Huang has talked about increasing demand for using its chips for inference. "We've now discovered a second scaling law, and this is the scaling law at a time of inference...All of these factors have led to the demand for Blackwell being incredibly high," Huang said last month at a conference in India, referring to the company's latest AI chip. (Reporting by Krystal Hu in New York and Anna Tong in San Francisco; editing by Kenneth Li and Claudia Parsons)
[8]
OpenAI and rivals seek new path to smarter AI as current methods hit limitations
Artificial intelligence companies like OpenAI are seeking to overcome unexpected delays and challenges in the pursuit of ever-bigger large language models by developing training techniques that use more human-like ways for algorithms to "think". A dozen AI scientists, researchers and investors told Reuters they believe that these techniques, which are behind OpenAI's recently released o1 model, could reshape the AI arms race, and have implications for the types of resources that AI companies have an insatiable demand for, from energy to types of chips. OpenAI declined to comment for this story. After the release of the viral ChatGPT chatbot two years ago, technology companies, whose valuations have benefited greatly from the AI boom, have publicly maintained that "scaling up" current models through adding more data and computing power will consistently lead to improved AI models. But now, some of the most prominent AI scientists are speaking out on the limitations of this "bigger is better" philosophy. Ilya Sutskever, cofounder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training - the phase of training an AI model that uses a vast amount of unlabeled data to understand language patterns and structures - have plateaued. Sutskever is widely credited as an early advocate of achieving massive leaps in generative AI advancement through the use of more data and computing power in pre-training, which eventually created ChatGPT. Sutskever left OpenAI earlier this year to found SSI. "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing," Sutskever said. "Scaling the right thing matters more now than ever." Sutskever declined to share more details on how his team is addressing the issue, other than saying SSI is working on an alternative approach to scaling up pre-training. Behind the scenes, researchers at major AI labs have been running into delays and disappointing outcomes in the race to release a large language model that outperforms OpenAI's GPT-4 model, which is nearly two years old, according to three sources familiar with private matters. The so-called 'training runs' for large models can cost tens of millions of dollars by simultaneously running hundreds of chips. They are more likely to have hardware-induced failure given how complicated the system is; researchers may not know the eventual performance of the models until the end of the run, which can take months. Another problem is large language models gobble up huge amounts of data, and AI models have exhausted all the easily accessible data in the world. Power shortages have also hindered the training runs, as the process requires vast amounts of energy. To overcome these challenges, researchers are exploring "test-time compute," a technique that enhances existing AI models during the so-called "inference" phase, or when the model is being used. For example, instead of immediately choosing a single answer, a model could generate and evaluate multiple possibilities in real-time, ultimately choosing the best path forward. This method allows models to dedicate more processing power to challenging tasks like math or coding problems or complex operations that demand human-like reasoning and decision-making. "It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer," said Noam Brown, a researcher at OpenAI who worked on o1, at TED AI conference in San Francisco last month. OpenAI has embraced this technique in their newly released model known as "o1," formerly known as Q* and Strawberry, which Reuters first reported in July. The O1 model can "think" through problems in a multi-step manner, similar to human reasoning. It also involves using data and feedback curated from PhDs and industry experts. The secret sauce of the o1 series is another set of training carried out on top of 'base' models like GPT-4, and the company says it plans to apply this technique with more and bigger base models. At the same time, researchers at other top AI labs, from Anthropic, xAI, and Google DeepMind, have also been working to develop their own versions of the technique, according to five people familiar with the efforts. "We see a lot of low-hanging fruit that we can go pluck to make these models better very quickly," said Kevin Weil, chief product officer at OpenAI at a tech conference in October. "By the time people do catch up, we're going to try and be three more steps ahead." Google and xAI did not respond to requests for comment and Anthropic had no immediate comment. The implications could alter the competitive landscape for AI hardware, thus far dominated by insatiable demand for Nvidia's AI chips. Prominent venture capital investors, from Sequoia to Andreessen Horowitz, who have poured billions to fund expensive development of AI models at multiple AI labs including OpenAI and xAI, are taking notice of the transition and weighing the impact on their expensive bets. "This shift will move us from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference," Sonya Huang, a partner at Sequoia Capital, told Reuters. Demand for Nvidia's AI chips, which are the most cutting edge, has fueled its rise to becoming the world's most valuable company, surpassing Apple in October. Unlike training chips, where Nvidia dominates, the chip giant could face more competition in the inference market. Asked about the possible impact on demand for its products, Nvidia pointed to recent company presentations on the importance of the technique behind the o1 model. Its CEO Jensen Huang has talked about increasing demand for using its chips for inference. "We've now discovered a second scaling law, and this is the scaling law at a time of inference...All of these factors have led to the demand for Blackwell being incredibly high," Huang said last month at a conference in India, referring to the company's latest AI chip.
[9]
Open AI co-founder reckons AI training has hit a wall, forcing AI labs to train their models smarter not just bigger
Ilya Sutskever, co-founder of OpenAI, thinks existing approaches to scaling up large language models have plateaued. For significant future progress, AI labs will need to train smarter, not just bigger, and LLMs will need to think a little bit longer. Speaking to Reuters, Sutskever explained that the pre-training phase of scaling up large language models, such as ChatGPT, is reaching its limits. Pre-training is the initial phase that processes huge quantities of uncategorized data to build language patterns and structures within the model. Until recently, adding scale, in other words increasing the amount of data available for training, was enough to produce a more powerful and capable model. But that's not the case any longer, instead exactly what you train the model on and how is more important. "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing," Sutskever reckons, "scaling the right thing matters more now than ever." The backdrop here is the increasingly apparent problems AI labs are having making major advances on models in and around the power and performance of ChatGPT 4.0. The short version of this narrative is that everyone now has access to the same or at least similar easily accessible training data through various online sources. It's no longer possible to get an edge simply by throwing more raw data at the problem. So, in very simple terms, training smarter not just bigger is what will now give AI outfits an edge. Another enabler for LLM performance will be at the other end of the process when the models are fully trained and accessed by users, the stage known as inferencing. Here, the idea is to use a multi-step approach to solving problems and queries in which the model can feed back into itself, leading to more human-like reasoning and decision-making. "It turned out that having a bot think for just 20 seconds in a hand of poker got the same performance boost as scaling up the model by 100,000x and training it for 100,000 times longer," Noam Brown, an OpenAI researcher who worked on the latest o1 LLM says. In other words, having bots think longer rather than just spew out the first thing that comes to mind can deliver better results. If the latter proves a productive approach, the AI hardware industry could shift away from massive training clusters towards banks of GPUs focussed on improved inferencing. Of course, either way, Nvidia is likely to be ready to take everyone's money. The increase in demand for AI GPUs for inferencing is indeed something Nvidia CEO Jensen Huang recently noted. "We've now discovered a second scaling law, and this is the scaling law at a time of inference. All of these factors have led to the demand for Blackwell [Nvidia's next-gen GPU architecture] being incredibly high," Huang said recently. How long it will take for a generation of cleverer bots to appear thanks to these methods isn't clear. But the effort will probably show up in Nvidia's bank balance soon enough.
[10]
A funny thing happened on the way to AGI: Model 'supersizing' has hit a wall
Welcome to AI Decoded, Fast Company's weekly newsletter that breaks down the most important news in the world of AI. You can sign up to receive this newsletter every week here. The impressive intelligence gains of large models like OpenAI's GPT-4 and Anthropic's Claude came about after researchers figured out that they could reap progressively and predictably better intelligence by increasing the size of the models, and training them with more data and more computing power for longer periods of time. That realization yielded some impressive chatbots and coding assistants. The big question now is how far the "supersizing" approach can carry researchers toward the holy grail of artificial general intelligence (i.e., AI that's generally smarter than human beings). In fact, newly departed OpenAI cofounder Ilya Sutskever recently told Reuters that the jig is up for massive scaling. Similarly, billionaire investor and accelerationist Marc Andreessen said on his podcast with Ben Horowitz that the AI researchers he talks to are hitting the limits of massive scale. "If you look at the improvement from GPT-2 to GPT-3 to 3.5, and then compare that from like 3.5 to 4, you know we really slowed down in terms of the amount of improvement," Horowitz says.
[11]
The AI Winter Begins? - AI Scaling Challenges and the Future of AI Development
From leaked documentation and circulating reports, it seems that the frontrunners in artificial intelligence research are encountering significant obstacles in scaling AI models. This development challenges the long-held belief that bigger models with more data and computing power inevitably lead to smarter AI. This situation suggests there may be a ceiling to enhancing intelligence through mere scaling, prompting a critical reevaluation of current AI development strategies. Imagine eagerly anticipating the next big leap in artificial intelligence, only to find that the much-hyped advancements fall short of expectations. This is the current reality for OpenAI, a leader in AI research, as it grapples with the realization that simply making models larger and feeding them more data is not a guaranteed path to creating smarter AI. The Orion model, once anticipated as a groundbreaking innovation, has instead highlighted the potential limits of scaling. This unexpected twist is driving a shift in focus, urging researchers to explore more nuanced and efficient methods to enhance AI capabilities. As we stand on the cusp of this pivotal moment in AI development, it is becoming clear that the path forward will not be as straightforward as once thought. OpenAI's challenges underscore a broader industry-wide reckoning with the "bigger is better" mindset. The good news? These obstacles could inspire more innovative and sustainable approaches, emphasizing reasoning, safety, and alignment with human values. By rethinking current strategies, researchers may unlock the potential for AI systems that are not only more powerful but also better attuned to the complexities of real-world applications. While the journey may be more complex, the destination holds the promise of truly transformative advancements. The challenges faced by OpenAI highlight a crucial turning point in AI research. As the industry grapples with the limitations of the "bigger is better" approach, researchers are being forced to explore more nuanced and efficient methods of improving AI capabilities. This shift could have far-reaching implications for the future of AI development, potentially leading to more innovative and sustainable approaches. The underwhelming performance of the Orion model underscores the need to pivot from a focus on sheer size to more sophisticated approaches in AI development. OpenAI is now actively exploring specialized AI tools and post-training techniques to enhance model capabilities. These methods include: These techniques aim to refine AI performance through targeted improvements rather than relying solely on increased scale. By focusing on quality over quantity, researchers hope to create AI systems that are not just larger, but smarter and more adaptable to complex tasks. Synthetic data presents both opportunities and challenges in the realm of AI development. While it offers the potential to significantly expand training datasets, there's growing concern that over-reliance on synthetic data might inadvertently stifle innovation. The use of synthetic data introduces a complex dynamic: Benefits: - Expands available training data - Allows for creation of diverse scenarios - Reduces dependency on real-world data collection Risks: - Potential introduction of biases - Possible reinforcement of existing model limitations - Risk of creating AI systems detached from real-world complexities Careful management of synthetic data usage is crucial to ensure AI models remain robust, reliable, and grounded in real-world applications. Striking the right balance between synthetic and real data will be key to driving meaningful AI advancements. Unlock more potential in scaling limitations by reading previous articles we have written. AI development is increasingly prioritizing reasoning capabilities, safety measures, and alignment with human values. This shift reflects a broader understanding that AI models must not only perform tasks effectively but also do so safely and ethically. Making sure alignment with human values and intentions is becoming a critical component of AI strategy, influencing future technological directions. Key focus areas in this shift include: This evolving approach aims to create AI systems that are not just powerful, but also trustworthy and beneficial to society. The changing priorities in AI development are likely to have a significant impact on AI infrastructure investments. As the focus shifts from merely scaling models to enhancing their reasoning and safety capabilities, investments may be redirected towards developing more sophisticated infrastructure to support these goals. This change could redefine the AI technology landscape, guiding future innovations and strategies. Areas likely to see increased investment include: These evolving investment patterns reflect a maturing AI industry that is increasingly focused on creating sustainable, responsible, and truly intelligent systems. OpenAI's experiences highlight the complexities and challenges inherent in pushing the boundaries of AI development. The limitations of scaling, coupled with the growing need for specialized tools and ethical considerations, are actively shaping the future trajectory of AI research and development. As the field continues to evolve, a balanced approach that integrates size, reasoning capabilities, and safety considerations will be crucial for achieving sustainable progress in artificial intelligence. This multifaceted approach promises to unlock new possibilities in AI, potentially leading to systems that are not just more powerful, but also more aligned with human needs and values.
[12]
How OpenAI and rivals are overcoming limitations of current AI models
Artificial intelligence companies like OpenAI are seeking to overcome unexpected delays and challenges in the pursuit of ever-bigger large language models by developing training techniques that use more human-like ways for algorithms to "think". A dozen AI scientists, researchers and investors told Reuters they believe that these techniques, which are behind OpenAI's recently released o1 model, could reshape the AI arms race, and have implications for the types of resources that AI companies have an insatiable demand for, from energy to types of chips. OpenAI declined to comment for this story. After the release of the viral ChatGPT chatbot two years ago, technology companies, whose valuations have benefited greatly from the AI boom, have publicly maintained that "scaling up" current models through adding more data and computing power will consistently lead to improved AI models. But now, some of the most prominent AI scientists are speaking out on the limitations of this "bigger is better" philosophy.
[13]
Report: AI companies face scalling wall as results dwindle
Disclaimer: This content generated by AI & may have errors or hallucinations. Edit before use. Read our Terms of use Artificial intelligence (AI) companies are hitting a scaling wall, according to a Reuters report referring to experts and investors in the AI space. The report suggests that the results from scaling up pre-training -- making models bigger and feeding models more data -- are no longer providing proportional capability improvements. AI developers are reportedly facing challenges in developing a model that is better than GPT 4. The developers are struggling with the following key challenges -- To cope with this, companies are using a process called "test time compute" where the model learns more while it is coming out with inferences. So for instance, when you ask a model for the answer to a specific question, it might show you two options instead of pre-emptively picking one. This helps the model allocate more of its processing power to challenging tasks like maths or coding. While Reuters published a comprehensive report about the subject, concerns around AI scaling have been in discussion for a while now. As per a recent Bloomberg report, OpenAI's new model Orion did not live up to the company's expectations, given that it wasn't as big a step up as GPT 4 was from GPT 3.5. Similarly, Google and Anthropic are also struggling to make any major breakthroughs. This comes as a concerning development, especially keeping in mind the large quantities of funds companies are allocating on AI. Meta said that its investment in AI continues to require "serious infrastructure" and that it expects to invest significantly in them in its latest earnings call for the quarter ending in September 2024. Similarly, as per Google's recent earnings call, the company spent $7.2 billion on sales and marketing, these expenses were a result of Google's investment in advertising and promotional efforts related to the Made by Google launches, as well as for AI and Gemini. Commenting on the Reuters report, Meta's Chief AI Scientist Yan Le Cun says "I told you so". He explains that auto-regressive large language models (LLMs) -- models that use past data to predict future trends -- are hitting a ceiling and that he has been concerned about the same since before most people heard of LLMs. "I've always said that LLMs were useful, but were an off-ramp on the road towards human-level AI. I've said that reaching human-level AI will require new architectures and new paradigms," Le Cun explains. Fellow computer scientist Gary Marcus argues that he spoke about deep learning models hitting a wall in 2022. "We all know that GPT-3 was vastly better than GPT-2. And we all know that GPT-4 (released thirteen months ago) was vastly better than GPT-3. But what has happened since? I could be persuaded that on some measures there was a doubling of capabilities for some set of months in 2020-2023, but I don't see that case at all for the last 13 months. Instead, I see numerous signs that we have reached a period of diminishing returns," Marcus mentioned in his blog post in April this year. At the time, he flagged struggling AI projects like Inflection AI and Stability AI that were struggling with financial difficulties. "If enthusiasm for GenAI dwindles and market valuations plummet, AI won't disappear, and LLMs won't disappear; they will still have their place as tools for statistical approximation," he explained. Meanwhile, OpenAI CEO Sam Altman addressed the Reuters report with a single-line tweet saying there is no wall. Earlier this year during the AI for Good Global Summit, an interviewer asked Altman about data shortages and questioned whether OpenAI was now relying on synthetic data (computer-generated data) to train its models. Altman admitted that the company has generated a lot of synthetic data and experimented with training on it while adding that it would be really strange if the best way for companies to train models was to generate synthetic data and feed it into their models. When asked about quality concerns with synthetic data, Altman said that companies need to focus on ensuring that the data is of good quality and also finding ways to "get better at data efficiency and learn more from smaller amounts of data." Another space that companies could turn to address data shortage to is non-English data. At MediaNama's annual PrivacyNama conference this year, lawyer Amlan Mohanty pointed out localised data sets (like ones from the Global South) could make AI models powerful and culturally richer. "These models are going to become more powerful, more capable when they have really small, localized, specialized data sets that are going to be licensed," he mentioned.
[14]
OpenAI, Competitors Look for Ways to Overcome Current Limitations
A dozen AI scientists, researchers and investors told Reuters they believe that these techniques, which are behind OpenAI's recently released o1 model, could reshape the AI arms race, and have implications for the types of resources that AI companies have an insatiable demand for, from energy to types of chips. OpenAI declined to comment for this story. After the release of the viral ChatGPT chatbot two years ago, technology companies, whose valuations have benefited greatly from the AI boom, have publicly maintained that "scaling up" current models through adding more data and computing power will consistently lead to improved AI models. But now, some of the most prominent AI scientists are speaking out on the limitations of this "bigger is better" philosophy.
Share
Share
Copy Link
Recent reports suggest that the rapid advancements in AI, particularly in large language models, may be hitting a plateau. Industry insiders and experts are noting diminishing returns despite massive investments in computing power and data.
The artificial intelligence community is buzzing with a growing sentiment that the meteoric rise of AI technologies, particularly large language models (LLMs), may be slowing down. This development could have significant implications for the future of AI and the tech industry at large 1.
Since the launch of ChatGPT two years ago, there has been a prevailing belief that improvements in generative AI would accelerate exponentially. The theory was simple: pour in more computing power and data, and artificial general intelligence (AGI) would inevitably emerge 2.
Tech giants have been pouring billions into AI development. OpenAI recently raised $6.6 billion, while Elon Musk's xAI is reportedly seeking $6 billion to purchase 100,000 Nvidia chips 1. These investments underscore the high stakes in the race for AI supremacy.
Despite these massive investments, industry insiders are beginning to acknowledge that LLMs aren't scaling endlessly higher when provided with more power and data. Performance improvements are showing signs of plateauing, challenging the notion that continued scaling will lead to AGI 2.
One fundamental challenge is the finite amount of high-quality, language-based data available for AI training. Scott Stevenson, CEO of AI legal tasks firm Spellbook, suggests that relying solely on language data for scaling is destined to hit a wall 1.
In response to these challenges, companies like OpenAI are shifting their focus. Instead of simply increasing model size, they are exploring ways to use existing capabilities more efficiently. OpenAI's recent o1 model, for instance, aims to provide more accurate answers through improved reasoning rather than increased training data 2.
While some in the AI industry contest these interpretations, others acknowledge the need for a new approach. Ilya Sutskever, a recently departed OpenAI co-founder, stated, "The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again" 4.
As the AI community grapples with these challenges, the focus is shifting from simply scaling up models to finding more innovative approaches. Stanford University professor Walter De Brouwer likens this transition to students moving from high school to university, suggesting a more thoughtful, "homo sapiens approach of thinking before leaping" 1.
These developments could have significant implications for the AI industry, potentially affecting the sky-high valuations of companies like OpenAI and Microsoft. It also raises questions about the feasibility of achieving AGI through current methods 5.
As the AI community navigates these challenges, the coming years may see a shift in focus from raw computing power to more nuanced and efficient approaches in the pursuit of advanced AI capabilities.
Reference
[1]
[2]
[3]
[4]
[5]
OpenAI's next-generation AI model, ChatGPT-5 (codenamed Orion), is encountering significant hurdles in surpassing its predecessor, GPT-4. This development raises questions about the future of AI scaling and progress in the field.
11 Sources
Leading AI companies are experiencing diminishing returns on scaling their AI systems, prompting a shift in approach and raising questions about the future of AI development.
7 Sources
As major tech companies invest heavily in AI, questions arise about sustainability and the potential of smaller, specialized models. This story explores the current AI landscape, its challenges, and emerging alternatives.
2 Sources
OpenAI is reportedly on the verge of a significant breakthrough in AI reasoning capabilities. This development has sparked both excitement and concern in the tech community, as it marks a crucial step towards Artificial General Intelligence (AGI).
7 Sources
Recent developments suggest open-source AI models are rapidly catching up to closed models, while traditional scaling approaches for large language models may be reaching their limits. This shift is prompting AI companies to explore new strategies for advancing artificial intelligence.
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved