19 Sources
19 Sources
[1]
Can Google's AI Memory Compression Algorithm Help Solve the RAM Crisis?
Google has unveiled a new memory-optimization algorithm for AI inferencing that researchers claim could reduce the amount of "working memory" an AI model requires by at least 6x. As TechCrunch reports, this "TurboQuant" algorithm is still a lab breakthrough rather than a technology that has been trialed at scale or deployed in the real world, but if it does what it says it does, it could help reduce the enormous disparity between memory supply and demand, which is causing so many knock-on effects in a range of hardware and material industries. "We introduce a set of advanced, theoretically grounded quantization algorithms that enable massive compression for large language models and vector search engines," Google says in a research paper. The idea is that TurboQuant reduces memory requirements and improves response performance and latency while maintaining accuracy. In practice, it would allow AI models to access more contextual data while using less space and avoiding hallucinations. These are the kind of holy grail achievements of any compression algorithm: Make everything smaller, easier to move, and cheaper to move, without losing anything in the process. (Remember HBO's Silicon Valley and Pied Piper?) Google is set to showcase the core components of TurboQuant at ICLR 2026 next month: PolarQuant and QJL, a novel method for training and optimization. Together, they could help alleviate the memory bottleneck. Although it wouldn't do much for training data centers, which also require monstrous amounts of memory, it could thin out the RAM needs of inferencing systems. It probably wouldn't do much to solve the current memory crisis, as deployment would take time, and memory orders are already locked in for many months. But perhaps it could help bring the RAM shortage to a close before 2030. Google seems confident it's ready for large-scale deployment. "These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds," it says. "This rigorous foundation is what makes them robust and trustworthy for critical, large-scale systems."
[2]
Memory-makers' shares are down. Don't blame Google
Chocolate Factory boffins have found a way to reduce AI's memory use, but don't assume that means less demand for DRAM The high cost of memory has sideswiped the technology industry, causing server vendors to admit their quotes are guesstimates and depressing sales of PCs and smartphones. Nobody is immune: Microsoft used the RAM panic as cover for fixing Windows 11's memory gluttony, and Sony suspended orders for compact flash and SD cards because it can't buy the chips to build them. Demand for artificial intelligence infrastructure created the situations described above by giving memory-makers incentive to production of high-bandwidth and high-margin memory GPUs require. Reduced supply for other memory sent prices soaring. Yet over the last week, the price of consumer-grade memory has reportedly eased at some online vendors. Memory-makers' share prices slipped sharply. The value of Micron Technology's scrip has slumped in the dozen days since it announced enormous growth in revenue and profits. Western Digital's share price fell 8.5 percent on Monday alone and is down 20.5 percent since March 19th. SanDisk slid seven percent on Monday, and the company has lost a fifth of its value in a fortnight. Those falls are steeper than those recorded by major stock market indices, as investors try to understand the impact of war in the Middle East. Given the AI boom continues largely unabated, why are investors worried? Some have linked the change in market sentiment to a technology Google revealed last week called TurboQuant that the company's researchers describe as "a compression algorithm that optimally addresses the challenge of memory overhead in vector quantization" - one of the most memory-hungry parts of AI workloads - and by doing so speeds up applications and "lowers memory costs." Google's announcement also claims TurboQuant can reduce the amount of memory needed in key-value cache - "by a factor of at least 6x". Some have concluded that TurboQuant means demand for memory will fall and linked Google's announcement of the tech to share market moves. Analyst firm TrendForce, which specializes in the memory market, disagrees. In a report published last week, TrendForce predicts TurboQuant will lower the cost of AI infrastructure, and by doing so "spark massive long-sequence application demand, comprehensively driving structural growth and specification upgrades for high-bandwidth, main, and flash memory across cloud and edge platforms." The firm thinks that TurboQuant can reduce the cost of running inferencing workloads, and suggests that this "is likely to drive substantial demand for long-context and multi-agent architectures, further accelerating the migration of AI workloads to the edge." Or in other words, more efficient AI will create demand for more AI, and more memory. The war in the Persian Gulf means it will be hard to meet that demand, because the conflict has damaged the supply chain for helium - a vital component in semiconductor production that could mean chipmakers can't make all the RAM they assumed would inflate their revenue and profits. Share prices reflect investors' opinions of a business's future prospects. And helium shortages are an obvious indicator of reduced future production and sales. ®
[3]
Memory Stock Boom Seen Resilient to Threat From New Google Tech
Shares of computer memory and storage products slumpedBloomberg Terminal on concerns over demand after Google researchers touted a new compression technique. But it may be a hiccup rather than an existential threat. SK Hynix Inc., a key maker of memory chips for artificial intelligence applications, fell as much as 6% on the Korea Exchange. Flash memory manufacturer Kioxia Holdings Corp. dropped 4.4% in Tokyo. That followed similar losses by Micron Technology Inc. and Sandisk Corp. Wednesday in New York. The Alphabet Inc. unit said its new TurboQuant technology can reduce memory size for large language models and vector search engines. Bulls tracking the blistering rally in global memory shares, however, say that improved efficiency will increase rather than reduce demand -- an old theory known as the Jevons Paradox. The 19th century premise was cited in a note from the trading desk at JPMorgan Chase & Co. Its analysts said that investors may take profits on the news, but there's no near-term threat to memory consumption. Memory and storage product prices have climbed in recent months amid shortages due to ravenous demand tied to the AI boom. That's driven exponential moves in related stocks, such as Kioxia's 700% surge since the end of August. The Google news spurred some caution that demand could be reduced, but some analysts pushed back on this idea saying the opposite is actually true. Jevons Paradox is an English economist's theory about coal production, stating that the more efficient it becomes, the more the demand will rise. The idea was brought up last year when China's low-cost DeepSeek AI model sparked fear of reduced demand for advanced technology. The Google development may make "little difference to demand given the extreme supply constraints," Ortus Advisors analyst Andrew Jackson wrote in a note on Smartkarma. For Kioxia, "after such massive gains it makes sense we see a bit of profit-taking creep in."
[4]
You Can't Escape the AI Tax
Electronics are getting more expensive and worse. Blame the AI boom. Recently, a Costco in Florida instituted a new store policy. An employee told me that he was asked to open up every desktop computer displayed in the electronics section and remove the memory chips. Otherwise, the RAM harvesters would get them. Elsewhere, criminal groups are misdirecting trucks carrying RAM in order to loot them. All of this is happening because of a generational shortage of a part used in practically every electronic gadget on Earth. RAM is your device's short-term memory -- storing the information it needs to handle any active tasks. (RAM stands for "random-access memory.") To put this in intimately familiar terms, it is what your computer runs out of when you have too many browser tabs open. And right now, the price of RAM is skyrocketing. From September to February, the price of a single 64GB stick of RAM went from roughly $250 to more than $1,000. Gamers who build their own juiced computers were among the first to notice that something was off. Starting in the fall, it became so difficult for them to acquire memory sticks that they have given a name to this crisis: RAMageddon. Now it's quickly becoming everyone's problem. In December, Dell jacked the prices of some of its computers by hundreds of dollars because of what its COO has referred to as "this memory crisis, shortage, whatever you want to call it." Earlier this month, for the same reason, Lenovo raised prices on some of its products, including the popular ThinkPad. This seems to be only the beginning. Matteo Rinaldi, the head of a global semiconductor-research institute run by Northeastern University, told me he recently asked a colleague what new laptop he should buy. "He told me right away, 'Well, you know, it almost doesn't matter which one,'" Rinaldi said. "'Just decide you want to buy now, because prices are going up.'" RAM is suddenly so expensive because memory is powering the AI boom. Data centers require huge amounts to run the models that underlie AI tools such as ChatGPT and Claude -- especially as they become capable of handling more complicated tasks. This year, a group of tech giants -- Amazon, Alphabet, Meta, Microsoft, and Oracle -- is set to collectively spend half a trillion dollars on the AI build-out. Roughly a third of that money is being spent on memory alone, according to Dylan Patel, the founder of SemiAnalysis, a popular semiconductor-research firm. Read: Welcome to a multidimensional economic disaster The insatiable demand has "cannibalized our conventional consumer-electronics supply," Yang Wang, an analyst at Counterpoint Research, a market-research firm, told me. Every major RAM manufacturer has shifted production lines to service AI data centers. This year, 70 percent of memory-chip products made globally will be destined for them. In South Korea, where two of the biggest RAM manufacturers are based, Silicon Valley executives are reportedly booking hotels in the country's tech districts, frantically hoping to secure inventory. A Korean newspaper has given them a name: RAM beggars. Ideally, this problem would be solved by producing a whole lot more RAM. Micron, one of the biggest RAM manufacturers, is building a factory in New York that will cost more than any other private investment in the state's history. Elon Musk recently suggested that Tesla will build its own RAM factories, called "fabs," to ensure that he has enough memory to build robots and robotaxis. ("We've got two choices: Hit the chip wall, or make a fab," he said in January.) But because of the complexity of making RAM, it could take even the richest man in the world two to five years to bring a new factory online. In the meantime, the world simply won't have enough of a basic electronics part. During RAMageddon, your gadgets will essentially be subject to an AI tax. It's long been safe to assume that technology will get cheaper, faster, and better. But for the next few years, all signs suggest that devices will get more expensive, slower, and worse. So far, it might not feel like all that much has changed. Earlier this month, Apple released its cheapest computer ever, the $599 Mac Neo. (It runs on a chip previously used only in iPhones.) But elsewhere, the price hikes have started. Samsung's new Galaxy phones cost about $100 more than last year's models, which the company's COO has attributed in large part to the memory shortage. That's despite the fact that Samsung is one of three companies in the world producing a significant amount of memory. Android phones have debuted this year with worse cameras, less storage, and slower processors than models released years ago, Wang told me, yet they still cost more. Expect more changes like this. Gadget makers were able to initially swallow the cost of high RAM, but in the long run, they'll have little choice but to pass on the cost to consumers. Consider Sony, which just announced that it will raise the price of the PlayStation 5 by $100. Before the adjustment, the memory chips inside a PS5 were worth more than the console itself. Smaller video-game manufacturers have pushed back launches or canceled the release of new consoles altogether. To keep up with increasing RAM costs, things might get weird. Companies may jack up software prices to compensate for all the money they are sinking into memory chips. Sony's CFO said on a recent earnings call that the company will survive the RAM crisis by "monetizing the installed base," which seems to be a euphemism for finding ways to charge PlayStation owners more, or showing them more ads. (Sony did not respond to a request for comment.) At the same time, some companies may start to pare back products they've made "smart" to justify markups. Smart speakers, smart toilets, smart toasters, and smart deodorants (yes, really) all contain RAM. "Do we stop getting smart refrigerators? I don't think that's a net bad," Laine Nooney, a technology historian at NYU, told me. Read: Your smart thermostat isn't here to help you If that's a silver lining, it's not a particularly good one. TrendForce, a consumer-research firm, anticipates that laptop prices will rise by more than a third in the next few years. Computers under $500 will be extinct by 2028, according to a report from Gartner. Put differently, cheaper computers may fall off the map. "The $300 Chromebook and the $150 Android phone were products of a specific era -- one where memory was cheap because nobody else was competing for it at this scale," Nate Jones, an AI analyst, told me. "That era is ending." The consequences are global. All of this will be felt acutely in poor countries, where sub-$150 smartphones are especially popular. Some people may have no choice but to revert to flip phones, potentially cutting them off from essential apps and services. "You can't build a gaming PC? Cool story, bro," Wang, the smartphone analyst, said. "But then people in Africa can't get a device which is crucial for their lives." So much money is going into the AI build-out that it is already reshaping the physical world. The data centers that are sprouting up across the United States are at least partly to blame for rising utility bills. And now people who may never have heard of Claude or asked ChatGPT for homework help will feel the effects of RAMmaggedon. Hospitals have shelved plans to install touch screens that display medical charts and let patients order food, because the displays contain RAM, Rachael England, a manager at Vizient, a consulting firm that works with many U.S. hospitals, told me. Josh Bauman, the director of technology for a public-school district in Missouri, told me that if RAM prices keep increasing, his district may rethink buying a Chromebook for every student. For the foreseeable future, no one can escape the AI tax.
[5]
Google TurboQuant breakthrough rattles memory chip stocks
Shares of memory hardware producers took a hit this week following Alphabet $GOOGL's announcement of a technology designed to drastically lower the working memory requirements for artificial intelligence models. South Korean markets saw Samsung drop by nearly 5 percent, and SK Hynix lost 6 percent. Kioxia, a manufacturer of flash storage based in Japan, experienced a stock decline of almost 6 percent. Wednesday's trading session in the United States yielded downward movement for shares of both Sandisk and Micron $MU. Google Research published the technology on March 24. The algorithm operates without degrading model precision, focusing its compression on the key-value cache -- the area responsible for retaining historical calculations to bypass redundant processing. According to the researchers, performance on tasks such as code generation, question answering, and text summarization remained fully intact despite the cache storage shrinking by a factor of at least six. Comparisons quickly emerged between this development and the industry-wide shockwaves caused last year by DeepSeek, a China-based AI firm. Posting on the social media platform X $TWTR, the head of Cloudflare, Matthew Prince, likened the new algorithm to "Google's DeepSeek." He added that the industry still has vast potential to improve "speed, memory usage, power consumption, and multi-tenant utilization" when it comes to artificial intelligence inference. Analysts cautioned against reading too much into the sell-off. Addressing CNBC, SemiAnalysis researcher Ray Wang pointed out that alleviating technical constraints frequently paves the way for advanced models that ultimately demand increased hardware support. "When the model becomes more powerful, you require better hardware to support it," he said. The recent drop in share prices is likely the result of shareholders cashing out after a period of sustained growth in a cyclical market, Quilter Cheviot technology research lead Ben Barringer explained to CNBC. TurboQuant "added to the pressure, but this is evolutionary, not revolutionary," he said. "It does not alter the industry's long-term demand picture." The algorithm has limits. A TechCrunch analysis noted the technology offers no relief for the massive RAM needed for AI model training, as it strictly compresses data during the inference stage. Currently, the compression tool lacks widespread deployment and exists purely as a laboratory development. An analysis published by Forbes theorized that decreasing hardware barriers might actually accelerate localized artificial intelligence projects, a shift that could paradoxically drive up total long-term chip consumption. Details of the algorithm are slated for a formal presentation at the upcoming ICLR 2026 event in April.
[6]
Report claims OpenAI spending cuts have 'hit' memory prices but there's little evidence right now of cheaper PC components
The UK's Telegraph newspaper is claiming that "spending cuts at OpenAI have hit memory chip prices." That sounds like potentially good news on several levels. But does it stack up? As we reported, OpenAI shuttered its Sora video-generation tool last week. The AI outfit also cancelled a multi-billion-dollar deal with Oracle to extend the Stargate data center in Texas. Broadly, OpenAI is seen as cutting back on costs. Whether that's because the money is actually running out or, in fact, OpenAi wants to polish its books before a stock market flotation that's been mooted for later this year is an open question. But spending at OpenAI has undeniably been cut and that likely means at least some reduction in demand for memory chips, which has been running at record levels. The Telegraph points out the market analyst Trendforce has tracked memory chip prices rising by 700% over the past year. That's obviously been reflected in ballooning costs for PC components like DDR5 RAM kits. But the Telegraph notes that DDR5 memory kits on Amazon have dropped by as much as $100 from their AI-fuelled peak. Scanning through some of the kit prices I tracked late least year, and the picture is mixed. This Corsair 32 GB kit, for instance, is now $370, down from the $410 it hit in December. This Kingston 16 GB kit, meanwhile, peaked at $350 and can now be bought for $261, albeit it has usually been unavailable. However, if you observe the price trend of the 32 GB version of that Kingston kit, you'll see that the price has been essentially oscillating between $657 and around $515 since February, with it mostly being listed at the lower price. Certainly, it's hard to look at the historic price graph for that kit and conclude that the price is now falling. To be honest, you wouldn't expect that even if OpenAI's recent cutbacks are having an impact on memory chip prices. It would take longer than that to feed into PC memory kit prices. Of course, equally it's hard to know how much of current price rises for computing hardware are down to real supply and demand constraints as opposed to panic buying and price gouging. If it's largely the latter, then it might not take much more than a change in market sentiment thanks to OpenAI's cut backs to see memory pricing tumbling. However, if the price rises are more structural than sentiment-based, then it will likely take more than OpenAI becoming more circumspect with its spending to normalise the DDR5 market. Indeed, the narrative of late has been about an expansion of the supply crunch to include CPUs as the AI industry moves from its initial development and training phase into compute loads that are more inference and agentic. That will shift some of the details around component demand. But not memory. Whether you are training, inferencing or inferencing at the behest of agentic models, you're going to want plenty of money. Heck, another story doing the rounds is that automated cars and robots will soon be gobbling up 300 GB a pop, implying that the memory crunch is set to get worse, not better. On the other hand, Google reckons its new AI algorithm reduces memory demand by 6x, so there's that. All of which means that the Telegraph has probably jumped the gun on this one. There are just so many other demand-driving vectors in flight right now when it comes to the AI boom, it would certainly be premature to conclude that one single factor -- OpenAI reducing spending -- was going to have a tangible impact. For now, then, let's just say the guidance probably remains the same. This whole mess still looks likely to get worse before it gets better, especially with conflict in the Gulf adding another hugely volatile and potentially damaging variable into the mix.
[7]
Google TurboQuant AI Compression Triggers Market Concerns Over DRAM Demand
Google has unveiled a new AI memory compression technology called TurboQuant, and the announcement has already had a measurable impact on the semiconductor market. The technology is designed to reduce the memory footprint of AI models during inference, specifically targeting the Key-Value (KV) cache, a component that expands rapidly as models process longer input sequences. TurboQuant applies vector quantization techniques to compress KV cache data while maintaining computational accuracy. According to Google, the system can reduce memory usage by up to six times compared to standard 32-bit representations, while also improving inference performance by as much as eight times. The company states that this is achieved without requiring additional model training or fine-tuning, which is a key factor in its potential for broad adoption. At the core of the implementation are two proprietary methods: PolarQuant, a quantization algorithm optimized for preserving data fidelity at reduced precision, and QJL, a training optimization approach that ensures stability and accuracy during inference. Together, these technologies enable KV cache compression down to 3-bit precision, significantly lowering the memory requirements for large-scale AI workloads. The market response was immediate. Shares of Samsung Electronics fell 4.8%, SK Hynix declined 5.9%, and Micron Technology dropped 3.4%. Investors reacted to the possibility that improved memory efficiency at the software level could reduce the need for high-capacity DRAM in AI deployments, particularly in inference-heavy environments. However, the long-term implications remain uncertain. While TurboQuant introduces a method to reduce per-workload memory requirements, overall demand for memory continues to be driven by the rapid expansion of AI infrastructure, increasing model complexity, and broader deployment across industries. Analysts suggest that while compression technologies may improve efficiency, they are unlikely to fully offset the structural growth in memory demand. Performance gains are no longer driven solely by hardware scaling; software-level optimizations are playing an increasingly important role. Technologies like TurboQuant could influence how data centers allocate resources, potentially reducing memory pressure in some scenarios while enabling more efficient utilization of existing hardware.
[8]
Google's TurboQuant cuts AI working memory by 6x, but it won't fix the global RAM shortage
TL;DR: Google developed three AI compression algorithms-TurboQuant, PolarQuant, and Quantized Johnson-Lindenstrauss-that reduce large language models' KV cache memory by at least six times without losing accuracy, enabling efficient AI inference on consumer devices while potentially increasing overall memory demand due to wider AI deployment. Google has developed three AI compression algorithms designed to reduce the memory footprint of large language models without sacrificing performance and quality. Published on Google Research, the tech is described as a way to shrink AI's working memory, known as the "KV cache", by using a form of vector quantization. The company plans to present its findings at the ICLR 2026 conference next month, along with the three algorithms making this possible, namely TurboQuant, PolarQuant, and Quantized Johnson-Lindenstrauss. TurboQuant would allow AI to remember more information while taking up less space and maintaining accuracy. There is a lot more detail in the Google Research article on how the compression technology works, but the results are what's exciting. Google evaluated all three algorithms across a range of standard long-context benchmarks, including LongBench, Needle in a Haystack, ZeroSCROLLS, RULER, and L-Eval, using the open-source Gemma and Mistral LLMs. The results show that TurboQuant could make AI cheaper to run, reducing its runtime working memory by "at least 6x" while maintaining strong performance across the board. This is good news, but not for RAM prices. This working memory has nothing to do with AI data centers requiring fewer resources. Instead, the aim is to address memory overhead in the KV cache for LLMs. This cache stores conversational context as users interact with AI chatbots and grows the more you use the model. That translates to reduced memory requirements in AI inference workloads, making it easier for LLMs to run on consumer smartphones or mid-range laptops. It's similar to how DeepSeek R1 was so efficient that it could run on a single GPU. Since TurboQuant targets inference memory, and not training, where the real hardware crunch is happening, it won't ease the broader RAM shortage driven by AI development. At least not directly. There's also a less comfortable angle to consider. Agentic AI, systems capable of performing tasks autonomously, are already around the corner. With such compression tech making those systems run on lower-spec hardware, it could accelerate the AI push significantly. More deployment means more demand for training new models, which loops back to more pressure on the memory supply, not less. This means that a more efficient inference method, like what we are seeing here, could somewhat drive overall memory demand higher in the long run. With that said, TurboQuant is still a lab result. It hasn't been deployed broadly. For now, the broader memory crisis shows no signs of slowing down, with AI data centers already straining CPU supply and forcing Intel and AMD to raise CPU prices by up to 15%.
[9]
Google says its new algorithm reduces AI memory overhead by 6x which could be good news for the RAMpocalypse but bad news for Micron and co
Stock prices for the big three memory makers have already slid. Other than the AI bubble bursting or hype dying down, the other thing that could allow the RAMpocalypse to ease off is a technological change that leads to a dramatic reduction in how much memory AI needs. To that end, Google has cooked up TurboQuant, a new compression algorithm that promises to reduce memory demand by about 6x. And memory maker stock prices have already dropped, likely as a result of this. Although we should resist being reductive and assuming Google's new algorithm is responsible for these market changes -- lest we forget the effects on crucial material availability thanks to the war in Iran -- a 6x claimed reduction in memory demand must surely account for at least some of it. According to Google, TurboQuant "optimally addresses the challenge of memory overhead in vector quantization" and "achieves a high reduction in model size with zero accuracy loss." In other words, it makes vector compression -- which is critical for AI models understanding and processing information, as they do so using vectors -- require less memory than it has until now and crucially without the normally associated loss of accuracy from compressing things down. The basic idea, to exclude lots of details and simplify greatly, seems to be a shift from calculating things in terms of standard vectors and instead using a more absolute reference system. Which, to my non-mathematical ears at least, sounds a bit like moving away from vectors: "Instead of looking at a memory vector using standard coordinates (i.e., X, Y, Z) that indicate the distance along each axis, PolarQuant converts the vector into polar coordinates using a Cartesian coordinate system. This is comparable to replacing 'Go 3 blocks East, 4 blocks North' with 'Go 5 blocks total at a 37-degree angle'." This, ultimately, means no need for data normalisation, which should "eliminate the memory overhead that traditional methods must carry." Google has put the new algorithm through its paces in a bunch of benchmarks, and the results, according to the Big G, at least, show that "TurboQuant achieves perfect downstream results across all benchmarks while reducing the key value memory size by a factor of at least 6x." Again according to Google, the results also "demonstrate a transformative shift in high-dimensional search... [allowing] for building and querying large vector indices with minimal memory, near-zero preprocessing time, and state-of-the-art accuracy." Naturally, such a big improvement, if true, could drastically change the AI server market. Which means it could change the amount of memory that AI companies are wanting to buy from Micron, SK Hynix, and Samsung. As identified by investor boffins, that is indeed what the market is predicting we will see, because stock prices for the big three memory makers have already dropped. Samsung's, for instance, has dropped by about 8% since a couple of days ago, SK Hynix's by about 11%, and Micron's by about 10%. And these are all actually after factoring in a slight rebound today. If AI companies need to buy less memory, that will of course raise the amount of supply open to the general consumer market, including for gaming PCs, laptops, handhelds, and other goodies. Which should in theory mean memory gets cheaper. This highlights the stark difference between what's good for memory makers and what's good for we end users. It's something I've noticed a lot over the past few months: the less stock and the more demand there is, the happier memory investors and analysts seem to be, and the unhappier consumers are. Which is obvious, of course, but it's interesting to see it to go the other way for a change. We shouldn't consider anything a done deal, though. After all, Micron has already said there is "demand significantly in excess of our available supply for the foreseeable future." So for all we know, much of the 'freed up' memory production could just go straight back into AI server racks rather than our PCs. There's also the fact that compute hungry AI folk will just end up running larger models if the memory requirement drop. But there's a chance, at least, and I'll keep clutching at it.
[10]
Is The RAM AI-pocalypse Finally Over? Probably Not
RAM manufacturers' stock prices are falling across the board this week, but it's too early to break out the celebratory post-AI-bubble-popping champagne It’s been a big week for AI haters. OpenAI’s video platform Sora shut down on March 24 (following a report that it was allegedly losing $1 million every day), some DDR5 RAM prices are currently down (a development which has been tied to the announcement of Google’s “extreme compression†tech TurboQuant), and an AI Olaf robot just died in front of Disneyland Paris visitorsâ€"and hey, it’s only Tuesday! Yet despite all this, the story that’s seemingly getting the most traction among the anti-AI crowd concerns stock prices. Exciting segue, I know. Over the past month, all of the top players in the RAM manufacturing industry have seen significant decreases in their stock valuations, and many are treating the news as the first real sign that the AI bubble may be set to finally burst. “Turns out Sam Altman 'buying up' 40% of DRAM wafers was actually him writing Letters of Intent,†wrote YouTuber Hardware Canucks in a post on X. “Letters he supposedly had / has no intention of converting to actual purchases now. And memory manufacturers are just getting DUMPED on today.†For the most part, this is true. As of this writing, Samsung Electronics’ stock price is down 19.68 percent this month, Micron Technology's is down 20.72 percent, and SK Hynix is down 14.06 percent. But is this truly the beginning of the end for the AI sector, and will RAM prices continue to dip as a result? To properly answer that question, we need to break down every factor at play here. Let’s start with Hardware Canucks’ post. The “Letters of Intent†part is multifaceted, but it starts back in September of 2025. Samsung Electronics and SK Hynix signed letters of intent with OpenAI, stating that the RAM manufacturers intended to supply the ChatGPT developer with roughly 40 percent of the entire global output of DRAM for use in its “Stargate Project†data centers. There’s a lot to go over here, but the part that’s seemingly gotten lost in the sauce is that a letter of intent is a precursor to a formal agreement. In short, the DRAM wasn’t actually purchased; OpenAI signed a non-legally-binding document stating that they intended to purchase it. It’s important to note here that this announcement had a huge effect on the stock valuations of RAM manufacturers. For example, from September 5, 2025 to March 5, 2026, SK Hynix’s stock valuation shot up by a whopping 238 percent. Earlier this month, Bloomberg reported that OpenAI and Oracle had supposedly cancelled the expansion of the Stargate Project’s datacenters in Abilene, Texas. According to Bloomberg’s report, this was due to OpenAI’s financing woes and its “often-changing demand forecasting.†In a post on X on March 9, Oracle (which partnered with OpenAI for the Stargate Project) stated that “recent media activity about the Abilene site are false and incorrect.†Turns out, Bloomberg was right on the moneyâ€"at least where OpenAI is concerned. Crusoe Energy Systems, the AI infrastructure developers behind the Stargate Project’s construction, confirmed on March 27 that Microsoft is building a new AI factory in Abilene, Texas, which will be “located adjacent to Crusoe's existing Abilene AI factory infrastructure.†According to AP News, Microsoft’s new factory is “on the same tract of land,†situated “right next to where Crusoe has been building an even larger computing campus for OpenAI and Oracle.†Crusoe is skirting around outright confirming it, but the timing and location more than implies that Microsoft is leasing the space that OpenAI no longer wants (or, rather, the space they can no longer take). With all this in mind, surely this means that OpenAI’s Stargate Project is in some serious trouble, right? It’s not quite that simple, unfortunately. Microsoft and OpenAI signed a multiyear-long partnership deal back in 2019, which, in 2025, Microsoft confirmed had been (partially) extended up until 2032. Microsoft is entitled to 20 percent of OpenAI’s revenue until said agreement expires, so it’s very much in the company’s interest to make sure that OpenAI doesn’t kick the bucket. Microsoft hasn’t just moved in next door to its competitor; it’s leased land next to its business partner, and likely eased some of OpenAI’s logistical and financial issues in the process. Still, it’s another example of OpenAI’s longstanding habit of writing checks that can’t be cashed. OpenAI reportedly lost roughly $5 billion in 2024, somewhere between $8 billion and $12 billion in 2025 and, according to The Information, it's projected to lose $14 billion in 2026. The funny thing is that these losses are just a drop in the bucket because, as of 2025, OpenAI owes almost $100 billion in loans. There’s also the matter of Nvidia’s “strategic partnership†with OpenAI. On September 22, 2025, Nvidia announced that it would be investing up to “$100 billion in OpenAI†in installments. Nvidia planned to progressively invest this money to help “OpenAI to build and deploy at least 10 gigawatts of AI data centers.†It’s an oddly back-to-front deal, but Nvidia was essentially giving OpenAI money to buy Nvidia’s microchips. But, once again, the deal was simply a “letter of intent.†Nvidia’s deal was technically separate from Samsung and SK Hynix’s deal with OpenAI, just made in the same timeframe. I say “was†because, after months of speculation, Nvidia’s CEO, Jensen Huang, all but confirmed on March 4 that the deal had fallen through. As reported by Reuters, Huang stated during the Morgan Stanley Technology, Media and Telecom conference that Nvidia’s previous $30 billion investment in OpenAI was likely to be the last time it could “invest in a consequential company like this.†This is because OpenAI is reportedly set to become a publicly traded company in 2026, which means Nvidia’s “letter of intent†deal will no longer be fulfilled. That means that the $100 billion total investment, and the microchips, are no longer on the table. There’s a good reason OpenAI is often referred to as a “black hole,†as far as investments are concerned. Ok, but now we’re exactly back to where we were four paragraphs ago; why doesn’t this mean that RAM prices are going to fall? By all accounts, OpenAI is set for a make-or-break 2026, and if it breaks, those unfulfilled letters of intent, which already aren’t legally binding anyway, will release hundreds of thousands of RAM modules into the worldâ€|right? The problem is that OpenAI’s letters of intent aren’t entirely to blame for the dips in stock valuations. While the likes of Bloomberg and Fast Company note that investors’ AI-bubble-related fears are fueling the downward turn, Google’s TurboQuant announcement and the US-Iran war are also being factored into the dip. But here’s the catch: according to Morgan Stanley’s analyst Joseph Moore (per Investor’s Business Daily), all of these factors, from TurboQuant to the AI bubble popping, aren’t indicative of a decrease in RAM demand. It’s just a classic case of buy low, sell high, and many investors are likely selling pre-dip, so they can buy during the dip. “We see the recent sell-off as a healthy pricing in of durability concerns â€" capex, demand destruction, productivity, etc. â€" and yet we see the strength as more durable than the market thinks, with memory supply remaining a gating factor for AI," Moore revealed. “But our view is that looking for sell signals from prior cycles misses the point.†"[Memory] shortages are intensifying and customers are prepaying for large-volume deals given conviction that these shortages will be sustainable,†Moore continued. “Our take, after talking to industry folks on this today, is that this is an evolutionary development, with basically no surprises for memory." The Microsoft/OpenAI datacenter lease in Abilene, Texas, is an example of this in a microcosm. Let’s say OpenAI goes bankrupt next month. What happens then? Do all of the microchips and RAM that have already been purchased, and are already housed within said datacenters, just get thrown into landfills, free for the taking? No. They get purchased by a bigger fish, probably for less than what OpenAI bought them for in the first place, and all these letters of intent will get passed on. As far as the shareholders are concerned, these RAM prices have to be maintained for as long as humanly possible. This doesn’t end with OpenAI. Its demise doesn’t mark the final chapter in the AI bubble. The vultures will swarm and pick its carcass clean, and then a new AI overlord will take its place. And then they’ll own all the RAM. There are already half a dozen frontrunners, set to cannibalize its corpse. Anthropic's Claude, Google's Gemini, Microsoft’s Copilotâ€"the list, sadly, goes on and on and on. Sundar Pichai, the head of Google parent firm Alphabet, stated in an interview with the BBC that, should the AI bubble pop, “no company is going to be immune, including us.†To these AI investors, there is too much money and too much risk involved in letting that happen. With that in mind, instead, we’d probably be wiser to consider OpenAI’s potential downfall as a first chapter of sorts. A stepping stone, or, rather, the first in a line of dominoes.
[11]
Micron Stock's Rally Looked Unstoppable -- Until Google's TurboQuant Hit - Amazon.com (NASDAQ:AMZN), Alphabet (NASDAQ:GOOGL)
For the past six sessions, that trade has been unraveling in a way the demand models did not anticipate. The proximate trigger was Alphabet Inc. (NASDAQ:GOOGL)'s announcement of TurboQuant on Tuesday, March 24 -- an AI memory compression algorithm that rattled the entire memory sector. Shares of Micron Technology have fallen in each of the past six sessions - tumbling by over 20%. That is the stock's worst multi-session performance since April 2025's tariff shock selloff. The debate now is whether this is the dip to buy or a bullish thesis that just permanently broke. What TurboQuant Actually Does (And Doesn't Do) The algorithm is called TurboQuant. Developed by Amir Zandieh, a research scientist at Google, and Vahab Mirrokni, a VP and Google Fellow, it compresses the key-value cache -- the high-speed memory store that allows an AI model to retrieve past calculations without reprocessing them -- to just 3 bits per value, down from the standard 16. An open-source release is expected in the second quarter of 2026. In plain terms: every AI inference workload runs against a KV cache that grows with context length. TurboQuant compresses that cache, which means the same amount of high-bandwidth memory can serve more simultaneous users, handle longer contexts, or run a larger model than was previously possible. However, there are also notable technical boundaries here. The algorithm is also, as of today, a laboratory result. It will be formally presented at International Conference on Learning Representations (ICLR) 2026 in late April. It has not been deployed at production scale across any major AI infrastructure stack. Analysts Say This Is Not Altering The Memory Trade Wells Fargo TMT analyst Andrew Rocha acknowledged the threat of demand directly. "TurboQuant is directly attacking the cost curve here," Rocha said in an investor note Wednesday, adding that lower memory specifications per AI workload quickly raise the question of how much total capacity the industry actually needs. Yet, Rocha stopped short of a bearish conclusion -- noting that the demand destruction scenario requires broad adoption, which has not yet occurred. It does not alter the industry's long-term demand picture," he said. Morgan Stanley pushed back on the selling theme. The bank's semiconductor analyst Shawn Kim called the stock reaction excessive and argued that TurboQuant could ultimately benefit memory makers over the longer term -- lower inference costs reduce the per-token cost of running AI services, which historically drives wider adoption rather than demand compression. The bank invoked Jevons Paradox: efficiency gains in a resource-constrained market tend to increase total consumption, not reduce it. The historical precedent is stark. JPEG compression didn't reduce camera storage. Video codecs didn't reduce hard drive demand -- they enabled 4K streaming that drove it higher. DeepSeek R1's efficiency breakthrough in January 2025 triggered a similar selloff in Nvidia and memory stocks; within two quarters, AI capex commitments from hyperscalers hit record highs. The selloff proved to be an entry point, not a cycle turning point. Vivek Arya, semiconductor analyst at BofA Securities, made the most direct case against the demand destruction thesis. In a note published Thursday, he said that similar compression techniques have been in circulation since 2024-25 -- Nvidia alone has published four distinct KV cache efficiency methods over the past twelve months -- without altering hardware procurement at scale. The more telling evidence, he said, sits in Google's own spending plans. Despite publishing TurboQuant, Google raised its CY26 capital expenditure outlook to approximately $180 billion, up 100% year-over-year, well above the prior consensus of roughly $127 billion. "The 6x improvement in memory efficiency," Arya said in the note, is likely to produce a "6x increase in accuracy and/or context length, rather than 6x decrease in memory." He maintained a $500 price target on Micron, noting the stock is now trading at the low end of its historical 5-10x forward price-to-earnings range. Ben Barringer, technology research lead at Quilter Cheviot, offered the same framing. TurboQuant added to the pressure on memory stocks, Barringer said, but the technology is evolutionary rather than revolutionary and does not alter the sector's long-term demand outlook. Andrew Jackson, technology analyst at Ortus Advisors, noted in a research note that the TurboQuant development may make "little difference to demand given the extreme supply constraints" that currently characterize the AI memory market. What This Means For Investors The near-term beneficiaries of broader TurboQuant adoption are hyperscalers -- cheaper inference costs improve return on infrastructure investment -- and AI startups that can run larger models on smaller hardware budgets. Notably, Nvidia is not a loser under this scenario; GPUs do not become less necessary when memory efficiency improves, they become more cost-effective per dollar of inference output, potentially accelerating adoption in markets previously constrained by cost. For Micron and SanDisk, the calculus is more complicated. Both stocks entered 2026 pricing in the assumption that AI memory demand would scale linearly with model size and context length -- an assumption TurboQuant now complicates, even if the full adoption path remains years away. The market appears to be not pricing in the near-term adoption scenario, but the existence of a credible software pathway to lower memory intensity. That is a different and harder assumption to dismiss. Micron's recent selloff is already notable. Whether it represents a structural repricing or an overreaction to a lab result not yet in production is a question the next earnings cycle -- and ICLR 2026 -- will begin to answer. Image: Shutterstock Market News and Data brought to you by Benzinga APIs To add Benzinga News as your preferred source on Google, click here.
[12]
Here Is The Unvarnished Truth About Google's TurboQuant: Jevons Paradox Prevails, Memory Crunch To Continue
Google's new algorithm that dramatically compresses KV cache in a lossless fashion, dubbed TurboQuant, is all the rage these days in the AI sphere, where doomsday predictions around an imminent collapse in the demand for memory abound. Never mind the fact that the underlying paper was released all the way back in April 2025! Even so, we postulate that the current doom-and-gloom in the market is eerily similar to the one that prevailed immediately after DeepSeek released its R1 model in early 2025, and that Jevons paradox will prevail. Before going further, let's first discuss what TurboQuant actually does. Consider a scenario: you are writing a story but hampered by terrible short-term memory. Whenever you write a new word, you are compelled to read whatever you've written so far just to remember what has already been inked. Obviously, as the text length increase, so does this laborious process. Key-Value or KV cache is similar to taking notes on a separate sheet so that you remain abreast of what has been written so far. This speeds up the entire process by orders of magnitude. Google's TurboQuant compresses this KV cache for a given AI model by up to 6x, thereby speeding up the underlying model by up to 8x. What's more, TurboQuant is able to do so with zero accuracy loss. Now that we've discussed what TurboQuant actually does, let's go over all of the recent doom-and-gloom surrounding this breakthrough. Basically, investors in high-flying memory stocks now fear that this algorithm would dampen the oncoming demand for memory resources just as major players start to embark on capacity expansion. What many people have failed to grasp is the fact that TurboQuant does not actually compress model weights, which often dwarf KV cache in large deployments. This means that the model size remains the same. The algorithm dramatically improves inference-related economics for data centers by allowing for an increase in a given model's context window (number of tokens), or by enabling a smaller number of GPUs to handle the same number of users. Far from decreasing the demand for memory resources, this development actually invokes the Jevons paradox, which postulates that a technology's use increases as its operating cost decreases. Consequently, it would be facile to believe that the ongoing memory crunch will end anytime soon. Finally, the interplay with Jevons paradox also means that we should not expect the ongoing upheaval in the consumer electronics sphere, especially the memory chipflation-driven price increases for smartphones, to moderate in the near future.
[13]
TurboQuant Panic: Why Market Is Wrong About Google's Newest AI Breakthrough - Alphabet (NASDAQ:GOOG), Alphabet (NASDAQ:GOOGL)
However, analysts pushed back on the bearish reaction, arguing the technology is more likely to expand AI use cases and ultimately drive higher long-term demand -- framing the pullback as a potential "buy the dip" opportunity for investors. TurboQuant Triggers Sharp Sell-Off in Memory Stocks Google introduced TurboQuant, an algorithm that reduces memory usage in key-value caches by 6x through extreme compression. Efficiency Gains Raise Demand Concerns Investors reacted to fears that improved AI efficiency could weaken demand for memory hardware. TurboQuant's ability to significantly cut memory requirements raised concerns that fewer chips would be needed to run large language models, pressuring sentiment across the semiconductor space. Analysts See Long-Term Upside Analysts pushed back on the bearish view, arguing the development could expand the AI market, SCMP reported on Friday. Morgan Stanley's head of Asia technology research, Shawn Kim, told SCMP that TurboQuant increases throughput per chip and lowers inference costs, which could expand AI adoption. He explained that efficiency gains may actually drive higher overall demand by making AI cheaper and more accessible. "TurboQuant is less about incremental optimisation and more about shifting the cost curve of AI deployment," Kim said. "Models that need cloud clusters can fit on local hardware, effectively lowering the barrier to deploying AI at scale. More applications become viable, more models remain active and utilisation of existing infrastructure improves." Semiconductor expert Lennart Heim also told SCMP that demand for memory and chips has continued to rise despite ongoing efficiency improvements, suggesting the market reaction may be short-lived. Technical Analysis Alphabet is trading 7.7% below its 20-day SMA and 10.1% below its 100-day SMA, which keeps the near-term trend pointed down even though the stock remains 6.4% above its 200-day SMA (a sign the longer-term uptrend hasn't fully broken). Shares are up 73.15% over the past 12 months, and they're currently positioned closer to their 52-week highs than lows. RSI is at 27.56, which puts the stock in oversold territory and often signals that selling pressure may be getting stretched. MACD is at -5.8217 versus a signal line of -4.0757, keeping momentum bearish as the indicator remains below its trigger line. The combination of oversold RSI (below 30) and bearish MACD suggests mixed momentum. Key Resistance: $312.50 Key Support: $270.50 Earnings & Analyst Outlook Looking further out, the next major catalyst for the stock arrives with the April 23, 2026 (estimated) earnings report. EPS Estimate: $2.67 (Down from $2.81 YoY) Revenue Estimate: $100.74 Billion (Up from $90.23 Billion YoY) Valuation: P/E of 26.0x (Indicates premium valuation relative to peers) Analyst Consensus & Recent Actions: The stock carries a Buy Rating with an average price target of $375.31. Recent analyst moves include: Needham: Buy (Maintains Target to $400.00) (March 13) Wells Fargo: Upgraded to Overweight (Raises Target to $387.00) (February 23) Tigress Financial: Strong Buy (Raises Target to $415.00) (February 19) Top ETF Exposure Significance: Because GOOGL carries such a heavy weight in these funds, any significant inflows or outflows will likely trigger automatic buying or selling of the stock. Price Action GOOGL Price Action: Alphabet shares were down 0.33% at $280.00 during premarket trading on Friday, according to Benzinga Pro data. Photo by Markus Mainka via Shutterstock Market News and Data brought to you by Benzinga APIs To add Benzinga News as your preferred source on Google, click here.
[14]
Google's TurboQuant: Opportunity or crisis for memory semiconductor market?
Google's brick-and-mortar store in Chelsea, New York. (Reuters/Yonhap) Google's announcement of an artificial intelligence model that utilizes memory more efficiently jolted share prices for chipmakers like Samsung Electronics and SK Hynix that make memory. Shaky share prices are reflecting market jitters about the current AI "super cycle" boom for memory chips potentially ending sooner than expected. However, as Chinese AI startup DeepSeek's low-cost, high-efficiency model intensified investment competition, some say that worries about Google's new technology are unwarranted. The current hubbub was triggered by a blog post published by Google Research on March 24. The post contained updates in Google's ongoing research regarding its TurboQuant algorithm, which was first unveiled in a research paper published in April 2025. The paper states that TurboQuant reduces the amount of key-value caches of models like ChatGPT and Gemini, thereby reducing the amount of requisite memory to generate responses more quickly. Key-value caching involves reusing previous calculations and responses, which are stored temporarily, to reduce redundancy. It's like having an exclusive notepad for past responses that one can refer to. AI models are growing progressively larger; this creates the problem of requiring more memory. AI proliferation has boosted demand for memory of all sorts, whether it's high-bandwidth memory (HBM) or NAND flash, thereby driving up memory prices. Google's research paper states that TurboQuant condenses standardized data while correcting errors that arise between compressed data and actual data to reduce memory usage to a sixth of current levels. The paper sparked market concerns about reduced demand for AI-driven memory. The UK's Financial Times published an article on Saturday stating, "Shares in US memory maker Micron have shed more than US$70 billion in market capitalization since last Friday's close, down 15%, amid a broad sell-off on Wall Street." "Sandisk, the maker of flash memory devices that was the best-performing stock in the S&P 500 last year, lost around US$15 billion in value during the week, while storage companies Western Digital and Seagate each lost billions," the article continued. During the same period, the value of shares for Samsung Electronics and SK Hynix dropped over 8 percent. Some analysts suggest that this was Google's way of insulating itself from the risk of a memory supply shortage by creating its own software, much like how it reduced its dependence on Nvidia chips by producing its own TPUs for AI applications. But some see it differently. "It's interesting, and we're keeping an eye on it, but one article probably isn't going to have any immediate major ramifications for the memory market as a whole," one Korean industry insider said. "Demand to build AI data centers for Big Tech companies continues to grow, and if memory use becomes more efficient and prices drop, demand for AI investments is bound to go up," they said. Financial analysts appear to be viewing the development in a similar light. Many expect a modern-day "Jevons paradox" -- originally used to describe the counterintuitive increase in demand for coal that accompanied improvements in the efficiency of steam engines in the 19th century. While many had expected coal consumption to dwindle as a result of increased fuel efficiency, the widespread adoption of steam engines actually drove demand for coal. "Technological demands are intensifying in unexpected ways like never before, meaning memory makers will need to transition their business model into an 'AI memory business' by integrating software tech into their existing hardware and having clients involved starting at the chip design stage," said Kwon Seok-joon, a professor of chemical engineering at Sungkyunkwan University. The suggestion is that chipmakers will need to go beyond producing and supplying general-use memory by developing AI memory tailored to clients. By Park Jong-o, staff reporter; Kang Jae-gu, staff reporter
[15]
Google's TurboQuant unlikely to weaken memory demand: analysts - The Korea Times
An introduction of Google's TurboQuant technology published on Google Research website / Captured from Google Research Google's announcement of TurboQuant is weighing on the share prices of memory companies, as the technology is expected to cut artificial intelligence (AI) models' memory usage to about one-sixth of current levels. For analysts, however, concerns on the memory chip demand may be overblown, as they noted that even if memory demand per model declines due to TurboQuant, overall demand for AI continues to grow at a faster pace, keeping the broader memory market on a solid growth trajectory. Announced on Tuesday by Google Research, TurboQuant is a compression technology designed to maximize AI efficiency. The gist of the technology is compressing an AI model's key value cache memory (KV cache) to just 3 bits, cutting its size by more than sixfold. KV cache is an AI model's short-term memory, where it stores keys and values already calculated so it can generate the next words faster. A sixfold reduction in KV cache size effectively lowers memory usage to about one-sixth of current levels, making similar performance possible even with only one-sixth of the required memory. As AI services become more advanced, the AI industry has been seeking ways to address the growing burden of KV cache, while the demand of AI memory chips such as high-bandwidth memory (HBM) has remained strong. This has even led to discussions on using advanced NAND storage to support HBM. As expectations grew that TurboQuant's commercialization could reduce memory demand, shares of memory companies came under pressure. Samsung Electronics shed 4.71 percent Thursday, while SK hynix fell 6.23 percent. Micron also declined 6.97 percent on the same day. Samsung Electronics and SK hynix extended their losses Friday, falling 0.22 and 1.18 percent, respectively. While TurboQuant could reduce memory use, analysts said concerns over a slowdown in overall memory demand are excessive. "There have been efforts to improve AI models to optimize chip usage, but more efficient models tend to lower overall costs and, in turn, drive greater demand for AI computing," Samsung Securities analyst Lee Jong-wook said. "Rather than reducing semiconductor demand, such optimized models are being used to deliver higher-performance AI services with the same chip resources." Lee cited the Jevons paradox, which occurs when increased efficiency will increase the use of a resource. In AI computing, the paradox means improvements in AI efficiency reduce the cost of computing, ultimately driving much higher demand. "Factors that could lead to a decline in AI memory demand are likely to emerge when AI capabilities get into a stalemate, such as a slowdown in service improvements or weakening competition between AI model developers," Lee said. "As long as AI companies compete on performance rather than cost, optimization will not weigh on semiconductor demand." Hana Securities analyst Kim Rok-ho also said TurboQuant's commercialization will improve cost efficiency for data center operators, driving up the overall demand for AI memory chips. "Compression technologies are not new, and it remains uncertain whether they will be widely adopted across the industry," Kim said. "Even if such technologies become more widely used over the mid to long term, it will lower memory cost barriers, expanding overall AI use. There are limited chances of decline in demand for DRAM and storage."
[16]
TurboQuant And Why The Stock Market Reaction is Irrational
When Google's researchers quietly published details of their new TurboQuant compression technique this week, the reaction in semiconductor markets was swift and punishing. SK Hynix fell as much as 6.4% on the Korea Exchange. Kioxia dropped by the same margin in Tokyo. Micron and Sandisk slid in New York. The logic seemed obvious to traders: if AI models need less memory to run, the companies that make that memory stand to suffer. It is a tidy theory. It is also, almost certainly, wrong. Ask Jevons. The sell-off reflects a misunderstanding baked deeply into how markets react to efficiency gains, a misunderstanding that economists have been correcting since 1865. The relevant framework has a name: the Jevons Paradox. And right now, it is the most important concept in AI investing that most traders appear to have forgotten. What TurboQuant Actually Does TurboQuant is Alphabet's new quantization and compression technology for large language models. According to Morgan Stanley analyst Shawn Kim as quoted by Yahoo Finance, the algorithm makes AI inference eight times faster while simultaneously requiring six times less memory. For context, memory - specifically the high-bandwidth memory stacked inside Nvidia's accelerators - has been one of the primary bottlenecks and cost centers in deploying AI at scale. Reducing that requirement by a factor of six is not a minor tweak. It is a structural shift in the cost curve of AI deployment. The Alphabet unit's breakthrough can limit the amount of memory required to run large language models by at least a factor of six, reducing the overall cost of training and running artificial intelligence systems. That reduction in cost, combined with the eightfold speed improvement, is precisely what sets the Jevons mechanism in motion The Paradox That Refuses to Die William Stanley Jevons was a 19th-century English economist who observed something counterintuitive about coal. When the steam engine became dramatically more efficient under James Watt , the reasonable expectation was that Britain's coal consumption would fall. Instead, it soared. More efficient engines made coal-powered production economically viable across a vast range of new applications. The lower cost per unit of output expanded the addressable market so dramatically that total consumption surged. The paradox has recurred throughout industrial history. Think automobiles, PCs, Software costs, Tyres, Electronic products - pretty much everything that has gone mass, in each case, the efficiency gain reduced the marginal cost of the activity, which in turn unlocked latent demand that had been suppressed by that cost First articulated in 1865 by economist William Stanley Jevons in The Coal Question, the paradox observes that technological improvements in resource efficiency tend to increase, rather than decrease, total resource consumption since lower cost per unit of output expands the viable use-case universe, stimulating demand that far outpaces the savings achieved per individual deployment. The theory resurfaced in January 2025 when DeepSeek's low-cost AI model triggered fears of reduced demand for Nvidia chips. Those fears proved similarly misplaced as cheaper AI inference expanded the number of companies and applications deploying AI models at scale Why Memory Demand Will Rise The argument that TurboQuant reduces memory demand rests on an implicit assumption: that the quantity of AI workloads is fixed. And the use cases for AI have already been estimated. But the quantity of AI workloads is not fixed. In fact, it is highly elastic with respect to cost. Today, countless organizations, developers, and researchers are held back from deploying AI at scale not by lack of ambition but by the expense of doing so. Memory is expensive, constrained, and in chronic shortage, but a very significant component of that expense. Slash it by a factor of six and the economics of previously marginal AI deployments flip from negative to positive. JPMorgan's trading desk cited the Jevons Paradox directly, noting that while investors may take profits on the news, there is no near-term threat to memory consumption. Their analysts understand what the sell-off reflects: a market reacting to the per-unit efficiency gain while ignoring the volume effect that historically dominates such transitions. The dynamic is already visible in the supply picture. Memory and storage product prices have climbed sharply in recent months amid shortages driven by AI demand. Those shortages do not exist because the world has too much AI ambition - they exist because the world has too little production capacity to serve the AI ambitions that already exist at current prices. Lower the cost of each deployment and the queue of would-be deployers grows substantially longer. The DeepSeek Precedent This is not the first time an AI efficiency breakthrough has triggered a reflexive memory-demand panic. In early 2025, the emergence of DeepSeek - a Chinese AI model that achieved competitive performance at dramatically lower computational cost - sparked widespread fear that advanced chip demand would crater. The Jevons Paradox was invoked then too, and subsequently validated: cheaper AI inference expanded the deployment universe rather than contracting it, and demand for high-end compute continued to accelerate. TurboQuant follows the same structural logic. Given the severe supplu side constraints in the chips industry, TurboQuant may ease the situation significantly. Also, the TurboQuant only addresses the Inference engine by making it faster. However, the oart of AI, which has to do with Training is as-yet-unaffected. Also, the bottleneck on AI deployment is not the existence of efficient compression algorithms but the physical availability of the memory and compute infrastructure to run models at the scale the market demands. TurboQuant loosens one constraint. It does not eliminate the others. What the Stock Market Sell-Off Actually Reveals It is easy to dismiss the immediate stock market reaction as knee-jerk. However, once we look at the chip sell-off not as a rational reappraisal of long-run demand but as a short-term profit-taking event dressed in fundamental clothing, then things fall into place. Kioxia had surged roughly 700% since August 2024. SK Hynix and Micron had made similarly dramatic moves. In fact, the overall sector has been extremely bullish for more than a year now. Consequently, when a narrative appears on the horizon which provides cover for investors sitting on enormous gains to reduce exposure, profit booking will follow - regardless of whether the narrative is correct. TurboQuant provided that excuse. It does not, however, provide a credible long-run bear case for memory demand. The more considered view is that TurboQuant is additive to the AI ecosystem. It makes inference faster and cheaper, which expands the population of viable AI applications, which expands the demand for the infrastructure, including memory, required to run them. Google's TurboQuant is a genuine technical achievement. An eightfold inference speed improvement combined with a sixfold reduction in memory requirements is the kind of step-change that reshapes the cost economics of entire industries. But the market's initial reaction, which treated this as categorically bad news for memory manufacturers, confuses efficiency with reduced consumption. The AI memory market is not different in kind from every other market where this dynamic has played out. It is simply the latest arena in which the Jevon's paradox will repeat itself.
[17]
Memory Stocks Slide As Google's New AI Efficiency Breakthrough May Slash Data Storage Needs - SanDisk (NASDAQ:SNDK)
Google Unveils TurboQuant Algorithm On Tuesday, Google researchers introduced "TurboQuant." This set of advanced quantization algorithms enables massive compression for large language models (LLMs). According to the Google blog, the technology "optimally addresses the challenge of memory overhead in vector quantization." The breakthrough focuses on the key-value (KV) cache. Google describes this as a "digital cheat sheet" that stores frequently used information. High-dimensional vectors capture complex AI data but consume vast amounts of memory. Google reported that TurboQuant reduces key-value memory size "by a factor of at least 6x" without sacrificing model accuracy. Enhanced Efficiency For LLMs Google stated these techniques allow "building and querying large vector indices with minimal memory." Price Action: SanDisk shares were down 3.17% at $649.20, Western Digital shares were down 1.29% at $291.80, and Micron Technology shares were down 3.46% at $381.86 at the time of publication on Wednesday, according to Benzinga Pro data. Photo: Nor Gal / Shutterstock This content was partially produced with the help of AI tools and was reviewed and published by Benzinga editors. Market News and Data brought to you by Benzinga APIs To add Benzinga News as your preferred source on Google, click here.
[18]
Samsung, SK Hynix slide as Google touts AI memory compression tech 'TurboQuant' By Investing.com
Investing.com-- Samsung Electronics and SK Hynix shares fell sharply on Thursday after Google researchers unveiled a new compression algorithm that could lower artificial intelligence demand for memory. Samsung (KS:005930) fell 4.8%, while SK Hynix Inc (KS:000660) slid 5.9%, with both stocks among the biggest weights on the KOSPI index, which shed as much as 3%. Get more key insights on the biggest AI stocks by subscribing to InvestingPro Losses in Asian memory stocks came tracking overnight declines in their U.S. peers, with Micron Technology Inc (NASDAQ:MU), SanDisk Corporation (NASDAQ:SNDK), Western Digital (NASDAQ:WDC) and Seagate Technology PLC (NASDAQ:STX) down between 3% and 6%. Google researchers earlier this week unveiled TurboQuant, which they said was an algorithm that could shrink AI's working memory requirements without affecting performance. The technology is also able to optimize the vector search capabilities that power major search engines, Google researchers said. A reduction in AI's memory requirements, especially if the technology is viable and adopted at mass, could in turn point to slower demand from the industry for advanced memory chips. AI-fueled demand, which was seen causing a memory chip supply shortage in recent quarters, had fuelled a major rally in Samsung and SK Hynix shares. The two are among the largest and most advanced memory chip makers in the world, and are key memory supplier for the AI industry. Google said it will present Turboquant at the ICLR 2026 conference in April.
[19]
MU, WDC, SNDK fall: Why Google's TurboQuant is rattling memory stocks By Investing.com
Investing.com -- Memory stocks fell Wednesday despite broader technology sector strength, with shares dropping after Google unveiled TurboQuant, a new compression algorithm that could reduce memory requirements for AI systems. SanDisk Corporation (NASDAQ:SNDK) fell 5.7%, Micron Technology (NASDAQ:MU) dropped 3%, Western Digital (NASDAQ:WDC) declined 4.7%, and Seagate Technology (NASDAQ:STX) slid 4%. The declines came as the Nasdaq 100 advanced. Google introduced TurboQuant, a compression technology designed to reduce memory consumption in large language models and vector search engines. The algorithm addresses bottlenecks in key-value cache, which stores frequently accessed information in AI systems. According to Google's announcement, TurboQuant can compress key-value cache to 3 bits without requiring training or fine-tuning while maintaining model accuracy. Testing on open-source models including Gemma and Mistral showed the technology achieved a 6x reduction in key-value memory size. The algorithm also demonstrated up to 8x performance increase over unquantized keys on H100 GPU accelerators. The technology works through two steps: applying the PolarQuant method for high-quality compression by rotating data vectors, and using the Quantized Johnson-Lindenstrauss algorithm to eliminate residual errors. Google said traditional vector quantization methods add 1 to 2 extra bits per number in memory overhead, partially negating compression benefits. TurboQuant will be presented at ICLR 2026, while PolarQuant is scheduled for presentation at AISTATS 2026. Google tested the algorithms across benchmarks including LongBench, Needle In A Haystack, ZeroSCROLLS, RULER, and L-Eval. The technology has applications beyond AI models, including vector search capabilities that power large-scale search engines. Memory stocks have rallied significantly year to date, making them vulnerable to developments that could reduce demand.
Share
Share
Copy Link
Google unveiled TurboQuant, a memory compression algorithm that can reduce AI working memory requirements by at least 6x. The announcement triggered sharp declines in memory chip stocks, with SK Hynix falling 6% and Samsung dropping nearly 5%. But analysts warn the efficiency gains may paradoxically drive higher long-term demand for memory.
Google has introduced TurboQuant, a memory compression algorithm designed to dramatically reduce the working memory requirements for AI models during inferencing
1
. According to researchers, the technology can reduce memory requirements by at least 6x while maintaining accuracy, potentially offering relief amid an industry-wide RAM crisis1
. The algorithm focuses its compression on the key-value cache, the area responsible for retaining historical calculations to bypass redundant processing, and maintains full performance on tasks including code generation, question answering, and text summarization5
.
Source: Korea Times
Google describes TurboQuant as "a set of advanced, theoretically grounded quantization algorithms that enable massive compression for large language models and vector search engines"
1
. The company plans to showcase the core components—PolarQuant and QJL, a novel method for training and optimization—at ICLR 2026 next month1
. Google expresses confidence the technology is ready for large-scale deployment, stating these methods "operate near theoretical lower bounds" and are "robust and trustworthy for critical, large-scale systems"1
.The TurboQuant announcement triggered immediate reactions in financial markets, with memory chip stocks experiencing sharp declines
3
. SK Hynix fell as much as 6% on the Korea Exchange, while Samsung dropped nearly 5%5
. Kioxia Holdings Corp. declined 4.4% to nearly 6% in Tokyo, and Micron Technology and SanDisk experienced similar losses in New York trading3
5
.
Source: CXOToday
Western Digital's share price fell 8.5% on Monday alone and is down 20.5% since March 19th, while SanDisk slid 7% on Monday and lost a fifth of its value in a fortnight
2
. Micron Technology's stock has slumped in the dozen days since announcing enormous growth in revenue and profits2
. Cloudflare's head Matthew Prince compared the development to "Google's DeepSeek," referencing last year's industry-wide shockwaves from China's low-cost AI model5
.Despite initial market panic, analysts argue TurboQuant may actually increase demand for memory through a phenomenon known as Jevons Paradox
3
. This 19th-century economic theory states that improved efficiency leads to increased consumption rather than decreased demand3
. JPMorgan Chase analysts noted that while investors may take profits on the news, there's no near-term threat to memory consumption3
.TrendForce, which specializes in the memory market, predicts TurboQuant will lower AI infrastructure costs and "spark massive long-sequence application demand, comprehensively driving structural growth and specification upgrades for high-bandwidth, main, and flash memory across cloud and edge platforms"
2
. SemiAnalysis researcher Ray Wang told CNBC that alleviating technical constraints frequently paves the way for advanced models that ultimately demand increased hardware support, noting "when the model becomes more powerful, you require better hardware to support it"5
.Related Stories
The technology arrives amid what industry insiders call "RAMageddon," a generational shortage affecting practically every electronic gadget
4
. From September to February, the price of a single 64GB RAM stick jumped from roughly $250 to more than $1,0004
. The AI boom has created this situation by giving memory-makers incentive to prioritize production of high-bandwidth and high-margin memory that GPUs require, reducing supply for other memory and sending prices soaring2
.
Source: Bloomberg
This year, tech giants including Amazon, Alphabet, Meta, Microsoft, and Oracle are set to collectively spend half a trillion dollars on the AI build-out, with roughly a third spent on memory alone
4
. Every major RAM manufacturer has shifted production lines to service AI data centers, with 70% of memory-chip products made globally destined for them4
. The demand has "cannibalized our conventional consumer-electronics supply," according to Yang Wang, an analyst at Counterpoint Research4
.While TurboQuant represents a significant advancement, it won't immediately solve the memory crisis
1
. The technology is still a lab breakthrough rather than something trialed at scale or deployed in the real world, and deployment would take time while memory orders are already locked in for many months1
. The algorithm offers no relief for the massive RAM needed for AI model training, as it strictly compresses data during the inferencing stage5
.Additionally, helium shortages caused by war in the Persian Gulf have damaged the supply chain for semiconductor production, potentially preventing chipmakers from producing all the RAM they anticipated
2
. Quilter Cheviot technology research lead Ben Barringer explained to CNBC that the recent stock drop likely results from shareholders cashing out after sustained growth, with TurboQuant "added to the pressure, but this is evolutionary, not revolutionary" and doesn't alter the industry's long-term demand picture5
. Analysts suggest that decreasing hardware barriers might actually accelerate localized AI projects, paradoxically driving up total long-term chip consumption and sustaining demand for memory despite compression advances5
.Summarized by
Navi
[2]
[4]
03 Dec 2025•Business and Economy

23 Oct 2025•Business and Economy

16 Feb 2026•Business and Economy

1
Policy and Regulation

2
Technology

3
Policy and Regulation
