



5 Sources
5 Sources
[1]

Ultra-efficient AI won't solve data centers' climate problem. This might.
Despite DeepSeek's AI efficiency gains, data centers are still expected to gobble up huge amounts of U.S. electricity. When Chinese AI start-up DeepSeek announced a chatbot that matched the performance of cutting-edge models such as ChatGPT with a fraction of the computing power, it sparked a glimmer of hope that AI might be less of an energy hog than people had feared. But making AI cheaper and more efficient could just prompt people to use more AI, meaning data centers will still wind up using a lot more electricity, according to computer scientists, energy experts and tech investors. "People ask, 'Can we just forget about this? Is AI still an energy problem?'" said Vijay Gadepally, a senior scientist at the MIT Lincoln Laboratory who studies ways to make AI more sustainable. "The answer is a resounding, 'Yes, it is.'" One solution would be for tech companies to use cleaner energy, Gadepally said. In the United States, most electricity is still generated by fossil fuels, but data centers could monitor local power grids and slow down when power plants are burning the dirtiest fuels. Tech companies, including Google and Microsoft, have started experimenting with this idea. Why efficiency won't solve AI's energy problem The data centers that train and run AI algorithms can use as much electricity as small cities -- and tech companies are racing to build thousands more in the next few years, prompting power companies to burn more planet-warming fossil fuels to keep up. Even though AI has become more efficient over time, its energy use has only gone up. This phenomenon has played out before, as well. In the 19th century, economist William Stanley Jevons documented how England's coal consumption shot up after the invention of a steam engine that used less coal. The new technology made it so cheap to run a coal-powered engine that companies all over England started doing it, creating even more demand for the fuel. Since DeepSeek unveiled its extra-efficient chatbot this month, tech and energy experts -- including Microsoft CEO Satya Nadella -- have been citing the Jevons paradox as the reason AI's energy use won't change. Microsoft is the biggest investor in ChatGPT-maker OpenAI. Regardless of whether that prediction comes true, the tech CEOs behind the data center boom show no signs of slowing planned construction -- including Project Stargate, an OpenAI-led plan to invest $500 billion in up to 20 new data centers over the next four years. Plans for the first Stargate data center campus include a 360-megawatt natural gas power plant, which could produce enough electricity for up to 170,000 average U.S. homes and as much planet-warming pollution as about 75,000 cars, based on industry averages from the Energy Information Administration and the U.S. Environmental Protection Agency. Data centers already gobble up more than 4 percent of all U.S. electricity, according to the Lawrence Berkeley National Laboratory. Even if more efficient AI leads to the number of data centers growing on the slower side of experts' predictions, they would still devour more power over the next few years. "Whatever we do, energy usage is likely going to go up," Gadepally said. "That train has left the station." Cutting AI's carbon emissions The biggest tech companies -- including Amazon, Google and Microsoft -- pay for clean-energy credits to offset their energy use and invest in green-power projects, such as restarting the Three Mile Island nuclear plant, developing a new kind of geothermal energy or building fusion reactors. But, for the most part, they still use the same electricity as everyone else, which often comes from fossil fuels. Tech companies could work around that by slowing down their data centers in moments when there's a lot of fossil fuel energy on the local grid, or shifting the work to data centers in parts of the world that have more renewable energy in that moment. They could also use more powerful, energy-hungry versions of their AI models when the grid is clean and less powerful models when it's dirty. In a 2023 study, Gadepally tested the idea on an image recognition AI model. When the sun was shining and the wind was blowing, Gadepally used a state-of-the-art version of the AI. But when the grid used more fossil fuels, he switched to an older version that performs slightly worse but hogs less energy. He compared the performance of this green approach with what would happen if he used just the energy-hogging state-of-the-art system all the time. Over the course of two days, the green AI cut carbon emissions 80 percent compared with the standard version -- and its scores on image recognition tests were only 3 percent worse. He ran a similar test using a language model last year and found that the green version cut carbon emissions 40 percent with no difference in performance. Companies could put a dent in AI's carbon emissions using these strategies, said Benjamin Lee, an electrical and systems engineering professor at the University of Pennsylvania who was not involved in the research -- but only if they were willing to accept that their AI might perform slightly worse or take more time to train. "The challenge is not implementing these techniques but rather convincing AI companies and users that some accuracy loss ... is worth the carbon savings," he said.
[2]

DeepSeek claims to have cured AI's environmental headache. The Jevons paradox suggests it might make things worse
AI burns through a lot of resources. And thanks to a paradox first identified way back in the 1860s, even a more energy-efficient AI is likely to simply mean more energy is used in the long run. For most users, "large language models" such as OpenAI's ChatGPT work like intuitive search engines. But unlike regular web-searches that find and retrieve data from anywhere along a global network of servers, AI models return data they've generated from scratch. Like powering up a nuclear reactor to use a calculator, this tailored process is very inefficient. One study suggests the AI industry will be consuming somewhere between 85 and 134 terrawatt-hours (TWh) of electricity by 2027. That's a similar amount of energy as the Netherlands consumes each year. One prominent researcher predicts that by 2030, over 20% of all electricity produced in the US will be feeding AI data centres (huge warehouses filled with computers). Big tech firms have always claimed to be heavy investors in wind and solar energy. But AI's appetite for 24/7 power means most are developing their own nuclear options. Microsoft even plans to revive the infamous Three Mile Island power plant, scene of America's worst ever civil nuclear accident. Despite Google's ambitious target of being carbon neutral by 2030, the company's AI developments mean its emissions have climbed 48% in the past few years. And the computing power needed to train these models increases tenfold each year. However, Chinese start-up DeepSeek claims to have created a fix: a model that matches the performance of established US rivals like OpenAI, but at a fraction of the cost and carbon footprint. An environmental game changer? DeepSeek has created a powerful open-source, relatively energy-lite model. The company claims it spent just US$6 million renting the hardware needed to train its new R1 model, compared with over $60 million for Meta's Llama, which used 11 times the computing resources. DeepSeek uses a "mixture-of-experts" architecture, a machine-learning method that allows the model to scale up and down depending on the complexity of prompts. The company claims its model can also store more data and be trained without the need for huge amounts of expensive processor chips. In reaction, US chip manufacturing and energy stocks plummeted following investor concerns that AI companies would rethink their energy-intensive data centre developments. As the world's largest supplier of specialist AI processors, Nvidia saw its share price fall by US$589 billion, the biggest one-day loss in Wall Street history. Paradoxically, as well as upsetting the performance of US tech stocks, improving the energy efficiency of AI platforms could actually worsen the industry's environmental performance as a whole. With tech stocks crashing, Microsoft CEO Satya Nadella tried to bring a longer-term perspective: "Jevons paradox strikes again!" he posted on X. "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of." The Jevons paradox The idea that energy efficiency isn't always a good thing for Earth's resources has been around for well over a century. In 1865, a young Englishman named William Stanley Jevons wrote "The Coal Question", a book in which he suggested that Britain's place as an industrial superpower might soon come to an end, due to its rapidly depleting coal reserves. But to Jevons, frugality was not the solution. He argued: "It is wholly a confusion of ideas to suppose that the economical use of fuel is equivalent to a diminished consumption. The very contrary is the truth." According to Jevons, any increase in resource efficiency generates an increase in long-term resource consumption, rather than a decrease. Because greater energy efficiency has the effect of reducing energy's implicit price, it increases the rate of return - and demand. Jevons offered the example of the British iron industry. If technological advancements helped a blast furnace produce iron with less coal, profits would rise and new investment in iron production would be attracted. At the same time, falling prices would stimulate additional demand. He concluded: "The greater number of furnaces will more than make up for the diminished [coal] consumption of each." More recently, the economist William Nordhaus applied this idea to the efficiency of lighting since the dawn of human civilisation. In a paper published in 1998, he concluded that in ancient Babylon, the average labourer might need to work more than 40 hours to purchase enough fuel to produce the equivalent amount of light emitted by a modern lightbulb for one hour. But by 1992, an average American would need to work for less than half a second to produce the same. Throughout time, efficiency gains haven't reduced the energy we expend on lighting or shrunk our energy consumption. On the contrary, we now generate so much electric light that areas without it have become tourist attractions. Warming and lighting our homes efficiently, driving our cars, mining Bitcoin and, indeed, building AI models are all subject to the same so-called rebound effects identified in the Jevons paradox. And this is why it will be impossible to ensure a more efficient AI industry actually leads to an overall reduction in energy use. A Sputnik moment In the 1950s, the US was horrified when the Soviets launched Sputnik, the first space satellite. The emergence of a more efficient rival caused America to allocate more resources to the space race, not less. DeepSeek is Silicon Valley's Sputnik moment. More efficient AI will probably mean more distributed and powerful models, in an arms race that is no longer made up only of US tech giants. AI offers superpower status, and the floodgates may now be fully open for the UK and other global competitors, as well as China. What's for certain is that in the long term, the AI industry's appetite for energy and other resources is only going to increase.
[3]

AI is 'an energy hog,' but DeepSeek could change that
Justine Calma is a senior science reporter covering energy and the environment with more than a decade of experience. She is also the host of Hell or High Water: When Disaster Hits Home, a podcast from Vox Media and Audible Originals. DeepSeek startled everyone last month with the claim that its AI model uses roughly one-tenth the amount of computing power as Meta's Llama 3.1 model, upending an entire worldview of how much energy and resources it'll take to develop artificial intelligence. Taken at face value, that claim could have tremendous implications for the environmental impact of AI. Tech giants are rushing to build out massive AI data centers, with plans for some to use as much electricity as small cities. Generating that much electricity creates pollution, raising fears about how the physical infrastructure undergirding new generative AI tools could exacerbate climate change and worsen air quality. Reducing how much energy it takes to train and run generative AI models could alleviate much of that stress. But it's still too early to gauge whether DeepSeek will be a game-changer when it comes to AI's environmental footprint. Much will depend on how other major players respond to the Chinese startup's breakthroughs, especially considering plans to build new data centers. "It just shows that AI doesn't have to be an energy hog," says Madalsa Singh, a postdoctoral research fellow at the University of California, Santa Barbara who studies energy systems. "There's a choice in the matter." The fuss around DeepSeek began with the release of its V3 model in December, which only cost $5.6 million for its final training run and 2.78 million GPU hours to train on Nvidia's older H800 chips, according to a technical report from the company. For comparison, Meta's Llama 3.1 405B model -- despite using newer, more efficient H100 chips -- took about 30.8 million GPU hours to train. (We don't know exact costs, but estimates for Llama 3.1 405B have been around $60 million and between $100 million and $1 billion for comparable models.) Then DeepSeek released its R1 model last week, which venture capitalist Marc Andreessen called "a profound gift to the world." The company's AI assistant quickly shot to the top of Apple's and Google's app stores. And on Monday, it sent competitors' stock prices into a nosedive on the assumption DeepSeek was able to create an alternative to Llama, Gemini, and ChatGPT for a fraction of the budget. Nvidia, whose chips enable all these technologies, saw its stock price plummet on news that DeepSeek's V3 only needed 2,000 chips to train, compared to the 16,000 chips or more needed by its competitors. DeepSeek says it was able to cut down on how much electricity it consumes by using more efficient training methods. In technical terms, it uses an auxiliary-loss-free strategy. Singh says it boils down to being more selective with which parts of the model are trained; you don't have to train the entire model at the same time. If you think of the AI model as a big customer service firm with many experts, Singh says, it's more selective in choosing which experts to tap. The model also saves energy when it comes to inference, which is when the model is actually tasked to do something, through what's called key value caching and compression. If you're writing a story that requires research, you can think of this method as similar to being able to reference index cards with high-level summaries as you're writing rather than having to read the entire report that's been summarized, Singh explains. What Singh is especially optimistic about is that DeepSeek's models are mostly open source, minus the training data. With this approach, researchers can learn from each other faster, and it opens the door for smaller players to enter the industry. It also sets a precedent for more transparency and accountability so that investors and consumers can be more critical of what resources go into developing a model. "If we've demonstrated that these advanced AI capabilities don't require such massive resource consumption, it will open up a little bit more breathing room for more sustainable infrastructure planning," Singh says. "This can also incentivize these established AI labs today, like Open AI, Anthropic, Google Gemini, towards developing more efficient algorithms and techniques and move beyond sort of a brute force approach of simply adding more data and computing power onto these models." To be sure, there's still skepticism around DeepSeek. "We've done some digging on DeepSeek, but it's hard to find any concrete facts about the program's energy consumption," Carlos Torres Diaz, head of power research at Rystad Energy, said in an email. If what the company claims about its energy use is true, that could slash a data center's total energy consumption, Torres Diaz writes. And while big tech companies have signed a flurry of deals to procure renewable energy, soaring electricity demand from data centers still risks siphoning limited solar and wind resources from power grids. Reducing AI's electricity consumption "would in turn make more renewable energy available for other sectors, helping displace faster the use of fossil fuels," according to Torres Diaz. "Overall, less power demand from any sector is beneficial for the global energy transition as less fossil-fueled power generation would be needed in the long-term." There is a double-edged sword to consider with more energy-efficient AI models. Microsoft CEO Satya Nadella wrote on X about Jevons paradox, in which the more efficient a technology becomes, the more likely it is to be used. The environmental damage grows as a result of efficiency gains. "The question is, gee, if we could drop the energy use of AI by a factor of 100 does that mean that there'd be 1,000 data providers coming in and saying, 'Wow, this is great. We're going to build, build, build 1,000 times as much even as we planned'?" says Philip Krein, research professor of electrical and computer engineering at the University of Illinois Urbana-Champaign. "It'll be a really interesting thing over the next 10 years to watch." Torres Diaz also said that this issue makes it too early to revise power consumption forecasts "significantly down." No matter how much electricity a data center uses, it's important to look at where that electricity is coming from to understand how much pollution it creates. China still gets more than 60 percent of its electricity from coal, and another 3 percent comes from gas. The US also gets about 60 percent of its electricity from fossil fuels, but a majority of that comes from gas -- which creates less carbon dioxide pollution when burned than coal. To make things worse, energy companies are delaying the retirement of fossil fuel power plants in the US in part to meet skyrocketing demand from data centers. Some are even planning to build out new gas plants. Burning more fossil fuels inevitably leads to more of the pollution that causes climate change, as well as local air pollutants that raise health risks to nearby communities. Data centers also guzzle up a lot of water to keep hardware from overheating, which can lead to more stress in drought-prone regions. Those are all problems that AI developers can minimize by limiting energy use overall. Traditional data centers have been able to do so in the past. Despite workloads almost tripling between 2015 and 2019, power demand managed to stay relatively flat during that time period, according to Goldman Sachs Research. Data centers then grew much more power-hungry around 2020 with advances in AI. They consumed more than 4 percent of electricity in the US in 2023, and that could nearly triple to around 12 percent by 2028, according to a December report from the Lawrence Berkeley National Laboratory. There's more uncertainty about those kinds of projections now, but calling any shots based on DeepSeek at this point is still a shot in the dark.
[4]

ChatGPT vs. DeepSeek: which AI model Is more sustainable?
How do the environmental credentials of these ChatGPT and DeepSeek compare? By now, even casual observers of the tech world are well aware of ChatGPT, OpenAI's dazzling contribution to artificial intelligence. Its ability to generate coherent, on-point responses has upended online research and sparked endless speculation about AI's growing role in our everyday lives. A recent rising challenger, China's opensource AI-powered chatbot, DeepSeek, has drawn its own intrigue, promising to run more efficiently and be better suited to non-English users than its American competitor. Yet in the rush to assess its functionality, adoption, and potential geopolitical sway, one pressing question seems to have been sidelined: how do the environmental credentials of these ChatGPT and DeepSeek compare? ChatGPT ChatGPT's meteoric rise began in late 2022, with OpenAI and Microsoft forming a high-profile alliance to scale it via Azure's cloud services. Every iteration of the GPT architecture, however, comes at a steep environmental price. Training such a colossal model requires immense computing power, and the subsequent energy use has raised uncomfortable questions about its carbon footprint. DeepSeek While DeepSeek hasn't yet become a household name to the extent ChatGPT has, it's earning a reputation as a leaner, more multilingual competitor. It uses techniques like pruning (removing unnecessary parts of the model to reduce size and improve efficiency), model distillation (training a smaller "student" model to imitate a larger "teacher" model), and algorithmic streamlining (optimizing each step of the computation process to minimize wasted resources and improve overall performance) - all intended to cut down on resources and associated costs. The theory goes that an AI needing fewer GPUs should, in principle, consume less energy overall. Yet details on its total environmental impact remain conspicuously thin, leaving observers to wonder if DeepSeek's operational gains could truly deliver on the sustainability front. The most glaring environmental toll for both models lies in the power needed to train them. Early estimates suggest that rolling out ChatGPT's latest language model, GPT4, demanded colossal GPU capacity for weeks on end. DeepSeek, meanwhile, claims to require fewer high-end chips, potentially reducing its total electricity draw. Powering ChatGPT on Microsoft's Azure platform has its upsides and downsides. Microsoft is working to become carbon-negative by 2030, underpinned by investments in green energy and carbon capture. Yet many of its data centers remain tethered to non-renewable energy grids, and the manufacture of sophisticated AI chips is itself resource-intensive. DeepSeek appears to rely on Alibaba Cloud, China's most prominent cloud provider, which has set similar targets for carbon neutrality. But China's national grid continues to rely heavily on coal, meaning the actual environmental impact might be more significant unless DeepSeek is sited in locations rich in renewable infrastructure. That said, DeepSeek's focus on efficiency might still make it less carbon-intensive overall. Running giant clusters of GPUs produces heat - lots of it. Data centres typically use vast amounts of water for cooling, especially in regions with high temperatures. Microsoft has come under fire for consuming billions of liters of water, some of which goes towards cooling the hardware behind AI operations. Information on DeepSeek's water footprint is scant. If Alibaba Cloud's newer facilities use advanced cooling methods - such as immersion cooling (submerging servers in a thermally conductive liquid to dissipate heat more efficiently) - DeepSeek might fare better in terms of water usage. But with so little public data on its processes, it's difficult to measure how it stacks up against ChatGPT on this front. The relentless pace of AI hardware development means GPUs and other accelerators can quickly become obsolete. ChatGPT's operations, involving cutting-edge equipment, likely generate a rising tide of e-waste, though precise figures are elusive. In principle, DeepSeek's more frugal approach implies fewer chips, which could mean slower turnover and less waste. Still, this remains an educated guess until there's more visibility into how DeepSeek's hardware ecosystem is managed. At first glance, OpenAI's partnership with Microsoft suggests ChatGPT might stand to benefit from a more environmentally conscious framework - provided that Microsoft's grand sustainability promises translate into meaningful progress on the ground. DeepSeek, meanwhile, must grapple with a coal-reliant grid in China, yet its drive for efficiency could place it in a better position to curb overall energy consumption per operation. That said, the U.S. is hardly a clean-energy haven either. While Microsoft has pledged to go carbon-negative by 2030, America remains one of the world's largest consumers of fossil fuels, with coal still powering parts of its grid. Moreover, political shifts could slow progress: the resurgence of a "drill, baby, drill" mentality in Republican energy rhetoric suggests a renewed push for oil and gas, potentially undermining AI's green ambitions. Ultimately, AI is hurtling forward at breakneck speed, but the environmental ramifications lag far behind in public scrutiny. As these systems weave themselves ever deeper into our politics, economy, and daily interactions, the debate on their energy sources, water usage, and hardware footprints must become more transparent. If the world's appetite for AI is unstoppable, then so too must be our commitment to holding its creators accountable for the planet's long-term well-being. That responsibility extends not just to China and the U.S. and every nation where AI is trained, deployed, and powered. We've created a comprehensive list of the best AI tools.
[5]

DeepSeek might not be such good news for energy after all
The latter notion is misleading, and new numbers shared with MIT Technology Review help show why. These early figures -- based on the performance of one of DeepSeek's smaller models on a small number of prompts -- suggest it could be more energy intensive when generating responses than the equivalent-size model from Meta. The issue might be that the energy it saves in training is offset by its more intensive techniques for answering questions, and by the long answers they produce. Add the fact that other tech firms, inspired by DeepSeek's approach, may now start building their own similar low-cost reasoning models, and the outlook for energy consumption is already looking a lot less rosy. The life cycle of any AI model has two phases: training and inference. Training is the often months-long process in which the model learns from data. The model is then ready for inference, which happens each time anyone in the world asks it something. Both usually take place in data centers, where they require lots of energy to run chips and cool servers. On the training side for its R1 model, DeepSeek's team improved what's called a "mixture of experts" technique, in which only a portion of a model's billions of parameters -- the "knobs" a model uses to form better answers -- are turned on at a given time during training. More notably, they improved reinforcement learning, where a model's outputs are scored and then used to make it better. This is often done by human annotators, but the DeepSeek team got good at automating it. The introduction of a way to make training more efficient might suggest that AI companies will use less energy to bring their AI models to a certain standard. That's not really how it works, though. "Because the value of having a more intelligent system is so high," wrote Anthropic cofounder Dario Amodei on his blog, it "causes companies to spend more, not less, on training models." If companies get more for their money, they will find it worthwhile to spend more, and therefore use more energy. "The gains in cost efficiency end up entirely devoted to training smarter models, limited only by the company's financial resources," he wrote. It's an example of what's known as the Jevons paradox. But that's been true on the training side as long as the AI race has been going. The energy required for inference is where things get more interesting. DeepSeek is designed as a reasoning model, which means it's meant to perform well on things like logic, pattern-finding, math, and other tasks that typical generative AI models struggle with. Reasoning models do this using something called "chain of thought." It allows the AI model to break its task into parts and work through them in a logical order before coming to its conclusion. You can see this with DeepSeek. Ask whether it's okay to lie to protect someone's feelings, and the model first tackles the question with utilitarianism, weighing the immediate good against the potential future harm. It then considers Kantian ethics, which propose that you should act according to maxims that could be universal laws. It considers these and other nuances before sharing its conclusion. (It finds that lying is "generally acceptable in situations where kindness and prevention of harm are paramount, yet nuanced with no universal solution," if you're curious.)
Share
Share
Copy Link
Chinese startup DeepSeek claims to have created an AI model that matches the performance of established rivals at a fraction of the cost and carbon footprint. However, experts warn that increased efficiency might lead to higher overall energy consumption due to the Jevons paradox.

Chinese startup DeepSeek has made waves in the AI industry with its claim of creating an AI model that matches the performance of established rivals like OpenAI's ChatGPT and Meta's Llama, but at a fraction of the cost and carbon footprint
1
. The company's V3 model reportedly cost just $5 million for its final training run and used 2 million GPU hours, compared to Meta's Llama 3 model, which took about 30 million GPU hours to train3
.DeepSeek attributes its efficiency gains to several innovative techniques:
2
.3
.3
.The announcement of DeepSeek's efficient model has had significant repercussions:
2
.3
.3
.While DeepSeek's efficiency gains seem promising for reducing AI's environmental impact, experts warn of potential unintended consequences:
2
3
.1
.3
.Related Stories
Several factors complicate the evaluation of DeepSeek's true environmental impact:
3
.4
.4
.As AI continues to advance rapidly, the debate on its environmental ramifications must keep pace. Experts emphasize the need for:
4
.3
.5
.As the AI race intensifies, it's clear that efficiency alone won't solve the industry's energy challenges. A holistic approach considering both technological advancements and responsible usage will be crucial in mitigating AI's environmental impact.
Summarized by

Navi
[1]
[2]
[5]