3 Sources
3 Sources
[1]
Open-Sourced AI Models May Be More Costly in the Long Run, Study Finds
Open-source AI models tend to use more computing power than closed-source models for the same tasks. As more businesses adopt AI, picking which model to go with is a major decision. While open-sourced models may seem cheaper initially, a new study warns that those savings can evaporate fast, due to the extra computing power they require. In fact, open-source AI models burn through significantly more computing resources than their closed-source rivals when performing the same tasks, according to a study published Thursday by Nous Research. The researchers tested dozens of AI models, including closed systems from Google and OpenAI, as well as open-source models from DeepSeek and Magistral. They measured how much computing effort each required to complete identical tasks across three categories: simple knowledge questions, math problems, and logic puzzles. To do this, they used the number of tokens each model used to solve and answer questions as for computing resources consumed. “Open-weight models use 1.5â€"4à -- more tokens than closed onesâ€"and up to 10à -- for simple knowledge questionsâ€"making them sometimes more expensive per query despite lower per-token costs,†the study authors wrote. In AI, a token is a piece of text or dataâ€"it could be a word, part of a word, or even punctuationâ€"that models use to understand language. Models process and generate text one token at a time, so the more tokens they use, the more computing power and time a task requires. Since most closed-source models don’t reveal their raw reasoning process or chain of thought (CoT), the researchers measured their computing efficiency by counting the tokens they used instead. Because models are billed by total output tokens used in their reasoning process and outputting the final answer, completion tokens serve as a proxy for the effort needed to produce a response. This is an important consideration for companies using AI for many reasons. “First, while hosting open weight models may be cheaper, this cost advantage could be easily offset if they require more tokens to reason about a given problem,†the researchers wrote. “Second, an increased number of tokens will lead to longer generation times and increased latency.†The study found that open models consistently use more tokens than closed models for the same tasks, sometimes three times as many for simple knowledge questions. The gap narrowed to less than twice for math and logic problems. “Closed models (OpenAI, Grok-4) optimize for fewer tokens to cut costs, while open models (DeepSeek, Qwen) use more tokens, possibly for better reasoning,†the study authors wrote. Among open models, llama-3.3-nemotron-super-49b-v1 was the most efficient, while Magistral models were the most inefficient. OpenAI’s models were standouts as well. Both its o4â€'mini and the new open-weight gptâ€'oss models showed impressive token efficiency, especially on math problems. The researchers noted that OpenAI’s gptâ€'oss models, with their concise chain-of-thoughts, could serve as a benchmark for improving token efficiency in other open models.
[2]
That 'cheap' open-source AI model is actually burning through your compute budget
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A comprehensive new study has revealed that open-source artificial intelligence models consume significantly more computing resources than their closed-source competitors when performing identical tasks, potentially undermining their cost advantages and reshaping how enterprises evaluate AI deployment strategies. The research, conducted by AI firm Nous Research, found that open-weight models use between 1.5 to 4 times more tokens -- the basic units of AI computation -- than closed models like those from OpenAI and Anthropic. For simple knowledge questions, the gap widened dramatically, with some open models using up to 10 times more tokens. "Open weight models use 1.5-4× more tokens than closed ones (up to 10× for simple knowledge questions), making them sometimes more expensive per query despite lower per‑token costs," the researchers wrote in their report published Wednesday. The findings challenge a prevailing assumption in the AI industry that open-source models offer clear economic advantages over proprietary alternatives. While open-source models typically cost less per token to run, the study suggests this advantage can be "easily offset if they require more tokens to reason about a given problem." The real cost of AI: Why 'cheaper' models may break your budget The research examined 19 different AI models across three categories of tasks: basic knowledge questions, mathematical problems, and logic puzzles. The team measured "token efficiency" -- how many computational units models use relative to the complexity of their solutions -- a metric that has received little systematic study despite its significant cost implications. "Token efficiency is a critical metric for several practical reasons," the researchers noted. "While hosting open weight models may be cheaper, this cost advantage could be easily offset if they require more tokens to reason about a given problem." The inefficiency is particularly pronounced for Large Reasoning Models (LRMs), which use extended "chains of thought" to solve complex problems. These models, designed to think through problems step-by-step, can consume thousands of tokens pondering simple questions that should require minimal computation. For basic knowledge questions like "What is the capital of Australia?" the study found that reasoning models spend "hundreds of tokens pondering simple knowledge questions" that could be answered in a single word. Which AI models actually deliver bang for your buck The research revealed stark differences between model providers. OpenAI's models, particularly its o4-mini and newly released open-source gpt-oss variants, demonstrated exceptional token efficiency, especially for mathematical problems. The study found OpenAI models "stand out for extreme token efficiency in math problems," using up to three times fewer tokens than other commercial models. Among open-source options, Nvidia's llama-3.3-nemotron-super-49b-v1 emerged as "the most token efficient open weight model across all domains," while newer models from companies like Magistral showed "exceptionally high token usage" as outliers. The efficiency gap varied significantly by task type. While open models used roughly twice as many tokens for mathematical and logic problems, the difference ballooned for simple knowledge questions where efficient reasoning should be unnecessary. What enterprise leaders need to know about AI computing costs The findings have immediate implications for enterprise AI adoption, where computing costs can scale rapidly with usage. Companies evaluating AI models often focus on accuracy benchmarks and per-token pricing, but may overlook the total computational requirements for real-world tasks. "The better token efficiency of closed weight models often compensates for the higher API pricing of those models," the researchers found when analyzing total inference costs. The study also revealed that closed-source model providers appear to be actively optimizing for efficiency. "Closed weight models have been iteratively optimized to use fewer tokens to reduce inference cost," while open-source models have "increased their token usage for newer versions, possibly reflecting a priority toward better reasoning performance." How researchers cracked the code on AI efficiency measurement The research team faced unique challenges in measuring efficiency across different model architectures. Many closed-source models don't reveal their raw reasoning processes, instead providing compressed summaries of their internal computations to prevent competitors from copying their techniques. To address this, researchers used completion tokens -- the total computational units billed for each query -- as a proxy for reasoning effort. They discovered that "most recent closed source models will not share their raw reasoning traces" and instead "use smaller language models to transcribe the chain of thought into summaries or compressed representations." The study's methodology included testing with modified versions of well-known problems to minimize the influence of memorized solutions, such as altering variables in mathematical competition problems from the American Invitational Mathematics Examination (AIME). The future of AI efficiency: What's coming next The researchers suggest that token efficiency should become a primary optimization target alongside accuracy for future model development. "A more densified CoT will also allow for more efficient context usage and may counter context degradation during challenging reasoning tasks," they wrote. The release of OpenAI's open-source gpt-oss models, which demonstrate state-of-the-art efficiency with "freely accessible CoT," could serve as a reference point for optimizing other open-source models. The complete research dataset and evaluation code are available on GitHub, allowing other researchers to validate and extend the findings. As the AI industry races toward more powerful reasoning capabilities, this study suggests that the real competition may not be about who can build the smartest AI -- but who can build the most efficient one. After all, in a world where every token counts, the most wasteful models may find themselves priced out of the market, regardless of how well they can think.
[3]
This Open-Source AI Model is Breaking Your Budget
The global AI race highlights that cost transparency and efficiency matter as much as accessibility in Artificial Intelligence. The world of artificial intelligence is changing quickly, with open-source models becoming more popular than ever. Companies, researchers, and governments often see open-source models as cheaper and more flexible compared to closed systems owned by private firms. However, new studies reveal that these models, which were expected to save money, may be causing budgets to spiral out of control. The problem lies in hidden costs, especially when it comes to how many tokens these models use to complete a single task. On the surface, an open-source AI model may look cheaper, but in practice, it is often far less efficient than expected. This has turned the idea of "affordable AI for all" into a difficult challenge for businesses and institutions.
Share
Share
Copy Link
A new study finds that open-source AI models consume significantly more computing resources than closed-source alternatives, potentially offsetting their initial cost advantages.
A groundbreaking study by Nous Research has uncovered a significant efficiency gap between open-source and closed-source AI models, challenging the prevailing notion that open-source options are more cost-effective
1
. The research, which examined 19 different AI models across various task categories, found that open-source models consistently consume more computational resources than their closed-source counterparts2
.Source: Analytics Insight
The study introduced "token efficiency" as a critical metric for evaluating AI model performance. Tokens, which represent units of text or data processed by AI models, serve as a proxy for computing power consumption. The findings revealed that open-weight models use 1.5 to 4 times more tokens than closed models for identical tasks, with the disparity widening to up to 10 times for simple knowledge questions
1
2
.This inefficiency is particularly pronounced in Large Reasoning Models (LRMs), which employ extended "chains of thought" to solve complex problems. These models can consume thousands of tokens pondering simple questions that should require minimal computation
2
.The research has significant implications for businesses adopting AI technologies. While open-source models may appear cheaper initially due to lower per-token costs, their higher token consumption can quickly erode this advantage
3
. Companies evaluating AI models often focus on accuracy benchmarks and per-token pricing but may overlook the total computational requirements for real-world tasks2
.Source: Gizmodo
Among the models tested, OpenAI's offerings demonstrated exceptional token efficiency, particularly in mathematical problem-solving. The o4-mini and newly released open-source gpt-oss variants used up to three times fewer tokens than other commercial models for math problems
1
2
.In the open-source category, Nvidia's llama-3.3-nemotron-super-49b-v1 emerged as the most token-efficient model across all domains. Conversely, newer models from companies like Magistral showed exceptionally high token usage
1
2
.The study revealed that the efficiency gap varied significantly depending on the type of task:
1
2
Related Stories
Source: VentureBeat
The researchers suggest that token efficiency should become a primary optimization target alongside accuracy for future model development. They noted that closed-source model providers appear to be actively optimizing for efficiency, while open-source models have increased their token usage in newer versions, possibly prioritizing reasoning performance over efficiency
2
.The research team faced unique challenges in measuring efficiency across different model architectures. To address the lack of transparency in closed-source models' reasoning processes, they used completion tokens as a proxy for reasoning effort. The study also employed modified versions of well-known problems to minimize the influence of memorized solutions
2
.As the global AI race intensifies, this study highlights that cost transparency and efficiency are as crucial as accessibility in artificial intelligence. The findings underscore the need for a more nuanced approach to AI model selection and deployment, particularly for enterprises looking to optimize their AI investments
3
.Summarized by
Navi
[3]
28 Jan 2025•Technology
06 Aug 2025•Technology
01 Apr 2025•Technology
1
Business and Economy
2
Business and Economy
3
Policy and Regulation