4 Sources
4 Sources
[1]
Tokens may soon drive the AI economy
A new economic reality is starting to take hold in AI. It already underpins the industry's giant data centres and it will one day become an iron rule for all companies that use machine-generated intelligence. That, at least, is according to Jensen Huang, chief executive of Nvidia, who promoted the idea heavily at his company's main annual tech event this week. His theory helps to make a case for Nvidia's continued dominance in chips. But it also reveals how far the industry has to go to make a wider case for the technology. Huang's take on AI economics is based around the production, consumption and monetisation of tokens. These are the most basic units of output from large language models: it takes about 1,300 tokens to generate 1,000 words of text. The key metric, he argues, is the cost per token of output. And as the main input into AI-powered services, he adds, tokens translate directly into revenue. It is not hard to see why the Nvidia boss wants a nervous Wall Street to focus on token economics. Forget the gargantuan capital spending or the fact that so many competitors are lining up to eat into Nvidia's fat profit margins, he seems to be saying: as long as his company's chips keep pumping out tokens at the lowest cost and as long as demand for tokens continues to far outstrip supply, then all is well with the AI boom. As a theory of Nvidia's continued pre-eminence, it sounds compelling. But if token economics is ever to rule the AI world in the way that Huang predicts, some important gaps need to be filled in. One is the lack of a clear link between the production of tokens and the creation of value for customers. Just because the cost of tokens is falling doesn't mean the services created with AI suddenly become valuable or that this will automatically generate revenue across the industry, as Huang suggests. Complicating this picture is the fact that newer AI models consume far larger numbers of tokens. The "reasoning" models that emerged late in 2024, starting with OpenAI's o1, perform far more work to arrive at an answer. These are now being supplemented by agents, which promise to automate white-collar work and bring an explosion in token use -- and, by extension, hefty bills for companies that give workers unlimited use of AI. Nvidia and the rest of the AI industry have barely scratched the surface when it comes to showing how this will translate into revenue for their customers. In software engineering, which has seen the first widespread use of AI agents, there have been efforts to measure how token use is linked to output and to use this to apportion tokens to workers. Eventually, tech companies dream of AI becoming a core part of employment, with the cost of all white-collar workers coming to be seen as a salary plus a certain number of tokens per month. For now, that is still only a pipe dream. The second significant piece missing from Huang's narrative about an emerging token economy is how the companies that produce tokens, the raw commodity on which all of this depends, will make profits. If these "AI factories" all use Nvidia's latest chips, then it may be hard for any of them to gain a cost-per-token advantage or retain any pricing power. The big price declines that have accompanied the plunging cost of producing tokens seem to bear this out. When OpenAI launched GPT-4 two years ago, for instance, it charged $33 for 1mn tokens. Today, it charges only 9 cents for 1mn tokens produced by its cheapest model. That may be great for customers, but it has fed worries about commoditisation. Such worries are hardly new. It is the same argument that was heard in the early days of cloud computing, when Amazon Web Services charged for access to basic data storage and computing power. How could cloud companies ever make a decent profit if computing services were stripped back and sold as commodities like this? The answer was that these were only the first components of what became higher-value services -- full-scale computing platforms on which customers could run their businesses. Whether OpenAI and Anthropic will be able to work a similar trick is unclear, but the opportunity before them is clear. There may be other explanations for the healthy profit margins in cloud computing. The business is ruled by a small oligopoly. Cloud companies have also faced pressure from regulators to reduce switching costs that may help to pad their profits. For now, there is no shortage of competition among frontier AI companies. How that shakes out in future will go a long way to shaping the industry's profits.
[2]
Tech Employees Are Reportedly Being Evaluated by How Fast They Burn Through LLM Tokens
According to a column by the New York Times' Kevin Roose, employees at companies including Meta and OpenAI compete on "internal leaderboards that show how many tokens[...]each worker consumes." At Meta in particular (and also Shopify), Roose says volume of A.I. used has become a metric that goes into people's evaluations, with managers "rewarding workers who make heavy use of A.I. tools and chastening those who don’t." Analogies are tricky here. One is tempted to say it's like making painters compete to use the most paint, but even if the paint is just being splattered as quickly as possible, it's at least going to be visible when the project is done. It's a bit more like telling soldiers to gauge their battlefield success by the number of bullets fired, but suppressive fire that doesn't hit anything has its place in war strategy. The best analogy I can come up with is this: it's like NBA mascots being evaluated by how many t-shirts they fire out of their t-shirt cannons, but the t-shirts are made by Hermès. The resulting numbers, in terms of both tokens and money, are absolutely staggering. One OpenAI engineer, according to Roose, burned through 210 billion tokens, which Roose equates to 33 Wikipedias. A Swedish software engineer claims to Roose that his company spends more than his salary on his Claude Code tokens alone. This "tokenmaxxing" trend clearly stems in part from the use of "claws," agentic AI platforms like OpenClaw, which are this year's biggest supposed innovation in AI. OpenClaw's virality was part of the big shift away from OpenAI's GPT models and toward Claude this year by AI fanatics, and OpenAI subsequently hired OpenClaw's creator, seemingly in a bid to maintain its position as the industry leader. But even when used without an external claw platform Claude Code is becoming more and more like OpenClaw lately, with a feature rolling out last week that allows greater and greater on-the-go vibe coding, by letting users communicate with Claude Code more easily on their phones. The promo even features a little 4-bit sprite of a lobster, or possibly a crabâ€"a red crustacean at any rateâ€"the new symbol for LLM token profligacy. But tokenmaxxing speaks to a wider issue in which these companies tout the sheer number of tokens processed as a marker of success. OpenAI president Greg Brockman bragged about a week ago that the coding-oriented GPT-5.4 processes 5 trillion tokens per dayâ€"which, in fairness, makes sense as a way to please investors, because tokens cost money. But 5 trillion is a huge number, you have to admit. Did you know that Ronald McDonald's Big Red Shoe car in the Macy's Thanksgiving Day Parade would fit a men’s size 266 foot if it were to have an actual foot in it? Aren't big numbers cool?
[3]
AI Adoption Is Being Measured in Tokens, but the Metric Falls Short, Experts Say | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. Tokens are the foundational unit through which AI models process all information. Tokens are tiny units of data that result from breaking larger chunks of information into smaller pieces. AI models process tokens to learn relationships between them and unlock capabilities such as prediction, generation and reasoning. For large language models, short words may be represented by a single token, while longer words may be split into two or more tokens. The word "darkness," for example, would be split into two tokens, "dark" and "ness," with each token bearing a numerical representation as explained by Nvidia. Every prompt a worker sends to an AI system and every response the system returns are measured and often billed in tokens. Every prompt and response consumes tokens and incurs charges. That direct relationship between usage and cost is what makes tokens attractive as a management tool. Unlike the seat-based pricing that defined earlier generations of enterprise software, token consumption is granular, real-time and tied directly to behavior. The shift from seat counts to token consumption mirrors how enterprise AI spending itself has changed. While the unit price of AI tokens is falling, overall enterprise spending on and scaling of AI systems is rising. The number of users, complexity of models, and intensity of workloads will likely drive greater token consumption and, consequently, higher costs. OpenAI's own data on its enterprise customer base illustrates how dramatically usage patterns have shifted. Average reasoning token consumption per organization has increased by approximately 320 times in the past 12 months, suggesting that more intelligent models are being systematically integrated into expanding products and services. That figure has become a headline metric in the company's internal reporting on adoption progress. Nvidia CEO Jensen Huang went further at the company's GTC conference this week, framing tokens as a new form of corporate currency. "I could totally imagine in the future every single engineer in our company will need an annual token budget," Huang said, estimating that employee token allocations could reach half of base salary in value. The appeal of token metrics runs into a fundamental problem: tokens measure volume, not outcome. Generating through packaged software abstracts tokens almost entirely, while consuming through APIs makes tokens explicit, but this can bring transparency and also volatility, as costs rise based on workload design, prompt length, and hidden choices of infrastructure providers. A poorly structured prompt that forces the model to iterate, rephrase or regenerate a response will consume more tokens than a concise, well-targeted query, yet both may or may not produce useful output. If an AI agent saves a customer service representative 15 minutes of work, but costs $4 in inference tokens to run, the ROI is negative, as explained by AnalyticsWeek. The kind unit-economics mismatch is more visible as companies move from pilots to production deployments. As companies move from experimental chatbots to thousands of autonomous "agentic" workflows running around the clock, the sheer volume of tokens consumed has created a massive budgetary leak. The dynamic draws comparisons to earlier enterprise metrics that proved easier to game than to interpret. Click-through rates once served as a proxy for advertising effectiveness; hours logged once functioned as a proxy for productivity. Both created incentives that diverged from the outcomes they were meant to track. If token consumption becomes a performance indicator tied to employee evaluations, workers may optimize for AI interaction frequency rather than task quality. Knowing that "AI spend is up 40%" is not enough. Organizations need a single pane of glass that links every workload, tenant and token to their owners or business outcomes.
[4]
What is tokenmaxxing: A game employees are playing to show how much AI they use
Remember when the most competitive thing at work was who could reply to emails fastest? Those were simpler times. Meet tokenmaxxing - the new workplace sport where employees compete to burn through as many AI tokens as possible, because nothing says "I'm indispensable" quite like a six figure compute bill on your employer's tab. Also read: Zuckerberg wants AI CEO to run Meta: What could go wrong? When you type a prompt into an AI tool, the model breaks your words into small chunks called tokens, roughly three-quarters of a word each. Every question you ask, every document you generate, every time you make Claude rewrite your passive-aggressive email into something professional. That's tokens and tokens cost money. Tokenmaxxing is what happens when companies start tracking how many of those tokens each employee uses and turn it into a performance metric. At Meta and Shopify, AI usage has reportedly made its way into performance reviews. Use a lot of AI? Gold star. Barely touch it? You might want to update your LinkedIn. Also read: SBI to Salesforce: Arundhati Bhattacharya on Women in Tech leadership Some companies have gone further, setting up internal leaderboards that rank employees by consumption. The result is that workers deliberately pump up their AI usage, not necessarily to get better work done, but to be seen using AI. Nvidia CEO Jensen Huang may have poured fuel on the fire when he floated the idea of giving engineers a token budget as compensation on top of their salary. The pitch was that top engineers could rack up $250,000 a year in AI compute spend. Tokens, in other words, as the new signing bonus. That's where it gets murky. Critics are quick to point out that measuring token usage to gauge productivity is a bit like counting how many keystrokes a writer makes. Volume is not the same as value. You can consume an enormous number of tokens asking AI to generate haikus about your commute. There's also a surveillance angle, tools now exist that let managers track AI usage per employee and assess whether it's actually translating into output. Tokenmaxxing is what you get when hustle culture discovers AI and creates a race to perform productivity rather than achieve it. The irony is that the workers gaming the leaderboard are probably asking AI to help them do that too.
Share
Share
Copy Link
Companies including Meta and Shopify are evaluating employees based on how many AI tokens they consume, with internal leaderboards tracking usage. Nvidia CEO Jensen Huang predicts token budgets could reach half an engineer's salary. But critics warn that measuring AI adoption through token consumption conflates volume with actual business outcomes.
A fundamental shift is underway in how companies measure AI adoption, and it centers on a surprisingly granular unit: AI tokens. These tiny data fragments—the basic units of output from large language models—are rapidly becoming the metric by which employees at major tech companies are evaluated
1
. At companies including Meta and OpenAI, workers now compete on internal leaderboards showing token consumption, with managers at Meta and Shopify reportedly rewarding heavy AI tool usage and questioning those who don't2
. The phenomenon has spawned a new term: tokenmaxxing, where employees deliberately maximize their AI usage not necessarily to improve work quality, but to demonstrate they're embracing the technology4
.
Source: Digit
The push toward token-based measurement gained significant momentum when Jensen Huang, Nvidia's CEO, promoted token economics heavily at the company's annual GTC conference this week
1
. Huang argued that cost per token should become the key metric for the AI industry, suggesting that every engineer could eventually receive an annual token budget potentially worth half their base salary—up to $250,000 for top engineers1
4
. His theory positions tokens as directly translating into revenue, making a case for Nvidia's continued dominance as long as its chips keep producing tokens at the lowest cost1
. The numbers involved are staggering: one OpenAI engineer burned through 210 billion tokens, equivalent to 33 Wikipedias, while OpenAI president Greg Brockman recently boasted that GPT-5.4 processes 5 trillion tokens per day2
.Yet experts increasingly warn that token consumption as a performance metric has a fundamental flaw: it measures volume, not outcome
3
. A poorly structured prompt that forces a model to iterate or regenerate will consume more tokens than a concise query, yet both may produce equally useful—or useless—output3
. The ROI problem becomes stark in practical scenarios: if an AI agent saves a customer service representative 15 minutes but costs $4 in inference tokens, the economics are negative3
. One Swedish software engineer claims his company spends more on his Claude Code tokens than his entire salary2
. This disconnect raises serious questions about whether the AI industry has established a clear link between token production and actual value creation for customers1
.
Source: PYMNTS
Related Stories
The economics become even more complex when examining how companies producing tokens—what Huang calls "AI factories"—can maintain profitability
1
. Price declines have been dramatic: when OpenAI launched GPT-4 two years ago, it charged $33 for 1 million tokens; today, its cheapest model costs just 9 cents for the same amount1
. This commoditisation mirrors concerns from the early days of cloud computing, when observers questioned how Amazon Web Services could profit from selling basic storage and computing power1
. While large language models process prompts and responses through tokens, the direct relationship between usage and cost makes tokens attractive as a management tool—but only if they correlate with productivity3
.The trend toward evaluating employees by token usage creates incentives that may diverge from actual business outcomes
3
. When token consumption becomes tied to performance reviews, workers optimize for AI interaction frequency rather than task quality3
. Critics compare it to earlier flawed metrics: measuring productivity by hours logged, or advertising effectiveness by click-through rates3
. Tokenmaxxing represents what happens when hustle culture discovers AI, creating a race to perform productivity rather than achieve it4
. OpenAI's own data shows average reasoning token consumption per organization has increased approximately 320 times in the past 12 months, suggesting more intelligent models are being integrated into expanding products and services3
. But knowing "AI spend is up 40%" isn't enough—organizations need systems that link every workload and token to actual business outcomes3
.Summarized by
Navi
[3]
09 Apr 2026•Business and Economy

20 Mar 2026•Business and Economy

10 Dec 2025•Business and Economy
