- AI token costs have surged so sharply that Microsoft and Uber each burned through months of allocated AI budgets ahead of schedule.
- Wall Street analysts are rapidly reversing bullish AI positions, with some explicitly warning that AI token costs signal a speculative bubble.
- The crisis is forcing enterprises to confront an uncomfortable question: are the productivity gains actually worth the infrastructure bills?
- AI token costs are now a boardroom-level concern, reshaping how companies plan, budget, and justify their generative AI deployments.
- AI token costs have surged so sharply that Microsoft and Uber each burned through months of allocated AI budgets ahead of schedule.
- Wall Street analysts are rapidly reversing bullish AI positions, with some explicitly warning that AI token costs signal a speculative bubble.
- The crisis is forcing enterprises to confront an uncomfortable question: are the productivity gains actually worth the infrastructure bills?
- AI token costs are now a boardroom-level concern, reshaping how companies plan, budget, and justify their generative AI deployments.
Table of Contents
AI Token Costs Are Eating Corporate Budgets Alive
The AI spending boom has produced a very awkward side effect: AI token costs are now growing so fast that even the world’s best-capitalised tech companies can’t keep up with their own projections. Microsoft and Uber have both reportedly exhausted AI budget allocations that were supposed to last months — blown through in a fraction of the expected time. It’s the kind of story that tends to stay quiet until it can’t anymore, and right now, it can’t.
Tokens are the basic unit of work for large language models. Every word, sentence, or document you feed into a model — and every response it generates — is measured and billed in tokens. At the scale these companies operate, millions of interactions per day, costs don’t creep up. They detonate. A single enterprise deployment touching thousands of employees across dozens of workflows can rack up token bills that dwarf even generous budget forecasts.
The problem isn’t new, exactly. Anyone who’s spent time with the OpenAI API pricing page knows the per-token numbers look manageable in isolation. It’s at scale, compounded across an entire product surface, that the economics get uncomfortable. What’s changed is that enterprises are now deep enough into their AI rollouts to feel the full weight of that reality — and for some, the numbers aren’t working out.
Microsoft and Uber: Two Very Different Companies, Same Problem
Microsoft’s exposure here is almost structurally inevitable. The company has made generative AI the centrepiece of its product strategy, embedding OpenAI models into everything from Word and Excel via Copilot to Azure cloud services and GitHub. That’s an enormous surface area for AI token costs to accumulate. When you’re the company that bet the house on AI integration across every major product line, there’s simply no natural ceiling on token consumption — every new user, every new feature, every new workflow adds to the bill.
Uber’s situation is arguably more telling. Uber isn’t an AI-first company in the way Microsoft has tried to become. It’s a logistics and marketplace platform that has, like most large enterprises, been rolling out AI tools to improve operations, customer service, and internal productivity. If a company like Uber — where AI is a tool rather than the product itself — is blowing through AI budgets, it suggests the cost problem isn’t limited to the obvious big spenders. It’s an enterprise-wide phenomenon.
The uncomfortable truth is that a lot of AI deployments were greenlit based on pilot-phase economics. A proof of concept running a few hundred queries a day looks very different from the same feature serving an entire organisation. Finance teams approved budgets built on those early numbers, and now the real-world figures are arriving. They’re not pretty.
Wall Street’s Mood Has Shifted, and Quickly
For much of the past two years, Wall Street’s dominant posture on AI was straightforward enthusiasm. Analysts upgraded tech stocks on AI exposure. Earnings calls were won or lost on the strength of AI narratives. Companies that could credibly attach the word ‘AI’ to their growth story were rewarded handsomely.
That consensus is cracking. A growing cohort of analysts is now openly flagging the divergence between AI capital expenditure and demonstrable revenue returns. The math is getting harder to ignore: companies are spending tens of billions of dollars on AI infrastructure — Nvidia’s data centre revenue tells that story clearly enough — but the corresponding productivity gains and new revenue streams haven’t materialised at the pace the bull case required.
The word ‘bubble’ is getting used more frequently, and not just by the usual contrarian voices. When mainstream sell-side analysts start circulating notes questioning AI ROI, something has shifted in the room. It doesn’t mean AI is valueless — far from it — but it does mean the market is starting to demand evidence rather than promises, which is a very different environment from where we were eighteen months ago.
The Real Question: Is the Value There?
Here’s what makes this moment genuinely interesting rather than just another tech spending overhang story. The question isn’t really whether AI is useful. It clearly is, in specific contexts, for specific tasks. The question is whether the value being generated is proportional to what’s being spent — and right now, for a lot of companies, the honest answer is ‘we’re not sure.’
That uncertainty is the core of the bubble concern. Speculative overshoots in tech aren’t usually about technology that doesn’t work. They’re about technology whose deployment costs outrun the value it delivers, at least in the medium term. Fibre-optic cables in the late nineties worked fine. There was just too much of the stuff relative to immediate demand. AI token costs are starting to feel structurally similar — the technology functions, but the billing model and the scale of deployment are getting ahead of the demonstrable returns.
Enterprises are already starting to respond. There’s growing interest in smaller, more efficient models for high-volume routine tasks — the kind of work that doesn’t need GPT-4-class intelligence but has been routed to expensive frontier models anyway because it was the path of least resistance. There’s also renewed focus on model caching, query optimisation, and tiered access — all essentially cost-containment strategies dressed up in technical language.
What Comes Next for AI Spending
The companies building AI infrastructure — Nvidia, Microsoft’s Azure, Google Cloud, Amazon Web Services — aren’t going to feel this pain directly in the near term. The budgets are already committed, the hardware is ordered, the contracts are signed. The pressure shows up in the next budget cycle, in Q3 and Q4 earnings calls where CFOs start asking sharper questions about AI ROI, and in the internal politics of enterprise IT departments where AI projects compete with everything else for finite dollars.
What’s likely coming is a rationalisation phase. Not a retreat from AI — that’s not happening — but a much more disciplined approach to which AI deployments get funded, at what scale, and on what timeline. The era of ‘move fast and figure out the costs later’ is running into a wall of actual invoices, and boards are paying attention.
The companies that navigate this well will be the ones that treat AI like any other infrastructure investment: with rigorous cost-benefit analysis, clear use-case prioritisation, and honest accounting of what’s working and what’s burning money. That’s less exciting than the hype cycle suggested, but it’s probably healthier — and it’s where the industry is clearly heading, whether the most enthusiastic AI boosters want to admit it or not.
Source: 富途牛牛
Frequently Asked Questions
Why are AI token costs rising so fast for large enterprises?
Every query sent to a large language model consumes tokens — units of text processed by the model. At enterprise scale, millions of daily queries add up with brutal speed. As companies expand AI features to more users and workflows, consumption compounds, often far outpacing the budgets originally set by finance teams.
How are AI token costs affecting Microsoft specifically?
Microsoft, which has deeply integrated OpenAI models across its Copilot product suite, has reportedly seen AI inference costs exhaust budget allocations that were meant to last months. The sheer breadth of its deployment — across Office, Azure, and developer tools — makes it one of the most exposed companies to runaway token spend.
Is the AI spending boom heading toward a bubble?
A growing number of Wall Street analysts think so. The concern isn’t that AI lacks utility — it’s that the gap between what companies are spending on AI infrastructure and what they’re recovering in measurable revenue is widening. That imbalance is the classic anatomy of a speculative overshoot.
What can companies do to control AI token costs?
Options include switching to smaller, cheaper models for routine tasks, implementing token budgets per user or team, caching frequent queries to avoid repeat processing, and auditing which AI features actually drive business value versus which are simply expensive novelties.

