Gartner: AI Coding Costs to Exceed Developer Salaries by 2028

Developer looking at large AI coding tool invoice with token costs exceeding salary

Gartner predicts AI coding costs will surpass developer salaries by 2028

Gartner published a number last week that should recalibrate how every engineering team thinks about AI coding tools: by 2028, the per-developer cost of running AI coding agents will exceed the average developer’s salary. That is not a distant-future warning. Six percent of organizations are already there, and the first full billing cycle on GitHub Copilot’s new token-based pricing closes today.

The Gartner Forecast

The June 24 report puts the global average developer salary at roughly $2,000 per month and projects AI coding tool costs crossing that line within two years. The mechanism is straightforward: AI coding tool vendors are moving from flat-seat pricing to consumption-based billing, meaning every token in, every token out, every cached context block now has a price attached. Most engineering teams budgeted for subscriptions. They are now running tab-based accounts.

Gartner senior analyst Nitish Tyagi put the governance problem bluntly: “Token discipline will not emerge through developer choice alone, as developers tend to optimize for speed and convenience over cost efficiency.” Translation — developers will not voluntarily choose the cheaper model when the expensive one is right there and the bill lands on someone else’s desk.

Uber Already Lived It

The cautionary tale is Uber. The company gave roughly 5,000 engineers access to Claude Code in late 2025. By April 2026, it had burned through its entire 2026 AI coding budget in four months. What accelerated it: Uber ranked engineers on internal leaderboards by Claude Code usage. The incentive to use more meant the incentive to spend more, and the bill compounded silently until finance caught it.

Walmart capped tokens on its internal coding platform after adoption exceeded budget. Amazon warned engineers to stop using AI agents to climb leaderboards. Cisco’s chief product officer acknowledged that agent workloads require “meaningfully higher” infrastructure than chatbots. The pattern is consistent: teams discover the cost problem through a bill, not through a policy.

Today’s Reckoning

GitHub Copilot switched all plans to usage-based billing on June 1. Today, June 30, is the close of the first full 30-day billing cycle. Reports landing across Reddit, X, and GitHub’s own discussion threads are not encouraging. One developer on the $39 Pro+ plan burned eight percent of their monthly credit allowance in two hours. Others are projecting monthly totals in the hundreds of dollars for plans that nominally cost $39. The billing change was well-documented; the reality of what agentic mode actually consumes was not.

That gap between expectation and invoice is the Gartner problem in miniature. The shock is not a bug in the billing system. It is what happens when developers who built habits around flat-rate tools encounter consumption-based math for the first time.

Why Agentic Mode Costs So Much More

Chat with an AI assistant burns a few thousand tokens. Running an agent through a multi-file refactor or a debug session with tool calls burns 50,000 to 200,000 tokens — and the full context is resent on every tool call. At current frontier model rates, a single complex agentic session costs $2 to $8. Twenty sessions in a working day on a coding agent using Claude Fable 5 brings the daily bill to $40 to $160 before any subscription offset.

The context window arms race amplified this. Models now accept 200,000 to one million tokens. Agents use that capacity. Developers routinely run agents against entire repositories rather than scoping tasks tightly. Each extra token is cheap; each extra million tokens is not.

Three Things to Do This Week

Enable prompt caching. Cached reads cost roughly ten percent of standard input token rates across Anthropic, OpenAI, and Bedrock. ProjectDiscovery documented a 59 percent reduction in total LLM cost after implementing caching across their pipeline. System prompts, repeated context blocks, and shared instructions are prime candidates. This is the single highest-leverage change available without touching any workflow logic.

Tier your models. Frontier models — Claude Fable 5, GPT-5.6 Sol — are priced for the hardest reasoning work. Most coding tasks are not that hard. Open-weight models like Qwen3.6, GLM 5.2, and Kimi K2.7-Code cost 80 to 90 percent less per token and handle code completion, linting suggestions, and unit test generation cleanly. Route to frontier only when you need frontier.

Scope your context. Start a fresh session when you switch tasks. Load only the files relevant to the current change. Do not feed an agent a whole repository to fix a single function. The context window exists to handle complexity; using it by default rather than by design is how costs compound. Analysts describe this as context engineering, and it is increasingly the skill that separates cost-controlled teams from the ones surprised by their invoice.

Teams that apply all three consistently report cutting blended AI coding costs 60 to 90 percent with no measurable quality regression. The math: $15 per developer per active day becomes $3 to $5.

The Governance Question

Gartner’s 2028 projection assumes no meaningful change in how organizations govern AI tool usage. That is not a prediction; it is a warning. The companies that control costs right now — Coinbase being a notable example, which kept AI costs flat while usage grew exponentially by routing prompts intelligently — treat token spending as infrastructure cost, not as a software subscription line item.

Someone at your company owns the AWS bill. Someone owns the Datadog invoice. Token costs are now in the same category. If nobody owns them yet, that is what the first billing cycle shock is for.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.