AI coding agents are writing more than half of all professional code right now. A survey published last week put the number at 54% — nearly double the 28% recorded just twelve months earlier. If you work in software, that figure probably doesn’t shock you. What should give you pause is the question almost nobody’s asking: which 54%?
- AI coding agents now generate 54% of all code, up from 28% just a year ago, according to recent survey data.
- AI coding agents excel at repetitive, structural work — but silently skip edge cases that senior engineers catch instinctively.
- A fintech CTO found that agents built flawless happy paths but missed illegal payment state transitions with real financial consequences.
- Replacing senior engineers with AI coding agents trades long-term reliability for short-term speed — a dangerous bargain in regulated industries.
Not All Code Is Created Equal
Here’s the uncomfortable truth that gets buried in the productivity hype. A CRUD endpoint that fetches a list of merchant names carries almost zero risk. A webhook handler that flips a payment from pending to complete is someone’s payroll. Get one wrong and your users see a mildly stale UI. Get the other wrong and money moves to places it shouldn’t — or stops moving entirely.
That distinction is at the core of a candid breakdown published by the CTO of an FCA-authorised payment platform, who’s been running AI coding agents aggressively across a production NestJS microservices stack for well over a year. The architecture is recognisable to most backend teams: NestJS, Docker, Traefik, multiple microservices communicating over webhooks. Real merchants. Real money. Real regulatory consequences.
His verdict? AI coding agents are extraordinary — for the right jobs. The problem is that teams are increasingly throwing them at the wrong ones.
Where AI Coding Agents Actually Shine
Credit where it’s due. The CTO describes genuine productivity gains that would’ve sounded implausible two years ago. Spinning up a new microservice — module structure, base config, Docker setup, Traefik labels — used to eat half a day of copy-paste grunt work. Now it’s a conversation with an agent, measured in minutes.
When the team overhauled environment variable management across all their repositories, AI coding agents mapped every .env file, surfaced naming conflicts, identified shared variables, and generated a unified Zod validation schema. Work that might have taken a team several days of grep commands and spreadsheets was done in hours.
API scaffolding. Service boilerplate. Test stubs. Import refactoring. Pattern migrations across repos. For this class of work — predictable, structurally repetitive, pattern-following — the CTO’s framing is hard to argue with: AI coding agents are the best junior developers money can ever buy. Tireless, cheap, no ego, and nearly error-free on the tasks they’re suited for. An army of juniors at your terminal, twenty-four hours a day.
The global picture backs this up. GitHub’s own research has consistently shown that developers using AI assistance complete tasks significantly faster, with self-reported improvements in job satisfaction alongside the speed gains. The productivity case is real.
The Silent Failure No Test Suite Caught
Then comes the part that should make every engineering leader sit up. The team let an AI coding agent build out a webhook handler — the component that processes incoming payment status updates from banks and card networks. Webhooks are the nervous system of any payments stack. When one fires, the system has to respond correctly, every time, with no ambiguity.
The agent’s output looked clean. Tests passed. Code review didn’t flag anything obvious. But the handler silently ignored illegal state transitions. In payments, there are hard rules about how a transaction can move through its lifecycle. A payment can go from pending to complete. It categorically cannot travel the other direction. A senior engineer who’s spent years in fintech doesn’t need a comment in the code to remind them of this — it’s embedded in their instincts, born from debugging incidents at 2am and explaining settlement delays to angry merchants.
The agent never thought about it. It built the happy path with polish and left the guardrails off entirely.
This, the CTO argues, isn’t a one-off bug. It’s a systemic behavioural pattern. AI coding agents optimise for completion, not correctness. They’re trained to get to the green checkmark, and they’ll take the most efficient path there — which means skipping the negative cases unless you explicitly, specifically demand them. What happens when a webhook fires twice? What happens when a refund is requested on a transaction that’s already been refunded? What happens when a bank returns a status code that isn’t in the documented spec? The agent doesn’t model any of that unless you spell it out word for word.
AI Coding Agents and the Architecture Problem
There’s a secondary failure mode that’s subtler and arguably more insidious over time: agents don’t care about your existing codebase. They don’t know that your team standardised a particular retry utility three years ago, battle-tested it through a production incident, and now uses it everywhere. They’ll write a new implementation from scratch because it’s faster than locating and importing yours. The feature works. But now you have two implementations of the same logic — one with years of production hardening, one freshly generated and completely untested in the wild.
Multiply that across dozens of features shipped by AI coding agents over a year, and you’re quietly accumulating architectural debt that no individual PR makes obvious.
There’s also what you might call the token economy problem. Agents appear to optimise for fewer conversational turns — shorter, simpler responses that close the loop quickly. Complex validation logic? The basic case works, so skip the edge handling. Rare error states? Not worth the overhead. The result is code that passes every test you wrote, and fails on scenarios you didn’t think to write — which are, precisely, the scenarios the agent also didn’t think to handle.
The Real Cost of Replacing Senior Engineers
This is where the analysis gets pointed. The CTO’s argument isn’t that AI coding agents are dangerous. It’s that the belief that they can substitute for senior engineers is dangerous — and it’s a belief that’s spreading fast through the industry as AI productivity numbers become ammunition in headcount debates.
The difference between code and a product is judgment. Knowing that idempotency isn’t optional in webhook handlers because banks routinely send duplicate notifications. Knowing that a backoff curve on retry logic has a specific shape because you’ve lived through what happens when it doesn’t. Knowing that the refund flow needs an explicit state machine because a support ticket three years ago taught you exactly what breaks without one. That knowledge doesn’t exist in any training dataset. It accumulates through operating real systems under real pressure.
Strip out your senior engineers and replace them with AI coding agents, and you get speed. You also get silent disasters — the kind that don’t announce themselves in the test suite, that only surface when a merchant calls asking where their settlement is.
Enable your senior engineers with AI coding agents, and the equation inverts. You get an architect with an army. The senior’s judgment shapes the architecture, defines the constraints, identifies the edge cases, and reviews what the agents produce. The agents handle the volume. That’s a genuinely powerful combination — and it’s a very different thing from what a lot of companies seem to be building toward.
What a Working System Actually Looks Like
The CTO’s team didn’t arrive at this understanding painlessly. They built it through failure, iteration, and a deliberate effort to make their architecture legible to the agents working within it. The practical starting point: extract your design patterns and architecture rules into formats that agents can actually consume. If the agent can read your conventions, it has a fighting chance of respecting them.
That means documented state machines for anything with legal and illegal transitions. It means explicit lists of shared utilities that agents are expected to use rather than reimplement. It means writing prompts that force negative case analysis before the agent writes a single line of production code.
It’s more work upfront. It’s also the only version of AI-assisted development that holds up in a regulated environment — and increasingly, it’s the only version that holds up anywhere the stakes are real.
As AI coding agents become more capable and more deeply embedded in professional workflows, the industry is going to keep pushing on the headcount question. The smarter question isn’t how many engineers you can remove. It’s how well you’ve structured things so that the engineers who remain can actually supervise what the agents are building. The teams that figure that out will ship fast and ship reliably. The ones that don’t will find out the hard way, probably at 2am, probably on a payment webhook.


