AI Strategy

The 90% Problem: Why AI Coding Agents Still Need Software Architects

AI coding agents are technically gifted. They also overbuild, overengineer, and miss architectural decisions that only a human who has shipped to production can catch.

Praveen Ghanta, CEO, Hire Fraction · June 3, 2026 ·5 min read

AI coding agentssoftware architecturecode reviewadversarial reviewsoftware factory

What you’ll learn

Why AI agents still fall short on the architectural decisions that determine whether software holds up in production
How adversarial code review using a different model catches mistakes that same-model review misses
The critical path approach that keeps human architects involved without making them a bottleneck
Which open-source and commercial models work well as adversarial reviewers
Why the role of software architect is becoming more valuable, not less

AI agents are technically gifted. They are also architecturally reckless.

In our software factory, the second major building block is software architecture. And this is where AI agents hit their ceiling.

The agents can get somewhere in the neighborhood of 90% of the job done. Technically, they are gifted. They write clean functions, handle edge cases, and produce code that passes tests. But they still make obvious mistakes. They lean too far into the wrong pattern. They overbuild. They overengineer. They solve the problem in front of them without understanding the system around it.

This is the 90% problem with AI coding agents. Google’s 2025 DORA report found that while AI adoption among developers has surged to 90%, that acceleration came with a 9% climb in bug rates, a 91% increase in code review time, and pull requests that grew 154% in size. More code, faster, with more things to catch. The speed is real. The quality gap is also real.

MD files, rules, and hooks are not enough

The natural response to agent mistakes is to add guardrails. Write better CLAUDE.md files. Add rules. Build pre-commit and post-commit hooks. Define skills that constrain the agent’s behavior.

All of that helps. None of it is sufficient.

Even with detailed rules and process hooks in place, you still find mistakes. The agent follows the letter of the rule but misses the intent. It satisfies the constraint on one file while introducing a structural problem across three others. Rules are local. Architecture is global.

And models continue to improve. We have Opus 4.7 now. On the OpenAI side, GPT-5.5. The capability curve is steep. But even the latest models produce code that can conflict with the system as a whole, working perfectly in isolation while violating the constraints of the broader architecture.

Adversarial code review: use a different model than the one that wrote the code

The first approach to closing that 10% gap is adversarial code review.

The principle is simple: whatever model writes the code, use a different model to review it. When the same model writes and reviews, it shares the same blind spots. It has correlated failures. Research confirms this pattern: when a generating agent and a reviewing agent reason from the same training distribution, the review checks code against itself, not against intent.

A different model breaks that correlation.

Since so many teams are using Claude Code, OpenAI has made this easy. Codex is available as a plugin that you can install directly into Claude Code and use for adversarial reviews. Write with Claude, review with Codex. One command.

But you are not limited to commercial models. To save on costs, any open-source model works. We have had success with Kimi, with GLM, and with plugging those into the review workflow. A hacker tip: you can actually flip Claude Code over to use open-source models as the underlying model within the harness. There are fairly straightforward ways to do this. And open-source harnesses like OpenCode can use any model natively.

The rule is simple. Whatever model writes the code, use a different one to review it.

Senior software architects still create real value

Adversarial review catches a class of bugs that same-model review misses. But it does not replace architectural judgment.

We are still seeing tremendous value in seasoned software architects who have shipped software to production over the course of their careers. People who know what breaks at scale, what patterns decay over time, and what looks correct in a pull request but fails in production under load.

The question is how to involve them without making them the bottleneck. A human who has to review every file that an AI agent produces is not a quality gate. They are a traffic jam.

The critical path approach: review one path, not every file

The answer is the critical path approach.

Instead of reviewing hundreds of files that the agent may have produced, you decide on a critical path through the system from a code review perspective. Then the human walks that path interactively.

They look at the code. They ask the AI questions about it. They have a conversation where they trace the architectural decisions along that critical path. This is targeted, not exhaustive.

The analogy is assembly language. We no longer look at assembler. We no longer look at lower-level constructs. We work at a higher level of abstraction and trust the tooling below it. The same principle applies here: focus on the critical path, verify that the right approach was taken, and trust the agent for the rest.

And here is why this works: if there are problems on the critical path, there are almost certainly problems elsewhere. That signal tells you where to direct the AI for further review and correction. One focused human review generates a roadmap for fixing the rest.

Two architecture approaches worth adopting today

To summarize: software architecture remains a core building block in our software factory where humans need to be involved. The models are not ready to handle it autonomously. But the involvement does not have to be a bottleneck.

First, adversarial code review. Use a different model to review than the one that wrote the code. Codex inside Claude Code, or any open-source model plugged into your harness. Break the correlation between writer and reviewer.

Second, critical path review. Do not try to read every file. Identify the most important execution path through the system, verify it interactively with the architect, and use the findings to direct AI review of the rest.

These two approaches keep the human in the loop where they add the most value, without slowing the factory down.

The role of the software architect is not shrinking. It is concentrating. Less time writing code, more time making the decisions that determine whether the code holds up in production. That 10% is where the real engineering happens.

Get an Instant Project Plan + Cost Estimate

Describe your software or AI project. Get a full scope with story-point pricing, sprint estimates, and a downloadable plan in minutes. No calls, no waiting.

Scope Your Project for Free

Free and instant. Try the calculator now.

Frequently Asked Questions

Why can't AI coding agents handle software architecture on their own?

AI agents excel at generating syntactically correct code and solving well-defined implementation tasks. But architecture requires understanding system-wide constraints, business context, and the downstream consequences of structural decisions. Agents optimize locally. Architects think globally. Until models can reliably reason across an entire system and its operational history, humans remain essential for architectural judgment.

What is adversarial code review and how does it improve AI-generated code?

Adversarial code review means using a different AI model to review code than the one that wrote it. When the same model writes and reviews, it shares the same blind spots and training biases, so it tends to approve its own mistakes. A different model catches errors the first one systematically misses. This is similar to why a second pair of eyes catches typos the author cannot see, but applied at the model level.

Which AI models work well for adversarial code review?

The principle matters more than the specific model. If you write code with Claude, review with Codex or an open-source model like Kimi or GLM. If you write with GPT, review with Claude. The key is breaking the correlation between the generating model and the reviewing model. Open-source models can reduce costs while still providing genuine adversarial perspective.

How do you prevent the human architect from becoming a bottleneck?

Focus human review on the critical path through the system rather than every file. Identify the core architectural decisions and verify those interactively. Then let AI handle the broader review of peripheral code. This is the same principle that moved developers from reading assembler to reading higher-level abstractions. You verify the important path and trust the tooling for the rest.

Will AI eventually replace the need for human software architects?

Models are improving rapidly, but architectural judgment requires understanding business context, operational history, and the second-order consequences of design decisions. As of mid-2026, even the strongest models still overbuild, overengineer, and lean into wrong patterns when left unsupervised. The role is shifting from writing code to verifying critical decisions, but the human checkpoint remains essential for production-grade software.

What does the critical path approach look like in practice?

Choose the most important execution path through your system, such as the primary data flow or the core transaction pipeline. Walk through that path interactively with AI, examining the architectural choices at each step. If you find problems on the critical path, there are likely problems elsewhere. Use that signal to direct AI to review and correct the rest. This keeps the human checkpoint targeted and fast rather than exhaustive and slow.

Sources

Google. “State of AI-Assisted Software Development 2025.” DORA. https://dora.dev/dora-report-2025/
Sobonix. “AI-Assisted Software Engineering in 2026: The Rise of Human-Guided Coding Agents.” https://www.sobonix.com/blog/ai-assisted-software-engineering-in-2026-the-rise-of-human-guided-coding-agents/
OpenAI Developer Community. “Introducing Codex Plugin for Claude Code.” https://community.openai.com/t/introducing-codex-plugin-for-claude-code/1378186
Arxiv. “The Specification as Quality Gate: Three Hypotheses on AI-Assisted Code Review.” https://arxiv.org/pdf/2603.25773
Mike Mason. “AI Coding Agents in 2026: Coherence Through Orchestration, Not Autonomy.” https://mikemason.ca/writing/ai-coding-agents-jan-2026/

Praveen Ghanta

CEO, Hire Fraction

Praveen Ghanta is a five-time founder and serial entrepreneur. He is the founder of DevHawk.ai, an AI-powered engineering management platform, and Fraction.work, which connects fast-growing companies with top fractional tech and growth marketing talent. Previously, he founded HiddenLevers, a risk analytics platform for wealth management that he bootstrapped from inception to acquisition by Orion Advisor Solutions in 2021, serving thousands of advisors and $600B in assets. He earlier founded SmartWorkGroups, acquired by Intralinks in 2000.

Connect on LinkedIn →

Get started

Get an Instant Project Plan + Cost Estimate

Describe your software or AI project. Get a full scope with story-point pricing, sprint estimates, and a downloadable plan in minutes. No calls, no waiting.

Scope Your Project for Free

Working on a data strategy? Talk to a Fraction CTO. → Book an intro call

The 90% Problem: Why AI Coding Agents Still Need Software Architects

AI agents are technically gifted. They are also architecturally reckless.

MD files, rules, and hooks are not enough

Adversarial code review: use a different model than the one that wrote the code

Senior software architects still create real value

The critical path approach: review one path, not every file

Two architecture approaches worth adopting today

Get an Instant Project Plan + Cost Estimate

Frequently Asked Questions

Sources

Get an Instant Project Plan + Cost Estimate

Get in Touch

Company

Resources