June 9, 20265 min read

Plans Are Useless, Planning Is Indispensable: The Architecture of the Agentic SDLC

Eisenhower's wartime line, applied to a software development lifecycle where 75% of new code is written by machines. A note on why static plans collapse under agent velocity, what replaces them, and what it costs.

Plans Are Useless, Planning Is Indispensable: The Architecture of the Agentic SDLC

Dwight Eisenhower's "plans are useless, but planning is indispensable" was not originally a management aphorism. He said it in 1957, talking specifically about war — the moment of enemy contact, when the plan meets reality and shatters, and the commander has to think on the terrain they prepared but cannot use. The value of planning, he argued, lay in the cognitive readiness it produced, not in the plan itself.

That is, almost line for line, the situation modern software engineering finds itself in. Sundar Pichai's April 2026 post claimed that 75% of new code at Google is now AI-generated and approved by engineers. The naive read of that number was that the SDLC was about to become autonomous — hand an agent a 50-page PRD, wait, collect a working product. The more sophisticated read, which most serious practitioners hold, is that agents can autonomously execute well-scoped tasks end-to-end inside human-managed pipelines. Both reads still miss the deeper point.

What is actually happening is that the lifespan of any static plan, including a well-scoped one, has collapsed to minutes. Reality intervenes faster than the plan can be updated. An agent given a multi-file feature spec to execute without an active feedback loop drifts — hallucinating interfaces that match the prompt but not the codebase, writing against schemas that changed last commit, compounding small errors across files into a structurally broken whole. Modern tooling has reduced this from "always fails" to "fails in characteristic, predictable ways," but the failure mode is still there, and it is not solvable by writing better plans. It is solvable only by replacing the plan with a loop.

I'll call this failure mode Automated Waterfall. The Waterfall SDLC of the 1970s and 80s collapsed when humans tried to execute static plans against changing reality. Automated Waterfall is what you get when you keep the static plan and replace the humans with agents. It fails harder and faster than Waterfall did, because agents do not notice the plan has broken until something explicit forces them to.

The metric question

If plans do not survive contact with execution, what should we optimise for? Most engineering metrics in current use measure properties of the plan or the artefact: lines of code, sprint velocity, story points, change failure rate. DORA's deployment frequency and lead time get closer, but they still measure the cycle from intent to production. They do not measure how fast the system can notice it was wrong and change direction.

The metric I want is something like Mean Time to Adapt — how long it takes, from the moment a deviation is detectable, for the system to register it and pivot. I have not seen this defined as a formal DORA-style metric, so I am using it as a working lens rather than a citation. It complements MTTR (recovery from failure) by addressing the upstream question: how fast does the system recognise it is on the wrong path before that path becomes a failure?

Optimising for MTTA changes what you build. You stop investing in plans that survive longer. You start investing in feedback loops that close faster.

Anatomy of the agentic SDLC

Four shifts follow from optimising for adaptation rather than execution.

Tests are the contract, not the document. A feature request expressed as natural language is ambient. An agent has to interpret it. A feature request expressed as a failing integration test is a contract — the agent's output is evaluated automatically against runtime assertions, and a stack trace re-enters the loop as context. This is the part of TDD that fits agent workflows cleanly. The part that does not fit: TDD works only for the class of work where verification can precede code. Discovery, UI exploration, ambiguous product spec, anything where you have to build something to know whether it is right — these need different scaffolding. Naming this matters because "everything is TDD now" is the kind of overreach the industry has done before and regretted.

Short task horizons over long-horizon speculation. Tools like Claude Code and Aider work the way they do because they assume the loop will fail, and design around fast recovery. Atomic commits. Git as the rollback layer. Easy pivots. The trade-off — and the part most discussions skip — is that some architectural decisions cannot be made inside a short loop because they only become visible across many loops. Humans still have to do the long-horizon work. The loop handles execution; the human handles architectural intent.

State machines, not pipelines. Pipelines assume forward motion. State machines model the actual situation: an agent in some state, transitioning based on what it observes. Google's Agent Development Kit, Anthropic's agent SDK, and LangGraph all converge on this pattern because the pattern is right. Facilitron's architecture — what I now call GUIDE+OAI+LEARN — was built on Google ADK as a state machine precisely because pipelines could not model what a facilitation agent actually has to do: observe a conversation, decide whether to intervene, intervene, observe the result, adjust. Adaptation has to be a first-class concept in the architecture, not a recovery path bolted on.

Humans as commanders, not coders. Eisenhower's commanders were not on the firing line. They were the ones who had done the planning so thoroughly that adaptation was possible — knew the terrain, the supply lines, the rules of engagement. The role of the engineer in an agent-saturated SDLC is similar. The value is not in typing code, and it is not in reviewing every diff either (the throughput will outrun that). It is in defining the constraints inside which agents loop, the verification standards that catch bad output, and the architectural intent the loop cannot see on its own. One good constraint shapes a thousand loops.

The matrix

SDLC phaseWhat stops workingWhat replaces it
DiscoveryFrozen 200-page PRDs as the deliverableContinuous story mapping, hypothesis-driven research, lightweight executable specs the team can debate independently of any single conversation
ArchitectureSpeculative system diagrams drawn in isolation, then ignoredDiagrams augmented by automated threat modelling, schema validators, and drift detectors — the diagram still does the human-alignment work; the tooling catches what the diagram cannot enforce
ImplementationLong-lived feature branches with rigid task assignmentAtomic git-level agent loops with continuous TDD verification on the work that admits it
DeploymentCalendar-driven release dates set independently of telemetryTelemetry-driven canary rollouts with automated rollback at the first signal
MaintenanceDefensive coding against unknown user inputBehavioural models of how the system and its agents fail — drawn from the disciplines (psychology, ethnography, design research) that have studied human failure for a century

What this costs

Looping is not free. Each iteration is tokens, latency, and human attention. Optimising for MTTA means you accept more iterations of less depth each. For most software work this is the right trade. For some domains — safety-critical systems, regulated environments, anything where the cost of a wrong commit is high — the trade reverses. Loop-driven SDLC is not universal. It is the right shape for the dominant case, which is web and SaaS software where the cost of a bad iteration is cheap and the cost of a missed adaptation is expensive.

The other cost is concentration. In a Waterfall SDLC, dozens of people contributed plans, specs, and reviews along the way, and the work distributed itself across the team. In a loop-driven SDLC, the leverage points — constraint definition, verification standards, architectural intent — concentrate in fewer hands. This is good for individual productivity and bad for organisational resilience. A team that has correctly invested in loop infrastructure can move very fast. A team that loses the one person who set up the loops is in worse shape than a team that loses any one Waterfall contributor.

Conclusion

The transition to agentic software engineering does not lower the bar for engineering rigour. It moves the bar — from rigour-in-execution to rigour-in-setup. The plan you wrote was always going to decay. What endures is the loop you built around it, the constraints you set for the loop, and the discipline of returning to those constraints when reality intervenes.

Eisenhower's commanders did not believe in the plan. They believed in the planning — the cognitive readiness, the rehearsed adaptation, the ability to think coherently when contact with reality forced them to abandon the original. The engineering version of this is not different. The plan will fail. The planning loop is what keeps us moving.

If loops are the right shape for the lifecycle, the next questions are downstream: what kind of teams build best inside loops, and what kind of products are designed for them. That is a separate conversation, which I have written about in After the Tipping Point.

Related Reading

The App-Less Future: How Outside AI and Ambient Computing Will Replace the Smartphone

For decades, the dream of ambient computing felt out of reach. With the launch of Project Solara and Antigravity 2.0, we are finally building the "Outside AI" architecture—breaking free from app containers and moving toward intelligence we invoke.

After the Tipping Point: Where the SDLC is Heading, and How to Build Teams and Products for It

Three workshops at Google I/O 2026 made the shape of the next software development lifecycle visible. Here is what it changes — for how we hire, how we build, and what we measure.