A methodology where the spec is the contract — and AI implements against it, end to end.
Most teams treat prompts as the artifact and hope the diff lines up. Spec Driven AI inverts the loop: an executable spec — schema, fixture, acceptance criteria — is authored first; an AI agent implements against it under TDD; tests verify the spec on every build. This is one of the pillars of the Beyond the AI Plateau curriculum, and the live showcase is the amysoft.tech repository itself — built by exactly this loop.
Vibe coding plateaus. Prompt-and-pray ships drift you cannot review.
Plausible diffs, no contract behind them
A prompt produces something that compiles. You squint at the diff, ship it, and find out at runtime which assumption the model invented. Tomorrow the same prompt — same intent — produces a different shape. There is no artifact between intent and code that the team agreed on, so every iteration is a fresh negotiation with the model, and every regression is a surprise.
Intent leaks at every seam
The product brief lives in a doc. The acceptance criteria live in a ticket. The prompt lives in chat history. The diff lives in the PR. Nothing checks that the four agree. By the time a reviewer asks "is this what we asked for?", the only way to answer is to read the implementation and reconstruct the intent backwards — which is exactly the work the spec was supposed to do.
There is nothing to compare the code against
Code review degrades into taste arbitration. Without an executable spec the reviewer is asking "does this look right?" instead of "does this satisfy the contract?" Bugs that violate intent slip through; stylistic preferences get litigated. Velocity drops, trust drops, and the team blames the AI for output the process never gave it a target for.
The spec is the contract between intent and execution. Schemas, fixtures, and acceptance criteria are authored first as code. An AI agent implements against them under TDD — RED, then GREEN, then REFACTOR. Tests verify the spec on every build; spec drift fails CI before it reaches the reviewer. The reviewer reads the spec; the agent satisfies the spec; the test proves the agent did. There is no prompt-and-pray in the loop.
A three-stage loop: author the spec, implement against it, verify the spec.
Spec Driven AI is a methodology, not a tool. The same three stages run for every feature, every fixture, every bug fix — small enough to repeat on a single field, structured enough to scale across a hundred-task work breakdown.
Author the spec
The spec lands first as code: a Zod schema, a fixture, and a written acceptance criterion the change has to satisfy. In this repository, a new product-concept page begins as an entry in src/content/product-concepts-schema.ts and a fixture under planning/pages/resources/product-concepts/ — both reviewable before a line of implementation is written. The spec is the artifact the team negotiates over; everything downstream is constrained by it.
AI implements against the spec (RED → GREEN → REFACTOR)
An agent — the tdd-implementation-engineer, dispatched by /implement-task — writes a failing test first (RED), then the minimum code to pass (GREEN), then refactors against the same test (REFACTOR). The agent never edits the spec to make the code pass; the spec drives the implementation, not the other way around. Each loop produces a reviewable commit on a feature branch off develop.
Tests verify the spec (closing the loop)
Vitest runs unit and fixture-guard tests; Playwright runs E2E against the production-built container. A spec drift — frontmatter that no longer satisfies the schema, a fixture that no longer satisfies the guard, behaviour that no longer satisfies the E2E — fails CI before the PR merges. The reviewer reads the spec and the diff side by side; the test proves the agent landed on the contract. The loop closes on every build.
productConceptsSchema in src/content/product-concepts-schema.ts is the spec[1]; each new fixture (MCP, Reader Publications, SaaS Platform, Spec Driven AI) is authored against that schema and gated by a *.fixture.test.ts in tests/unit/[2]; the tdd-implementation-engineer agent lands each change on a feature/product-concept-* branch off develop through Git Flow[3]; CI runs Vitest and Playwright before the merge button unlocks[4]. The page you are reading was produced by the same loop. *.fixture.test.ts per fixture, run by Vitesttdd-implementation-engineer + /implement-task drive RED → GREEN → REFACTORGit Flow + CI gates close the loop before merge to develop Spec discipline is structural, not a layer you bolt on.
A competitor can paste a prompt into a chat window and ship the output. Producing the same artifact reproducibly, reviewing it against a contract the team agreed on, and shipping faster because the spec catches drift is a different exercise — and it is the entire bet.
Reproducibility — same spec, same output, every time
When the spec is a <code>Zod</code> schema + a fixture + an acceptance criterion, two runs of the same loop produce the same shape. A new contributor, a new agent, or a re-run on a different day lands at the same contract. The output stops being a function of which prompt happened to win that session.
Reviewability — the spec is the diff worth reading
Reviewers compare the implementation against the spec, not against their taste. "Does this satisfy the schema?" "Does the fixture-guard still pass?" "Does the acceptance criterion read as met?" The questions are answerable from the artifacts, not from re-interpreting intent. PRs land faster because there is something to land them against.
Velocity without regret — the spec catches drift
Speed is the dividend, not the goal. Because every change passes through schema validation, fixture-guards, and TDD against the spec, regressions are caught at the build, not at the customer. Teams ship faster because the loop refuses to ship broken — not because they cut corners.
Team scalability — the spec onboards the next contributor
The spec is the onboarding doc. A new engineer (or a new agent) reads the schema, reads the fixture, reads the acceptance criterion, and is productive without a week of tribal-knowledge transfer. The methodology compounds: every new fixture authored under it becomes a worked example for the next.
Three tiers — from the methodology fundamentals to a private team cohort.
Spec Driven AI is taught as part of Beyond the AI Plateau. Every tier is fixed-price against a written outcome. We start small on purpose: the Foundation tier covers the methodology end-to-end before any team commits to the deeper curriculum.
- Methodology overview — the three-stage loop in detail
- Worked example walkthrough — a Zod schema + fixture + AI implementation
- Reading list — the curriculum chapters that frame the pillar
- Access to the public reference implementation (this repository)
- Full curriculum access across every pillar — Spec Driven AI plus the rest
- Reference implementations for each pillar with annotated commit history
- Agent and skill recipes — <code>tdd-implementation-engineer</code>, <code>/implement-task</code>, Git Flow patterns
- Spec-authoring templates — Zod schema scaffolds, fixture patterns, acceptance-criterion checklists
- Six months of curriculum updates as the methodology evolves
- Private cohort — 4 to 12 engineers, fixed scope, written success criteria
- Two-day on-site workshop — author a spec, implement it under TDD, verify it
- Codebase audit — where Spec Driven AI fits in your current pipeline
- Agent and CI gate setup — TDD agent wired to your stack, fixture-guards added to CI
- 30-day follow-on retainer for adoption support and spec-review office hours
The canonical methodology stack — proven inside this repository.
tdd-implementation-engineer, dispatched by /implement-task; domain skills extend its scope without diluting the loop. *.fixture.test.ts) that fail when a content fixture drifts from its schema. Spec drift fails the build, not the reader. /implement-task) with two modes — strict (human-in-the-loop, PR to develop) and autonomous (auto-merge after stricter gates). Multi-subtask plans execute under a configurable strategy (PARALLEL / SEQUENTIAL / GATED / HYBRID / ITERATIVE). The orchestration is the methodology made operational. feature/* branches off develop, PR-gated merges, and CI checks that block on Vitest + Playwright before the merge button unlocks. Container-first dev and build via docker-compose.dev.yml — the image that runs locally is the image that ships. Spec Driven AI is one of the pillars of the Beyond the AI Plateau curriculum — the move that elevates teams above the vibe-coding plateau. The methodology is stack-agnostic, but this repository is its working reference: Zod schemas as the spec, tdd-implementation-engineer as the implementer, Vitest and Playwright as the verifier, Git Flow as the delivery gate. Adopt the loop; the stack underneath is yours to choose.
Three released phases, one active focus, and the playbooks beyond it.
Every Beyond the AI Plateau pillar runs through the same release cadence. Earlier modules are de-risked by the worked example in this repository; the active focus is what the curriculum is producing now; future expansions cover advanced and vertical-specific patterns.
The three-stage loop — author the spec, AI implements, tests verify — taught end-to-end with the amysoft.tech repository as the worked example. Released as the Foundation tier; consumed standalone or as the on-ramp to the full curriculum.
Zod-schema authoring patterns, fixture-guard test scaffolds, TDD agent recipes, and Git Flow orchestration walkthroughs — packaged with annotated commit history from this repository so practitioners can read the loop instead of recreating it.
Multi-subtask plan strategies (PARALLEL / SEQUENTIAL / GATED / HYBRID / ITERATIVE), autonomous-mode operation with stricter CI gates, agent composition across domain skills, and the codebase-audit playbook for teams adopting Spec Driven AI inside an existing pipeline.
Spec Driven AI applied beyond software — editorial workflows, research pipelines, agent-eval design, compliance specs. Each playbook treats the vertical's contract (style guide, research protocol, eval rubric, control framework) as the spec the AI implements against.
Anywhere intent has to survive the hand-off to an AI implementer.
The amysoft.tech repository is one instance of a generalisable methodology. The same loop — author the spec, AI implements, tests verify — applies wherever the cost of unreviewable AI output is high and the team needs a contract between intent and execution.
Ready to leave vibe coding behind?
Spec Driven AI is one of the pillars of Beyond the AI Plateau — the move that lets teams ship faster because the spec catches drift. Start with the Foundation tier to learn the loop, commit to the Practitioner curriculum for the full set of pillars, or bring us in for a private team cohort. The reference implementation is this repository; the methodology is yours to adopt.