Blog

From Spec to Code: Building Software with Spec Kit

Hidde de Smet, Emanuele Bartolesi

June 17, 2026

32 minutes

This article is part of the XPRT. Magazine #21

Spec-Driven Development starts with a simple observation: specifications usually get thrown away. You write a spec to get alignment, development starts, and the spec becomes outdated within a week. The code is what matters. The spec was just scaffolding.

Spec-Driven Development flips that assumption. The spec does not get discarded. It becomes the input that drives code generation directly. You describe what you want to build and why, and an AI agent turns that into a working implementation through a structured, repeatable process.

Spec Kit is GitHub's open source toolkit for doing exactly this. A CLI scaffolds your project and registers a set of slash commands in your AI agent of choice. From there, you work through six steps: establish project principles, write the spec, clarify edge cases, define the tech stack, generate a task breakdown, and implement. Each step has a dedicated command. Two optional commands (/speckit.clarify for resolving ambiguities, /speckit.analyze for cross-artifact consistency checks) slot in when needed. A third optional command, /speckit.checklist, generates quality checklists that validate requirements for completeness and consistency. The workflow keeps you focused on outcomes rather than prompt engineering.

The project has over 82,000 stars on GitHub and supports Claude Code, GitHub Copilot, Cursor, Gemini CLI, and over 25 other agents out of the box.

This article walks through the full workflow, from installation to a working implementation, covering both greenfield projects and extending an existing codebase.

Getting started

You need Python 3.11+, Git, and a supported AI agent installed. The package manager uv handles the rest.

Install the specify CLI once and use it everywhere:

uv tool install specify-cli --from git+https://github.com/github/spec-kit.git

You can also use it without installing by running it directly with uvx:

uvx --from git+https://github.com/github/spec-kit.git specify init my-project

If you have installed specify globally, you can initialize a new project for your AI agent of choice. The --ai flag determines which slash commands get registered:

specify init my-project --ai copilot

To work inside an existing directory instead of creating a new one:

specify init --here --ai copilot

Once initialized, run specify check to verify that your agent and all required tools are detected correctly.

If everything looks good, open your AI agent in the project folder. You will see the /speckit.* commands available. That is the confirmation that the setup worked.

The six commands form a fixed sequence. Each step produces artifacts the next step reads. Here is the full workflow before we walk through each part:

Establishing principles with /speckit.constitution

The first thing you run after setup. No arguments are required, but you can pass guidance:

/speckit.constitution Create principles focused on testability, simplicity,
and consistency across all generated code

The command reads .specify/memory/constitution.md, which was placed there during specify init as a template. It fills in the placeholder values using whatever guidance you provided, plus any context it can infer from your repository: the README, existing documentation, and prior constitution versions. The result is a structured document with numbered principles, a governance section, and a semantic version number.

A minimal output looks like this:

# My Project Constitution

## Core Principles

### I. Test-First (NON-NEGOTIABLE)
TDD mandatory: tests written and approved before any implementation code.
Red-Green-Refactor cycle strictly enforced.

### II. Simplicity
YAGNI principles. Maximum 3 projects for initial implementation.
Additional complexity requires documented justification.

### III. Observability
Structured logging required across all modules.
Text I/O ensures debuggability.

## Governance
Constitution supersedes all other practices. Amendments require documentation,
approval, and a migration plan.

**Version**: 1.0.0 | **Ratified**: 2026-03-25

Why it matters for later steps

Running the constitution first is not ceremonial. When you call /speckit.plan, the agent reads it and runs a set of compliance gates before generating anything:

### Phase -1: Pre-Implementation Gates

#### Simplicity Gate (Article II)
- [ ] Using ≤3 projects?
- [ ] No future-proofing?

#### Test-First Gate (Article I)
- [ ] Tests written before implementation?
- [ ] Contract tests defined?

Without a constitution, those gates have nothing to check against. With one, they become binding constraints. Any plan that fails a gate either needs to be revised or must document a justified exception in a dedicated complexity tracking section.

The constitution also propagates. After you update it, the command validates that the plan template, spec template, and task template all stay in sync. If a new principle requires a new task category (a mandatory security review step, for example), the command flags that as a pending update and lists it in a Sync Impact Report.

What to write

The command accepts specific guidance or nothing at all. You can be precise:

/speckit.constitution Enforce 80% test coverage minimum, prefer composition over
inheritance, never wrap framework APIs in custom abstractions

Or general:

/speckit.constitution Prioritize user experience consistency and performance

If you give it nothing, it infers principles from repository context. For a new project with no existing documentation, the defaults are reasonable starting points: test-first development, simplicity, CLI interfaces, and integration testing over mocks.

You can re-run the command at any point. Each update increments the version number using semantic versioning: PATCH for wording changes, MINOR for new principles, MAJOR for removals. This lets you trace when and why the governing rules changed.

Keep the constitution short. The agent reads it in full on every subsequent command. Four to six principles covering your actual non-negotiables is enough. The goal is not documentation. It is a reference small enough to stay alive across the full workflow.

Writing the spec with /speckit.specify

This is where the actual feature description goes. The argument is a plain-language description of what you want to build. No tech stack, no implementation details, just what the feature is and why it exists:

/speckit.specify Build a photo organizer that groups photos into albums by date.
Albums can be reordered by drag and drop. Within each album, photos are shown in
a tile grid. Albums are never nested inside other albums.

The command creates a git branch, names it from the feature description, and writes a spec.md into specs/<branch-name>/. No scaffolding to write by hand.

What the spec contains

The output follows a fixed template with three required sections:

User Scenarios & Testing: the feature broken into independently testable user stories, each with a priority and acceptance scenarios in Given/When/Then format:

### User Story 1 - View Photo Albums (Priority: P1)
A user opens the app and sees their photos grouped into date-based albums.

**Acceptance Scenarios**:
1. Given the user has photos, When they open the app, Then albums are displayed
   grouped by date in reverse chronological order.
2. Given an album is empty, When the user navigates to it, Then a placeholder
   is shown rather than a blank screen.

Functional Requirements: numbered, testable statements of what the system must do:

- **FR-001**: System MUST group photos into albums by calendar date.
- **FR-002**: Users MUST be able to reorder albums via drag and drop.
- **FR-003**: System MUST display a tile grid within each album.
- **FR-004**: Albums MUST NOT be nested inside other albums.

Success Criteria: measurable, technology-agnostic outcomes:

- **SC-001**: Users can create and reorder three albums in under 60 seconds.
- **SC-002**: Photo tile grid loads within 2 seconds for albums of 100 photos.

What the spec does not contain

The spec template is deliberately restrictive. It explicitly forbids implementation details: no framework names, no database choices, no API structure. A checklist runs after the spec is written and fails it if implementation details have leaked in. If your description mentions React or PostgreSQL, expect those to be stripped or flagged.

This separation is the point. The spec defines what users need. The tech stack decision comes later, as input to /speckit.plan. Keeping them apart means you can run the same spec through the plan step with different technology choices and compare the outputs.

Clarification markers

If your description is ambiguous in a way that would significantly affect scope or behaviour, the command inserts [NEEDS CLARIFICATION] markers and presents up to three questions with suggested answers:

## Question 1: Authentication scope

**Context**: "Users can reorder albums via drag and drop."

**What we need to know**: Is reordering persisted across sessions, or only for
the current session?

| Option | Answer | Implications |
|--------|--------|--------------|
| A      | Persisted | Requires a storage model for order state |
| B      | Session only | Simpler implementation, resets on reload |
| Custom | Provide your own answer | |

The cap is three questions. Anything below the cut gets a reasonable default, documented in the Assumptions section of the spec. The model picks the most impactful ambiguities. Scope and security get priority over layout preferences or default sort order.

If there are no ambiguities above the threshold, the spec is written without questions and you move directly to /speckit.clarify or /speckit.plan.

Clarifying requirements with /speckit.clarify

This step is conditional. If /speckit.specify found no significant ambiguities, you skip straight to /speckit.plan. If it inserted one or more [NEEDS CLARIFICATION] markers, you run clarify first to resolve them.

The command reads the current spec.md and does two things: it presents each unresolved marker as a structured question and waits for your answer, and it scans the spec for gaps that did not meet the marker threshold but could still cause problems during planning. That second pass is where most of the value is. A spec that looks complete often has unstated assumptions it can catch before they turn into technical decisions.

Run it with no arguments:

/speckit.clarify

Or to focus on a specific area:

/speckit.clarify Focus on the data persistence and offline behaviour questions

Answering a clarification question

Each open question is presented in the same format used during spec generation. You answer by selecting an option or providing your own:

## Question 1: Album reorder persistence

**Context**: "Users MUST be able to reorder albums via drag and drop." (FR-002)

**What we need to know**: Is the new order persisted across sessions, or
only for the current session?

| Option | Answer | Implications |
|--------|--------|--------------|
| A      | Persisted across sessions | Requires a storage model for order state |
| B      | Current session only | Simpler implementation, resets on reload |
| Custom | Provide your own answer | |

Your answer: A

After you answer, the command updates spec.md in place. The [NEEDS CLARIFICATION] marker is replaced with the resolved requirement and any relevant constraints are appended to the Assumptions section. The spec version number increments.

If you selected option A for the example above, the functional requirement would update to:

- **FR-002**: Users MUST be able to reorder albums via drag and drop.
  Album order MUST be persisted across sessions.
  **Resolved**: 2026-03-25, persisted via local storage or equivalent.

When clarify surfaces new questions

Running clarify against a spec can produce questions that were not flagged during /speckit.specify. This happens when the command identifies a logical gap between two requirements rather than an ambiguity within a single one.

The photo organizer example: FR-001 says photos are grouped by calendar date, and FR-003 says a tile grid is shown within each album. These two requirements say nothing about what happens when the same photo appears in multiple date ranges. That gap has no marker, but it directly affects data modeling. The question surfaces from the interaction between requirements, not from either one on its own.

There is no cap on follow-up questions, but each round only surfaces what is still unresolved in the current spec. Once you work through a round with no new questions, the spec is ready and you can move to /speckit.plan.

What not to answer here

The clarify step should stay at the functional level. If a question drifts into implementation ("should we use IndexedDB or localStorage for persistence?"), that is not a clarification question. It is a tech stack question that belongs in /speckit.plan. The answer to give in clarify is the requirement: "order state must persist across browser sessions." How it persists is the plan's problem.

The spec template enforces this by running the same implementation-detail check after each clarify round. If your answers introduce a specific technology name or architectural choice, expect it to be flagged and moved to an assumptions block rather than a requirement. Keeping the spec technology-agnostic here means you can compare different tech stacks during planning without rewriting your requirements.

When to skip clarify

If /speckit.specify completed without markers and the feature is small, straightforward, and well-understood, there is no need to run clarify. The command is not mandatory in the workflow. The flowchart gate exists for a reason: adding a clarification round to a spec that needs none slows down the workflow without improving the plan.

The one exception is when multiple developers will review the spec before planning begins. A pass through /speckit.clarify produces a documented record of which options were considered and which were chosen. That record is useful when onboarding someone to the feature later, even when there were no open markers to resolve.

Generating a plan with /speckit.plan

This is the step where the spec meets the real world. The argument is a plain-language description of your tech stack and any architectural constraints. No ceremony required, just the decisions you have already made:

/speckit.plan The application uses React. Images are not uploaded anywhere and
album order is stored in localStorage.

The command reads the spec.md from your current branch, validates it against the constitution, and produces a set of planning artifacts in the same specs/<branch-name>/ directory. These artifacts are what the next two commands consume. Nothing gets generated from thin air during implementation. It all comes from here.

What the command produces

Running /speckit.plan generates up to five files depending on the scope of the feature:

plan.md is the main artifact. It opens with the pre-implementation compliance gates from your constitution, then moves into a phased implementation outline:

# Implementation Plan: Photo Organizer

## Phase -1: Pre-Implementation Gates

### Simplicity Gate (Article II)
- [x] Using ≤3 projects?
- [x] No future-proofing?

### Test-First Gate (Article I)
- [x] Tests written before implementation?
- [x] Contract tests defined?

## Architecture Overview

**Chosen Stack**: React, localStorage
**Rationale**: FR-001 requires local-only storage. No server component is needed.
localStorage is sufficient for persisting album order without additional dependencies.

## Phase 0: Foundation

**Objective**: Set up project structure and confirm localStorage read/write.
**Dependencies**: None
**Deliverables**: React app scaffolded, localStorage utility wired up, a passing smoke test.

Each phase names its dependencies explicitly. If Phase 2 requires something from Phase 1, that dependency is stated. This is what /speckit.tasks uses to determine ordering and which tasks can be parallelized.

research.md is generated when the tech stack includes choices that benefit from comparison. Its purpose is to document the trade-offs the agent considered before committing to a library or approach:

# Research: Photo Organizer

## Drag-and-Drop

**Options evaluated**:
- Native HTML5 drag-and-drop API: Available without libraries, works for
  basic reordering.
- @dnd-kit/react: Accessible, touch-friendly, built for React.

**Decision**: @dnd-kit/react. FR-002 requires drag-and-drop and the native
HTML5 API has known issues with touch events on mobile.

The depth of research.md scales with the number of decision points in the tech stack. A simple, well-defined stack produces a short research file or none at all. A stack with multiple contested library choices produces a longer one. You do not control this directly. It follows from the complexity of your prompt.

data-model.md translates the domain concepts from the spec into concrete entity definitions:

# Data Model: Photo Organizer

## Entities

### Album
| Field     | Type     | Constraints          | Notes                          |
|-----------|----------|----------------------|--------------------------------|
| id        | INTEGER  | PRIMARY KEY, AUTOINCREMENT |                          |
| name      | TEXT     | NOT NULL             | Derived from date by default   |
| date      | TEXT     | NOT NULL             | ISO 8601 format                |
| sort_order| INTEGER  | NOT NULL, DEFAULT 0  | Persisted per FR-002           |

### Photo
| Field     | Type     | Constraints          | Notes                          |
|-----------|----------|----------------------|--------------------------------|
| id        | INTEGER  | PRIMARY KEY, AUTOINCREMENT |                          |
| album_id  | INTEGER  | NOT NULL, FK → Album.id |                             |
| filename  | TEXT     | NOT NULL             | Local path, not a URL          |
| created_at| TEXT     | NOT NULL             | Used for in-album sort order   |

## Constraints
- Albums are never nested (FR-004). No parent_id column.
- sort_order is the persisted user-defined order (resolved 2026-03-25, clarify step).

The schema reflects exactly what the spec says, nothing more. If a column would serve a feature not covered by a functional requirement, it does not appear here.

contracts/ is a directory of API specification files, one per surface area. For a REST API, each file covers a single resource. For a frontend-only application, the contracts describe the interface between components and the data layer: function signatures and their expected inputs and outputs.

quickstart.md contains the key validation scenarios the agent should be able to manually verify once implementation is complete. These map directly to the acceptance scenarios from the spec, reformatted as a numbered checklist:

# Quickstart Validation

1. Open the app in Firefox and Chrome. Confirm albums are visible.
2. Drag an album to a new position. Reload the page. Confirm the new order persisted.
3. Click into an album. Confirm photos are shown in a tile grid.
4. Attempt to nest one album inside another. This must not be possible.

How to review the plan

Before running /speckit.tasks, read through plan.md with the spec open alongside it. Two things to check:

Gate status: Every compliance gate from the constitution should be marked with [x]. An unchecked gate means the plan found a conflict with your stated principles and could not resolve it. The plan should include a Complexity Tracking section below the gates that documents the exception and why it was justified. If the exception is not there, rerun /speckit.plan with additional guidance:

/speckit.plan Same stack as before. Flag as a justified exception: the feature
requires three separate React apps because each album type ships as an
independent micro-frontend.

Phase dependencies: Each phase in plan.md states what it depends on. Follow those dependencies through from Phase 0 to the final phase. If a deliverable is listed in Phase 2 but its dependency is not covered by any prior phase, that gap will surface as a broken task during /speckit.implement. Better to catch it here.

Reviewing research.md is optional but useful when you disagree with a library choice or when a decision was made on assumptions you want to override. If you want to change a decision, update your tech stack description and rerun the command. The research file regenerates from your revised input.

data-model.md is worth a quick scan to confirm that constrained fields match your spec exactly. The sort_order column for albums, for example, should only exist if your clarify step resolved album reordering as persisted. If the model contains a column that maps to a feature you explicitly excluded, the plan has drifted from the spec and a rerun is the right fix.

What the tech stack prompt does and does not control

The tech stack argument controls library and framework choices. It does not control project structure, test strategy, or implementation phases. Those come from the constitution and the spec. If you want to influence those, update the constitution or the spec before running the plan.

One common mistake is putting constraints in the tech stack prompt that belong in the spec. "No authentication" is not a tech stack decision. It is a scope decision that belongs in the spec as a functional requirement or explicit exclusion. Putting it in the plan prompt works around the process rather than encoding it in the right place. The next time someone updates the spec and reruns the plan, the constraint disappears.

If you initialized the project for an existing codebase, make sure your constitution includes principles that reference the existing architecture. Without that context, the plan will treat your codebase as a blank slate and may generate data models and contracts that duplicate what is already there. After running /speckit.tasks, use /speckit.analyze to verify that the generated artifacts are consistent with each other and with the existing code before moving to implementation.

Breaking it down with /speckit.tasks

No arguments needed. The command reads plan.md, spec.md, and whatever optional artifacts exist (data-model.md, contracts/, research.md, quickstart.md) from your feature directory:

/speckit.tasks

The output is a single file: tasks.md. Every task in it is a checkbox the agent can work through one by one during /speckit.implement.

How tasks.md is structured

Tasks are organized into phases. The first two phases are always the same:

Phase 1: Setup covers project initialization, dependency installation, and configuration files.
Phase 2: Foundational covers blocking prerequisites that multiple user stories depend on. Shared data models, base components, or utility modules go here.

After that, each user story from the spec gets its own phase, ordered by priority. A P1 story becomes Phase 3, a P2 story becomes Phase 4, and so on. The final phase handles polish and cross-cutting concerns like error states, accessibility, or performance.

Within each user story phase, the ordering follows a fixed pattern: tests (if requested), then models, then services, then endpoints or UI, then integration. That ordering is deliberate. The agent writes tests before implementation when TDD is specified in the constitution. If TDD is not part of your principles, test tasks only appear when the spec explicitly requests them.

A stripped-down example for the photo organizer:

## Phase 1: Setup

- [ ] T001 Initialize React project with Create React App in /photo-organizer
- [ ] T002 Install @dnd-kit/react in /photo-organizer

## Phase 2: Foundational

- [ ] T003 Create Album type definition in src/types/album.ts
- [ ] T004 Create Photo type definition in src/types/photo.ts
- [ ] T005 [P] Create localStorage utility for reading/writing albums in src/utils/storage.ts
- [ ] T006 [P] Create date grouping utility in src/utils/groupByDate.ts

## Phase 3: View Photo Albums [US1]

- [ ] T007 [US1] Create AlbumList component in src/components/AlbumList.tsx
- [ ] T008 [US1] Create AlbumCard component in src/components/AlbumCard.tsx
- [ ] T009 [US1] Render AlbumList in App.tsx with sample data
- [ ] T010 [US1] Add empty album placeholder in AlbumCard.tsx

## Phase 4: Reorder Albums [US2]

- [ ] T011 [US2] Wrap AlbumList with DndContext in src/components/AlbumList.tsx
- [ ] T012 [US2] Persist new order to localStorage on drag end in src/utils/storage.ts
- [ ] T013 [US2] Load persisted order on app start in App.tsx

## Phase 5: Photo Tile Grid [US3]

- [ ] T014 [P] [US3] Create PhotoGrid component in src/components/PhotoGrid.tsx
- [ ] T015 [P] [US3] Create PhotoTile component in src/components/PhotoTile.tsx
- [ ] T016 [US3] Wire album click to navigate to PhotoGrid in App.tsx

## Phase 6: Polish

- [ ] T017 Add loading state for album list in src/components/AlbumList.tsx
- [ ] T018 Verify all quickstart validation scenarios pass

Task format

Every task follows a strict format:

- [ ] T001 [P] [US1] Description with file path

The checkbox (- [ ]) is required. T001 is a sequential ID. [P] marks the task as parallelizable, meaning an agent that supports concurrent execution can run it alongside other [P] tasks in the same phase. [US1] maps the task back to a user story from the spec. Setup and foundational tasks have no story label. The description always ends with a file path so the agent knows exactly where to write code.

Parallel markers

Tasks marked [P] can run at the same time because they touch different files and have no dependencies on incomplete tasks within the same phase. In the example above, T005 and T006 are both [P] because storage.ts and groupByDate.ts have no coupling. T003 and T004 are not marked [P] because type definitions are often referenced by later tasks in the same phase and the cost of sequencing them is negligible.

Agents that do not support parallel execution simply ignore the [P] marker and run tasks in order. The sequential IDs (T001, T002, ...) are the canonical execution order regardless of parallel markers.

What to check before moving on

After the task list is generated, the command prints a summary: total task count, tasks per user story, parallel opportunities, and a suggested MVP scope (typically just User Story 1). Scan this before running /speckit.implement.

Three things to look for:

Missing coverage. Every functional requirement from the spec should trace to at least one task. If FR-004 ("albums must not be nested") has no corresponding task, the agent will not enforce it.
Wrong phases. A task that belongs in Phase 2 (foundational) but landed in a user story phase means the agent might try to build on something that does not exist yet. Move it up.
Overly broad tasks. A task like "implement the album feature" is too vague for the agent to execute reliably. Each task should target a single file and a single concern. If a task description covers two files, it should be two tasks.

If your team tracks work in GitHub Issues, /speckit.taskstoissues can push the generated tasks there directly.

Implementation with /speckit.implement

Run it with no arguments. The command reads everything it needs from the feature directory:

/speckit.implement

What happens when you run it

The first thing the command does is check for checklists. If your feature directory contains a checklists/ folder (created by /speckit.checklist), it reads every file in it and counts incomplete items. Any checklist with unchecked items triggers a prompt asking whether you want to proceed or stop. If no checklists exist, this step is skipped.

Next, the command loads the full implementation context: tasks.md for the execution plan, plan.md for the tech stack and file structure, and any optional artifacts like data-model.md, contracts/, or research.md. All of these were generated in earlier steps.

Then it verifies project setup. For a React project, that means checking for .gitignore and ensuring it includes node_modules/, build/, and .env*. If a file is missing, the command creates it. If one exists, it appends any critical patterns that are absent.

Phase-by-phase execution

Tasks run in the order defined by tasks.md. Phase 1 first, then Phase 2, and so on. Within each phase, the agent processes tasks sequentially unless they carry the [P] marker. Parallel tasks can run at the same time because they touch different files.

For the photo organizer, that looks like this:

Phase 1 initializes the React project and installs @dnd-kit/react.
Phase 2 creates the type definitions and utility modules. T005 (storage.ts) and T006 (groupByDate.ts) are [P], so they run concurrently.
Phase 3 builds the album list UI, wiring components to sample data.
Phase 4 adds drag-and-drop reordering with localStorage persistence.
Phase 5 creates the photo grid view.
Phase 6 handles polish: loading states and quickstart validation.

Each completed task gets its checkbox flipped from - [ ] to - [X] in tasks.md. You can open the file at any point to see how far the agent has progressed.

Error handling

When a sequential task fails, the command stops. It prints the error with enough context for you to diagnose the problem: which file was being written, what went wrong, and a suggested next step. Fix the issue and rerun /speckit.implement. The command picks up where it left off by reading the checkboxes in tasks.md and skipping any task already marked [X].

Parallel tasks behave differently. If one [P] task fails while others in the same group succeed, the command continues with the successful tasks and reports the failure at the end of the phase. This matters for Phase 2 in the photo organizer: if T005 (storage.ts) fails but T006 (groupByDate.ts) succeeds, you only need to fix the storage utility before proceeding.

Completion validation

After the last task finishes, the command runs a final check. It verifies that every task in tasks.md is marked [X], that the implemented features match the original spec, and that any tests defined in the task list pass. The output is a summary: total tasks completed, any tasks that were skipped or failed, and a status for each user story.

At this point, open the app and test it against your spec. For the photo organizer, that means loading the page, verifying albums render from localStorage, dragging an album to reorder it, refreshing to confirm persistence, and clicking into a photo grid. The quickstart scenarios from quickstart.md (if you generated one during /speckit.plan) map directly to these manual checks.

Using Spec Kit in an existing project

Everything up to this point assumed a greenfield project. Most real work happens in an existing codebase. The workflow is the same six steps, but three of them need extra care: initialization, constitution, and planning.

Initialization

Instead of creating a new directory, initialize Spec Kit inside your existing repository:

specify init --here --ai copilot

This adds the .specify/ directory and registers slash commands without touching your existing code. If your project is large, consider scoping the initialization to a specific subdirectory. A monorepo with ten services does not need the agent reasoning about all ten when you are adding a feature to one. Navigate to the service directory first and run specify init --here from there. The agent's context stays smaller and more accurate.

Writing a constitution that respects existing code

The constitution is the single most important step for brownfield projects. In a greenfield project, the constitution describes how you want to work. In an existing project, it must also describe what already exists and what the agent must not touch.

Start by passing explicit guidance that references your codebase:

/speckit.constitution This project is an existing ASP.NET Core Web API with
a layered architecture: Controllers, Services, and Repositories. Do not
regenerate or restructure any existing class. New code must follow the
existing patterns in the Services/ and Repositories/ directories. All new
endpoints must be added to existing controllers where appropriate rather
than creating new ones. Entity Framework Core is the ORM. Do not introduce
a second data access pattern.

Scoping the spec to a change, not the system

When you run /speckit.specify, describe only the feature you are adding, not the system it lives in. The spec should read as a change request, not as a system overview:

/speckit.specify Add a notification preferences page where users can toggle
email and push notifications per category. Categories are managed by admins
and already exist in the system as NotificationCategory entities.

Notice the last sentence. It tells the agent that NotificationCategory already exists. This matters because it prevents the agent from writing a functional requirement to create that entity. During the clarify step, expect a question about how the existing entity is accessed. Answer at the functional level: "notification categories are read-only for this feature and accessed through the existing ICategoryRepository."

Planning around existing architecture

The /speckit.plan step is where brownfield projects diverge most from greenfield. Your tech stack prompt should describe what is already in place, not what you are choosing:

/speckit.plan The project uses ASP.NET Core 8, Entity Framework Core with
SQL Server, and a React frontend served from wwwroot. Authentication is
handled by ASP.NET Identity and is already configured. The new feature
adds a NotificationPreferencesController, a NotificationPreferenceService,
and a React page at /settings/notifications. Use the existing repository
pattern in src/Repositories for data access.

This level of specificity gives the plan step enough context to generate a data-model.md that extends the existing schema rather than replacing it, and a plan.md whose phases reference existing files. Check the generated plan.md carefully: every phase should say "extend" or "add to" rather than "create" when referencing something that already exists.

Running analyze before implementation

For existing projects, /speckit.analyze is not optional. Run it after /speckit.tasks and before /speckit.implement:

/speckit.analyze

The command cross-checks the generated artifacts against each other and against the files in your repository. It catches problems like:

A task that creates a file that already exists.
A data model that defines an entity with a name that conflicts with an existing class.
A contract that specifies an API route already claimed by another controller.

Fix any issues it reports before running /speckit.implement. Catching a naming conflict in the analyze step costs seconds. Catching it after implementation means untangling generated code from existing code.

A practical workflow summary

For an existing project, the adjusted sequence looks like this:

Run specify init --here --ai copilot from the relevant subdirectory.
Run /speckit.constitution with explicit preservation rules, naming existing patterns and boundaries.
Run /speckit.specify describing only the new feature, referencing existing entities by name.
Run /speckit.clarify if markers exist. Answer questions at the functional level, pointing to existing interfaces rather than proposing new ones.
Run /speckit.plan with a tech stack prompt that describes the existing architecture, not a new one.
Run /speckit.tasks to generate the task breakdown.
Run /speckit.analyze to validate artifacts against existing code. Fix any conflicts.
Run /speckit.implement to execute the task list.

The extra steps (scoping initialization, writing a defensive constitution, running analyze) add roughly ten minutes to the workflow. That is far less time than debugging an agent that regenerated your service layer from scratch.

What works well and what does not

The structured sequence is the main thing Spec Kit gets right. When you are forced to clarify requirements before the plan is written, and to review the plan before the agent writes code, you catch misunderstandings early. That sounds obvious. In practice, most AI-assisted development skips straight to implementation, and the feedback loop for discovering a bad assumption is a broken feature rather than a clarifying question.

The task breakdown from /speckit.tasks is more reliable than asking the agent to plan and implement in one shot. The parallel markers help agents that support concurrent task execution, and the explicit ordering avoids the common failure mode where an agent tries to implement a feature before it has written the test scaffolding.

Greenfield projects get the most out of the workflow. Starting from nothing, the full six-step sequence feels proportionate: there is no existing code for the agent to misread, and the constitution you write in step one shapes every decision that follows.

Brownfield is harder

Adding Spec Kit to an existing codebase is a different story. A well-documented failure mode is this: the agent is asked to add a feature that builds on top of existing code. During the planning step, it researches the existing classes and produces accurate notes. But during implementation it treats those descriptions as new requirements and regenerates the classes from scratch, creating duplicates instead of extending what was already there.

The underlying issue is that the agent needs to correctly distinguish between "this describes something that already exists" and "this is a new requirement" during implementation. That distinction can break down with longer context windows and more complex codebases. Running /speckit.analyze after task generation helps catch inconsistencies between the plan and existing code before implementation begins, but it validates cross-artifact consistency. It does not rewrite the constitution or prevent the agent from regenerating existing classes.

A practical workaround is to write explicit constitution articles that say "do not regenerate code that already exists in the following modules." Keeping the scope of what the agent sees small and precise also helps. Initialize Spec Kit for a subset of the codebase rather than the full repo. Neither approach is a guarantee, but both reduce the failure mode.

When your spec changes

Spec Kit creates a new git branch for every spec. That design treats a spec as the source of truth for a single change request, not for a feature over its entire lifetime. GitHub's blog describes specs as "living artifacts that evolve with the project," but the per-branch model suggests Spec Kit currently operates at the spec-first level, not the spec-anchored level.

What this means in practice: if the spec for a feature changes mid-way through implementation, you update the spec files in the branch, rerun /speckit.clarify and /speckit.plan as needed, and regenerate the task list before continuing. The spec-driven methodology documents this as expected; the spec is the source of truth and code is regenerated from it. But this only works reliably when the change is caught before implementation begins. A spec change discovered after /speckit.implement has run requires manual reconciliation between the updated plan and the already-generated code.

Branching to explore alternatives

One use of the git branching model that does work well is running the workflow twice on the same spec with different inputs to /speckit.plan. Because you provide the tech stack as a prompt, you can branch at plan time and generate two separate implementations: one with WebSockets and one with server-sent events, for example. The task breakdowns stay separate, and you can compare the implementations before committing to one. The spec-driven.md documentation explicitly describes this as a design goal: "generate multiple implementation approaches from the same specification to explore different optimization targets."

The markdown overhead

Spec Kit generates a substantial number of files during a single workflow: spec.md, plan.md, research.md, data-model.md, contracts/, tasks.md, quickstart.md. For a medium-sized feature on an existing codebase, reviewing all of these can take longer than reviewing the eventual code. For a feature in the 3 to 5 story point range, the time spent reviewing Spec Kit artifacts can match the time it would have taken to implement the feature directly with standard AI-assisted coding.

This is not a flaw in the tool. It is a cost you need to account for. The artifacts are useful for large features with multiple contributors, where traceability between requirement and implementation matters. For a solo developer working on a well-understood codebase, the return on that review time is lower.

Where spec quality is the bottleneck

A vague spec produces a vague plan, and /speckit.implement cannot recover from a task breakdown that lacks precision. The tool surfaces the problem earlier than traditional development, but it does not eliminate it. A common confusion is when to stay at the functional level versus when to add technical detail. The workflow documentation is not entirely consistent here, and developers who have spent years writing implementation-heavy tickets will instinctively reach for technical decisions earlier than the process expects.

The [NEEDS CLARIFICATION] markers that /speckit.clarify inserts into the spec help, but they only catch what the model identifies as ambiguous. A requirement that reads as precise but contains an unstated assumption will pass through without a marker and surface as a bug later.

Agent capability matters more here than in open-ended prompting. The clarify and plan steps rely on the model reasoning about gaps and trade-offs. A weaker model produces shallower questions in /speckit.clarify and a less useful plan. On models with smaller context windows, the implementation phase also runs into trouble when tasks reference earlier files that have scrolled out of the active context.

Conclusion

Spec-Driven Development is not a new idea. Writing specs before code has been standard practice for decades. What Spec Kit changes is the cost of doing it properly. When a well-written spec directly generates a task breakdown and implementation, the incentive to skip it disappears.

The six-step workflow forces the right conversations early. You clarify what you are building before you pick a tech stack. You validate the plan before the agent writes a single line of code. That sequence reduces the kind of rework that comes from discovering a misunderstanding mid-implementation.

Spec Kit works best when you treat the spec as something you will update, not a deliverable you hand off. Write it precisely and the agent has less room to guess wrong.

References

GitHub. (2025). Spec Kit: Toolkit to help you get started with Spec-Driven Development. https://github.com/github/spec-kit

GitHub. (2025). Spec-Driven Development methodology. https://github.com/github/spec-kit/blob/main/spec-driven.md

XPRT. Magazine #21

This article is part of the XPRT. Magazine

In this latest edition, we dive into one of the most defining shifts in our industry: the rise of AI as a core part of how we design, build, and operate software.

From practical engineering approaches to strategic insights, this issue brings together hands-on expertise and forward-looking perspectives from Xebia specialists.

Download Full Magazine

Tags:

Written by

Hidde de Smet

Our Ideas

Explore More Blogs

View All

‌

The Risks of Unprotected Agents and How to Mitigate Them

‌

Event-driven plumbing. When plain SQS + Lambda beats EventBridge Pipes

Yev Dytyniuk

Contact

Let’s discuss how we can support your journey.

‌