Spec-Flow is a workflow toolkit for Claude Code that transforms how you build software with AI. Instead of ad-hoc prompting, you get a structured pipeline that takes ideas from specification to production.
/feature "add user authentication"
That's it. Spec-Flow handles the rest: writing specs, planning architecture, breaking down tasks, implementing with TDD, running quality gates, and deploying.
| Without Spec-Flow | With Spec-Flow |
|---|---|
| "What were we building again?" | Every decision tracked in NOTES.md |
| Features shipped without tests | TDD enforced, quality gates block bad code |
| Context bloat slows Claude down | Auto-compaction keeps context efficient |
| Each feature starts from scratch | Reusable patterns, proven workflows |
| "Did we test this? Who approved?" | Auditable artifacts for every phase |
npx spec-flow initThis copies workflow files directly into your project (.claude/, .spec-flow/, CLAUDE.md). No dependency is added to your package.json — Spec-Flow becomes part of your codebase.
/feature "add dark mode toggle"Spec-Flow runs you through:
spec → plan → tasks → implement → optimize → ship
Each phase produces artifacts, runs quality checks, and hands off cleanly to the next.
Your feature is deployed. All decisions documented. Tests passing. Ready for the next one.
Check for updates anytime:
npx spec-flow statusUpdate to the latest version:
npx spec-flow updateFor CI/CD pipelines, use --check to fail if outdated:
npx spec-flow status --checkFor focused work on a single subsystem:
/feature "user profile editing" # Start the workflow
/feature continue # Resume after a breakFor complex work spanning multiple subsystems:
/epic "OAuth 2.1 authentication" # Multi-sprint orchestration
/epic continue # Resume epic workEpics break down into parallel sprints with locked API contracts, giving you 3-5x velocity through parallelization.
For small changes that don't need the full workflow:
/quick "fix login button alignment"| Command | What it does |
|---|---|
/feature "name" |
Start a feature workflow |
/epic "goal" |
Start a multi-sprint epic |
/quick "fix" |
Fast path for small changes |
/help |
Context-aware guidance |
| Command | Phase |
|---|---|
/spec |
Generate specification |
/plan |
Create implementation plan |
/tasks |
Break down into TDD tasks |
/implement |
Execute tasks |
/optimize |
Run quality gates |
/ship |
Deploy to staging/production |
| Command | What it does |
|---|---|
/init-project |
Generate project documentation |
/init-preferences |
Configure workflow defaults |
/roadmap |
Manage features via GitHub Issues |
See all 46 commands in the full reference.
/spec "user authentication"Generates spec.md with:
- User scenarios in Gherkin format
- Functional and non-functional requirements
- Acceptance criteria
- Success metrics
/planCreates plan.md with:
- Architecture decisions
- Component breakdown
- Code reuse opportunities
- Risk assessment
/tasksProduces tasks.md with:
- 20-30 concrete implementation tasks
- TDD sequencing (Red → Green → Refactor)
- Dependency ordering
- Acceptance criteria per task
/implementExecutes tasks with:
- Test-first development
- Specialist agents (backend, frontend, database)
- Parallel batch execution
- Automatic error recovery
/optimizeRuns parallel checks:
- Performance benchmarks
- Security scanning
- Accessibility audits
- Code review
- Test coverage validation
/shipHandles:
- Staging deployment
- Validation checks
- Production promotion
- Rollback capability
your-project/
├── .claude/
│ ├── commands/ # Slash commands
│ ├── agents/ # Specialist agent briefs
│ ├── skills/ # Progressive disclosure content
│ └── hooks/ # Event handlers
├── .spec-flow/
│ ├── scripts/ # Automation scripts
│ ├── config/ # User preferences
│ └── templates/ # Artifact templates
├── specs/
│ └── NNN-feature/ # Feature workspaces
├── epics/
│ └── NNN-epic/ # Epic workspaces
└── docs/
└── project/ # Project documentation
Spec-Flow can be installed as a Gemini CLI extension to bring the same structured workflow to Gemini.
-
Install:
# From within your project root npx spec-flow install-gemini-extensionAlternatively, run:
gemini extensions install .inside your project. -
Use: The extension provides the same slash commands (
/feature,/plan, etc.) directly within your Gemini CLI session./feature "add user authentication" -
Skills & Agents: The Gemini extension automatically adapts Spec-Flow's agents and skills to work within the Gemini environment, allowing you to leverage specialized personas like
backend-devorgit-workflow-enforcer.
- Claude Code with slash command support
- Git 2.39+
- Python 3.10+
- yq 4.0+ for YAML processing
Windows users: Install Git for Windows for full compatibility.
| Guide | Description |
|---|---|
| Getting Started | Step-by-step tutorial |
| Developer Guide | Complete reference |
| Commands Reference | All slash commands |
| Architecture | System design |
| Troubleshooting | Common issues |
See a complete feature workflow in specs/001-example-feature/:
- Full specification with requirements
- 28 tasks with acceptance criteria
- Performance benchmarks
- Release notes
shadcn/ui Integration with Token Bridge Pattern - Generate OKLCH tokens + shadcn-compatible CSS variables
- 8 customization options: Style Preset, Base Color, Theme Mode, Icon Library, Font Family, Border Radius, Menu Color, Menu Accent
- Token Bridge Pattern: OKLCH tokens remain source of truth, shadcn CSS variable aliases generated
- Brownfield scanning: Auto-detect and consolidate existing color tokens
- Menu theming: New menu-specific tokens for background, hover, active, and accent styles
Ultrathink Philosophy Checkpoints - Deep thinking embedded across all workflow phases
- Phase checkpoints: Think Different (spec), Obsess+Simplify (plan), Simplify Ruthlessly (tasks), Craft Don't Code (implement)
- Progressive depth: Trivial → Standard → Complex → Epic with increasing thinking requirements
- Assumption inventory: Question everything before designing
- Complexity budgets: Justify each new component
Auto-Mode for End-to-End Workflow Execution - Run entire workflows without stopping
--autoflag: Added to/featureand/epiccommands- Continue automatically through optimize → ship → finalize
- Skip manual approval prompts
- Auto-merge PR when CI passes (controlled by preference)
/ship --auto: Full autopilot for deployment phase- New preferences:
deployment.auto_ship,deployment.auto_merge,deployment.auto_finalize - Full autopilot:
/feature "add auth" --autoruns entire workflow unattended
CLI Version Awareness - Stay current with automatic update detection
- Version checking:
npx spec-flow statusshows installed vs latest version- Fetches from npm registry with graceful offline handling
- Clear "update available" indicator when behind
- CI/CD integration:
--checkflag exits code 1 if update available - Bug fix: Fixed healthCheck variable shadowing bug
Git Worktree Integration - Parallel development with isolated git state per feature/epic
- worktree-context.sh: Root orchestration utilities for worktree-based development
- Automatic worktree creation for features/epics
- Merge and cleanup functions for ship workflow
- Task() agent context generation with cd-first pattern
- Worktree-aware agents: Worker agent supports isolated worktree execution
- Command integration:
/feature,/implement-epic,/shipnow create and manage worktrees - Enabled by default:
worktrees.auto_create: true- reduces merge conflicts ~90%
Automatic Regression Test Generation - Bugs captured as tests to prevent recurrence
- regression-test-generator skill: Auto-generates framework-specific regression tests
- Framework auto-detection (Jest, Vitest, pytest, Playwright)
- Arrange-Act-Assert test structure with error ID references
- Links tests back to error-log.md entries
- Auto-invoke /debug on test failures: When tests fail during
/implement,/debugis automatically invoked- Generates regression test for the failure (Step 3.5)
- Updates error-log.md with test reference
- Continuous checks integration: Check 7/7 triggers auto-debug on test failures
/quick Command - Task() Orchestrator Pattern - Consistent architecture across all workflow commands
- quick-worker agent: Isolated agent for atomic quick change execution
- Domain detection (backend/frontend/test/docs)
- Test framework detection and execution
- Style guide validation (UI changes)
- Automatic commit with conventional message
- Delimiter-based returns:
---COMPLETED---,---NEEDS_INPUT---,---FAILED--- - Full Q&A support: Test failure decisions batched to main context
Imperative Task() Architecture - Commands now properly spawn isolated agents
- Breaking Change:
/featureand/epiccommands rewritten to use imperative Task() spawning- Phase agents now return structured delimiters (
---COMPLETED---,---NEEDS_INPUT---,---FAILED---) - Questions batch to main context - agents return questions, main asks user, re-spawns with answers
- Parallel sprint execution for epics via
run_in_background: true
- Phase agents now return structured delimiters (
- Ultra-lightweight Orchestrator Pattern
- Read state from disk, spawn isolated agents, handle Q&A, update state
- Never carry implementation details in context
- Unlimited feature/epic complexity (no context overflow)
- Updated Agents:
spec-agent.md,plan-agent.md: Delimiter-based return formatworker.md: WORKER_COMPLETED, WORKER_FAILED, ALL_DONE, BLOCKED delimitersinitializer.md: INITIALIZED, INIT_FAILED delimiters
Domain Memory v2 - Full Phase Isolation - Revolutionary architecture preventing context overflow
- Domain Memory System - Persistent disk-based state for unlimited iterations
- Workers pick ONE task, implement, test, update disk, exit
- Zero shared context between workers (prevents overflow)
- 13-command CLI for state management
- Phase Isolation Pattern - All phases spawn isolated agents via Task()
- Question batching: agents return questions, main asks user, agents resume
- Resumable at any point via interaction-state.yaml
- Project Setup Agents - Hybrid pattern for /init-project, /prototype, /roadmap
- Questionnaire inline, heavy generation isolated
- mgrep Semantic Search - Find code by meaning, not exact text
- Integrated as PRIMARY search in anti-duplication skill
- Added to agent boot-up rituals
Quality Feedback Loop System - Multi-agent voting, continuous checks, and perpetual learning
- Multi-Agent Voting - Error decorrelation through diverse sampling (MAKER algorithm)
- 3-agent voting with k=2 strategy for code reviews, security audits, breaking changes
- Temperature variation (0.5, 0.7, 0.9) decorrelates errors across agents
- Automatic in
/optimizephase, manual via/review --voting
- Continuous Quality Checks - Lightweight validation during
/implementphase- Runs after each task batch (3-4 tasks), < 30s performance target
- 6 checks: linting (auto-fix), type checking, unit tests, coverage delta, dead code, gap detection
- Non-blocking warnings with user choice: fix now, continue, or abort
- Progressive Quality Gates - Three escalating levels throughout workflow
- Level 1: Continuous (after each batch, < 30s, warn & continue)
- Level 2: Full quality gates (
/optimizephase, 10-15m, block deployment) - Level 3: Critical pre-flight (
/ship, < 2m, block production)
- On-Demand Review - New
/reviewcommand for anytime code review- Quick review (single agent, ~2-3 min) or comprehensive voting review (3 agents, ~5-8 min)
- Auto-fix linting, extract file:line references, generate coverage gaps
- Perpetual Learning - Auto-apply proven patterns at workflow start
- Performance optimizations (≥0.90 confidence) auto-applied
- Anti-patterns (≥0.85 confidence) generate warnings
- Custom abbreviations (≥0.95 confidence) auto-expanded
- Early Gap Detection - Find missing implementations before staging validation
- Scans for TODO/FIXME/HACK comments, placeholders, edge cases
- High-confidence gaps (≥0.8) flag likely issues before deployment
Command Architecture Optimization - Cleaner package structure with 27% size reduction
- Consolidated Commands: Merged 11 archived commands into 4 active commands
/gatenow handles both CI and security gates/createconsolidated 6 creation commands/contextmerged session management commands/initupdated routing to new active paths
- Optimized Distribution: Excluded 48 archived commands from npm package
- Package size: 8.5 MB → 6.27 MB (27% reduction)
- Archived commands accessible via GitHub source only
- All essential functionality in 30 active commands
- Moved Essential Commands: Project, deployment, and meta commands organized in active directories
See CONTRIBUTING.md for guidelines.
See CHANGELOG.md for version history and release notes.
MIT License - see LICENSE for details.
Built by @marcusgoll