Tasker

Overview

Tasker is a multi-agent architecture built with Claude Code that facilitates planning and implementation through a simple UI: /plan and /execute. It features a TUI dashboard for monitoring progress (tmux required).

The project began by taking a deep dive into task decomposition, developing a protocol for it and then using it as the basis for agentic planning. The task decomposition protocol is at the heart of the planning stage of Tasker.

Planning is the process of breaking down a large body of work, presented as a specification document, into logical, atomic units of work, and then figuring out how to build them in an optimal fashion—parallelizing to the extent possible.

The protocol looks at work through three lenses: Logical (what needs to happen), Physical (where code lives), and Strategic (what order reduces risk). It builds the thinnest end-to-end slice first—the "steel thread"—to validate your architecture early, before you've invested in building out the full system. Every task has a clear, observable, testable "Done."

Why Use Tasker?

You can go a really long way with Claude Code using off-the-shelf capabilities for planning and implementation. I often do, using prompts like these:

Planning:

review {spec} and @README.md. then, review the source code. we're just planning,
converting specs to well-documented, self-contained tasks. These tasks will be
worked on by sub-agents, who need sufficient context to do a good job. Figure out
a concurrency strategy that you can apply across your task dependencies, where you
may safely spawn asynchronous sub-agents to work on tasks. Save a Task DAG
dependency graph along with the plan to plan.md

Execution:

I want you to read {specs} and {plan}. Use a concurrency strategy where you may
safely spawn asynchronous sub-agents to work on tasks in waves. Then, for each
task in the plan, spawn a sub-agent, concurrently when safe to do so, giving the
sub-agent the task and instruction to implement a solution for the task, and also
give it a full copy of {specs} for context about the project. Tell the sub-agent
to review the specs and the source code for the project before starting.

This simple pattern works well for small changes whereas Tasker excels for larger projects.

With the ad-hoc approach, you're trusting Claude's interpretation of your spec, managing concurrency manually, and if your session dies you start over. There's no verification beyond "hope it works," no observability into what's happening, and no traceability from code back to requirements.

Tasker adds structure where it matters: protocol-driven decomposition with schema validation, persistent state you can resume, LLM-as-judge verification after each task, a TUI dashboard showing progress and costs, execution bundles that give each subagent exactly the context it needs, and the ability to retry failed tasks or skip blocked ones without starting over.

You could use Tasker for small projects too—there's no good reason not to. The overhead is minimal (/plan then /execute), and you get observability and state tracking even for simple work.

Prerequisites

Technical Requirements

You need Claude Code CLI with subagent support (current versions have this), Python 3.11+ with uv for package management, and optionally tmux if you want the TUI dashboard in a split pane.

Your Preferences Matter

Tasker intentionally excludes preferences specified by CLAUDE.md so it can serve as a generic-purpose framework. You bring your preferences and Tasker applies them in architecture, design, and implementation.

Setting up a comprehensive source of preferences is heavy lift, but the quality of Tasker's output is proportional to the preferences effort you put in. Without preferences, Tasker still works—but Claude makes its own choices about style, patterns, and conventions. With good preferences, you get code that looks like you wrote it.

Your ~/.claude/CLAUDE.md (or project-level .claude/CLAUDE.md) should cover things like: language and framework preferences, code style and naming conventions, testing requirements, architecture patterns you prefer (composition over inheritance, Protocols vs ABCs), error handling standards, documentation expectations.

This is a one-time investment that pays off across all your projects. If you already have a CLAUDE.md you're happy with, you're ready to go.

Getting Started

Have a specification document for your project ready. Tasker's workspace is the project-planning sub-directory. If you save your spec file there, Tasker will find it during planning onboarding, but you could also just share a path to the file if you choose.

With the spec file ready, run claude from within the Tasker project root and invoke the /plan command. If you're using tmux, Tasker will split your screen and start a TUI dashboard.

During planning, Tasker will ask you about the spec document and information about the target project—whether it's a new project or an existing one. Share the path to the project you want Tasker to build in. For new projects, Tasker establishes the directory structure. For existing projects, Tasker analyzes your codebase first to understand patterns, conventions, and integration points.

Once planning completes, run /execute to begin implementation. Tasker works through the task DAG, spawning isolated subagents for each task, verifying their work, and updating state as it goes. You can watch progress in the TUI, pause with a STOP file, and resume later.

Modes

Tasker operates in two distinct modes:

Planning Mode (/plan) - Transforms a specification into a directed acyclic graph (DAG) of implementable tasks
Execution Mode (/execute) - Implements tasks via context-isolated subagents with verification

┌─────────────────────────────────────────────────────────────────────────────┐
│                           PLANNING MODE (/plan)                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   spec.md ──▶ Logic Architect ──▶ Physical Architect ──▶ Task Author        │
│                     │                    │                   │               │
│                     ▼                    ▼                   ▼               │
│             capability-map.json   physical-map.json    tasks/T001.json      │
│                                                        tasks/T002.json      │
│                                                              │               │
│                                          ┌───────────────────┘               │
│                                          ▼                                   │
│                              Task-Plan-Verifier ──▶ Plan Auditor            │
│                                                          │                   │
│                                                          ▼                   │
│                                                   state.json (ready)        │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                          EXECUTION MODE (/execute)                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   For each ready task:                                                       │
│                                                                              │
│   state.json ──▶ Bundle Generator ──▶ Task Executor (isolated subagent)     │
│                                              │                               │
│                                              ▼                               │
│                                       Implementation                         │
│                                              │                               │
│                                              ▼                               │
│                                      Task Verifier ──▶ state.json (updated) │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Quick Start

1. Install Dependencies

uv sync

2. Run Planning

/plan

The planner will ask you for:

Specification - Paste directly or provide a file path. Any format works (PRDs, bullet lists, meeting notes, etc.)
Target Directory - Where the code will be written
Tech Stack (optional) - Any constraints like "Python with FastAPI"

This triggers the planning pipeline:

Logic Architect extracts capabilities from spec
Physical Architect maps capabilities to files
Task Author creates task definitions
Task-Plan-Verifier validates task quality
Plan Auditor assigns execution phases and validates DAG

3. Execute

/execute             # Begin implementation
/execute T005        # Execute a specific task
/execute --batch     # Execute all ready tasks without prompts

Note: Task validation runs automatically during /plan (Phase 4). You don't need to run /verify-plan manually unless you've edited task files after planning.

4. Monitor Progress

/tui                 # Launch interactive TUI dashboard
/status              # CLI status summary

Command Reference

Command	Type	Purpose
`/plan`	Production	Decompose spec → task DAG (runs phases 0-5 automatically)
`/execute`	Production	Implement tasks via isolated subagents
`/status`	Production	View current workflow state
`/tui`	Production	Interactive TUI dashboard
`/verify-plan`	Manual	Re-run task validation after editing task files
`/evaluate`	Manual	Generate performance report after execution

Production commands are the main workflow. Manual commands are for debugging, re-running steps, or post-execution analysis.

Architecture

Directory Structure

tasker/
├── .claude/
│   ├── agents/                    # Subagent definitions
│   │   ├── logic-architect.md     # Phase 1: Spec → capabilities
│   │   ├── physical-architect.md  # Phase 2: Capabilities → files
│   │   ├── task-author.md         # Phase 3: Files → tasks
│   │   ├── task-plan-verifier.md  # Phase 4: Pre-execution validation
│   │   ├── plan-auditor.md        # Phase 4: DAG validation & sequencing
│   │   ├── task-executor.md       # Execution: Implement tasks
│   │   └── task-verifier.md       # Execution: Verify implementations
│   ├── commands/                  # Slash commands
│   │   ├── plan.md                # /plan - Enter planning mode (production)
│   │   ├── execute.md             # /execute - Enter execution mode (production)
│   │   ├── status.md              # /status - Show dashboard (production)
│   │   ├── tui.md                 # /tui - Launch TUI (production)
│   │   ├── verify-plan.md         # /verify-plan - Re-run validation (manual/debug)
│   │   └── evaluate.md            # /evaluate - Generate report (manual/debug)
│   ├── hooks/                     # Event hooks
│   │   ├── launch-tui.sh          # Auto-launch TUI in tmux
│   │   ├── close-tui.sh           # Close TUI on completion
│   │   ├── detect-workflow.sh     # Detect /plan or /execute
│   │   └── subagent_stop.py       # Log token usage
│   └── skills/
│       └── orchestrator/
│           └── SKILL.md           # Main orchestrator skill
├── schemas/                       # JSON validation schemas
│   ├── capability-map.schema.json
│   ├── physical-map.schema.json
│   ├── task.schema.json
│   ├── execution-bundle.schema.json
│   └── state.schema.json
├── scripts/                       # Python utilities
│   ├── state.py                   # State management (single source of truth)
│   ├── bundle.py                  # Execution bundle generation
│   ├── validate.py                # DAG, steel thread, verification validation
│   ├── status.py                  # TUI launcher
│   ├── dashboard.py               # CLI dashboard
│   └── tui/                       # Textual TUI components
│       ├── app.py
│       ├── providers.py
│       ├── state_provider.py
│       └── views/
│           ├── dashboard.py
│           ├── task_detail.py
│           └── widgets.py
├── templates/                     # Example files (for reference only)
│   ├── example-spec.md            # Example specification format
│   ├── constraints.md.example     # Example constraints
│   ├── task.json.example          # Example task structure
│   └── README.md
└── project-planning/              # Generated during workflow (gitignored)
    ├── inputs/
    │   └── spec.md                # Your specification (stored verbatim)
    ├── artifacts/
    │   ├── capability-map.json    # Phase 1 output
    │   └── physical-map.json      # Phase 2 output
    ├── tasks/                     # Individual task definitions
    │   ├── T001.json
    │   ├── T002.json
    │   └── ...
    ├── bundles/                   # Execution bundles
    └── state.json                 # Workflow state

State Machine

The workflow progresses through these phases:

ingestion → logical → physical → definition → validation → sequencing → ready → executing → complete

Phase Progression:

Phase	Name	Output	Agent
0	Ingestion	spec.md saved	Orchestrator
1	Logical	capability-map.json	logic-architect
2	Physical	physical-map.json	physical-architect
3	Definition	tasks/T*.json	task-author
4	Validation	Verification results	task-plan-verifier
5	Sequencing	Phase assignments, DAG	plan-auditor
6	Ready	Planning complete	Orchestrator
7	Executing	Implementation	task-executor + task-verifier
8	Complete	All tasks done	Orchestrator

Task Status Transitions

pending ──▶ ready ──▶ running ──▶ complete
                         │
                         ├──▶ failed
                         │
                         └──▶ blocked

skipped (manual override)

Task Ordering and Scheduling

Tasker uses dynamic scheduling rather than a precomputed topological sort. Order is determined through two mechanisms:

1. Phase Assignment (Planning Time)

The plan-auditor agent assigns phases using heuristics:

Phase 1: Tasks with no dependencies (foundations)
Phase 2: Steel thread tasks (critical path)
Phase 3+: Remaining tasks grouped by domain affinity

This is a human/agent judgment call, not an algorithm. The only validation is that all dependencies must be in earlier phases.

2. Ready-Task Computation (Execution Time)

At runtime, get_ready_tasks() dynamically determines what can execute:

def get_ready_tasks(state):
    ready = []
    for task in state["tasks"].values():
        if task["status"] != "pending":
            continue
        # Check ALL dependencies are complete or skipped
        if all(dep_status in ["complete", "skipped"]
               for dep in task["depends_on"]):
            ready.append(task["id"])
    return ready

This is a pull-based scheduler—rather than computing the full order upfront, it evaluates "what can run now?" each iteration.

How "A before B" is Determined

Condition	Order	Mechanism
B lists A in `dependencies.tasks`	A → B	Explicit dependency
A has phase 1, B has phase 2	A → B (usually)	Phase grouping
Neither depends on other, same phase	A ∥ B	Parallel execution

The only hard constraint is explicit dependencies declared in task files. Phases are advisory groupings that align with dependency depth.

Why No Traditional Topological Sort?

Tasks are written to individual files with dependencies declared upfront
Cycle detection uses DFS (in validate.py) but doesn't produce an ordering
Execution is driven by "what's ready?" not "what's the precomputed order?"

This approach enables resumability—if execution stops mid-way, ready tasks are recomputed from current state rather than requiring the full sort to be rerun.

Data Flow Detail

Phase 1: Logical Architecture (logic-architect)

Input: project-planning/inputs/spec.md Output: project-planning/artifacts/capability-map.json

The Logic Architect extracts the logical structure from your specification:

Domains - Major functional areas (e.g., Authentication, Storage)
Capabilities - Features within domains (e.g., User Login, Token Refresh)
Behaviors - Atomic operations with types (Input/Process/State/Output)
Flows - End-to-end user journeys that traverse behaviors

{
  "version": "1.0",
  "spec_checksum": "abc123...",
  "domains": [{
    "id": "D1",
    "name": "Authentication",
    "description": "User identity and access management",
    "capabilities": [{
      "id": "C1",
      "name": "User Login",
      "spec_ref": {
        "quote": "Users must be able to log in with email and password",
        "location": "paragraph 3"
      },
      "behaviors": [
        {"id": "B1", "name": "validate_credentials", "type": "process", "description": "Verify email and password"},
        {"id": "B2", "name": "generate_token", "type": "output", "description": "Create JWT access token"}
      ]
    }]
  }],
  "flows": [{
    "id": "F1",
    "name": "Login Flow",
    "is_steel_thread": true,
    "steps": [
      {"order": 1, "behavior_id": "B1", "description": "Validate user credentials"},
      {"order": 2, "behavior_id": "B2", "description": "Generate and return token"}
    ]
  }],
  "coverage": {
    "total_requirements": 15,
    "covered_requirements": 15,
    "gaps": []
  }
}

Behavior Types (I.P.S.O. Taxonomy):

Type	Description	Example
Input	Data entering the system	`ReceiveLoginRequest`
Process	Computation/transformation	`ValidateCredentials`, `HashPassword`
State	Data persistence/mutation	`StoreSession`, `UpdateLastLogin`
Output	Data leaving the system	`ReturnAuthToken`, `SendWelcomeEmail`

Phase 2: Physical Architecture (physical-architect)

Input: capability-map.json, constraints.md Output: project-planning/artifacts/physical-map.json

Maps behaviors to concrete file paths based on:

Target directory structure
Language/framework conventions (from constraints)
Architectural layers (api, domain, data, infra, test)

{
  "version": "1.0",
  "target_dir": "/path/to/project",
  "capability_map_checksum": "abc123...",
  "file_mapping": [{
    "behavior_id": "B1",
    "behavior_name": "validate_credentials",
    "files": [
      {"path": "src/auth/validator.py", "action": "create", "layer": "domain", "purpose": "Credential validation logic"}
    ],
    "tests": [
      {"path": "tests/auth/test_validator.py", "action": "create"}
    ]
  }],
  "cross_cutting": [{
    "concern": "logging",
    "files": [{"path": "src/utils/logging.py", "action": "create", "purpose": "Structured logging setup"}]
  }],
  "infrastructure": [
    {"path": "pyproject.toml", "action": "create", "purpose": "Project configuration"}
  ],
  "summary": {
    "total_behaviors": 12,
    "total_files": 24,
    "files_to_create": 22,
    "files_to_modify": 2
  }
}

Architectural Layers:

Layer	Purpose	Example Files
api	HTTP/API handlers	`src/api/routes.py`
domain	Business logic	`src/auth/validator.py`
data	Data access/models	`src/models/user.py`
infra	Infrastructure	`src/config/settings.py`
test	Tests	`tests/auth/test_validator.py`

Phase 3: Task Definition (task-author)

Input: capability-map.json, physical-map.json Output: project-planning/tasks/T001.json, T002.json, ...

Creates individual task files. Each task groups 2-5 related behaviors into a cohesive unit of work:

{
  "id": "T001",
  "name": "Implement credential validation",
  "phase": 1,
  "context": {
    "domain": "Authentication",
    "capability": "User Login",
    "spec_ref": {
      "quote": "Users must be able to log in with email and password",
      "location": "paragraph 3"
    },
    "steel_thread": true
  },
  "behaviors": ["B1", "B2"],
  "files": [
    {"path": "src/auth/validator.py", "action": "create", "purpose": "Credential validation logic"},
    {"path": "tests/auth/test_validator.py", "action": "create", "purpose": "Unit tests"}
  ],
  "dependencies": {
    "tasks": [],
    "external": ["pydantic>=2.0"]
  },
  "acceptance_criteria": [
    {"criterion": "Valid credentials return True", "verification": "pytest tests/auth/test_validator.py::test_valid -v"},
    {"criterion": "Invalid email raises ValidationError", "verification": "pytest tests/auth/test_validator.py::test_invalid_email -v"},
    {"criterion": "Code passes linting", "verification": "ruff check src/auth/validator.py"},
    {"criterion": "Code passes type checking", "verification": "ty check src/auth"}
  ],
  "estimate_hours": 3
}

Phase 4: Validation (task-plan-verifier, plan-auditor)

Input: All artifacts + task files Output: Updated state.json with validation results

Task-Plan-Verifier (LLM-as-judge) evaluates each task:

Dimension	What's Checked
Spec Alignment	Does task trace to spec requirements?
Strategy Alignment	Does task fit the decomposition strategy?
Preference Compliance	Does task follow user's `~/.claude/CLAUDE.md` standards?
Viability	Is task properly scoped with clear acceptance criteria?

Verdicts: READY, READY_WITH_NOTES, BLOCKED

Plan-Auditor then:

Validates DAG has no cycles
Checks steel thread forms contiguous early path
Validates all verification commands are syntactically valid
Assigns execution phases based on dependencies:
- Phase 1: Tasks with no dependencies
- Phase 2: Tasks depending only on Phase 1
- And so on...

State Management

state.json is the single source of truth for all workflow state:

{
  "version": "2.0",
  "phase": {
    "current": "executing",
    "completed": ["ingestion", "logical", "physical", "definition", "validation", "sequencing", "ready"]
  },
  "target_dir": "/path/to/project",
  "created_at": "2024-01-15T10:00:00Z",
  "updated_at": "2024-01-15T14:30:00Z",
  "artifacts": {
    "capability_map": {"path": "artifacts/capability-map.json", "checksum": "abc123...", "valid": true},
    "physical_map": {"path": "artifacts/physical-map.json", "checksum": "def456...", "valid": true},
    "task_validation": {"verdict": "READY", "valid": true, "summary": "All tasks aligned"}
  },
  "tasks": {
    "T001": {
      "id": "T001",
      "name": "Implement credential validation",
      "status": "complete",
      "phase": 1,
      "depends_on": [],
      "blocks": ["T002", "T003"],
      "started_at": "2024-01-15T11:00:00Z",
      "completed_at": "2024-01-15T11:15:00Z",
      "duration_seconds": 900,
      "attempts": 1,
      "files_created": ["src/auth/validator.py", "tests/auth/test_validator.py"],
      "files_modified": [],
      "verification": {
        "verdict": "PASS",
        "recommendation": "PROCEED",
        "criteria": [
          {"name": "Valid credentials return True", "score": "PASS", "evidence": "Test passed"}
        ],
        "quality": {"types": "PASS", "docs": "PASS", "patterns": "PASS", "errors": "PASS"},
        "tests": {"coverage": "PASS", "assertions": "PASS", "edge_cases": "PARTIAL"}
      }
    }
  },
  "execution": {
    "current_phase": 2,
    "active_tasks": ["T003"],
    "completed_count": 2,
    "failed_count": 0,
    "total_tokens": 45000,
    "total_cost_usd": 0.45
  },
  "events": [
    {"timestamp": "2024-01-15T11:00:00Z", "type": "task_started", "task_id": "T001"},
    {"timestamp": "2024-01-15T11:15:00Z", "type": "task_completed", "task_id": "T001"}
  ]
}

Execution Bundles

Before task execution, bundle.py generates a self-contained bundle with everything the executor needs:

{
  "version": "1.2",
  "bundle_created_at": "2024-01-15T11:00:00Z",
  "task_id": "T001",
  "name": "Implement credential validation",
  "phase": 1,
  "target_dir": "/path/to/project",
  "context": {
    "domain": "Authentication",
    "capability": "User Login",
    "capability_id": "C1",
    "spec_ref": {
      "quote": "Users must be able to log in with email and password",
      "location": "paragraph 3"
    },
    "steel_thread": true
  },
  "behaviors": [
    {"id": "B1", "name": "validate_credentials", "type": "process", "description": "Verify email and password"}
  ],
  "files": [
    {"path": "src/auth/validator.py", "action": "create", "layer": "domain", "purpose": "Credential validation", "behaviors": ["B1"]}
  ],
  "dependencies": {
    "tasks": [],
    "files": [],
    "external": ["pydantic>=2.0"]
  },
  "acceptance_criteria": [
    {"criterion": "Valid credentials return True", "verification": "pytest tests/auth/test_validator.py -v"}
  ],
  "constraints": {
    "language": "Python",
    "framework": "FastAPI",
    "testing": "pytest",
    "patterns": ["Use Protocol for interfaces", "Use dataclass for data structures"],
    "raw": "Full constraints.md content..."
  },
  "checksums": {
    "artifacts": {
      "capability_map": "abc123...",
      "physical_map": "def456...",
      "constraints": "ghi789...",
      "task_definition": "jkl012..."
    },
    "dependency_files": {}
  }
}

Bundle Benefits:

Context isolation - Executor sees only what it needs
Integrity validation - Checksums detect artifact drift
Reproducibility - All inputs captured
Self-documentation - Bundle explains what to build and why

Task Verification

Execution-Time Verification (task-verifier)

After each task completes, an LLM-as-judge verifier evaluates the implementation:

Evidence Gathering:

Read implementation files
Run verification commands
Capture all output

Multi-Dimensional Judgment:

Dimension	What's Judged
Functional Correctness	Does it meet each acceptance criterion?
Code Quality	Types, docs, patterns, error handling
Test Quality	Coverage, assertions, edge cases

Verdicts & Recommendations:

Verdict	Recommendation	Meaning
PASS	PROCEED	All criteria met, quality acceptable
CONDITIONAL	PROCEED	Works, minor issues, proceed with notes
FAIL	BLOCK	Criteria not met or critical issues

When BLOCK is recommended:

Dependent tasks are automatically blocked
Task cannot be marked complete until issues resolved
Verifier provides specific feedback on what failed and how to fix

Verifier Calibration

Tasker tracks verifier accuracy over time:

False positives - PROCEED verdict but task later failed
False negatives - BLOCK verdict but task would have worked

Calibration Score = (correct verdicts) / (total verdicts)

The dashboard displays calibration metrics to help tune verification thresholds.

Key Concepts

Steel Thread

A "steel thread" is the minimal end-to-end path through the system. Tasks marked steel_thread: true are prioritized in early phases to validate architecture before building out the full system.

The plan-auditor validates that:

Steel thread tasks exist
They form a contiguous path
They're assigned to early phases

Behaviors vs Tasks

Aspect	Behavior	Task
Abstraction	Logical (what to do)	Physical (where/how)
Granularity	Single atomic operation	Group of 2-5 related behaviors
Typical count	20-50 per project	5-15 per project
Created in	Phase 1 (capability-map.json)	Phase 3 (tasks/*.json)
Contains	Name, type, description	Behaviors, files, criteria, dependencies

Behaviors/Task Metric:

< 2 behaviors/task: Tasks too granular, excessive overhead
2-5 behaviors/task: Sweet spot—cohesive, verifiable units
> 5 behaviors/task: Tasks too large, higher failure risk

Monitoring

TUI Dashboard

The Textual-based TUI (/tui or python3 scripts/status.py) provides real-time monitoring:

Panels:

Phase Indicator - Current workflow phase with icon
Health Checks - DAG validation, steel thread, verification commands
Progress - Task completion by phase with progress bars
Calibration - Verifier accuracy metrics
Cost - Token usage and estimated cost (total + per-task average)
Current Task - Currently running task with elapsed time
Task List - All tasks sorted by phase with status icons
Recent Activity - Latest completions/failures

Keybindings:

q - Quit
r - Refresh
a - Toggle auto-refresh (5s interval)
d - Toggle dark/light mode
escape / b - Back (from detail view)

CLI Dashboard

python3 scripts/dashboard.py                # Full dashboard with boxes
python3 scripts/dashboard.py --compact      # Single-line summary
python3 scripts/dashboard.py --json         # JSON output
python3 scripts/dashboard.py --no-color     # Without ANSI colors

Scripts Reference

state.py - State Management

# Initialization
python3 scripts/state.py init /path/to/project      # Initialize workflow

# Phase progression
python3 scripts/state.py advance-phase logical      # Move to next phase
python3 scripts/state.py show                       # Display current state

# Task management
python3 scripts/state.py task-start T001            # Mark task running
python3 scripts/state.py task-complete T001         # Mark task complete
python3 scripts/state.py task-fail T001 "error"     # Mark task failed
python3 scripts/state.py retry-task T001            # Reset failed task
python3 scripts/state.py skip-task T001 "reason"    # Skip blocked task
python3 scripts/state.py ready                      # List ready tasks

# Validation
python3 scripts/state.py validate capability_map    # Validate artifact
python3 scripts/state.py validate-tasks READY "OK"  # Record task validation

# Token tracking
python3 scripts/state.py log-tokens T001 1000 500 0.02  # Log usage

# Metrics
python3 scripts/state.py metrics                    # Execution metrics
python3 scripts/state.py planning-metrics           # Planning quality
python3 scripts/state.py calibration-score          # Verifier accuracy

bundle.py - Execution Bundles

python3 scripts/bundle.py generate T001        # Generate single bundle
python3 scripts/bundle.py generate-ready       # Generate for all ready tasks
python3 scripts/bundle.py validate T001        # Validate against schema
python3 scripts/bundle.py validate-integrity T001  # Check checksums + deps
python3 scripts/bundle.py list                 # List existing bundles
python3 scripts/bundle.py clean                # Remove all bundles

validate.py - Comprehensive Validation

python3 scripts/validate.py dag                    # Check for cycles
python3 scripts/validate.py steel-thread           # Validate steel thread
python3 scripts/validate.py verification-commands  # Check command syntax
python3 scripts/validate.py calibration            # Show verifier metrics
python3 scripts/validate.py all                    # Run all validations

Hooks

Tasker integrates with Claude Code hooks for automation:

Hook	Trigger	Action
`detect-workflow.sh`	User prompt	Detect `/plan` or `/execute`, launch TUI
`launch-tui.sh`	Planning/execution start	Open tmux split (30% height) with TUI
`close-tui.sh`	Workflow complete	Close TUI pane
`subagent_stop.py`	Subagent completes	Parse transcript, log token usage to state

Note: Tmux hooks only work when running Claude Code inside a tmux session.

Agent Responsibilities

Planning Agents

Agent	Input	Output	Purpose
logic-architect	spec.md	capability-map.json	Extract logical structure (domains, capabilities, behaviors)
physical-architect	capability-map.json, constraints.md	physical-map.json	Map behaviors to file paths
task-author	Both maps	tasks/T*.json	Create individual task definitions
task-plan-verifier	Tasks + spec	Validation report	LLM-as-judge pre-execution check
plan-auditor	Tasks	Updated state.json	Assign phases, validate DAG

Execution Agents

Agent	Input	Output	Purpose
task-executor	Execution bundle	Implementation code	Write code in isolated context
task-verifier	Implementation	Verification result	LLM-as-judge post-execution check

Templates

The templates/ directory contains example files for reference. You don't need to copy or follow these templates - the planner accepts any specification format.

Template	Purpose
`example-spec.md`	Shows one possible spec format (not required)
`constraints.md.example`	Example tech stack constraints
`task.json.example`	Shows task structure (generated automatically)

Specification Input

Your spec can be in any format:

Freeform requirements or PRDs
Bullet lists or numbered lists
Design docs or meeting notes
Existing README files

The planner stores your spec verbatim and extracts requirements from whatever format you provide. Each extracted capability includes a spec_ref that quotes the original text for traceability.

Tech Stack Constraints (Optional)

You can provide constraints conversationally when running /plan, or create a constraints.md file with:

Language & runtime preferences
Framework choices
Testing requirements
Architecture patterns to follow or avoid

Development Phases in Specs

Tasker supports phased development by allowing you to mark sections of your spec for later phases. This is useful when your spec contains features you want to defer.

How it works:

Phase 1 (implicit) - Any content NOT under a phase heading is Phase 1
Phase 2+ (explicit) - Content under "Phase 2", "Phase 3", etc. headings is excluded

Example spec with phases:

# My Application Spec

## User Authentication
- Users can log in with email/password
- Users can reset their password
- Sessions expire after 24 hours

## Data Management
- Users can create, read, update, delete items
- Items have a title, description, and status

## Phase 2: Advanced Features
- OAuth integration with Google/GitHub
- Single Sign-On (SSO) for enterprise
- Admin dashboard with analytics

## Phase 3: Scale & Performance
- Redis caching layer
- Read replicas for database
- CDN integration

When you run /plan, Tasker will:

Extract only Phase 1 content (Authentication + Data Management)
Document excluded phases in capability-map.json
Verify no Phase 2+ content leaks into tasks

Phase markers recognized:

## Phase 2
## Phase 2: Title
### Phase 3 - Description
# Phase 2 Requirements
**Phase 2:**

Verification: The task-plan-verifier checks that no tasks reference Phase 2+ content. If leakage is detected, planning is BLOCKED until the offending tasks are removed.

Output: The capability-map.json includes a phase_filtering section:

{
  "phase_filtering": {
    "active_phase": 1,
    "excluded_phases": [
      {
        "phase": 2,
        "heading": "## Phase 2: Advanced Features",
        "location": "line 15",
        "summary": "OAuth, SSO, admin dashboard"
      }
    ],
    "total_excluded_requirements": 5
  }
}

Development

make install    # Setup project with uv
make lint       # Run ruff check
make test       # Run pytest
make clean      # Remove artifacts

Design Principles

Single Source of Truth - state.json owns all workflow state
Context Isolation - Each executor sees only its bundle
Fail Fast - Validation happens before execution
Observability - TUI + hooks provide real-time visibility
Reproducibility - Checksums detect artifact drift
Steel Thread First - Validate architecture early
LLM-as-Judge - Structured verification with calibration tracking
Schema Validation - All artifacts validated against JSON schemas

Limitations

Requires Claude Code with subagent support
TUI requires textual package (uv add textual)
Tmux hooks only work in tmux sessions
Currently single-threaded execution (no parallel tasks within a phase)
Token tracking requires subagent_stop.py hook

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.claude		.claude
assets		assets
docs		docs
schemas		schemas
scripts		scripts
tasker-transform		tasker-transform
templates		templates
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

Dowwie/tasker

Folders and files

Latest commit

History

Repository files navigation

Tasker

Overview

Why Use Tasker?

Prerequisites

Technical Requirements

Your Preferences Matter

Getting Started

Modes

Quick Start

1. Install Dependencies

2. Run Planning

3. Execute

4. Monitor Progress

Command Reference

Architecture

Directory Structure

State Machine

Task Status Transitions

Task Ordering and Scheduling

Data Flow Detail

Phase 1: Logical Architecture (logic-architect)

Phase 2: Physical Architecture (physical-architect)

Phase 3: Task Definition (task-author)

Phase 4: Validation (task-plan-verifier, plan-auditor)

State Management

Execution Bundles

Task Verification

Execution-Time Verification (task-verifier)

Verifier Calibration

Key Concepts

Steel Thread

Behaviors vs Tasks

Monitoring

TUI Dashboard

CLI Dashboard

Scripts Reference

state.py - State Management

bundle.py - Execution Bundles

validate.py - Comprehensive Validation

Hooks

Agent Responsibilities

Planning Agents

Execution Agents

Templates

Specification Input

Tech Stack Constraints (Optional)

Development Phases in Specs

Development

Design Principles

Limitations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages