chore: add PRDs and surgical test generator agent
This commit is contained in:
220
.taskmaster/docs/prd-autonomous-tdd-rails.md
Normal file
220
.taskmaster/docs/prd-autonomous-tdd-rails.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# PRD: Autonomous TDD + Git Workflow (On Rails)
|
||||
|
||||
## Summary
|
||||
- Put the existing git and test workflows on rails: a repeatable, automated process that can run autonomously, with guardrails and a compact TUI for visibility.
|
||||
- Flow: for a selected task, create a branch named with the tag + task id → generate tests for the first subtask (red) using the Surgical Test Generator → implement code (green) → verify tests → commit → repeat per subtask → final verify → push → open PR against the default branch.
|
||||
- Build on existing rules: `.cursor/rules/git_workflow.mdc`, `.cursor/rules/test_workflow.mdc`, `.claude/agents/surgical-test-generator.md`, and existing CLI/core services.
|
||||
|
||||
## Goals
|
||||
- Deterministic, resumable automation to execute the TDD loop per subtask with minimal human intervention.
|
||||
- Strong guardrails: never commit to the default branch; only commit when tests pass; enforce status transitions; persist logs/state for debuggability.
|
||||
- Visibility: a compact terminal UI (like lazygit) to pick tag, view tasks, and start work; right-side pane opens an executor terminal (via tmux) for agent coding.
|
||||
- Extensible: framework-agnostic test generation via the Surgical Test Generator; detect and use the repo’s test command for execution with coverage thresholds.
|
||||
|
||||
## Non‑Goals (initial)
|
||||
- Full multi-language runner parity beyond detection and executing the project’s test command.
|
||||
- Complex GUI; start with CLI/TUI + tmux pane. IDE/extension can hook into the same state later.
|
||||
- Rich executor selection UX (codex/gemini/claude) — we’ll prompt per run; defaults can come later.
|
||||
|
||||
## Success Criteria
|
||||
- One command can autonomously complete a task’s subtasks via TDD and open a PR when done.
|
||||
- All commits made on a branch that includes the tag and task id (see Branch Naming); no commits to the default branch directly.
|
||||
- Every subtask iteration: failing tests added first (red), then code added to pass them (green), commit only after green.
|
||||
- End-to-end logs + artifacts stored in `.taskmaster/reports/runs/<timestamp-or-id>/`.
|
||||
|
||||
## User Stories
|
||||
- As a developer, I can run `tm autopilot <taskId>` and watch a structured, safe workflow execute.
|
||||
- As a reviewer, I can inspect commits per subtask, and a PR summarizing the work when the task completes.
|
||||
- As an operator, I can see current step, active subtask, tests status, and logs in a compact CLI view and read a final run report.
|
||||
|
||||
## High‑Level Workflow
|
||||
1) Pre‑flight
|
||||
- Verify clean working tree or confirm staging/commit policy (configurable).
|
||||
- Detect repo type and the project’s test command (e.g., `npm test`, `pnpm test`, `pytest`, `go test`).
|
||||
- Validate tools: `git`, `gh` (optional for PR), `node/npm`, and (if used) `claude` CLI.
|
||||
- Load TaskMaster state and selected task; if no subtasks exist, automatically run “expand” before working.
|
||||
|
||||
2) Branch & Tag Setup
|
||||
- Checkout default branch and update (optional), then create a branch using Branch Naming (below).
|
||||
- Map branch ↔ tag via existing tag management; explicitly set active tag to the branch’s tag.
|
||||
|
||||
3) Subtask Loop (for each pending/in-progress subtask in dependency order)
|
||||
- Select next eligible subtask using `tm-core` TaskService `getNextTask()` and subtask eligibility logic.
|
||||
- Red: generate or update failing tests for the subtask
|
||||
- Use the Surgical Test Generator system prompt (`.claude/agents/surgical-test-generator.md`) to produce high-signal tests following project conventions.
|
||||
- Run tests to confirm red; record results. If not red (already passing), skip to next subtask or escalate.
|
||||
- Green: implement code to pass tests
|
||||
- Use executor to implement changes (initial: `claude` CLI prompt with focused context).
|
||||
- Re-run tests until green or timeout/backoff policy triggers.
|
||||
- Commit: when green
|
||||
- Commit tests + code with conventional commit message. Optionally update subtask status to `done`.
|
||||
- Persist run step metadata/logs.
|
||||
|
||||
4) Finalization
|
||||
- Run full test suite and coverage (if configured); optionally lint/format.
|
||||
- Commit any final adjustments.
|
||||
- Push branch (ask user to confirm); create PR (via `gh pr create`) targeting the default branch. Title format: `Task #<id> [<tag>]: <title>`.
|
||||
|
||||
5) Post‑Run
|
||||
- Update task status if desired (e.g., `review`).
|
||||
- Persist run report (JSON + markdown summary) to `.taskmaster/reports/runs/<run-id>/`.
|
||||
|
||||
## Guardrails
|
||||
- Never commit to the default branch.
|
||||
- Commit only if all tests (targeted and suite) pass; allow override flags.
|
||||
- Enforce 80% coverage thresholds (lines/branches/functions/statements) by default; configurable.
|
||||
- Timebox/model ops and retries; if not green within N attempts, pause with actionable state for resume.
|
||||
- Always log actions, commands, and outcomes; include dry-run mode.
|
||||
- Ask before branch creation, pushing, and opening a PR unless `--no-confirm` is set.
|
||||
|
||||
## Integration Points (Current Repo)
|
||||
- CLI: `apps/cli` provides command structure and UI components.
|
||||
- New command: `tm autopilot` (alias: `task-master autopilot`).
|
||||
- Reuse UI components under `apps/cli/src/ui/components/` for headers/task details/next-task.
|
||||
- Core services: `packages/tm-core`
|
||||
- `TaskService` for selection, status, tags.
|
||||
- `TaskExecutionService` for prompt formatting and executor prep.
|
||||
- Executors: `claude` executor and `ExecutorFactory` to run external tools.
|
||||
- Proposed new: `WorkflowOrchestrator` to drive the autonomous loop and emit progress events.
|
||||
- Tag/Git utilities: `scripts/modules/utils/git-utils.js` and `scripts/modules/task-manager/tag-management.js` for branch→tag mapping and explicit tag switching.
|
||||
- Rules: `.cursor/rules/git_workflow.mdc` and `.cursor/rules/test_workflow.mdc` to steer behavior and ensure consistency.
|
||||
- Test generation prompt: `.claude/agents/surgical-test-generator.md`.
|
||||
|
||||
## Proposed Components
|
||||
- Orchestrator (tm-core): `WorkflowOrchestrator` (new)
|
||||
- State machine driving phases: Preflight → Branch/Tag → SubtaskIter (Red/Green/Commit) → Finalize → PR.
|
||||
- Exposes an evented API (progress events) that the CLI can render.
|
||||
- Stores run state artifacts.
|
||||
|
||||
- Test Runner Adapter
|
||||
- Detects and runs tests via the project’s test command (e.g., `npm test`), with targeted runs where feasible.
|
||||
- API: runTargeted(files/pattern), runAll(), report summary (failures, duration, coverage), enforce 80% threshold by default.
|
||||
|
||||
- Git/PR Adapter
|
||||
- Encapsulates `git` ops: branch create/checkout, add/commit, push.
|
||||
- Optional `gh` integration to open PR; fallback to instructions if `gh` unavailable.
|
||||
- Confirmation gates for branch creation and pushes.
|
||||
- Adds commit footers and a unified trailer (`Refs: TM-<tag>-<id>[.<sub>]`) for robust mapping to tasks/subtasks.
|
||||
|
||||
- Prompt/Exec Adapter
|
||||
- Uses existing executor service to call the selected coding assistant (initially `claude`) with tight prompts: task/subtask context, surgical tests first, then minimal code to green.
|
||||
|
||||
- Run State + Reporting
|
||||
- JSONL log of steps, timestamps, commands, test results.
|
||||
- Markdown summary for PR description and post-run artifact.
|
||||
|
||||
## CLI UX (MVP)
|
||||
- Command: `tm autopilot [taskId]`
|
||||
- Flags: `--dry-run`, `--no-push`, `--no-pr`, `--no-confirm`, `--force`, `--max-attempts <n>`, `--runner <auto|custom>`, `--commit-scope <scope>`
|
||||
- Output: compact header (project, tag, branch), current phase, subtask line, last test summary, next actions.
|
||||
- Resume: If interrupted, `tm autopilot --resume` picks up from last checkpoint in run state.
|
||||
|
||||
### TUI with tmux (Linear Execution)
|
||||
- Left pane: Tag selector, task list (status/priority), start/expand shortcuts; “Start” triggers the next task or a selected task.
|
||||
- Right pane: Executor terminal (tmux split) that runs the coding agent (claude-code/codex). Autopilot can hand over to the right pane during green.
|
||||
- MCP integration: use MCP tools for task queries/updates and for shell/test invocations where available.
|
||||
|
||||
## Prompts (Initial Direction)
|
||||
- Red phase prompt skeleton (tests):
|
||||
- Use `.claude/agents/surgical-test-generator.md` as the system prompt to generate high-signal failing tests tailored to the project’s language and conventions. Keep scope minimal and deterministic; no code changes yet.
|
||||
- Green phase prompt skeleton (code):
|
||||
- “Make the tests pass by changing the smallest amount of code, following project patterns. Only modify necessary files. Keep commits focused to this subtask.”
|
||||
|
||||
## Configuration
|
||||
- `.taskmaster/config.json` additions
|
||||
- `autopilot`: `{ enabled: true, requireCleanWorkingTree: true, commitTemplate: "{type}({scope}): {msg}", defaultCommitType: "feat" }`
|
||||
- `test`: `{ runner: "auto", coverageThresholds: { lines: 80, branches: 80, functions: 80, statements: 80 } }`
|
||||
- `git`: `{ branchPattern: "{tag}/task-{id}-{slug}", pr: { enabled: true, base: "default" }, commitFooters: { task: "Task-Id", subtask: "Subtask-Id", tag: "Tag" }, commitTrailer: "Refs: TM-{tag}-{id}{.sub?}" }`
|
||||
|
||||
## Decisions
|
||||
- Use conventional commits plus footers and a unified trailer `Refs: TM-<tag>-<id>[.<sub>]` for all commits; Git/PR adapter is responsible for injecting these.
|
||||
|
||||
## Risks and Mitigations
|
||||
- Model hallucination/large diffs: restrict prompt scope; enforce minimal changes; show diff previews (optional) before commit.
|
||||
- Flaky tests: allow retries, isolate targeted runs for speed, then full suite before commit.
|
||||
- Environment variability: detect runners/tools; provide fallbacks and actionable errors.
|
||||
- PR creation fails: still push and print manual commands; persist PR body to reuse.
|
||||
|
||||
## Open Questions
|
||||
1) Slugging rules for branch names; any length limits or normalization beyond `{slug}` token sanitize?
|
||||
2) PR body standard sections beyond run report (e.g., checklist, coverage table)?
|
||||
3) Default executor prompt fine-tuning once codex/gemini integration is available.
|
||||
4) Where to store persistent TUI state (pane layout, last selection) in `.taskmaster/state.json`?
|
||||
|
||||
## Branch Naming
|
||||
- Include both the tag and the task id in the branch name to make lineage explicit.
|
||||
- Default pattern: `<tag>/task-<id>[-slug]` (e.g., `master/task-12`, `tag-analytics/task-4-user-auth`).
|
||||
- Configurable via `.taskmaster/config.json`: `git.branchPattern` supports tokens `{tag}`, `{id}`, `{slug}`.
|
||||
|
||||
## PR Base Branch
|
||||
- Use the repository’s default branch (detected via git) unless overridden.
|
||||
- Title format: `Task #<id> [<tag>]: <title>`.
|
||||
|
||||
## RPG Mapping (Repository Planning Graph)
|
||||
|
||||
Functional nodes (capabilities):
|
||||
- Autopilot Orchestration → drives TDD loop and lifecycle
|
||||
- Test Generation (Surgical) → produces failing tests from subtask context
|
||||
- Test Execution + Coverage → runs suite, enforces thresholds
|
||||
- Git/Branch/PR Management → safe operations and PR creation
|
||||
- TUI/Terminal Integration → interactive control and visibility via tmux
|
||||
- MCP Integration → structured task/status/context operations
|
||||
|
||||
Structural nodes (code organization):
|
||||
- `packages/tm-core`:
|
||||
- `services/workflow-orchestrator.ts` (new)
|
||||
- `services/test-runner-adapter.ts` (new)
|
||||
- `services/git-adapter.ts` (new)
|
||||
- existing: `task-service.ts`, `task-execution-service.ts`, `executors/*`
|
||||
- `apps/cli`:
|
||||
- `src/commands/autopilot.command.ts` (new)
|
||||
- `src/ui/tui/` (new tmux/TUI helpers)
|
||||
- `scripts/modules`:
|
||||
- reuse `utils/git-utils.js`, `task-manager/tag-management.js`
|
||||
- `.claude/agents/`:
|
||||
- `surgical-test-generator.md`
|
||||
|
||||
Edges (data/control flow):
|
||||
- Autopilot → Test Generation → Test Execution → Git Commit → loop
|
||||
- Autopilot → Git Adapter (branch, tag, PR)
|
||||
- Autopilot → TUI (event stream) → tmux pane control
|
||||
- Autopilot → MCP tools for task/status updates
|
||||
- Test Execution → Coverage gate → Autopilot decision
|
||||
|
||||
Topological traversal (implementation order):
|
||||
1) Git/Test adapters (foundations)
|
||||
2) Orchestrator skeleton + events
|
||||
3) CLI `autopilot` command and dry-run
|
||||
4) Surgical test-gen integration and execution gate
|
||||
5) PR creation, run reports, resumability
|
||||
|
||||
## Phased Roadmap
|
||||
- Phase 0: Spike
|
||||
- Implement CLI skeleton `tm autopilot` with dry-run showing planned steps from a real task + subtasks.
|
||||
- Detect test runner (package.json) and git state; render a preflight report.
|
||||
|
||||
- Phase 1: Core Rails
|
||||
- Implement `WorkflowOrchestrator` in `tm-core` with event stream; add Git/Test adapters.
|
||||
- Support subtask loop (red/green/commit) with framework-agnostic test generation and detected test command; commit gating on passing tests and coverage.
|
||||
- Branch/tag mapping via existing tag-management APIs.
|
||||
- Run report persisted under `.taskmaster/reports/runs/`.
|
||||
|
||||
- Phase 2: PR + Resumability
|
||||
- Add `gh` PR creation with well-formed body using the run report.
|
||||
- Introduce resumable checkpoints and `--resume` flag.
|
||||
- Add coverage enforcement and optional lint/format step.
|
||||
|
||||
- Phase 3: Extensibility + Guardrails
|
||||
- Add support for basic pytest/go test adapters.
|
||||
- Add safeguards: diff preview mode, manual confirm gates, aggressive minimal-change prompts.
|
||||
- Optional: small TUI panel and extension panel leveraging the same run state file.
|
||||
|
||||
## References (Repo)
|
||||
- Test Workflow: `.cursor/rules/test_workflow.mdc`
|
||||
- Git Workflow: `.cursor/rules/git_workflow.mdc`
|
||||
- CLI: `apps/cli/src/commands/start.command.ts`, `apps/cli/src/ui/components/*.ts`
|
||||
- Core Services: `packages/tm-core/src/services/task-service.ts`, `task-execution-service.ts`
|
||||
- Executors: `packages/tm-core/src/executors/*`
|
||||
- Git Utilities: `scripts/modules/utils/git-utils.js`
|
||||
- Tag Management: `scripts/modules/task-manager/tag-management.js`
|
||||
- Surgical Test Generator: `.claude/agents/surgical-test-generator.md`
|
||||
221
.taskmaster/docs/prd-rpg-user-stories.md
Normal file
221
.taskmaster/docs/prd-rpg-user-stories.md
Normal file
@@ -0,0 +1,221 @@
|
||||
# PRD: RPG‑Based User Story Mode + Validation‑First Delivery
|
||||
|
||||
## Summary
|
||||
- Introduce a “User Story Mode” where each Task is a user story and each Subtask is a concrete implementation step. Enable via config flag; when enabled, Task generation and PRD parsing produce user‑story titles/details with acceptance criteria, while Subtasks capture implementation details.
|
||||
- Build a validation‑first delivery pipeline: derive tests from acceptance criteria (Surgical Test Generator), wire TDD rails and Git/PR mapping so reviews focus on verification rather than code spelunking.
|
||||
- Keep everything on rails: branch naming with tag+task id, commit/PR linkage to tasks/subtasks, coverage + test gates, and lightweight TUI for fast execution.
|
||||
|
||||
## North‑Star Outcomes
|
||||
- Humans stay in briefs/frontends; implementation runs quickly, often without opening the IDE.
|
||||
- “Definition of Done” is expressed and enforced as tests; business logic is encoded in test criteria/acceptance criteria.
|
||||
- End‑to‑end linkage from brief → user story → subtasks → commits/PRs → delivery, with reproducible automation and minimal ceremony.
|
||||
|
||||
## Problem
|
||||
- The bottleneck is validation and PR review, not code generation. Plans are helpful but the chokepoint is proving correctness, business conformance, and integration.
|
||||
- Current test workflow is too Jest‑specific; we need framework‑agnostic generation and execution.
|
||||
- We need consistent Git/TDD wiring so GitHub integrations can map work artifacts to tasks/subtasks without ambiguity.
|
||||
|
||||
## Solution Overview
|
||||
- Add a configuration flag to switch to user story mode and adapt prompts/parsers.
|
||||
- Expand tasks with explicit Acceptance Criteria and Test Criteria; drive Surgical Test Generator to create failing tests first; wire autonomous TDD loops per subtask until green, then commit.
|
||||
- Enforce coverage (80% default) and generate PRs that summarize user story, acceptance criteria coverage, and test results; commits/PRs contain metadata to link back to tasks/subtasks.
|
||||
- Provide a compact TUI (tmux) to pick tag/tasks and launch an executor terminal, while the orchestrator runs rails in the background.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
- `.taskmaster/config.json` additions
|
||||
- `stories`: `{ enabled: true, storyLabel: "User Story", acceptanceKey: "Acceptance Criteria" }`
|
||||
- `autopilot`: `{ enabled: true, requireCleanWorkingTree: true }`
|
||||
- `test`: `{ runner: "auto", coverageThresholds: { lines: 80, branches: 80, functions: 80, statements: 80 } }`
|
||||
- `git`: `{ branchPattern: "{tag}/task-{id}-{slug}", pr: { enabled: true, base: "default" }, commitFooters: { task: "Task-Id", subtask: "Subtask-Id", tag: "Tag" } }`
|
||||
|
||||
Behavior when `stories.enabled=true`:
|
||||
- Task generation prompts and PRD parsers produce user‑story formatted titles and descriptions, include acceptance criteria blocks, and set `task.type = 'user_story'`.
|
||||
- Subtasks remain implementation steps with concise technical goals.
|
||||
- Expand will ensure any missing acceptance criteria is synthesized (from brief/PRD content) before starting work.
|
||||
|
||||
---
|
||||
|
||||
## Data Model Changes
|
||||
- Task fields (add):
|
||||
- `type: 'user_story' | 'technical'` (default `technical`)
|
||||
- `acceptanceCriteria: string[] | string` (structured or markdown)
|
||||
- `testCriteria: string[] | string` (optional, derived from acceptance criteria; what to validate)
|
||||
- Subtask fields remain focused on implementation detail and dependency graph.
|
||||
|
||||
Storage and UI remain backward compatible; fields are optional when `stories.enabled=false`.
|
||||
|
||||
### JSON Gherkin Representation (for stories)
|
||||
Add an optional `gherkin` block to Tasks in story mode. Keep Hybrid acceptanceCriteria as the human/authoring surface; maintain a normalized JSON Gherkin for deterministic mapping.
|
||||
|
||||
```
|
||||
GherkinFeature {
|
||||
id: string, // FEAT-<taskId>
|
||||
name: string, // mirrors user story title
|
||||
description?: string,
|
||||
background?: { steps: Step[] },
|
||||
scenarios: Scenario[]
|
||||
}
|
||||
|
||||
Scenario {
|
||||
id: string, // SC-<taskId>-<n> or derived from AC id
|
||||
name: string,
|
||||
tags?: string[],
|
||||
steps: Step[], // Given/When/Then/And/But
|
||||
examples?: Record<string, any>[]
|
||||
}
|
||||
|
||||
Step { keyword: 'Given'|'When'|'Then'|'And'|'But', text: string }
|
||||
```
|
||||
|
||||
Notes
|
||||
- Derive `gherkin.scenarios` from acceptanceCriteria when obvious; preserve both raw markdown and normalized items.
|
||||
- Allow cross‑references between scenarios and AC items (e.g., `refs: ['AC-12-1']`).
|
||||
|
||||
---
|
||||
|
||||
## RPG Plan (Repository Planning Graph)
|
||||
|
||||
Functional Nodes (Capabilities)
|
||||
- Brief Intake → parse briefs/PRDs and extract user stories (when enabled)
|
||||
- User Story Generation → create task title/details as user stories + acceptance criteria
|
||||
- JSON Gherkin Synthesis → produce Feature/Scenario structure from acceptance criteria
|
||||
- Acceptance/Test Criteria Synthesis → convert acceptance criteria into concrete test criteria
|
||||
- Surgical Test Generation → generate failing tests per subtask using `.claude/agents/surgical-test-generator.md`
|
||||
- Implementation Planning → expand subtasks as atomic implementation steps with dependencies
|
||||
- Autonomous Execution (Rails) → branch, red/green loop per subtask, commit when green
|
||||
- Validation & Review Automation → coverage gates, PR body with user story + results, checklist
|
||||
- GitHub Integration Mapping → branch naming, commit footers, PR linkage to tasks/subtasks
|
||||
- TUI/Terminal Integration → tag/task selection left pane; executor terminal right pane via tmux
|
||||
|
||||
Structural Nodes (Code Organization)
|
||||
- `packages/tm-core`
|
||||
- `services/workflow-orchestrator.ts` (new): drives rails, emits progress events
|
||||
- `services/story-mode.service.ts` (new): toggles prompts/parsers for user stories, acceptance criteria
|
||||
- `services/test-runner-adapter.ts` (new): detects/executes project test command, collects coverage
|
||||
- `services/git-adapter.ts` (new): branch/commit/push, PR creation; applies commit footers
|
||||
- existing: `task-service.ts`, `task-execution-service.ts`, `executors/*`
|
||||
- `apps/cli`
|
||||
- `src/commands/autopilot.command.ts` (new): orchestrates a full run; supports `--stories`
|
||||
- `src/ui/tui/` (new): tmux helpers and compact panes for selection and logs
|
||||
- `scripts/modules`
|
||||
- reuse `utils/git-utils.js`, `task-manager/tag-management.js`, PR template utilities
|
||||
- `.cursor/rules`
|
||||
- update generation/parsing rules to emit user‑story format when enabled
|
||||
- `.claude/agents`
|
||||
- existing: `surgical-test-generator.md` for red phase
|
||||
|
||||
Edges (Dependencies / Data Flow)
|
||||
- Brief Intake → User Story Generation → Acceptance/Test Criteria Synthesis → Implementation Planning → Autonomous Execution → Validation/PR
|
||||
- Execution ↔ Test Runner (runAll/runTargeted, coverage) → back to Execution for decisions
|
||||
- Git Adapter ← Execution (commits/branch) → PR creation (target default branch)
|
||||
- TUI ↔ Orchestrator (event stream) → user confirmations for branch/push/PR
|
||||
- MCP Tools ↔ Orchestrator for task/status/context updates
|
||||
|
||||
Topological Traversal (Build Order)
|
||||
1) Config + Data Model changes (stories flag, acceptance fields, optional `gherkin`)
|
||||
2) Rules/Prompts updates for parsing/generation in story mode (emit AC Hybrid + JSON Gherkin)
|
||||
3) Test Runner Adapter (framework‑agnostic execute + coverage)
|
||||
4) Git Adapter (branch pattern `{tag}/task-{id}-{slug}`, commit footers/trailer, PR create)
|
||||
5) Workflow Orchestrator wiring red/green/commit loop with coverage gate and scenario iteration
|
||||
6) Surgical Test Gen integration (red) from JSON Gherkin + AC; minimal‑change impl prompts (green)
|
||||
7) CLI autopilot (dry‑run → full run) and TUI (tmux panes)
|
||||
8) PR template and review automation (user story, AC table with test links, scenarios, coverage)
|
||||
|
||||
---
|
||||
|
||||
## Git/TDD Wiring (Validation‑First)
|
||||
- Branch naming: include tag + task id (e.g., `master/task-12-user-auth`) to disambiguate context.
|
||||
- Commit footers (configurable):
|
||||
- `Task-Id: <id>`
|
||||
- `Subtask-Id: <id>.<sub>` when relevant
|
||||
- `Tag: <tag>`
|
||||
- Trailer: `Refs: TM-<tag>-<id>[.<sub>] SC:<scenarioId> AC:<acId>`
|
||||
- Red/Green/Commit loop per subtask:
|
||||
- Red: synthesize failing tests from acceptance criteria (Surgical agent)
|
||||
- Green: minimal code to pass; re‑run full suite
|
||||
- Commit when all tests pass and coverage ≥ 80%
|
||||
- PR base: repository default branch. Title `Task #<id> [<tag>]: <title>`.
|
||||
- PR body sections: User Story, Acceptance Criteria, Subtask Summary, Test Results, Coverage Table, Linked Work Items (ids), Next Steps.
|
||||
|
||||
---
|
||||
|
||||
## Prompts & Parsers (Story Mode)
|
||||
- PRD/Brief Parser updates:
|
||||
- Extract user stories with “As a … I want … so that …” format when present.
|
||||
- Extract Acceptance Criteria as bullet list; fill gaps with LLM synthesis from brief context.
|
||||
- Emit JSON Gherkin Feature/Scenarios; auto‑split Given/When/Then when feasible; otherwise store text under `then` and refine later.
|
||||
- Task Generation Prompt (story mode):
|
||||
- “Generate a task as a User Story with clear Acceptance Criteria. Do not include implementation details in the story; produce implementation subtasks separately.”
|
||||
- Subtask Generation Prompt:
|
||||
- “Produce technical implementation steps to satisfy the acceptance criteria. Each subtask should be atomic and testable.”
|
||||
- Test Generation (Red):
|
||||
- Use `.claude/agents/surgical-test-generator.md`; seed with JSON Gherkin + Acceptance/Test Criteria; determinism favored over maximum coverage.
|
||||
- Record produced test paths back into AC items and optionally scenario annotations.
|
||||
- Implementation (Green):
|
||||
- Minimal diffs, follow patterns, keep commits scoped to the subtask.
|
||||
|
||||
---
|
||||
|
||||
## TUI (Linear, tmux‑based)
|
||||
- Left: Tag selector and task list (status/priority). Actions: Expand, Start (Next or Selected), Review.
|
||||
- Right: Executor terminal (claude‑code/codex) under tmux split; orchestrator logs under another pane.
|
||||
- Confirmations inline (branch create, push, PR) unless `--no-confirm`.
|
||||
|
||||
---
|
||||
|
||||
## Migration & Backward Compatibility
|
||||
- Optional `gherkin` block; existing tasks remain valid.
|
||||
- When `stories.enabled=true`, new tasks include AC Hybrid + `gherkin`; upgrade path via a utility to synthesize both from description/testStrategy/acceptanceCriteria.
|
||||
|
||||
---
|
||||
|
||||
## Risks & Mitigations
|
||||
- Hallucinated acceptance criteria → Always show criteria in PR; allow quick amend and re‑run.
|
||||
- Framework variance → Test Runner Adapter detects and normalizes execution/coverage; fallback to `test` script.
|
||||
- Large diffs → Prompt for minimal changes; allow diff preview before commit.
|
||||
- Flaky tests → Retry policy; isolate targeted runs; enforce passing full suite before commit.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria Schema Options (for decision)
|
||||
- Option A: Markdown only
|
||||
- Pros: simple to write/edit, good for humans
|
||||
- Cons: hard to map deterministically to tests; weak traceability; brittle diffs
|
||||
- Option B: Structured array
|
||||
- Example: `{ id, summary, given, when, then, severity, tags }`
|
||||
- Pros: machine‑readable; strong linking to tests/coverage; easy to diff
|
||||
- Cons: heavier authoring; requires schema discipline
|
||||
- Option C: Hybrid (recommended)
|
||||
- Store both: a normalized array of criteria objects and a preserved `raw` markdown block
|
||||
- Each criterion gets a stable `id` (e.g., `AC-<taskId>-<n>`) used in tests, commit trailers, and PR tables
|
||||
- Enables clean PR tables and deterministic coverage mapping while keeping human‑friendly text
|
||||
|
||||
Proposed default schema (hybrid):
|
||||
```
|
||||
acceptanceCriteria: {
|
||||
raw: """
|
||||
- AC1: Guest can checkout with credit card
|
||||
- AC2: Declined cards show error inline
|
||||
""",
|
||||
items: [
|
||||
{
|
||||
id: "AC-12-1",
|
||||
summary: "Guest can checkout with credit card",
|
||||
given: "a guest with items in cart",
|
||||
when: "submits valid credit card",
|
||||
then: "order is created and receipt emailed",
|
||||
severity: "must",
|
||||
tags: ["checkout", "payments"],
|
||||
tests: [] // filled by orchestrator (file paths/test IDs)
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Decision: adopt Hybrid default; allow Markdown‑only input and auto‑normalize.
|
||||
|
||||
## Decisions
|
||||
- Adopt Hybrid acceptance criteria schema by default; normalize Markdown to structured items with stable IDs `AC-<taskId>-<n>`.
|
||||
- Use conventional commits plus footers and a unified trailer `Refs: TM-<tag>-<id>[.<sub>]` across PRDs for robust mapping.
|
||||
Reference in New Issue
Block a user