diff --git a/templates/commands/checklist.md b/templates/commands/checklist.md index 7a746019..acce2e06 100644 --- a/templates/commands/checklist.md +++ b/templates/commands/checklist.md @@ -5,6 +5,24 @@ scripts: ps: scripts/powershell/check-prerequisites.ps1 -Json --- +## Checklist Purpose + +**CRITICAL CLARIFICATION**: Checklists generated by this command are for **requirements validation**, NOT: +- ❌ Verifying code execution or functionality +- ❌ Testing whether code matches the specification +- ❌ Checking implementation correctness +- ❌ Code review or quality assurance + +**What checklists ARE for**: +- ✅ Ensuring requirements are clearly captured and complete +- ✅ Identifying ambiguities in specifications or plans +- ✅ Verifying proper scenario coverage across the spec and plan +- ✅ Confirming acceptance criteria are well-defined and measurable +- ✅ Detecting gaps, conflicts, or missing edge cases in requirements +- ✅ Validating that the problem domain is properly understood before implementation + +Think of checklists as a **pre-implementation review** to ensure the spec and plan are solid, not a post-implementation verification tool. + ## User Input ```text @@ -88,23 +106,25 @@ You **MUST** consider the user input before proceeding (if not empty). Do NOT invent ad-hoc categories; merge sparse categories (<2 items) into the closest higher-signal category. - **Scenario Classification & Coverage**: + **Scenario Classification & Coverage** (Requirements Validation Focus): - Classify scenarios into: Primary, Alternate, Exception/Error, Recovery/Resilience, Non-Functional - At least one item per present scenario class; if intentionally absent add: `Confirm intentional absence of scenarios` - Include resilience/rollback coverage when state mutation or migrations occur (partial write, degraded mode, backward compatibility, rollback preconditions) - If a major scenario lacks acceptance criteria, add an item to define measurable criteria + - **Focus on requirements validation**: Are scenarios clearly defined? Are acceptance criteria measurable? Are edge cases identified in the spec? **Traceability Requirements**: - MINIMUM: ≥80% of items MUST include at least one traceability reference - Each item should include ≥1 of: scenario class tag, spec ref `[Spec §X.Y]`, acceptance criterion `[AC-##]`, or marker `(Assumption)/(Dependency)/(Ambiguity)/(Conflict)` - If no ID system exists, create an item: `Establish requirement & acceptance criteria ID scheme before proceeding` - **Surface & Resolve Issues**: + **Surface & Resolve Issues** (Pre-Implementation Validation): - Cluster and create one resolution item per cluster for: - - Ambiguities (vague terms: "fast", "robust", "secure") - - Conflicts (contradictory statements) - - Assumptions (unvalidated premises) - - Dependencies (external systems, feature flags, migrations, upstream APIs) + - Ambiguities (vague terms in spec: "fast", "robust", "secure" - these need quantification) + - Conflicts (contradictory statements in requirements) + - Assumptions (unvalidated premises in the spec or plan) + - Dependencies (external systems, feature flags, migrations, upstream APIs - are they documented?) + - Items should focus on "Is this requirement clear enough to implement?" not "Does the code work?" **Content Consolidation**: - Soft cap: If raw candidate items > 40, prioritize by risk/impact and add: `Consolidate remaining low-impact scenarios (see source docs) after priority review` @@ -119,6 +139,16 @@ You **MUST** consider the user input before proceeding (if not empty). - Rephrase any such user input into requirement clarity or coverage validation - Optional brief rationale ONLY if it clarifies requirement intent or risk + **✅ HOW TO PHRASE CHECKLIST ITEMS** (Requirements Validation): + - Good: "Verify error handling scenarios are defined for network failures" + - Bad: "Test error handling for network failures" + - Good: "Confirm acceptance criteria are measurable for performance requirements" + - Bad: "Run performance tests to verify requirements" + - Good: "Identify edge cases for concurrent user access in spec" + - Bad: "Implement thread-safe concurrent access" + - Good: "Clarify ambiguous term 'fast response' with specific timing requirements" + - Bad: "Verify response time is under 100ms" + 6. **Structure Reference**: Generate the checklist following the canonical template in `templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### ` lines with globally incrementing IDs starting at CHK001. 7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize: diff --git a/templates/commands/specify.md b/templates/commands/specify.md index 37886316..8fa35230 100644 --- a/templates/commands/specify.md +++ b/templates/commands/specify.md @@ -29,15 +29,25 @@ Given that feature description, do this: If empty: ERROR "No feature description provided" 2. Extract key concepts from description Identify: actors, actions, data, constraints - 3. For each unclear aspect: - Mark with [NEEDS CLARIFICATION: specific question] + 3. For unclear aspects: + - Make informed guesses based on context and industry standards + - Only mark with [NEEDS CLARIFICATION: specific question] if: + - The choice significantly impacts feature scope or user experience + - Multiple reasonable interpretations exist with different implications + - No reasonable default exists + - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total** + - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details 4. Fill User Scenarios & Testing section If no clear user flow: ERROR "Cannot determine user scenarios" 5. Generate Functional Requirements Each requirement must be testable - Mark ambiguous requirements - 6. Identify Key Entities (if data involved) - 7. Return: SUCCESS (spec ready for planning) + Use reasonable defaults for unspecified details (document assumptions in Assumptions section) + 6. Define Success Criteria + Create measurable, technology-agnostic outcomes + Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion) + Each criterion must be verifiable without implementation details + 7. Identify Key Entities (if data involved) + 8. Return: SUCCESS (spec ready for planning) 4. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings. @@ -64,9 +74,19 @@ Given that feature description, do this: - [ ] No [NEEDS CLARIFICATION] markers remain - [ ] Requirements are testable and unambiguous - [ ] Success criteria are measurable + - [ ] Success criteria are technology-agnostic (no implementation details) + - [ ] All acceptance scenarios are defined + - [ ] Edge cases are identified - [ ] Scope is clearly bounded - [ ] Dependencies and assumptions identified + ## Feature Readiness + + - [ ] All functional requirements have clear acceptance criteria + - [ ] User scenarios cover primary flows + - [ ] Feature meets measurable outcomes defined in Success Criteria + - [ ] No implementation details leak into specification + ## Notes - Items marked incomplete require spec updates before `/clarify` or `/plan` @@ -88,7 +108,8 @@ Given that feature description, do this: - **If [NEEDS CLARIFICATION] markers remain**: 1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec - 2. For each clarification needed, present options to user in this format: + 2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest + 3. For each clarification needed (max 3), present options to user in this format: ```markdown ## Question [N]: [Topic] @@ -109,16 +130,16 @@ Given that feature description, do this: **Your choice**: _[Wait for user response]_ ``` - 3. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted: + 4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted: - Use consistent spacing with pipes aligned - Each cell should have spaces around content: `| Content |` not `|Content|` - Header separator must have at least 3 dashes: `|--------|` - Test that the table renders correctly in markdown preview - 4. Number questions sequentially (Q1, Q2, Q3, etc.) - 5. Present all questions together before waiting for responses - 6. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B") - 7. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer - 8. Re-run validation after all clarifications are resolved + 5. Number questions sequentially (Q1, Q2, Q3 - max 3 total) + 6. Present all questions together before waiting for responses + 7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B") + 8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer + 9. Re-run validation after all clarifications are resolved d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status @@ -145,13 +166,46 @@ Given that feature description, do this: When creating this spec from a user prompt: -1. **Mark all ambiguities**: Use [NEEDS CLARIFICATION: specific question] for any assumption you'd need to make -2. **Don't guess**: If the prompt doesn't specify something (e.g., "login system" without auth method), mark it -3. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item -4. **Common underspecified areas**: - - User types and permissions - - Data retention/deletion policies - - Performance targets and scale - - Error handling behaviors - - Integration requirements - - Security/compliance needs +1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps +2. **Document assumptions**: Record reasonable defaults in the Assumptions section +3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that: + - Significantly impact feature scope or user experience + - Have multiple reasonable interpretations with different implications + - Lack any reasonable default +4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details +5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item +6. **Common areas needing clarification** (only if no reasonable default exists): + - Feature scope and boundaries (include/exclude specific use cases) + - User types and permissions (if multiple conflicting interpretations possible) + - Security/compliance requirements (when legally/financially significant) + +**Examples of reasonable defaults** (don't ask about these): + +- Data retention: Industry-standard practices for the domain +- Performance targets: Standard web/mobile app expectations unless specified +- Error handling: User-friendly messages with appropriate fallbacks +- Authentication method: Standard session-based or OAuth2 for web apps +- Integration patterns: RESTful APIs unless specified otherwise + +### Success Criteria Guidelines + +Success criteria must be: + +1. **Measurable**: Include specific metrics (time, percentage, count, rate) +2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools +3. **User-focused**: Describe outcomes from user/business perspective, not system internals +4. **Verifiable**: Can be tested/validated without knowing implementation details + +**Good examples**: + +- "Users can complete checkout in under 3 minutes" +- "System supports 10,000 concurrent users" +- "95% of searches return results in under 1 second" +- "Task completion rate improves by 40%" + +**Bad examples** (implementation-focused): + +- "API response time is under 200ms" (too technical, use "Users see results instantly") +- "Database can handle 1000 TPS" (implementation detail, use user-facing metric) +- "React components render efficiently" (framework-specific) +- "Redis cache hit rate above 80%" (technology-specific) diff --git a/templates/spec-template.md b/templates/spec-template.md index fd5dbfb9..db1d1878 100644 --- a/templates/spec-template.md +++ b/templates/spec-template.md @@ -55,3 +55,17 @@ - **[Entity 1]**: [What it represents, key attributes without implementation] - **[Entity 2]**: [What it represents, relationships to other entities] + +## Success Criteria *(mandatory)* + + + +### Measurable Outcomes + +- **SC-001**: [Measurable metric, e.g., "Users can complete account creation in under 2 minutes"] +- **SC-002**: [Measurable metric, e.g., "System handles 1000 concurrent users without degradation"] +- **SC-003**: [User satisfaction metric, e.g., "90% of users successfully complete primary task on first attempt"] +- **SC-004**: [Business metric, e.g., "Reduce support tickets related to [X] by 50%"]