Update checklist.md

This commit is contained in:
den (work)
2025-10-05 22:55:34 -07:00
parent 78638a9a37
commit a6be9bea31

View File

@@ -5,23 +5,24 @@ scripts:
ps: scripts/powershell/check-prerequisites.ps1 -Json ps: scripts/powershell/check-prerequisites.ps1 -Json
--- ---
## Checklist Purpose ## Checklist Purpose: "Unit Tests for English"
**CRITICAL CLARIFICATION**: Checklists generated by this command are for **requirements validation**, NOT: **CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
- ❌ Verifying code execution or functionality
- ❌ Testing whether code matches the specification
- ❌ Checking implementation correctness
- ❌ Code review or quality assurance
**What checklists ARE for**: **NOT for verification/testing**:
- ✅ Ensuring requirements are clearly captured and complete - ❌ NOT "Verify the button clicks correctly"
- ✅ Identifying ambiguities in specifications or plans - ❌ NOT "Test error handling works"
- ✅ Verifying proper scenario coverage across the spec and plan - ❌ NOT "Confirm the API returns 200"
- ✅ Confirming acceptance criteria are well-defined and measurable - ❌ NOT checking if code/implementation matches the spec
- ✅ Detecting gaps, conflicts, or missing edge cases in requirements
- ✅ Validating that the problem domain is properly understood before implementation
Think of checklists as a **pre-implementation review** to ensure the spec and plan are solid, not a post-implementation verification tool. **FOR requirements quality validation**:
- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
## User Input ## User Input
@@ -85,69 +86,122 @@ You **MUST** consider the user input before proceeding (if not empty).
- Use progressive disclosure: add follow-on retrieval only if gaps detected - Use progressive disclosure: add follow-on retrieval only if gaps detected
- If source docs are large, generate interim summary items instead of embedding raw text - If source docs are large, generate interim summary items instead of embedding raw text
5. **Generate checklist**: 5. **Generate checklist** - Create "Unit Tests for Requirements":
- Create `FEATURE_DIR/checklists/` directory if it doesn't exist - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
- Generate unique checklist filename: - Generate unique checklist filename:
- Use short, descriptive name based on checklist type - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
- Format: `[type].md` (e.g., `ux.md`, `test.md`, `security.md`, `deploy.md`) - Format: `[domain].md`
- If file exists, append to existing file (e.g., use the same UX checklist) - If file exists, append to existing file
- Number items sequentially starting from CHK001 - Number items sequentially starting from CHK001
- Each `/checklist` run creates a NEW file (never overwrites existing checklists) - Each `/checklist` run creates a NEW file (never overwrites existing checklists)
**Category Structure** - Group items ONLY using this controlled set: **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
- Primary Flows Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
- Alternate Flows - **Completeness**: Are all necessary requirements present?
- Exception / Error Flows - **Clarity**: Are requirements unambiguous and specific?
- Recovery & Resilience - **Consistency**: Do requirements align with each other?
- Non-Functional Domains (subgrouped or prefixed: Performance, Reliability, Security & Privacy, Accessibility, Observability, Scalability, Data Lifecycle) - **Measurability**: Can requirements be objectively verified?
- Traceability & Coverage - **Coverage**: Are all scenarios/edge cases addressed?
- Ambiguities & Conflicts
- Assumptions & Dependencies
Do NOT invent ad-hoc categories; merge sparse categories (<2 items) into the closest higher-signal category. **Category Structure** - Group items by requirement quality dimensions:
- **Requirement Completeness** (Are all necessary requirements documented?)
- **Requirement Clarity** (Are requirements specific and unambiguous?)
- **Requirement Consistency** (Do requirements align without conflicts?)
- **Acceptance Criteria Quality** (Are success criteria measurable?)
- **Scenario Coverage** (Are all flows/cases addressed?)
- **Edge Case Coverage** (Are boundary conditions defined?)
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
- **Dependencies & Assumptions** (Are they documented and validated?)
- **Ambiguities & Conflicts** (What needs clarification?)
**Scenario Classification & Coverage** (Requirements Validation Focus): **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
- Classify scenarios into: Primary, Alternate, Exception/Error, Recovery/Resilience, Non-Functional
- At least one item per present scenario class; if intentionally absent add: `Confirm intentional absence of <Scenario Class> scenarios` **WRONG** (Testing implementation):
- Include resilience/rollback coverage when state mutation or migrations occur (partial write, degraded mode, backward compatibility, rollback preconditions) - "Verify landing page displays 3 episode cards"
- If a major scenario lacks acceptance criteria, add an item to define measurable criteria - "Test hover states work on desktop"
- **Focus on requirements validation**: Are scenarios clearly defined? Are acceptance criteria measurable? Are edge cases identified in the spec? - "Confirm logo click navigates home"
**CORRECT** (Testing requirements quality):
- "Are the exact number and layout of featured episodes specified?" [Completeness]
- "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
- "Are loading states defined for asynchronous episode data?" [Completeness]
- "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
**ITEM STRUCTURE**:
Each item should follow this pattern:
- Question format asking about requirement quality
- Focus on what's WRITTEN (or not written) in the spec/plan
- Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
- Reference spec section `[Spec §X.Y]` when checking existing requirements
- Use `[Gap]` marker when checking for missing requirements
**EXAMPLES BY QUALITY DIMENSION**:
Completeness:
- "Are error handling requirements defined for all API failure modes? [Gap]"
- "Are accessibility requirements specified for all interactive elements? [Completeness]"
- "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
Clarity:
- "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
- "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
- "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
Consistency:
- "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
- "Are card component requirements consistent between landing and detail pages? [Consistency]"
Coverage:
- "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
- "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
- "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
Measurability:
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
**Scenario Classification & Coverage** (Requirements Quality Focus):
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
- If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
- Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
**Traceability Requirements**: **Traceability Requirements**:
- MINIMUM: ≥80% of items MUST include at least one traceability reference - MINIMUM: ≥80% of items MUST include at least one traceability reference
- Each item should include 1 of: scenario class tag, spec ref `[Spec §X.Y]`, acceptance criterion `[AC-##]`, or marker `(Assumption)/(Dependency)/(Ambiguity)/(Conflict)` - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
- If no ID system exists, create an item: `Establish requirement & acceptance criteria ID scheme before proceeding` - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
**Surface & Resolve Issues** (Pre-Implementation Validation): **Surface & Resolve Issues** (Requirements Quality Problems):
- Cluster and create one resolution item per cluster for: Ask questions about the requirements themselves:
- Ambiguities (vague terms in spec: "fast", "robust", "secure" - these need quantification) - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
- Conflicts (contradictory statements in requirements) - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
- Assumptions (unvalidated premises in the spec or plan) - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
- Dependencies (external systems, feature flags, migrations, upstream APIs - are they documented?) - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
- Items should focus on "Is this requirement clear enough to implement?" not "Does the code work?" - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
**Content Consolidation**: **Content Consolidation**:
- Soft cap: If raw candidate items > 40, prioritize by risk/impact and add: `Consolidate remaining low-impact scenarios (see source docs) after priority review` - Soft cap: If raw candidate items > 40, prioritize by risk/impact
- Merge near-duplicates when: same scenario class + same spec section + overlapping acceptance intent - Merge near-duplicates checking the same requirement aspect
- If >5 low-impact edge cases, cluster into a single aggregated item - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
- Do not repeat identical spec or acceptance refs in >3 items unless covering distinct scenario classes
- Treat context budget as finite: do not restate already-tagged requirements verbatim across multiple items
**🚫 PROHIBITED CONTENT** (Requirements Focus ONLY): **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
- Focus on requirements & scenario coverage quality, NOT implementation - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
- NEVER include: specific tests ("unit test", "integration test"), code symbols, frameworks, algorithmic prescriptions, deployment steps, test plan details, implementation strategy - ❌ References to code execution, user actions, system behavior
- Rephrase any such user input into requirement clarity or coverage validation - ❌ "Displays correctly", "works properly", "functions as expected"
- Optional brief rationale ONLY if it clarifies requirement intent or risk - ❌ "Click", "navigate", "render", "load", "execute"
- ❌ Test cases, test plans, QA procedures
- ❌ Implementation details (frameworks, APIs, algorithms)
**HOW TO PHRASE CHECKLIST ITEMS** (Requirements Validation): **REQUIRED PATTERNS** - These test requirements quality:
- Good: "Verify error handling scenarios are defined for network failures" - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
- Bad: "Test error handling for network failures" - ✅ "Is [vague term] quantified/clarified with specific criteria?"
- Good: "Confirm acceptance criteria are measurable for performance requirements" - ✅ "Are requirements consistent between [section A] and [section B]?"
- Bad: "Run performance tests to verify requirements" - ✅ "Can [requirement] be objectively measured/verified?"
- Good: "Identify edge cases for concurrent user access in spec" - ✅ "Are [edge cases/scenarios] addressed in requirements?"
- Bad: "Implement thread-safe concurrent access" - ✅ "Does the spec define [missing aspect]?"
- Good: "Clarify ambiguous term 'fast response' with specific timing requirements"
- Bad: "Verify response time is under 100ms"
6. **Structure Reference**: Generate the checklist following the canonical template in `templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001. 6. **Structure Reference**: Generate the checklist following the canonical template in `templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
@@ -165,39 +219,71 @@ You **MUST** consider the user input before proceeding (if not empty).
To avoid clutter, use descriptive types and clean up obsolete checklists when done. To avoid clutter, use descriptive types and clean up obsolete checklists when done.
## Example Checklist Types ## Example Checklist Types & Sample Items
**Specification Review:** `spec-review.md` **UX Requirements Quality:** `ux.md`
- Requirement completeness and clarity Sample items (testing the requirements, NOT the implementation):
- User scenarios and edge cases coverage - "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
- Acceptance criteria definition - "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
- Domain-specific considerations - "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
**Requirements Quality:** `requirements.md` **API Requirements Quality:** `api.md`
- Testable and measurable outcomes Sample items:
- Stakeholder alignment verification - "Are error response formats specified for all failure scenarios? [Completeness]"
- Assumptions and constraints documentation - "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
- Success metrics definition - "Are authentication requirements consistent across all endpoints? [Consistency]"
- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
- "Is versioning strategy documented in requirements? [Gap]"
**UX/Accessibility Scenarios:** `ux.md` or `a11y.md` **Performance Requirements Quality:** `performance.md`
- User journey completeness Sample items:
- Accessibility requirement coverage - "Are performance requirements quantified with specific metrics? [Clarity]"
- Responsive design considerations - "Are performance targets defined for all critical user journeys? [Coverage]"
- Internationalization needs - "Are performance requirements under different load conditions specified? [Completeness]"
- "Can performance requirements be objectively measured? [Measurability]"
- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
**Security Requirements:** `security.md` **Security Requirements Quality:** `security.md`
- Threat model coverage Sample items:
- Authentication/authorization requirements - "Are authentication requirements specified for all protected resources? [Coverage]"
- Data protection requirements - "Are data protection requirements defined for sensitive information? [Completeness]"
- Compliance and regulatory needs - "Is the threat model documented and requirements aligned to it? [Traceability]"
- "Are security requirements consistent with compliance obligations? [Consistency]"
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
**API/Integration Scenarios:** `api.md` ## Anti-Examples: What NOT To Do
- Contract completeness **❌ WRONG - These test implementation, not requirements:**
- Error handling scenarios
- Backward compatibility considerations ```markdown
- Integration touchpoint coverage - [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
```
**✅ CORRECT - These test requirements quality:**
```markdown
- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
```
**Key Differences:**
- Wrong: Tests if the system works correctly
- Correct: Tests if the requirements are written correctly
- Wrong: Verification of behavior
- Correct: Validation of requirement quality
- Wrong: "Does it do X?"
- Correct: "Is X clearly specified?"