- Add 6 specialized quality assessment commands - Implement risk-based testing with scoring - Create quality gate system with deterministic decisions - Add comprehensive test design and NFR validation - Update documentation with stage-based workflow integration
13 KiB
test-design
Create comprehensive test scenarios with appropriate test level recommendations for story implementation.
Inputs
required:
- story_id: "{epic}.{story}" # e.g., "1.3"
- story_path: "docs/stories/{epic}.{story}.*.md"
- story_title: "{title}" # If missing, derive from story file H1
- story_slug: "{slug}" # If missing, derive from title (lowercase, hyphenated)
Purpose
Design a complete test strategy that identifies what to test, at which level (unit/integration/e2e), and why. This ensures efficient test coverage without redundancy while maintaining appropriate test boundaries.
Test Level Decision Framework
Unit Tests
When to use:
- Testing pure functions and business logic
- Algorithm correctness
- Input validation and data transformation
- Error handling in isolated components
- Complex calculations or state machines
Characteristics:
- Fast execution (immediate feedback)
- No external dependencies (DB, API, file system)
- Highly maintainable and stable
- Easy to debug failures
Example scenarios:
unit_test:
component: "PriceCalculator"
scenario: "Calculate discount with multiple rules"
justification: "Complex business logic with multiple branches"
mock_requirements: "None - pure function"
Integration Tests
When to use:
- Testing component interactions
- Database operations and queries
- API endpoint behavior
- Service layer orchestration
- External service integration (with test doubles)
Characteristics:
- Moderate execution time
- May use test databases or containers
- Tests multiple components together
- Validates contracts between components
Example scenarios:
integration_test:
components: ["UserService", "UserRepository", "Database"]
scenario: "Create user with duplicate email check"
justification: "Tests transaction boundaries and constraint handling"
test_doubles: "Mock email service, real test database"
End-to-End Tests
When to use:
- Critical user journeys
- Cross-system workflows
- UI interaction flows
- Full stack validation
- Production-like scenario testing
Characteristics:
- Keep under 90 seconds per test
- Tests complete user scenarios
- Uses real or production-like environment
- Higher maintenance cost
- More prone to flakiness
Example scenarios:
e2e_test:
flow: "Complete purchase flow"
scenario: "User browses, adds to cart, and completes checkout"
justification: "Critical business flow requiring full stack validation"
environment: "Staging with test payment gateway"
Test Design Process
1. Analyze Story Requirements
Break down each acceptance criterion into testable scenarios:
acceptance_criterion: "User can reset password via email"
test_scenarios:
- level: unit
what: "Password validation rules"
why: "Complex regex and business rules"
- level: integration
what: "Password reset token generation and storage"
why: "Database interaction with expiry logic"
- level: integration
what: "Email service integration"
why: "External service with retry logic"
- level: e2e
what: "Complete password reset flow"
why: "Critical security flow needing full validation"
2. Apply Test Level Heuristics
Use these rules to determine appropriate test levels:
## Test Level Selection Rules
### Favor Unit Tests When:
- Logic can be isolated
- No side effects involved
- Fast feedback needed
- High cyclomatic complexity
### Favor Integration Tests When:
- Testing persistence layer
- Validating service contracts
- Testing middleware/interceptors
- Component boundaries critical
### Favor E2E Tests When:
- User-facing critical paths
- Multi-system interactions
- Regulatory compliance scenarios
- Visual regression important
### Anti-patterns to Avoid:
- E2E testing for business logic validation
- Unit testing framework behavior
- Integration testing third-party libraries
- Duplicate coverage across levels
### Duplicate Coverage Guard
**Before adding any test, check:**
1. Is this already tested at a lower level?
2. Can a unit test cover this instead of integration?
3. Can an integration test cover this instead of E2E?
**Coverage overlap is only acceptable when:**
- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
- Critical paths requiring defense in depth
- Regression prevention for previously broken functionality
3. Design Test Scenarios
Test ID Format: {EPIC}.{STORY}-{LEVEL}-{SEQ}
- Example:
1.3-UNIT-001,1.3-INT-002,1.3-E2E-001 - Ensures traceability across all artifacts
Naming Convention:
- Unit:
test_{component}_{scenario} - Integration:
test_{flow}_{interaction} - E2E:
test_{journey}_{outcome}
Risk Linkage:
- Tag tests with risk IDs they mitigate
- Prioritize tests for high-risk areas (P0)
- Link to risk profile when available
For each identified test need:
test_scenario:
id: "1.3-INT-002"
requirement: "AC2: Rate limiting on login attempts"
mitigates_risks: ["SEC-001", "PERF-003"] # Links to risk profile
priority: P0 # Based on risk score
unit_tests:
- name: "RateLimiter calculates window correctly"
input: "Timestamp array"
expected: "Correct window calculation"
integration_tests:
- name: "Login endpoint enforces rate limit"
setup: "5 failed attempts"
action: "6th attempt"
expected: "429 response with retry-after header"
e2e_tests:
- name: "User sees rate limit message"
setup: "Trigger rate limit"
validation: "Error message displayed, retry timer shown"
Deterministic Test Level Minimums
Per Acceptance Criterion:
- At least 1 unit test for business logic
- At least 1 integration test if multiple components interact
- At least 1 E2E test if it's a user-facing feature
Exceptions:
- Pure UI changes: May skip unit tests
- Pure logic changes: May skip E2E tests
- Infrastructure changes: May focus on integration tests
When in doubt: Start with unit tests, add integration for interactions, E2E for critical paths only.
Test Quality Standards
Core Testing Principles
No Flaky Tests: Ensure reliability through proper async handling, explicit waits, and atomic test design.
No Hard Waits/Sleeps: Use dynamic waiting strategies (e.g., polling, event-based triggers).
Stateless & Parallel-Safe: Tests run independently; use cron jobs or semaphores only if unavoidable.
No Order Dependency: Every it/describe/context block works in isolation (supports .only execution).
Self-Cleaning Tests: Test sets up its own data and automatically deletes/deactivates entities created during testing.
Tests Live Near Source Code: Co-locate test files with the code they validate (e.g., *.spec.js alongside components).
Execution Strategy
Shifted Left:
- Start with local environments or ephemeral stacks
- Validate functionality across all deployment stages (local → dev → stage)
Low Maintenance: Minimize manual upkeep (avoid brittle selectors, do not repeat UI actions, leverage APIs).
CI Execution Evidence: Integrate into pipelines with clear logs/artifacts.
Visibility: Generate test reports (e.g., JUnit XML, HTML) for failures and trends.
Coverage Requirements
Release Confidence:
- Happy Path: Core user journeys are prioritized
- Edge Cases: Critical error/validation scenarios are covered
- Feature Flags: Test both enabled and disabled states where applicable
Test Design Rules
Assertions: Keep them explicit in tests; avoid abstraction into helpers. Use parametrized tests for soft assertions.
Naming: Follow conventions (e.g., describe('Component'), it('should do X when Y')).
Size: Aim for files ≤200 lines; split/chunk large tests logically.
Speed: Target individual tests ≤90 seconds; optimize slow setups (e.g., shared fixtures).
Careful Abstractions: Favor readability over DRY when balancing helper reuse (page objects are okay, assertion logic is not).
Test Cleanup: Ensure tests clean up resources they create (e.g., closing browser, deleting test data).
Deterministic Flow: Tests should refrain from using conditionals (e.g., if/else) to control flow or try/catch blocks where possible.
API Testing Standards
- Tests must not depend on hardcoded data → use factories and per-test setup
- Always test both happy path and negative/error cases
- API tests should run parallel safely (no global state shared)
- Test idempotency where applicable (e.g., duplicate requests)
- Tests should clean up their data
- Response logs should only be printed in case of failure
- Auth tests must validate token expiration and renewal
Outputs
Output 1: Test Design Document
Save to: docs/qa/assessments/{epic}.{story}-test-design-{YYYYMMDD}.md
Generate a comprehensive test design document:
# Test Design: Story {epic}.{story}
Date: {date}
Reviewer: Quinn (Test Architect)
## Test Strategy Overview
- Total test scenarios: X
- Unit tests: Y (A%)
- Integration tests: Z (B%)
- E2E tests: W (C%)
## Test Level Rationale
[Explain why this distribution was chosen]
## Detailed Test Scenarios
### Requirement: AC1 - {description}
#### Unit Tests (3 scenarios)
1. **ID**: 1.3-UNIT-001
**Test**: Validate input format
- **Why Unit**: Pure validation logic
- **Coverage**: Input edge cases
- **Mocks**: None needed
- **Mitigates**: DATA-001 (if applicable)
#### Integration Tests (2 scenarios)
1. **ID**: 1.3-INT-001
**Test**: Service processes valid request
- **Why Integration**: Multiple components involved
- **Coverage**: Happy path + error handling
- **Test Doubles**: Mock external API
- **Mitigates**: TECH-002
#### E2E Tests (1 scenario)
1. **ID**: 1.3-E2E-001
**Test**: Complete user workflow
- **Why E2E**: Critical user journey
- **Coverage**: Full stack validation
- **Environment**: Staging
- **Max Duration**: 90 seconds
- **Mitigates**: BUS-001
[Continue for all requirements...]
## Test Data Requirements
### Unit Test Data
- Static fixtures for calculations
- Edge case values arrays
### Integration Test Data
- Test database seeds
- API response fixtures
### E2E Test Data
- Test user accounts
- Sandbox environment data
## Mock/Stub Strategy
### What to Mock
- External services (payment, email)
- Time-dependent functions
- Random number generators
### What NOT to Mock
- Core business logic
- Database in integration tests
- Critical security functions
## Test Execution Implementation
### Parallel Execution
- All unit tests: Fully parallel (stateless requirement)
- Integration tests: Parallel with isolated databases
- E2E tests: Sequential or limited parallelism
### Execution Order
1. Unit tests first (fail fast)
2. Integration tests second
3. E2E tests last (expensive, max 90 seconds each)
## Risk-Based Test Priority
### P0 - Must Have (Linked to Critical/High Risks)
- Security-related tests (SEC-\* risks)
- Data integrity tests (DATA-\* risks)
- Critical business flow tests (BUS-\* risks)
- Tests for risks scored ≥6 in risk profile
### P1 - Should Have (Medium Risks)
- Edge case coverage
- Performance tests (PERF-\* risks)
- Error recovery tests
- Tests for risks scored 4-5
### P2 - Nice to Have (Low Risks)
- UI polish tests
- Minor validation tests
- Tests for risks scored ≤3
## Test Maintenance Considerations
### High Maintenance Tests
[List tests that may need frequent updates]
### Stability Measures
- No retry strategies (tests must be deterministic)
- Dynamic waits only (no hard sleeps)
- Environment isolation
- Self-cleaning test data
## Coverage Goals
### Unit Test Coverage
- Target: 80% line coverage
- Focus: Business logic, calculations
### Integration Coverage
- Target: All API endpoints
- Focus: Contract validation
### E2E Coverage
- Target: Critical paths only
- Focus: User value delivery
Test Level Smells to Flag
Over-testing Smells
- Same logic tested at multiple levels
- E2E tests for calculations
- Integration tests for framework features
Under-testing Smells
- No unit tests for complex logic
- Missing integration tests for data operations
- No E2E tests for critical user paths
Wrong Level Smells
- Unit tests with real database
- E2E tests checking calculation results
- Integration tests mocking everything
Quality Indicators
Good test design shows:
- Clear level separation
- No redundant coverage
- Fast feedback from unit tests
- Reliable integration tests
- Focused e2e tests
Key Principles
- Test at the lowest appropriate level
- One clear owner per test
- Fast tests run first
- Mock at boundaries, not internals
- E2E for user value, not implementation
- Maintain test/production parity where critical
- Tests must be atomic and self-contained
- No shared state between tests
- Explicit assertions in test files (not helpers)
Output 2: Story Hook Line
Print this line for review task to quote:
Test design: docs/qa/assessments/{epic}.{story}-test-design-{YYYYMMDD}.md
For traceability: This planning document will be referenced by trace-requirements task.
Output 3: Test Count Summary
Print summary for quick reference:
test_summary:
total: { total_count }
by_level:
unit: { unit_count }
integration: { int_count }
e2e: { e2e_count }
by_priority:
P0: { p0_count }
P1: { p1_count }
P2: { p2_count }
coverage_gaps: [] # List any ACs without tests