Files
BMAD-METHOD/bmad-core/tasks/test-design.md
Murat Ozcan b7a34b4fc6 feat: transform QA agent into Test Architect with advanced quality capabilities
- Add 6 specialized quality assessment commands
  - Implement risk-based testing with scoring
  - Create quality gate system with deterministic decisions
  - Add comprehensive test design and NFR validation
  - Update documentation with stage-based workflow integration
2025-08-12 12:59:22 -05:00

529 lines
13 KiB
Markdown

# test-design
Create comprehensive test scenarios with appropriate test level recommendations for story implementation.
## Inputs
```yaml
required:
- story_id: "{epic}.{story}" # e.g., "1.3"
- story_path: "docs/stories/{epic}.{story}.*.md"
- story_title: "{title}" # If missing, derive from story file H1
- story_slug: "{slug}" # If missing, derive from title (lowercase, hyphenated)
```
## Purpose
Design a complete test strategy that identifies what to test, at which level (unit/integration/e2e), and why. This ensures efficient test coverage without redundancy while maintaining appropriate test boundaries.
## Test Level Decision Framework
### Unit Tests
**When to use:**
- Testing pure functions and business logic
- Algorithm correctness
- Input validation and data transformation
- Error handling in isolated components
- Complex calculations or state machines
**Characteristics:**
- Fast execution (immediate feedback)
- No external dependencies (DB, API, file system)
- Highly maintainable and stable
- Easy to debug failures
**Example scenarios:**
```yaml
unit_test:
component: "PriceCalculator"
scenario: "Calculate discount with multiple rules"
justification: "Complex business logic with multiple branches"
mock_requirements: "None - pure function"
```
### Integration Tests
**When to use:**
- Testing component interactions
- Database operations and queries
- API endpoint behavior
- Service layer orchestration
- External service integration (with test doubles)
**Characteristics:**
- Moderate execution time
- May use test databases or containers
- Tests multiple components together
- Validates contracts between components
**Example scenarios:**
```yaml
integration_test:
components: ["UserService", "UserRepository", "Database"]
scenario: "Create user with duplicate email check"
justification: "Tests transaction boundaries and constraint handling"
test_doubles: "Mock email service, real test database"
```
### End-to-End Tests
**When to use:**
- Critical user journeys
- Cross-system workflows
- UI interaction flows
- Full stack validation
- Production-like scenario testing
**Characteristics:**
- Keep under 90 seconds per test
- Tests complete user scenarios
- Uses real or production-like environment
- Higher maintenance cost
- More prone to flakiness
**Example scenarios:**
```yaml
e2e_test:
flow: "Complete purchase flow"
scenario: "User browses, adds to cart, and completes checkout"
justification: "Critical business flow requiring full stack validation"
environment: "Staging with test payment gateway"
```
## Test Design Process
### 1. Analyze Story Requirements
Break down each acceptance criterion into testable scenarios:
```yaml
acceptance_criterion: "User can reset password via email"
test_scenarios:
- level: unit
what: "Password validation rules"
why: "Complex regex and business rules"
- level: integration
what: "Password reset token generation and storage"
why: "Database interaction with expiry logic"
- level: integration
what: "Email service integration"
why: "External service with retry logic"
- level: e2e
what: "Complete password reset flow"
why: "Critical security flow needing full validation"
```
### 2. Apply Test Level Heuristics
Use these rules to determine appropriate test levels:
```markdown
## Test Level Selection Rules
### Favor Unit Tests When:
- Logic can be isolated
- No side effects involved
- Fast feedback needed
- High cyclomatic complexity
### Favor Integration Tests When:
- Testing persistence layer
- Validating service contracts
- Testing middleware/interceptors
- Component boundaries critical
### Favor E2E Tests When:
- User-facing critical paths
- Multi-system interactions
- Regulatory compliance scenarios
- Visual regression important
### Anti-patterns to Avoid:
- E2E testing for business logic validation
- Unit testing framework behavior
- Integration testing third-party libraries
- Duplicate coverage across levels
### Duplicate Coverage Guard
**Before adding any test, check:**
1. Is this already tested at a lower level?
2. Can a unit test cover this instead of integration?
3. Can an integration test cover this instead of E2E?
**Coverage overlap is only acceptable when:**
- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
- Critical paths requiring defense in depth
- Regression prevention for previously broken functionality
```
### 3. Design Test Scenarios
**Test ID Format:** `{EPIC}.{STORY}-{LEVEL}-{SEQ}`
- Example: `1.3-UNIT-001`, `1.3-INT-002`, `1.3-E2E-001`
- Ensures traceability across all artifacts
**Naming Convention:**
- Unit: `test_{component}_{scenario}`
- Integration: `test_{flow}_{interaction}`
- E2E: `test_{journey}_{outcome}`
**Risk Linkage:**
- Tag tests with risk IDs they mitigate
- Prioritize tests for high-risk areas (P0)
- Link to risk profile when available
For each identified test need:
```yaml
test_scenario:
id: "1.3-INT-002"
requirement: "AC2: Rate limiting on login attempts"
mitigates_risks: ["SEC-001", "PERF-003"] # Links to risk profile
priority: P0 # Based on risk score
unit_tests:
- name: "RateLimiter calculates window correctly"
input: "Timestamp array"
expected: "Correct window calculation"
integration_tests:
- name: "Login endpoint enforces rate limit"
setup: "5 failed attempts"
action: "6th attempt"
expected: "429 response with retry-after header"
e2e_tests:
- name: "User sees rate limit message"
setup: "Trigger rate limit"
validation: "Error message displayed, retry timer shown"
```
## Deterministic Test Level Minimums
**Per Acceptance Criterion:**
- At least 1 unit test for business logic
- At least 1 integration test if multiple components interact
- At least 1 E2E test if it's a user-facing feature
**Exceptions:**
- Pure UI changes: May skip unit tests
- Pure logic changes: May skip E2E tests
- Infrastructure changes: May focus on integration tests
**When in doubt:** Start with unit tests, add integration for interactions, E2E for critical paths only.
## Test Quality Standards
### Core Testing Principles
**No Flaky Tests:** Ensure reliability through proper async handling, explicit waits, and atomic test design.
**No Hard Waits/Sleeps:** Use dynamic waiting strategies (e.g., polling, event-based triggers).
**Stateless & Parallel-Safe:** Tests run independently; use cron jobs or semaphores only if unavoidable.
**No Order Dependency:** Every it/describe/context block works in isolation (supports .only execution).
**Self-Cleaning Tests:** Test sets up its own data and automatically deletes/deactivates entities created during testing.
**Tests Live Near Source Code:** Co-locate test files with the code they validate (e.g., `*.spec.js` alongside components).
### Execution Strategy
**Shifted Left:**
- Start with local environments or ephemeral stacks
- Validate functionality across all deployment stages (local → dev → stage)
**Low Maintenance:** Minimize manual upkeep (avoid brittle selectors, do not repeat UI actions, leverage APIs).
**CI Execution Evidence:** Integrate into pipelines with clear logs/artifacts.
**Visibility:** Generate test reports (e.g., JUnit XML, HTML) for failures and trends.
### Coverage Requirements
**Release Confidence:**
- Happy Path: Core user journeys are prioritized
- Edge Cases: Critical error/validation scenarios are covered
- Feature Flags: Test both enabled and disabled states where applicable
### Test Design Rules
**Assertions:** Keep them explicit in tests; avoid abstraction into helpers. Use parametrized tests for soft assertions.
**Naming:** Follow conventions (e.g., `describe('Component')`, `it('should do X when Y')`).
**Size:** Aim for files ≤200 lines; split/chunk large tests logically.
**Speed:** Target individual tests ≤90 seconds; optimize slow setups (e.g., shared fixtures).
**Careful Abstractions:** Favor readability over DRY when balancing helper reuse (page objects are okay, assertion logic is not).
**Test Cleanup:** Ensure tests clean up resources they create (e.g., closing browser, deleting test data).
**Deterministic Flow:** Tests should refrain from using conditionals (e.g., if/else) to control flow or try/catch blocks where possible.
### API Testing Standards
- Tests must not depend on hardcoded data → use factories and per-test setup
- Always test both happy path and negative/error cases
- API tests should run parallel safely (no global state shared)
- Test idempotency where applicable (e.g., duplicate requests)
- Tests should clean up their data
- Response logs should only be printed in case of failure
- Auth tests must validate token expiration and renewal
## Outputs
### Output 1: Test Design Document
**Save to:** `docs/qa/assessments/{epic}.{story}-test-design-{YYYYMMDD}.md`
Generate a comprehensive test design document:
```markdown
# Test Design: Story {epic}.{story}
Date: {date}
Reviewer: Quinn (Test Architect)
## Test Strategy Overview
- Total test scenarios: X
- Unit tests: Y (A%)
- Integration tests: Z (B%)
- E2E tests: W (C%)
## Test Level Rationale
[Explain why this distribution was chosen]
## Detailed Test Scenarios
### Requirement: AC1 - {description}
#### Unit Tests (3 scenarios)
1. **ID**: 1.3-UNIT-001
**Test**: Validate input format
- **Why Unit**: Pure validation logic
- **Coverage**: Input edge cases
- **Mocks**: None needed
- **Mitigates**: DATA-001 (if applicable)
#### Integration Tests (2 scenarios)
1. **ID**: 1.3-INT-001
**Test**: Service processes valid request
- **Why Integration**: Multiple components involved
- **Coverage**: Happy path + error handling
- **Test Doubles**: Mock external API
- **Mitigates**: TECH-002
#### E2E Tests (1 scenario)
1. **ID**: 1.3-E2E-001
**Test**: Complete user workflow
- **Why E2E**: Critical user journey
- **Coverage**: Full stack validation
- **Environment**: Staging
- **Max Duration**: 90 seconds
- **Mitigates**: BUS-001
[Continue for all requirements...]
## Test Data Requirements
### Unit Test Data
- Static fixtures for calculations
- Edge case values arrays
### Integration Test Data
- Test database seeds
- API response fixtures
### E2E Test Data
- Test user accounts
- Sandbox environment data
## Mock/Stub Strategy
### What to Mock
- External services (payment, email)
- Time-dependent functions
- Random number generators
### What NOT to Mock
- Core business logic
- Database in integration tests
- Critical security functions
## Test Execution Implementation
### Parallel Execution
- All unit tests: Fully parallel (stateless requirement)
- Integration tests: Parallel with isolated databases
- E2E tests: Sequential or limited parallelism
### Execution Order
1. Unit tests first (fail fast)
2. Integration tests second
3. E2E tests last (expensive, max 90 seconds each)
## Risk-Based Test Priority
### P0 - Must Have (Linked to Critical/High Risks)
- Security-related tests (SEC-\* risks)
- Data integrity tests (DATA-\* risks)
- Critical business flow tests (BUS-\* risks)
- Tests for risks scored ≥6 in risk profile
### P1 - Should Have (Medium Risks)
- Edge case coverage
- Performance tests (PERF-\* risks)
- Error recovery tests
- Tests for risks scored 4-5
### P2 - Nice to Have (Low Risks)
- UI polish tests
- Minor validation tests
- Tests for risks scored ≤3
## Test Maintenance Considerations
### High Maintenance Tests
[List tests that may need frequent updates]
### Stability Measures
- No retry strategies (tests must be deterministic)
- Dynamic waits only (no hard sleeps)
- Environment isolation
- Self-cleaning test data
## Coverage Goals
### Unit Test Coverage
- Target: 80% line coverage
- Focus: Business logic, calculations
### Integration Coverage
- Target: All API endpoints
- Focus: Contract validation
### E2E Coverage
- Target: Critical paths only
- Focus: User value delivery
```
## Test Level Smells to Flag
### Over-testing Smells
- Same logic tested at multiple levels
- E2E tests for calculations
- Integration tests for framework features
### Under-testing Smells
- No unit tests for complex logic
- Missing integration tests for data operations
- No E2E tests for critical user paths
### Wrong Level Smells
- Unit tests with real database
- E2E tests checking calculation results
- Integration tests mocking everything
## Quality Indicators
Good test design shows:
- Clear level separation
- No redundant coverage
- Fast feedback from unit tests
- Reliable integration tests
- Focused e2e tests
## Key Principles
- Test at the lowest appropriate level
- One clear owner per test
- Fast tests run first
- Mock at boundaries, not internals
- E2E for user value, not implementation
- Maintain test/production parity where critical
- Tests must be atomic and self-contained
- No shared state between tests
- Explicit assertions in test files (not helpers)
### Output 2: Story Hook Line
**Print this line for review task to quote:**
```text
Test design: docs/qa/assessments/{epic}.{story}-test-design-{YYYYMMDD}.md
```
**For traceability:** This planning document will be referenced by trace-requirements task.
### Output 3: Test Count Summary
**Print summary for quick reference:**
```yaml
test_summary:
total: { total_count }
by_level:
unit: { unit_count }
integration: { int_count }
e2e: { e2e_count }
by_priority:
P0: { p0_count }
P1: { p1_count }
P2: { p2_count }
coverage_gaps: [] # List any ACs without tests
```