doc: test design refinements (#1382)

This commit is contained in:
Murat K Ozcan
2026-01-23 13:00:48 -06:00
committed by GitHub
parent efbe839a0a
commit 48881f86a6
4 changed files with 687 additions and 367 deletions

View File

@@ -80,23 +80,29 @@
- [ ] Owners assigned where applicable - [ ] Owners assigned where applicable
- [ ] No duplicate coverage (same behavior at multiple levels) - [ ] No duplicate coverage (same behavior at multiple levels)
### Execution Order ### Execution Strategy
- [ ] Smoke tests defined (<5 min target) **CRITICAL: Keep execution strategy simple, avoid redundancy**
- [ ] P0 tests listed (<10 min target)
- [ ] P1 tests listed (<30 min target) - [ ] **Simple structure**: PR / Nightly / Weekly (NOT complex smoke/P0/P1/P2 tiers)
- [ ] P2/P3 tests listed (<60 min target) - [ ] **PR execution**: All functional tests unless significant infrastructure overhead
- [ ] Order optimizes for fast feedback - [ ] **Nightly/Weekly**: Only performance, chaos, long-running, manual tests
- [ ] **No redundancy**: Don't re-list all tests (already in coverage plan)
- [ ] **Philosophy stated**: "Run everything in PRs if <15 min, defer only if expensive/long"
- [ ] **Playwright parallelization noted**: 100s of tests in 10-15 min
### Resource Estimates ### Resource Estimates
- [ ] P0 hours calculated (count × 2 hours) **CRITICAL: Use intervals/ranges, NOT exact numbers**
- [ ] P1 hours calculated (count × 1 hour)
- [ ] P2 hours calculated (count × 0.5 hours) - [ ] P0 effort provided as interval range (e.g., "~25-40 hours" NOT "36 hours")
- [ ] P3 hours calculated (count × 0.25 hours) - [ ] P1 effort provided as interval range (e.g., "~20-35 hours" NOT "27 hours")
- [ ] Total hours summed - [ ] P2 effort provided as interval range (e.g., "~10-30 hours" NOT "15.5 hours")
- [ ] Days estimate provided (hours / 8) - [ ] P3 effort provided as interval range (e.g., "~2-5 hours" NOT "2.5 hours")
- [ ] Estimates include setup time - [ ] Total effort provided as interval range (e.g., "~55-110 hours" NOT "81 hours")
- [ ] Timeline provided as week range (e.g., "~1.5-3 weeks" NOT "11 days")
- [ ] Estimates include setup time and account for complexity variations
- [ ] **No false precision**: Avoid exact calculations like "18 tests × 2 hours = 36 hours"
### Quality Gate Criteria ### Quality Gate Criteria
@@ -126,11 +132,16 @@
### Priority Assignment Accuracy ### Priority Assignment Accuracy
- [ ] P0: Truly blocks core functionality **CRITICAL: Priority classification is separate from execution timing**
- [ ] P0: High-risk (score 6)
- [ ] P0: No workaround exists - [ ] **Priority sections (P0/P1/P2/P3) do NOT include execution context** (e.g., no "Run on every commit" in headers)
- [ ] P1: Important but not blocking - [ ] **Priority sections have only "Criteria" and "Purpose"** (no "Execution:" field)
- [ ] P2/P3: Nice-to-have or edge cases - [ ] **Execution Strategy section** is separate and handles timing based on infrastructure overhead
- [ ] P0: Truly blocks core functionality + High-risk (≥6) + No workaround
- [ ] P1: Important features + Medium-risk (3-4) + Common workflows
- [ ] P2: Secondary features + Low-risk (1-2) + Edge cases
- [ ] P3: Nice-to-have + Exploratory + Benchmarks
- [ ] **Note at top of Test Coverage Plan**: Clarifies P0/P1/P2/P3 = priority/risk, NOT execution timing
### Test Level Selection ### Test Level Selection
@@ -176,58 +187,90 @@
- [ ] 🚨 BLOCKERS - Team Must Decide (Sprint 0 critical path items) - [ ] 🚨 BLOCKERS - Team Must Decide (Sprint 0 critical path items)
- [ ] HIGH PRIORITY - Team Should Validate (recommendations for approval) - [ ] HIGH PRIORITY - Team Should Validate (recommendations for approval)
- [ ] 📋 INFO ONLY - Solutions Provided (no decisions needed) - [ ] 📋 INFO ONLY - Solutions Provided (no decisions needed)
- [ ] **Risk Assessment** section - [ ] **Risk Assessment** section - **ACTIONABLE**
- [ ] Total risks identified count - [ ] Total risks identified count
- [ ] High-priority risks table (score 6) with all columns: Risk ID, Category, Description, Probability, Impact, Score, Mitigation, Owner, Timeline - [ ] High-priority risks table (score 6) with all columns: Risk ID, Category, Description, Probability, Impact, Score, Mitigation, Owner, Timeline
- [ ] Medium and low-priority risks tables - [ ] Medium and low-priority risks tables
- [ ] Risk category legend included - [ ] Risk category legend included
- [ ] **Testability Concerns** section (if system has architectural constraints) - [ ] **Testability Concerns and Architectural Gaps** section - **ACTIONABLE**
- [ ] Blockers to fast feedback table - [ ] **Sub-section: 🚨 ACTIONABLE CONCERNS** at TOP
- [ ] Explanation of why standard CI/CD may not apply (if applicable) - [ ] Blockers to Fast Feedback table (WHAT architecture must provide)
- [ ] Tiered testing strategy table (if forced by architecture) - [ ] Architectural Improvements Needed (WHAT must be changed)
- [ ] Architectural improvements needed (or acknowledgment system supports testing well) - [ ] Each concern has: Owner, Timeline, Impact
- [ ] **Sub-section: Testability Assessment Summary** at BOTTOM (FYI)
- [ ] What Works Well (passing items)
- [ ] Accepted Trade-offs (no action required)
- [ ] This section only included if worth mentioning; otherwise omitted
- [ ] **Risk Mitigation Plans** for all high-priority risks (≥6) - [ ] **Risk Mitigation Plans** for all high-priority risks (≥6)
- [ ] Each plan has: Strategy (numbered steps), Owner, Timeline, Status, Verification - [ ] Each plan has: Strategy (numbered steps), Owner, Timeline, Status, Verification
- [ ] **Only Backend/DevOps/Arch/Security mitigations** (production code changes)
- [ ] QA-owned mitigations belong in QA doc instead
- [ ] **Assumptions and Dependencies** section - [ ] **Assumptions and Dependencies** section
- [ ] **Architectural assumptions only** (SLO targets, replication lag, system design)
- [ ] Assumptions list (numbered) - [ ] Assumptions list (numbered)
- [ ] Dependencies list with required dates - [ ] Dependencies list with required dates
- [ ] Risks to plan with impact and contingency - [ ] Risks to plan with impact and contingency
- [ ] QA execution assumptions belong in QA doc instead
- [ ] **NO test implementation code** (long examples belong in QA doc) - [ ] **NO test implementation code** (long examples belong in QA doc)
- [ ] **NO test scripts** (no Playwright test(...) blocks, no assertions, no test setup code)
- [ ] **NO NFR test examples** (NFR sections describe WHAT to test, not HOW to test)
- [ ] **NO test scenario checklists** (belong in QA doc) - [ ] **NO test scenario checklists** (belong in QA doc)
- [ ] **Cross-references to QA doc** where appropriate - [ ] **NO bloat or repetition** (consolidate repeated notes, avoid over-explanation)
- [ ] **Cross-references to QA doc** where appropriate (instead of duplication)
- [ ] **RECIPE SECTIONS NOT IN ARCHITECTURE DOC:**
- [ ] NO "Test Levels Strategy" section (unit/integration/E2E split belongs in QA doc only)
- [ ] NO "NFR Testing Approach" section with detailed test procedures (belongs in QA doc only)
- [ ] NO "Test Environment Requirements" section (belongs in QA doc only)
- [ ] NO "Recommendations for Sprint 0" section with test framework setup (belongs in QA doc only)
- [ ] NO "Quality Gate Criteria" section (pass rates, coverage targets belong in QA doc only)
- [ ] NO "Tool Selection" section (Playwright, k6, etc. belongs in QA doc only)
### test-design-qa.md ### test-design-qa.md
- [ ] **Purpose statement** at top (execution recipe for QA team) **NEW STRUCTURE (streamlined from 375 to ~287 lines):**
- [ ] **Quick Reference for QA** section
- [ ] Before You Start checklist - [ ] **Purpose statement** at top (test execution recipe)
- [ ] Test Execution Order - [ ] **Executive Summary** with risk summary and coverage summary
- [ ] Need Help? guidance - [ ] **Dependencies & Test Blockers** section in POSITION 2 (right after Executive Summary)
- [ ] **System Architecture Summary** (brief overview of services and data flow) - [ ] Backend/Architecture dependencies listed (what QA needs from other teams)
- [ ] **Test Environment Requirements** in early section (section 1-3, NOT buried at end) - [ ] QA infrastructure setup listed (factories, fixtures, environments)
- [ ] Table with Local/Dev/Staging environments - [ ] Code example with playwright-utils if config.tea_use_playwright_utils is true
- [ ] Key principles listed (shared DB, randomization, parallel-safe, self-cleaning, shift-left) - [ ] Test from '@seontechnologies/playwright-utils/api-request/fixtures'
- [ ] Code example provided - [ ] Expect from '@playwright/test' (playwright-utils does not re-export expect)
- [ ] **Testability Assessment** with prerequisites checklist - [ ] Code examples include assertions (no unused imports)
- [ ] References Architecture doc blockers (not duplication) - [ ] **Risk Assessment** section (brief, references Architecture doc)
- [ ] **Test Levels Strategy** with unit/integration/E2E split - [ ] High-priority risks table
- [ ] System type identified - [ ] Medium/low-priority risks table
- [ ] Recommended split percentages with rationale - [ ] Each risk shows "QA Test Coverage" column (how QA validates)
- [ ] Test count summary (P0/P1/P2/P3 totals)
- [ ] **Test Coverage Plan** with P0/P1/P2/P3 sections - [ ] **Test Coverage Plan** with P0/P1/P2/P3 sections
- [ ] Each priority has: Execution details, Purpose, Criteria, Test Count - [ ] Priority sections have ONLY "Criteria" (no execution context)
- [ ] Detailed test scenarios WITH CHECKBOXES - [ ] Note at top: "P0/P1/P2/P3 = priority, NOT execution timing"
- [ ] Coverage table with columns: Requirement | Test Level | Risk Link | Test Count | Owner | Notes - [ ] Test tables with columns: Test ID | Requirement | Test Level | Risk Link | Notes
- [ ] **Sprint 0 Setup Requirements** - [ ] **Execution Strategy** section (organized by TOOL TYPE)
- [ ] Architecture/Backend blockers listed with cross-references to Architecture doc - [ ] Every PR: Playwright tests (~10-15 min)
- [ ] QA Test Infrastructure section (factories, fixtures) - [ ] Nightly: k6 performance tests (~30-60 min)
- [ ] Test Environments section (Local, CI/CD, Staging, Production) - [ ] Weekly: Chaos & long-running (~hours)
- [ ] Sprint 0 NFR Gates checklist - [ ] Philosophy: "Run everything in PRs unless expensive/long-running"
- [ ] Sprint 1 Items clearly separated - [ ] **QA Effort Estimate** section (QA effort ONLY)
- [ ] **NFR Readiness Summary** (reference to Architecture doc, not duplication) - [ ] Interval-based estimates (e.g., "~1-2 weeks" NOT "36 hours")
- [ ] Table with NFR categories, status, evidence, blocker, next action - [ ] NO DevOps, Backend, Data Eng, Finance effort
- [ ] **Cross-references to Architecture doc** (not duplication) - [ ] NO Sprint breakdowns (too prescriptive)
- [ ] **NO architectural theory** (just reference Architecture doc) - [ ] **Appendix A: Code Examples & Tagging**
- [ ] **Appendix B: Knowledge Base References**
**REMOVED SECTIONS (bloat):**
- [ ] NO Quick Reference section (bloat)
- [ ] NO System Architecture Summary (bloat)
- [ ] NO Test Environment Requirements as separate section (integrated into Dependencies)
- [ ] NO Testability Assessment section (bloat - covered in Dependencies)
- [ ] NO Test Levels Strategy section (bloat - obvious from test scenarios)
- [ ] NO NFR Readiness Summary (bloat)
- [ ] NO Quality Gate Criteria section (teams decide for themselves)
- [ ] NO Follow-on Workflows section (bloat - BMAD commands self-explanatory)
- [ ] NO Approval section (unnecessary formality)
- [ ] NO Infrastructure/DevOps/Finance effort tables (out of scope)
- [ ] NO Sprint 0/1/2/3 breakdown tables (too prescriptive)
- [ ] NO Next Steps section (bloat)
### Cross-Document Consistency ### Cross-Document Consistency
@@ -238,6 +281,40 @@
- [ ] Dates and authors match across documents - [ ] Dates and authors match across documents
- [ ] ADR and PRD references consistent - [ ] ADR and PRD references consistent
### Document Quality (Anti-Bloat Check)
**CRITICAL: Check for bloat and repetition across BOTH documents**
- [ ] **No repeated notes 10+ times** (e.g., "Timing is pessimistic until R-005 fixed" on every section)
- [ ] **Repeated information consolidated** (write once at top, reference briefly if needed)
- [ ] **No excessive detail** that doesn't add value (obvious concepts, redundant examples)
- [ ] **Focus on unique/critical info** (only document what's different from standard practice)
- [ ] **Architecture doc**: Concerns-focused, NOT implementation-focused
- [ ] **QA doc**: Implementation-focused, NOT theory-focused
- [ ] **Clear separation**: Architecture = WHAT and WHY, QA = HOW
- [ ] **Professional tone**: No AI slop markers
- [ ] Avoid excessive ✅/❌ emojis (use sparingly, only when adding clarity)
- [ ] Avoid "absolutely", "excellent", "fantastic", overly enthusiastic language
- [ ] Write professionally and directly
- [ ] **Architecture doc length**: Target ~150-200 lines max (focus on actionable concerns only)
- [ ] **QA doc length**: Keep concise, remove bloat sections
### Architecture Doc Structure (Actionable-First Principle)
**CRITICAL: Validate structure follows actionable-first, FYI-last principle**
- [ ] **Actionable sections at TOP:**
- [ ] Quick Guide (🚨 BLOCKERS first, then HIGH PRIORITY, then 📋 INFO ONLY last)
- [ ] Risk Assessment (high-priority risks 6 at top)
- [ ] Testability Concerns (concerns/blockers at top, passing items at bottom)
- [ ] Risk Mitigation Plans (for high-priority risks 6)
- [ ] **FYI sections at BOTTOM:**
- [ ] Testability Assessment Summary (what works well - only if worth mentioning)
- [ ] Assumptions and Dependencies
- [ ] **ASRs categorized correctly:**
- [ ] Actionable ASRs included in 🚨 or sections
- [ ] FYI ASRs included in 📋 section or omitted if obvious
## Completion Criteria ## Completion Criteria
**All must be true:** **All must be true:**
@@ -295,9 +372,20 @@ If workflow fails:
- **Solution**: Use test pyramid - E2E for critical paths only - **Solution**: Use test pyramid - E2E for critical paths only
**Issue**: Resource estimates too high **Issue**: Resource estimates too high or too precise
- **Solution**: Invest in fixtures/factories to reduce per-test setup time - **Solution**:
- Invest in fixtures/factories to reduce per-test setup time
- Use interval ranges (e.g., "~55-110 hours") instead of exact numbers (e.g., "81 hours")
- Widen intervals if high uncertainty exists
**Issue**: Execution order section too complex or redundant
- **Solution**:
- Default: Run everything in PRs (<15 min with Playwright parallelization)
- Only defer to nightly/weekly if expensive (k6, chaos, 4+ hour tests)
- Don't create smoke/P0/P1/P2/P3 tier structure
- Don't re-list all tests (already in coverage plan)
### Best Practices ### Best Practices
@@ -305,7 +393,9 @@ If workflow fails:
- High-priority risks (≥6) require immediate mitigation - High-priority risks (≥6) require immediate mitigation
- P0 tests should cover <10% of total scenarios - P0 tests should cover <10% of total scenarios
- Avoid testing same behavior at multiple levels - Avoid testing same behavior at multiple levels
- Include smoke tests (P0 subset) for fast feedback - **Use interval-based estimates** (e.g., "~25-40 hours") instead of exact numbers to avoid false precision and provide flexibility
- **Keep execution strategy simple**: Default to "run everything in PRs" (<15 min with Playwright), only defer if expensive/long-running
- **Avoid execution order redundancy**: Don't create complex tier structures or re-list tests
--- ---

View File

@@ -157,7 +157,13 @@ TEA test-design workflow supports TWO modes, detected automatically:
1. **Review Architecture for Testability** 1. **Review Architecture for Testability**
Evaluate architecture against these criteria: **STRUCTURE PRINCIPLE: CONCERNS FIRST, PASSING ITEMS LAST**
Evaluate architecture against these criteria and structure output as:
1. **Testability Concerns** (ACTIONABLE - what's broken/missing)
2. **Testability Assessment Summary** (FYI - what works well)
**Testability Criteria:**
**Controllability:** **Controllability:**
- Can we control system state for testing? (API seeding, factories, database reset) - Can we control system state for testing? (API seeding, factories, database reset)
@@ -174,8 +180,18 @@ TEA test-design workflow supports TWO modes, detected automatically:
- Can we reproduce failures? (deterministic waits, HAR capture, seed data) - Can we reproduce failures? (deterministic waits, HAR capture, seed data)
- Are components loosely coupled? (mockable, testable boundaries) - Are components loosely coupled? (mockable, testable boundaries)
**In Architecture Doc Output:**
- **Section A: Testability Concerns** (TOP) - List what's BROKEN or MISSING
- Example: "No API for test data seeding → Cannot parallelize tests"
- Example: "Hardcoded DB connection → Cannot test in CI"
- **Section B: Testability Assessment Summary** (BOTTOM) - List what PASSES
- Example: "✅ API-first design supports test isolation"
- Only include if worth mentioning; otherwise omit this section entirely
2. **Identify Architecturally Significant Requirements (ASRs)** 2. **Identify Architecturally Significant Requirements (ASRs)**
**CRITICAL: ASRs must indicate if ACTIONABLE or FYI**
From PRD NFRs and architecture decisions, identify quality requirements that: From PRD NFRs and architecture decisions, identify quality requirements that:
- Drive architecture decisions (e.g., "Must handle 10K concurrent users" → caching architecture) - Drive architecture decisions (e.g., "Must handle 10K concurrent users" → caching architecture)
- Pose testability challenges (e.g., "Sub-second response time" → performance test infrastructure) - Pose testability challenges (e.g., "Sub-second response time" → performance test infrastructure)
@@ -183,21 +199,60 @@ TEA test-design workflow supports TWO modes, detected automatically:
Score each ASR using risk matrix (probability × impact). Score each ASR using risk matrix (probability × impact).
**In Architecture Doc, categorize ASRs:**
- **ACTIONABLE ASRs** (require architecture changes): Include in "Quick Guide" 🚨 or ⚠️ sections
- **FYI ASRs** (already satisfied by architecture): Include in "Quick Guide" 📋 section OR omit if obvious
**Example:**
- ASR-001 (Score 9): "Multi-region deployment requires region-specific test infrastructure" → **ACTIONABLE** (goes in 🚨 BLOCKERS)
- ASR-002 (Score 4): "OAuth 2.1 authentication already implemented in ADR-5" → **FYI** (goes in 📋 INFO ONLY or omit)
**Structure Principle:** Actionable ASRs at TOP, FYI ASRs at BOTTOM (or omit)
3. **Define Test Levels Strategy** 3. **Define Test Levels Strategy**
**IMPORTANT: This section goes in QA doc ONLY, NOT in Architecture doc**
Based on architecture (mobile, web, API, microservices, monolith): Based on architecture (mobile, web, API, microservices, monolith):
- Recommend unit/integration/E2E split (e.g., 70/20/10 for API-heavy, 40/30/30 for UI-heavy) - Recommend unit/integration/E2E split (e.g., 70/20/10 for API-heavy, 40/30/30 for UI-heavy)
- Identify test environment needs (local, staging, ephemeral, production-like) - Identify test environment needs (local, staging, ephemeral, production-like)
- Define testing approach per technology (Playwright for web, Maestro for mobile, k6 for performance) - Define testing approach per technology (Playwright for web, Maestro for mobile, k6 for performance)
4. **Assess NFR Testing Approach** **In Architecture doc:** Only mention test level split if it's an ACTIONABLE concern
- Example: "API response time <100ms requires load testing infrastructure" (concern)
- DO NOT include full test level strategy table in Architecture doc
For each NFR category: 4. **Assess NFR Requirements (MINIMAL in Architecture Doc)**
- **Security**: Auth/authz tests, OWASP validation, secret handling (Playwright E2E + security tools)
- **Performance**: Load/stress/spike testing with k6, SLO/SLA thresholds **CRITICAL: NFR testing approach is a RECIPE - belongs in QA doc ONLY**
- **Reliability**: Error handling, retries, circuit breakers, health checks (Playwright + API tests)
**In Architecture Doc:**
- Only mention NFRs if they create testability CONCERNS
- Focus on WHAT architecture must provide, not HOW to test
- Keep it brief - 1-2 sentences per NFR category at most
**Example - Security NFR in Architecture doc (if there's a concern):**
CORRECT (concern-focused, brief, WHAT/WHY only):
- "System must prevent cross-customer data access (GDPR requirement). Requires test infrastructure for multi-tenant isolation in Sprint 0."
- "OAuth tokens must expire after 1 hour (ADR-5). Requires test harness for token expiration validation."
INCORRECT (too detailed, belongs in QA doc):
- Full table of security test scenarios
- Test scripts with code examples
- Detailed test procedures
- Tool selection (e.g., "use Playwright E2E + OWASP ZAP")
- Specific test approaches (e.g., "Test approach: Playwright E2E for auth/authz")
**In QA Doc (full NFR testing approach):**
- **Security**: Full test scenarios, tooling (Playwright + OWASP ZAP), test procedures
- **Performance**: Load/stress/spike test scenarios, k6 scripts, SLO thresholds
- **Reliability**: Error handling tests, retry logic validation, circuit breaker tests
- **Maintainability**: Coverage targets, code quality gates, observability validation - **Maintainability**: Coverage targets, code quality gates, observability validation
**Rule of Thumb:**
- Architecture doc: "What NFRs exist and what concerns they create" (1-2 sentences)
- QA doc: "How to test those NFRs" (full sections with tables, code, procedures)
5. **Flag Testability Concerns** 5. **Flag Testability Concerns**
Identify architecture decisions that harm testability: Identify architecture decisions that harm testability:
@@ -228,22 +283,54 @@ TEA test-design workflow supports TWO modes, detected automatically:
**Standard Structures (REQUIRED):** **Standard Structures (REQUIRED):**
**test-design-architecture.md sections (in this order):** **test-design-architecture.md sections (in this order):**
**STRUCTURE PRINCIPLE: Actionable items FIRST, FYI items LAST**
1. Executive Summary (scope, business context, architecture, risk summary) 1. Executive Summary (scope, business context, architecture, risk summary)
2. Quick Guide (🚨 BLOCKERS / HIGH PRIORITY / 📋 INFO ONLY) 2. Quick Guide (🚨 BLOCKERS / HIGH PRIORITY / 📋 INFO ONLY)
3. Risk Assessment (high/medium/low-priority risks with scoring) 3. Risk Assessment (high/medium/low-priority risks with scoring) - **ACTIONABLE**
4. Testability Concerns and Architectural Gaps (if system has constraints) 4. Testability Concerns and Architectural Gaps - **ACTIONABLE** (what arch team must do)
5. Risk Mitigation Plans (detailed for high-priority risks ≥6) - Sub-section: Blockers to Fast Feedback (ACTIONABLE - concerns FIRST)
6. Assumptions and Dependencies - Sub-section: Architectural Improvements Needed (ACTIONABLE)
- Sub-section: Testability Assessment Summary (FYI - passing items LAST, only if worth mentioning)
5. Risk Mitigation Plans (detailed for high-priority risks 6) - **ACTIONABLE**
6. Assumptions and Dependencies - **FYI**
**SECTIONS THAT DO NOT BELONG IN ARCHITECTURE DOC:**
- Test Levels Strategy (unit/integration/E2E split) - This is a RECIPE, belongs in QA doc ONLY
- NFR Testing Approach with test examples - This is a RECIPE, belongs in QA doc ONLY
- Test Environment Requirements - This is a RECIPE, belongs in QA doc ONLY
- Recommendations for Sprint 0 (test framework setup, factories) - This is a RECIPE, belongs in QA doc ONLY
- Quality Gate Criteria (pass rates, coverage targets) - This is a RECIPE, belongs in QA doc ONLY
- Tool Selection (Playwright, k6, etc.) - This is a RECIPE, belongs in QA doc ONLY
**WHAT BELONGS IN ARCHITECTURE DOC:**
- Testability CONCERNS (what makes it hard to test)
- Architecture GAPS (what's missing for testability)
- What architecture team must DO (blockers, improvements)
- Risks and mitigation plans
- ASRs (Architecturally Significant Requirements) - but clarify if FYI or actionable
**test-design-qa.md sections (in this order):** **test-design-qa.md sections (in this order):**
1. Quick Reference for QA (Before You Start, Execution Order, Need Help) 1. Executive Summary (risk summary, coverage summary)
2. System Architecture Summary (brief overview) 2. **Dependencies & Test Blockers** (CRITICAL: RIGHT AFTER SUMMARY - what QA needs from other teams)
3. Test Environment Requirements (MOVE UP - section 3, NOT buried at end) 3. Risk Assessment (scored risks with categories - reference Arch doc, don't duplicate)
4. Testability Assessment (lightweight prerequisites checklist) 4. Test Coverage Plan (P0/P1/P2/P3 with detailed scenarios + checkboxes)
5. Test Levels Strategy (unit/integration/E2E split with rationale) 5. **Execution Strategy** (SIMPLE: Organized by TOOL TYPE: PR (Playwright) / Nightly (k6) / Weekly (chaos/manual))
6. Test Coverage Plan (P0/P1/P2/P3 with detailed scenarios + checkboxes) 6. QA Effort Estimate (QA effort ONLY - no DevOps, Data Eng, Finance, Backend)
7. Sprint 0 Setup Requirements (blockers, infrastructure, environments) 7. Appendices (code examples with playwright-utils, tagging strategy, knowledge base refs)
8. NFR Readiness Summary (reference to Architecture doc)
**SECTIONS TO EXCLUDE FROM QA DOC:**
- Quality Gate Criteria (pass/fail thresholds - teams decide for themselves)
- Follow-on Workflows (bloat - BMAD commands are self-explanatory)
- Approval section (unnecessary formality)
- Test Environment Requirements (remove as separate section - integrate into Dependencies if needed)
- NFR Readiness Summary (bloat - covered in Risk Assessment)
- Testability Assessment (bloat - covered in Dependencies)
- Test Levels Strategy (bloat - obvious from test scenarios)
- Sprint breakdowns (too prescriptive)
- Infrastructure/DevOps/Data Eng effort tables (out of scope)
- Mitigation plans for non-QA work (belongs in Arch doc)
**Content Guidelines:** **Content Guidelines:**
@@ -252,26 +339,46 @@ TEA test-design workflow supports TWO modes, detected automatically:
- Clear ownership (each blocker/ASR has owner + timeline) - Clear ownership (each blocker/ASR has owner + timeline)
- Testability requirements (what architecture must support) - Testability requirements (what architecture must support)
- Mitigation plans (for each high-risk item 6) - Mitigation plans (for each high-risk item 6)
- ✅ Short code examples (5-10 lines max showing what to support) - Brief conceptual examples ONLY if needed to clarify architecture concerns (5-10 lines max)
- **Target length**: ~150-200 lines max (focus on actionable concerns only)
- **Professional tone**: Avoid AI slop (excessive ✅/❌ emojis, "absolutely", "excellent", overly enthusiastic language)
**Architecture doc (DON'T):** **Architecture doc (DON'T) - CRITICAL:**
- ❌ NO long test code examples (belongs in QA doc) - NO test scripts or test implementation code AT ALL - This is a communication doc for architects, not a testing guide
- ❌ NO test scenario checklists (belongs in QA doc) - NO Playwright test examples (e.g., test('...', async ({ request }) => ...))
- ❌ NO implementation details (how QA will test) - ❌ NO assertion logic (e.g., expect(...).toBe(...))
- ❌ NO test scenario checklists with checkboxes (belongs in QA doc)
- ❌ NO implementation details about HOW QA will test
- ❌ Focus on CONCERNS, not IMPLEMENTATION
**QA doc (DO):** **QA doc (DO):**
- ✅ Test scenario recipes (clear P0/P1/P2/P3 with checkboxes) - ✅ Test scenario recipes (clear P0/P1/P2/P3 with checkboxes)
-Environment setup (Sprint 0 checklist with blockers) -Full test implementation code samples when helpful
-Tool setup (factories, fixtures, frameworks) -**IMPORTANT: If config.tea_use_playwright_utils is true, ALL code samples MUST use @seontechnologies/playwright-utils fixtures and utilities**
- ✅ Import test fixtures from '@seontechnologies/playwright-utils/api-request/fixtures'
- ✅ Import expect from '@playwright/test' (playwright-utils does not re-export expect)
- ✅ Use apiRequest fixture with schema validation, retry logic, and structured responses
- ✅ Dependencies & Test Blockers section RIGHT AFTER Executive Summary (what QA needs from other teams)
-**QA effort estimates ONLY** (no DevOps, Data Eng, Finance, Backend effort - out of scope)
- ✅ Cross-references to Architecture doc (not duplication) - ✅ Cross-references to Architecture doc (not duplication)
-**Professional tone**: Avoid AI slop (excessive ✅/❌ emojis, "absolutely", "excellent", overly enthusiastic language)
**QA doc (DON'T):** **QA doc (DON'T):**
- ❌ NO architectural theory (just reference Architecture doc) - ❌ NO architectural theory (just reference Architecture doc)
- ❌ NO ASR explanations (link to Architecture doc instead) - ❌ NO ASR explanations (link to Architecture doc instead)
- ❌ NO duplicate risk assessments (reference Architecture doc) - ❌ NO duplicate risk assessments (reference Architecture doc)
- ❌ NO Quality Gate Criteria section (teams decide pass/fail thresholds for themselves)
- ❌ NO Follow-on Workflows section (bloat - BMAD commands are self-explanatory)
- ❌ NO Approval section (unnecessary formality)
- ❌ NO effort estimates for other teams (DevOps, Backend, Data Eng, Finance - out of scope, QA effort only)
- ❌ NO Sprint breakdowns (too prescriptive - e.g., "Sprint 0: 40 hours, Sprint 1: 48 hours")
- ❌ NO mitigation plans for Backend/Arch/DevOps work (those belong in Architecture doc)
- ❌ NO architectural assumptions or debates (those belong in Architecture doc)
**Anti-Patterns to Avoid (Cross-Document Redundancy):** **Anti-Patterns to Avoid (Cross-Document Redundancy):**
**CRITICAL: NO BLOAT, NO REPETITION, NO OVERINFO**
**DON'T duplicate OAuth requirements:** **DON'T duplicate OAuth requirements:**
- Architecture doc: Explain OAuth 2.1 flow in detail - Architecture doc: Explain OAuth 2.1 flow in detail
- QA doc: Re-explain why OAuth 2.1 is required - QA doc: Re-explain why OAuth 2.1 is required
@@ -280,6 +387,24 @@ TEA test-design workflow supports TWO modes, detected automatically:
- Architecture doc: "ASR-1: OAuth 2.1 required (see QA doc for 12 test scenarios)" - Architecture doc: "ASR-1: OAuth 2.1 required (see QA doc for 12 test scenarios)"
- QA doc: "OAuth tests: 12 P0 scenarios (see Architecture doc R-001 for risk details)" - QA doc: "OAuth tests: 12 P0 scenarios (see Architecture doc R-001 for risk details)"
**DON'T repeat the same note 10+ times:**
- Example: "Timing is pessimistic until R-005 is fixed" repeated on every P0, P1, P2 section
- This creates bloat and makes docs hard to read
**DO consolidate repeated information:**
- Write once at the top: "**Note**: All timing estimates are pessimistic pending R-005 resolution"
- Reference briefly if needed: "(pessimistic timing)"
**DON'T include excessive detail that doesn't add value:**
- Long explanations of obvious concepts
- Redundant examples showing the same pattern
- Over-documentation of standard practices
**DO focus on what's unique or critical:**
- Document only what's different from standard practice
- Highlight critical decisions and risks
- Keep explanations concise and actionable
**Markdown Cross-Reference Syntax Examples:** **Markdown Cross-Reference Syntax Examples:**
```markdown ```markdown
@@ -330,6 +455,24 @@ TEA test-design workflow supports TWO modes, detected automatically:
- Cross-reference between docs (no duplication) - Cross-reference between docs (no duplication)
- Validate against checklist.md (System-Level Mode section) - Validate against checklist.md (System-Level Mode section)
**Common Over-Engineering to Avoid:**
**In QA Doc:**
1. ❌ Quality gate thresholds ("P0 must be 100%, P1 ≥95%") - Let teams decide for themselves
2. ❌ Effort estimates for other teams - QA doc should only estimate QA effort
3. ❌ Sprint breakdowns ("Sprint 0: 40 hours, Sprint 1: 48 hours") - Too prescriptive
4. ❌ Approval sections - Unnecessary formality
5. ❌ Assumptions about architecture (SLO targets, replication lag) - These are architectural concerns, belong in Arch doc
6. ❌ Mitigation plans for Backend/Arch/DevOps - Those belong in Arch doc
7. ❌ Follow-on workflows section - Bloat, BMAD commands are self-explanatory
8. ❌ NFR Readiness Summary - Bloat, covered in Risk Assessment
**Test Coverage Numbers Reality Check:**
- With Playwright parallelization, running ALL Playwright tests is as fast as running just P0
- Don't split Playwright tests by priority into different CI gates - it adds no value
- Tool type matters, not priority labels
- Defer based on infrastructure cost, not importance
**After System-Level Mode:** Workflow COMPLETE. System-level outputs (test-design-architecture.md + test-design-qa.md) are written in this step. Steps 2-4 are epic-level only - do NOT execute them in system-level mode. **After System-Level Mode:** Workflow COMPLETE. System-level outputs (test-design-architecture.md + test-design-qa.md) are written in this step. Steps 2-4 are epic-level only - do NOT execute them in system-level mode.
--- ---
@@ -540,12 +683,51 @@ TEA test-design workflow supports TWO modes, detected automatically:
8. **Plan Mitigations** 8. **Plan Mitigations**
**CRITICAL: Mitigation placement depends on WHO does the work**
For each high-priority risk: For each high-priority risk:
- Define mitigation strategy - Define mitigation strategy
- Assign owner (dev, QA, ops) - Assign owner (dev, QA, ops)
- Set timeline - Set timeline
- Update residual risk expectation - Update residual risk expectation
**Mitigation Plan Placement:**
**Architecture Doc:**
- Mitigations owned by Backend, DevOps, Architecture, Security, Data Eng
- Example: "Add authorization layer for customer-scoped access" (Backend work)
- Example: "Configure AWS Fault Injection Simulator" (DevOps work)
- Example: "Define CloudWatch log schema for backfill events" (Architecture work)
**QA Doc:**
- Mitigations owned by QA (test development work)
- Example: "Create factories for test data with randomization" (QA work)
- Example: "Implement polling with retry for async validation" (QA test code)
- Brief reference to Architecture doc mitigations (don't duplicate)
**Rule of Thumb:**
- If mitigation requires production code changes → Architecture doc
- If mitigation is test infrastructure/code → QA doc
- If mitigation involves multiple teams → Architecture doc with QA validation approach
**Assumptions Placement:**
**Architecture Doc:**
- Architectural assumptions (SLO targets, replication lag, system design assumptions)
- Example: "P95 <500ms inferred from <2s timeout (requires Product approval)"
- Example: "Multi-region replication lag <1s assumed (ADR doesn't specify SLA)"
- Example: "Recent Cache hit ratio >80% assumed (not in PRD/ADR)"
**QA Doc:**
- Test execution assumptions (test infrastructure readiness, test data availability)
- Example: "Assumes test factories already created"
- Example: "Assumes CI/CD pipeline configured"
- Brief reference to Architecture doc for architectural assumptions
**Rule of Thumb:**
- If assumption is about system architecture/design → Architecture doc
- If assumption is about test infrastructure/execution → QA doc
--- ---
## Step 3: Design Test Coverage ## Step 3: Design Test Coverage
@@ -594,6 +776,8 @@ TEA test-design workflow supports TWO modes, detected automatically:
3. **Assign Priority Levels** 3. **Assign Priority Levels**
**CRITICAL: P0/P1/P2/P3 indicates priority and risk level, NOT execution timing**
**Knowledge Base Reference**: `test-priorities-matrix.md` **Knowledge Base Reference**: `test-priorities-matrix.md`
**P0 (Critical)**: **P0 (Critical)**:
@@ -601,25 +785,28 @@ TEA test-design workflow supports TWO modes, detected automatically:
- High-risk areas (score ≥6) - High-risk areas (score ≥6)
- Revenue-impacting - Revenue-impacting
- Security-critical - Security-critical
- **Run on every commit** - No workaround exists
- Affects majority of users
**P1 (High)**: **P1 (High)**:
- Important user features - Important user features
- Medium-risk areas (score 3-4) - Medium-risk areas (score 3-4)
- Common workflows - Common workflows
- **Run on PR to main** - Workaround exists but difficult
**P2 (Medium)**: **P2 (Medium)**:
- Secondary features - Secondary features
- Low-risk areas (score 1-2) - Low-risk areas (score 1-2)
- Edge cases - Edge cases
- **Run nightly or weekly** - Regression prevention
**P3 (Low)**: **P3 (Low)**:
- Nice-to-have - Nice-to-have
- Exploratory - Exploratory
- Performance benchmarks - Performance benchmarks
- **Run on-demand** - Documentation validation
**NOTE:** Priority classification is separate from execution timing. A P1 test might run in PRs if it's fast, or nightly if it requires expensive infrastructure (e.g., k6 performance test). See "Execution Strategy" section for timing guidance.
4. **Outline Data and Tooling Prerequisites** 4. **Outline Data and Tooling Prerequisites**
@@ -629,13 +816,55 @@ TEA test-design workflow supports TWO modes, detected automatically:
- Environment setup - Environment setup
- Tools and dependencies - Tools and dependencies
5. **Define Execution Order** 5. **Define Execution Strategy** (Keep It Simple)
Recommend test execution sequence: **IMPORTANT: Avoid over-engineering execution order**
1. **Smoke tests** (P0 subset, <5 min)
2. **P0 tests** (critical paths, <10 min) **Default Philosophy:**
3. **P1 tests** (important features, <30 min) - Run **everything** in PRs if total duration <15 minutes
4. **P2/P3 tests** (full regression, <60 min) - Playwright is fast with parallelization (100s of tests in ~10-15 min)
- Only defer to nightly/weekly if there's significant overhead:
- Performance tests (k6, load testing) - expensive infrastructure
- Chaos engineering - requires special setup (AWS FIS)
- Long-running tests - endurance (4+ hours), disaster recovery
- Manual tests - require human intervention
**Simple Execution Strategy (Organized by TOOL TYPE):**
```markdown
## Execution Strategy
**Philosophy**: Run everything in PRs unless significant infrastructure overhead.
Playwright with parallelization is extremely fast (100s of tests in ~10-15 min).
**Organized by TOOL TYPE:**
### Every PR: Playwright Tests (~10-15 min)
All functional tests (from any priority level):
- All E2E, API, integration, unit tests using Playwright
- Parallelized across {N} shards
- Total: ~{N} tests (includes P0, P1, P2, P3)
### Nightly: k6 Performance Tests (~30-60 min)
All performance tests (from any priority level):
- Load, stress, spike, endurance
- Reason: Expensive infrastructure, long-running (10-40 min per test)
### Weekly: Chaos & Long-Running (~hours)
Special infrastructure tests (from any priority level):
- Multi-region failover, disaster recovery, endurance
- Reason: Very expensive, very long (4+ hours)
```
**KEY INSIGHT: Organize by TOOL TYPE, not priority**
- Playwright (fast, cheap) → PR
- k6 (expensive, long) → Nightly
- Chaos/Manual (very expensive, very long) → Weekly
**Avoid:**
- ❌ Don't organize by priority (smoke → P0 → P1 → P2 → P3)
- ❌ Don't say "P1 runs on PR to main" (some P1 are Playwright/PR, some are k6/Nightly)
- ❌ Don't create artificial tiers - organize by tool type and infrastructure overhead
--- ---
@@ -661,34 +890,66 @@ TEA test-design workflow supports TWO modes, detected automatically:
| Login flow | E2E | P0 | R-001 | 3 | QA | | Login flow | E2E | P0 | R-001 | 3 | QA |
``` ```
3. **Document Execution Order** 3. **Document Execution Strategy** (Simple, Not Redundant)
**IMPORTANT: Keep execution strategy simple and avoid redundancy**
```markdown ```markdown
### Smoke Tests (<5 min) ## Execution Strategy
- Login successful **Default: Run all functional tests in PRs (~10-15 min)**
- Dashboard loads - All Playwright tests (parallelized across 4 shards)
- Includes E2E, API, integration, unit tests
- Total: ~{N} tests
### P0 Tests (<10 min) **Nightly: Performance & Infrastructure tests**
- k6 load/stress/spike tests (~30-60 min)
- Reason: Expensive infrastructure, long-running
- [Full P0 list] **Weekly: Chaos & Disaster Recovery**
- Endurance tests (4+ hours)
### P1 Tests (<30 min) - Multi-region failover (requires AWS FIS)
- Backup restore validation
- [Full P1 list] - Reason: Special infrastructure, very long-running
``` ```
**DO NOT:**
- ❌ Create redundant smoke/P0/P1/P2/P3 tier structure
- ❌ List all tests again in execution order (already in coverage plan)
- ❌ Split tests by priority unless there's infrastructure overhead
4. **Include Resource Estimates** 4. **Include Resource Estimates**
**IMPORTANT: Use intervals/ranges, not exact numbers**
Provide rough estimates with intervals to avoid false precision:
```markdown ```markdown
### Test Effort Estimates ### Test Effort Estimates
- P0 scenarios: 15 tests × 2 hours = 30 hours - P0 scenarios: 15 tests (~1.5-2.5 hours each) = **~25-40 hours**
- P1 scenarios: 25 tests × 1 hour = 25 hours - P1 scenarios: 25 tests (~0.75-1.5 hours each) = **~20-35 hours**
- P2 scenarios: 40 tests × 0.5 hour = 20 hours - P2 scenarios: 40 tests (~0.25-0.75 hours each) = **~10-30 hours**
- **Total:** 75 hours (~10 days) - **Total:** **~55-105 hours** (~1.5-3 weeks with 1 QA engineer)
``` ```
**Why intervals:**
- Avoids false precision (estimates are never exact)
- Provides flexibility for complexity variations
- Accounts for unknowns and dependencies
- More realistic and less prescriptive
**Guidelines:**
- P0 tests: 1.5-2.5 hours each (complex setup, security, performance)
- P1 tests: 0.75-1.5 hours each (standard integration, API tests)
- P2 tests: 0.25-0.75 hours each (edge cases, simple validation)
- P3 tests: 0.1-0.5 hours each (exploratory, documentation)
**Express totals as:**
- Hour ranges: "~55-105 hours"
- Week ranges: "~1.5-3 weeks"
- Avoid: Exact numbers like "75 hours" or "11 days"
5. **Add Gate Criteria** 5. **Add Gate Criteria**
```markdown ```markdown

View File

@@ -108,54 +108,51 @@
### Testability Concerns and Architectural Gaps ### Testability Concerns and Architectural Gaps
**IMPORTANT**: {If system has constraints, explain them. If standard CI/CD achievable, state that.} **🚨 ACTIONABLE CONCERNS - Architecture Team Must Address**
#### Blockers to Fast Feedback {If system has critical testability concerns, list them here. If architecture supports testing well, state "No critical testability concerns identified" and skip to Testability Assessment Summary}
| Blocker | Impact | Current Mitigation | Ideal Solution | #### 1. Blockers to Fast Feedback (WHAT WE NEED FROM ARCHITECTURE)
|---------|--------|-------------------|----------------|
| **{Blocker name}** | {Impact description} | {How we're working around it} | {What architecture should provide} |
#### Why This Matters | Concern | Impact | What Architecture Must Provide | Owner | Timeline |
|---------|--------|--------------------------------|-------|----------|
| **{Concern name}** | {Impact on testing} | {Specific architectural change needed} | {Team} | {Sprint} |
**Standard CI/CD expectations:** **Example:**
- Full test suite on every commit (~5-15 min feedback) - **No API for test data seeding** → Cannot parallelize tests → Provide POST /test/seed endpoint (Backend, Sprint 0)
- Parallel test execution (isolated test data per worker)
- Ephemeral test environments (spin up → test → tear down)
- Fast feedback loop (devs stay in flow state)
**Current reality for {Feature}:** #### 2. Architectural Improvements Needed (WHAT SHOULD BE CHANGED)
- {Actual situation - what's different from standard}
#### Tiered Testing Strategy {List specific improvements that would make the system more testable}
{If forced by architecture, explain. If standard approach works, state that.}
| Tier | When | Duration | Coverage | Why Not Full Suite? |
|------|------|----------|----------|---------------------|
| **Smoke** | Every commit | <5 min | {N} tests | Fast feedback, catch build-breaking changes |
| **P0** | Every commit | ~{X} min | ~{N} tests | Critical paths, security-critical flows |
| **P1** | PR to main | ~{X} min | ~{N} tests | Important features, algorithm accuracy |
| **P2/P3** | Nightly | ~{X} min | ~{N} tests | Edge cases, performance, NFR |
**Note**: {Any timing assumptions or constraints}
#### Architectural Improvements Needed
{If system has technical debt affecting testing, list improvements. If architecture supports testing well, acknowledge that.}
1. **{Improvement name}** 1. **{Improvement name}**
- {What to change} - **Current problem**: {What's wrong}
- **Impact**: {How it improves testing} - **Required change**: {What architecture must do}
- **Impact if not fixed**: {Consequences}
- **Owner**: {Team}
- **Timeline**: {Sprint}
#### Acceptance of Trade-offs ---
For {Feature} Phase 1, the team accepts: ### Testability Assessment Summary
- **{Trade-off 1}** ({Reasoning})
- **{Trade-off 2}** ({Reasoning})
- **{Known limitation}** ({Why acceptable for now})
This is {**technical debt** OR **acceptable for Phase 1**} that should be {revisited post-GA OR maintained as-is}. **📊 CURRENT STATE - FYI**
{Only include this section if there are passing items worth mentioning. Otherwise omit.}
#### What Works Well
- ✅ {Passing item 1} (e.g., "API-first design supports parallel test execution")
- ✅ {Passing item 2} (e.g., "Feature flags enable test isolation")
- ✅ {Passing item 3}
#### Accepted Trade-offs (No Action Required)
For {Feature} Phase 1, the following trade-offs are acceptable:
- **{Trade-off 1}** - {Why acceptable for now}
- **{Trade-off 2}** - {Why acceptable for now}
{This is technical debt OR acceptable for Phase 1} that {should be revisited post-GA OR maintained as-is}
--- ---

View File

@@ -1,314 +1,286 @@
# Test Design for QA: {Feature Name} # Test Design for QA: {Feature Name}
**Purpose:** Test execution recipe for QA team. Defines test scenarios, coverage plan, tooling, and Sprint 0 setup requirements. Use this as your implementation guide after architectural blockers are resolved. **Purpose:** Test execution recipe for QA team. Defines what to test, how to test it, and what QA needs from other teams.
**Date:** {date} **Date:** {date}
**Author:** {author} **Author:** {author}
**Status:** Draft / Ready for Implementation **Status:** Draft
**Project:** {project_name} **Project:** {project_name}
**PRD Reference:** {prd_link}
**ADR Reference:** {adr_link} **Related:** See Architecture doc (test-design-architecture.md) for testability concerns and architectural blockers.
--- ---
## Quick Reference for QA ## Executive Summary
**Before You Start:** **Scope:** {Brief description of testing scope}
- [ ] Review Architecture doc (test-design-architecture.md) - understand blockers and risks
- [ ] Verify Sprint 0 blockers resolved (see Sprint 0 section below)
- [ ] Confirm test infrastructure ready (factories, fixtures, environments)
**Test Execution Order:** **Risk Summary:**
1. **Smoke tests** (<5 min) - Fast feedback on critical paths - Total Risks: {N} ({X} high-priority score ≥6, {Y} medium, {Z} low)
2. **P0 tests** (~{X} min) - Critical paths, security-critical flows - Critical Categories: {Categories with most high-priority risks}
3. **P1 tests** (~{X} min) - Important features, algorithm accuracy
4. **P2/P3 tests** (~{X} min) - Edge cases, performance, NFR
**Need Help?** **Coverage Summary:**
- Blockers: See Architecture doc "Quick Guide" for mitigation plans - P0 tests: ~{N} (critical paths, security)
- Test scenarios: See "Test Coverage Plan" section below - P1 tests: ~{N} (important features, integration)
- Sprint 0 setup: See "Sprint 0 Setup Requirements" section - P2 tests: ~{N} (edge cases, regression)
- P3 tests: ~{N} (exploratory, benchmarks)
- **Total**: ~{N} tests (~{X}-{Y} weeks with 1 QA)
--- ---
## System Architecture Summary ## Dependencies & Test Blockers
**Data Pipeline:** **CRITICAL:** QA cannot proceed without these items from other teams.
{Brief description of system flow}
**Key Services:** ### Backend/Architecture Dependencies (Sprint 0)
- **{Service 1}**: {Purpose and key responsibilities}
- **{Service 2}**: {Purpose and key responsibilities}
- **{Service 3}**: {Purpose and key responsibilities}
**Data Stores:** **Source:** See Architecture doc "Quick Guide" for detailed mitigation plans
- **{Database 1}**: {What it stores}
- **{Database 2}**: {What it stores}
**Expected Scale** (from ADR): 1. **{Dependency 1}** - {Team} - {Timeline}
- {Key metrics: RPS, volume, users, etc.} - {What QA needs}
- {Why it blocks testing}
--- 2. **{Dependency 2}** - {Team} - {Timeline}
- {What QA needs}
- {Why it blocks testing}
## Test Environment Requirements ### QA Infrastructure Setup (Sprint 0)
**{Company} Standard:** Shared DB per Environment with Randomization (Shift-Left) 1. **Test Data Factories** - QA
- {Entity} factory with faker-based randomization
- Auto-cleanup fixtures for parallel safety
| Environment | Database | Test Data Strategy | Purpose | 2. **Test Environments** - QA
|-------------|----------|-------------------|---------| - Local: {Setup details}
| **Local** | {DB} (shared) | Randomized (faker), auto-cleanup | Local development | - CI/CD: {Setup details}
| **Dev (CI)** | {DB} (shared) | Randomized (faker), auto-cleanup | PR validation | - Staging: {Setup details}
| **Staging** | {DB} (shared) | Randomized (faker), auto-cleanup | Pre-production, E2E |
**Key Principles:** **Example factory pattern:**
- **Shared database per environment** (no ephemeral)
- **Randomization for isolation** (faker-based unique IDs)
- **Parallel-safe** (concurrent test runs don't conflict)
- **Self-cleaning** (tests delete their own data)
- **Shift-left** (test against real DBs early)
**Example:**
```typescript ```typescript
import { faker } from "@faker-js/faker"; import { test } from '@seontechnologies/playwright-utils/api-request/fixtures';
import { expect } from '@playwright/test';
import { faker } from '@faker-js/faker';
test("example with randomized test data @p0", async ({ apiRequest }) => { test('example test @p0', async ({ apiRequest }) => {
const testData = { const testData = {
id: `test-${faker.string.uuid()}`, id: `test-${faker.string.uuid()}`,
customerId: `test-customer-${faker.string.alphanumeric(8)}`, email: faker.internet.email(),
// ... unique test data
}; };
// Seed, test, cleanup const { status } = await apiRequest({
method: 'POST',
path: '/api/resource',
body: testData,
});
expect(status).toBe(201);
}); });
``` ```
--- ---
## Testability Assessment ## Risk Assessment
**Prerequisites from Architecture Doc:** **Note:** Full risk details in Architecture doc. This section summarizes risks relevant to QA test planning.
Verify these blockers are resolved before test development: ### High-Priority Risks (Score ≥6)
- [ ] {Blocker 1} (see Architecture doc Quick Guide 🚨 BLOCKERS)
- [ ] {Blocker 2}
- [ ] {Blocker 3}
**If Prerequisites Not Met:** Coordinate with Architecture team (see Architecture doc for mitigation plans and owner assignments) | Risk ID | Category | Description | Score | QA Test Coverage |
|---------|----------|-------------|-------|------------------|
| **{R-ID}** | {CAT} | {Brief description} | **{Score}** | {How QA validates this risk} |
--- ### Medium/Low-Priority Risks
## Test Levels Strategy | Risk ID | Category | Description | Score | QA Test Coverage |
|---------|----------|-------------|-------|------------------|
**System Type:** {API-heavy / UI-heavy / Mixed backend system} | {R-ID} | {CAT} | {Brief description} | {Score} | {How QA validates this risk} |
**Recommended Split:**
- **Unit Tests: {X}%** - {What to unit test}
- **Integration/API Tests: {X}%** - **PRIMARY FOCUS** - {What to integration test}
- **E2E Tests: {X}%** - {What to E2E test}
**Rationale:** {Why this split makes sense for this system}
**Test Count Summary:**
- P0: ~{N} tests - Critical paths, run on every commit
- P1: ~{N} tests - Important features, run on PR to main
- P2: ~{N} tests - Edge cases, run nightly/weekly
- P3: ~{N} tests - Exploratory, run on-demand
- **Total: ~{N} tests** (~{X} weeks for 1 QA, ~{Y} weeks for 2 QAs)
--- ---
## Test Coverage Plan ## Test Coverage Plan
**Repository Note:** {Where tests live - backend repo, admin panel repo, etc. - and how CI pipelines are organized} **IMPORTANT:** P0/P1/P2/P3 = **priority and risk level** (what to focus on if time-constrained), NOT execution timing. See "Execution Strategy" for when tests run.
### P0 (Critical) - Run on every commit (~{X} min) ### P0 (Critical)
**Execution:** CI/CD on every commit, parallel workers, smoke tests first (<5 min) **Criteria:** Blocks core functionality + High risk (≥6) + No workaround + Affects majority of users
**Purpose:** Critical path validation - catch build-breaking changes and security violations immediately | Test ID | Requirement | Test Level | Risk Link | Notes |
|---------|-------------|------------|-----------|-------|
| **P0-001** | {Requirement} | {Level} | {R-ID} | {Notes} |
| **P0-002** | {Requirement} | {Level} | {R-ID} | {Notes} |
**Criteria:** Blocks core functionality OR High risk (≥6) OR No workaround **Total P0:** ~{N} tests
**Key Smoke Tests** (subset of P0, run first for fast feedback):
- {Smoke test 1} - {Duration}
- {Smoke test 2} - {Duration}
- {Smoke test 3} - {Duration}
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|-------------|------------|-----------|------------|-------|-------|
| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} |
| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} |
**Total P0:** ~{N} tests (~{X} weeks)
#### P0 Test Scenarios (Detailed)
**1. {Test Category} ({N} tests) - {CRITICALITY if applicable}**
- [ ] {Scenario 1 with checkbox}
- [ ] {Scenario 2}
- [ ] {Scenario 3}
**2. {Test Category 2} ({N} tests)**
- [ ] {Scenario 1}
- [ ] {Scenario 2}
{Continue for all P0 categories}
--- ---
### P1 (High) - Run on PR to main (~{X} min additional) ### P1 (High)
**Execution:** CI/CD on pull requests to main branch, runs after P0 passes, parallel workers **Criteria:** Important features + Medium risk (3-4) + Common workflows + Workaround exists but difficult
**Purpose:** Important feature coverage - algorithm accuracy, complex workflows, Admin Panel interactions | Test ID | Requirement | Test Level | Risk Link | Notes |
|---------|-------------|------------|-----------|-------|
| **P1-001** | {Requirement} | {Level} | {R-ID} | {Notes} |
| **P1-002** | {Requirement} | {Level} | {R-ID} | {Notes} |
**Criteria:** Important features OR Medium risk (3-4) OR Common workflows **Total P1:** ~{N} tests
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|-------------|------------|-----------|------------|-------|-------|
| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} |
| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} |
**Total P1:** ~{N} tests (~{X} weeks)
#### P1 Test Scenarios (Detailed)
**1. {Test Category} ({N} tests)**
- [ ] {Scenario 1}
- [ ] {Scenario 2}
{Continue for all P1 categories}
--- ---
### P2 (Medium) - Run nightly/weekly (~{X} min) ### P2 (Medium)
**Execution:** Scheduled nightly run (or weekly for P3), full infrastructure, sequential execution acceptable **Criteria:** Secondary features + Low risk (1-2) + Edge cases + Regression prevention
**Purpose:** Edge case coverage, error handling, data integrity validation - slow feedback acceptable | Test ID | Requirement | Test Level | Risk Link | Notes |
|---------|-------------|------------|-----------|-------|
| **P2-001** | {Requirement} | {Level} | {R-ID} | {Notes} |
**Criteria:** Secondary features OR Low risk (1-2) OR Edge cases **Total P2:** ~{N} tests
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|-------------|------------|-----------|------------|-------|-------|
| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} |
| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} |
**Total P2:** ~{N} tests (~{X} weeks)
--- ---
### P3 (Low) - Run on-demand (exploratory) ### P3 (Low)
**Execution:** Manual trigger or weekly scheduled run, performance testing **Criteria:** Nice-to-have + Exploratory + Performance benchmarks + Documentation validation
**Purpose:** Full regression, performance benchmarks, accessibility validation - no time pressure | Test ID | Requirement | Test Level | Notes |
|---------|-------------|------------|-------|
| **P3-001** | {Requirement} | {Level} | {Notes} |
**Criteria:** Nice-to-have OR Exploratory OR Performance benchmarks **Total P3:** ~{N} tests
| Requirement | Test Level | Test Count | Owner | Notes |
|-------------|------------|------------|-------|-------|
| {Requirement 1} | {Level} | {N} | QA | {Notes} |
| {Requirement 2} | {Level} | {N} | QA | {Notes} |
**Total P3:** ~{N} tests (~{X} days)
--- ---
### Coverage Matrix (Requirements → Tests) ## Execution Strategy
| Requirement | Test Level | Priority | Risk Link | Test Count | Owner | **Philosophy:** Run everything in PRs unless there's significant infrastructure overhead. Playwright with parallelization is extremely fast (100s of tests in ~10-15 min).
|-------------|------------|----------|-----------|------------|-------|
| {Requirement 1} | {Level} | {P0-P3} | {R-ID} | {N} | {Owner} | **Organized by TOOL TYPE:**
| {Requirement 2} | {Level} | {P0-P3} | {R-ID} | {N} | {Owner} |
### Every PR: Playwright Tests (~10-15 min)
**All functional tests** (from any priority level):
- All E2E, API, integration, unit tests using Playwright
- Parallelized across {N} shards
- Total: ~{N} Playwright tests (includes P0, P1, P2, P3)
**Why run in PRs:** Fast feedback, no expensive infrastructure
### Nightly: k6 Performance Tests (~30-60 min)
**All performance tests** (from any priority level):
- Load, stress, spike, endurance tests
- Total: ~{N} k6 tests (may include P0, P1, P2)
**Why defer to nightly:** Expensive infrastructure (k6 Cloud), long-running (10-40 min per test)
### Weekly: Chaos & Long-Running (~hours)
**Special infrastructure tests** (from any priority level):
- Multi-region failover (requires AWS Fault Injection Simulator)
- Disaster recovery (backup restore, 4+ hours)
- Endurance tests (4+ hours runtime)
**Why defer to weekly:** Very expensive infrastructure, very long-running, infrequent validation sufficient
**Manual tests** (excluded from automation):
- DevOps validation (deployment, monitoring)
- Finance validation (cost alerts)
- Documentation validation
--- ---
## Sprint 0 Setup Requirements ## QA Effort Estimate
**IMPORTANT:** These items **BLOCK test development**. Complete in Sprint 0 before QA can write tests. **QA test development effort only** (excludes DevOps, Backend, Data Eng, Finance work):
### Architecture/Backend Blockers (from Architecture doc) | Priority | Count | Effort Range | Notes |
|----------|-------|--------------|-------|
| P0 | ~{N} | ~{X}-{Y} weeks | Complex setup (security, performance, multi-step) |
| P1 | ~{N} | ~{X}-{Y} weeks | Standard coverage (integration, API tests) |
| P2 | ~{N} | ~{X}-{Y} days | Edge cases, simple validation |
| P3 | ~{N} | ~{X}-{Y} days | Exploratory, benchmarks |
| **Total** | ~{N} | **~{X}-{Y} weeks** | **1 QA engineer, full-time** |
**Source:** See Architecture doc "Quick Guide" for detailed mitigation plans **Assumptions:**
- Includes test design, implementation, debugging, CI integration
- Excludes ongoing maintenance (~10% effort)
- Assumes test infrastructure (factories, fixtures) ready
1. **{Blocker 1}** 🚨 **BLOCKER** - {Owner} **Dependencies from other teams:**
- {What needs to be provided} - See "Dependencies & Test Blockers" section for what QA needs from Backend, DevOps, Data Eng
- **Details:** Architecture doc {Risk-ID} mitigation plan
2. **{Blocker 2}** 🚨 **BLOCKER** - {Owner}
- {What needs to be provided}
- **Details:** Architecture doc {Risk-ID} mitigation plan
### QA Test Infrastructure
1. **{Factory/Fixture Name}** - QA
- Faker-based generator: `{function_signature}`
- Auto-cleanup after tests
2. **{Entity} Fixtures** - QA
- Seed scripts for {states/scenarios}
- Isolated {id_pattern} per test
### Test Environments
**Local:** {Setup details - Docker, LocalStack, etc.}
**CI/CD:** {Setup details - shared infrastructure, parallel workers, artifacts}
**Staging:** {Setup details - shared multi-tenant, nightly E2E}
**Production:** {Setup details - feature flags, canary transactions}
**Sprint 0 NFR Gates** (MUST complete before integration testing):
- [ ] {Gate 1}: {Description} (Owner) 🚨
- [ ] {Gate 2}: {Description} (Owner) 🚨
- [ ] {Gate 3}: {Description} (Owner) 🚨
### Sprint 1 Items (Not Sprint 0)
- **{Item 1}** ({Owner}): {Description}
- **{Item 2}** ({Owner}): {Description}
**Sprint 1 NFR Gates** (MUST complete before GA):
- [ ] {Gate 1}: {Description} (Owner)
- [ ] {Gate 2}: {Description} (Owner)
--- ---
## NFR Readiness Summary ## Appendix A: Code Examples & Tagging
**Based on Architecture Doc Risk Assessment** **Playwright Tags for Selective Execution:**
| NFR Category | Status | Evidence Status | Blocker | Next Action | ```typescript
|--------------|--------|-----------------|---------|-------------| import { test } from '@seontechnologies/playwright-utils/api-request/fixtures';
| **Testability & Automation** | {Status} | {Evidence} | {Sprint} | {Action} | import { expect } from '@playwright/test';
| **Test Data Strategy** | {Status} | {Evidence} | {Sprint} | {Action} |
| **Scalability & Availability** | {Status} | {Evidence} | {Sprint} | {Action} |
| **Disaster Recovery** | {Status} | {Evidence} | {Sprint} | {Action} |
| **Security** | {Status} | {Evidence} | {Sprint} | {Action} |
| **Monitorability, Debuggability & Manageability** | {Status} | {Evidence} | {Sprint} | {Action} |
| **QoS & QoE** | {Status} | {Evidence} | {Sprint} | {Action} |
| **Deployability** | {Status} | {Evidence} | {Sprint} | {Action} |
**Total:** {N} PASS, {N} CONCERNS across {N} categories // P0 critical test
test('@P0 @API @Security unauthenticated request returns 401', async ({ apiRequest }) => {
const { status, body } = await apiRequest({
method: 'POST',
path: '/api/endpoint',
body: { data: 'test' },
skipAuth: true,
});
expect(status).toBe(401);
expect(body.error).toContain('unauthorized');
});
// P1 integration test
test('@P1 @Integration data syncs correctly', async ({ apiRequest }) => {
// Seed data
await apiRequest({
method: 'POST',
path: '/api/seed',
body: { /* test data */ },
});
// Validate
const { status, body } = await apiRequest({
method: 'GET',
path: '/api/resource',
});
expect(status).toBe(200);
expect(body).toHaveProperty('data');
});
```
**Run specific tags:**
```bash
# Run only P0 tests
npx playwright test --grep @P0
# Run P0 + P1 tests
npx playwright test --grep "@P0|@P1"
# Run only security tests
npx playwright test --grep @Security
# Run all Playwright tests in PR (default)
npx playwright test
```
--- ---
**End of QA Document** ## Appendix B: Knowledge Base References
**Next Steps for QA Team:** - **Risk Governance**: `risk-governance.md` - Risk scoring methodology
1. Verify Sprint 0 blockers resolved (coordinate with Architecture team if not) - **Test Priorities Matrix**: `test-priorities-matrix.md` - P0-P3 criteria
2. Set up test infrastructure (factories, fixtures, environments) - **Test Levels Framework**: `test-levels-framework.md` - E2E vs API vs Unit selection
3. Begin test implementation following priority order (P0 P1 P2 P3) - **Test Quality**: `test-quality.md` - Definition of Done (no hard waits, <300 lines, <1.5 min)
4. Run smoke tests first for fast feedback
5. Track progress using test scenario checklists above
**Next Steps for Architecture Team:** ---
1. Monitor Sprint 0 blocker resolution
2. Provide support for QA infrastructure setup if needed **Generated by:** BMad TEA Agent
3. Review test results and address any newly discovered testability gaps **Workflow:** `_bmad/bmm/testarch/test-design`
**Version:** 4.0 (BMad v6)