feat: integrated new playwright mcp
This commit is contained in:
@@ -11,7 +11,7 @@ agent:
|
||||
persona:
|
||||
role: Master Test Architect
|
||||
identity: Test architect specializing in CI/CD, automated frameworks, and scalable quality gates.
|
||||
communication_style: Data-driven advisor. Strong opinions, weakly held. Pragmatic. Makes random bird noises.
|
||||
communication_style: Data-driven advisor. Strong opinions, weakly held. Pragmatic.
|
||||
principles:
|
||||
- Risk-based testing: depth scales with impact. Quality gates backed by data. Tests mirror usage. Cost = creation + execution + maintenance.
|
||||
- Testing is feature work. Prioritize unit/integration over E2E. Flakiness is critical debt. ATDD: tests first, AI implements, suite validates.
|
||||
@@ -44,7 +44,7 @@ agent:
|
||||
|
||||
- trigger: trace
|
||||
workflow: "{project-root}/bmad/bmm/workflows/testarch/trace/workflow.yaml"
|
||||
description: Map requirements to tests Given-When-Then BDD format
|
||||
description: Map requirements to tests (Phase 1) and make quality gate decision (Phase 2)
|
||||
|
||||
- trigger: nfr-assess
|
||||
workflow: "{project-root}/bmad/bmm/workflows/testarch/nfr-assess/workflow.yaml"
|
||||
@@ -54,10 +54,6 @@ agent:
|
||||
workflow: "{project-root}/bmad/bmm/workflows/testarch/ci/workflow.yaml"
|
||||
description: Scaffold CI/CD quality pipeline
|
||||
|
||||
- trigger: gate
|
||||
workflow: "{project-root}/bmad/bmm/workflows/testarch/gate/workflow.yaml"
|
||||
description: Write/update quality gate decision assessment
|
||||
|
||||
- trigger: test-review
|
||||
workflow: "{project-root}/bmad/bmm/workflows/testarch/test-review/workflow.yaml"
|
||||
description: Review test quality using comprehensive knowledge base and best practices
|
||||
|
||||
7
src/modules/bmm/config.yaml
Normal file
7
src/modules/bmm/config.yaml
Normal file
@@ -0,0 +1,7 @@
|
||||
# Powered by BMAD™ Core
|
||||
name: bmm
|
||||
short-title: BMad Method Module
|
||||
author: Brian (BMad) Madison
|
||||
|
||||
# TEA Agent Configuration
|
||||
tea_use_mcp_enhancements: true # Enable Playwright MCP capabilities (healing, exploratory, verification)
|
||||
@@ -49,7 +49,7 @@ TEA integrates across the entire BMad development lifecycle, providing quality a
|
||||
│ ↓ │
|
||||
│ TEA: *test-review (final audit, optional) │
|
||||
│ ↓ │
|
||||
│ TEA: *gate ──→ PASS | CONCERNS | FAIL | WAIVED │
|
||||
│ TEA: *trace (Phase 2: Gate) ──→ PASS | CONCERNS | FAIL | WAIVED │
|
||||
│ │
|
||||
└──────────────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -81,19 +81,21 @@ Phase 3 (Solutioning) → [TEA validates architecture testability]
|
||||
↓
|
||||
Phase 4 (Implementation) → TEA: *atdd, *automate, *test-review, *trace (per story)
|
||||
↓
|
||||
Epic/Release Gate → TEA: *nfr-assess, *gate (release decision)
|
||||
Epic/Release Gate → TEA: *nfr-assess, *trace Phase 2 (release decision)
|
||||
```
|
||||
|
||||
### Why TEA Needs 9 Workflows
|
||||
### Why TEA Needs 8 Workflows
|
||||
|
||||
**Standard agents**: 1-3 workflows per phase
|
||||
**TEA**: 9 workflows across 3+ phases
|
||||
**TEA**: 8 workflows across 3+ phases
|
||||
|
||||
| Phase | TEA Workflows | Frequency | Purpose |
|
||||
| ----------- | -------------------------------------- | ---------------- | -------------------------------- |
|
||||
| **Phase 2** | *framework, *ci, \*test-design | Once per project | Establish quality infrastructure |
|
||||
| **Phase 4** | *atdd, *automate, *test-review, *trace | Per story/sprint | Continuous quality validation |
|
||||
| **Release** | *nfr-assess, *gate | Per epic/release | Go/no-go decision |
|
||||
| **Release** | *nfr-assess, *trace (Phase 2: gate) | Per epic/release | Go/no-go decision |
|
||||
|
||||
**Note**: `*trace` is a two-phase workflow: Phase 1 (traceability) + Phase 2 (gate decision). This reduces cognitive load while maintaining natural workflow.
|
||||
|
||||
This complexity **requires specialized documentation** (this guide), **extensive knowledge base** (19+ fragments), and **unique architecture** (`testarch/` directory).
|
||||
|
||||
@@ -121,7 +123,7 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
| Story Prep | - | Scrum Master `*create-story`, `*story-context` | Story markdown + context XML |
|
||||
| Implementation | (Optional) Trigger `*atdd` before dev to supply failing tests + checklist | Implement story guided by ATDD checklist | Failing acceptance tests + implementation checklist |
|
||||
| Post-Dev | Execute `*automate`, (Optional) `*test-review`, re-run `*trace` | Address recommendations, update code/tests | Regression specs, quality report, refreshed coverage matrix |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*gate` | Confirm Definition of Done, share release notes | Quality audit, Gate YAML + release summary (owners, waivers) |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*trace` (Phase 2) | Confirm Definition of Done, share release notes | Quality audit, Gate YAML + release summary (owners, waivers) |
|
||||
|
||||
<details>
|
||||
<summary>Execution Notes</summary>
|
||||
@@ -129,8 +131,8 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
- Run `*framework` only once per repo or when modern harness support is missing.
|
||||
- `*framework` followed by `*ci` establishes install + pipeline; `*test-design` then handles risk scoring, mitigations, and scenario planning in one pass.
|
||||
- Use `*atdd` before coding when the team can adopt ATDD; share its checklist with the dev agent.
|
||||
- Post-implementation, keep `*trace` current, expand coverage with `*automate`, optionally review test quality with `*test-review`, and finish with `*gate`.
|
||||
- Use `*test-review` after `*atdd` to validate generated tests, after `*automate` to ensure regression quality, or before `*gate` for final audit.
|
||||
- Post-implementation, keep `*trace` current, expand coverage with `*automate`, optionally review test quality with `*test-review`. For release gate, run `*trace` with Phase 2 enabled to get deployment decision.
|
||||
- Use `*test-review` after `*atdd` to validate generated tests, after `*automate` to ensure regression quality, or before gate for final audit.
|
||||
|
||||
</details>
|
||||
|
||||
@@ -141,7 +143,7 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
2. **Setup:** TEA checks harness via `*framework`, configures `*ci`, and runs `*test-design` to capture risk/coverage plans.
|
||||
3. **Story Prep:** Scrum Master generates the story via `*create-story`; PO validates using `*assess-project-ready`.
|
||||
4. **Implementation:** TEA optionally runs `*atdd`; Dev implements with guidance from failing tests and the plan.
|
||||
5. **Post-Dev and Release:** TEA runs `*automate`, optionally `*test-review` to audit test quality, re-runs `*trace`, and finishes with `*gate` to document the decision.
|
||||
5. **Post-Dev and Release:** TEA runs `*automate`, optionally `*test-review` to audit test quality, re-runs `*trace` with Phase 2 enabled to generate both traceability and gate decision.
|
||||
|
||||
</details>
|
||||
|
||||
@@ -155,7 +157,7 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
| Story Prep | - | Scrum Master `*create-story` | Updated story markdown |
|
||||
| Implementation | (Optional) Run `*atdd` before dev | Implement story, referencing checklist/tests | Failing acceptance tests + implementation checklist |
|
||||
| Post-Dev | Apply `*automate`, (Optional) `*test-review`, re-run `*trace`, `*nfr-assess` if needed | Resolve gaps, update docs/tests | Regression specs, quality report, refreshed coverage matrix, NFR report |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*gate` | Product Owner `*assess-project-ready`, share release notes | Quality audit, Gate YAML + release summary |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*trace` (Phase 2) | Product Owner `*assess-project-ready`, share release notes | Quality audit, Gate YAML + release summary |
|
||||
|
||||
<details>
|
||||
<summary>Execution Notes</summary>
|
||||
@@ -163,7 +165,7 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
- Lead with `*trace` so remediation plans target true coverage gaps. Ensure `*framework` and `*ci` are in place early in the engagement; if the brownfield lacks them, run those setup steps immediately after refreshing context.
|
||||
- `*test-design` should highlight regression hotspots, mitigations, and P0 scenarios.
|
||||
- Use `*atdd` when stories benefit from ATDD; otherwise proceed to implementation and rely on post-dev automation.
|
||||
- After development, expand coverage with `*automate`, optionally review test quality with `*test-review`, re-run `*trace`, and close with `*gate`. Run `*nfr-assess` now if non-functional risks weren't addressed earlier.
|
||||
- After development, expand coverage with `*automate`, optionally review test quality with `*test-review`, re-run `*trace` (Phase 2 for gate decision). Run `*nfr-assess` now if non-functional risks weren't addressed earlier.
|
||||
- Use `*test-review` to validate existing brownfield tests or audit new tests before gate.
|
||||
- Product Owner `*assess-project-ready` confirms the team has artifacts before handoff or release.
|
||||
|
||||
@@ -178,19 +180,19 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
4. **Story Prep:** Scrum Master generates `stories/story-1.1.md` via `*create-story`, automatically pulling updated context.
|
||||
5. **ATDD First:** TEA runs `*atdd`, producing failing Playwright specs under `tests/e2e/payments/` plus an implementation checklist.
|
||||
6. **Implementation:** Dev pairs with the checklist/tests to deliver the story.
|
||||
7. **Post-Implementation:** TEA applies `*automate`, optionally `*test-review` to audit test quality, re-runs `*trace`, performs `*nfr-assess` to validate SLAs, and closes with `*gate` marking PASS with follow-ups.
|
||||
7. **Post-Implementation:** TEA applies `*automate`, optionally `*test-review` to audit test quality, re-runs `*trace` with Phase 2 enabled, performs `*nfr-assess` to validate SLAs. The `*trace` Phase 2 output marks PASS with follow-ups.
|
||||
|
||||
</details>
|
||||
|
||||
### Enterprise / Compliance Program (Level 4)
|
||||
|
||||
| Phase | Test Architect | Dev / Team | Outputs |
|
||||
| ------------------- | ---------------------------------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------- |
|
||||
| Strategic Planning | - | Analyst/PM/Architect standard workflows | Enterprise-grade PRD, epics, architecture |
|
||||
| Quality Planning | Run `*framework`, `*test-design`, `*nfr-assess` | Review guidance, align compliance requirements | Harness scaffold, risk + coverage plan, NFR documentation |
|
||||
| Pipeline Enablement | Configure `*ci` | Coordinate secrets, pipeline approvals | `.github/workflows/test.yml`, helper scripts |
|
||||
| Execution | Enforce `*atdd`, `*automate`, `*test-review`, `*trace` per story | Implement stories, resolve TEA findings | Tests, fixtures, quality reports, coverage matrices |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*gate` | Capture sign-offs, archive artifacts | Quality audit, updated assessments, gate YAML, audit trail |
|
||||
| Phase | Test Architect | Dev / Team | Outputs |
|
||||
| ------------------- | ----------------------------------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------- |
|
||||
| Strategic Planning | - | Analyst/PM/Architect standard workflows | Enterprise-grade PRD, epics, architecture |
|
||||
| Quality Planning | Run `*framework`, `*test-design`, `*nfr-assess` | Review guidance, align compliance requirements | Harness scaffold, risk + coverage plan, NFR documentation |
|
||||
| Pipeline Enablement | Configure `*ci` | Coordinate secrets, pipeline approvals | `.github/workflows/test.yml`, helper scripts |
|
||||
| Execution | Enforce `*atdd`, `*automate`, `*test-review`, `*trace` per story | Implement stories, resolve TEA findings | Tests, fixtures, quality reports, coverage matrices |
|
||||
| Release | (Optional) `*test-review` for final audit, Run `*trace` (Phase 2) | Capture sign-offs, archive artifacts | Quality audit, updated assessments, gate YAML, audit trail |
|
||||
|
||||
<details>
|
||||
<summary>Execution Notes</summary>
|
||||
@@ -198,7 +200,7 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
- Use `*atdd` for every story when feasible so acceptance tests lead implementation in regulated environments.
|
||||
- `*ci` scaffolds selective testing scripts, burn-in jobs, caching, and notifications for long-running suites.
|
||||
- Enforce `*test-review` per story or sprint to maintain quality standards and ensure compliance with testing best practices.
|
||||
- Prior to release, rerun coverage (`*trace`, `*automate`), perform final quality audit with `*test-review`, and formalize the decision in `*gate`; store everything for audits. Call `*nfr-assess` here if compliance/performance requirements weren't captured during planning.
|
||||
- Prior to release, rerun coverage (`*trace`, `*automate`), perform final quality audit with `*test-review`, and formalize the decision with `*trace` Phase 2 (gate decision); store everything for audits. Call `*nfr-assess` here if compliance/performance requirements weren't captured during planning.
|
||||
|
||||
</details>
|
||||
|
||||
@@ -209,36 +211,78 @@ This complexity **requires specialized documentation** (this guide), **extensive
|
||||
2. **Quality Planning:** TEA runs `*framework`, `*test-design`, and `*nfr-assess` to establish mitigations, coverage, and NFR targets.
|
||||
3. **Pipeline Setup:** TEA configures CI via `*ci` with selective execution scripts.
|
||||
4. **Execution:** For each story, TEA enforces `*atdd`, `*automate`, `*test-review`, and `*trace`; Dev teams iterate on the findings.
|
||||
5. **Release:** TEA re-checks coverage, performs final quality audit with `*test-review`, and logs the final gate decision via `*gate`, archiving artifacts for compliance.
|
||||
5. **Release:** TEA re-checks coverage, performs final quality audit with `*test-review`, and logs the final gate decision via `*trace` Phase 2, archiving artifacts for compliance.
|
||||
|
||||
</details>
|
||||
|
||||
## Command Catalog
|
||||
|
||||
| Command | Workflow README | Primary Outputs | Notes |
|
||||
| -------------- | ------------------------------------------------- | ------------------------------------------------------------------- | ------------------------------------------------ |
|
||||
| `*framework` | [📖](../workflows/testarch/framework/README.md) | Playwright/Cypress scaffold, `.env.example`, `.nvmrc`, sample specs | Use when no production-ready harness exists |
|
||||
| `*ci` | [📖](../workflows/testarch/ci/README.md) | CI workflow, selective test scripts, secrets checklist | Platform-aware (GitHub Actions default) |
|
||||
| `*test-design` | [📖](../workflows/testarch/test-design/README.md) | Combined risk assessment, mitigation plan, and coverage strategy | Handles risk scoring and test design in one pass |
|
||||
| `*atdd` | [📖](../workflows/testarch/atdd/README.md) | Failing acceptance tests + implementation checklist | Requires approved story + harness |
|
||||
| `*automate` | [📖](../workflows/testarch/automate/README.md) | Prioritized specs, fixtures, README/script updates, DoD summary | Avoid duplicate coverage (see priority matrix) |
|
||||
| `*trace` | [📖](../workflows/testarch/trace/README.md) | Coverage matrix, recommendations, gate snippet | Requires access to story/tests repositories |
|
||||
| `*nfr-assess` | [📖](../workflows/testarch/nfr-assess/README.md) | NFR assessment report with actions | Focus on security/performance/reliability |
|
||||
| `*gate` | [📖](../workflows/testarch/gate/README.md) | Gate YAML + summary (PASS/CONCERNS/FAIL/WAIVED) | Deterministic decision rules + rationale |
|
||||
| `*test-review` | [📖](../workflows/testarch/test-review/README.md) | Test quality review report with 0-100 score, violations, fixes | Reviews tests against knowledge base patterns |
|
||||
|
||||
**📖** = Click to view detailed workflow documentation
|
||||
|
||||
<details>
|
||||
<summary>Command Guidance and Context Loading</summary>
|
||||
<summary><strong>Optional Playwright MCP Enhancements</strong></summary>
|
||||
|
||||
- Each task now carries its own preflight/flow/deliverable guidance inline.
|
||||
- `tea-index.csv` maps workflow needs to knowledge fragments; keep tags accurate as you add guidance.
|
||||
- Consider future modularization into orchestrated workflows if additional automation is needed.
|
||||
- Update the fragment markdown files alongside workflow edits so guidance and outputs stay in sync.
|
||||
**Two Playwright MCP servers** (actively maintained, continuously updated):
|
||||
|
||||
- `playwright` - Browser automation (`npx @playwright/mcp@latest`)
|
||||
- `playwright-test` - Test runner with failure analysis (`npx playwright run-test-mcp-server`)
|
||||
|
||||
**How MCP Enhances TEA Workflows**:
|
||||
|
||||
MCP provides additional capabilities on top of TEA's default AI-based approach:
|
||||
|
||||
1. `*test-design`:
|
||||
- Default: Analysis + documentation
|
||||
- **+ MCP**: Interactive UI discovery with `browser_navigate`, `browser_click`, `browser_snapshot`, behavior observation
|
||||
|
||||
Benefit:Discover actual functionality, edge cases, undocumented features
|
||||
|
||||
2. `*atdd`, `*automate`:
|
||||
- Default: Infers selectors and interactions from requirements and knowledge fragments
|
||||
- **+ MCP**: Generates tests **then** verifies with `generator_setup_page`, `browser_*` tools, validates against live app
|
||||
|
||||
Benefit: Accurate selectors from real DOM, verified behavior, refined test code
|
||||
|
||||
3. `*automate`:
|
||||
- Default: Pattern-based fixes from error messages + knowledge fragments
|
||||
- **+ MCP**: Pattern fixes **enhanced with** `browser_snapshot`, `browser_console_messages`, `browser_network_requests`, `browser_generate_locator`
|
||||
|
||||
Benefit: Visual failure context, live DOM inspection, root cause discovery
|
||||
|
||||
**Config example**:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"playwright": {
|
||||
"command": "npx",
|
||||
"args": ["@playwright/mcp@latest"]
|
||||
},
|
||||
"playwright-test": {
|
||||
"command": "npx",
|
||||
"args": ["playwright", "run-test-mcp-server"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**To disable**: Set `tea_use_mcp_enhancements: false` in `bmad/bmm/config.yaml` OR remove MCPs from IDE config.
|
||||
|
||||
</details>
|
||||
|
||||
<br></br>
|
||||
|
||||
| Command | Workflow README | Primary Outputs | Notes | With Playwright MCP Enhancements |
|
||||
| -------------- | ------------------------------------------------- | --------------------------------------------------------------------------------------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
|
||||
| `*framework` | [📖](../workflows/testarch/framework/README.md) | Playwright/Cypress scaffold, `.env.example`, `.nvmrc`, sample specs | Use when no production-ready harness exists | - |
|
||||
| `*ci` | [📖](../workflows/testarch/ci/README.md) | CI workflow, selective test scripts, secrets checklist | Platform-aware (GitHub Actions default) | - |
|
||||
| `*test-design` | [📖](../workflows/testarch/test-design/README.md) | Combined risk assessment, mitigation plan, and coverage strategy | Risk scoring + optional exploratory mode | **+ Exploratory**: Interactive UI discovery with browser automation (uncover actual functionality) |
|
||||
| `*atdd` | [📖](../workflows/testarch/atdd/README.md) | Failing acceptance tests + implementation checklist | TDD red phase + optional recording mode | **+ Recording**: AI generation verified with live browser (accurate selectors from real DOM) |
|
||||
| `*automate` | [📖](../workflows/testarch/automate/README.md) | Prioritized specs, fixtures, README/script updates, DoD summary | Optional healing/recording, avoid duplicate coverage | **+ Healing**: Pattern fixes enhanced with visual debugging + **+ Recording**: AI verified with live browser |
|
||||
| `*test-review` | [📖](../workflows/testarch/test-review/README.md) | Test quality review report with 0-100 score, violations, fixes | Reviews tests against knowledge base patterns | - |
|
||||
| `*nfr-assess` | [📖](../workflows/testarch/nfr-assess/README.md) | NFR assessment report with actions | Focus on security/performance/reliability | - |
|
||||
| `*trace` | [📖](../workflows/testarch/trace/README.md) | Phase 1: Coverage matrix, recommendations. Phase 2: Gate decision (PASS/CONCERNS/FAIL/WAIVED) | Two-phase workflow: traceability + gate decision | - |
|
||||
|
||||
**📖** = Click to view detailed workflow documentation
|
||||
|
||||
## Why TEA is Architecturally Different
|
||||
|
||||
TEA is the only BMM agent with its own top-level module directory (`bmm/testarch/`). This intentional design pattern reflects TEA's unique requirements:
|
||||
@@ -255,13 +299,13 @@ src/modules/bmm/
|
||||
├── workflows/
|
||||
│ └── testarch/ # TEA workflows (standard location)
|
||||
└── testarch/ # Knowledge base (UNIQUE!)
|
||||
├── knowledge/ # 19+ reusable test pattern fragments
|
||||
├── tea-index.csv # Centralized knowledge lookup
|
||||
├── knowledge/ # 21 production-ready test pattern fragments
|
||||
├── tea-index.csv # Centralized knowledge lookup (21 fragments indexed)
|
||||
└── README.md # This guide
|
||||
```
|
||||
|
||||
### Why TEA Gets Special Treatment
|
||||
|
||||
TEA uniquely requires **extensive domain knowledge** (19+ fragments: test patterns, CI/CD, fixtures, quality practices), a **centralized reference system** (`tea-index.csv` for on-demand fragment loading), and **cross-cutting concerns** (domain-specific patterns vs project-specific artifacts like PRDs/stories). Other BMM agents don't require this architecture.
|
||||
TEA uniquely requires **extensive domain knowledge** (21 fragments, 12,821 lines: test patterns, CI/CD, fixtures, quality practices, healing strategies), a **centralized reference system** (`tea-index.csv` for on-demand fragment loading), **cross-cutting concerns** (domain-specific patterns vs project-specific artifacts like PRDs/stories), and **optional MCP integration** (healing, exploratory, verification modes). Other BMM agents don't require this architecture.
|
||||
|
||||
</details>
|
||||
|
||||
@@ -1,9 +1,675 @@
|
||||
# CI Pipeline and Burn-In Strategy
|
||||
|
||||
- Stage jobs: install/caching once, run `test-changed` for quick feedback, then shard full suites with `fail-fast: false` so evidence isn’t lost.
|
||||
- Re-run changed specs 5–10x (burn-in) before merging to flush flakes; fail the pipeline on the first inconsistent run.
|
||||
- Upload artifacts on failure (videos, traces, HAR) and keep retry counts explicit—hidden retries hide instability.
|
||||
- Use `wait-on` for app startup, enforce time budgets (<10 min per job), and document required secrets alongside workflows.
|
||||
- Mirror CI scripts locally (`npm run test:ci`, `scripts/burn-in-changed.sh`) so devs reproduce pipeline behaviour exactly.
|
||||
## Principle
|
||||
|
||||
_Source: Murat CI/CD strategy blog, Playwright/Cypress workflow examples._
|
||||
CI pipelines must execute tests reliably, quickly, and provide clear feedback. Burn-in testing (running changed tests multiple times) flushes out flakiness before merge. Stage jobs strategically: install/cache once, run changed specs first for fast feedback, then shard full suites with fail-fast disabled to preserve evidence.
|
||||
|
||||
## Rationale
|
||||
|
||||
CI is the quality gate for production. A poorly configured pipeline either wastes developer time (slow feedback, false positives) or ships broken code (false negatives, insufficient coverage). Burn-in testing ensures reliability by stress-testing changed code, while parallel execution and intelligent test selection optimize speed without sacrificing thoroughness.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: GitHub Actions Workflow with Parallel Execution
|
||||
|
||||
**Context**: Production-ready CI/CD pipeline for E2E tests with caching, parallelization, and burn-in testing.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/e2e-tests.yml
|
||||
name: E2E Tests
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main, develop]
|
||||
|
||||
env:
|
||||
NODE_VERSION_FILE: '.nvmrc'
|
||||
CACHE_KEY: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
|
||||
|
||||
jobs:
|
||||
install-dependencies:
|
||||
name: Install & Cache Dependencies
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 10
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: ${{ env.NODE_VERSION_FILE }}
|
||||
cache: 'npm'
|
||||
|
||||
- name: Cache node modules
|
||||
uses: actions/cache@v4
|
||||
id: npm-cache
|
||||
with:
|
||||
path: |
|
||||
~/.npm
|
||||
node_modules
|
||||
~/.cache/Cypress
|
||||
~/.cache/ms-playwright
|
||||
key: ${{ env.CACHE_KEY }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-node-
|
||||
|
||||
- name: Install dependencies
|
||||
if: steps.npm-cache.outputs.cache-hit != 'true'
|
||||
run: npm ci --prefer-offline --no-audit
|
||||
|
||||
- name: Install Playwright browsers
|
||||
if: steps.npm-cache.outputs.cache-hit != 'true'
|
||||
run: npx playwright install --with-deps chromium
|
||||
|
||||
test-changed-specs:
|
||||
name: Test Changed Specs First (Burn-In)
|
||||
needs: install-dependencies
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 15
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0 # Full history for accurate diff
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: ${{ env.NODE_VERSION_FILE }}
|
||||
cache: 'npm'
|
||||
|
||||
- name: Restore dependencies
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: |
|
||||
~/.npm
|
||||
node_modules
|
||||
~/.cache/ms-playwright
|
||||
key: ${{ env.CACHE_KEY }}
|
||||
|
||||
- name: Detect changed test files
|
||||
id: changed-tests
|
||||
run: |
|
||||
CHANGED_SPECS=$(git diff --name-only origin/main...HEAD | grep -E '\.(spec|test)\.(ts|js|tsx|jsx)$' || echo "")
|
||||
echo "changed_specs=${CHANGED_SPECS}" >> $GITHUB_OUTPUT
|
||||
echo "Changed specs: ${CHANGED_SPECS}"
|
||||
|
||||
- name: Run burn-in on changed specs (10 iterations)
|
||||
if: steps.changed-tests.outputs.changed_specs != ''
|
||||
run: |
|
||||
SPECS="${{ steps.changed-tests.outputs.changed_specs }}"
|
||||
echo "Running burn-in: 10 iterations on changed specs"
|
||||
for i in {1..10}; do
|
||||
echo "Burn-in iteration $i/10"
|
||||
npm run test -- $SPECS || {
|
||||
echo "❌ Burn-in failed on iteration $i"
|
||||
exit 1
|
||||
}
|
||||
done
|
||||
echo "✅ Burn-in passed - 10/10 successful runs"
|
||||
|
||||
- name: Upload artifacts on failure
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: burn-in-failure-artifacts
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
screenshots/
|
||||
retention-days: 7
|
||||
|
||||
test-e2e-sharded:
|
||||
name: E2E Tests (Shard ${{ matrix.shard }}/${{ strategy.job-total }})
|
||||
needs: [install-dependencies, test-changed-specs]
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
strategy:
|
||||
fail-fast: false # Run all shards even if one fails
|
||||
matrix:
|
||||
shard: [1, 2, 3, 4]
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: ${{ env.NODE_VERSION_FILE }}
|
||||
cache: 'npm'
|
||||
|
||||
- name: Restore dependencies
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: |
|
||||
~/.npm
|
||||
node_modules
|
||||
~/.cache/ms-playwright
|
||||
key: ${{ env.CACHE_KEY }}
|
||||
|
||||
- name: Run E2E tests (shard ${{ matrix.shard }})
|
||||
run: npm run test:e2e -- --shard=${{ matrix.shard }}/4
|
||||
env:
|
||||
TEST_ENV: staging
|
||||
CI: true
|
||||
|
||||
- name: Upload test results
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: test-results-shard-${{ matrix.shard }}
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
retention-days: 30
|
||||
|
||||
- name: Upload JUnit report
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: junit-results-shard-${{ matrix.shard }}
|
||||
path: test-results/junit.xml
|
||||
retention-days: 30
|
||||
|
||||
merge-test-results:
|
||||
name: Merge Test Results & Generate Report
|
||||
needs: test-e2e-sharded
|
||||
runs-on: ubuntu-latest
|
||||
if: always()
|
||||
steps:
|
||||
- name: Download all shard results
|
||||
uses: actions/download-artifact@v4
|
||||
with:
|
||||
pattern: test-results-shard-*
|
||||
path: all-results/
|
||||
|
||||
- name: Merge HTML reports
|
||||
run: |
|
||||
npx playwright merge-reports --reporter=html all-results/
|
||||
echo "Merged report available in playwright-report/"
|
||||
|
||||
- name: Upload merged report
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: merged-playwright-report
|
||||
path: playwright-report/
|
||||
retention-days: 30
|
||||
|
||||
- name: Comment PR with results
|
||||
if: github.event_name == 'pull_request'
|
||||
uses: daun/playwright-report-comment@v3
|
||||
with:
|
||||
report-path: playwright-report/
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Install once, reuse everywhere**: Dependencies cached across all jobs
|
||||
- **Burn-in first**: Changed specs run 10x before full suite
|
||||
- **Fail-fast disabled**: All shards run to completion for full evidence
|
||||
- **Parallel execution**: 4 shards cut execution time by ~75%
|
||||
- **Artifact retention**: 30 days for reports, 7 days for failure debugging
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Burn-In Loop Pattern (Standalone Script)
|
||||
|
||||
**Context**: Reusable bash script for burn-in testing changed specs locally or in CI.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/burn-in-changed.sh
|
||||
# Usage: ./scripts/burn-in-changed.sh [iterations] [base-branch]
|
||||
|
||||
set -e # Exit on error
|
||||
|
||||
# Configuration
|
||||
ITERATIONS=${1:-10}
|
||||
BASE_BRANCH=${2:-main}
|
||||
SPEC_PATTERN='\.(spec|test)\.(ts|js|tsx|jsx)$'
|
||||
|
||||
echo "🔥 Burn-In Test Runner"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Iterations: $ITERATIONS"
|
||||
echo "Base branch: $BASE_BRANCH"
|
||||
echo ""
|
||||
|
||||
# Detect changed test files
|
||||
echo "📋 Detecting changed test files..."
|
||||
CHANGED_SPECS=$(git diff --name-only $BASE_BRANCH...HEAD | grep -E "$SPEC_PATTERN" || echo "")
|
||||
|
||||
if [ -z "$CHANGED_SPECS" ]; then
|
||||
echo "✅ No test files changed. Skipping burn-in."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "Changed test files:"
|
||||
echo "$CHANGED_SPECS" | sed 's/^/ - /'
|
||||
echo ""
|
||||
|
||||
# Count specs
|
||||
SPEC_COUNT=$(echo "$CHANGED_SPECS" | wc -l | xargs)
|
||||
echo "Running burn-in on $SPEC_COUNT test file(s)..."
|
||||
echo ""
|
||||
|
||||
# Burn-in loop
|
||||
FAILURES=()
|
||||
for i in $(seq 1 $ITERATIONS); do
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🔄 Iteration $i/$ITERATIONS"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
# Run tests with explicit file list
|
||||
if npm run test -- $CHANGED_SPECS 2>&1 | tee "burn-in-log-$i.txt"; then
|
||||
echo "✅ Iteration $i passed"
|
||||
else
|
||||
echo "❌ Iteration $i failed"
|
||||
FAILURES+=($i)
|
||||
|
||||
# Save failure artifacts
|
||||
mkdir -p burn-in-failures/iteration-$i
|
||||
cp -r test-results/ burn-in-failures/iteration-$i/ 2>/dev/null || true
|
||||
cp -r screenshots/ burn-in-failures/iteration-$i/ 2>/dev/null || true
|
||||
|
||||
echo ""
|
||||
echo "🛑 BURN-IN FAILED on iteration $i"
|
||||
echo "Failure artifacts saved to: burn-in-failures/iteration-$i/"
|
||||
echo "Logs saved to: burn-in-log-$i.txt"
|
||||
echo ""
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo ""
|
||||
done
|
||||
|
||||
# Success summary
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🎉 BURN-IN PASSED"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "All $ITERATIONS iterations passed for $SPEC_COUNT test file(s)"
|
||||
echo "Changed specs are stable and ready to merge."
|
||||
echo ""
|
||||
|
||||
# Cleanup logs
|
||||
rm -f burn-in-log-*.txt
|
||||
|
||||
exit 0
|
||||
```
|
||||
|
||||
**Usage**:
|
||||
|
||||
```bash
|
||||
# Run locally with default settings (10 iterations, compare to main)
|
||||
./scripts/burn-in-changed.sh
|
||||
|
||||
# Custom iterations and base branch
|
||||
./scripts/burn-in-changed.sh 20 develop
|
||||
|
||||
# Add to package.json
|
||||
{
|
||||
"scripts": {
|
||||
"test:burn-in": "bash scripts/burn-in-changed.sh",
|
||||
"test:burn-in:strict": "bash scripts/burn-in-changed.sh 20"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Exit on first failure**: Flaky tests caught immediately
|
||||
- **Failure artifacts**: Saved per-iteration for debugging
|
||||
- **Flexible configuration**: Iterations and base branch customizable
|
||||
- **CI/local parity**: Same script runs in both environments
|
||||
- **Clear output**: Visual feedback on progress and results
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Shard Orchestration with Result Aggregation
|
||||
|
||||
**Context**: Advanced sharding strategy for large test suites with intelligent result merging.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```javascript
|
||||
// scripts/run-sharded-tests.js
|
||||
const { spawn } = require('child_process');
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
/**
|
||||
* Run tests across multiple shards and aggregate results
|
||||
* Usage: node scripts/run-sharded-tests.js --shards=4 --env=staging
|
||||
*/
|
||||
|
||||
const SHARD_COUNT = parseInt(process.env.SHARD_COUNT || '4');
|
||||
const TEST_ENV = process.env.TEST_ENV || 'local';
|
||||
const RESULTS_DIR = path.join(__dirname, '../test-results');
|
||||
|
||||
console.log(`🚀 Running tests across ${SHARD_COUNT} shards`);
|
||||
console.log(`Environment: ${TEST_ENV}`);
|
||||
console.log('━'.repeat(50));
|
||||
|
||||
// Ensure results directory exists
|
||||
if (!fs.existsSync(RESULTS_DIR)) {
|
||||
fs.mkdirSync(RESULTS_DIR, { recursive: true });
|
||||
}
|
||||
|
||||
/**
|
||||
* Run a single shard
|
||||
*/
|
||||
function runShard(shardIndex) {
|
||||
return new Promise((resolve, reject) => {
|
||||
const shardId = `${shardIndex}/${SHARD_COUNT}`;
|
||||
console.log(`\n📦 Starting shard ${shardId}...`);
|
||||
|
||||
const child = spawn('npx', ['playwright', 'test', `--shard=${shardId}`, '--reporter=json'], {
|
||||
env: { ...process.env, TEST_ENV, SHARD_INDEX: shardIndex },
|
||||
stdio: 'pipe',
|
||||
});
|
||||
|
||||
let stdout = '';
|
||||
let stderr = '';
|
||||
|
||||
child.stdout.on('data', (data) => {
|
||||
stdout += data.toString();
|
||||
process.stdout.write(data);
|
||||
});
|
||||
|
||||
child.stderr.on('data', (data) => {
|
||||
stderr += data.toString();
|
||||
process.stderr.write(data);
|
||||
});
|
||||
|
||||
child.on('close', (code) => {
|
||||
// Save shard results
|
||||
const resultFile = path.join(RESULTS_DIR, `shard-${shardIndex}.json`);
|
||||
try {
|
||||
const result = JSON.parse(stdout);
|
||||
fs.writeFileSync(resultFile, JSON.stringify(result, null, 2));
|
||||
console.log(`✅ Shard ${shardId} completed (exit code: ${code})`);
|
||||
resolve({ shardIndex, code, result });
|
||||
} catch (error) {
|
||||
console.error(`❌ Shard ${shardId} failed to parse results:`, error.message);
|
||||
reject({ shardIndex, code, error });
|
||||
}
|
||||
});
|
||||
|
||||
child.on('error', (error) => {
|
||||
console.error(`❌ Shard ${shardId} process error:`, error.message);
|
||||
reject({ shardIndex, error });
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Aggregate results from all shards
|
||||
*/
|
||||
function aggregateResults() {
|
||||
console.log('\n📊 Aggregating results from all shards...');
|
||||
|
||||
const shardResults = [];
|
||||
let totalTests = 0;
|
||||
let totalPassed = 0;
|
||||
let totalFailed = 0;
|
||||
let totalSkipped = 0;
|
||||
let totalFlaky = 0;
|
||||
|
||||
for (let i = 1; i <= SHARD_COUNT; i++) {
|
||||
const resultFile = path.join(RESULTS_DIR, `shard-${i}.json`);
|
||||
if (fs.existsSync(resultFile)) {
|
||||
const result = JSON.parse(fs.readFileSync(resultFile, 'utf8'));
|
||||
shardResults.push(result);
|
||||
|
||||
// Aggregate stats
|
||||
totalTests += result.stats?.expected || 0;
|
||||
totalPassed += result.stats?.expected || 0;
|
||||
totalFailed += result.stats?.unexpected || 0;
|
||||
totalSkipped += result.stats?.skipped || 0;
|
||||
totalFlaky += result.stats?.flaky || 0;
|
||||
}
|
||||
}
|
||||
|
||||
const summary = {
|
||||
totalShards: SHARD_COUNT,
|
||||
environment: TEST_ENV,
|
||||
totalTests,
|
||||
passed: totalPassed,
|
||||
failed: totalFailed,
|
||||
skipped: totalSkipped,
|
||||
flaky: totalFlaky,
|
||||
duration: shardResults.reduce((acc, r) => acc + (r.duration || 0), 0),
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
// Save aggregated summary
|
||||
fs.writeFileSync(path.join(RESULTS_DIR, 'summary.json'), JSON.stringify(summary, null, 2));
|
||||
|
||||
console.log('\n━'.repeat(50));
|
||||
console.log('📈 Test Results Summary');
|
||||
console.log('━'.repeat(50));
|
||||
console.log(`Total tests: ${totalTests}`);
|
||||
console.log(`✅ Passed: ${totalPassed}`);
|
||||
console.log(`❌ Failed: ${totalFailed}`);
|
||||
console.log(`⏭️ Skipped: ${totalSkipped}`);
|
||||
console.log(`⚠️ Flaky: ${totalFlaky}`);
|
||||
console.log(`⏱️ Duration: ${(summary.duration / 1000).toFixed(2)}s`);
|
||||
console.log('━'.repeat(50));
|
||||
|
||||
return summary;
|
||||
}
|
||||
|
||||
/**
|
||||
* Main execution
|
||||
*/
|
||||
async function main() {
|
||||
const startTime = Date.now();
|
||||
const shardPromises = [];
|
||||
|
||||
// Run all shards in parallel
|
||||
for (let i = 1; i <= SHARD_COUNT; i++) {
|
||||
shardPromises.push(runShard(i));
|
||||
}
|
||||
|
||||
try {
|
||||
await Promise.allSettled(shardPromises);
|
||||
} catch (error) {
|
||||
console.error('❌ One or more shards failed:', error);
|
||||
}
|
||||
|
||||
// Aggregate results
|
||||
const summary = aggregateResults();
|
||||
|
||||
const totalTime = ((Date.now() - startTime) / 1000).toFixed(2);
|
||||
console.log(`\n⏱️ Total execution time: ${totalTime}s`);
|
||||
|
||||
// Exit with failure if any tests failed
|
||||
if (summary.failed > 0) {
|
||||
console.error('\n❌ Test suite failed');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.log('\n✅ All tests passed');
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
main().catch((error) => {
|
||||
console.error('Fatal error:', error);
|
||||
process.exit(1);
|
||||
});
|
||||
```
|
||||
|
||||
**package.json integration**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"test:sharded": "node scripts/run-sharded-tests.js",
|
||||
"test:sharded:ci": "SHARD_COUNT=8 TEST_ENV=staging node scripts/run-sharded-tests.js"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Parallel shard execution**: All shards run simultaneously
|
||||
- **Result aggregation**: Unified summary across shards
|
||||
- **Failure detection**: Exit code reflects overall test status
|
||||
- **Artifact preservation**: Individual shard results saved for debugging
|
||||
- **CI/local compatibility**: Same script works in both environments
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Selective Test Execution (Changed Files + Tags)
|
||||
|
||||
**Context**: Optimize CI by running only relevant tests based on file changes and tags.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/selective-test-runner.sh
|
||||
# Intelligent test selection based on changed files and test tags
|
||||
|
||||
set -e
|
||||
|
||||
BASE_BRANCH=${BASE_BRANCH:-main}
|
||||
TEST_ENV=${TEST_ENV:-local}
|
||||
|
||||
echo "🎯 Selective Test Runner"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Base branch: $BASE_BRANCH"
|
||||
echo "Environment: $TEST_ENV"
|
||||
echo ""
|
||||
|
||||
# Detect changed files (all types, not just tests)
|
||||
CHANGED_FILES=$(git diff --name-only $BASE_BRANCH...HEAD)
|
||||
|
||||
if [ -z "$CHANGED_FILES" ]; then
|
||||
echo "✅ No files changed. Skipping tests."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "Changed files:"
|
||||
echo "$CHANGED_FILES" | sed 's/^/ - /'
|
||||
echo ""
|
||||
|
||||
# Determine test strategy based on changes
|
||||
run_smoke_only=false
|
||||
run_all_tests=false
|
||||
affected_specs=""
|
||||
|
||||
# Critical files = run all tests
|
||||
if echo "$CHANGED_FILES" | grep -qE '(package\.json|package-lock\.json|playwright\.config|cypress\.config|\.github/workflows)'; then
|
||||
echo "⚠️ Critical configuration files changed. Running ALL tests."
|
||||
run_all_tests=true
|
||||
|
||||
# Auth/security changes = run all auth + smoke tests
|
||||
elif echo "$CHANGED_FILES" | grep -qE '(auth|login|signup|security)'; then
|
||||
echo "🔒 Auth/security files changed. Running auth + smoke tests."
|
||||
npm run test -- --grep "@auth|@smoke"
|
||||
exit $?
|
||||
|
||||
# API changes = run integration + smoke tests
|
||||
elif echo "$CHANGED_FILES" | grep -qE '(api|service|controller)'; then
|
||||
echo "🔌 API files changed. Running integration + smoke tests."
|
||||
npm run test -- --grep "@integration|@smoke"
|
||||
exit $?
|
||||
|
||||
# UI component changes = run related component tests
|
||||
elif echo "$CHANGED_FILES" | grep -qE '\.(tsx|jsx|vue)$'; then
|
||||
echo "🎨 UI components changed. Running component + smoke tests."
|
||||
|
||||
# Extract component names and find related tests
|
||||
components=$(echo "$CHANGED_FILES" | grep -E '\.(tsx|jsx|vue)$' | xargs -I {} basename {} | sed 's/\.[^.]*$//')
|
||||
for component in $components; do
|
||||
# Find tests matching component name
|
||||
affected_specs+=$(find tests -name "*${component}*" -type f) || true
|
||||
done
|
||||
|
||||
if [ -n "$affected_specs" ]; then
|
||||
echo "Running tests for: $affected_specs"
|
||||
npm run test -- $affected_specs --grep "@smoke"
|
||||
else
|
||||
echo "No specific tests found. Running smoke tests only."
|
||||
npm run test -- --grep "@smoke"
|
||||
fi
|
||||
exit $?
|
||||
|
||||
# Documentation/config only = run smoke tests
|
||||
elif echo "$CHANGED_FILES" | grep -qE '\.(md|txt|json|yml|yaml)$'; then
|
||||
echo "📝 Documentation/config files changed. Running smoke tests only."
|
||||
run_smoke_only=true
|
||||
else
|
||||
echo "⚙️ Other files changed. Running smoke tests."
|
||||
run_smoke_only=true
|
||||
fi
|
||||
|
||||
# Execute selected strategy
|
||||
if [ "$run_all_tests" = true ]; then
|
||||
echo ""
|
||||
echo "Running full test suite..."
|
||||
npm run test
|
||||
elif [ "$run_smoke_only" = true ]; then
|
||||
echo ""
|
||||
echo "Running smoke tests..."
|
||||
npm run test -- --grep "@smoke"
|
||||
fi
|
||||
```
|
||||
|
||||
**Usage in GitHub Actions**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/selective-tests.yml
|
||||
name: Selective Tests
|
||||
on: pull_request
|
||||
|
||||
jobs:
|
||||
selective-tests:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Run selective tests
|
||||
run: bash scripts/selective-test-runner.sh
|
||||
env:
|
||||
BASE_BRANCH: ${{ github.base_ref }}
|
||||
TEST_ENV: staging
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Intelligent routing**: Tests selected based on changed file types
|
||||
- **Tag-based filtering**: Use @smoke, @auth, @integration tags
|
||||
- **Fast feedback**: Only relevant tests run on most PRs
|
||||
- **Safety net**: Critical changes trigger full suite
|
||||
- **Component mapping**: UI changes run related component tests
|
||||
|
||||
---
|
||||
|
||||
## CI Configuration Checklist
|
||||
|
||||
Before deploying your CI pipeline, verify:
|
||||
|
||||
- [ ] **Caching strategy**: node_modules, npm cache, browser binaries cached
|
||||
- [ ] **Timeout budgets**: Each job has reasonable timeout (10-30 min)
|
||||
- [ ] **Artifact retention**: 30 days for reports, 7 days for failure artifacts
|
||||
- [ ] **Parallelization**: Matrix strategy uses fail-fast: false
|
||||
- [ ] **Burn-in enabled**: Changed specs run 5-10x before merge
|
||||
- [ ] **wait-on app startup**: CI waits for app (wait-on: 'http://localhost:3000')
|
||||
- [ ] **Secrets documented**: README lists required secrets (API keys, tokens)
|
||||
- [ ] **Local parity**: CI scripts runnable locally (npm run test:ci)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*ci` (CI/CD pipeline setup)
|
||||
- Related fragments: `selective-testing.md`, `playwright-config.md`, `test-quality.md`
|
||||
- CI tools: GitHub Actions, GitLab CI, CircleCI, Jenkins
|
||||
|
||||
_Source: Murat CI/CD strategy blog, Playwright/Cypress workflow examples, SEON production pipelines_
|
||||
|
||||
@@ -1,9 +1,486 @@
|
||||
# Component Test-Driven Development Loop
|
||||
|
||||
- Start every UI change with a failing component spec (`cy.mount` or RTL `render`); ship only after red → green → refactor passes.
|
||||
- Recreate providers/stores per spec to prevent state bleed and keep parallel runs deterministic.
|
||||
- Use factories to exercise prop/state permutations; cover accessibility by asserting against roles, labels, and keyboard flows.
|
||||
- Keep component specs under ~100 lines: split by intent (rendering, state transitions, error messaging) to preserve clarity.
|
||||
- Pair component tests with visual debugging (Cypress runner, Storybook, Playwright trace viewer) to accelerate diagnosis.
|
||||
## Principle
|
||||
|
||||
_Source: CCTDD repository, Murat component testing talks._
|
||||
Start every UI change with a failing component test (`cy.mount`, Playwright component test, or RTL `render`). Follow the Red-Green-Refactor cycle: write a failing test (red), make it pass with minimal code (green), then improve the implementation (refactor). Ship only after the cycle completes. Keep component tests under 100 lines, isolated with fresh providers per test, and validate accessibility alongside functionality.
|
||||
|
||||
## Rationale
|
||||
|
||||
Component TDD provides immediate feedback during development. Failing tests (red) clarify requirements before writing code. Minimal implementations (green) prevent over-engineering. Refactoring with passing tests ensures changes don't break functionality. Isolated tests with fresh providers prevent state bleed in parallel runs. Accessibility assertions catch usability issues early. Visual debugging (Cypress runner, Storybook, Playwright trace viewer) accelerates diagnosis when tests fail.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Red-Green-Refactor Loop
|
||||
|
||||
**Context**: When building a new component, start with a failing test that describes the desired behavior. Implement just enough to pass, then refactor for quality.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// Step 1: RED - Write failing test
|
||||
// Button.cy.tsx (Cypress Component Test)
|
||||
import { Button } from './Button';
|
||||
|
||||
describe('Button Component', () => {
|
||||
it('should render with label', () => {
|
||||
cy.mount(<Button label="Click Me" />);
|
||||
cy.contains('Click Me').should('be.visible');
|
||||
});
|
||||
|
||||
it('should call onClick when clicked', () => {
|
||||
const onClickSpy = cy.stub().as('onClick');
|
||||
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
|
||||
|
||||
cy.get('button').click();
|
||||
cy.get('@onClick').should('have.been.calledOnce');
|
||||
});
|
||||
});
|
||||
|
||||
// Run test: FAILS - Button component doesn't exist yet
|
||||
// Error: "Cannot find module './Button'"
|
||||
|
||||
// Step 2: GREEN - Minimal implementation
|
||||
// Button.tsx
|
||||
type ButtonProps = {
|
||||
label: string;
|
||||
onClick?: () => void;
|
||||
};
|
||||
|
||||
export const Button = ({ label, onClick }: ButtonProps) => {
|
||||
return <button onClick={onClick}>{label}</button>;
|
||||
};
|
||||
|
||||
// Run test: PASSES - Component renders and handles clicks
|
||||
|
||||
// Step 3: REFACTOR - Improve implementation
|
||||
// Add disabled state, loading state, variants
|
||||
type ButtonProps = {
|
||||
label: string;
|
||||
onClick?: () => void;
|
||||
disabled?: boolean;
|
||||
loading?: boolean;
|
||||
variant?: 'primary' | 'secondary' | 'danger';
|
||||
};
|
||||
|
||||
export const Button = ({
|
||||
label,
|
||||
onClick,
|
||||
disabled = false,
|
||||
loading = false,
|
||||
variant = 'primary'
|
||||
}: ButtonProps) => {
|
||||
return (
|
||||
<button
|
||||
onClick={onClick}
|
||||
disabled={disabled || loading}
|
||||
className={`btn btn-${variant}`}
|
||||
data-testid="button"
|
||||
>
|
||||
{loading ? <Spinner /> : label}
|
||||
</button>
|
||||
);
|
||||
};
|
||||
|
||||
// Step 4: Expand tests for new features
|
||||
describe('Button Component', () => {
|
||||
it('should render with label', () => {
|
||||
cy.mount(<Button label="Click Me" />);
|
||||
cy.contains('Click Me').should('be.visible');
|
||||
});
|
||||
|
||||
it('should call onClick when clicked', () => {
|
||||
const onClickSpy = cy.stub().as('onClick');
|
||||
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
|
||||
|
||||
cy.get('button').click();
|
||||
cy.get('@onClick').should('have.been.calledOnce');
|
||||
});
|
||||
|
||||
it('should be disabled when disabled prop is true', () => {
|
||||
cy.mount(<Button label="Submit" disabled={true} />);
|
||||
cy.get('button').should('be.disabled');
|
||||
});
|
||||
|
||||
it('should show spinner when loading', () => {
|
||||
cy.mount(<Button label="Submit" loading={true} />);
|
||||
cy.get('[data-testid="spinner"]').should('be.visible');
|
||||
cy.get('button').should('be.disabled');
|
||||
});
|
||||
|
||||
it('should apply variant styles', () => {
|
||||
cy.mount(<Button label="Delete" variant="danger" />);
|
||||
cy.get('button').should('have.class', 'btn-danger');
|
||||
});
|
||||
});
|
||||
|
||||
// Run tests: ALL PASS - Refactored component still works
|
||||
|
||||
// Playwright Component Test equivalent
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import { Button } from './Button';
|
||||
|
||||
test.describe('Button Component', () => {
|
||||
test('should call onClick when clicked', async ({ mount }) => {
|
||||
let clicked = false;
|
||||
const component = await mount(
|
||||
<Button label="Submit" onClick={() => { clicked = true; }} />
|
||||
);
|
||||
|
||||
await component.getByRole('button').click();
|
||||
expect(clicked).toBe(true);
|
||||
});
|
||||
|
||||
test('should be disabled when loading', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Submit" loading={true} />);
|
||||
await expect(component.getByRole('button')).toBeDisabled();
|
||||
await expect(component.getByTestId('spinner')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Red: Write failing test first - clarifies requirements before coding
|
||||
- Green: Implement minimal code to pass - prevents over-engineering
|
||||
- Refactor: Improve code quality while keeping tests green
|
||||
- Expand: Add tests for new features after refactoring
|
||||
- Cycle repeats: Each new feature starts with a failing test
|
||||
|
||||
### Example 2: Provider Isolation Pattern
|
||||
|
||||
**Context**: When testing components that depend on context providers (React Query, Auth, Router), wrap them with required providers in each test to prevent state bleed between tests.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// test-utils/AllTheProviders.tsx
|
||||
import { FC, ReactNode } from 'react';
|
||||
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
|
||||
import { BrowserRouter } from 'react-router-dom';
|
||||
import { AuthProvider } from '../contexts/AuthContext';
|
||||
|
||||
type Props = {
|
||||
children: ReactNode;
|
||||
initialAuth?: { user: User | null; token: string | null };
|
||||
};
|
||||
|
||||
export const AllTheProviders: FC<Props> = ({ children, initialAuth }) => {
|
||||
// Create NEW QueryClient per test (prevent state bleed)
|
||||
const queryClient = new QueryClient({
|
||||
defaultOptions: {
|
||||
queries: { retry: false },
|
||||
mutations: { retry: false }
|
||||
}
|
||||
});
|
||||
|
||||
return (
|
||||
<QueryClientProvider client={queryClient}>
|
||||
<BrowserRouter>
|
||||
<AuthProvider initialAuth={initialAuth}>
|
||||
{children}
|
||||
</AuthProvider>
|
||||
</BrowserRouter>
|
||||
</QueryClientProvider>
|
||||
);
|
||||
};
|
||||
|
||||
// Cypress custom mount command
|
||||
// cypress/support/component.tsx
|
||||
import { mount } from 'cypress/react18';
|
||||
import { AllTheProviders } from '../../test-utils/AllTheProviders';
|
||||
|
||||
Cypress.Commands.add('wrappedMount', (component, options = {}) => {
|
||||
const { initialAuth, ...mountOptions } = options;
|
||||
|
||||
return mount(
|
||||
<AllTheProviders initialAuth={initialAuth}>
|
||||
{component}
|
||||
</AllTheProviders>,
|
||||
mountOptions
|
||||
);
|
||||
});
|
||||
|
||||
// Usage in tests
|
||||
// UserProfile.cy.tsx
|
||||
import { UserProfile } from './UserProfile';
|
||||
|
||||
describe('UserProfile Component', () => {
|
||||
it('should display user when authenticated', () => {
|
||||
const user = { id: 1, name: 'John Doe', email: 'john@example.com' };
|
||||
|
||||
cy.wrappedMount(<UserProfile />, {
|
||||
initialAuth: { user, token: 'fake-token' }
|
||||
});
|
||||
|
||||
cy.contains('John Doe').should('be.visible');
|
||||
cy.contains('john@example.com').should('be.visible');
|
||||
});
|
||||
|
||||
it('should show login prompt when not authenticated', () => {
|
||||
cy.wrappedMount(<UserProfile />, {
|
||||
initialAuth: { user: null, token: null }
|
||||
});
|
||||
|
||||
cy.contains('Please log in').should('be.visible');
|
||||
});
|
||||
});
|
||||
|
||||
// Playwright Component Test with providers
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
|
||||
import { UserProfile } from './UserProfile';
|
||||
import { AuthProvider } from '../contexts/AuthContext';
|
||||
|
||||
test.describe('UserProfile Component', () => {
|
||||
test('should display user when authenticated', async ({ mount }) => {
|
||||
const user = { id: 1, name: 'John Doe', email: 'john@example.com' };
|
||||
const queryClient = new QueryClient();
|
||||
|
||||
const component = await mount(
|
||||
<QueryClientProvider client={queryClient}>
|
||||
<AuthProvider initialAuth={{ user, token: 'fake-token' }}>
|
||||
<UserProfile />
|
||||
</AuthProvider>
|
||||
</QueryClientProvider>
|
||||
);
|
||||
|
||||
await expect(component.getByText('John Doe')).toBeVisible();
|
||||
await expect(component.getByText('john@example.com')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Create NEW providers per test (QueryClient, Router, Auth)
|
||||
- Prevents state pollution between tests
|
||||
- `initialAuth` prop allows testing different auth states
|
||||
- Custom mount command (`wrappedMount`) reduces boilerplate
|
||||
- Providers wrap component, not the entire test suite
|
||||
|
||||
### Example 3: Accessibility Assertions
|
||||
|
||||
**Context**: When testing components, validate accessibility alongside functionality using axe-core, ARIA roles, labels, and keyboard navigation.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// Cypress with axe-core
|
||||
// cypress/support/component.tsx
|
||||
import 'cypress-axe';
|
||||
|
||||
// Form.cy.tsx
|
||||
import { Form } from './Form';
|
||||
|
||||
describe('Form Component Accessibility', () => {
|
||||
beforeEach(() => {
|
||||
cy.wrappedMount(<Form />);
|
||||
cy.injectAxe(); // Inject axe-core
|
||||
});
|
||||
|
||||
it('should have no accessibility violations', () => {
|
||||
cy.checkA11y(); // Run axe scan
|
||||
});
|
||||
|
||||
it('should have proper ARIA labels', () => {
|
||||
cy.get('input[name="email"]').should('have.attr', 'aria-label', 'Email address');
|
||||
cy.get('input[name="password"]').should('have.attr', 'aria-label', 'Password');
|
||||
cy.get('button[type="submit"]').should('have.attr', 'aria-label', 'Submit form');
|
||||
});
|
||||
|
||||
it('should support keyboard navigation', () => {
|
||||
// Tab through form fields
|
||||
cy.get('input[name="email"]').focus().type('test@example.com');
|
||||
cy.realPress('Tab'); // cypress-real-events plugin
|
||||
cy.focused().should('have.attr', 'name', 'password');
|
||||
|
||||
cy.focused().type('password123');
|
||||
cy.realPress('Tab');
|
||||
cy.focused().should('have.attr', 'type', 'submit');
|
||||
|
||||
cy.realPress('Enter'); // Submit via keyboard
|
||||
cy.contains('Form submitted').should('be.visible');
|
||||
});
|
||||
|
||||
it('should announce errors to screen readers', () => {
|
||||
cy.get('button[type="submit"]').click(); // Submit without data
|
||||
|
||||
// Error has role="alert" and aria-live="polite"
|
||||
cy.get('[role="alert"]')
|
||||
.should('be.visible')
|
||||
.and('have.attr', 'aria-live', 'polite')
|
||||
.and('contain', 'Email is required');
|
||||
});
|
||||
|
||||
it('should have sufficient color contrast', () => {
|
||||
cy.checkA11y(null, {
|
||||
rules: {
|
||||
'color-contrast': { enabled: true }
|
||||
}
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
// Playwright with axe-playwright
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import AxeBuilder from '@axe-core/playwright';
|
||||
import { Form } from './Form';
|
||||
|
||||
test.describe('Form Component Accessibility', () => {
|
||||
test('should have no accessibility violations', async ({ mount, page }) => {
|
||||
await mount(<Form />);
|
||||
|
||||
const accessibilityScanResults = await new AxeBuilder({ page })
|
||||
.analyze();
|
||||
|
||||
expect(accessibilityScanResults.violations).toEqual([]);
|
||||
});
|
||||
|
||||
test('should support keyboard navigation', async ({ mount, page }) => {
|
||||
const component = await mount(<Form />);
|
||||
|
||||
await component.getByLabel('Email address').fill('test@example.com');
|
||||
await page.keyboard.press('Tab');
|
||||
|
||||
await expect(component.getByLabel('Password')).toBeFocused();
|
||||
|
||||
await component.getByLabel('Password').fill('password123');
|
||||
await page.keyboard.press('Tab');
|
||||
|
||||
await expect(component.getByRole('button', { name: 'Submit form' })).toBeFocused();
|
||||
|
||||
await page.keyboard.press('Enter');
|
||||
await expect(component.getByText('Form submitted')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Use `cy.checkA11y()` (Cypress) or `AxeBuilder` (Playwright) for automated accessibility scanning
|
||||
- Validate ARIA roles, labels, and live regions
|
||||
- Test keyboard navigation (Tab, Enter, Escape)
|
||||
- Ensure errors are announced to screen readers (`role="alert"`, `aria-live`)
|
||||
- Check color contrast meets WCAG standards
|
||||
|
||||
### Example 4: Visual Regression Test
|
||||
|
||||
**Context**: When testing components, capture screenshots to detect unintended visual changes. Use Playwright visual comparison or Cypress snapshot plugins.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// Playwright visual regression
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import { Button } from './Button';
|
||||
|
||||
test.describe('Button Visual Regression', () => {
|
||||
test('should match primary button snapshot', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Primary" variant="primary" />);
|
||||
|
||||
// Capture and compare screenshot
|
||||
await expect(component).toHaveScreenshot('button-primary.png');
|
||||
});
|
||||
|
||||
test('should match secondary button snapshot', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Secondary" variant="secondary" />);
|
||||
await expect(component).toHaveScreenshot('button-secondary.png');
|
||||
});
|
||||
|
||||
test('should match disabled button snapshot', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Disabled" disabled={true} />);
|
||||
await expect(component).toHaveScreenshot('button-disabled.png');
|
||||
});
|
||||
|
||||
test('should match loading button snapshot', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Loading" loading={true} />);
|
||||
await expect(component).toHaveScreenshot('button-loading.png');
|
||||
});
|
||||
});
|
||||
|
||||
// Cypress visual regression with percy or snapshot plugins
|
||||
import { Button } from './Button';
|
||||
|
||||
describe('Button Visual Regression', () => {
|
||||
it('should match primary button snapshot', () => {
|
||||
cy.wrappedMount(<Button label="Primary" variant="primary" />);
|
||||
|
||||
// Option 1: Percy (cloud-based visual testing)
|
||||
cy.percySnapshot('Button - Primary');
|
||||
|
||||
// Option 2: cypress-plugin-snapshots (local snapshots)
|
||||
cy.get('button').toMatchImageSnapshot({
|
||||
name: 'button-primary',
|
||||
threshold: 0.01 // 1% threshold for pixel differences
|
||||
});
|
||||
});
|
||||
|
||||
it('should match hover state', () => {
|
||||
cy.wrappedMount(<Button label="Hover Me" />);
|
||||
cy.get('button').realHover(); // cypress-real-events
|
||||
cy.percySnapshot('Button - Hover State');
|
||||
});
|
||||
|
||||
it('should match focus state', () => {
|
||||
cy.wrappedMount(<Button label="Focus Me" />);
|
||||
cy.get('button').focus();
|
||||
cy.percySnapshot('Button - Focus State');
|
||||
});
|
||||
});
|
||||
|
||||
// Playwright configuration for visual regression
|
||||
// playwright.config.ts
|
||||
export default defineConfig({
|
||||
expect: {
|
||||
toHaveScreenshot: {
|
||||
maxDiffPixels: 100, // Allow 100 pixels difference
|
||||
threshold: 0.2 // 20% threshold
|
||||
}
|
||||
},
|
||||
use: {
|
||||
screenshot: 'only-on-failure'
|
||||
}
|
||||
});
|
||||
|
||||
// Update snapshots when intentional changes are made
|
||||
// npx playwright test --update-snapshots
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Playwright: Use `toHaveScreenshot()` for built-in visual comparison
|
||||
- Cypress: Use Percy (cloud) or snapshot plugins (local) for visual testing
|
||||
- Capture different states: default, hover, focus, disabled, loading
|
||||
- Set threshold for acceptable pixel differences (avoid false positives)
|
||||
- Update snapshots when visual changes are intentional
|
||||
- Visual tests catch unintended CSS/layout regressions
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (component test generation), `*automate` (component test expansion), `*framework` (component testing setup)
|
||||
- **Related fragments**:
|
||||
- `test-quality.md` - Keep component tests <100 lines, isolated, focused
|
||||
- `fixture-architecture.md` - Provider wrapping patterns, custom mount commands
|
||||
- `data-factories.md` - Factory functions for component props
|
||||
- `test-levels-framework.md` - When to use component tests vs E2E tests
|
||||
|
||||
## TDD Workflow Summary
|
||||
|
||||
**Red-Green-Refactor Cycle**:
|
||||
|
||||
1. **Red**: Write failing test describing desired behavior
|
||||
2. **Green**: Implement minimal code to make test pass
|
||||
3. **Refactor**: Improve code quality, tests stay green
|
||||
4. **Repeat**: Each new feature starts with failing test
|
||||
|
||||
**Component Test Checklist**:
|
||||
|
||||
- [ ] Test renders with required props
|
||||
- [ ] Test user interactions (click, type, submit)
|
||||
- [ ] Test different states (loading, error, disabled)
|
||||
- [ ] Test accessibility (ARIA, keyboard navigation)
|
||||
- [ ] Test visual regression (snapshots)
|
||||
- [ ] Isolate with fresh providers (no state bleed)
|
||||
- [ ] Keep tests <100 lines (split by intent)
|
||||
|
||||
_Source: CCTDD repository, Murat component testing talks, Playwright/Cypress component testing docs._
|
||||
|
||||
@@ -1,9 +1,957 @@
|
||||
# Contract Testing Essentials (Pact)
|
||||
|
||||
- Store consumer contracts beside the integration specs that generate them; version contracts semantically and publish on every CI run.
|
||||
- Require provider verification before merge; failed verification blocks release and surfaces breaking changes immediately.
|
||||
- Capture fallback behaviour inside interactions (timeouts, retries, error payloads) so resilience guarantees remain explicit.
|
||||
- Automate broker housekeeping: tag releases, archive superseded contracts, and expire unused pacts to reduce noise.
|
||||
- Pair contract suites with API smoke or component tests to validate data mapping and UI rendering in tandem.
|
||||
## Principle
|
||||
|
||||
_Source: Pact consumer/provider sample repos, Murat contract testing blog._
|
||||
Contract testing validates API contracts between consumer and provider services without requiring integrated end-to-end tests. Store consumer contracts alongside integration specs, version contracts semantically, and publish on every CI run. Provider verification before merge surfaces breaking changes immediately, while explicit fallback behavior (timeouts, retries, error payloads) captures resilience guarantees in contracts.
|
||||
|
||||
## Rationale
|
||||
|
||||
Traditional integration testing requires running both consumer and provider simultaneously, creating slow, flaky tests with complex setup. Contract testing decouples services: consumers define expectations (pact files), providers verify against those expectations independently. This enables parallel development, catches breaking changes early, and documents API behavior as executable specifications. Pair contract tests with API smoke tests to validate data mapping and UI rendering in tandem.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Pact Consumer Test (Frontend → Backend API)
|
||||
|
||||
**Context**: React application consuming a user management API, defining expected interactions.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/contract/user-api.pact.spec.ts
|
||||
import { PactV3, MatchersV3 } from '@pact-foundation/pact';
|
||||
import { getUserById, createUser, User } from '@/api/user-service';
|
||||
|
||||
const { like, eachLike, string, integer } = MatchersV3;
|
||||
|
||||
/**
|
||||
* Consumer-Driven Contract Test
|
||||
* - Consumer (React app) defines expected API behavior
|
||||
* - Generates pact file for provider to verify
|
||||
* - Runs in isolation (no real backend required)
|
||||
*/
|
||||
|
||||
const provider = new PactV3({
|
||||
consumer: 'user-management-web',
|
||||
provider: 'user-api-service',
|
||||
dir: './pacts', // Output directory for pact files
|
||||
logLevel: 'warn',
|
||||
});
|
||||
|
||||
describe('User API Contract', () => {
|
||||
describe('GET /users/:id', () => {
|
||||
it('should return user when user exists', async () => {
|
||||
// Arrange: Define expected interaction
|
||||
await provider
|
||||
.given('user with id 1 exists') // Provider state
|
||||
.uponReceiving('a request for user 1')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/1',
|
||||
headers: {
|
||||
Accept: 'application/json',
|
||||
Authorization: like('Bearer token123'), // Matcher: any string
|
||||
},
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 200,
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
body: like({
|
||||
id: integer(1),
|
||||
name: string('John Doe'),
|
||||
email: string('john@example.com'),
|
||||
role: string('user'),
|
||||
createdAt: string('2025-01-15T10:00:00Z'),
|
||||
}),
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
// Act: Call consumer code against mock server
|
||||
const user = await getUserById(1, {
|
||||
baseURL: mockServer.url,
|
||||
headers: { Authorization: 'Bearer token123' },
|
||||
});
|
||||
|
||||
// Assert: Validate consumer behavior
|
||||
expect(user).toEqual(
|
||||
expect.objectContaining({
|
||||
id: 1,
|
||||
name: 'John Doe',
|
||||
email: 'john@example.com',
|
||||
role: 'user',
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
it('should handle 404 when user does not exist', async () => {
|
||||
await provider
|
||||
.given('user with id 999 does not exist')
|
||||
.uponReceiving('a request for non-existent user')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/999',
|
||||
headers: { Accept: 'application/json' },
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 404,
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: {
|
||||
error: 'User not found',
|
||||
code: 'USER_NOT_FOUND',
|
||||
},
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
// Act & Assert: Consumer handles 404 gracefully
|
||||
await expect(getUserById(999, { baseURL: mockServer.url })).rejects.toThrow('User not found');
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('POST /users', () => {
|
||||
it('should create user and return 201', async () => {
|
||||
const newUser: Omit<User, 'id' | 'createdAt'> = {
|
||||
name: 'Jane Smith',
|
||||
email: 'jane@example.com',
|
||||
role: 'admin',
|
||||
};
|
||||
|
||||
await provider
|
||||
.given('no users exist')
|
||||
.uponReceiving('a request to create a user')
|
||||
.withRequest({
|
||||
method: 'POST',
|
||||
path: '/users',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
Accept: 'application/json',
|
||||
},
|
||||
body: like(newUser),
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 201,
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: like({
|
||||
id: integer(2),
|
||||
name: string('Jane Smith'),
|
||||
email: string('jane@example.com'),
|
||||
role: string('admin'),
|
||||
createdAt: string('2025-01-15T11:00:00Z'),
|
||||
}),
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
const createdUser = await createUser(newUser, {
|
||||
baseURL: mockServer.url,
|
||||
});
|
||||
|
||||
expect(createdUser).toEqual(
|
||||
expect.objectContaining({
|
||||
id: expect.any(Number),
|
||||
name: 'Jane Smith',
|
||||
email: 'jane@example.com',
|
||||
role: 'admin',
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**package.json scripts**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"test:contract": "jest tests/contract --testTimeout=30000",
|
||||
"pact:publish": "pact-broker publish ./pacts --consumer-app-version=$GIT_SHA --broker-base-url=$PACT_BROKER_URL --broker-token=$PACT_BROKER_TOKEN"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Consumer-driven**: Frontend defines expectations, not backend
|
||||
- **Matchers**: `like`, `string`, `integer` for flexible matching
|
||||
- **Provider states**: given() sets up test preconditions
|
||||
- **Isolation**: No real backend needed, runs fast
|
||||
- **Pact generation**: Automatically creates JSON pact files
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Pact Provider Verification (Backend validates contracts)
|
||||
|
||||
**Context**: Node.js/Express API verifying pacts published by consumers.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/contract/user-api.provider.spec.ts
|
||||
import { Verifier, VerifierOptions } from '@pact-foundation/pact';
|
||||
import { server } from '../../src/server'; // Your Express/Fastify app
|
||||
import { seedDatabase, resetDatabase } from '../support/db-helpers';
|
||||
|
||||
/**
|
||||
* Provider Verification Test
|
||||
* - Provider (backend API) verifies against published pacts
|
||||
* - State handlers setup test data for each interaction
|
||||
* - Runs before merge to catch breaking changes
|
||||
*/
|
||||
|
||||
describe('Pact Provider Verification', () => {
|
||||
let serverInstance;
|
||||
const PORT = 3001;
|
||||
|
||||
beforeAll(async () => {
|
||||
// Start provider server
|
||||
serverInstance = server.listen(PORT);
|
||||
console.log(`Provider server running on port ${PORT}`);
|
||||
});
|
||||
|
||||
afterAll(async () => {
|
||||
// Cleanup
|
||||
await serverInstance.close();
|
||||
});
|
||||
|
||||
it('should verify pacts from all consumers', async () => {
|
||||
const opts: VerifierOptions = {
|
||||
// Provider details
|
||||
provider: 'user-api-service',
|
||||
providerBaseUrl: `http://localhost:${PORT}`,
|
||||
|
||||
// Pact Broker configuration
|
||||
pactBrokerUrl: process.env.PACT_BROKER_URL,
|
||||
pactBrokerToken: process.env.PACT_BROKER_TOKEN,
|
||||
publishVerificationResult: process.env.CI === 'true',
|
||||
providerVersion: process.env.GIT_SHA || 'dev',
|
||||
|
||||
// State handlers: Setup provider state for each interaction
|
||||
stateHandlers: {
|
||||
'user with id 1 exists': async () => {
|
||||
await seedDatabase({
|
||||
users: [
|
||||
{
|
||||
id: 1,
|
||||
name: 'John Doe',
|
||||
email: 'john@example.com',
|
||||
role: 'user',
|
||||
createdAt: '2025-01-15T10:00:00Z',
|
||||
},
|
||||
],
|
||||
});
|
||||
return 'User seeded successfully';
|
||||
},
|
||||
|
||||
'user with id 999 does not exist': async () => {
|
||||
// Ensure user doesn't exist
|
||||
await resetDatabase();
|
||||
return 'Database reset';
|
||||
},
|
||||
|
||||
'no users exist': async () => {
|
||||
await resetDatabase();
|
||||
return 'Database empty';
|
||||
},
|
||||
},
|
||||
|
||||
// Request filters: Add auth headers to all requests
|
||||
requestFilter: (req, res, next) => {
|
||||
// Mock authentication for verification
|
||||
req.headers['x-user-id'] = 'test-user';
|
||||
req.headers['authorization'] = 'Bearer valid-test-token';
|
||||
next();
|
||||
},
|
||||
|
||||
// Timeout for verification
|
||||
timeout: 30000,
|
||||
};
|
||||
|
||||
// Run verification
|
||||
await new Verifier(opts).verifyProvider();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**CI integration**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/pact-provider.yml
|
||||
name: Pact Provider Verification
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
verify-contracts:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Start database
|
||||
run: docker-compose up -d postgres
|
||||
|
||||
- name: Run migrations
|
||||
run: npm run db:migrate
|
||||
|
||||
- name: Verify pacts
|
||||
run: npm run test:contract:provider
|
||||
env:
|
||||
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
GIT_SHA: ${{ github.sha }}
|
||||
CI: true
|
||||
|
||||
- name: Can I Deploy?
|
||||
run: |
|
||||
npx pact-broker can-i-deploy \
|
||||
--pacticipant user-api-service \
|
||||
--version ${{ github.sha }} \
|
||||
--to-environment production
|
||||
env:
|
||||
PACT_BROKER_BASE_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **State handlers**: Setup provider data for each given() state
|
||||
- **Request filters**: Add auth/headers for verification requests
|
||||
- **CI publishing**: Verification results sent to broker
|
||||
- **can-i-deploy**: Safety check before production deployment
|
||||
- **Database isolation**: Reset between state handlers
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Contract CI Integration (Consumer & Provider Workflow)
|
||||
|
||||
**Context**: Complete CI/CD workflow coordinating consumer pact publishing and provider verification.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/pact-consumer.yml (Consumer side)
|
||||
name: Pact Consumer Tests
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
consumer-tests:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run consumer contract tests
|
||||
run: npm run test:contract
|
||||
|
||||
- name: Publish pacts to broker
|
||||
if: github.ref == 'refs/heads/main' || github.event_name == 'pull_request'
|
||||
run: |
|
||||
npx pact-broker publish ./pacts \
|
||||
--consumer-app-version ${{ github.sha }} \
|
||||
--branch ${{ github.head_ref || github.ref_name }} \
|
||||
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
|
||||
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
|
||||
- name: Tag pact with environment (main branch only)
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
npx pact-broker create-version-tag \
|
||||
--pacticipant user-management-web \
|
||||
--version ${{ github.sha }} \
|
||||
--tag production \
|
||||
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
|
||||
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
```
|
||||
|
||||
```yaml
|
||||
# .github/workflows/pact-provider.yml (Provider side)
|
||||
name: Pact Provider Verification
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main]
|
||||
repository_dispatch:
|
||||
types: [pact_changed] # Webhook from Pact Broker
|
||||
|
||||
jobs:
|
||||
verify-contracts:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Start dependencies
|
||||
run: docker-compose up -d
|
||||
|
||||
- name: Run provider verification
|
||||
run: npm run test:contract:provider
|
||||
env:
|
||||
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
GIT_SHA: ${{ github.sha }}
|
||||
CI: true
|
||||
|
||||
- name: Publish verification results
|
||||
if: always()
|
||||
run: echo "Verification results published to broker"
|
||||
|
||||
- name: Can I Deploy to Production?
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
npx pact-broker can-i-deploy \
|
||||
--pacticipant user-api-service \
|
||||
--version ${{ github.sha }} \
|
||||
--to-environment production \
|
||||
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
|
||||
--broker-token ${{ secrets.PACT_BROKER_TOKEN }} \
|
||||
--retry-while-unknown 6 \
|
||||
--retry-interval 10
|
||||
|
||||
- name: Record deployment (if can-i-deploy passed)
|
||||
if: success() && github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
npx pact-broker record-deployment \
|
||||
--pacticipant user-api-service \
|
||||
--version ${{ github.sha }} \
|
||||
--environment production \
|
||||
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
|
||||
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
```
|
||||
|
||||
**Pact Broker Webhook Configuration**:
|
||||
|
||||
```json
|
||||
{
|
||||
"events": [
|
||||
{
|
||||
"name": "contract_content_changed"
|
||||
}
|
||||
],
|
||||
"request": {
|
||||
"method": "POST",
|
||||
"url": "https://api.github.com/repos/your-org/user-api/dispatches",
|
||||
"headers": {
|
||||
"Authorization": "Bearer ${user.githubToken}",
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/vnd.github.v3+json"
|
||||
},
|
||||
"body": {
|
||||
"event_type": "pact_changed",
|
||||
"client_payload": {
|
||||
"pact_url": "${pactbroker.pactUrl}",
|
||||
"consumer": "${pactbroker.consumerName}",
|
||||
"provider": "${pactbroker.providerName}"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Automatic trigger**: Consumer pact changes trigger provider verification via webhook
|
||||
- **Branch tracking**: Pacts published per branch for feature testing
|
||||
- **can-i-deploy**: Safety gate before production deployment
|
||||
- **Record deployment**: Track which version is in each environment
|
||||
- **Parallel dev**: Consumer and provider teams work independently
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Resilience Coverage (Testing Fallback Behavior)
|
||||
|
||||
**Context**: Capture timeout, retry, and error handling behavior explicitly in contracts.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/contract/user-api-resilience.pact.spec.ts
|
||||
import { PactV3, MatchersV3 } from '@pact-foundation/pact';
|
||||
import { getUserById, ApiError } from '@/api/user-service';
|
||||
|
||||
const { like, string } = MatchersV3;
|
||||
|
||||
const provider = new PactV3({
|
||||
consumer: 'user-management-web',
|
||||
provider: 'user-api-service',
|
||||
dir: './pacts',
|
||||
});
|
||||
|
||||
describe('User API Resilience Contract', () => {
|
||||
/**
|
||||
* Test 500 error handling
|
||||
* Verifies consumer handles server errors gracefully
|
||||
*/
|
||||
it('should handle 500 errors with retry logic', async () => {
|
||||
await provider
|
||||
.given('server is experiencing errors')
|
||||
.uponReceiving('a request that returns 500')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/1',
|
||||
headers: { Accept: 'application/json' },
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 500,
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: {
|
||||
error: 'Internal server error',
|
||||
code: 'INTERNAL_ERROR',
|
||||
retryable: true,
|
||||
},
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
// Consumer should retry on 500
|
||||
try {
|
||||
await getUserById(1, {
|
||||
baseURL: mockServer.url,
|
||||
retries: 3,
|
||||
retryDelay: 100,
|
||||
});
|
||||
fail('Should have thrown error after retries');
|
||||
} catch (error) {
|
||||
expect(error).toBeInstanceOf(ApiError);
|
||||
expect((error as ApiError).code).toBe('INTERNAL_ERROR');
|
||||
expect((error as ApiError).retryable).toBe(true);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
/**
|
||||
* Test 429 rate limiting
|
||||
* Verifies consumer respects rate limits
|
||||
*/
|
||||
it('should handle 429 rate limit with backoff', async () => {
|
||||
await provider
|
||||
.given('rate limit exceeded for user')
|
||||
.uponReceiving('a request that is rate limited')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/1',
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 429,
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'Retry-After': '60', // Retry after 60 seconds
|
||||
},
|
||||
body: {
|
||||
error: 'Too many requests',
|
||||
code: 'RATE_LIMIT_EXCEEDED',
|
||||
},
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
try {
|
||||
await getUserById(1, {
|
||||
baseURL: mockServer.url,
|
||||
respectRateLimit: true,
|
||||
});
|
||||
fail('Should have thrown rate limit error');
|
||||
} catch (error) {
|
||||
expect(error).toBeInstanceOf(ApiError);
|
||||
expect((error as ApiError).code).toBe('RATE_LIMIT_EXCEEDED');
|
||||
expect((error as ApiError).retryAfter).toBe(60);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
/**
|
||||
* Test timeout handling
|
||||
* Verifies consumer has appropriate timeout configuration
|
||||
*/
|
||||
it('should timeout after 10 seconds', async () => {
|
||||
await provider
|
||||
.given('server is slow to respond')
|
||||
.uponReceiving('a request that times out')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/1',
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 200,
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: like({ id: 1, name: 'John' }),
|
||||
})
|
||||
.withDelay(15000) // Simulate 15 second delay
|
||||
.executeTest(async (mockServer) => {
|
||||
try {
|
||||
await getUserById(1, {
|
||||
baseURL: mockServer.url,
|
||||
timeout: 10000, // 10 second timeout
|
||||
});
|
||||
fail('Should have timed out');
|
||||
} catch (error) {
|
||||
expect(error).toBeInstanceOf(ApiError);
|
||||
expect((error as ApiError).code).toBe('TIMEOUT');
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
/**
|
||||
* Test partial response (optional fields)
|
||||
* Verifies consumer handles missing optional data
|
||||
*/
|
||||
it('should handle response with missing optional fields', async () => {
|
||||
await provider
|
||||
.given('user exists with minimal data')
|
||||
.uponReceiving('a request for user with partial data')
|
||||
.withRequest({
|
||||
method: 'GET',
|
||||
path: '/users/1',
|
||||
})
|
||||
.willRespondWith({
|
||||
status: 200,
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: {
|
||||
id: integer(1),
|
||||
name: string('John Doe'),
|
||||
email: string('john@example.com'),
|
||||
// role, createdAt, etc. omitted (optional fields)
|
||||
},
|
||||
})
|
||||
.executeTest(async (mockServer) => {
|
||||
const user = await getUserById(1, { baseURL: mockServer.url });
|
||||
|
||||
// Consumer handles missing optional fields gracefully
|
||||
expect(user.id).toBe(1);
|
||||
expect(user.name).toBe('John Doe');
|
||||
expect(user.role).toBeUndefined(); // Optional field
|
||||
expect(user.createdAt).toBeUndefined(); // Optional field
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**API client with retry logic**:
|
||||
|
||||
```typescript
|
||||
// src/api/user-service.ts
|
||||
import axios, { AxiosInstance, AxiosRequestConfig } from 'axios';
|
||||
|
||||
export class ApiError extends Error {
|
||||
constructor(
|
||||
message: string,
|
||||
public code: string,
|
||||
public retryable: boolean = false,
|
||||
public retryAfter?: number,
|
||||
) {
|
||||
super(message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* User API client with retry and error handling
|
||||
*/
|
||||
export async function getUserById(
|
||||
id: number,
|
||||
config?: AxiosRequestConfig & { retries?: number; retryDelay?: number; respectRateLimit?: boolean },
|
||||
): Promise<User> {
|
||||
const { retries = 3, retryDelay = 1000, respectRateLimit = true, ...axiosConfig } = config || {};
|
||||
|
||||
let lastError: Error;
|
||||
|
||||
for (let attempt = 1; attempt <= retries; attempt++) {
|
||||
try {
|
||||
const response = await axios.get(`/users/${id}`, axiosConfig);
|
||||
return response.data;
|
||||
} catch (error: any) {
|
||||
lastError = error;
|
||||
|
||||
// Handle rate limiting
|
||||
if (error.response?.status === 429) {
|
||||
const retryAfter = parseInt(error.response.headers['retry-after'] || '60');
|
||||
throw new ApiError('Too many requests', 'RATE_LIMIT_EXCEEDED', false, retryAfter);
|
||||
}
|
||||
|
||||
// Retry on 500 errors
|
||||
if (error.response?.status === 500 && attempt < retries) {
|
||||
await new Promise((resolve) => setTimeout(resolve, retryDelay * attempt));
|
||||
continue;
|
||||
}
|
||||
|
||||
// Handle 404
|
||||
if (error.response?.status === 404) {
|
||||
throw new ApiError('User not found', 'USER_NOT_FOUND', false);
|
||||
}
|
||||
|
||||
// Handle timeout
|
||||
if (error.code === 'ECONNABORTED') {
|
||||
throw new ApiError('Request timeout', 'TIMEOUT', true);
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
throw new ApiError('Request failed after retries', 'INTERNAL_ERROR', true);
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Resilience contracts**: Timeouts, retries, errors explicitly tested
|
||||
- **State handlers**: Provider sets up each test scenario
|
||||
- **Error handling**: Consumer validates graceful degradation
|
||||
- **Retry logic**: Exponential backoff tested
|
||||
- **Optional fields**: Consumer handles partial responses
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Pact Broker Housekeeping & Lifecycle Management
|
||||
|
||||
**Context**: Automated broker maintenance to prevent contract sprawl and noise.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// scripts/pact-broker-housekeeping.ts
|
||||
/**
|
||||
* Pact Broker Housekeeping Script
|
||||
* - Archive superseded contracts
|
||||
* - Expire unused pacts
|
||||
* - Tag releases for environment tracking
|
||||
*/
|
||||
|
||||
import { execSync } from 'child_process';
|
||||
|
||||
const PACT_BROKER_URL = process.env.PACT_BROKER_URL!;
|
||||
const PACT_BROKER_TOKEN = process.env.PACT_BROKER_TOKEN!;
|
||||
const PACTICIPANT = 'user-api-service';
|
||||
|
||||
/**
|
||||
* Tag release with environment
|
||||
*/
|
||||
function tagRelease(version: string, environment: 'staging' | 'production') {
|
||||
console.log(`🏷️ Tagging ${PACTICIPANT} v${version} as ${environment}`);
|
||||
|
||||
execSync(
|
||||
`npx pact-broker create-version-tag \
|
||||
--pacticipant ${PACTICIPANT} \
|
||||
--version ${version} \
|
||||
--tag ${environment} \
|
||||
--broker-base-url ${PACT_BROKER_URL} \
|
||||
--broker-token ${PACT_BROKER_TOKEN}`,
|
||||
{ stdio: 'inherit' },
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Record deployment to environment
|
||||
*/
|
||||
function recordDeployment(version: string, environment: 'staging' | 'production') {
|
||||
console.log(`📝 Recording deployment of ${PACTICIPANT} v${version} to ${environment}`);
|
||||
|
||||
execSync(
|
||||
`npx pact-broker record-deployment \
|
||||
--pacticipant ${PACTICIPANT} \
|
||||
--version ${version} \
|
||||
--environment ${environment} \
|
||||
--broker-base-url ${PACT_BROKER_URL} \
|
||||
--broker-token ${PACT_BROKER_TOKEN}`,
|
||||
{ stdio: 'inherit' },
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Clean up old pact versions (retention policy)
|
||||
* Keep: last 30 days, all production tags, latest from each branch
|
||||
*/
|
||||
function cleanupOldPacts() {
|
||||
console.log(`🧹 Cleaning up old pacts for ${PACTICIPANT}`);
|
||||
|
||||
execSync(
|
||||
`npx pact-broker clean \
|
||||
--pacticipant ${PACTICIPANT} \
|
||||
--broker-base-url ${PACT_BROKER_URL} \
|
||||
--broker-token ${PACT_BROKER_TOKEN} \
|
||||
--keep-latest-for-branch 1 \
|
||||
--keep-min-age 30`,
|
||||
{ stdio: 'inherit' },
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Check deployment compatibility
|
||||
*/
|
||||
function canIDeploy(version: string, toEnvironment: string): boolean {
|
||||
console.log(`🔍 Checking if ${PACTICIPANT} v${version} can deploy to ${toEnvironment}`);
|
||||
|
||||
try {
|
||||
execSync(
|
||||
`npx pact-broker can-i-deploy \
|
||||
--pacticipant ${PACTICIPANT} \
|
||||
--version ${version} \
|
||||
--to-environment ${toEnvironment} \
|
||||
--broker-base-url ${PACT_BROKER_URL} \
|
||||
--broker-token ${PACT_BROKER_TOKEN} \
|
||||
--retry-while-unknown 6 \
|
||||
--retry-interval 10`,
|
||||
{ stdio: 'inherit' },
|
||||
);
|
||||
return true;
|
||||
} catch (error) {
|
||||
console.error(`❌ Cannot deploy to ${toEnvironment}`);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Main housekeeping workflow
|
||||
*/
|
||||
async function main() {
|
||||
const command = process.argv[2];
|
||||
const version = process.argv[3];
|
||||
const environment = process.argv[4] as 'staging' | 'production';
|
||||
|
||||
switch (command) {
|
||||
case 'tag-release':
|
||||
tagRelease(version, environment);
|
||||
break;
|
||||
|
||||
case 'record-deployment':
|
||||
recordDeployment(version, environment);
|
||||
break;
|
||||
|
||||
case 'can-i-deploy':
|
||||
const canDeploy = canIDeploy(version, environment);
|
||||
process.exit(canDeploy ? 0 : 1);
|
||||
|
||||
case 'cleanup':
|
||||
cleanupOldPacts();
|
||||
break;
|
||||
|
||||
default:
|
||||
console.error('Unknown command. Use: tag-release | record-deployment | can-i-deploy | cleanup');
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
main();
|
||||
```
|
||||
|
||||
**package.json scripts**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"pact:tag": "ts-node scripts/pact-broker-housekeeping.ts tag-release",
|
||||
"pact:record": "ts-node scripts/pact-broker-housekeeping.ts record-deployment",
|
||||
"pact:can-deploy": "ts-node scripts/pact-broker-housekeeping.ts can-i-deploy",
|
||||
"pact:cleanup": "ts-node scripts/pact-broker-housekeeping.ts cleanup"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Deployment workflow integration**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/deploy-production.yml
|
||||
name: Deploy to Production
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- 'v*'
|
||||
|
||||
jobs:
|
||||
verify-contracts:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Check pact compatibility
|
||||
run: npm run pact:can-deploy ${{ github.ref_name }} production
|
||||
env:
|
||||
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
|
||||
deploy:
|
||||
needs: verify-contracts
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Deploy to production
|
||||
run: ./scripts/deploy.sh production
|
||||
|
||||
- name: Record deployment in Pact Broker
|
||||
run: npm run pact:record ${{ github.ref_name }} production
|
||||
env:
|
||||
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
```
|
||||
|
||||
**Scheduled cleanup**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/pact-housekeeping.yml
|
||||
name: Pact Broker Housekeeping
|
||||
on:
|
||||
schedule:
|
||||
- cron: '0 2 * * 0' # Weekly on Sunday at 2 AM
|
||||
|
||||
jobs:
|
||||
cleanup:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Cleanup old pacts
|
||||
run: npm run pact:cleanup
|
||||
env:
|
||||
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
|
||||
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Automated tagging**: Releases tagged with environment
|
||||
- **Deployment tracking**: Broker knows which version is where
|
||||
- **Safety gate**: can-i-deploy blocks incompatible deployments
|
||||
- **Retention policy**: Keep recent, production, and branch-latest pacts
|
||||
- **Webhook triggers**: Provider verification runs on consumer changes
|
||||
|
||||
---
|
||||
|
||||
## Contract Testing Checklist
|
||||
|
||||
Before implementing contract testing, verify:
|
||||
|
||||
- [ ] **Pact Broker setup**: Hosted (Pactflow) or self-hosted broker configured
|
||||
- [ ] **Consumer tests**: Generate pacts in CI, publish to broker on merge
|
||||
- [ ] **Provider verification**: Runs on PR, verifies all consumer pacts
|
||||
- [ ] **State handlers**: Provider implements all given() states
|
||||
- [ ] **can-i-deploy**: Blocks deployment if contracts incompatible
|
||||
- [ ] **Webhooks configured**: Consumer changes trigger provider verification
|
||||
- [ ] **Retention policy**: Old pacts archived (keep 30 days, all production tags)
|
||||
- [ ] **Resilience tested**: Timeouts, retries, error codes in contracts
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*automate` (integration test generation), `*ci` (contract CI setup)
|
||||
- Related fragments: `test-levels-framework.md`, `ci-burn-in.md`
|
||||
- Tools: Pact.js, Pact Broker (Pactflow or self-hosted), Pact CLI
|
||||
|
||||
_Source: Pact consumer/provider sample repos, Murat contract testing blog, Pact official documentation_
|
||||
|
||||
@@ -1,9 +1,500 @@
|
||||
# Data Factories and API-First Setup
|
||||
|
||||
- Prefer factory functions that accept overrides and return complete objects (`createUser(overrides)`)—never rely on static fixtures.
|
||||
- Seed state through APIs, tasks, or direct DB helpers before visiting the UI; UI-based setup is for validation only.
|
||||
- Ensure factories generate parallel-safe identifiers (UUIDs, timestamps) and perform cleanup after each test.
|
||||
- Centralize factory exports to avoid duplication; version them alongside schema changes to catch drift in reviews.
|
||||
- When working with shared environments, layer feature toggles or targeted cleanup so factories do not clobber concurrent runs.
|
||||
## Principle
|
||||
|
||||
_Source: Murat Testing Philosophy, blog posts on functional helpers and API-first testing._
|
||||
Prefer factory functions that accept overrides and return complete objects (`createUser(overrides)`). Seed test state through APIs, tasks, or direct DB helpers before visiting the UI—never via slow UI interactions. UI is for validation only, not setup.
|
||||
|
||||
## Rationale
|
||||
|
||||
Static fixtures (JSON files, hardcoded objects) create brittle tests that:
|
||||
|
||||
- Fail when schemas evolve (missing new required fields)
|
||||
- Cause collisions in parallel execution (same user IDs)
|
||||
- Hide test intent (what matters for _this_ test?)
|
||||
|
||||
Dynamic factories with overrides provide:
|
||||
|
||||
- **Parallel safety**: UUIDs and timestamps prevent collisions
|
||||
- **Schema evolution**: Defaults adapt to schema changes automatically
|
||||
- **Explicit intent**: Overrides show what matters for each test
|
||||
- **Speed**: API setup is 10-50x faster than UI
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Factory Function with Overrides
|
||||
|
||||
**Context**: When creating test data, build factory functions with sensible defaults and explicit overrides. Use `faker` for dynamic values that prevent collisions.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// test-utils/factories/user-factory.ts
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
type User = {
|
||||
id: string;
|
||||
email: string;
|
||||
name: string;
|
||||
role: 'user' | 'admin' | 'moderator';
|
||||
createdAt: Date;
|
||||
isActive: boolean;
|
||||
};
|
||||
|
||||
export const createUser = (overrides: Partial<User> = {}): User => ({
|
||||
id: faker.string.uuid(),
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
role: 'user',
|
||||
createdAt: new Date(),
|
||||
isActive: true,
|
||||
...overrides,
|
||||
});
|
||||
|
||||
// test-utils/factories/product-factory.ts
|
||||
type Product = {
|
||||
id: string;
|
||||
name: string;
|
||||
price: number;
|
||||
stock: number;
|
||||
category: string;
|
||||
};
|
||||
|
||||
export const createProduct = (overrides: Partial<Product> = {}): Product => ({
|
||||
id: faker.string.uuid(),
|
||||
name: faker.commerce.productName(),
|
||||
price: parseFloat(faker.commerce.price()),
|
||||
stock: faker.number.int({ min: 0, max: 100 }),
|
||||
category: faker.commerce.department(),
|
||||
...overrides,
|
||||
});
|
||||
|
||||
// Usage in tests:
|
||||
test('admin can delete users', async ({ page, apiRequest }) => {
|
||||
// Default user
|
||||
const user = createUser();
|
||||
|
||||
// Admin user (explicit override shows intent)
|
||||
const admin = createUser({ role: 'admin' });
|
||||
|
||||
// Seed via API (fast!)
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: user });
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: admin });
|
||||
|
||||
// Now test UI behavior
|
||||
await page.goto('/admin/users');
|
||||
await page.click(`[data-testid="delete-user-${user.id}"]`);
|
||||
await expect(page.getByText(`User ${user.name} deleted`)).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `Partial<User>` allows overriding any field without breaking type safety
|
||||
- Faker generates unique values—no collisions in parallel tests
|
||||
- Override shows test intent: `createUser({ role: 'admin' })` is explicit
|
||||
- Factory lives in `test-utils/factories/` for easy reuse
|
||||
|
||||
### Example 2: Nested Factory Pattern
|
||||
|
||||
**Context**: When testing relationships (orders with users and products), nest factories to create complete object graphs. Control relationship data explicitly.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// test-utils/factories/order-factory.ts
|
||||
import { createUser } from './user-factory';
|
||||
import { createProduct } from './product-factory';
|
||||
|
||||
type OrderItem = {
|
||||
product: Product;
|
||||
quantity: number;
|
||||
price: number;
|
||||
};
|
||||
|
||||
type Order = {
|
||||
id: string;
|
||||
user: User;
|
||||
items: OrderItem[];
|
||||
total: number;
|
||||
status: 'pending' | 'paid' | 'shipped' | 'delivered';
|
||||
createdAt: Date;
|
||||
};
|
||||
|
||||
export const createOrderItem = (overrides: Partial<OrderItem> = {}): OrderItem => {
|
||||
const product = overrides.product || createProduct();
|
||||
const quantity = overrides.quantity || faker.number.int({ min: 1, max: 5 });
|
||||
|
||||
return {
|
||||
product,
|
||||
quantity,
|
||||
price: product.price * quantity,
|
||||
...overrides,
|
||||
};
|
||||
};
|
||||
|
||||
export const createOrder = (overrides: Partial<Order> = {}): Order => {
|
||||
const items = overrides.items || [createOrderItem(), createOrderItem()];
|
||||
const total = items.reduce((sum, item) => sum + item.price, 0);
|
||||
|
||||
return {
|
||||
id: faker.string.uuid(),
|
||||
user: overrides.user || createUser(),
|
||||
items,
|
||||
total,
|
||||
status: 'pending',
|
||||
createdAt: new Date(),
|
||||
...overrides,
|
||||
};
|
||||
};
|
||||
|
||||
// Usage in tests:
|
||||
test('user can view order details', async ({ page, apiRequest }) => {
|
||||
const user = createUser({ email: 'test@example.com' });
|
||||
const product1 = createProduct({ name: 'Widget A', price: 10.0 });
|
||||
const product2 = createProduct({ name: 'Widget B', price: 15.0 });
|
||||
|
||||
// Explicit relationships
|
||||
const order = createOrder({
|
||||
user,
|
||||
items: [
|
||||
createOrderItem({ product: product1, quantity: 2 }), // $20
|
||||
createOrderItem({ product: product2, quantity: 1 }), // $15
|
||||
],
|
||||
});
|
||||
|
||||
// Seed via API
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: user });
|
||||
await apiRequest({ method: 'POST', url: '/api/products', data: product1 });
|
||||
await apiRequest({ method: 'POST', url: '/api/products', data: product2 });
|
||||
await apiRequest({ method: 'POST', url: '/api/orders', data: order });
|
||||
|
||||
// Test UI
|
||||
await page.goto(`/orders/${order.id}`);
|
||||
await expect(page.getByText('Widget A x 2')).toBeVisible();
|
||||
await expect(page.getByText('Widget B x 1')).toBeVisible();
|
||||
await expect(page.getByText('Total: $35.00')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Nested factories handle relationships (order → user, order → products)
|
||||
- Overrides cascade: provide custom user/products or use defaults
|
||||
- Calculated fields (total) derived automatically from nested data
|
||||
- Explicit relationships make test data clear and maintainable
|
||||
|
||||
### Example 3: Factory with API Seeding
|
||||
|
||||
**Context**: When tests need data setup, always use API calls or database tasks—never UI navigation. Wrap factory usage with seeding utilities for clean test setup.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/helpers/seed-helpers.ts
|
||||
import { APIRequestContext } from '@playwright/test';
|
||||
import { User, createUser } from '../../test-utils/factories/user-factory';
|
||||
import { Product, createProduct } from '../../test-utils/factories/product-factory';
|
||||
|
||||
export async function seedUser(request: APIRequestContext, overrides: Partial<User> = {}): Promise<User> {
|
||||
const user = createUser(overrides);
|
||||
|
||||
const response = await request.post('/api/users', {
|
||||
data: user,
|
||||
});
|
||||
|
||||
if (!response.ok()) {
|
||||
throw new Error(`Failed to seed user: ${response.status()}`);
|
||||
}
|
||||
|
||||
return user;
|
||||
}
|
||||
|
||||
export async function seedProduct(request: APIRequestContext, overrides: Partial<Product> = {}): Promise<Product> {
|
||||
const product = createProduct(overrides);
|
||||
|
||||
const response = await request.post('/api/products', {
|
||||
data: product,
|
||||
});
|
||||
|
||||
if (!response.ok()) {
|
||||
throw new Error(`Failed to seed product: ${response.status()}`);
|
||||
}
|
||||
|
||||
return product;
|
||||
}
|
||||
|
||||
// Playwright globalSetup for shared data
|
||||
// playwright/support/global-setup.ts
|
||||
import { chromium, FullConfig } from '@playwright/test';
|
||||
import { seedUser } from './helpers/seed-helpers';
|
||||
|
||||
async function globalSetup(config: FullConfig) {
|
||||
const browser = await chromium.launch();
|
||||
const page = await browser.newPage();
|
||||
const context = page.context();
|
||||
|
||||
// Seed admin user for all tests
|
||||
const admin = await seedUser(context.request, {
|
||||
email: 'admin@example.com',
|
||||
role: 'admin',
|
||||
});
|
||||
|
||||
// Save auth state for reuse
|
||||
await context.storageState({ path: 'playwright/.auth/admin.json' });
|
||||
|
||||
await browser.close();
|
||||
}
|
||||
|
||||
export default globalSetup;
|
||||
|
||||
// Cypress equivalent with cy.task
|
||||
// cypress/support/tasks.ts
|
||||
export const seedDatabase = async (entity: string, data: unknown) => {
|
||||
// Direct database insert or API call
|
||||
if (entity === 'users') {
|
||||
await db.users.create(data);
|
||||
}
|
||||
return null;
|
||||
};
|
||||
|
||||
// Usage in Cypress tests:
|
||||
beforeEach(() => {
|
||||
const user = createUser({ email: 'test@example.com' });
|
||||
cy.task('db:seed', { entity: 'users', data: user });
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- API seeding is 10-50x faster than UI-based setup
|
||||
- `globalSetup` seeds shared data once (e.g., admin user)
|
||||
- Per-test seeding uses `seedUser()` helpers for isolation
|
||||
- Cypress `cy.task` allows direct database access for speed
|
||||
|
||||
### Example 4: Anti-Pattern - Hardcoded Test Data
|
||||
|
||||
**Problem**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Hardcoded test data
|
||||
test('user can login', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', 'test@test.com'); // Hardcoded
|
||||
await page.fill('[data-testid="password"]', 'password123'); // Hardcoded
|
||||
await page.click('[data-testid="submit"]');
|
||||
|
||||
// What if this user already exists? Test fails in parallel runs.
|
||||
// What if schema adds required fields? Test breaks.
|
||||
});
|
||||
|
||||
// ❌ BAD: Static JSON fixtures
|
||||
// fixtures/users.json
|
||||
{
|
||||
"users": [
|
||||
{ "id": 1, "email": "user1@test.com", "name": "User 1" },
|
||||
{ "id": 2, "email": "user2@test.com", "name": "User 2" }
|
||||
]
|
||||
}
|
||||
|
||||
test('admin can delete user', async ({ page }) => {
|
||||
const users = require('../fixtures/users.json');
|
||||
// Brittle: IDs collide in parallel, schema drift breaks tests
|
||||
});
|
||||
```
|
||||
|
||||
**Why It Fails**:
|
||||
|
||||
- **Parallel collisions**: Hardcoded IDs (`id: 1`, `email: 'test@test.com'`) cause failures when tests run concurrently
|
||||
- **Schema drift**: Adding required fields (`phoneNumber`, `address`) breaks all tests using fixtures
|
||||
- **Hidden intent**: Does this test need `email: 'test@test.com'` specifically, or any email?
|
||||
- **Slow setup**: UI-based data creation is 10-50x slower than API
|
||||
|
||||
**Better Approach**: Use factories
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD: Factory-based data
|
||||
test('user can login', async ({ page, apiRequest }) => {
|
||||
const user = createUser({ email: 'unique@example.com', password: 'secure123' });
|
||||
|
||||
// Seed via API (fast, parallel-safe)
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: user });
|
||||
|
||||
// Test UI
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', user.email);
|
||||
await page.fill('[data-testid="password"]', user.password);
|
||||
await page.click('[data-testid="submit"]');
|
||||
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
});
|
||||
|
||||
// ✅ GOOD: Factories adapt to schema changes automatically
|
||||
// When `phoneNumber` becomes required, update factory once:
|
||||
export const createUser = (overrides: Partial<User> = {}): User => ({
|
||||
id: faker.string.uuid(),
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
phoneNumber: faker.phone.number(), // NEW field, all tests get it automatically
|
||||
role: 'user',
|
||||
...overrides,
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Factories generate unique, parallel-safe data
|
||||
- Schema evolution handled in one place (factory), not every test
|
||||
- Test intent explicit via overrides
|
||||
- API seeding is fast and reliable
|
||||
|
||||
### Example 5: Factory Composition
|
||||
|
||||
**Context**: When building specialized factories, compose simpler factories instead of duplicating logic. Layer overrides for specific test scenarios.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// test-utils/factories/user-factory.ts (base)
|
||||
export const createUser = (overrides: Partial<User> = {}): User => ({
|
||||
id: faker.string.uuid(),
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
role: 'user',
|
||||
createdAt: new Date(),
|
||||
isActive: true,
|
||||
...overrides,
|
||||
});
|
||||
|
||||
// Compose specialized factories
|
||||
export const createAdminUser = (overrides: Partial<User> = {}): User => createUser({ role: 'admin', ...overrides });
|
||||
|
||||
export const createModeratorUser = (overrides: Partial<User> = {}): User => createUser({ role: 'moderator', ...overrides });
|
||||
|
||||
export const createInactiveUser = (overrides: Partial<User> = {}): User => createUser({ isActive: false, ...overrides });
|
||||
|
||||
// Account-level factories with feature flags
|
||||
type Account = {
|
||||
id: string;
|
||||
owner: User;
|
||||
plan: 'free' | 'pro' | 'enterprise';
|
||||
features: string[];
|
||||
maxUsers: number;
|
||||
};
|
||||
|
||||
export const createAccount = (overrides: Partial<Account> = {}): Account => ({
|
||||
id: faker.string.uuid(),
|
||||
owner: overrides.owner || createUser(),
|
||||
plan: 'free',
|
||||
features: [],
|
||||
maxUsers: 1,
|
||||
...overrides,
|
||||
});
|
||||
|
||||
export const createProAccount = (overrides: Partial<Account> = {}): Account =>
|
||||
createAccount({
|
||||
plan: 'pro',
|
||||
features: ['advanced-analytics', 'priority-support'],
|
||||
maxUsers: 10,
|
||||
...overrides,
|
||||
});
|
||||
|
||||
export const createEnterpriseAccount = (overrides: Partial<Account> = {}): Account =>
|
||||
createAccount({
|
||||
plan: 'enterprise',
|
||||
features: ['advanced-analytics', 'priority-support', 'sso', 'audit-logs'],
|
||||
maxUsers: 100,
|
||||
...overrides,
|
||||
});
|
||||
|
||||
// Usage in tests:
|
||||
test('pro accounts can access analytics', async ({ page, apiRequest }) => {
|
||||
const admin = createAdminUser({ email: 'admin@company.com' });
|
||||
const account = createProAccount({ owner: admin });
|
||||
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: admin });
|
||||
await apiRequest({ method: 'POST', url: '/api/accounts', data: account });
|
||||
|
||||
await page.goto('/analytics');
|
||||
await expect(page.getByText('Advanced Analytics')).toBeVisible();
|
||||
});
|
||||
|
||||
test('free accounts cannot access analytics', async ({ page, apiRequest }) => {
|
||||
const user = createUser({ email: 'user@company.com' });
|
||||
const account = createAccount({ owner: user }); // Defaults to free plan
|
||||
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: user });
|
||||
await apiRequest({ method: 'POST', url: '/api/accounts', data: account });
|
||||
|
||||
await page.goto('/analytics');
|
||||
await expect(page.getByText('Upgrade to Pro')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Compose specialized factories from base factories (`createAdminUser` → `createUser`)
|
||||
- Defaults cascade: `createProAccount` sets plan + features automatically
|
||||
- Still allow overrides: `createProAccount({ maxUsers: 50 })` works
|
||||
- Test intent clear: `createProAccount()` vs `createAccount({ plan: 'pro', features: [...] })`
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (factory setup)
|
||||
- **Related fragments**:
|
||||
- `fixture-architecture.md` - Pure functions and fixtures for factory integration
|
||||
- `network-first.md` - API-first setup patterns
|
||||
- `test-quality.md` - Parallel-safe, deterministic test design
|
||||
|
||||
## Cleanup Strategy
|
||||
|
||||
Ensure factories work with cleanup patterns:
|
||||
|
||||
```typescript
|
||||
// Track created IDs for cleanup
|
||||
const createdUsers: string[] = [];
|
||||
|
||||
afterEach(async ({ apiRequest }) => {
|
||||
// Clean up all users created during test
|
||||
for (const userId of createdUsers) {
|
||||
await apiRequest({ method: 'DELETE', url: `/api/users/${userId}` });
|
||||
}
|
||||
createdUsers.length = 0;
|
||||
});
|
||||
|
||||
test('user registration flow', async ({ page, apiRequest }) => {
|
||||
const user = createUser();
|
||||
createdUsers.push(user.id);
|
||||
|
||||
await apiRequest({ method: 'POST', url: '/api/users', data: user });
|
||||
// ... test logic
|
||||
});
|
||||
```
|
||||
|
||||
## Feature Flag Integration
|
||||
|
||||
When working with feature flags, layer them into factories:
|
||||
|
||||
```typescript
|
||||
export const createUserWithFlags = (
|
||||
overrides: Partial<User> = {},
|
||||
flags: Record<string, boolean> = {},
|
||||
): User & { flags: Record<string, boolean> } => ({
|
||||
...createUser(overrides),
|
||||
flags: {
|
||||
'new-dashboard': false,
|
||||
'beta-features': false,
|
||||
...flags,
|
||||
},
|
||||
});
|
||||
|
||||
// Usage:
|
||||
const user = createUserWithFlags(
|
||||
{ email: 'test@example.com' },
|
||||
{
|
||||
'new-dashboard': true,
|
||||
'beta-features': true,
|
||||
},
|
||||
);
|
||||
```
|
||||
|
||||
_Source: Murat Testing Philosophy (lines 94-120), API-first testing patterns, faker.js documentation._
|
||||
|
||||
@@ -1,9 +1,721 @@
|
||||
# Email-Based Authentication Testing
|
||||
|
||||
- Use services like Mailosaur or in-house SMTP capture; extract magic links via regex or HTML parsing helpers.
|
||||
- Preserve browser storage (local/session) when processing links—restore state before visiting the authenticated page.
|
||||
- Cache email payloads with `cypress-data-session` or equivalent so retries don’t exhaust inbox quotas.
|
||||
- Cover negative cases: expired links, reused links, and multiple requests in rapid succession.
|
||||
- Ensure the workflow logs the email ID and link for troubleshooting, but scrub PII before committing artifacts.
|
||||
## Principle
|
||||
|
||||
_Source: Email authentication blog, Murat testing toolkit._
|
||||
Email-based authentication (magic links, one-time codes, passwordless login) requires specialized testing with email capture services like Mailosaur or Ethereal. Extract magic links via HTML parsing or use built-in link extraction, preserve browser storage (local/session/cookies) when processing links, cache email payloads to avoid exhausting inbox quotas, and cover negative cases (expired links, reused links, multiple rapid requests). Log email IDs and links for troubleshooting, but scrub PII before committing artifacts.
|
||||
|
||||
## Rationale
|
||||
|
||||
Email authentication introduces unique challenges: asynchronous email delivery, quota limits (AWS Cognito: 50/day), cost per email, and complex state management (session preservation across link clicks). Without proper patterns, tests become slow (wait for email each time), expensive (quota exhaustion), and brittle (timing issues, missing state). Using email capture services + session caching + state preservation patterns makes email auth tests fast, reliable, and cost-effective.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Magic Link Extraction with Mailosaur
|
||||
|
||||
**Context**: Passwordless login flow where user receives magic link via email, clicks it, and is authenticated.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/magic-link-auth.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Magic Link Authentication Flow
|
||||
* 1. User enters email
|
||||
* 2. Backend sends magic link
|
||||
* 3. Test retrieves email via Mailosaur
|
||||
* 4. Extract and visit magic link
|
||||
* 5. Verify user is authenticated
|
||||
*/
|
||||
|
||||
// Mailosaur configuration
|
||||
const MAILOSAUR_API_KEY = process.env.MAILOSAUR_API_KEY!;
|
||||
const MAILOSAUR_SERVER_ID = process.env.MAILOSAUR_SERVER_ID!;
|
||||
|
||||
/**
|
||||
* Extract href from HTML email body
|
||||
* DOMParser provides XML/HTML parsing in Node.js
|
||||
*/
|
||||
function extractMagicLink(htmlString: string): string | null {
|
||||
const { JSDOM } = require('jsdom');
|
||||
const dom = new JSDOM(htmlString);
|
||||
const link = dom.window.document.querySelector('#magic-link-button');
|
||||
return link ? (link as HTMLAnchorElement).href : null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Alternative: Use Mailosaur's built-in link extraction
|
||||
* Mailosaur automatically parses links - no regex needed!
|
||||
*/
|
||||
async function getMagicLinkFromEmail(email: string): Promise<string> {
|
||||
const MailosaurClient = require('mailosaur');
|
||||
const mailosaur = new MailosaurClient(MAILOSAUR_API_KEY);
|
||||
|
||||
// Wait for email (timeout: 30 seconds)
|
||||
const message = await mailosaur.messages.get(
|
||||
MAILOSAUR_SERVER_ID,
|
||||
{
|
||||
sentTo: email,
|
||||
},
|
||||
{
|
||||
timeout: 30000, // 30 seconds
|
||||
},
|
||||
);
|
||||
|
||||
// Mailosaur extracts links automatically - no parsing needed!
|
||||
const magicLink = message.html?.links?.[0]?.href;
|
||||
|
||||
if (!magicLink) {
|
||||
throw new Error(`Magic link not found in email to ${email}`);
|
||||
}
|
||||
|
||||
console.log(`📧 Email received. Magic link extracted: ${magicLink}`);
|
||||
return magicLink;
|
||||
}
|
||||
|
||||
test.describe('Magic Link Authentication', () => {
|
||||
test('should authenticate user via magic link', async ({ page, context }) => {
|
||||
// Arrange: Generate unique test email
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Act: Request magic link
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
// Assert: Success message
|
||||
await expect(page.getByTestId('check-email-message')).toBeVisible();
|
||||
await expect(page.getByTestId('check-email-message')).toContainText('Check your email');
|
||||
|
||||
// Retrieve magic link from email
|
||||
const magicLink = await getMagicLinkFromEmail(testEmail);
|
||||
|
||||
// Visit magic link
|
||||
await page.goto(magicLink);
|
||||
|
||||
// Assert: User is authenticated
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
await expect(page.getByTestId('user-email')).toContainText(testEmail);
|
||||
|
||||
// Verify session storage preserved
|
||||
const localStorage = await page.evaluate(() => JSON.stringify(window.localStorage));
|
||||
expect(localStorage).toContain('authToken');
|
||||
});
|
||||
|
||||
test('should handle expired magic link', async ({ page }) => {
|
||||
// Use pre-expired link (older than 15 minutes)
|
||||
const expiredLink = 'http://localhost:3000/auth/verify?token=expired-token-123';
|
||||
|
||||
await page.goto(expiredLink);
|
||||
|
||||
// Assert: Error message displayed
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText('link has expired');
|
||||
|
||||
// Assert: User NOT authenticated
|
||||
await expect(page.getByTestId('user-menu')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should prevent reusing magic link', async ({ page }) => {
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Request magic link
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
const magicLink = await getMagicLinkFromEmail(testEmail);
|
||||
|
||||
// Visit link first time (success)
|
||||
await page.goto(magicLink);
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
|
||||
// Sign out
|
||||
await page.getByTestId('sign-out').click();
|
||||
|
||||
// Try to reuse same link (should fail)
|
||||
await page.goto(magicLink);
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText('link has already been used');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress equivalent with Mailosaur plugin**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/magic-link-auth.cy.ts
|
||||
describe('Magic Link Authentication', () => {
|
||||
it('should authenticate user via magic link', () => {
|
||||
const serverId = Cypress.env('MAILOSAUR_SERVERID');
|
||||
const randomId = Cypress._.random(1e6);
|
||||
const testEmail = `user-${randomId}@${serverId}.mailosaur.net`;
|
||||
|
||||
// Request magic link
|
||||
cy.visit('/login');
|
||||
cy.get('[data-cy="email-input"]').type(testEmail);
|
||||
cy.get('[data-cy="send-magic-link"]').click();
|
||||
cy.get('[data-cy="check-email-message"]').should('be.visible');
|
||||
|
||||
// Retrieve and visit magic link
|
||||
cy.mailosaurGetMessage(serverId, { sentTo: testEmail })
|
||||
.its('html.links.0.href') // Mailosaur extracts links automatically!
|
||||
.should('exist')
|
||||
.then((magicLink) => {
|
||||
cy.log(`Magic link: ${magicLink}`);
|
||||
cy.visit(magicLink);
|
||||
});
|
||||
|
||||
// Verify authenticated
|
||||
cy.get('[data-cy="user-menu"]').should('be.visible');
|
||||
cy.get('[data-cy="user-email"]').should('contain', testEmail);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Mailosaur auto-extraction**: `html.links[0].href` or `html.codes[0].value`
|
||||
- **Unique emails**: Random ID prevents collisions
|
||||
- **Negative testing**: Expired and reused links tested
|
||||
- **State verification**: localStorage/session checked
|
||||
- **Fast email retrieval**: 30 second timeout typical
|
||||
|
||||
---
|
||||
|
||||
### Example 2: State Preservation Pattern with cy.session / Playwright storageState
|
||||
|
||||
**Context**: Cache authenticated session to avoid requesting magic link on every test.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/fixtures/email-auth-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { getMagicLinkFromEmail } from '../support/mailosaur-helpers';
|
||||
|
||||
type EmailAuthFixture = {
|
||||
authenticatedUser: { email: string; token: string };
|
||||
};
|
||||
|
||||
export const test = base.extend<EmailAuthFixture>({
|
||||
authenticatedUser: async ({ page, context }, use) => {
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${process.env.MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Check if we have cached auth state for this email
|
||||
const storageStatePath = `./test-results/auth-state-${testEmail}.json`;
|
||||
|
||||
try {
|
||||
// Try to reuse existing session
|
||||
await context.storageState({ path: storageStatePath });
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Validate session is still valid
|
||||
const isAuthenticated = await page.getByTestId('user-menu').isVisible({ timeout: 2000 });
|
||||
|
||||
if (isAuthenticated) {
|
||||
console.log(`✅ Reusing cached session for ${testEmail}`);
|
||||
await use({ email: testEmail, token: 'cached' });
|
||||
return;
|
||||
}
|
||||
} catch (error) {
|
||||
console.log(`📧 No cached session, requesting magic link for ${testEmail}`);
|
||||
}
|
||||
|
||||
// Request new magic link
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
// Get magic link from email
|
||||
const magicLink = await getMagicLinkFromEmail(testEmail);
|
||||
|
||||
// Visit link and authenticate
|
||||
await page.goto(magicLink);
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
|
||||
// Extract auth token from localStorage
|
||||
const authToken = await page.evaluate(() => localStorage.getItem('authToken'));
|
||||
|
||||
// Save session state for reuse
|
||||
await context.storageState({ path: storageStatePath });
|
||||
|
||||
console.log(`💾 Cached session for ${testEmail}`);
|
||||
|
||||
await use({ email: testEmail, token: authToken || '' });
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress equivalent with cy.session + data-session**:
|
||||
|
||||
```javascript
|
||||
// cypress/support/commands/email-auth.js
|
||||
import { dataSession } from 'cypress-data-session';
|
||||
|
||||
/**
|
||||
* Authenticate via magic link with session caching
|
||||
* - First run: Requests email, extracts link, authenticates
|
||||
* - Subsequent runs: Reuses cached session (no email)
|
||||
*/
|
||||
Cypress.Commands.add('authViaMagicLink', (email) => {
|
||||
return dataSession({
|
||||
name: `magic-link-${email}`,
|
||||
|
||||
// First-time setup: Request and process magic link
|
||||
setup: () => {
|
||||
cy.visit('/login');
|
||||
cy.get('[data-cy="email-input"]').type(email);
|
||||
cy.get('[data-cy="send-magic-link"]').click();
|
||||
|
||||
// Get magic link from Mailosaur
|
||||
cy.mailosaurGetMessage(Cypress.env('MAILOSAUR_SERVERID'), {
|
||||
sentTo: email,
|
||||
})
|
||||
.its('html.links.0.href')
|
||||
.should('exist')
|
||||
.then((magicLink) => {
|
||||
cy.visit(magicLink);
|
||||
});
|
||||
|
||||
// Wait for authentication
|
||||
cy.get('[data-cy="user-menu"]', { timeout: 10000 }).should('be.visible');
|
||||
|
||||
// Preserve authentication state
|
||||
return cy.getAllLocalStorage().then((storage) => {
|
||||
return { storage, email };
|
||||
});
|
||||
},
|
||||
|
||||
// Validate cached session is still valid
|
||||
validate: (cached) => {
|
||||
return cy.wrap(Boolean(cached?.storage));
|
||||
},
|
||||
|
||||
// Recreate session from cache (no email needed)
|
||||
recreate: (cached) => {
|
||||
// Restore localStorage
|
||||
cy.setLocalStorage(cached.storage);
|
||||
cy.visit('/dashboard');
|
||||
cy.get('[data-cy="user-menu"]', { timeout: 5000 }).should('be.visible');
|
||||
},
|
||||
|
||||
shareAcrossSpecs: true, // Share session across all tests
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Usage in tests**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/dashboard.cy.ts
|
||||
describe('Dashboard', () => {
|
||||
const serverId = Cypress.env('MAILOSAUR_SERVERID');
|
||||
const testEmail = `test-user@${serverId}.mailosaur.net`;
|
||||
|
||||
beforeEach(() => {
|
||||
// First test: Requests magic link
|
||||
// Subsequent tests: Reuses cached session (no email!)
|
||||
cy.authViaMagicLink(testEmail);
|
||||
});
|
||||
|
||||
it('should display user dashboard', () => {
|
||||
cy.get('[data-cy="dashboard-content"]').should('be.visible');
|
||||
});
|
||||
|
||||
it('should show user profile', () => {
|
||||
cy.get('[data-cy="user-email"]').should('contain', testEmail);
|
||||
});
|
||||
|
||||
// Both tests share same session - only 1 email consumed!
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Session caching**: First test requests email, rest reuse session
|
||||
- **State preservation**: localStorage/cookies saved and restored
|
||||
- **Validation**: Check cached session is still valid
|
||||
- **Quota optimization**: Massive reduction in email consumption
|
||||
- **Fast tests**: Cached auth takes seconds vs. minutes
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Negative Flow Tests (Expired, Invalid, Reused Links)
|
||||
|
||||
**Context**: Comprehensive negative testing for email authentication edge cases.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/email-auth-negative.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { getMagicLinkFromEmail } from '../support/mailosaur-helpers';
|
||||
|
||||
const MAILOSAUR_SERVER_ID = process.env.MAILOSAUR_SERVER_ID!;
|
||||
|
||||
test.describe('Email Auth Negative Flows', () => {
|
||||
test('should reject expired magic link', async ({ page }) => {
|
||||
// Generate expired link (simulate 24 hours ago)
|
||||
const expiredToken = Buffer.from(
|
||||
JSON.stringify({
|
||||
email: 'test@example.com',
|
||||
exp: Date.now() - 24 * 60 * 60 * 1000, // 24 hours ago
|
||||
}),
|
||||
).toString('base64');
|
||||
|
||||
const expiredLink = `http://localhost:3000/auth/verify?token=${expiredToken}`;
|
||||
|
||||
// Visit expired link
|
||||
await page.goto(expiredLink);
|
||||
|
||||
// Assert: Error displayed
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText(/link.*expired|expired.*link/i);
|
||||
|
||||
// Assert: Link to request new one
|
||||
await expect(page.getByTestId('request-new-link')).toBeVisible();
|
||||
|
||||
// Assert: User NOT authenticated
|
||||
await expect(page.getByTestId('user-menu')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should reject invalid magic link token', async ({ page }) => {
|
||||
const invalidLink = 'http://localhost:3000/auth/verify?token=invalid-garbage';
|
||||
|
||||
await page.goto(invalidLink);
|
||||
|
||||
// Assert: Error displayed
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText(/invalid.*link|link.*invalid/i);
|
||||
|
||||
// Assert: User not authenticated
|
||||
await expect(page.getByTestId('user-menu')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should reject already-used magic link', async ({ page, context }) => {
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Request magic link
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
const magicLink = await getMagicLinkFromEmail(testEmail);
|
||||
|
||||
// Visit link FIRST time (success)
|
||||
await page.goto(magicLink);
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
|
||||
// Sign out
|
||||
await page.getByTestId('user-menu').click();
|
||||
await page.getByTestId('sign-out').click();
|
||||
await expect(page.getByTestId('user-menu')).not.toBeVisible();
|
||||
|
||||
// Try to reuse SAME link (should fail)
|
||||
await page.goto(magicLink);
|
||||
|
||||
// Assert: Link already used error
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText(/already.*used|link.*used/i);
|
||||
|
||||
// Assert: User not authenticated
|
||||
await expect(page.getByTestId('user-menu')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should handle rapid successive link requests', async ({ page }) => {
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Request magic link 3 times rapidly
|
||||
for (let i = 0; i < 3; i++) {
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
await expect(page.getByTestId('check-email-message')).toBeVisible();
|
||||
}
|
||||
|
||||
// Only the LATEST link should work
|
||||
const MailosaurClient = require('mailosaur');
|
||||
const mailosaur = new MailosaurClient(process.env.MAILOSAUR_API_KEY);
|
||||
|
||||
const messages = await mailosaur.messages.list(MAILOSAUR_SERVER_ID, {
|
||||
sentTo: testEmail,
|
||||
});
|
||||
|
||||
// Should receive 3 emails
|
||||
expect(messages.items.length).toBeGreaterThanOrEqual(3);
|
||||
|
||||
// Get the LATEST magic link
|
||||
const latestMessage = messages.items[0]; // Most recent first
|
||||
const latestLink = latestMessage.html.links[0].href;
|
||||
|
||||
// Latest link works
|
||||
await page.goto(latestLink);
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
|
||||
// Older links should NOT work (if backend invalidates previous)
|
||||
await page.getByTestId('sign-out').click();
|
||||
const olderLink = messages.items[1].html.links[0].href;
|
||||
|
||||
await page.goto(olderLink);
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
});
|
||||
|
||||
test('should rate-limit excessive magic link requests', async ({ page }) => {
|
||||
const randomId = Math.floor(Math.random() * 1000000);
|
||||
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
|
||||
|
||||
// Request magic link 10 times rapidly (should hit rate limit)
|
||||
for (let i = 0; i < 10; i++) {
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
// After N requests, should show rate limit error
|
||||
const errorVisible = await page
|
||||
.getByTestId('rate-limit-error')
|
||||
.isVisible({ timeout: 1000 })
|
||||
.catch(() => false);
|
||||
|
||||
if (errorVisible) {
|
||||
console.log(`Rate limit hit after ${i + 1} requests`);
|
||||
await expect(page.getByTestId('rate-limit-error')).toContainText(/too many.*requests|rate.*limit/i);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
// If no rate limit after 10 requests, log warning
|
||||
console.warn('⚠️ No rate limit detected after 10 requests');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Expired links**: Test 24+ hour old tokens
|
||||
- **Invalid tokens**: Malformed or garbage tokens rejected
|
||||
- **Reuse prevention**: Same link can't be used twice
|
||||
- **Rapid requests**: Multiple requests handled gracefully
|
||||
- **Rate limiting**: Excessive requests blocked
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Caching Strategy with cypress-data-session / Playwright Projects
|
||||
|
||||
**Context**: Minimize email consumption by sharing authentication state across tests and specs.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```javascript
|
||||
// cypress/support/commands/register-and-sign-in.js
|
||||
import { dataSession } from 'cypress-data-session';
|
||||
|
||||
/**
|
||||
* Email Authentication Caching Strategy
|
||||
* - One email per test run (not per spec, not per test)
|
||||
* - First spec: Full registration flow (form → email → code → sign in)
|
||||
* - Subsequent specs: Only sign in (reuse user)
|
||||
* - Subsequent tests in same spec: Session already active (no sign in)
|
||||
*/
|
||||
|
||||
// Helper: Fill registration form
|
||||
function fillRegistrationForm({ fullName, userName, email, password }) {
|
||||
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
|
||||
cy.contains('Register').click();
|
||||
cy.get('#reg-dialog-form').should('be.visible');
|
||||
cy.get('#first-name').type(fullName, { delay: 0 });
|
||||
cy.get('#last-name').type(lastName, { delay: 0 });
|
||||
cy.get('#email').type(email, { delay: 0 });
|
||||
cy.get('#username').type(userName, { delay: 0 });
|
||||
cy.get('#password').type(password, { delay: 0 });
|
||||
cy.contains('button', 'Create an account').click();
|
||||
cy.wait('@cognito').its('response.statusCode').should('equal', 200);
|
||||
}
|
||||
|
||||
// Helper: Confirm registration with email code
|
||||
function confirmRegistration(email) {
|
||||
return cy
|
||||
.mailosaurGetMessage(Cypress.env('MAILOSAUR_SERVERID'), { sentTo: email })
|
||||
.its('html.codes.0.value') // Mailosaur auto-extracts codes!
|
||||
.then((code) => {
|
||||
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
|
||||
cy.get('#verification-code').type(code, { delay: 0 });
|
||||
cy.contains('button', 'Confirm registration').click();
|
||||
cy.wait('@cognito');
|
||||
cy.contains('You are now registered!').should('be.visible');
|
||||
cy.contains('button', /ok/i).click();
|
||||
return cy.wrap(code); // Return code for reference
|
||||
});
|
||||
}
|
||||
|
||||
// Helper: Full registration (form + email)
|
||||
function register({ fullName, userName, email, password }) {
|
||||
fillRegistrationForm({ fullName, userName, email, password });
|
||||
return confirmRegistration(email);
|
||||
}
|
||||
|
||||
// Helper: Sign in
|
||||
function signIn({ userName, password }) {
|
||||
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
|
||||
cy.contains('Sign in').click();
|
||||
cy.get('#sign-in-username').type(userName, { delay: 0 });
|
||||
cy.get('#sign-in-password').type(password, { delay: 0 });
|
||||
cy.contains('button', 'Sign in').click();
|
||||
cy.wait('@cognito');
|
||||
cy.contains('Sign out').should('be.visible');
|
||||
}
|
||||
|
||||
/**
|
||||
* Register and sign in with email caching
|
||||
* ONE EMAIL PER MACHINE (cypress run or cypress open)
|
||||
*/
|
||||
Cypress.Commands.add('registerAndSignIn', ({ fullName, userName, email, password }) => {
|
||||
return dataSession({
|
||||
name: email, // Unique session per email
|
||||
|
||||
// First time: Full registration (form → email → code)
|
||||
init: () => register({ fullName, userName, email, password }),
|
||||
|
||||
// Subsequent specs: Just check email exists (code already used)
|
||||
setup: () => confirmRegistration(email),
|
||||
|
||||
// Always runs after init/setup: Sign in
|
||||
recreate: () => signIn({ userName, password }),
|
||||
|
||||
// Share across ALL specs (one email for entire test run)
|
||||
shareAcrossSpecs: true,
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Usage across multiple specs**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/place-order.cy.ts
|
||||
describe('Place Order', () => {
|
||||
beforeEach(() => {
|
||||
cy.visit('/');
|
||||
cy.registerAndSignIn({
|
||||
fullName: Cypress.env('fullName'), // From cypress.config
|
||||
userName: Cypress.env('userName'),
|
||||
email: Cypress.env('email'), // SAME email across all specs
|
||||
password: Cypress.env('password'),
|
||||
});
|
||||
});
|
||||
|
||||
it('should place order', () => {
|
||||
/* ... */
|
||||
});
|
||||
it('should view order history', () => {
|
||||
/* ... */
|
||||
});
|
||||
});
|
||||
|
||||
// cypress/e2e/profile.cy.ts
|
||||
describe('User Profile', () => {
|
||||
beforeEach(() => {
|
||||
cy.visit('/');
|
||||
cy.registerAndSignIn({
|
||||
fullName: Cypress.env('fullName'),
|
||||
userName: Cypress.env('userName'),
|
||||
email: Cypress.env('email'), // SAME email - no new email sent!
|
||||
password: Cypress.env('password'),
|
||||
});
|
||||
});
|
||||
|
||||
it('should update profile', () => {
|
||||
/* ... */
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Playwright equivalent with storageState**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts
|
||||
import { defineConfig } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
projects: [
|
||||
{
|
||||
name: 'setup',
|
||||
testMatch: /global-setup\.ts/,
|
||||
},
|
||||
{
|
||||
name: 'authenticated',
|
||||
testMatch: /.*\.spec\.ts/,
|
||||
dependencies: ['setup'],
|
||||
use: {
|
||||
storageState: '.auth/user-session.json', // Reuse auth state
|
||||
},
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// tests/global-setup.ts (runs once)
|
||||
import { test as setup } from '@playwright/test';
|
||||
import { getMagicLinkFromEmail } from './support/mailosaur-helpers';
|
||||
|
||||
const authFile = '.auth/user-session.json';
|
||||
|
||||
setup('authenticate via magic link', async ({ page }) => {
|
||||
const testEmail = process.env.TEST_USER_EMAIL!;
|
||||
|
||||
// Request magic link
|
||||
await page.goto('/login');
|
||||
await page.getByTestId('email-input').fill(testEmail);
|
||||
await page.getByTestId('send-magic-link').click();
|
||||
|
||||
// Get and visit magic link
|
||||
const magicLink = await getMagicLinkFromEmail(testEmail);
|
||||
await page.goto(magicLink);
|
||||
|
||||
// Verify authenticated
|
||||
await expect(page.getByTestId('user-menu')).toBeVisible();
|
||||
|
||||
// Save authenticated state (ONE TIME for all tests)
|
||||
await page.context().storageState({ path: authFile });
|
||||
|
||||
console.log('✅ Authentication state saved to', authFile);
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **One email per run**: Global setup authenticates once
|
||||
- **State reuse**: All tests use cached storageState
|
||||
- **cypress-data-session**: Intelligently manages cache lifecycle
|
||||
- **shareAcrossSpecs**: Session shared across all spec files
|
||||
- **Massive savings**: 500 tests = 1 email (not 500!)
|
||||
|
||||
---
|
||||
|
||||
## Email Authentication Testing Checklist
|
||||
|
||||
Before implementing email auth tests, verify:
|
||||
|
||||
- [ ] **Email service**: Mailosaur/Ethereal/MailHog configured with API keys
|
||||
- [ ] **Link extraction**: Use built-in parsing (html.links[0].href) over regex
|
||||
- [ ] **State preservation**: localStorage/session/cookies saved and restored
|
||||
- [ ] **Session caching**: cypress-data-session or storageState prevents redundant emails
|
||||
- [ ] **Negative flows**: Expired, invalid, reused, rapid requests tested
|
||||
- [ ] **Quota awareness**: One email per run (not per test)
|
||||
- [ ] **PII scrubbing**: Email IDs logged for debug, but scrubbed from artifacts
|
||||
- [ ] **Timeout handling**: 30 second email retrieval timeout configured
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*framework` (email auth setup), `*automate` (email auth test generation)
|
||||
- Related fragments: `fixture-architecture.md`, `test-quality.md`
|
||||
- Email services: Mailosaur (recommended), Ethereal (free), MailHog (self-hosted)
|
||||
- Plugins: cypress-mailosaur, cypress-data-session
|
||||
|
||||
_Source: Email authentication blog, Murat testing toolkit, Mailosaur documentation_
|
||||
|
||||
@@ -1,9 +1,725 @@
|
||||
# Error Handling and Resilience Checks
|
||||
|
||||
- Treat expected failures explicitly: intercept network errors and assert UI fallbacks (`error-message` visible, retries triggered).
|
||||
- In Cypress, use scoped `Cypress.on('uncaught:exception')` to ignore known errors; rethrow anything else so regressions fail.
|
||||
- In Playwright, hook `page.on('pageerror')` and only swallow the specific, documented error messages.
|
||||
- Test retry/backoff logic by forcing sequential failures (e.g., 500, timeout, success) and asserting telemetry gets recorded.
|
||||
- Log captured errors with context (request payload, user/session) but redact secrets to keep artifacts safe for sharing.
|
||||
## Principle
|
||||
|
||||
_Source: Murat error-handling patterns, Pact resilience guidance._
|
||||
Treat expected failures explicitly: intercept network errors, assert UI fallbacks (error messages visible, retries triggered), and use scoped exception handling to ignore known errors while catching regressions. Test retry/backoff logic by forcing sequential failures (500 → timeout → success) and validate telemetry logging. Log captured errors with context (request payload, user/session) but redact secrets to keep artifacts safe for sharing.
|
||||
|
||||
## Rationale
|
||||
|
||||
Tests fail for two reasons: genuine bugs or poor error handling in the test itself. Without explicit error handling patterns, tests become noisy (uncaught exceptions cause false failures) or silent (swallowing all errors hides real bugs). Scoped exception handling (Cypress.on('uncaught:exception'), page.on('pageerror')) allows tests to ignore documented, expected errors while surfacing unexpected ones. Resilience testing (retry logic, graceful degradation) ensures applications handle failures gracefully in production.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Scoped Exception Handling (Expected Errors Only)
|
||||
|
||||
**Context**: Handle known errors (Network failures, expected 500s) without masking unexpected bugs.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/error-handling.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Scoped Error Handling Pattern
|
||||
* - Only ignore specific, documented errors
|
||||
* - Rethrow everything else to catch regressions
|
||||
* - Validate error UI and user experience
|
||||
*/
|
||||
|
||||
test.describe('API Error Handling', () => {
|
||||
test('should display error message when API returns 500', async ({ page }) => {
|
||||
// Scope error handling to THIS test only
|
||||
const consoleErrors: string[] = [];
|
||||
page.on('pageerror', (error) => {
|
||||
// Only swallow documented NetworkError
|
||||
if (error.message.includes('NetworkError: Failed to fetch')) {
|
||||
consoleErrors.push(error.message);
|
||||
return; // Swallow this specific error
|
||||
}
|
||||
// Rethrow all other errors (catch regressions!)
|
||||
throw error;
|
||||
});
|
||||
|
||||
// Arrange: Mock 500 error response
|
||||
await page.route('**/api/users', (route) =>
|
||||
route.fulfill({
|
||||
status: 500,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({
|
||||
error: 'Internal server error',
|
||||
code: 'INTERNAL_ERROR',
|
||||
}),
|
||||
}),
|
||||
);
|
||||
|
||||
// Act: Navigate to page that fetches users
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Assert: Error UI displayed
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText(/error.*loading|failed.*load/i);
|
||||
|
||||
// Assert: Retry button visible
|
||||
await expect(page.getByTestId('retry-button')).toBeVisible();
|
||||
|
||||
// Assert: NetworkError was thrown and caught
|
||||
expect(consoleErrors).toContainEqual(expect.stringContaining('NetworkError'));
|
||||
});
|
||||
|
||||
test('should NOT swallow unexpected errors', async ({ page }) => {
|
||||
let unexpectedError: Error | null = null;
|
||||
|
||||
page.on('pageerror', (error) => {
|
||||
// Capture but don't swallow - test should fail
|
||||
unexpectedError = error;
|
||||
throw error;
|
||||
});
|
||||
|
||||
// Arrange: App has JavaScript error (bug)
|
||||
await page.addInitScript(() => {
|
||||
// Simulate bug in app code
|
||||
(window as any).buggyFunction = () => {
|
||||
throw new Error('UNEXPECTED BUG: undefined is not a function');
|
||||
};
|
||||
});
|
||||
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Trigger buggy function
|
||||
await page.evaluate(() => (window as any).buggyFunction());
|
||||
|
||||
// Assert: Test fails because unexpected error was NOT swallowed
|
||||
expect(unexpectedError).not.toBeNull();
|
||||
expect(unexpectedError?.message).toContain('UNEXPECTED BUG');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress equivalent**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/error-handling.cy.ts
|
||||
describe('API Error Handling', () => {
|
||||
it('should display error message when API returns 500', () => {
|
||||
// Scoped to this test only
|
||||
cy.on('uncaught:exception', (err) => {
|
||||
// Only swallow documented NetworkError
|
||||
if (err.message.includes('NetworkError')) {
|
||||
return false; // Prevent test failure
|
||||
}
|
||||
// All other errors fail the test
|
||||
return true;
|
||||
});
|
||||
|
||||
// Arrange: Mock 500 error
|
||||
cy.intercept('GET', '**/api/users', {
|
||||
statusCode: 500,
|
||||
body: {
|
||||
error: 'Internal server error',
|
||||
code: 'INTERNAL_ERROR',
|
||||
},
|
||||
}).as('getUsers');
|
||||
|
||||
// Act
|
||||
cy.visit('/dashboard');
|
||||
cy.wait('@getUsers');
|
||||
|
||||
// Assert: Error UI
|
||||
cy.get('[data-cy="error-message"]').should('be.visible');
|
||||
cy.get('[data-cy="error-message"]').should('contain', 'error loading');
|
||||
cy.get('[data-cy="retry-button"]').should('be.visible');
|
||||
});
|
||||
|
||||
it('should NOT swallow unexpected errors', () => {
|
||||
// No exception handler - test should fail on unexpected errors
|
||||
|
||||
cy.visit('/dashboard');
|
||||
|
||||
// Trigger unexpected error
|
||||
cy.window().then((win) => {
|
||||
// This should fail the test
|
||||
win.eval('throw new Error("UNEXPECTED BUG")');
|
||||
});
|
||||
|
||||
// Test fails (as expected) - validates error detection works
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Scoped handling**: page.on() / cy.on() scoped to specific tests
|
||||
- **Explicit allow-list**: Only ignore documented errors
|
||||
- **Rethrow unexpected**: Catch regressions by failing on unknown errors
|
||||
- **Error UI validation**: Assert user sees error message
|
||||
- **Logging**: Capture errors for debugging, don't swallow silently
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Retry Validation Pattern (Network Resilience)
|
||||
|
||||
**Context**: Test that retry/backoff logic works correctly for transient failures.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/retry-resilience.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Retry Validation Pattern
|
||||
* - Force sequential failures (500 → 500 → 200)
|
||||
* - Validate retry attempts and backoff timing
|
||||
* - Assert telemetry captures retry events
|
||||
*/
|
||||
|
||||
test.describe('Network Retry Logic', () => {
|
||||
test('should retry on 500 error and succeed', async ({ page }) => {
|
||||
let attemptCount = 0;
|
||||
const attemptTimestamps: number[] = [];
|
||||
|
||||
// Mock API: Fail twice, succeed on third attempt
|
||||
await page.route('**/api/products', (route) => {
|
||||
attemptCount++;
|
||||
attemptTimestamps.push(Date.now());
|
||||
|
||||
if (attemptCount <= 2) {
|
||||
// First 2 attempts: 500 error
|
||||
route.fulfill({
|
||||
status: 500,
|
||||
body: JSON.stringify({ error: 'Server error' }),
|
||||
});
|
||||
} else {
|
||||
// 3rd attempt: Success
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({ products: [{ id: 1, name: 'Product 1' }] }),
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
// Act: Navigate (should retry automatically)
|
||||
await page.goto('/products');
|
||||
|
||||
// Assert: Data eventually loads after retries
|
||||
await expect(page.getByTestId('product-list')).toBeVisible();
|
||||
await expect(page.getByTestId('product-item')).toHaveCount(1);
|
||||
|
||||
// Assert: Exactly 3 attempts made
|
||||
expect(attemptCount).toBe(3);
|
||||
|
||||
// Assert: Exponential backoff timing (1s → 2s between attempts)
|
||||
if (attemptTimestamps.length === 3) {
|
||||
const delay1 = attemptTimestamps[1] - attemptTimestamps[0];
|
||||
const delay2 = attemptTimestamps[2] - attemptTimestamps[1];
|
||||
|
||||
expect(delay1).toBeGreaterThanOrEqual(900); // ~1 second
|
||||
expect(delay1).toBeLessThan(1200);
|
||||
expect(delay2).toBeGreaterThanOrEqual(1900); // ~2 seconds
|
||||
expect(delay2).toBeLessThan(2200);
|
||||
}
|
||||
|
||||
// Assert: Telemetry logged retry events
|
||||
const telemetryEvents = await page.evaluate(() => (window as any).__TELEMETRY_EVENTS__ || []);
|
||||
expect(telemetryEvents).toContainEqual(
|
||||
expect.objectContaining({
|
||||
event: 'api_retry',
|
||||
attempt: 1,
|
||||
endpoint: '/api/products',
|
||||
}),
|
||||
);
|
||||
expect(telemetryEvents).toContainEqual(
|
||||
expect.objectContaining({
|
||||
event: 'api_retry',
|
||||
attempt: 2,
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
test('should give up after max retries and show error', async ({ page }) => {
|
||||
let attemptCount = 0;
|
||||
|
||||
// Mock API: Always fail (test retry limit)
|
||||
await page.route('**/api/products', (route) => {
|
||||
attemptCount++;
|
||||
route.fulfill({
|
||||
status: 500,
|
||||
body: JSON.stringify({ error: 'Persistent server error' }),
|
||||
});
|
||||
});
|
||||
|
||||
// Act
|
||||
await page.goto('/products');
|
||||
|
||||
// Assert: Max retries reached (3 attempts typical)
|
||||
expect(attemptCount).toBe(3);
|
||||
|
||||
// Assert: Error UI displayed after exhausting retries
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
await expect(page.getByTestId('error-message')).toContainText(/unable.*load|failed.*after.*retries/i);
|
||||
|
||||
// Assert: Data not displayed
|
||||
await expect(page.getByTestId('product-list')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should NOT retry on 404 (non-retryable error)', async ({ page }) => {
|
||||
let attemptCount = 0;
|
||||
|
||||
// Mock API: 404 error (should NOT retry)
|
||||
await page.route('**/api/products/999', (route) => {
|
||||
attemptCount++;
|
||||
route.fulfill({
|
||||
status: 404,
|
||||
body: JSON.stringify({ error: 'Product not found' }),
|
||||
});
|
||||
});
|
||||
|
||||
await page.goto('/products/999');
|
||||
|
||||
// Assert: Only 1 attempt (no retries on 404)
|
||||
expect(attemptCount).toBe(1);
|
||||
|
||||
// Assert: 404 error displayed immediately
|
||||
await expect(page.getByTestId('not-found-message')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress with retry interception**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/retry-resilience.cy.ts
|
||||
describe('Network Retry Logic', () => {
|
||||
it('should retry on 500 and succeed on 3rd attempt', () => {
|
||||
let attemptCount = 0;
|
||||
|
||||
cy.intercept('GET', '**/api/products', (req) => {
|
||||
attemptCount++;
|
||||
|
||||
if (attemptCount <= 2) {
|
||||
req.reply({ statusCode: 500, body: { error: 'Server error' } });
|
||||
} else {
|
||||
req.reply({ statusCode: 200, body: { products: [{ id: 1, name: 'Product 1' }] } });
|
||||
}
|
||||
}).as('getProducts');
|
||||
|
||||
cy.visit('/products');
|
||||
|
||||
// Wait for final successful request
|
||||
cy.wait('@getProducts').its('response.statusCode').should('eq', 200);
|
||||
|
||||
// Assert: Data loaded
|
||||
cy.get('[data-cy="product-list"]').should('be.visible');
|
||||
cy.get('[data-cy="product-item"]').should('have.length', 1);
|
||||
|
||||
// Validate retry count
|
||||
cy.wrap(attemptCount).should('eq', 3);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Sequential failures**: Test retry logic with 500 → 500 → 200
|
||||
- **Backoff timing**: Validate exponential backoff delays
|
||||
- **Retry limits**: Max attempts enforced (typically 3)
|
||||
- **Non-retryable errors**: 404s don't trigger retries
|
||||
- **Telemetry**: Log retry attempts for monitoring
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Telemetry Logging with Context (Sentry Integration)
|
||||
|
||||
**Context**: Capture errors with full context for production debugging without exposing secrets.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/telemetry-logging.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Telemetry Logging Pattern
|
||||
* - Log errors with request context
|
||||
* - Redact sensitive data (tokens, passwords, PII)
|
||||
* - Integrate with monitoring (Sentry, Datadog)
|
||||
* - Validate error logging without exposing secrets
|
||||
*/
|
||||
|
||||
type ErrorLog = {
|
||||
level: 'error' | 'warn' | 'info';
|
||||
message: string;
|
||||
context?: {
|
||||
endpoint?: string;
|
||||
method?: string;
|
||||
statusCode?: number;
|
||||
userId?: string;
|
||||
sessionId?: string;
|
||||
};
|
||||
timestamp: string;
|
||||
};
|
||||
|
||||
test.describe('Error Telemetry', () => {
|
||||
test('should log API errors with context', async ({ page }) => {
|
||||
const errorLogs: ErrorLog[] = [];
|
||||
|
||||
// Capture console errors
|
||||
page.on('console', (msg) => {
|
||||
if (msg.type() === 'error') {
|
||||
try {
|
||||
const log = JSON.parse(msg.text());
|
||||
errorLogs.push(log);
|
||||
} catch {
|
||||
// Not a structured log, ignore
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
// Mock failing API
|
||||
await page.route('**/api/orders', (route) =>
|
||||
route.fulfill({
|
||||
status: 500,
|
||||
body: JSON.stringify({ error: 'Payment processor unavailable' }),
|
||||
}),
|
||||
);
|
||||
|
||||
// Act: Trigger error
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('place-order').click();
|
||||
|
||||
// Wait for error UI
|
||||
await expect(page.getByTestId('error-message')).toBeVisible();
|
||||
|
||||
// Assert: Error logged with context
|
||||
expect(errorLogs).toContainEqual(
|
||||
expect.objectContaining({
|
||||
level: 'error',
|
||||
message: expect.stringContaining('API request failed'),
|
||||
context: expect.objectContaining({
|
||||
endpoint: '/api/orders',
|
||||
method: 'POST',
|
||||
statusCode: 500,
|
||||
userId: expect.any(String),
|
||||
}),
|
||||
}),
|
||||
);
|
||||
|
||||
// Assert: Sensitive data NOT logged
|
||||
const logString = JSON.stringify(errorLogs);
|
||||
expect(logString).not.toContain('password');
|
||||
expect(logString).not.toContain('token');
|
||||
expect(logString).not.toContain('creditCard');
|
||||
});
|
||||
|
||||
test('should send errors to Sentry with breadcrumbs', async ({ page }) => {
|
||||
const sentryEvents: any[] = [];
|
||||
|
||||
// Mock Sentry SDK
|
||||
await page.addInitScript(() => {
|
||||
(window as any).Sentry = {
|
||||
captureException: (error: Error, context?: any) => {
|
||||
(window as any).__SENTRY_EVENTS__ = (window as any).__SENTRY_EVENTS__ || [];
|
||||
(window as any).__SENTRY_EVENTS__.push({
|
||||
error: error.message,
|
||||
context,
|
||||
timestamp: Date.now(),
|
||||
});
|
||||
},
|
||||
addBreadcrumb: (breadcrumb: any) => {
|
||||
(window as any).__SENTRY_BREADCRUMBS__ = (window as any).__SENTRY_BREADCRUMBS__ || [];
|
||||
(window as any).__SENTRY_BREADCRUMBS__.push(breadcrumb);
|
||||
},
|
||||
};
|
||||
});
|
||||
|
||||
// Mock failing API
|
||||
await page.route('**/api/users', (route) => route.fulfill({ status: 403, body: { error: 'Forbidden' } }));
|
||||
|
||||
// Act
|
||||
await page.goto('/users');
|
||||
|
||||
// Assert: Sentry captured error
|
||||
const events = await page.evaluate(() => (window as any).__SENTRY_EVENTS__);
|
||||
expect(events).toHaveLength(1);
|
||||
expect(events[0]).toMatchObject({
|
||||
error: expect.stringContaining('403'),
|
||||
context: expect.objectContaining({
|
||||
endpoint: '/api/users',
|
||||
statusCode: 403,
|
||||
}),
|
||||
});
|
||||
|
||||
// Assert: Breadcrumbs include user actions
|
||||
const breadcrumbs = await page.evaluate(() => (window as any).__SENTRY_BREADCRUMBS__);
|
||||
expect(breadcrumbs).toContainEqual(
|
||||
expect.objectContaining({
|
||||
category: 'navigation',
|
||||
message: '/users',
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress with Sentry**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/telemetry-logging.cy.ts
|
||||
describe('Error Telemetry', () => {
|
||||
it('should log API errors with redacted sensitive data', () => {
|
||||
const errorLogs = [];
|
||||
|
||||
// Capture console errors
|
||||
cy.on('window:before:load', (win) => {
|
||||
cy.stub(win.console, 'error').callsFake((msg) => {
|
||||
errorLogs.push(msg);
|
||||
});
|
||||
});
|
||||
|
||||
// Mock failing API
|
||||
cy.intercept('POST', '**/api/orders', {
|
||||
statusCode: 500,
|
||||
body: { error: 'Payment failed' },
|
||||
});
|
||||
|
||||
// Act
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-cy="place-order"]').click();
|
||||
|
||||
// Assert: Error logged
|
||||
cy.wrap(errorLogs).should('have.length.greaterThan', 0);
|
||||
|
||||
// Assert: Context included
|
||||
cy.wrap(errorLogs[0]).should('include', '/api/orders');
|
||||
|
||||
// Assert: Secrets redacted
|
||||
cy.wrap(JSON.stringify(errorLogs)).should('not.contain', 'password');
|
||||
cy.wrap(JSON.stringify(errorLogs)).should('not.contain', 'creditCard');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Error logger utility with redaction**:
|
||||
|
||||
```typescript
|
||||
// src/utils/error-logger.ts
|
||||
type ErrorContext = {
|
||||
endpoint?: string;
|
||||
method?: string;
|
||||
statusCode?: number;
|
||||
userId?: string;
|
||||
sessionId?: string;
|
||||
requestPayload?: any;
|
||||
};
|
||||
|
||||
const SENSITIVE_KEYS = ['password', 'token', 'creditCard', 'ssn', 'apiKey'];
|
||||
|
||||
/**
|
||||
* Redact sensitive data from objects
|
||||
*/
|
||||
function redactSensitiveData(obj: any): any {
|
||||
if (typeof obj !== 'object' || obj === null) return obj;
|
||||
|
||||
const redacted = { ...obj };
|
||||
|
||||
for (const key of Object.keys(redacted)) {
|
||||
if (SENSITIVE_KEYS.some((sensitive) => key.toLowerCase().includes(sensitive))) {
|
||||
redacted[key] = '[REDACTED]';
|
||||
} else if (typeof redacted[key] === 'object') {
|
||||
redacted[key] = redactSensitiveData(redacted[key]);
|
||||
}
|
||||
}
|
||||
|
||||
return redacted;
|
||||
}
|
||||
|
||||
/**
|
||||
* Log error with context (Sentry integration)
|
||||
*/
|
||||
export function logError(error: Error, context?: ErrorContext) {
|
||||
const safeContext = context ? redactSensitiveData(context) : {};
|
||||
|
||||
const errorLog = {
|
||||
level: 'error' as const,
|
||||
message: error.message,
|
||||
stack: error.stack,
|
||||
context: safeContext,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
// Console (development)
|
||||
console.error(JSON.stringify(errorLog));
|
||||
|
||||
// Sentry (production)
|
||||
if (typeof window !== 'undefined' && (window as any).Sentry) {
|
||||
(window as any).Sentry.captureException(error, {
|
||||
contexts: { custom: safeContext },
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Context-rich logging**: Endpoint, method, status, user ID
|
||||
- **Secret redaction**: Passwords, tokens, PII removed before logging
|
||||
- **Sentry integration**: Production monitoring with breadcrumbs
|
||||
- **Structured logs**: JSON format for easy parsing
|
||||
- **Test validation**: Assert logs contain context but not secrets
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Graceful Degradation Tests (Fallback Behavior)
|
||||
|
||||
**Context**: Validate application continues functioning when services are unavailable.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/graceful-degradation.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Graceful Degradation Pattern
|
||||
* - Simulate service unavailability
|
||||
* - Validate fallback behavior
|
||||
* - Ensure user experience degrades gracefully
|
||||
* - Verify telemetry captures degradation events
|
||||
*/
|
||||
|
||||
test.describe('Service Unavailability', () => {
|
||||
test('should display cached data when API is down', async ({ page }) => {
|
||||
// Arrange: Seed localStorage with cached data
|
||||
await page.addInitScript(() => {
|
||||
localStorage.setItem(
|
||||
'products_cache',
|
||||
JSON.stringify({
|
||||
data: [
|
||||
{ id: 1, name: 'Cached Product 1' },
|
||||
{ id: 2, name: 'Cached Product 2' },
|
||||
],
|
||||
timestamp: Date.now(),
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
// Mock API unavailable
|
||||
await page.route(
|
||||
'**/api/products',
|
||||
(route) => route.abort('connectionrefused'), // Simulate server down
|
||||
);
|
||||
|
||||
// Act
|
||||
await page.goto('/products');
|
||||
|
||||
// Assert: Cached data displayed
|
||||
await expect(page.getByTestId('product-list')).toBeVisible();
|
||||
await expect(page.getByText('Cached Product 1')).toBeVisible();
|
||||
|
||||
// Assert: Stale data warning shown
|
||||
await expect(page.getByTestId('cache-warning')).toBeVisible();
|
||||
await expect(page.getByTestId('cache-warning')).toContainText(/showing.*cached|offline.*mode/i);
|
||||
|
||||
// Assert: Retry button available
|
||||
await expect(page.getByTestId('refresh-button')).toBeVisible();
|
||||
});
|
||||
|
||||
test('should show fallback UI when analytics service fails', async ({ page }) => {
|
||||
// Mock analytics service down (non-critical)
|
||||
await page.route('**/analytics/track', (route) => route.fulfill({ status: 503, body: 'Service unavailable' }));
|
||||
|
||||
// Act: Navigate normally
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Assert: Page loads successfully (analytics failure doesn't block)
|
||||
await expect(page.getByTestId('dashboard-content')).toBeVisible();
|
||||
|
||||
// Assert: Analytics error logged but not shown to user
|
||||
const consoleErrors = [];
|
||||
page.on('console', (msg) => {
|
||||
if (msg.type() === 'error') consoleErrors.push(msg.text());
|
||||
});
|
||||
|
||||
// Trigger analytics event
|
||||
await page.getByTestId('track-action-button').click();
|
||||
|
||||
// Analytics error logged
|
||||
expect(consoleErrors).toContainEqual(expect.stringContaining('Analytics service unavailable'));
|
||||
|
||||
// But user doesn't see error
|
||||
await expect(page.getByTestId('error-message')).not.toBeVisible();
|
||||
});
|
||||
|
||||
test('should fallback to local validation when API is slow', async ({ page }) => {
|
||||
// Mock slow API (> 5 seconds)
|
||||
await page.route('**/api/validate-email', async (route) => {
|
||||
await new Promise((resolve) => setTimeout(resolve, 6000)); // 6 second delay
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
body: JSON.stringify({ valid: true }),
|
||||
});
|
||||
});
|
||||
|
||||
// Act: Fill form
|
||||
await page.goto('/signup');
|
||||
await page.getByTestId('email-input').fill('test@example.com');
|
||||
await page.getByTestId('email-input').blur();
|
||||
|
||||
// Assert: Client-side validation triggers immediately (doesn't wait for API)
|
||||
await expect(page.getByTestId('email-valid-icon')).toBeVisible({ timeout: 1000 });
|
||||
|
||||
// Assert: Eventually API validates too (but doesn't block UX)
|
||||
await expect(page.getByTestId('email-validated-badge')).toBeVisible({ timeout: 7000 });
|
||||
});
|
||||
|
||||
test('should maintain functionality with third-party script failure', async ({ page }) => {
|
||||
// Block third-party scripts (Google Analytics, Intercom, etc.)
|
||||
await page.route('**/*.google-analytics.com/**', (route) => route.abort());
|
||||
await page.route('**/*.intercom.io/**', (route) => route.abort());
|
||||
|
||||
// Act
|
||||
await page.goto('/');
|
||||
|
||||
// Assert: App works without third-party scripts
|
||||
await expect(page.getByTestId('main-content')).toBeVisible();
|
||||
await expect(page.getByTestId('nav-menu')).toBeVisible();
|
||||
|
||||
// Assert: Core functionality intact
|
||||
await page.getByTestId('nav-products').click();
|
||||
await expect(page).toHaveURL(/.*\/products/);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Cached fallbacks**: Display stale data when API unavailable
|
||||
- **Non-critical degradation**: Analytics failures don't block app
|
||||
- **Client-side fallbacks**: Local validation when API slow
|
||||
- **Third-party resilience**: App works without external scripts
|
||||
- **User transparency**: Stale data warnings displayed
|
||||
|
||||
---
|
||||
|
||||
## Error Handling Testing Checklist
|
||||
|
||||
Before shipping error handling code, verify:
|
||||
|
||||
- [ ] **Scoped exception handling**: Only ignore documented errors (NetworkError, specific codes)
|
||||
- [ ] **Rethrow unexpected**: Unknown errors fail tests (catch regressions)
|
||||
- [ ] **Error UI tested**: User sees error messages for all error states
|
||||
- [ ] **Retry logic validated**: Sequential failures test backoff and max attempts
|
||||
- [ ] **Telemetry verified**: Errors logged with context (endpoint, status, user)
|
||||
- [ ] **Secret redaction**: Logs don't contain passwords, tokens, PII
|
||||
- [ ] **Graceful degradation**: Critical services down, app shows fallback UI
|
||||
- [ ] **Non-critical failures**: Analytics/tracking failures don't block app
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*automate` (error handling test generation), `*test-review` (error pattern detection)
|
||||
- Related fragments: `network-first.md`, `test-quality.md`, `contract-testing.md`
|
||||
- Monitoring tools: Sentry, Datadog, LogRocket
|
||||
|
||||
_Source: Murat error-handling patterns, Pact resilience guidance, SEON production error handling_
|
||||
|
||||
@@ -1,9 +1,750 @@
|
||||
# Feature Flag Governance
|
||||
|
||||
- Centralize flag definitions in a frozen enum; expose helpers to set, clear, and target specific audiences.
|
||||
- Test both enabled and disabled states in CI; clean up targeting after each spec to keep shared environments stable.
|
||||
- For LaunchDarkly-style systems, script API helpers to seed variations instead of mutating via UI.
|
||||
- Maintain a checklist for new flags: default state, owners, expiry date, telemetry, rollback plan.
|
||||
- Document flag dependencies in story/PR templates so QA and release reviews know which toggles must flip before launch.
|
||||
## Principle
|
||||
|
||||
_Source: LaunchDarkly strategy blog, Murat test architecture notes._
|
||||
Feature flags enable controlled rollouts and A/B testing, but require disciplined testing governance. Centralize flag definitions in a frozen enum, test both enabled and disabled states, clean up targeting after each spec, and maintain a comprehensive flag lifecycle checklist. For LaunchDarkly-style systems, script API helpers to seed variations programmatically rather than manual UI mutations.
|
||||
|
||||
## Rationale
|
||||
|
||||
Poorly managed feature flags become technical debt: untested variations ship broken code, forgotten flags clutter the codebase, and shared environments become unstable from leftover targeting rules. Structured governance ensures flags are testable, traceable, temporary, and safe. Testing both states prevents surprises when flags flip in production.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Feature Flag Enum Pattern with Type Safety
|
||||
|
||||
**Context**: Centralized flag management with TypeScript type safety and runtime validation.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// src/utils/feature-flags.ts
|
||||
/**
|
||||
* Centralized feature flag definitions
|
||||
* - Object.freeze prevents runtime modifications
|
||||
* - TypeScript ensures compile-time type safety
|
||||
* - Single source of truth for all flag keys
|
||||
*/
|
||||
export const FLAGS = Object.freeze({
|
||||
// User-facing features
|
||||
NEW_CHECKOUT_FLOW: 'new-checkout-flow',
|
||||
DARK_MODE: 'dark-mode',
|
||||
ENHANCED_SEARCH: 'enhanced-search',
|
||||
|
||||
// Experiments
|
||||
PRICING_EXPERIMENT_A: 'pricing-experiment-a',
|
||||
HOMEPAGE_VARIANT_B: 'homepage-variant-b',
|
||||
|
||||
// Infrastructure
|
||||
USE_NEW_API_ENDPOINT: 'use-new-api-endpoint',
|
||||
ENABLE_ANALYTICS_V2: 'enable-analytics-v2',
|
||||
|
||||
// Killswitches (emergency disables)
|
||||
DISABLE_PAYMENT_PROCESSING: 'disable-payment-processing',
|
||||
DISABLE_EMAIL_NOTIFICATIONS: 'disable-email-notifications',
|
||||
} as const);
|
||||
|
||||
/**
|
||||
* Type-safe flag keys
|
||||
* Prevents typos and ensures autocomplete in IDEs
|
||||
*/
|
||||
export type FlagKey = (typeof FLAGS)[keyof typeof FLAGS];
|
||||
|
||||
/**
|
||||
* Flag metadata for governance
|
||||
*/
|
||||
type FlagMetadata = {
|
||||
key: FlagKey;
|
||||
name: string;
|
||||
owner: string;
|
||||
createdDate: string;
|
||||
expiryDate?: string;
|
||||
defaultState: boolean;
|
||||
requiresCleanup: boolean;
|
||||
dependencies?: FlagKey[];
|
||||
telemetryEvents?: string[];
|
||||
};
|
||||
|
||||
/**
|
||||
* Flag registry with governance metadata
|
||||
* Used for flag lifecycle tracking and cleanup alerts
|
||||
*/
|
||||
export const FLAG_REGISTRY: Record<FlagKey, FlagMetadata> = {
|
||||
[FLAGS.NEW_CHECKOUT_FLOW]: {
|
||||
key: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
name: 'New Checkout Flow',
|
||||
owner: 'payments-team',
|
||||
createdDate: '2025-01-15',
|
||||
expiryDate: '2025-03-15',
|
||||
defaultState: false,
|
||||
requiresCleanup: true,
|
||||
dependencies: [FLAGS.USE_NEW_API_ENDPOINT],
|
||||
telemetryEvents: ['checkout_started', 'checkout_completed'],
|
||||
},
|
||||
[FLAGS.DARK_MODE]: {
|
||||
key: FLAGS.DARK_MODE,
|
||||
name: 'Dark Mode UI',
|
||||
owner: 'frontend-team',
|
||||
createdDate: '2025-01-10',
|
||||
defaultState: false,
|
||||
requiresCleanup: false, // Permanent feature toggle
|
||||
},
|
||||
// ... rest of registry
|
||||
};
|
||||
|
||||
/**
|
||||
* Validate flag exists in registry
|
||||
* Throws at runtime if flag is unregistered
|
||||
*/
|
||||
export function validateFlag(flag: string): asserts flag is FlagKey {
|
||||
if (!Object.values(FLAGS).includes(flag as FlagKey)) {
|
||||
throw new Error(`Unregistered feature flag: ${flag}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if flag is expired (needs removal)
|
||||
*/
|
||||
export function isFlagExpired(flag: FlagKey): boolean {
|
||||
const metadata = FLAG_REGISTRY[flag];
|
||||
if (!metadata.expiryDate) return false;
|
||||
|
||||
const expiry = new Date(metadata.expiryDate);
|
||||
return Date.now() > expiry.getTime();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get all expired flags requiring cleanup
|
||||
*/
|
||||
export function getExpiredFlags(): FlagMetadata[] {
|
||||
return Object.values(FLAG_REGISTRY).filter((meta) => isFlagExpired(meta.key));
|
||||
}
|
||||
```
|
||||
|
||||
**Usage in application code**:
|
||||
|
||||
```typescript
|
||||
// components/Checkout.tsx
|
||||
import { FLAGS } from '@/utils/feature-flags';
|
||||
import { useFeatureFlag } from '@/hooks/useFeatureFlag';
|
||||
|
||||
export function Checkout() {
|
||||
const isNewFlow = useFeatureFlag(FLAGS.NEW_CHECKOUT_FLOW);
|
||||
|
||||
return isNewFlow ? <NewCheckoutFlow /> : <LegacyCheckoutFlow />;
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Type safety**: TypeScript catches typos at compile time
|
||||
- **Runtime validation**: validateFlag ensures only registered flags used
|
||||
- **Metadata tracking**: Owner, dates, dependencies documented
|
||||
- **Expiry alerts**: Automated detection of stale flags
|
||||
- **Single source of truth**: All flags defined in one place
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Feature Flag Testing Pattern (Both States)
|
||||
|
||||
**Context**: Comprehensive testing of feature flag variations with proper cleanup.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-feature-flag.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { FLAGS } from '@/utils/feature-flags';
|
||||
|
||||
/**
|
||||
* Feature Flag Testing Strategy:
|
||||
* 1. Test BOTH enabled and disabled states
|
||||
* 2. Clean up targeting after each test
|
||||
* 3. Use dedicated test users (not production data)
|
||||
* 4. Verify telemetry events fire correctly
|
||||
*/
|
||||
|
||||
test.describe('Checkout Flow - Feature Flag Variations', () => {
|
||||
let testUserId: string;
|
||||
|
||||
test.beforeEach(async () => {
|
||||
// Generate unique test user ID
|
||||
testUserId = `test-user-${Date.now()}`;
|
||||
});
|
||||
|
||||
test.afterEach(async ({ request }) => {
|
||||
// CRITICAL: Clean up flag targeting to prevent shared env pollution
|
||||
await request.post('/api/feature-flags/cleanup', {
|
||||
data: {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
test('should use NEW checkout flow when flag is ENABLED', async ({ page, request }) => {
|
||||
// Arrange: Enable flag for test user
|
||||
await request.post('/api/feature-flags/target', {
|
||||
data: {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
variation: true, // ENABLED
|
||||
},
|
||||
});
|
||||
|
||||
// Act: Navigate as targeted user
|
||||
await page.goto('/checkout', {
|
||||
extraHTTPHeaders: {
|
||||
'X-Test-User-ID': testUserId,
|
||||
},
|
||||
});
|
||||
|
||||
// Assert: New flow UI elements visible
|
||||
await expect(page.getByTestId('checkout-v2-container')).toBeVisible();
|
||||
await expect(page.getByTestId('express-payment-options')).toBeVisible();
|
||||
await expect(page.getByTestId('saved-addresses-dropdown')).toBeVisible();
|
||||
|
||||
// Assert: Legacy flow NOT visible
|
||||
await expect(page.getByTestId('checkout-v1-container')).not.toBeVisible();
|
||||
|
||||
// Assert: Telemetry event fired
|
||||
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS_EVENTS__ || []);
|
||||
expect(analyticsEvents).toContainEqual(
|
||||
expect.objectContaining({
|
||||
event: 'checkout_started',
|
||||
properties: expect.objectContaining({
|
||||
variant: 'new_flow',
|
||||
}),
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
test('should use LEGACY checkout flow when flag is DISABLED', async ({ page, request }) => {
|
||||
// Arrange: Disable flag for test user (or don't target at all)
|
||||
await request.post('/api/feature-flags/target', {
|
||||
data: {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
variation: false, // DISABLED
|
||||
},
|
||||
});
|
||||
|
||||
// Act: Navigate as targeted user
|
||||
await page.goto('/checkout', {
|
||||
extraHTTPHeaders: {
|
||||
'X-Test-User-ID': testUserId,
|
||||
},
|
||||
});
|
||||
|
||||
// Assert: Legacy flow UI elements visible
|
||||
await expect(page.getByTestId('checkout-v1-container')).toBeVisible();
|
||||
await expect(page.getByTestId('legacy-payment-form')).toBeVisible();
|
||||
|
||||
// Assert: New flow NOT visible
|
||||
await expect(page.getByTestId('checkout-v2-container')).not.toBeVisible();
|
||||
await expect(page.getByTestId('express-payment-options')).not.toBeVisible();
|
||||
|
||||
// Assert: Telemetry event fired with correct variant
|
||||
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS_EVENTS__ || []);
|
||||
expect(analyticsEvents).toContainEqual(
|
||||
expect.objectContaining({
|
||||
event: 'checkout_started',
|
||||
properties: expect.objectContaining({
|
||||
variant: 'legacy_flow',
|
||||
}),
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
test('should handle flag evaluation errors gracefully', async ({ page, request }) => {
|
||||
// Arrange: Simulate flag service unavailable
|
||||
await page.route('**/api/feature-flags/evaluate', (route) => route.fulfill({ status: 500, body: 'Service Unavailable' }));
|
||||
|
||||
// Act: Navigate (should fallback to default state)
|
||||
await page.goto('/checkout', {
|
||||
extraHTTPHeaders: {
|
||||
'X-Test-User-ID': testUserId,
|
||||
},
|
||||
});
|
||||
|
||||
// Assert: Fallback to safe default (legacy flow)
|
||||
await expect(page.getByTestId('checkout-v1-container')).toBeVisible();
|
||||
|
||||
// Assert: Error logged but no user-facing error
|
||||
const consoleErrors = [];
|
||||
page.on('console', (msg) => {
|
||||
if (msg.type() === 'error') consoleErrors.push(msg.text());
|
||||
});
|
||||
expect(consoleErrors).toContain(expect.stringContaining('Feature flag evaluation failed'));
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Cypress equivalent**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/checkout-feature-flag.cy.ts
|
||||
import { FLAGS } from '@/utils/feature-flags';
|
||||
|
||||
describe('Checkout Flow - Feature Flag Variations', () => {
|
||||
let testUserId;
|
||||
|
||||
beforeEach(() => {
|
||||
testUserId = `test-user-${Date.now()}`;
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
// Clean up targeting
|
||||
cy.task('removeFeatureFlagTarget', {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
});
|
||||
});
|
||||
|
||||
it('should use NEW checkout flow when flag is ENABLED', () => {
|
||||
// Arrange: Enable flag via Cypress task
|
||||
cy.task('setFeatureFlagVariation', {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
variation: true,
|
||||
});
|
||||
|
||||
// Act
|
||||
cy.visit('/checkout', {
|
||||
headers: { 'X-Test-User-ID': testUserId },
|
||||
});
|
||||
|
||||
// Assert
|
||||
cy.get('[data-testid="checkout-v2-container"]').should('be.visible');
|
||||
cy.get('[data-testid="checkout-v1-container"]').should('not.exist');
|
||||
});
|
||||
|
||||
it('should use LEGACY checkout flow when flag is DISABLED', () => {
|
||||
// Arrange: Disable flag
|
||||
cy.task('setFeatureFlagVariation', {
|
||||
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
|
||||
userId: testUserId,
|
||||
variation: false,
|
||||
});
|
||||
|
||||
// Act
|
||||
cy.visit('/checkout', {
|
||||
headers: { 'X-Test-User-ID': testUserId },
|
||||
});
|
||||
|
||||
// Assert
|
||||
cy.get('[data-testid="checkout-v1-container"]').should('be.visible');
|
||||
cy.get('[data-testid="checkout-v2-container"]').should('not.exist');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Test both states**: Enabled AND disabled variations
|
||||
- **Automatic cleanup**: afterEach removes targeting (prevent pollution)
|
||||
- **Unique test users**: Avoid conflicts with real user data
|
||||
- **Telemetry validation**: Verify analytics events fire correctly
|
||||
- **Graceful degradation**: Test fallback behavior on errors
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Feature Flag Targeting Helper Pattern
|
||||
|
||||
**Context**: Reusable helpers for programmatic flag control via LaunchDarkly/Split.io API.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/support/feature-flag-helpers.ts
|
||||
import { request as playwrightRequest } from '@playwright/test';
|
||||
import { FLAGS, FlagKey } from '@/utils/feature-flags';
|
||||
|
||||
/**
|
||||
* LaunchDarkly API client configuration
|
||||
* Use test project SDK key (NOT production)
|
||||
*/
|
||||
const LD_SDK_KEY = process.env.LD_SDK_KEY_TEST;
|
||||
const LD_API_BASE = 'https://app.launchdarkly.com/api/v2';
|
||||
|
||||
type FlagVariation = boolean | string | number | object;
|
||||
|
||||
/**
|
||||
* Set flag variation for specific user
|
||||
* Uses LaunchDarkly API to create user target
|
||||
*/
|
||||
export async function setFlagForUser(flagKey: FlagKey, userId: string, variation: FlagVariation): Promise<void> {
|
||||
const response = await playwrightRequest.newContext().then((ctx) =>
|
||||
ctx.post(`${LD_API_BASE}/flags/${flagKey}/targeting`, {
|
||||
headers: {
|
||||
Authorization: LD_SDK_KEY!,
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
data: {
|
||||
targets: [
|
||||
{
|
||||
values: [userId],
|
||||
variation: variation ? 1 : 0, // 0 = off, 1 = on
|
||||
},
|
||||
],
|
||||
},
|
||||
}),
|
||||
);
|
||||
|
||||
if (!response.ok()) {
|
||||
throw new Error(`Failed to set flag ${flagKey} for user ${userId}: ${response.status()}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Remove user from flag targeting
|
||||
* CRITICAL for test cleanup
|
||||
*/
|
||||
export async function removeFlagTarget(flagKey: FlagKey, userId: string): Promise<void> {
|
||||
const response = await playwrightRequest.newContext().then((ctx) =>
|
||||
ctx.delete(`${LD_API_BASE}/flags/${flagKey}/targeting/users/${userId}`, {
|
||||
headers: {
|
||||
Authorization: LD_SDK_KEY!,
|
||||
},
|
||||
}),
|
||||
);
|
||||
|
||||
if (!response.ok() && response.status() !== 404) {
|
||||
// 404 is acceptable (user wasn't targeted)
|
||||
throw new Error(`Failed to remove flag ${flagKey} target for user ${userId}: ${response.status()}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Percentage rollout helper
|
||||
* Enable flag for N% of users
|
||||
*/
|
||||
export async function setFlagRolloutPercentage(flagKey: FlagKey, percentage: number): Promise<void> {
|
||||
if (percentage < 0 || percentage > 100) {
|
||||
throw new Error('Percentage must be between 0 and 100');
|
||||
}
|
||||
|
||||
const response = await playwrightRequest.newContext().then((ctx) =>
|
||||
ctx.patch(`${LD_API_BASE}/flags/${flagKey}`, {
|
||||
headers: {
|
||||
Authorization: LD_SDK_KEY!,
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
data: {
|
||||
rollout: {
|
||||
variations: [
|
||||
{ variation: 0, weight: 100 - percentage }, // off
|
||||
{ variation: 1, weight: percentage }, // on
|
||||
],
|
||||
},
|
||||
},
|
||||
}),
|
||||
);
|
||||
|
||||
if (!response.ok()) {
|
||||
throw new Error(`Failed to set rollout for flag ${flagKey}: ${response.status()}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Enable flag globally (100% rollout)
|
||||
*/
|
||||
export async function enableFlagGlobally(flagKey: FlagKey): Promise<void> {
|
||||
await setFlagRolloutPercentage(flagKey, 100);
|
||||
}
|
||||
|
||||
/**
|
||||
* Disable flag globally (0% rollout)
|
||||
*/
|
||||
export async function disableFlagGlobally(flagKey: FlagKey): Promise<void> {
|
||||
await setFlagRolloutPercentage(flagKey, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* Stub feature flags in local/test environments
|
||||
* Bypasses LaunchDarkly entirely
|
||||
*/
|
||||
export function stubFeatureFlags(flags: Record<FlagKey, FlagVariation>): void {
|
||||
// Set flags in localStorage or inject into window
|
||||
if (typeof window !== 'undefined') {
|
||||
(window as any).__STUBBED_FLAGS__ = flags;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Usage in Playwright fixture**:
|
||||
|
||||
```typescript
|
||||
// playwright/fixtures/feature-flag-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { setFlagForUser, removeFlagTarget } from '../support/feature-flag-helpers';
|
||||
import { FlagKey } from '@/utils/feature-flags';
|
||||
|
||||
type FeatureFlagFixture = {
|
||||
featureFlags: {
|
||||
enable: (flag: FlagKey, userId: string) => Promise<void>;
|
||||
disable: (flag: FlagKey, userId: string) => Promise<void>;
|
||||
cleanup: (flag: FlagKey, userId: string) => Promise<void>;
|
||||
};
|
||||
};
|
||||
|
||||
export const test = base.extend<FeatureFlagFixture>({
|
||||
featureFlags: async ({}, use) => {
|
||||
const cleanupQueue: Array<{ flag: FlagKey; userId: string }> = [];
|
||||
|
||||
await use({
|
||||
enable: async (flag, userId) => {
|
||||
await setFlagForUser(flag, userId, true);
|
||||
cleanupQueue.push({ flag, userId });
|
||||
},
|
||||
disable: async (flag, userId) => {
|
||||
await setFlagForUser(flag, userId, false);
|
||||
cleanupQueue.push({ flag, userId });
|
||||
},
|
||||
cleanup: async (flag, userId) => {
|
||||
await removeFlagTarget(flag, userId);
|
||||
},
|
||||
});
|
||||
|
||||
// Auto-cleanup after test
|
||||
for (const { flag, userId } of cleanupQueue) {
|
||||
await removeFlagTarget(flag, userId);
|
||||
}
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **API-driven control**: No manual UI clicks required
|
||||
- **Auto-cleanup**: Fixture tracks and removes targeting
|
||||
- **Percentage rollouts**: Test gradual feature releases
|
||||
- **Stubbing option**: Local development without LaunchDarkly
|
||||
- **Type-safe**: FlagKey prevents typos
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Feature Flag Lifecycle Checklist & Cleanup Strategy
|
||||
|
||||
**Context**: Governance checklist and automated cleanup detection for stale flags.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// scripts/feature-flag-audit.ts
|
||||
/**
|
||||
* Feature Flag Lifecycle Audit Script
|
||||
* Run weekly to detect stale flags requiring cleanup
|
||||
*/
|
||||
|
||||
import { FLAG_REGISTRY, FLAGS, getExpiredFlags, FlagKey } from '../src/utils/feature-flags';
|
||||
import * as fs from 'fs';
|
||||
import * as path from 'path';
|
||||
|
||||
type AuditResult = {
|
||||
totalFlags: number;
|
||||
expiredFlags: FlagKey[];
|
||||
missingOwners: FlagKey[];
|
||||
missingDates: FlagKey[];
|
||||
permanentFlags: FlagKey[];
|
||||
flagsNearingExpiry: FlagKey[];
|
||||
};
|
||||
|
||||
/**
|
||||
* Audit all feature flags for governance compliance
|
||||
*/
|
||||
function auditFeatureFlags(): AuditResult {
|
||||
const allFlags = Object.keys(FLAG_REGISTRY) as FlagKey[];
|
||||
const expiredFlags = getExpiredFlags().map((meta) => meta.key);
|
||||
|
||||
// Flags expiring in next 30 days
|
||||
const thirtyDaysFromNow = Date.now() + 30 * 24 * 60 * 60 * 1000;
|
||||
const flagsNearingExpiry = allFlags.filter((flag) => {
|
||||
const meta = FLAG_REGISTRY[flag];
|
||||
if (!meta.expiryDate) return false;
|
||||
const expiry = new Date(meta.expiryDate).getTime();
|
||||
return expiry > Date.now() && expiry < thirtyDaysFromNow;
|
||||
});
|
||||
|
||||
// Missing metadata
|
||||
const missingOwners = allFlags.filter((flag) => !FLAG_REGISTRY[flag].owner);
|
||||
const missingDates = allFlags.filter((flag) => !FLAG_REGISTRY[flag].createdDate);
|
||||
|
||||
// Permanent flags (no expiry, requiresCleanup = false)
|
||||
const permanentFlags = allFlags.filter((flag) => {
|
||||
const meta = FLAG_REGISTRY[flag];
|
||||
return !meta.expiryDate && !meta.requiresCleanup;
|
||||
});
|
||||
|
||||
return {
|
||||
totalFlags: allFlags.length,
|
||||
expiredFlags,
|
||||
missingOwners,
|
||||
missingDates,
|
||||
permanentFlags,
|
||||
flagsNearingExpiry,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate markdown report
|
||||
*/
|
||||
function generateReport(audit: AuditResult): string {
|
||||
let report = `# Feature Flag Audit Report\n\n`;
|
||||
report += `**Date**: ${new Date().toISOString()}\n`;
|
||||
report += `**Total Flags**: ${audit.totalFlags}\n\n`;
|
||||
|
||||
if (audit.expiredFlags.length > 0) {
|
||||
report += `## ⚠️ EXPIRED FLAGS - IMMEDIATE CLEANUP REQUIRED\n\n`;
|
||||
audit.expiredFlags.forEach((flag) => {
|
||||
const meta = FLAG_REGISTRY[flag];
|
||||
report += `- **${meta.name}** (\`${flag}\`)\n`;
|
||||
report += ` - Owner: ${meta.owner}\n`;
|
||||
report += ` - Expired: ${meta.expiryDate}\n`;
|
||||
report += ` - Action: Remove flag code, update tests, deploy\n\n`;
|
||||
});
|
||||
}
|
||||
|
||||
if (audit.flagsNearingExpiry.length > 0) {
|
||||
report += `## ⏰ FLAGS EXPIRING SOON (Next 30 Days)\n\n`;
|
||||
audit.flagsNearingExpiry.forEach((flag) => {
|
||||
const meta = FLAG_REGISTRY[flag];
|
||||
report += `- **${meta.name}** (\`${flag}\`)\n`;
|
||||
report += ` - Owner: ${meta.owner}\n`;
|
||||
report += ` - Expires: ${meta.expiryDate}\n`;
|
||||
report += ` - Action: Plan cleanup or extend expiry\n\n`;
|
||||
});
|
||||
}
|
||||
|
||||
if (audit.permanentFlags.length > 0) {
|
||||
report += `## 🔄 PERMANENT FLAGS (No Expiry)\n\n`;
|
||||
audit.permanentFlags.forEach((flag) => {
|
||||
const meta = FLAG_REGISTRY[flag];
|
||||
report += `- **${meta.name}** (\`${flag}\`) - Owner: ${meta.owner}\n`;
|
||||
});
|
||||
report += `\n`;
|
||||
}
|
||||
|
||||
if (audit.missingOwners.length > 0 || audit.missingDates.length > 0) {
|
||||
report += `## ❌ GOVERNANCE ISSUES\n\n`;
|
||||
if (audit.missingOwners.length > 0) {
|
||||
report += `**Missing Owners**: ${audit.missingOwners.join(', ')}\n`;
|
||||
}
|
||||
if (audit.missingDates.length > 0) {
|
||||
report += `**Missing Created Dates**: ${audit.missingDates.join(', ')}\n`;
|
||||
}
|
||||
report += `\n`;
|
||||
}
|
||||
|
||||
return report;
|
||||
}
|
||||
|
||||
/**
|
||||
* Feature Flag Lifecycle Checklist
|
||||
*/
|
||||
const FLAG_LIFECYCLE_CHECKLIST = `
|
||||
# Feature Flag Lifecycle Checklist
|
||||
|
||||
## Before Creating a New Flag
|
||||
|
||||
- [ ] **Name**: Follow naming convention (kebab-case, descriptive)
|
||||
- [ ] **Owner**: Assign team/individual responsible
|
||||
- [ ] **Default State**: Determine safe default (usually false)
|
||||
- [ ] **Expiry Date**: Set removal date (30-90 days typical)
|
||||
- [ ] **Dependencies**: Document related flags
|
||||
- [ ] **Telemetry**: Plan analytics events to track
|
||||
- [ ] **Rollback Plan**: Define how to disable quickly
|
||||
|
||||
## During Development
|
||||
|
||||
- [ ] **Code Paths**: Both enabled/disabled states implemented
|
||||
- [ ] **Tests**: Both variations tested in CI
|
||||
- [ ] **Documentation**: Flag purpose documented in code/PR
|
||||
- [ ] **Telemetry**: Analytics events instrumented
|
||||
- [ ] **Error Handling**: Graceful degradation on flag service failure
|
||||
|
||||
## Before Launch
|
||||
|
||||
- [ ] **QA**: Both states tested in staging
|
||||
- [ ] **Rollout Plan**: Gradual rollout percentage defined
|
||||
- [ ] **Monitoring**: Dashboards/alerts for flag-related metrics
|
||||
- [ ] **Stakeholder Communication**: Product/design aligned
|
||||
|
||||
## After Launch (Monitoring)
|
||||
|
||||
- [ ] **Metrics**: Success criteria tracked
|
||||
- [ ] **Error Rates**: No increase in errors
|
||||
- [ ] **Performance**: No degradation
|
||||
- [ ] **User Feedback**: Qualitative data collected
|
||||
|
||||
## Cleanup (Post-Launch)
|
||||
|
||||
- [ ] **Remove Flag Code**: Delete if/else branches
|
||||
- [ ] **Update Tests**: Remove flag-specific tests
|
||||
- [ ] **Remove Targeting**: Clear all user targets
|
||||
- [ ] **Delete Flag Config**: Remove from LaunchDarkly/registry
|
||||
- [ ] **Update Documentation**: Remove references
|
||||
- [ ] **Deploy**: Ship cleanup changes
|
||||
`;
|
||||
|
||||
// Run audit
|
||||
const audit = auditFeatureFlags();
|
||||
const report = generateReport(audit);
|
||||
|
||||
// Save report
|
||||
const outputPath = path.join(__dirname, '../feature-flag-audit-report.md');
|
||||
fs.writeFileSync(outputPath, report);
|
||||
fs.writeFileSync(path.join(__dirname, '../FEATURE-FLAG-CHECKLIST.md'), FLAG_LIFECYCLE_CHECKLIST);
|
||||
|
||||
console.log(`✅ Audit complete. Report saved to: ${outputPath}`);
|
||||
console.log(`Total flags: ${audit.totalFlags}`);
|
||||
console.log(`Expired flags: ${audit.expiredFlags.length}`);
|
||||
console.log(`Flags expiring soon: ${audit.flagsNearingExpiry.length}`);
|
||||
|
||||
// Exit with error if expired flags exist
|
||||
if (audit.expiredFlags.length > 0) {
|
||||
console.error(`\n❌ EXPIRED FLAGS DETECTED - CLEANUP REQUIRED`);
|
||||
process.exit(1);
|
||||
}
|
||||
```
|
||||
|
||||
**package.json scripts**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"feature-flags:audit": "ts-node scripts/feature-flag-audit.ts",
|
||||
"feature-flags:audit:ci": "npm run feature-flags:audit || true"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Automated detection**: Weekly audit catches stale flags
|
||||
- **Lifecycle checklist**: Comprehensive governance guide
|
||||
- **Expiry tracking**: Flags auto-expire after defined date
|
||||
- **CI integration**: Audit runs in pipeline, warns on expiry
|
||||
- **Ownership clarity**: Every flag has assigned owner
|
||||
|
||||
---
|
||||
|
||||
## Feature Flag Testing Checklist
|
||||
|
||||
Before merging flag-related code, verify:
|
||||
|
||||
- [ ] **Both states tested**: Enabled AND disabled variations covered
|
||||
- [ ] **Cleanup automated**: afterEach removes targeting (no manual cleanup)
|
||||
- [ ] **Unique test data**: Test users don't collide with production
|
||||
- [ ] **Telemetry validated**: Analytics events fire for both variations
|
||||
- [ ] **Error handling**: Graceful fallback when flag service unavailable
|
||||
- [ ] **Flag metadata**: Owner, dates, dependencies documented in registry
|
||||
- [ ] **Rollback plan**: Clear steps to disable flag in production
|
||||
- [ ] **Expiry date set**: Removal date defined (or marked permanent)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*automate` (test generation), `*framework` (flag setup)
|
||||
- Related fragments: `test-quality.md`, `selective-testing.md`
|
||||
- Flag services: LaunchDarkly, Split.io, Unleash, custom implementations
|
||||
|
||||
_Source: LaunchDarkly strategy blog, Murat test architecture notes, SEON feature flag governance_
|
||||
|
||||
@@ -1,9 +1,401 @@
|
||||
# Fixture Architecture Playbook
|
||||
|
||||
- Build helpers as pure functions first, then expose them via Playwright `extend` or Cypress commands so logic stays testable in isolation.
|
||||
- Compose capabilities with `mergeTests` (Playwright) or layered Cypress commands instead of inheritance; each fixture should solve one concern (auth, api, logs, network).
|
||||
- Keep HTTP helpers framework agnostic—accept all required params explicitly and return results so unit tests and runtime fixtures can share them.
|
||||
- Export fixtures through package subpaths (`"./api-request"`, `"./api-request/fixtures"`) to make reuse trivial across suites and projects.
|
||||
- Treat fixture files as infrastructure: document dependencies, enforce deterministic timeouts, and ban hidden retries that mask flakiness.
|
||||
## Principle
|
||||
|
||||
_Source: Murat Testing Philosophy, cy-vs-pw comparison, SEON production patterns._
|
||||
Build test helpers as pure functions first, then wrap them in framework-specific fixtures. Compose capabilities using `mergeTests` (Playwright) or layered commands (Cypress) instead of inheritance. Each fixture should solve one isolated concern (auth, API, logs, network).
|
||||
|
||||
## Rationale
|
||||
|
||||
Traditional Page Object Models create tight coupling through inheritance chains (`BasePage → LoginPage → AdminPage`). When base classes change, all descendants break. Pure functions with fixture wrappers provide:
|
||||
|
||||
- **Testability**: Pure functions run in unit tests without framework overhead
|
||||
- **Composability**: Mix capabilities freely via `mergeTests`, no inheritance constraints
|
||||
- **Reusability**: Export fixtures via package subpaths for cross-project sharing
|
||||
- **Maintainability**: One concern per fixture = clear responsibility boundaries
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Pure Function → Fixture Pattern
|
||||
|
||||
**Context**: When building any test helper, always start with a pure function that accepts all dependencies explicitly. Then wrap it in a Playwright fixture or Cypress command.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/helpers/api-request.ts
|
||||
// Step 1: Pure function (ALWAYS FIRST!)
|
||||
type ApiRequestParams = {
|
||||
request: APIRequestContext;
|
||||
method: 'GET' | 'POST' | 'PUT' | 'DELETE';
|
||||
url: string;
|
||||
data?: unknown;
|
||||
headers?: Record<string, string>;
|
||||
};
|
||||
|
||||
export async function apiRequest({
|
||||
request,
|
||||
method,
|
||||
url,
|
||||
data,
|
||||
headers = {}
|
||||
}: ApiRequestParams) {
|
||||
const response = await request.fetch(url, {
|
||||
method,
|
||||
data,
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
...headers
|
||||
}
|
||||
});
|
||||
|
||||
if (!response.ok()) {
|
||||
throw new Error(`API request failed: ${response.status()} ${await response.text()}`);
|
||||
}
|
||||
|
||||
return response.json();
|
||||
}
|
||||
|
||||
// Step 2: Fixture wrapper
|
||||
// playwright/support/fixtures/api-request-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { apiRequest } from '../helpers/api-request';
|
||||
|
||||
export const test = base.extend<{ apiRequest: typeof apiRequest }>({
|
||||
apiRequest: async ({ request }, use) => {
|
||||
// Inject framework dependency, expose pure function
|
||||
await use((params) => apiRequest({ request, ...params }));
|
||||
}
|
||||
});
|
||||
|
||||
// Step 3: Package exports for reusability
|
||||
// package.json
|
||||
{
|
||||
"exports": {
|
||||
"./api-request": "./playwright/support/helpers/api-request.ts",
|
||||
"./api-request/fixtures": "./playwright/support/fixtures/api-request-fixture.ts"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Pure function is unit-testable without Playwright running
|
||||
- Framework dependency (`request`) injected at fixture boundary
|
||||
- Fixture exposes the pure function to test context
|
||||
- Package subpath exports enable `import { apiRequest } from 'my-fixtures/api-request'`
|
||||
|
||||
### Example 2: Composable Fixture System with mergeTests
|
||||
|
||||
**Context**: When building comprehensive test capabilities, compose multiple focused fixtures instead of creating monolithic helper classes. Each fixture provides one capability.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/merged-fixtures.ts
|
||||
import { test as base, mergeTests } from '@playwright/test';
|
||||
import { test as apiRequestFixture } from './api-request-fixture';
|
||||
import { test as networkFixture } from './network-fixture';
|
||||
import { test as authFixture } from './auth-fixture';
|
||||
import { test as logFixture } from './log-fixture';
|
||||
|
||||
// Compose all fixtures for comprehensive capabilities
|
||||
export const test = mergeTests(base, apiRequestFixture, networkFixture, authFixture, logFixture);
|
||||
|
||||
export { expect } from '@playwright/test';
|
||||
|
||||
// Example usage in tests:
|
||||
// import { test, expect } from './support/fixtures/merged-fixtures';
|
||||
//
|
||||
// test('user can create order', async ({ page, apiRequest, auth, network }) => {
|
||||
// await auth.loginAs('customer@example.com');
|
||||
// await network.interceptRoute('POST', '**/api/orders', { id: 123 });
|
||||
// await page.goto('/checkout');
|
||||
// await page.click('[data-testid="submit-order"]');
|
||||
// await expect(page.getByText('Order #123')).toBeVisible();
|
||||
// });
|
||||
```
|
||||
|
||||
**Individual Fixture Examples**:
|
||||
|
||||
```typescript
|
||||
// network-fixture.ts
|
||||
export const test = base.extend({
|
||||
network: async ({ page }, use) => {
|
||||
const interceptedRoutes = new Map();
|
||||
|
||||
const interceptRoute = async (method: string, url: string, response: unknown) => {
|
||||
await page.route(url, (route) => {
|
||||
if (route.request().method() === method) {
|
||||
route.fulfill({ body: JSON.stringify(response) });
|
||||
}
|
||||
});
|
||||
interceptedRoutes.set(`${method}:${url}`, response);
|
||||
};
|
||||
|
||||
await use({ interceptRoute });
|
||||
|
||||
// Cleanup
|
||||
interceptedRoutes.clear();
|
||||
},
|
||||
});
|
||||
|
||||
// auth-fixture.ts
|
||||
export const test = base.extend({
|
||||
auth: async ({ page, context }, use) => {
|
||||
const loginAs = async (email: string) => {
|
||||
// Use API to setup auth (fast!)
|
||||
const token = await getAuthToken(email);
|
||||
await context.addCookies([
|
||||
{
|
||||
name: 'auth_token',
|
||||
value: token,
|
||||
domain: 'localhost',
|
||||
path: '/',
|
||||
},
|
||||
]);
|
||||
};
|
||||
|
||||
await use({ loginAs });
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `mergeTests` combines fixtures without inheritance
|
||||
- Each fixture has single responsibility (network, auth, logs)
|
||||
- Tests import merged fixture and access all capabilities
|
||||
- No coupling between fixtures—add/remove freely
|
||||
|
||||
### Example 3: Framework-Agnostic HTTP Helper
|
||||
|
||||
**Context**: When building HTTP helpers, keep them framework-agnostic. Accept all params explicitly so they work in unit tests, Playwright, Cypress, or any context.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// shared/helpers/http-helper.ts
|
||||
// Pure, framework-agnostic function
|
||||
type HttpHelperParams = {
|
||||
baseUrl: string;
|
||||
endpoint: string;
|
||||
method: 'GET' | 'POST' | 'PUT' | 'DELETE';
|
||||
body?: unknown;
|
||||
headers?: Record<string, string>;
|
||||
token?: string;
|
||||
};
|
||||
|
||||
export async function makeHttpRequest({ baseUrl, endpoint, method, body, headers = {}, token }: HttpHelperParams): Promise<unknown> {
|
||||
const url = `${baseUrl}${endpoint}`;
|
||||
const requestHeaders = {
|
||||
'Content-Type': 'application/json',
|
||||
...(token && { Authorization: `Bearer ${token}` }),
|
||||
...headers,
|
||||
};
|
||||
|
||||
const response = await fetch(url, {
|
||||
method,
|
||||
headers: requestHeaders,
|
||||
body: body ? JSON.stringify(body) : undefined,
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
const errorText = await response.text();
|
||||
throw new Error(`HTTP ${method} ${url} failed: ${response.status} ${errorText}`);
|
||||
}
|
||||
|
||||
return response.json();
|
||||
}
|
||||
|
||||
// Playwright fixture wrapper
|
||||
// playwright/support/fixtures/http-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { makeHttpRequest } from '../../shared/helpers/http-helper';
|
||||
|
||||
export const test = base.extend({
|
||||
httpHelper: async ({}, use) => {
|
||||
const baseUrl = process.env.API_BASE_URL || 'http://localhost:3000';
|
||||
|
||||
await use((params) => makeHttpRequest({ baseUrl, ...params }));
|
||||
},
|
||||
});
|
||||
|
||||
// Cypress command wrapper
|
||||
// cypress/support/commands.ts
|
||||
import { makeHttpRequest } from '../../shared/helpers/http-helper';
|
||||
|
||||
Cypress.Commands.add('apiRequest', (params) => {
|
||||
const baseUrl = Cypress.env('API_BASE_URL') || 'http://localhost:3000';
|
||||
return cy.wrap(makeHttpRequest({ baseUrl, ...params }));
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Pure function uses only standard `fetch`, no framework dependencies
|
||||
- Unit tests call `makeHttpRequest` directly with all params
|
||||
- Playwright and Cypress wrappers inject framework-specific config
|
||||
- Same logic runs everywhere—zero duplication
|
||||
|
||||
### Example 4: Fixture Cleanup Pattern
|
||||
|
||||
**Context**: When fixtures create resources (data, files, connections), ensure automatic cleanup in fixture teardown. Tests must not leak state.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/database-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { seedDatabase, deleteRecord } from '../helpers/db-helpers';
|
||||
|
||||
type DatabaseFixture = {
|
||||
seedUser: (userData: Partial<User>) => Promise<User>;
|
||||
seedOrder: (orderData: Partial<Order>) => Promise<Order>;
|
||||
};
|
||||
|
||||
export const test = base.extend<DatabaseFixture>({
|
||||
seedUser: async ({}, use) => {
|
||||
const createdUsers: string[] = [];
|
||||
|
||||
const seedUser = async (userData: Partial<User>) => {
|
||||
const user = await seedDatabase('users', userData);
|
||||
createdUsers.push(user.id);
|
||||
return user;
|
||||
};
|
||||
|
||||
await use(seedUser);
|
||||
|
||||
// Auto-cleanup: Delete all users created during test
|
||||
for (const userId of createdUsers) {
|
||||
await deleteRecord('users', userId);
|
||||
}
|
||||
createdUsers.length = 0;
|
||||
},
|
||||
|
||||
seedOrder: async ({}, use) => {
|
||||
const createdOrders: string[] = [];
|
||||
|
||||
const seedOrder = async (orderData: Partial<Order>) => {
|
||||
const order = await seedDatabase('orders', orderData);
|
||||
createdOrders.push(order.id);
|
||||
return order;
|
||||
};
|
||||
|
||||
await use(seedOrder);
|
||||
|
||||
// Auto-cleanup: Delete all orders
|
||||
for (const orderId of createdOrders) {
|
||||
await deleteRecord('orders', orderId);
|
||||
}
|
||||
createdOrders.length = 0;
|
||||
},
|
||||
});
|
||||
|
||||
// Example usage:
|
||||
// test('user can place order', async ({ seedUser, seedOrder, page }) => {
|
||||
// const user = await seedUser({ email: 'test@example.com' });
|
||||
// const order = await seedOrder({ userId: user.id, total: 100 });
|
||||
//
|
||||
// await page.goto(`/orders/${order.id}`);
|
||||
// await expect(page.getByText('Order Total: $100')).toBeVisible();
|
||||
//
|
||||
// // No manual cleanup needed—fixture handles it automatically
|
||||
// });
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Track all created resources in array during test execution
|
||||
- Teardown (after `use()`) deletes all tracked resources
|
||||
- Tests don't manually clean up—happens automatically
|
||||
- Prevents test pollution and flakiness from shared state
|
||||
|
||||
### Anti-Pattern: Inheritance-Based Page Objects
|
||||
|
||||
**Problem**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Page Object Model with inheritance
|
||||
class BasePage {
|
||||
constructor(public page: Page) {}
|
||||
|
||||
async navigate(url: string) {
|
||||
await this.page.goto(url);
|
||||
}
|
||||
|
||||
async clickButton(selector: string) {
|
||||
await this.page.click(selector);
|
||||
}
|
||||
}
|
||||
|
||||
class LoginPage extends BasePage {
|
||||
async login(email: string, password: string) {
|
||||
await this.navigate('/login');
|
||||
await this.page.fill('#email', email);
|
||||
await this.page.fill('#password', password);
|
||||
await this.clickButton('#submit');
|
||||
}
|
||||
}
|
||||
|
||||
class AdminPage extends LoginPage {
|
||||
async accessAdminPanel() {
|
||||
await this.login('admin@example.com', 'admin123');
|
||||
await this.navigate('/admin');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Why It Fails**:
|
||||
|
||||
- Changes to `BasePage` break all descendants (`LoginPage`, `AdminPage`)
|
||||
- `AdminPage` inherits unnecessary `login` details—tight coupling
|
||||
- Cannot compose capabilities (e.g., admin + reporting features require multiple inheritance)
|
||||
- Hard to test `BasePage` methods in isolation
|
||||
- Hidden state in class instances leads to unpredictable behavior
|
||||
|
||||
**Better Approach**: Use pure functions + fixtures
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD: Pure functions with fixture composition
|
||||
// helpers/navigation.ts
|
||||
export async function navigate(page: Page, url: string) {
|
||||
await page.goto(url);
|
||||
}
|
||||
|
||||
// helpers/auth.ts
|
||||
export async function login(page: Page, email: string, password: string) {
|
||||
await page.fill('[data-testid="email"]', email);
|
||||
await page.fill('[data-testid="password"]', password);
|
||||
await page.click('[data-testid="submit"]');
|
||||
}
|
||||
|
||||
// fixtures/admin-fixture.ts
|
||||
export const test = base.extend({
|
||||
adminPage: async ({ page }, use) => {
|
||||
await login(page, 'admin@example.com', 'admin123');
|
||||
await navigate(page, '/admin');
|
||||
await use(page);
|
||||
},
|
||||
});
|
||||
|
||||
// Tests import exactly what they need—no inheritance
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (initial setup)
|
||||
- **Related fragments**:
|
||||
- `data-factories.md` - Factory functions for test data
|
||||
- `network-first.md` - Network interception patterns
|
||||
- `test-quality.md` - Deterministic test design principles
|
||||
|
||||
## Helper Function Reuse Guidelines
|
||||
|
||||
When deciding whether to create a fixture, follow these rules:
|
||||
|
||||
- **3+ uses** → Create fixture with subpath export (shared across tests/projects)
|
||||
- **2-3 uses** → Create utility module (shared within project)
|
||||
- **1 use** → Keep inline (avoid premature abstraction)
|
||||
- **Complex logic** → Factory function pattern (dynamic data generation)
|
||||
|
||||
_Source: Murat Testing Philosophy (lines 74-122), SEON production patterns, Playwright fixture docs._
|
||||
|
||||
@@ -1,9 +1,486 @@
|
||||
# Network-First Safeguards
|
||||
|
||||
- Register interceptions before any navigation or user action; store the promise and await it immediately after the triggering step.
|
||||
- Assert on structured responses (status, body schema, headers) instead of generic waits so failures surface with actionable context.
|
||||
- Capture HAR files or Playwright traces on successful runs—reuse them for deterministic CI playback when upstream services flake.
|
||||
- Prefer edge mocking: stub at service boundaries, never deep within the stack unless risk analysis demands it.
|
||||
- Replace implicit waits with deterministic signals like `waitForResponse`, disappearance of spinners, or event hooks.
|
||||
## Principle
|
||||
|
||||
_Source: Murat Testing Philosophy, Playwright patterns book, blog on network interception._
|
||||
Register network interceptions **before** any navigation or user action. Store the interception promise and await it immediately after the triggering step. Replace implicit waits with deterministic signals based on network responses, spinner disappearance, or event hooks.
|
||||
|
||||
## Rationale
|
||||
|
||||
The most common source of flaky E2E tests is **race conditions** between navigation and network interception:
|
||||
|
||||
- Navigate then intercept = missed requests (too late)
|
||||
- No explicit wait = assertion runs before response arrives
|
||||
- Hard waits (`waitForTimeout(3000)`) = slow, unreliable, brittle
|
||||
|
||||
Network-first patterns provide:
|
||||
|
||||
- **Zero race conditions**: Intercept is active before triggering action
|
||||
- **Deterministic waits**: Wait for actual response, not arbitrary timeouts
|
||||
- **Actionable failures**: Assert on response status/body, not generic "element not found"
|
||||
- **Speed**: No padding with extra wait time
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Intercept Before Navigate Pattern
|
||||
|
||||
**Context**: The foundational pattern for all E2E tests. Always register route interception **before** the action that triggers the request (navigation, click, form submit).
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Intercept BEFORE navigate
|
||||
test('user can view dashboard data', async ({ page }) => {
|
||||
// Step 1: Register interception FIRST
|
||||
const usersPromise = page.waitForResponse((resp) => resp.url().includes('/api/users') && resp.status() === 200);
|
||||
|
||||
// Step 2: THEN trigger the request
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Step 3: THEN await the response
|
||||
const usersResponse = await usersPromise;
|
||||
const users = await usersResponse.json();
|
||||
|
||||
// Step 4: Assert on structured data
|
||||
expect(users).toHaveLength(10);
|
||||
await expect(page.getByText(users[0].name)).toBeVisible();
|
||||
});
|
||||
|
||||
// Cypress equivalent
|
||||
describe('Dashboard', () => {
|
||||
it('should display users', () => {
|
||||
// Step 1: Register interception FIRST
|
||||
cy.intercept('GET', '**/api/users').as('getUsers');
|
||||
|
||||
// Step 2: THEN trigger
|
||||
cy.visit('/dashboard');
|
||||
|
||||
// Step 3: THEN await
|
||||
cy.wait('@getUsers').then((interception) => {
|
||||
// Step 4: Assert on structured data
|
||||
expect(interception.response.statusCode).to.equal(200);
|
||||
expect(interception.response.body).to.have.length(10);
|
||||
cy.contains(interception.response.body[0].name).should('be.visible');
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
// ❌ WRONG: Navigate BEFORE intercept (race condition!)
|
||||
test('flaky test example', async ({ page }) => {
|
||||
await page.goto('/dashboard'); // Request fires immediately
|
||||
|
||||
const usersPromise = page.waitForResponse('/api/users'); // TOO LATE - might miss it
|
||||
const response = await usersPromise; // May timeout randomly
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Playwright: Use `page.waitForResponse()` with URL pattern or predicate **before** `page.goto()` or `page.click()`
|
||||
- Cypress: Use `cy.intercept().as()` **before** `cy.visit()` or `cy.click()`
|
||||
- Store promise/alias, trigger action, **then** await response
|
||||
- This prevents 95% of race-condition flakiness in E2E tests
|
||||
|
||||
### Example 2: HAR Capture for Debugging
|
||||
|
||||
**Context**: When debugging flaky tests or building deterministic mocks, capture real network traffic with HAR files. Replay them in tests for consistent, offline-capable test runs.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Enable HAR recording
|
||||
export default defineConfig({
|
||||
use: {
|
||||
// Record HAR on first run
|
||||
recordHar: { path: './hars/', mode: 'minimal' },
|
||||
// Or replay HAR in tests
|
||||
// serviceWorkers: 'block',
|
||||
},
|
||||
});
|
||||
|
||||
// Capture HAR for specific test
|
||||
test('capture network for order flow', async ({ page, context }) => {
|
||||
// Start recording
|
||||
await context.routeFromHAR('./hars/order-flow.har', {
|
||||
url: '**/api/**',
|
||||
update: true, // Update HAR with new requests
|
||||
});
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.fill('[data-testid="credit-card"]', '4111111111111111');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
|
||||
// HAR saved to ./hars/order-flow.har
|
||||
});
|
||||
|
||||
// Replay HAR for deterministic tests (no real API needed)
|
||||
test('replay order flow from HAR', async ({ page, context }) => {
|
||||
// Replay captured HAR
|
||||
await context.routeFromHAR('./hars/order-flow.har', {
|
||||
url: '**/api/**',
|
||||
update: false, // Read-only mode
|
||||
});
|
||||
|
||||
// Test runs with exact recorded responses - fully deterministic
|
||||
await page.goto('/checkout');
|
||||
await page.fill('[data-testid="credit-card"]', '4111111111111111');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
});
|
||||
|
||||
// Custom mock based on HAR insights
|
||||
test('mock order response based on HAR', async ({ page }) => {
|
||||
// After analyzing HAR, create focused mock
|
||||
await page.route('**/api/orders', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({
|
||||
orderId: '12345',
|
||||
status: 'confirmed',
|
||||
total: 99.99,
|
||||
}),
|
||||
}),
|
||||
);
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await expect(page.getByText('Order #12345')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- HAR files capture real request/response pairs for analysis
|
||||
- `update: true` records new traffic; `update: false` replays existing
|
||||
- Replay mode makes tests fully deterministic (no upstream API needed)
|
||||
- Use HAR to understand API contracts, then create focused mocks
|
||||
|
||||
### Example 3: Network Stub with Edge Cases
|
||||
|
||||
**Context**: When testing error handling, timeouts, and edge cases, stub network responses to simulate failures. Test both happy path and error scenarios.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// Test happy path
|
||||
test('order succeeds with valid data', async ({ page }) => {
|
||||
await page.route('**/api/orders', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({ orderId: '123', status: 'confirmed' }),
|
||||
}),
|
||||
);
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
});
|
||||
|
||||
// Test 500 error
|
||||
test('order fails with server error', async ({ page }) => {
|
||||
// Listen for console errors (app should log gracefully)
|
||||
const consoleErrors: string[] = [];
|
||||
page.on('console', (msg) => {
|
||||
if (msg.type() === 'error') consoleErrors.push(msg.text());
|
||||
});
|
||||
|
||||
// Stub 500 error
|
||||
await page.route('**/api/orders', (route) =>
|
||||
route.fulfill({
|
||||
status: 500,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({ error: 'Internal Server Error' }),
|
||||
}),
|
||||
);
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
|
||||
// Assert UI shows error gracefully
|
||||
await expect(page.getByText('Something went wrong')).toBeVisible();
|
||||
await expect(page.getByText('Please try again')).toBeVisible();
|
||||
|
||||
// Verify error logged (not thrown)
|
||||
expect(consoleErrors.some((e) => e.includes('Order failed'))).toBeTruthy();
|
||||
});
|
||||
|
||||
// Test network timeout
|
||||
test('order times out after 10 seconds', async ({ page }) => {
|
||||
// Stub delayed response (never resolves within timeout)
|
||||
await page.route(
|
||||
'**/api/orders',
|
||||
(route) => new Promise(() => {}), // Never resolves - simulates timeout
|
||||
);
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
|
||||
// App should show timeout message after configured timeout
|
||||
await expect(page.getByText('Request timed out')).toBeVisible({ timeout: 15000 });
|
||||
});
|
||||
|
||||
// Test partial data response
|
||||
test('order handles missing optional fields', async ({ page }) => {
|
||||
await page.route('**/api/orders', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
// Missing optional fields like 'trackingNumber', 'estimatedDelivery'
|
||||
body: JSON.stringify({ orderId: '123', status: 'confirmed' }),
|
||||
}),
|
||||
);
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
|
||||
// App should handle gracefully - no crash, shows what's available
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
await expect(page.getByText('Tracking information pending')).toBeVisible();
|
||||
});
|
||||
|
||||
// Cypress equivalents
|
||||
describe('Order Edge Cases', () => {
|
||||
it('should handle 500 error', () => {
|
||||
cy.intercept('POST', '**/api/orders', {
|
||||
statusCode: 500,
|
||||
body: { error: 'Internal Server Error' },
|
||||
}).as('orderFailed');
|
||||
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-testid="submit-order"]').click();
|
||||
cy.wait('@orderFailed');
|
||||
cy.contains('Something went wrong').should('be.visible');
|
||||
});
|
||||
|
||||
it('should handle timeout', () => {
|
||||
cy.intercept('POST', '**/api/orders', (req) => {
|
||||
req.reply({ delay: 20000 }); // Delay beyond app timeout
|
||||
}).as('orderTimeout');
|
||||
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-testid="submit-order"]').click();
|
||||
cy.contains('Request timed out', { timeout: 15000 }).should('be.visible');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Stub different HTTP status codes (200, 400, 500, 503)
|
||||
- Simulate timeouts with `delay` or non-resolving promises
|
||||
- Test partial/incomplete data responses
|
||||
- Verify app handles errors gracefully (no crashes, user-friendly messages)
|
||||
|
||||
### Example 4: Deterministic Waiting
|
||||
|
||||
**Context**: Never use hard waits (`waitForTimeout(3000)`). Always wait for explicit signals: network responses, element state changes, or custom events.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD: Wait for response with predicate
|
||||
test('wait for specific response', async ({ page }) => {
|
||||
const responsePromise = page.waitForResponse((resp) => resp.url().includes('/api/users') && resp.status() === 200);
|
||||
|
||||
await page.goto('/dashboard');
|
||||
const response = await responsePromise;
|
||||
|
||||
expect(response.status()).toBe(200);
|
||||
await expect(page.getByText('Dashboard')).toBeVisible();
|
||||
});
|
||||
|
||||
// ✅ GOOD: Wait for multiple responses
|
||||
test('wait for all required data', async ({ page }) => {
|
||||
const usersPromise = page.waitForResponse('**/api/users');
|
||||
const productsPromise = page.waitForResponse('**/api/products');
|
||||
const ordersPromise = page.waitForResponse('**/api/orders');
|
||||
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Wait for all in parallel
|
||||
const [users, products, orders] = await Promise.all([usersPromise, productsPromise, ordersPromise]);
|
||||
|
||||
expect(users.status()).toBe(200);
|
||||
expect(products.status()).toBe(200);
|
||||
expect(orders.status()).toBe(200);
|
||||
});
|
||||
|
||||
// ✅ GOOD: Wait for spinner to disappear
|
||||
test('wait for loading indicator', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Wait for spinner to disappear (signals data loaded)
|
||||
await expect(page.getByTestId('loading-spinner')).not.toBeVisible();
|
||||
await expect(page.getByText('Dashboard')).toBeVisible();
|
||||
});
|
||||
|
||||
// ✅ GOOD: Wait for custom event (advanced)
|
||||
test('wait for custom ready event', async ({ page }) => {
|
||||
let appReady = false;
|
||||
page.on('console', (msg) => {
|
||||
if (msg.text() === 'App ready') appReady = true;
|
||||
});
|
||||
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Poll until custom condition met
|
||||
await page.waitForFunction(() => appReady, { timeout: 10000 });
|
||||
|
||||
await expect(page.getByText('Dashboard')).toBeVisible();
|
||||
});
|
||||
|
||||
// ❌ BAD: Hard wait (arbitrary timeout)
|
||||
test('flaky hard wait example', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
await page.waitForTimeout(3000); // WHY 3 seconds? What if slower? What if faster?
|
||||
await expect(page.getByText('Dashboard')).toBeVisible(); // May fail if >3s
|
||||
});
|
||||
|
||||
// Cypress equivalents
|
||||
describe('Deterministic Waiting', () => {
|
||||
it('should wait for response', () => {
|
||||
cy.intercept('GET', '**/api/users').as('getUsers');
|
||||
cy.visit('/dashboard');
|
||||
cy.wait('@getUsers').its('response.statusCode').should('eq', 200);
|
||||
cy.contains('Dashboard').should('be.visible');
|
||||
});
|
||||
|
||||
it('should wait for spinner to disappear', () => {
|
||||
cy.visit('/dashboard');
|
||||
cy.get('[data-testid="loading-spinner"]').should('not.exist');
|
||||
cy.contains('Dashboard').should('be.visible');
|
||||
});
|
||||
|
||||
// ❌ BAD: Hard wait
|
||||
it('flaky hard wait', () => {
|
||||
cy.visit('/dashboard');
|
||||
cy.wait(3000); // NEVER DO THIS
|
||||
cy.contains('Dashboard').should('be.visible');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `waitForResponse()` with URL pattern or predicate = deterministic
|
||||
- `waitForLoadState('networkidle')` = wait for all network activity to finish
|
||||
- Wait for element state changes (spinner disappears, button enabled)
|
||||
- **NEVER** use `waitForTimeout()` or `cy.wait(ms)` - always non-deterministic
|
||||
|
||||
### Example 5: Anti-Pattern - Navigate Then Mock
|
||||
|
||||
**Problem**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Race condition - mock registered AFTER navigation starts
|
||||
test('flaky test - navigate then mock', async ({ page }) => {
|
||||
// Navigation starts immediately
|
||||
await page.goto('/dashboard'); // Request to /api/users fires NOW
|
||||
|
||||
// Mock registered too late - request already sent
|
||||
await page.route('**/api/users', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
body: JSON.stringify([{ id: 1, name: 'Test User' }]),
|
||||
}),
|
||||
);
|
||||
|
||||
// Test randomly passes/fails depending on timing
|
||||
await expect(page.getByText('Test User')).toBeVisible(); // Flaky!
|
||||
});
|
||||
|
||||
// ❌ BAD: No wait for response
|
||||
test('flaky test - no explicit wait', async ({ page }) => {
|
||||
await page.route('**/api/users', (route) => route.fulfill({ status: 200, body: JSON.stringify([]) }));
|
||||
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Assertion runs immediately - may fail if response slow
|
||||
await expect(page.getByText('No users found')).toBeVisible(); // Flaky!
|
||||
});
|
||||
|
||||
// ❌ BAD: Generic timeout
|
||||
test('flaky test - hard wait', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
await page.waitForTimeout(2000); // Arbitrary wait - brittle
|
||||
|
||||
await expect(page.getByText('Dashboard')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Why It Fails**:
|
||||
|
||||
- **Mock after navigate**: Request fires during navigation, mock isn't active yet (race condition)
|
||||
- **No explicit wait**: Assertion runs before response arrives (timing-dependent)
|
||||
- **Hard waits**: Slow tests, brittle (fails if < timeout, wastes time if > timeout)
|
||||
- **Non-deterministic**: Passes locally, fails in CI (different speeds)
|
||||
|
||||
**Better Approach**: Always intercept → trigger → await
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD: Intercept BEFORE navigate
|
||||
test('deterministic test', async ({ page }) => {
|
||||
// Step 1: Register mock FIRST
|
||||
await page.route('**/api/users', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify([{ id: 1, name: 'Test User' }]),
|
||||
}),
|
||||
);
|
||||
|
||||
// Step 2: Store response promise BEFORE trigger
|
||||
const responsePromise = page.waitForResponse('**/api/users');
|
||||
|
||||
// Step 3: THEN trigger
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Step 4: THEN await response
|
||||
await responsePromise;
|
||||
|
||||
// Step 5: THEN assert (data is guaranteed loaded)
|
||||
await expect(page.getByText('Test User')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Order matters: Mock → Promise → Trigger → Await → Assert
|
||||
- No race conditions: Mock is active before request fires
|
||||
- Explicit wait: Response promise ensures data loaded
|
||||
- Deterministic: Always passes if app works correctly
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (network setup)
|
||||
- **Related fragments**:
|
||||
- `fixture-architecture.md` - Network fixture patterns
|
||||
- `data-factories.md` - API-first setup with network
|
||||
- `test-quality.md` - Deterministic test principles
|
||||
|
||||
## Debugging Network Issues
|
||||
|
||||
When network tests fail, check:
|
||||
|
||||
1. **Timing**: Is interception registered **before** action?
|
||||
2. **URL pattern**: Does pattern match actual request URL?
|
||||
3. **Response format**: Is mocked response valid JSON/format?
|
||||
4. **Status code**: Is app checking for 200 vs 201 vs 204?
|
||||
5. **HAR file**: Capture real traffic to understand actual API contract
|
||||
|
||||
```typescript
|
||||
// Debug network issues with logging
|
||||
test('debug network', async ({ page }) => {
|
||||
// Log all requests
|
||||
page.on('request', (req) => console.log('→', req.method(), req.url()));
|
||||
|
||||
// Log all responses
|
||||
page.on('response', (resp) => console.log('←', resp.status(), resp.url()));
|
||||
|
||||
await page.goto('/dashboard');
|
||||
});
|
||||
```
|
||||
|
||||
_Source: Murat Testing Philosophy (lines 94-137), Playwright network patterns, Cypress intercept best practices._
|
||||
|
||||
@@ -1,21 +1,670 @@
|
||||
# Non-Functional Review Criteria
|
||||
# Non-Functional Requirements (NFR) Criteria
|
||||
|
||||
- **Security**
|
||||
- PASS: auth/authz, secret handling, and threat mitigations in place.
|
||||
- CONCERNS: minor gaps with clear owners.
|
||||
- FAIL: critical exposure or missing controls.
|
||||
- **Performance**
|
||||
- PASS: metrics meet targets with profiling evidence.
|
||||
- CONCERNS: trending toward limits or missing baselines.
|
||||
- FAIL: breaches SLO/SLA or introduces resource leaks.
|
||||
- **Reliability**
|
||||
- PASS: error handling, retries, health checks verified.
|
||||
- CONCERNS: partial coverage or missing telemetry.
|
||||
- FAIL: no recovery path or crash scenarios unresolved.
|
||||
- **Maintainability**
|
||||
- PASS: clean code, tests, and documentation shipped together.
|
||||
- CONCERNS: duplication, low coverage, or unclear ownership.
|
||||
- FAIL: absent tests, tangled implementations, or no observability.
|
||||
- Default to CONCERNS when targets or evidence are undefined—force the team to clarify before sign-off.
|
||||
## Principle
|
||||
|
||||
_Source: Murat NFR assessment guidance._
|
||||
Non-functional requirements (security, performance, reliability, maintainability) are **validated through automated tests**, not checklists. NFR assessment uses objective pass/fail criteria tied to measurable thresholds. Ambiguous requirements default to CONCERNS until clarified.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Teams ship features that "work" functionally but fail under load, expose security vulnerabilities, or lack error recovery. NFRs are treated as optional "nice-to-haves" instead of release blockers.
|
||||
|
||||
**The Solution**: Define explicit NFR criteria with automated validation. Security tests verify auth/authz and secret handling. Performance tests enforce SLO/SLA thresholds with profiling evidence. Reliability tests validate error handling, retries, and health checks. Maintainability is measured by test coverage, code duplication, and observability.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Prevents production incidents (security breaches, performance degradation, cascading failures)
|
||||
- Provides objective release criteria (no subjective "feels fast enough")
|
||||
- Automates compliance validation (audit trail for regulated environments)
|
||||
- Forces clarity on ambiguous requirements (default to CONCERNS)
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Security NFR Validation (Auth, Secrets, OWASP)
|
||||
|
||||
**Context**: Automated security tests enforcing authentication, authorization, and secret handling
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/nfr/security.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Security NFR: Authentication & Authorization', () => {
|
||||
test('unauthenticated users cannot access protected routes', async ({ page }) => {
|
||||
// Attempt to access dashboard without auth
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Should redirect to login (not expose data)
|
||||
await expect(page).toHaveURL(/\/login/);
|
||||
await expect(page.getByText('Please sign in')).toBeVisible();
|
||||
|
||||
// Verify no sensitive data leaked in response
|
||||
const pageContent = await page.content();
|
||||
expect(pageContent).not.toContain('user_id');
|
||||
expect(pageContent).not.toContain('api_key');
|
||||
});
|
||||
|
||||
test('JWT tokens expire after 15 minutes', async ({ page, request }) => {
|
||||
// Login and capture token
|
||||
await page.goto('/login');
|
||||
await page.getByLabel('Email').fill('test@example.com');
|
||||
await page.getByLabel('Password').fill('ValidPass123!');
|
||||
await page.getByRole('button', { name: 'Sign In' }).click();
|
||||
|
||||
const token = await page.evaluate(() => localStorage.getItem('auth_token'));
|
||||
expect(token).toBeTruthy();
|
||||
|
||||
// Wait 16 minutes (use mock clock in real tests)
|
||||
await page.clock.fastForward('00:16:00');
|
||||
|
||||
// Token should be expired, API call should fail
|
||||
const response = await request.get('/api/user/profile', {
|
||||
headers: { Authorization: `Bearer ${token}` },
|
||||
});
|
||||
|
||||
expect(response.status()).toBe(401);
|
||||
const body = await response.json();
|
||||
expect(body.error).toContain('expired');
|
||||
});
|
||||
|
||||
test('passwords are never logged or exposed in errors', async ({ page }) => {
|
||||
// Trigger login error
|
||||
await page.goto('/login');
|
||||
await page.getByLabel('Email').fill('test@example.com');
|
||||
await page.getByLabel('Password').fill('WrongPassword123!');
|
||||
|
||||
// Monitor console for password leaks
|
||||
const consoleLogs: string[] = [];
|
||||
page.on('console', (msg) => consoleLogs.push(msg.text()));
|
||||
|
||||
await page.getByRole('button', { name: 'Sign In' }).click();
|
||||
|
||||
// Error shown to user (generic message)
|
||||
await expect(page.getByText('Invalid credentials')).toBeVisible();
|
||||
|
||||
// Verify password NEVER appears in console, DOM, or network
|
||||
const pageContent = await page.content();
|
||||
expect(pageContent).not.toContain('WrongPassword123!');
|
||||
expect(consoleLogs.join('\n')).not.toContain('WrongPassword123!');
|
||||
});
|
||||
|
||||
test('RBAC: users can only access resources they own', async ({ page, request }) => {
|
||||
// Login as User A
|
||||
const userAToken = await login(request, 'userA@example.com', 'password');
|
||||
|
||||
// Try to access User B's order
|
||||
const response = await request.get('/api/orders/user-b-order-id', {
|
||||
headers: { Authorization: `Bearer ${userAToken}` },
|
||||
});
|
||||
|
||||
expect(response.status()).toBe(403); // Forbidden
|
||||
const body = await response.json();
|
||||
expect(body.error).toContain('insufficient permissions');
|
||||
});
|
||||
|
||||
test('SQL injection attempts are blocked', async ({ page }) => {
|
||||
await page.goto('/search');
|
||||
|
||||
// Attempt SQL injection
|
||||
await page.getByPlaceholder('Search products').fill("'; DROP TABLE users; --");
|
||||
await page.getByRole('button', { name: 'Search' }).click();
|
||||
|
||||
// Should return empty results, NOT crash or expose error
|
||||
await expect(page.getByText('No results found')).toBeVisible();
|
||||
|
||||
// Verify app still works (table not dropped)
|
||||
await page.goto('/dashboard');
|
||||
await expect(page.getByText('Welcome')).toBeVisible();
|
||||
});
|
||||
|
||||
test('XSS attempts are sanitized', async ({ page }) => {
|
||||
await page.goto('/profile/edit');
|
||||
|
||||
// Attempt XSS injection
|
||||
const xssPayload = '<script>alert("XSS")</script>';
|
||||
await page.getByLabel('Bio').fill(xssPayload);
|
||||
await page.getByRole('button', { name: 'Save' }).click();
|
||||
|
||||
// Reload and verify XSS is escaped (not executed)
|
||||
await page.reload();
|
||||
const bio = await page.getByTestId('user-bio').textContent();
|
||||
|
||||
// Text should be escaped, script should NOT execute
|
||||
expect(bio).toContain('<script>');
|
||||
expect(bio).not.toContain('<script>');
|
||||
});
|
||||
});
|
||||
|
||||
// Helper
|
||||
async function login(request: any, email: string, password: string): Promise<string> {
|
||||
const response = await request.post('/api/auth/login', {
|
||||
data: { email, password },
|
||||
});
|
||||
const body = await response.json();
|
||||
return body.token;
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Authentication: Unauthenticated access redirected (not exposed)
|
||||
- Authorization: RBAC enforced (403 for insufficient permissions)
|
||||
- Token expiry: JWT expires after 15 minutes (automated validation)
|
||||
- Secret handling: Passwords never logged or exposed in errors
|
||||
- OWASP Top 10: SQL injection and XSS blocked (input sanitization)
|
||||
|
||||
**Security NFR Criteria**:
|
||||
|
||||
- ✅ PASS: All 6 tests green (auth, authz, token expiry, secret handling, SQL injection, XSS)
|
||||
- ⚠️ CONCERNS: 1-2 tests failing with mitigation plan and owner assigned
|
||||
- ❌ FAIL: Critical exposure (unauthenticated access, password leak, SQL injection succeeds)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Performance NFR Validation (k6 Load Testing for SLO/SLA)
|
||||
|
||||
**Context**: Use k6 for load testing, stress testing, and SLO/SLA enforcement (NOT Playwright)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```javascript
|
||||
// tests/nfr/performance.k6.js
|
||||
import http from 'k6/http';
|
||||
import { check, sleep } from 'k6';
|
||||
import { Rate, Trend } from 'k6/metrics';
|
||||
|
||||
// Custom metrics
|
||||
const errorRate = new Rate('errors');
|
||||
const apiDuration = new Trend('api_duration');
|
||||
|
||||
// Performance thresholds (SLO/SLA)
|
||||
export const options = {
|
||||
stages: [
|
||||
{ duration: '1m', target: 50 }, // Ramp up to 50 users
|
||||
{ duration: '3m', target: 50 }, // Stay at 50 users for 3 minutes
|
||||
{ duration: '1m', target: 100 }, // Spike to 100 users
|
||||
{ duration: '3m', target: 100 }, // Stay at 100 users
|
||||
{ duration: '1m', target: 0 }, // Ramp down
|
||||
],
|
||||
thresholds: {
|
||||
// SLO: 95% of requests must complete in <500ms
|
||||
http_req_duration: ['p(95)<500'],
|
||||
// SLO: Error rate must be <1%
|
||||
errors: ['rate<0.01'],
|
||||
// SLA: API endpoints must respond in <1s (99th percentile)
|
||||
api_duration: ['p(99)<1000'],
|
||||
},
|
||||
};
|
||||
|
||||
export default function () {
|
||||
// Test 1: Homepage load performance
|
||||
const homepageResponse = http.get(`${__ENV.BASE_URL}/`);
|
||||
check(homepageResponse, {
|
||||
'homepage status is 200': (r) => r.status === 200,
|
||||
'homepage loads in <2s': (r) => r.timings.duration < 2000,
|
||||
});
|
||||
errorRate.add(homepageResponse.status !== 200);
|
||||
|
||||
// Test 2: API endpoint performance
|
||||
const apiResponse = http.get(`${__ENV.BASE_URL}/api/products?limit=10`, {
|
||||
headers: { Authorization: `Bearer ${__ENV.API_TOKEN}` },
|
||||
});
|
||||
check(apiResponse, {
|
||||
'API status is 200': (r) => r.status === 200,
|
||||
'API responds in <500ms': (r) => r.timings.duration < 500,
|
||||
});
|
||||
apiDuration.add(apiResponse.timings.duration);
|
||||
errorRate.add(apiResponse.status !== 200);
|
||||
|
||||
// Test 3: Search endpoint under load
|
||||
const searchResponse = http.get(`${__ENV.BASE_URL}/api/search?q=laptop&limit=100`);
|
||||
check(searchResponse, {
|
||||
'search status is 200': (r) => r.status === 200,
|
||||
'search responds in <1s': (r) => r.timings.duration < 1000,
|
||||
'search returns results': (r) => JSON.parse(r.body).results.length > 0,
|
||||
});
|
||||
errorRate.add(searchResponse.status !== 200);
|
||||
|
||||
sleep(1); // Realistic user think time
|
||||
}
|
||||
|
||||
// Threshold validation (run after test)
|
||||
export function handleSummary(data) {
|
||||
const p95Duration = data.metrics.http_req_duration.values['p(95)'];
|
||||
const p99ApiDuration = data.metrics.api_duration.values['p(99)'];
|
||||
const errorRateValue = data.metrics.errors.values.rate;
|
||||
|
||||
console.log(`P95 request duration: ${p95Duration.toFixed(2)}ms`);
|
||||
console.log(`P99 API duration: ${p99ApiDuration.toFixed(2)}ms`);
|
||||
console.log(`Error rate: ${(errorRateValue * 100).toFixed(2)}%`);
|
||||
|
||||
return {
|
||||
'summary.json': JSON.stringify(data),
|
||||
stdout: `
|
||||
Performance NFR Results:
|
||||
- P95 request duration: ${p95Duration < 500 ? '✅ PASS' : '❌ FAIL'} (${p95Duration.toFixed(2)}ms / 500ms threshold)
|
||||
- P99 API duration: ${p99ApiDuration < 1000 ? '✅ PASS' : '❌ FAIL'} (${p99ApiDuration.toFixed(2)}ms / 1000ms threshold)
|
||||
- Error rate: ${errorRateValue < 0.01 ? '✅ PASS' : '❌ FAIL'} (${(errorRateValue * 100).toFixed(2)}% / 1% threshold)
|
||||
`,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Run k6 tests:**
|
||||
|
||||
```bash
|
||||
# Local smoke test (10 VUs, 30s)
|
||||
k6 run --vus 10 --duration 30s tests/nfr/performance.k6.js
|
||||
|
||||
# Full load test (stages defined in script)
|
||||
k6 run tests/nfr/performance.k6.js
|
||||
|
||||
# CI integration with thresholds
|
||||
k6 run --out json=performance-results.json tests/nfr/performance.k6.js
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **k6 is the right tool** for load testing (NOT Playwright)
|
||||
- SLO/SLA thresholds enforced automatically (`p(95)<500`, `rate<0.01`)
|
||||
- Realistic load simulation (ramp up, sustained load, spike testing)
|
||||
- Comprehensive metrics (p50, p95, p99, error rate, throughput)
|
||||
- CI-friendly (JSON output, exit codes based on thresholds)
|
||||
|
||||
**Performance NFR Criteria**:
|
||||
|
||||
- ✅ PASS: All SLO/SLA targets met with k6 profiling evidence (p95 < 500ms, error rate < 1%)
|
||||
- ⚠️ CONCERNS: Trending toward limits (e.g., p95 = 480ms approaching 500ms) or missing baselines
|
||||
- ❌ FAIL: SLO/SLA breached (e.g., p95 > 500ms) or error rate > 1%
|
||||
|
||||
**Performance Testing Levels (from Test Architect course):**
|
||||
|
||||
- **Load testing**: System behavior under expected load
|
||||
- **Stress testing**: System behavior under extreme load (breaking point)
|
||||
- **Spike testing**: Sudden load increases (traffic spikes)
|
||||
- **Endurance/Soak testing**: System behavior under sustained load (memory leaks, resource exhaustion)
|
||||
- **Benchmarking**: Baseline measurements for comparison
|
||||
|
||||
**Note**: Playwright can validate **perceived performance** (Core Web Vitals via Lighthouse), but k6 validates **system performance** (throughput, latency, resource limits under load)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Reliability NFR Validation (Playwright for UI Resilience)
|
||||
|
||||
**Context**: Automated reliability tests validating graceful degradation and recovery paths
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/nfr/reliability.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Reliability NFR: Error Handling & Recovery', () => {
|
||||
test('app remains functional when API returns 500 error', async ({ page, context }) => {
|
||||
// Mock API failure
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) });
|
||||
});
|
||||
|
||||
await page.goto('/products');
|
||||
|
||||
// User sees error message (not blank page or crash)
|
||||
await expect(page.getByText('Unable to load products. Please try again.')).toBeVisible();
|
||||
await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
|
||||
|
||||
// App navigation still works (graceful degradation)
|
||||
await page.getByRole('link', { name: 'Home' }).click();
|
||||
await expect(page).toHaveURL('/');
|
||||
});
|
||||
|
||||
test('API client retries on transient failures (3 attempts)', async ({ page, context }) => {
|
||||
let attemptCount = 0;
|
||||
|
||||
await context.route('**/api/checkout', (route) => {
|
||||
attemptCount++;
|
||||
|
||||
// Fail first 2 attempts, succeed on 3rd
|
||||
if (attemptCount < 3) {
|
||||
route.fulfill({ status: 503, body: JSON.stringify({ error: 'Service Unavailable' }) });
|
||||
} else {
|
||||
route.fulfill({ status: 200, body: JSON.stringify({ orderId: '12345' }) });
|
||||
}
|
||||
});
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.getByRole('button', { name: 'Place Order' }).click();
|
||||
|
||||
// Should succeed after 3 attempts
|
||||
await expect(page.getByText('Order placed successfully')).toBeVisible();
|
||||
expect(attemptCount).toBe(3);
|
||||
});
|
||||
|
||||
test('app handles network disconnection gracefully', async ({ page, context }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Simulate offline mode
|
||||
await context.setOffline(true);
|
||||
|
||||
// Trigger action requiring network
|
||||
await page.getByRole('button', { name: 'Refresh Data' }).click();
|
||||
|
||||
// User sees offline indicator (not crash)
|
||||
await expect(page.getByText('You are offline. Changes will sync when reconnected.')).toBeVisible();
|
||||
|
||||
// Reconnect
|
||||
await context.setOffline(false);
|
||||
await page.getByRole('button', { name: 'Refresh Data' }).click();
|
||||
|
||||
// Data loads successfully
|
||||
await expect(page.getByText('Data updated')).toBeVisible();
|
||||
});
|
||||
|
||||
test('health check endpoint returns service status', async ({ request }) => {
|
||||
const response = await request.get('/api/health');
|
||||
|
||||
expect(response.status()).toBe(200);
|
||||
|
||||
const health = await response.json();
|
||||
expect(health).toHaveProperty('status', 'healthy');
|
||||
expect(health).toHaveProperty('timestamp');
|
||||
expect(health).toHaveProperty('services');
|
||||
|
||||
// Verify critical services are monitored
|
||||
expect(health.services).toHaveProperty('database');
|
||||
expect(health.services).toHaveProperty('cache');
|
||||
expect(health.services).toHaveProperty('queue');
|
||||
|
||||
// All services should be UP
|
||||
expect(health.services.database.status).toBe('UP');
|
||||
expect(health.services.cache.status).toBe('UP');
|
||||
expect(health.services.queue.status).toBe('UP');
|
||||
});
|
||||
|
||||
test('circuit breaker opens after 5 consecutive failures', async ({ page, context }) => {
|
||||
let failureCount = 0;
|
||||
|
||||
await context.route('**/api/recommendations', (route) => {
|
||||
failureCount++;
|
||||
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Service Error' }) });
|
||||
});
|
||||
|
||||
await page.goto('/product/123');
|
||||
|
||||
// Wait for circuit breaker to open (fallback UI appears)
|
||||
await expect(page.getByText('Recommendations temporarily unavailable')).toBeVisible({ timeout: 10000 });
|
||||
|
||||
// Verify circuit breaker stopped making requests after threshold (should be ≤5)
|
||||
expect(failureCount).toBeLessThanOrEqual(5);
|
||||
});
|
||||
|
||||
test('rate limiting gracefully handles 429 responses', async ({ page, context }) => {
|
||||
let requestCount = 0;
|
||||
|
||||
await context.route('**/api/search', (route) => {
|
||||
requestCount++;
|
||||
|
||||
if (requestCount > 10) {
|
||||
// Rate limit exceeded
|
||||
route.fulfill({
|
||||
status: 429,
|
||||
headers: { 'Retry-After': '5' },
|
||||
body: JSON.stringify({ error: 'Rate limit exceeded' }),
|
||||
});
|
||||
} else {
|
||||
route.fulfill({ status: 200, body: JSON.stringify({ results: [] }) });
|
||||
}
|
||||
});
|
||||
|
||||
await page.goto('/search');
|
||||
|
||||
// Make 15 search requests rapidly
|
||||
for (let i = 0; i < 15; i++) {
|
||||
await page.getByPlaceholder('Search').fill(`query-${i}`);
|
||||
await page.getByRole('button', { name: 'Search' }).click();
|
||||
}
|
||||
|
||||
// User sees rate limit message (not crash)
|
||||
await expect(page.getByText('Too many requests. Please wait a moment.')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Error handling: Graceful degradation (500 error → user-friendly message + retry button)
|
||||
- Retries: 3 attempts on transient failures (503 → eventual success)
|
||||
- Offline handling: Network disconnection detected (sync when reconnected)
|
||||
- Health checks: `/api/health` monitors database, cache, queue
|
||||
- Circuit breaker: Opens after 5 failures (fallback UI, stop retries)
|
||||
- Rate limiting: 429 response handled (Retry-After header respected)
|
||||
|
||||
**Reliability NFR Criteria**:
|
||||
|
||||
- ✅ PASS: Error handling, retries, health checks verified (all 6 tests green)
|
||||
- ⚠️ CONCERNS: Partial coverage (e.g., missing circuit breaker) or no telemetry
|
||||
- ❌ FAIL: No recovery path (500 error crashes app) or unresolved crash scenarios
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Maintainability NFR Validation (CI Tools, Not Playwright)
|
||||
|
||||
**Context**: Use proper CI tools for code quality validation (coverage, duplication, vulnerabilities)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/nfr-maintainability.yml
|
||||
name: NFR - Maintainability
|
||||
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test-coverage:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run tests with coverage
|
||||
run: npm run test:coverage
|
||||
|
||||
- name: Check coverage threshold (80% minimum)
|
||||
run: |
|
||||
COVERAGE=$(jq '.total.lines.pct' coverage/coverage-summary.json)
|
||||
echo "Coverage: $COVERAGE%"
|
||||
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
|
||||
echo "❌ FAIL: Coverage $COVERAGE% below 80% threshold"
|
||||
exit 1
|
||||
else
|
||||
echo "✅ PASS: Coverage $COVERAGE% meets 80% threshold"
|
||||
fi
|
||||
|
||||
code-duplication:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
|
||||
- name: Check code duplication (<5% allowed)
|
||||
run: |
|
||||
npx jscpd src/ --threshold 5 --format json --output duplication.json
|
||||
DUPLICATION=$(jq '.statistics.total.percentage' duplication.json)
|
||||
echo "Duplication: $DUPLICATION%"
|
||||
if (( $(echo "$DUPLICATION >= 5" | bc -l) )); then
|
||||
echo "❌ FAIL: Duplication $DUPLICATION% exceeds 5% threshold"
|
||||
exit 1
|
||||
else
|
||||
echo "✅ PASS: Duplication $DUPLICATION% below 5% threshold"
|
||||
fi
|
||||
|
||||
vulnerability-scan:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run npm audit (no critical/high vulnerabilities)
|
||||
run: |
|
||||
npm audit --json > audit.json || true
|
||||
CRITICAL=$(jq '.metadata.vulnerabilities.critical' audit.json)
|
||||
HIGH=$(jq '.metadata.vulnerabilities.high' audit.json)
|
||||
echo "Critical: $CRITICAL, High: $HIGH"
|
||||
if [ "$CRITICAL" -gt 0 ] || [ "$HIGH" -gt 0 ]; then
|
||||
echo "❌ FAIL: Found $CRITICAL critical and $HIGH high vulnerabilities"
|
||||
npm audit
|
||||
exit 1
|
||||
else
|
||||
echo "✅ PASS: No critical/high vulnerabilities"
|
||||
fi
|
||||
```
|
||||
|
||||
**Playwright Tests for Observability (E2E Validation):**
|
||||
|
||||
```typescript
|
||||
// tests/nfr/observability.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Maintainability NFR: Observability Validation', () => {
|
||||
test('critical errors are reported to monitoring service', async ({ page, context }) => {
|
||||
const sentryEvents: any[] = [];
|
||||
|
||||
// Mock Sentry SDK to verify error tracking
|
||||
await context.addInitScript(() => {
|
||||
(window as any).Sentry = {
|
||||
captureException: (error: Error) => {
|
||||
console.log('SENTRY_CAPTURE:', JSON.stringify({ message: error.message, stack: error.stack }));
|
||||
},
|
||||
};
|
||||
});
|
||||
|
||||
page.on('console', (msg) => {
|
||||
if (msg.text().includes('SENTRY_CAPTURE:')) {
|
||||
sentryEvents.push(JSON.parse(msg.text().replace('SENTRY_CAPTURE:', '')));
|
||||
}
|
||||
});
|
||||
|
||||
// Trigger error by mocking API failure
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Database Error' }) });
|
||||
});
|
||||
|
||||
await page.goto('/products');
|
||||
|
||||
// Wait for error UI and Sentry capture
|
||||
await expect(page.getByText('Unable to load products')).toBeVisible();
|
||||
|
||||
// Verify error was captured by monitoring
|
||||
expect(sentryEvents.length).toBeGreaterThan(0);
|
||||
expect(sentryEvents[0]).toHaveProperty('message');
|
||||
expect(sentryEvents[0]).toHaveProperty('stack');
|
||||
});
|
||||
|
||||
test('API response times are tracked in telemetry', async ({ request }) => {
|
||||
const response = await request.get('/api/products?limit=10');
|
||||
|
||||
expect(response.ok()).toBeTruthy();
|
||||
|
||||
// Verify Server-Timing header for APM (Application Performance Monitoring)
|
||||
const serverTiming = response.headers()['server-timing'];
|
||||
|
||||
expect(serverTiming).toBeTruthy();
|
||||
expect(serverTiming).toContain('db'); // Database query time
|
||||
expect(serverTiming).toContain('total'); // Total processing time
|
||||
});
|
||||
|
||||
test('structured logging present in application', async ({ request }) => {
|
||||
// Make API call that generates logs
|
||||
const response = await request.post('/api/orders', {
|
||||
data: { productId: '123', quantity: 2 },
|
||||
});
|
||||
|
||||
expect(response.ok()).toBeTruthy();
|
||||
|
||||
// Note: In real scenarios, validate logs in monitoring system (Datadog, CloudWatch)
|
||||
// This test validates the logging contract exists (Server-Timing, trace IDs in headers)
|
||||
const traceId = response.headers()['x-trace-id'];
|
||||
expect(traceId).toBeTruthy(); // Confirms structured logging with correlation IDs
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Coverage/duplication**: CI jobs (GitHub Actions), not Playwright tests
|
||||
- **Vulnerability scanning**: npm audit in CI, not Playwright tests
|
||||
- **Observability**: Playwright validates error tracking (Sentry) and telemetry headers
|
||||
- **Structured logging**: Validate logging contract (trace IDs, Server-Timing headers)
|
||||
- **Separation of concerns**: Build-time checks (coverage, audit) vs runtime checks (error tracking, telemetry)
|
||||
|
||||
**Maintainability NFR Criteria**:
|
||||
|
||||
- ✅ PASS: Clean code (80%+ coverage from CI, <5% duplication from CI), observability validated in E2E, no critical vulnerabilities from npm audit
|
||||
- ⚠️ CONCERNS: Duplication >5%, coverage 60-79%, or unclear ownership
|
||||
- ❌ FAIL: Absent tests (<60%), tangled implementations (>10% duplication), or no observability
|
||||
|
||||
---
|
||||
|
||||
## NFR Assessment Checklist
|
||||
|
||||
Before release gate:
|
||||
|
||||
- [ ] **Security** (Playwright E2E + Security Tools):
|
||||
- [ ] Auth/authz tests green (unauthenticated redirect, RBAC enforced)
|
||||
- [ ] Secrets never logged or exposed in errors
|
||||
- [ ] OWASP Top 10 validated (SQL injection blocked, XSS sanitized)
|
||||
- [ ] Security audit completed (vulnerability scan, penetration test if applicable)
|
||||
|
||||
- [ ] **Performance** (k6 Load Testing):
|
||||
- [ ] SLO/SLA targets met with k6 evidence (p95 <500ms, error rate <1%)
|
||||
- [ ] Load testing completed (expected load)
|
||||
- [ ] Stress testing completed (breaking point identified)
|
||||
- [ ] Spike testing completed (handles traffic spikes)
|
||||
- [ ] Endurance testing completed (no memory leaks under sustained load)
|
||||
|
||||
- [ ] **Reliability** (Playwright E2E + API Tests):
|
||||
- [ ] Error handling graceful (500 → user-friendly message + retry)
|
||||
- [ ] Retries implemented (3 attempts on transient failures)
|
||||
- [ ] Health checks monitored (/api/health endpoint)
|
||||
- [ ] Circuit breaker tested (opens after failure threshold)
|
||||
- [ ] Offline handling validated (network disconnection graceful)
|
||||
|
||||
- [ ] **Maintainability** (CI Tools):
|
||||
- [ ] Test coverage ≥80% (from CI coverage report)
|
||||
- [ ] Code duplication <5% (from jscpd CI job)
|
||||
- [ ] No critical/high vulnerabilities (from npm audit CI job)
|
||||
- [ ] Structured logging validated (Playwright validates telemetry headers)
|
||||
- [ ] Error tracking configured (Sentry/monitoring integration validated)
|
||||
|
||||
- [ ] **Ambiguous requirements**: Default to CONCERNS (force team to clarify thresholds and evidence)
|
||||
- [ ] **NFR criteria documented**: Measurable thresholds defined (not subjective "fast enough")
|
||||
- [ ] **Automated validation**: NFR tests run in CI pipeline (not manual checklists)
|
||||
- [ ] **Tool selection**: Right tool for each NFR (k6 for performance, Playwright for security/reliability E2E, CI tools for maintainability)
|
||||
|
||||
## NFR Gate Decision Matrix
|
||||
|
||||
| Category | PASS Criteria | CONCERNS Criteria | FAIL Criteria |
|
||||
| ------------------- | -------------------------------------------- | -------------------------------------------- | ---------------------------------------------- |
|
||||
| **Security** | Auth/authz, secret handling, OWASP verified | Minor gaps with clear owners | Critical exposure or missing controls |
|
||||
| **Performance** | Metrics meet SLO/SLA with profiling evidence | Trending toward limits or missing baselines | SLO/SLA breached or resource leaks detected |
|
||||
| **Reliability** | Error handling, retries, health checks OK | Partial coverage or missing telemetry | No recovery path or unresolved crash scenarios |
|
||||
| **Maintainability** | Clean code, tests, docs shipped together | Duplication, low coverage, unclear ownership | Absent tests, tangled code, no observability |
|
||||
|
||||
**Default**: If targets or evidence are undefined → **CONCERNS** (force team to clarify before sign-off)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*nfr-assess` (automated NFR validation), `*trace` (gate decision Phase 2), `*test-design` (NFR risk assessment via Utility Tree)
|
||||
- **Related fragments**: `risk-governance.md` (NFR risk scoring), `probability-impact.md` (NFR impact assessment), `test-quality.md` (maintainability standards), `test-levels-framework.md` (system-level testing for NFRs)
|
||||
- **Tools by NFR Category**:
|
||||
- **Security**: Playwright (E2E auth/authz), OWASP ZAP, Burp Suite, npm audit, Snyk
|
||||
- **Performance**: k6 (load/stress/spike/endurance), Lighthouse (Core Web Vitals), Artillery
|
||||
- **Reliability**: Playwright (E2E error handling), API tests (retries, health checks), Chaos Engineering tools
|
||||
- **Maintainability**: GitHub Actions (coverage, duplication, audit), jscpd, Playwright (observability validation)
|
||||
|
||||
_Source: Test Architect course (NFR testing approaches, Utility Tree, Quality Scenarios), ISO/IEC 25010 Software Quality Characteristics, OWASP Top 10, k6 documentation, SRE practices_
|
||||
|
||||
@@ -1,9 +1,730 @@
|
||||
# Playwright Configuration Guardrails
|
||||
|
||||
- Load environment configs via a central map (`envConfigMap`) and fail fast when `TEST_ENV` is missing or unsupported.
|
||||
- Standardize timeouts: action 15s, navigation 30s, expect 10s, test 60s; expose overrides through fixtures rather than inline literals.
|
||||
- Emit HTML + JUnit reporters, disable auto-open, and store artifacts under `test-results/` for CI upload.
|
||||
- Keep `.env.example`, `.nvmrc`, and browser dependencies versioned so local and CI runs stay aligned.
|
||||
- Use global setup for shared auth tokens or seeding, but prefer per-test fixtures for anything mutable to avoid cross-test leakage.
|
||||
## Principle
|
||||
|
||||
_Source: Playwright book repo, SEON configuration example._
|
||||
Load environment configs via a central map (`envConfigMap`), standardize timeouts (action 15s, navigation 30s, expect 10s, test 60s), emit HTML + JUnit reporters, and store artifacts under `test-results/` for CI upload. Keep `.env.example`, `.nvmrc`, and browser dependencies versioned so local and CI runs stay aligned.
|
||||
|
||||
## Rationale
|
||||
|
||||
Environment-specific configuration prevents hardcoded URLs, timeouts, and credentials from leaking into tests. A central config map with fail-fast validation catches missing environments early. Standardized timeouts reduce flakiness while remaining long enough for real-world network conditions. Consistent artifact storage (`test-results/`, `playwright-report/`) enables CI pipelines to upload failure evidence automatically. Versioned dependencies (`.nvmrc`, `package.json` browser versions) eliminate "works on my machine" issues between local and CI environments.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Environment-Based Configuration
|
||||
|
||||
**Context**: When testing against multiple environments (local, staging, production), use a central config map that loads environment-specific settings and fails fast if `TEST_ENV` is invalid.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Central config loader
|
||||
import { config as dotenvConfig } from 'dotenv';
|
||||
import path from 'path';
|
||||
|
||||
// Load .env from project root
|
||||
dotenvConfig({
|
||||
path: path.resolve(__dirname, '../../.env'),
|
||||
});
|
||||
|
||||
// Central environment config map
|
||||
const envConfigMap = {
|
||||
local: require('./playwright/config/local.config').default,
|
||||
staging: require('./playwright/config/staging.config').default,
|
||||
production: require('./playwright/config/production.config').default,
|
||||
};
|
||||
|
||||
const environment = process.env.TEST_ENV || 'local';
|
||||
|
||||
// Fail fast if environment not supported
|
||||
if (!Object.keys(envConfigMap).includes(environment)) {
|
||||
console.error(`❌ No configuration found for environment: ${environment}`);
|
||||
console.error(` Available environments: ${Object.keys(envConfigMap).join(', ')}`);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.log(`✅ Running tests against: ${environment.toUpperCase()}`);
|
||||
|
||||
export default envConfigMap[environment as keyof typeof envConfigMap];
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/config/base.config.ts - Shared base configuration
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
export const baseConfig = defineConfig({
|
||||
testDir: path.resolve(__dirname, '../tests'),
|
||||
outputDir: path.resolve(__dirname, '../../test-results'),
|
||||
fullyParallel: true,
|
||||
forbidOnly: !!process.env.CI,
|
||||
retries: process.env.CI ? 2 : 0,
|
||||
workers: process.env.CI ? 1 : undefined,
|
||||
reporter: [
|
||||
['html', { outputFolder: 'playwright-report', open: 'never' }],
|
||||
['junit', { outputFile: 'test-results/results.xml' }],
|
||||
['list'],
|
||||
],
|
||||
use: {
|
||||
actionTimeout: 15000,
|
||||
navigationTimeout: 30000,
|
||||
trace: 'on-first-retry',
|
||||
screenshot: 'only-on-failure',
|
||||
video: 'retain-on-failure',
|
||||
},
|
||||
globalSetup: path.resolve(__dirname, '../support/global-setup.ts'),
|
||||
timeout: 60000,
|
||||
expect: { timeout: 10000 },
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/config/local.config.ts - Local environment
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import { baseConfig } from './base.config';
|
||||
|
||||
export default defineConfig({
|
||||
...baseConfig,
|
||||
use: {
|
||||
...baseConfig.use,
|
||||
baseURL: 'http://localhost:3000',
|
||||
video: 'off', // No video locally for speed
|
||||
},
|
||||
webServer: {
|
||||
command: 'npm run dev',
|
||||
url: 'http://localhost:3000',
|
||||
reuseExistingServer: !process.env.CI,
|
||||
timeout: 120000,
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/config/staging.config.ts - Staging environment
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import { baseConfig } from './base.config';
|
||||
|
||||
export default defineConfig({
|
||||
...baseConfig,
|
||||
use: {
|
||||
...baseConfig.use,
|
||||
baseURL: 'https://staging.example.com',
|
||||
ignoreHTTPSErrors: true, // Allow self-signed certs in staging
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/config/production.config.ts - Production environment
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import { baseConfig } from './base.config';
|
||||
|
||||
export default defineConfig({
|
||||
...baseConfig,
|
||||
retries: 3, // More retries in production
|
||||
use: {
|
||||
...baseConfig.use,
|
||||
baseURL: 'https://example.com',
|
||||
video: 'on', // Always record production failures
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
```bash
|
||||
# .env.example - Template for developers
|
||||
TEST_ENV=local
|
||||
API_KEY=your_api_key_here
|
||||
DATABASE_URL=postgresql://localhost:5432/test_db
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Central `envConfigMap` prevents environment misconfiguration
|
||||
- Fail-fast validation with clear error message (available envs listed)
|
||||
- Base config defines shared settings, environment configs override
|
||||
- `.env.example` provides template for required secrets
|
||||
- `TEST_ENV=local` as default for local development
|
||||
- Production config increases retries and enables video recording
|
||||
|
||||
### Example 2: Timeout Standards
|
||||
|
||||
**Context**: When tests fail due to inconsistent timeout settings, standardize timeouts across all tests: action 15s, navigation 30s, expect 10s, test 60s. Expose overrides through fixtures rather than inline literals.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/config/base.config.ts - Standardized timeouts
|
||||
import { defineConfig } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
// Global test timeout: 60 seconds
|
||||
timeout: 60000,
|
||||
|
||||
use: {
|
||||
// Action timeout: 15 seconds (click, fill, etc.)
|
||||
actionTimeout: 15000,
|
||||
|
||||
// Navigation timeout: 30 seconds (page.goto, page.reload)
|
||||
navigationTimeout: 30000,
|
||||
},
|
||||
|
||||
// Expect timeout: 10 seconds (all assertions)
|
||||
expect: {
|
||||
timeout: 10000,
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/timeout-fixture.ts - Timeout override fixture
|
||||
import { test as base } from '@playwright/test';
|
||||
|
||||
type TimeoutOptions = {
|
||||
extendedTimeout: (timeoutMs: number) => Promise<void>;
|
||||
};
|
||||
|
||||
export const test = base.extend<TimeoutOptions>({
|
||||
extendedTimeout: async ({}, use, testInfo) => {
|
||||
const originalTimeout = testInfo.timeout;
|
||||
|
||||
await use(async (timeoutMs: number) => {
|
||||
testInfo.setTimeout(timeoutMs);
|
||||
});
|
||||
|
||||
// Restore original timeout after test
|
||||
testInfo.setTimeout(originalTimeout);
|
||||
},
|
||||
});
|
||||
|
||||
export { expect } from '@playwright/test';
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Usage in tests - Standard timeouts (implicit)
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('user can log in', async ({ page }) => {
|
||||
await page.goto('/login'); // Uses 30s navigation timeout
|
||||
await page.fill('[data-testid="email"]', 'test@example.com'); // Uses 15s action timeout
|
||||
await page.click('[data-testid="login-button"]'); // Uses 15s action timeout
|
||||
|
||||
await expect(page.getByText('Welcome')).toBeVisible(); // Uses 10s expect timeout
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Usage in tests - Per-test timeout override
|
||||
import { test, expect } from '../support/fixtures/timeout-fixture';
|
||||
|
||||
test('slow data processing operation', async ({ page, extendedTimeout }) => {
|
||||
// Override default 60s timeout for this slow test
|
||||
await extendedTimeout(180000); // 3 minutes
|
||||
|
||||
await page.goto('/data-processing');
|
||||
await page.click('[data-testid="process-large-file"]');
|
||||
|
||||
// Wait for long-running operation
|
||||
await expect(page.getByText('Processing complete')).toBeVisible({
|
||||
timeout: 120000, // 2 minutes for assertion
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Per-assertion timeout override (inline)
|
||||
test('API returns quickly', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Override expect timeout for fast API (reduce flakiness detection)
|
||||
await expect(page.getByTestId('user-name')).toBeVisible({ timeout: 5000 }); // 5s instead of 10s
|
||||
|
||||
// Override expect timeout for slow external API
|
||||
await expect(page.getByTestId('weather-widget')).toBeVisible({ timeout: 20000 }); // 20s instead of 10s
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Standardized timeouts**: action 15s, navigation 30s, expect 10s, test 60s (global defaults)
|
||||
- Fixture-based override (`extendedTimeout`) for slow tests (preferred over inline)
|
||||
- Per-assertion timeout override via `{ timeout: X }` option (use sparingly)
|
||||
- Avoid hard waits (`page.waitForTimeout(3000)`) - use event-based waits instead
|
||||
- CI environments may need longer timeouts (handle in environment-specific config)
|
||||
|
||||
### Example 3: Artifact Output Configuration
|
||||
|
||||
**Context**: When debugging failures in CI, configure artifacts (screenshots, videos, traces, HTML reports) to be captured on failure and stored in consistent locations for upload.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Artifact configuration
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
export default defineConfig({
|
||||
// Output directory for test artifacts
|
||||
outputDir: path.resolve(__dirname, './test-results'),
|
||||
|
||||
use: {
|
||||
// Screenshot on failure only (saves space)
|
||||
screenshot: 'only-on-failure',
|
||||
|
||||
// Video recording on failure + retry
|
||||
video: 'retain-on-failure',
|
||||
|
||||
// Trace recording on first retry (best debugging data)
|
||||
trace: 'on-first-retry',
|
||||
},
|
||||
|
||||
reporter: [
|
||||
// HTML report (visual, interactive)
|
||||
[
|
||||
'html',
|
||||
{
|
||||
outputFolder: 'playwright-report',
|
||||
open: 'never', // Don't auto-open in CI
|
||||
},
|
||||
],
|
||||
|
||||
// JUnit XML (CI integration)
|
||||
[
|
||||
'junit',
|
||||
{
|
||||
outputFile: 'test-results/results.xml',
|
||||
},
|
||||
],
|
||||
|
||||
// List reporter (console output)
|
||||
['list'],
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/artifact-fixture.ts - Custom artifact capture
|
||||
import { test as base } from '@playwright/test';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
export const test = base.extend({
|
||||
// Auto-capture console logs on failure
|
||||
page: async ({ page }, use, testInfo) => {
|
||||
const logs: string[] = [];
|
||||
|
||||
page.on('console', (msg) => {
|
||||
logs.push(`[${msg.type()}] ${msg.text()}`);
|
||||
});
|
||||
|
||||
await use(page);
|
||||
|
||||
// Save logs on failure
|
||||
if (testInfo.status !== testInfo.expectedStatus) {
|
||||
const logsPath = path.join(testInfo.outputDir, 'console-logs.txt');
|
||||
fs.writeFileSync(logsPath, logs.join('\n'));
|
||||
testInfo.attachments.push({
|
||||
name: 'console-logs',
|
||||
contentType: 'text/plain',
|
||||
path: logsPath,
|
||||
});
|
||||
}
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
```yaml
|
||||
# .github/workflows/e2e.yml - CI artifact upload
|
||||
name: E2E Tests
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Install Playwright browsers
|
||||
run: npx playwright install --with-deps
|
||||
|
||||
- name: Run tests
|
||||
run: npm run test
|
||||
env:
|
||||
TEST_ENV: staging
|
||||
|
||||
# Upload test artifacts on failure
|
||||
- name: Upload test results
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: test-results
|
||||
path: test-results/
|
||||
retention-days: 30
|
||||
|
||||
- name: Upload Playwright report
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: playwright-report
|
||||
path: playwright-report/
|
||||
retention-days: 30
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Example: Custom screenshot on specific condition
|
||||
test('capture screenshot on specific error', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
try {
|
||||
await page.click('[data-testid="submit-payment"]');
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
} catch (error) {
|
||||
// Capture custom screenshot with timestamp
|
||||
await page.screenshot({
|
||||
path: `test-results/payment-error-${Date.now()}.png`,
|
||||
fullPage: true,
|
||||
});
|
||||
throw error;
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `screenshot: 'only-on-failure'` saves space (not every test)
|
||||
- `video: 'retain-on-failure'` captures full flow on failures
|
||||
- `trace: 'on-first-retry'` provides deep debugging data (network, DOM, console)
|
||||
- HTML report at `playwright-report/` (visual debugging)
|
||||
- JUnit XML at `test-results/results.xml` (CI integration)
|
||||
- CI uploads artifacts on failure with 30-day retention
|
||||
- Custom fixture can capture console logs, network logs, etc.
|
||||
|
||||
### Example 4: Parallelization Configuration
|
||||
|
||||
**Context**: When tests run slowly in CI, configure parallelization with worker count, sharding, and fully parallel execution to maximize speed while maintaining stability.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Parallelization settings
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import os from 'os';
|
||||
|
||||
export default defineConfig({
|
||||
// Run tests in parallel within single file
|
||||
fullyParallel: true,
|
||||
|
||||
// Worker configuration
|
||||
workers: process.env.CI
|
||||
? 1 // Serial in CI for stability (or 2 for faster CI)
|
||||
: os.cpus().length - 1, // Parallel locally (leave 1 CPU for OS)
|
||||
|
||||
// Prevent accidentally committed .only() from blocking CI
|
||||
forbidOnly: !!process.env.CI,
|
||||
|
||||
// Retry failed tests in CI
|
||||
retries: process.env.CI ? 2 : 0,
|
||||
|
||||
// Shard configuration (split tests across multiple machines)
|
||||
shard:
|
||||
process.env.SHARD_INDEX && process.env.SHARD_TOTAL
|
||||
? {
|
||||
current: parseInt(process.env.SHARD_INDEX, 10),
|
||||
total: parseInt(process.env.SHARD_TOTAL, 10),
|
||||
}
|
||||
: undefined,
|
||||
});
|
||||
```
|
||||
|
||||
```yaml
|
||||
# .github/workflows/e2e-parallel.yml - Sharded CI execution
|
||||
name: E2E Tests (Parallel)
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
shard: [1, 2, 3, 4] # Split tests across 4 machines
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Install Playwright browsers
|
||||
run: npx playwright install --with-deps
|
||||
|
||||
- name: Run tests (shard ${{ matrix.shard }})
|
||||
run: npm run test
|
||||
env:
|
||||
SHARD_INDEX: ${{ matrix.shard }}
|
||||
SHARD_TOTAL: 4
|
||||
TEST_ENV: staging
|
||||
|
||||
- name: Upload test results
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: test-results-shard-${{ matrix.shard }}
|
||||
path: test-results/
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/config/serial.config.ts - Serial execution for flaky tests
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import { baseConfig } from './base.config';
|
||||
|
||||
export default defineConfig({
|
||||
...baseConfig,
|
||||
|
||||
// Disable parallel execution
|
||||
fullyParallel: false,
|
||||
workers: 1,
|
||||
|
||||
// Used for: authentication flows, database-dependent tests, feature flag tests
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Usage: Force serial execution for specific tests
|
||||
import { test } from '@playwright/test';
|
||||
|
||||
// Serial execution for auth tests (shared session state)
|
||||
test.describe.configure({ mode: 'serial' });
|
||||
|
||||
test.describe('Authentication Flow', () => {
|
||||
test('user can log in', async ({ page }) => {
|
||||
// First test in serial block
|
||||
});
|
||||
|
||||
test('user can access dashboard', async ({ page }) => {
|
||||
// Depends on previous test (serial)
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Usage: Parallel execution for independent tests (default)
|
||||
import { test } from '@playwright/test';
|
||||
|
||||
test.describe('Product Catalog', () => {
|
||||
test('can view product 1', async ({ page }) => {
|
||||
// Runs in parallel with other tests
|
||||
});
|
||||
|
||||
test('can view product 2', async ({ page }) => {
|
||||
// Runs in parallel with other tests
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `fullyParallel: true` enables parallel execution within single test file
|
||||
- Workers: 1 in CI (stability), N-1 CPUs locally (speed)
|
||||
- Sharding splits tests across multiple CI machines (4x faster with 4 shards)
|
||||
- `test.describe.configure({ mode: 'serial' })` for dependent tests
|
||||
- `forbidOnly: true` in CI prevents `.only()` from blocking pipeline
|
||||
- Matrix strategy in CI runs shards concurrently
|
||||
|
||||
### Example 5: Project Configuration
|
||||
|
||||
**Context**: When testing across multiple browsers, devices, or configurations, use Playwright projects to run the same tests against different environments (chromium, firefox, webkit, mobile).
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Multiple browser projects
|
||||
import { defineConfig, devices } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
projects: [
|
||||
// Desktop browsers
|
||||
{
|
||||
name: 'chromium',
|
||||
use: { ...devices['Desktop Chrome'] },
|
||||
},
|
||||
{
|
||||
name: 'firefox',
|
||||
use: { ...devices['Desktop Firefox'] },
|
||||
},
|
||||
{
|
||||
name: 'webkit',
|
||||
use: { ...devices['Desktop Safari'] },
|
||||
},
|
||||
|
||||
// Mobile browsers
|
||||
{
|
||||
name: 'mobile-chrome',
|
||||
use: { ...devices['Pixel 5'] },
|
||||
},
|
||||
{
|
||||
name: 'mobile-safari',
|
||||
use: { ...devices['iPhone 13'] },
|
||||
},
|
||||
|
||||
// Tablet
|
||||
{
|
||||
name: 'tablet',
|
||||
use: { ...devices['iPad Pro'] },
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Authenticated vs. unauthenticated projects
|
||||
import { defineConfig } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
export default defineConfig({
|
||||
projects: [
|
||||
// Setup project (runs first, creates auth state)
|
||||
{
|
||||
name: 'setup',
|
||||
testMatch: /global-setup\.ts/,
|
||||
},
|
||||
|
||||
// Authenticated tests (reuse auth state)
|
||||
{
|
||||
name: 'authenticated',
|
||||
dependencies: ['setup'],
|
||||
use: {
|
||||
storageState: path.resolve(__dirname, './playwright/.auth/user.json'),
|
||||
},
|
||||
testMatch: /.*authenticated\.spec\.ts/,
|
||||
},
|
||||
|
||||
// Unauthenticated tests (public pages)
|
||||
{
|
||||
name: 'unauthenticated',
|
||||
testMatch: /.*unauthenticated\.spec\.ts/,
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
```typescript
|
||||
// playwright/support/global-setup.ts - Setup project for auth
|
||||
import { chromium, FullConfig } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
async function globalSetup(config: FullConfig) {
|
||||
const browser = await chromium.launch();
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Perform authentication
|
||||
await page.goto('http://localhost:3000/login');
|
||||
await page.fill('[data-testid="email"]', 'test@example.com');
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// Wait for authentication to complete
|
||||
await page.waitForURL('**/dashboard');
|
||||
|
||||
// Save authentication state
|
||||
await page.context().storageState({
|
||||
path: path.resolve(__dirname, '../.auth/user.json'),
|
||||
});
|
||||
|
||||
await browser.close();
|
||||
}
|
||||
|
||||
export default globalSetup;
|
||||
```
|
||||
|
||||
```bash
|
||||
# Run specific project
|
||||
npx playwright test --project=chromium
|
||||
npx playwright test --project=mobile-chrome
|
||||
npx playwright test --project=authenticated
|
||||
|
||||
# Run multiple projects
|
||||
npx playwright test --project=chromium --project=firefox
|
||||
|
||||
# Run all projects (default)
|
||||
npx playwright test
|
||||
```
|
||||
|
||||
```typescript
|
||||
// Usage: Project-specific test
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('mobile navigation works', async ({ page, isMobile }) => {
|
||||
await page.goto('/');
|
||||
|
||||
if (isMobile) {
|
||||
// Open mobile menu
|
||||
await page.click('[data-testid="hamburger-menu"]');
|
||||
}
|
||||
|
||||
await page.click('[data-testid="products-link"]');
|
||||
await expect(page).toHaveURL(/.*products/);
|
||||
});
|
||||
```
|
||||
|
||||
```yaml
|
||||
# .github/workflows/e2e-cross-browser.yml - CI cross-browser testing
|
||||
name: E2E Tests (Cross-Browser)
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
project: [chromium, firefox, webkit, mobile-chrome]
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
- run: npm ci
|
||||
- run: npx playwright install --with-deps
|
||||
|
||||
- name: Run tests (${{ matrix.project }})
|
||||
run: npx playwright test --project=${{ matrix.project }}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Projects enable testing across browsers, devices, and configurations
|
||||
- `devices` from `@playwright/test` provide preset configurations (Pixel 5, iPhone 13, etc.)
|
||||
- `dependencies` ensures setup project runs first (auth, data seeding)
|
||||
- `storageState` shares authentication across tests (0 seconds auth per test)
|
||||
- `testMatch` filters which tests run in which project
|
||||
- CI matrix strategy runs projects in parallel (4x faster with 4 projects)
|
||||
- `isMobile` context property for conditional logic in tests
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*framework` (config setup), `*ci` (parallelization, artifact upload)
|
||||
- **Related fragments**:
|
||||
- `fixture-architecture.md` - Fixture-based timeout overrides
|
||||
- `ci-burn-in.md` - CI pipeline artifact upload
|
||||
- `test-quality.md` - Timeout standards (no hard waits)
|
||||
- `data-factories.md` - Per-test isolation (no shared global state)
|
||||
|
||||
## Configuration Checklist
|
||||
|
||||
**Before deploying tests, verify**:
|
||||
|
||||
- [ ] Environment config map with fail-fast validation
|
||||
- [ ] Standardized timeouts (action 15s, navigation 30s, expect 10s, test 60s)
|
||||
- [ ] Artifact storage at `test-results/` and `playwright-report/`
|
||||
- [ ] HTML + JUnit reporters configured
|
||||
- [ ] `.env.example`, `.nvmrc`, browser versions committed
|
||||
- [ ] Parallelization configured (workers, sharding)
|
||||
- [ ] Projects defined for cross-browser/device testing (if needed)
|
||||
- [ ] CI uploads artifacts on failure with 30-day retention
|
||||
|
||||
_Source: Playwright book repo, SEON configuration example, Murat testing philosophy (lines 216-271)._
|
||||
|
||||
@@ -1,17 +1,601 @@
|
||||
# Probability and Impact Scale
|
||||
|
||||
- **Probability**
|
||||
- 1 – Unlikely: standard implementation, low uncertainty.
|
||||
- 2 – Possible: edge cases or partial unknowns worth investigation.
|
||||
- 3 – Likely: known issues, new integrations, or high ambiguity.
|
||||
- **Impact**
|
||||
- 1 – Minor: cosmetic issues or easy workarounds.
|
||||
- 2 – Degraded: partial feature loss or manual workaround required.
|
||||
- 3 – Critical: blockers, data/security/regulatory exposure.
|
||||
- Multiply probability × impact to derive the risk score.
|
||||
- 1–3: document for awareness.
|
||||
- 4–5: monitor closely, plan mitigations.
|
||||
- 6–8: CONCERNS at the gate until mitigations are implemented.
|
||||
- 9: automatic gate FAIL until resolved or formally waived.
|
||||
## Principle
|
||||
|
||||
_Source: Murat risk model summary._
|
||||
Risk scoring uses a **probability × impact** matrix (1-9 scale) to prioritize testing efforts. Higher scores (6-9) demand immediate action; lower scores (1-3) require documentation only. This systematic approach ensures testing resources focus on the highest-value risks.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Without quantifiable risk assessment, teams over-test low-value scenarios while missing critical risks. Gut feeling leads to inconsistent prioritization and missed edge cases.
|
||||
|
||||
**The Solution**: Standardize risk evaluation with a 3×3 matrix (probability: 1-3, impact: 1-3). Multiply to derive risk score (1-9). Automate classification (DOCUMENT, MONITOR, MITIGATE, BLOCK) based on thresholds. This approach surfaces hidden risks early and justifies testing decisions to stakeholders.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Consistent risk language across product, engineering, and QA
|
||||
- Objective prioritization of test scenarios (not politics)
|
||||
- Automatic gate decisions (score=9 → FAIL until resolved)
|
||||
- Audit trail for compliance and retrospectives
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Probability-Impact Matrix Implementation (Automated Classification)
|
||||
|
||||
**Context**: Implement a reusable risk scoring system with automatic threshold classification
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// src/testing/risk-matrix.ts
|
||||
|
||||
/**
|
||||
* Probability levels:
|
||||
* 1 = Unlikely (standard implementation, low uncertainty)
|
||||
* 2 = Possible (edge cases or partial unknowns)
|
||||
* 3 = Likely (known issues, new integrations, high ambiguity)
|
||||
*/
|
||||
export type Probability = 1 | 2 | 3;
|
||||
|
||||
/**
|
||||
* Impact levels:
|
||||
* 1 = Minor (cosmetic issues or easy workarounds)
|
||||
* 2 = Degraded (partial feature loss or manual workaround)
|
||||
* 3 = Critical (blockers, data/security/regulatory exposure)
|
||||
*/
|
||||
export type Impact = 1 | 2 | 3;
|
||||
|
||||
/**
|
||||
* Risk score (probability × impact): 1-9
|
||||
*/
|
||||
export type RiskScore = 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9;
|
||||
|
||||
/**
|
||||
* Action categories based on risk score thresholds
|
||||
*/
|
||||
export type RiskAction = 'DOCUMENT' | 'MONITOR' | 'MITIGATE' | 'BLOCK';
|
||||
|
||||
export type RiskAssessment = {
|
||||
probability: Probability;
|
||||
impact: Impact;
|
||||
score: RiskScore;
|
||||
action: RiskAction;
|
||||
reasoning: string;
|
||||
};
|
||||
|
||||
/**
|
||||
* Calculate risk score: probability × impact
|
||||
*/
|
||||
export function calculateRiskScore(probability: Probability, impact: Impact): RiskScore {
|
||||
return (probability * impact) as RiskScore;
|
||||
}
|
||||
|
||||
/**
|
||||
* Classify risk action based on score thresholds:
|
||||
* - 1-3: DOCUMENT (awareness only)
|
||||
* - 4-5: MONITOR (watch closely, plan mitigations)
|
||||
* - 6-8: MITIGATE (CONCERNS at gate until mitigated)
|
||||
* - 9: BLOCK (automatic FAIL until resolved or waived)
|
||||
*/
|
||||
export function classifyRiskAction(score: RiskScore): RiskAction {
|
||||
if (score >= 9) return 'BLOCK';
|
||||
if (score >= 6) return 'MITIGATE';
|
||||
if (score >= 4) return 'MONITOR';
|
||||
return 'DOCUMENT';
|
||||
}
|
||||
|
||||
/**
|
||||
* Full risk assessment with automatic classification
|
||||
*/
|
||||
export function assessRisk(params: { probability: Probability; impact: Impact; reasoning: string }): RiskAssessment {
|
||||
const { probability, impact, reasoning } = params;
|
||||
|
||||
const score = calculateRiskScore(probability, impact);
|
||||
const action = classifyRiskAction(score);
|
||||
|
||||
return { probability, impact, score, action, reasoning };
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate risk matrix visualization (3x3 grid)
|
||||
* Returns markdown table with color-coded scores
|
||||
*/
|
||||
export function generateRiskMatrix(): string {
|
||||
const matrix: string[][] = [];
|
||||
const header = ['Impact \\ Probability', 'Unlikely (1)', 'Possible (2)', 'Likely (3)'];
|
||||
matrix.push(header);
|
||||
|
||||
const impactLabels = ['Critical (3)', 'Degraded (2)', 'Minor (1)'];
|
||||
for (let impact = 3; impact >= 1; impact--) {
|
||||
const row = [impactLabels[3 - impact]];
|
||||
for (let probability = 1; probability <= 3; probability++) {
|
||||
const score = calculateRiskScore(probability as Probability, impact as Impact);
|
||||
const action = classifyRiskAction(score);
|
||||
const emoji = action === 'BLOCK' ? '🔴' : action === 'MITIGATE' ? '🟠' : action === 'MONITOR' ? '🟡' : '🟢';
|
||||
row.push(`${emoji} ${score}`);
|
||||
}
|
||||
matrix.push(row);
|
||||
}
|
||||
|
||||
return matrix.map((row) => `| ${row.join(' | ')} |`).join('\n');
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Type-safe probability/impact (1-3 enforced at compile time)
|
||||
- Automatic action classification (DOCUMENT, MONITOR, MITIGATE, BLOCK)
|
||||
- Visual matrix generation for documentation
|
||||
- Risk score formula: `probability * impact` (max = 9)
|
||||
- Threshold-based decision rules (6-8 = MITIGATE, 9 = BLOCK)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Risk Assessment Workflow (Test Planning Integration)
|
||||
|
||||
**Context**: Apply risk matrix during test design to prioritize scenarios
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/test-planning/risk-assessment.ts
|
||||
import { assessRisk, generateRiskMatrix, type RiskAssessment } from '../../../src/testing/risk-matrix';
|
||||
|
||||
export type TestScenario = {
|
||||
id: string;
|
||||
title: string;
|
||||
feature: string;
|
||||
risk: RiskAssessment;
|
||||
testLevel: 'E2E' | 'API' | 'Unit';
|
||||
priority: 'P0' | 'P1' | 'P2' | 'P3';
|
||||
owner: string;
|
||||
};
|
||||
|
||||
/**
|
||||
* Assess test scenarios and auto-assign priority based on risk score
|
||||
*/
|
||||
export function assessTestScenarios(scenarios: Omit<TestScenario, 'risk' | 'priority'>[]): TestScenario[] {
|
||||
return scenarios.map((scenario) => {
|
||||
// Auto-assign priority based on risk score
|
||||
const priority = mapRiskToPriority(scenario.risk.score);
|
||||
return { ...scenario, priority };
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Map risk score to test priority (P0-P3)
|
||||
* P0: Critical (score 9) - blocks release
|
||||
* P1: High (score 6-8) - must fix before release
|
||||
* P2: Medium (score 4-5) - fix if time permits
|
||||
* P3: Low (score 1-3) - document and defer
|
||||
*/
|
||||
function mapRiskToPriority(score: number): 'P0' | 'P1' | 'P2' | 'P3' {
|
||||
if (score === 9) return 'P0';
|
||||
if (score >= 6) return 'P1';
|
||||
if (score >= 4) return 'P2';
|
||||
return 'P3';
|
||||
}
|
||||
|
||||
/**
|
||||
* Example: Payment flow risk assessment
|
||||
*/
|
||||
export const paymentScenarios: Array<Omit<TestScenario, 'priority'>> = [
|
||||
{
|
||||
id: 'PAY-001',
|
||||
title: 'Valid credit card payment completes successfully',
|
||||
feature: 'Checkout',
|
||||
risk: assessRisk({
|
||||
probability: 2, // Possible (standard Stripe integration)
|
||||
impact: 3, // Critical (revenue loss if broken)
|
||||
reasoning: 'Core revenue flow, but Stripe is well-tested',
|
||||
}),
|
||||
testLevel: 'E2E',
|
||||
owner: 'qa-team',
|
||||
},
|
||||
{
|
||||
id: 'PAY-002',
|
||||
title: 'Expired credit card shows user-friendly error',
|
||||
feature: 'Checkout',
|
||||
risk: assessRisk({
|
||||
probability: 3, // Likely (edge case handling often buggy)
|
||||
impact: 2, // Degraded (users see error, but can retry)
|
||||
reasoning: 'Error handling logic is custom and complex',
|
||||
}),
|
||||
testLevel: 'E2E',
|
||||
owner: 'qa-team',
|
||||
},
|
||||
{
|
||||
id: 'PAY-003',
|
||||
title: 'Payment confirmation email formatting is correct',
|
||||
feature: 'Email',
|
||||
risk: assessRisk({
|
||||
probability: 2, // Possible (template changes occasionally break)
|
||||
impact: 1, // Minor (cosmetic issue, email still sent)
|
||||
reasoning: 'Non-blocking, users get email regardless',
|
||||
}),
|
||||
testLevel: 'Unit',
|
||||
owner: 'dev-team',
|
||||
},
|
||||
{
|
||||
id: 'PAY-004',
|
||||
title: 'Payment fails gracefully when Stripe is down',
|
||||
feature: 'Checkout',
|
||||
risk: assessRisk({
|
||||
probability: 1, // Unlikely (Stripe has 99.99% uptime)
|
||||
impact: 3, // Critical (complete checkout failure)
|
||||
reasoning: 'Rare but catastrophic, requires retry mechanism',
|
||||
}),
|
||||
testLevel: 'API',
|
||||
owner: 'qa-team',
|
||||
},
|
||||
];
|
||||
|
||||
/**
|
||||
* Generate risk assessment report with priority distribution
|
||||
*/
|
||||
export function generateRiskReport(scenarios: TestScenario[]): string {
|
||||
const priorityCounts = scenarios.reduce(
|
||||
(acc, s) => {
|
||||
acc[s.priority] = (acc[s.priority] || 0) + 1;
|
||||
return acc;
|
||||
},
|
||||
{} as Record<string, number>,
|
||||
);
|
||||
|
||||
const actionCounts = scenarios.reduce(
|
||||
(acc, s) => {
|
||||
acc[s.risk.action] = (acc[s.risk.action] || 0) + 1;
|
||||
return acc;
|
||||
},
|
||||
{} as Record<string, number>,
|
||||
);
|
||||
|
||||
return `
|
||||
# Risk Assessment Report
|
||||
|
||||
## Risk Matrix
|
||||
${generateRiskMatrix()}
|
||||
|
||||
## Priority Distribution
|
||||
- **P0 (Blocker)**: ${priorityCounts.P0 || 0} scenarios
|
||||
- **P1 (High)**: ${priorityCounts.P1 || 0} scenarios
|
||||
- **P2 (Medium)**: ${priorityCounts.P2 || 0} scenarios
|
||||
- **P3 (Low)**: ${priorityCounts.P3 || 0} scenarios
|
||||
|
||||
## Action Required
|
||||
- **BLOCK**: ${actionCounts.BLOCK || 0} scenarios (auto-fail gate)
|
||||
- **MITIGATE**: ${actionCounts.MITIGATE || 0} scenarios (concerns at gate)
|
||||
- **MONITOR**: ${actionCounts.MONITOR || 0} scenarios (watch closely)
|
||||
- **DOCUMENT**: ${actionCounts.DOCUMENT || 0} scenarios (awareness only)
|
||||
|
||||
## Scenarios by Risk Score (Highest First)
|
||||
${scenarios
|
||||
.sort((a, b) => b.risk.score - a.risk.score)
|
||||
.map((s) => `- **[${s.priority}]** ${s.id}: ${s.title} (Score: ${s.risk.score} - ${s.risk.action})`)
|
||||
.join('\n')}
|
||||
`.trim();
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Risk score → Priority mapping (P0-P3 automated)
|
||||
- Report generation with priority/action distribution
|
||||
- Scenarios sorted by risk score (highest first)
|
||||
- Visual matrix included in reports
|
||||
- Reusable across projects (extract to shared library)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Dynamic Risk Re-Assessment (Continuous Evaluation)
|
||||
|
||||
**Context**: Recalculate risk scores as project evolves (requirements change, mitigations implemented)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// src/testing/risk-tracking.ts
|
||||
import { type RiskAssessment, assessRisk, type Probability, type Impact } from './risk-matrix';
|
||||
|
||||
export type RiskHistory = {
|
||||
timestamp: Date;
|
||||
assessment: RiskAssessment;
|
||||
changedBy: string;
|
||||
reason: string;
|
||||
};
|
||||
|
||||
export type TrackedRisk = {
|
||||
id: string;
|
||||
title: string;
|
||||
feature: string;
|
||||
currentRisk: RiskAssessment;
|
||||
history: RiskHistory[];
|
||||
mitigations: string[];
|
||||
status: 'OPEN' | 'MITIGATED' | 'WAIVED' | 'RESOLVED';
|
||||
};
|
||||
|
||||
export class RiskTracker {
|
||||
private risks: Map<string, TrackedRisk> = new Map();
|
||||
|
||||
/**
|
||||
* Add new risk to tracker
|
||||
*/
|
||||
addRisk(params: {
|
||||
id: string;
|
||||
title: string;
|
||||
feature: string;
|
||||
probability: Probability;
|
||||
impact: Impact;
|
||||
reasoning: string;
|
||||
changedBy: string;
|
||||
}): TrackedRisk {
|
||||
const { id, title, feature, probability, impact, reasoning, changedBy } = params;
|
||||
|
||||
const assessment = assessRisk({ probability, impact, reasoning });
|
||||
|
||||
const risk: TrackedRisk = {
|
||||
id,
|
||||
title,
|
||||
feature,
|
||||
currentRisk: assessment,
|
||||
history: [
|
||||
{
|
||||
timestamp: new Date(),
|
||||
assessment,
|
||||
changedBy,
|
||||
reason: 'Initial assessment',
|
||||
},
|
||||
],
|
||||
mitigations: [],
|
||||
status: 'OPEN',
|
||||
};
|
||||
|
||||
this.risks.set(id, risk);
|
||||
return risk;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reassess risk (probability or impact changed)
|
||||
*/
|
||||
reassessRisk(params: {
|
||||
id: string;
|
||||
probability?: Probability;
|
||||
impact?: Impact;
|
||||
reasoning: string;
|
||||
changedBy: string;
|
||||
}): TrackedRisk | null {
|
||||
const { id, probability, impact, reasoning, changedBy } = params;
|
||||
const risk = this.risks.get(id);
|
||||
if (!risk) return null;
|
||||
|
||||
// Use existing values if not provided
|
||||
const newProbability = probability ?? risk.currentRisk.probability;
|
||||
const newImpact = impact ?? risk.currentRisk.impact;
|
||||
|
||||
const newAssessment = assessRisk({
|
||||
probability: newProbability,
|
||||
impact: newImpact,
|
||||
reasoning,
|
||||
});
|
||||
|
||||
risk.currentRisk = newAssessment;
|
||||
risk.history.push({
|
||||
timestamp: new Date(),
|
||||
assessment: newAssessment,
|
||||
changedBy,
|
||||
reason: reasoning,
|
||||
});
|
||||
|
||||
this.risks.set(id, risk);
|
||||
return risk;
|
||||
}
|
||||
|
||||
/**
|
||||
* Mark risk as mitigated (probability reduced)
|
||||
*/
|
||||
mitigateRisk(params: { id: string; newProbability: Probability; mitigation: string; changedBy: string }): TrackedRisk | null {
|
||||
const { id, newProbability, mitigation, changedBy } = params;
|
||||
const risk = this.reassessRisk({
|
||||
id,
|
||||
probability: newProbability,
|
||||
reasoning: `Mitigation implemented: ${mitigation}`,
|
||||
changedBy,
|
||||
});
|
||||
|
||||
if (risk) {
|
||||
risk.mitigations.push(mitigation);
|
||||
if (risk.currentRisk.action === 'DOCUMENT' || risk.currentRisk.action === 'MONITOR') {
|
||||
risk.status = 'MITIGATED';
|
||||
}
|
||||
}
|
||||
|
||||
return risk;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get risks requiring action (MITIGATE or BLOCK)
|
||||
*/
|
||||
getRisksRequiringAction(): TrackedRisk[] {
|
||||
return Array.from(this.risks.values()).filter(
|
||||
(r) => r.status === 'OPEN' && (r.currentRisk.action === 'MITIGATE' || r.currentRisk.action === 'BLOCK'),
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate risk trend report (show changes over time)
|
||||
*/
|
||||
generateTrendReport(riskId: string): string | null {
|
||||
const risk = this.risks.get(riskId);
|
||||
if (!risk) return null;
|
||||
|
||||
return `
|
||||
# Risk Trend Report: ${risk.id}
|
||||
|
||||
**Title**: ${risk.title}
|
||||
**Feature**: ${risk.feature}
|
||||
**Status**: ${risk.status}
|
||||
|
||||
## Current Assessment
|
||||
- **Probability**: ${risk.currentRisk.probability}
|
||||
- **Impact**: ${risk.currentRisk.impact}
|
||||
- **Score**: ${risk.currentRisk.score}
|
||||
- **Action**: ${risk.currentRisk.action}
|
||||
- **Reasoning**: ${risk.currentRisk.reasoning}
|
||||
|
||||
## Mitigations Applied
|
||||
${risk.mitigations.length > 0 ? risk.mitigations.map((m) => `- ${m}`).join('\n') : '- None'}
|
||||
|
||||
## History (${risk.history.length} changes)
|
||||
${risk.history
|
||||
.reverse()
|
||||
.map((h) => `- **${h.timestamp.toISOString()}** by ${h.changedBy}: Score ${h.assessment.score} (${h.assessment.action}) - ${h.reason}`)
|
||||
.join('\n')}
|
||||
`.trim();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Historical tracking (audit trail for risk changes)
|
||||
- Mitigation impact tracking (probability reduction)
|
||||
- Status lifecycle (OPEN → MITIGATED → RESOLVED)
|
||||
- Trend reports (show risk evolution over time)
|
||||
- Re-assessment triggers (requirements change, new info)
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Risk Matrix in Gate Decision (Integration with Trace Workflow)
|
||||
|
||||
**Context**: Use probability-impact scores to drive gate decisions (PASS/CONCERNS/FAIL/WAIVED)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// src/testing/gate-decision.ts
|
||||
import { type RiskScore, classifyRiskAction, type RiskAction } from './risk-matrix';
|
||||
import { type TrackedRisk } from './risk-tracking';
|
||||
|
||||
export type GateDecision = 'PASS' | 'CONCERNS' | 'FAIL' | 'WAIVED';
|
||||
|
||||
export type GateResult = {
|
||||
decision: GateDecision;
|
||||
blockers: TrackedRisk[]; // Score=9, action=BLOCK
|
||||
concerns: TrackedRisk[]; // Score 6-8, action=MITIGATE
|
||||
monitored: TrackedRisk[]; // Score 4-5, action=MONITOR
|
||||
documented: TrackedRisk[]; // Score 1-3, action=DOCUMENT
|
||||
summary: string;
|
||||
};
|
||||
|
||||
/**
|
||||
* Evaluate gate based on risk assessments
|
||||
*/
|
||||
export function evaluateGateFromRisks(risks: TrackedRisk[]): GateResult {
|
||||
const blockers = risks.filter((r) => r.currentRisk.action === 'BLOCK' && r.status === 'OPEN');
|
||||
const concerns = risks.filter((r) => r.currentRisk.action === 'MITIGATE' && r.status === 'OPEN');
|
||||
const monitored = risks.filter((r) => r.currentRisk.action === 'MONITOR');
|
||||
const documented = risks.filter((r) => r.currentRisk.action === 'DOCUMENT');
|
||||
|
||||
let decision: GateDecision;
|
||||
|
||||
if (blockers.length > 0) {
|
||||
decision = 'FAIL';
|
||||
} else if (concerns.length > 0) {
|
||||
decision = 'CONCERNS';
|
||||
} else {
|
||||
decision = 'PASS';
|
||||
}
|
||||
|
||||
const summary = generateGateSummary({ decision, blockers, concerns, monitored, documented });
|
||||
|
||||
return { decision, blockers, concerns, monitored, documented, summary };
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate gate decision summary
|
||||
*/
|
||||
function generateGateSummary(result: Omit<GateResult, 'summary'>): string {
|
||||
const { decision, blockers, concerns, monitored, documented } = result;
|
||||
|
||||
const lines: string[] = [`## Gate Decision: ${decision}`];
|
||||
|
||||
if (decision === 'FAIL') {
|
||||
lines.push(`\n**Blockers** (${blockers.length}): Automatic FAIL until resolved or waived`);
|
||||
blockers.forEach((r) => {
|
||||
lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`);
|
||||
lines.push(` - Probability: ${r.currentRisk.probability}, Impact: ${r.currentRisk.impact}`);
|
||||
lines.push(` - Reasoning: ${r.currentRisk.reasoning}`);
|
||||
});
|
||||
}
|
||||
|
||||
if (concerns.length > 0) {
|
||||
lines.push(`\n**Concerns** (${concerns.length}): Address before release`);
|
||||
concerns.forEach((r) => {
|
||||
lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`);
|
||||
lines.push(` - Mitigations: ${r.mitigations.join(', ') || 'None'}`);
|
||||
});
|
||||
}
|
||||
|
||||
if (monitored.length > 0) {
|
||||
lines.push(`\n**Monitored** (${monitored.length}): Watch closely`);
|
||||
monitored.forEach((r) => lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`));
|
||||
}
|
||||
|
||||
if (documented.length > 0) {
|
||||
lines.push(`\n**Documented** (${documented.length}): Awareness only`);
|
||||
}
|
||||
|
||||
lines.push(`\n---\n`);
|
||||
lines.push(`**Next Steps**:`);
|
||||
if (decision === 'FAIL') {
|
||||
lines.push(`- Resolve blockers or request formal waiver`);
|
||||
} else if (decision === 'CONCERNS') {
|
||||
lines.push(`- Implement mitigations for high-risk scenarios (score 6-8)`);
|
||||
lines.push(`- Re-run gate after mitigations`);
|
||||
} else {
|
||||
lines.push(`- Proceed with release`);
|
||||
}
|
||||
|
||||
return lines.join('\n');
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Gate decision driven by risk scores (not gut feeling)
|
||||
- Automatic FAIL for score=9 (blockers)
|
||||
- CONCERNS for score 6-8 (requires mitigation)
|
||||
- PASS only when no blockers/concerns
|
||||
- Actionable summary with next steps
|
||||
- Integration with trace workflow (Phase 2)
|
||||
|
||||
---
|
||||
|
||||
## Probability-Impact Threshold Summary
|
||||
|
||||
| Score | Action | Gate Impact | Typical Use Case |
|
||||
| ----- | -------- | -------------------- | -------------------------------------- |
|
||||
| 1-3 | DOCUMENT | None | Cosmetic issues, low-priority bugs |
|
||||
| 4-5 | MONITOR | None (watch closely) | Edge cases, partial unknowns |
|
||||
| 6-8 | MITIGATE | CONCERNS at gate | High-impact scenarios needing coverage |
|
||||
| 9 | BLOCK | Automatic FAIL | Critical blockers, must resolve |
|
||||
|
||||
## Risk Assessment Checklist
|
||||
|
||||
Before deploying risk matrix:
|
||||
|
||||
- [ ] **Probability scale defined**: 1 (unlikely), 2 (possible), 3 (likely) with clear examples
|
||||
- [ ] **Impact scale defined**: 1 (minor), 2 (degraded), 3 (critical) with concrete criteria
|
||||
- [ ] **Threshold rules documented**: Score → Action mapping (1-3 = DOCUMENT, 4-5 = MONITOR, 6-8 = MITIGATE, 9 = BLOCK)
|
||||
- [ ] **Gate integration**: Risk scores drive gate decisions (PASS/CONCERNS/FAIL/WAIVED)
|
||||
- [ ] **Re-assessment process**: Risks re-evaluated as project evolves (requirements change, mitigations applied)
|
||||
- [ ] **Audit trail**: Historical tracking for risk changes (who, when, why)
|
||||
- [ ] **Mitigation tracking**: Link mitigations to probability reduction (quantify impact)
|
||||
- [ ] **Reporting**: Risk matrix visualization, trend reports, gate summaries
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*test-design` (initial risk assessment), `*trace` (gate decision Phase 2), `*nfr-assess` (security/performance risks)
|
||||
- **Related fragments**: `risk-governance.md` (risk scoring matrix, gate decision engine), `test-priorities-matrix.md` (P0-P3 mapping), `nfr-criteria.md` (impact assessment for NFRs)
|
||||
- **Tools**: TypeScript for type safety, markdown for reports, version control for audit trail
|
||||
|
||||
_Source: Murat risk model summary, gate decision patterns from production systems, probability-impact matrix from risk governance practices_
|
||||
|
||||
@@ -1,14 +1,615 @@
|
||||
# Risk Governance and Gatekeeping
|
||||
|
||||
- Score risk as probability (1–3) × impact (1–3); totals ≥6 demand mitigation before approval, 9 mandates a gate failure.
|
||||
- Classify risks across TECH, SEC, PERF, DATA, BUS, OPS. Document owners, mitigation plans, and deadlines for any score above 4.
|
||||
- Trace every acceptance criterion to implemented tests; missing coverage must be resolved or explicitly waived before release.
|
||||
- Gate decisions:
|
||||
- **PASS** – no critical issues remain and evidence is current.
|
||||
- **CONCERNS** – residual risk exists but has owners, actions, and timelines.
|
||||
- **FAIL** – critical issues unresolved or evidence missing.
|
||||
- **WAIVED** – risk accepted with documented approver, rationale, and expiry.
|
||||
- Maintain a gate history log capturing updates so auditors can follow the decision trail.
|
||||
- Use the probability/impact scale fragment for shared definitions when scoring teams run the matrix.
|
||||
## Principle
|
||||
|
||||
_Source: Murat risk governance notes, gate schema guidance._
|
||||
Risk governance transforms subjective "should we ship?" debates into objective, data-driven decisions. By scoring risk (probability × impact), classifying by category (TECH, SEC, PERF, etc.), and tracking mitigation ownership, teams create transparent quality gates that balance speed with safety.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Without formal risk governance, releases become political—loud voices win, quiet risks hide, and teams discover critical issues in production. "We thought it was fine" isn't a release strategy.
|
||||
|
||||
**The Solution**: Risk scoring (1-3 scale for probability and impact, total 1-9) creates shared language. Scores ≥6 demand documented mitigation. Scores = 9 mandate gate failure. Every acceptance criterion maps to a test, and gaps require explicit waivers with owners and expiry dates.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Removes ambiguity from release decisions (objective scores vs subjective opinions)
|
||||
- Creates audit trail for compliance (FDA, SOC2, ISO require documented risk management)
|
||||
- Identifies true blockers early (prevents last-minute production fires)
|
||||
- Distributes responsibility (owners, mitigation plans, deadlines for every risk >4)
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Risk Scoring Matrix with Automated Classification (TypeScript)
|
||||
|
||||
**Context**: Calculate risk scores automatically from test results and categorize by risk type
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// risk-scoring.ts - Risk classification and scoring system
|
||||
export const RISK_CATEGORIES = {
|
||||
TECH: 'TECH', // Technical debt, architecture fragility
|
||||
SEC: 'SEC', // Security vulnerabilities
|
||||
PERF: 'PERF', // Performance degradation
|
||||
DATA: 'DATA', // Data integrity, corruption
|
||||
BUS: 'BUS', // Business logic errors
|
||||
OPS: 'OPS', // Operational issues (deployment, monitoring)
|
||||
} as const;
|
||||
|
||||
export type RiskCategory = keyof typeof RISK_CATEGORIES;
|
||||
|
||||
export type RiskScore = {
|
||||
id: string;
|
||||
category: RiskCategory;
|
||||
title: string;
|
||||
description: string;
|
||||
probability: 1 | 2 | 3; // 1=Low, 2=Medium, 3=High
|
||||
impact: 1 | 2 | 3; // 1=Low, 2=Medium, 3=High
|
||||
score: number; // probability × impact (1-9)
|
||||
owner: string;
|
||||
mitigationPlan?: string;
|
||||
deadline?: Date;
|
||||
status: 'OPEN' | 'MITIGATED' | 'WAIVED' | 'ACCEPTED';
|
||||
waiverReason?: string;
|
||||
waiverApprover?: string;
|
||||
waiverExpiry?: Date;
|
||||
};
|
||||
|
||||
// Risk scoring rules
|
||||
export function calculateRiskScore(probability: 1 | 2 | 3, impact: 1 | 2 | 3): number {
|
||||
return probability * impact;
|
||||
}
|
||||
|
||||
export function requiresMitigation(score: number): boolean {
|
||||
return score >= 6; // Scores 6-9 demand action
|
||||
}
|
||||
|
||||
export function isCriticalBlocker(score: number): boolean {
|
||||
return score === 9; // Probability=3 AND Impact=3 → FAIL gate
|
||||
}
|
||||
|
||||
export function classifyRiskLevel(score: number): 'LOW' | 'MEDIUM' | 'HIGH' | 'CRITICAL' {
|
||||
if (score === 9) return 'CRITICAL';
|
||||
if (score >= 6) return 'HIGH';
|
||||
if (score >= 4) return 'MEDIUM';
|
||||
return 'LOW';
|
||||
}
|
||||
|
||||
// Example: Risk assessment from test failures
|
||||
export function assessTestFailureRisk(failure: {
|
||||
test: string;
|
||||
category: RiskCategory;
|
||||
affectedUsers: number;
|
||||
revenueImpact: number;
|
||||
securityVulnerability: boolean;
|
||||
}): RiskScore {
|
||||
// Probability based on test failure frequency (simplified)
|
||||
const probability: 1 | 2 | 3 = 3; // Test failed = High probability
|
||||
|
||||
// Impact based on business context
|
||||
let impact: 1 | 2 | 3 = 1;
|
||||
if (failure.securityVulnerability) impact = 3;
|
||||
else if (failure.revenueImpact > 10000) impact = 3;
|
||||
else if (failure.affectedUsers > 1000) impact = 2;
|
||||
else impact = 1;
|
||||
|
||||
const score = calculateRiskScore(probability, impact);
|
||||
|
||||
return {
|
||||
id: `risk-${Date.now()}`,
|
||||
category: failure.category,
|
||||
title: `Test failure: ${failure.test}`,
|
||||
description: `Affects ${failure.affectedUsers} users, $${failure.revenueImpact} revenue`,
|
||||
probability,
|
||||
impact,
|
||||
score,
|
||||
owner: 'unassigned',
|
||||
status: score === 9 ? 'OPEN' : 'OPEN',
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Objective scoring**: Probability (1-3) × Impact (1-3) = Score (1-9)
|
||||
- **Clear thresholds**: Score ≥6 requires mitigation, score = 9 blocks release
|
||||
- **Business context**: Revenue, users, security drive impact calculation
|
||||
- **Status tracking**: OPEN → MITIGATED → WAIVED → ACCEPTED lifecycle
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Gate Decision Engine with Traceability Validation
|
||||
|
||||
**Context**: Automated gate decision based on risk scores and test coverage
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// gate-decision-engine.ts
|
||||
export type GateDecision = 'PASS' | 'CONCERNS' | 'FAIL' | 'WAIVED';
|
||||
|
||||
export type CoverageGap = {
|
||||
acceptanceCriteria: string;
|
||||
testMissing: string;
|
||||
reason: string;
|
||||
};
|
||||
|
||||
export type GateResult = {
|
||||
decision: GateDecision;
|
||||
timestamp: Date;
|
||||
criticalRisks: RiskScore[];
|
||||
highRisks: RiskScore[];
|
||||
coverageGaps: CoverageGap[];
|
||||
summary: string;
|
||||
recommendations: string[];
|
||||
};
|
||||
|
||||
export function evaluateGate(params: { risks: RiskScore[]; coverageGaps: CoverageGap[]; waiverApprover?: string }): GateResult {
|
||||
const { risks, coverageGaps, waiverApprover } = params;
|
||||
|
||||
// Categorize risks
|
||||
const criticalRisks = risks.filter((r) => r.score === 9 && r.status === 'OPEN');
|
||||
const highRisks = risks.filter((r) => r.score >= 6 && r.score < 9 && r.status === 'OPEN');
|
||||
const unresolvedGaps = coverageGaps.filter((g) => !g.reason);
|
||||
|
||||
// Decision logic
|
||||
let decision: GateDecision;
|
||||
|
||||
// FAIL: Critical blockers (score=9) or missing coverage
|
||||
if (criticalRisks.length > 0 || unresolvedGaps.length > 0) {
|
||||
decision = 'FAIL';
|
||||
}
|
||||
// WAIVED: All risks waived by authorized approver
|
||||
else if (risks.every((r) => r.status === 'WAIVED') && waiverApprover) {
|
||||
decision = 'WAIVED';
|
||||
}
|
||||
// CONCERNS: High risks (score 6-8) with mitigation plans
|
||||
else if (highRisks.length > 0 && highRisks.every((r) => r.mitigationPlan && r.owner !== 'unassigned')) {
|
||||
decision = 'CONCERNS';
|
||||
}
|
||||
// PASS: No critical issues, all risks mitigated or low
|
||||
else {
|
||||
decision = 'PASS';
|
||||
}
|
||||
|
||||
// Generate recommendations
|
||||
const recommendations: string[] = [];
|
||||
if (criticalRisks.length > 0) {
|
||||
recommendations.push(`🚨 ${criticalRisks.length} CRITICAL risk(s) must be mitigated before release`);
|
||||
}
|
||||
if (unresolvedGaps.length > 0) {
|
||||
recommendations.push(`📋 ${unresolvedGaps.length} acceptance criteria lack test coverage`);
|
||||
}
|
||||
if (highRisks.some((r) => !r.mitigationPlan)) {
|
||||
recommendations.push(`⚠️ High risks without mitigation plans: assign owners and deadlines`);
|
||||
}
|
||||
if (decision === 'PASS') {
|
||||
recommendations.push(`✅ All risks mitigated or acceptable. Ready for release.`);
|
||||
}
|
||||
|
||||
return {
|
||||
decision,
|
||||
timestamp: new Date(),
|
||||
criticalRisks,
|
||||
highRisks,
|
||||
coverageGaps: unresolvedGaps,
|
||||
summary: generateSummary(decision, risks, unresolvedGaps),
|
||||
recommendations,
|
||||
};
|
||||
}
|
||||
|
||||
function generateSummary(decision: GateDecision, risks: RiskScore[], gaps: CoverageGap[]): string {
|
||||
const total = risks.length;
|
||||
const critical = risks.filter((r) => r.score === 9).length;
|
||||
const high = risks.filter((r) => r.score >= 6 && r.score < 9).length;
|
||||
|
||||
return `Gate Decision: ${decision}. Total Risks: ${total} (${critical} critical, ${high} high). Coverage Gaps: ${gaps.length}.`;
|
||||
}
|
||||
```
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```typescript
|
||||
// Example: Running gate check before deployment
|
||||
import { assessTestFailureRisk, evaluateGate } from './gate-decision-engine';
|
||||
|
||||
// Collect risks from test results
|
||||
const risks: RiskScore[] = [
|
||||
assessTestFailureRisk({
|
||||
test: 'Payment processing with expired card',
|
||||
category: 'BUS',
|
||||
affectedUsers: 5000,
|
||||
revenueImpact: 50000,
|
||||
securityVulnerability: false,
|
||||
}),
|
||||
assessTestFailureRisk({
|
||||
test: 'SQL injection in search endpoint',
|
||||
category: 'SEC',
|
||||
affectedUsers: 10000,
|
||||
revenueImpact: 0,
|
||||
securityVulnerability: true,
|
||||
}),
|
||||
];
|
||||
|
||||
// Identify coverage gaps
|
||||
const coverageGaps: CoverageGap[] = [
|
||||
{
|
||||
acceptanceCriteria: 'User can reset password via email',
|
||||
testMissing: 'e2e/auth/password-reset.spec.ts',
|
||||
reason: '', // Empty = unresolved
|
||||
},
|
||||
];
|
||||
|
||||
// Evaluate gate
|
||||
const gateResult = evaluateGate({ risks, coverageGaps });
|
||||
|
||||
console.log(gateResult.decision); // 'FAIL'
|
||||
console.log(gateResult.summary);
|
||||
// "Gate Decision: FAIL. Total Risks: 2 (1 critical, 1 high). Coverage Gaps: 1."
|
||||
|
||||
console.log(gateResult.recommendations);
|
||||
// [
|
||||
// "🚨 1 CRITICAL risk(s) must be mitigated before release",
|
||||
// "📋 1 acceptance criteria lack test coverage"
|
||||
// ]
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Automated decision**: No human interpretation required
|
||||
- **Clear criteria**: FAIL = critical risks or gaps, CONCERNS = high risks with plans, PASS = low risks
|
||||
- **Actionable output**: Recommendations drive next steps
|
||||
- **Audit trail**: Timestamp, decision, and context for compliance
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Risk Mitigation Workflow with Owner Tracking
|
||||
|
||||
**Context**: Track risk mitigation from identification to resolution
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// risk-mitigation.ts
|
||||
export type MitigationAction = {
|
||||
riskId: string;
|
||||
action: string;
|
||||
owner: string;
|
||||
deadline: Date;
|
||||
status: 'PENDING' | 'IN_PROGRESS' | 'COMPLETED' | 'BLOCKED';
|
||||
completedAt?: Date;
|
||||
blockedReason?: string;
|
||||
};
|
||||
|
||||
export class RiskMitigationTracker {
|
||||
private risks: Map<string, RiskScore> = new Map();
|
||||
private actions: Map<string, MitigationAction[]> = new Map();
|
||||
private history: Array<{ riskId: string; event: string; timestamp: Date }> = [];
|
||||
|
||||
// Register a new risk
|
||||
addRisk(risk: RiskScore): void {
|
||||
this.risks.set(risk.id, risk);
|
||||
this.logHistory(risk.id, `Risk registered: ${risk.title} (Score: ${risk.score})`);
|
||||
|
||||
// Auto-assign mitigation requirements for score ≥6
|
||||
if (requiresMitigation(risk.score) && !risk.mitigationPlan) {
|
||||
this.logHistory(risk.id, `⚠️ Mitigation required (score ${risk.score}). Assign owner and plan.`);
|
||||
}
|
||||
}
|
||||
|
||||
// Add mitigation action
|
||||
addMitigationAction(action: MitigationAction): void {
|
||||
const risk = this.risks.get(action.riskId);
|
||||
if (!risk) throw new Error(`Risk ${action.riskId} not found`);
|
||||
|
||||
const existingActions = this.actions.get(action.riskId) || [];
|
||||
existingActions.push(action);
|
||||
this.actions.set(action.riskId, existingActions);
|
||||
|
||||
this.logHistory(action.riskId, `Mitigation action added: ${action.action} (Owner: ${action.owner})`);
|
||||
}
|
||||
|
||||
// Complete mitigation action
|
||||
completeMitigation(riskId: string, actionIndex: number): void {
|
||||
const actions = this.actions.get(riskId);
|
||||
if (!actions || !actions[actionIndex]) throw new Error('Action not found');
|
||||
|
||||
actions[actionIndex].status = 'COMPLETED';
|
||||
actions[actionIndex].completedAt = new Date();
|
||||
|
||||
this.logHistory(riskId, `Mitigation completed: ${actions[actionIndex].action}`);
|
||||
|
||||
// If all actions completed, mark risk as MITIGATED
|
||||
if (actions.every((a) => a.status === 'COMPLETED')) {
|
||||
const risk = this.risks.get(riskId)!;
|
||||
risk.status = 'MITIGATED';
|
||||
this.logHistory(riskId, `✅ Risk mitigated. All actions complete.`);
|
||||
}
|
||||
}
|
||||
|
||||
// Request waiver for a risk
|
||||
requestWaiver(riskId: string, reason: string, approver: string, expiryDays: number): void {
|
||||
const risk = this.risks.get(riskId);
|
||||
if (!risk) throw new Error(`Risk ${riskId} not found`);
|
||||
|
||||
risk.status = 'WAIVED';
|
||||
risk.waiverReason = reason;
|
||||
risk.waiverApprover = approver;
|
||||
risk.waiverExpiry = new Date(Date.now() + expiryDays * 24 * 60 * 60 * 1000);
|
||||
|
||||
this.logHistory(riskId, `⚠️ Waiver granted by ${approver}. Expires: ${risk.waiverExpiry}`);
|
||||
}
|
||||
|
||||
// Generate risk report
|
||||
generateReport(): string {
|
||||
const allRisks = Array.from(this.risks.values());
|
||||
const critical = allRisks.filter((r) => r.score === 9 && r.status === 'OPEN');
|
||||
const high = allRisks.filter((r) => r.score >= 6 && r.score < 9 && r.status === 'OPEN');
|
||||
const mitigated = allRisks.filter((r) => r.status === 'MITIGATED');
|
||||
const waived = allRisks.filter((r) => r.status === 'WAIVED');
|
||||
|
||||
let report = `# Risk Mitigation Report\n\n`;
|
||||
report += `**Generated**: ${new Date().toISOString()}\n\n`;
|
||||
report += `## Summary\n`;
|
||||
report += `- Total Risks: ${allRisks.length}\n`;
|
||||
report += `- Critical (Score=9, OPEN): ${critical.length}\n`;
|
||||
report += `- High (Score 6-8, OPEN): ${high.length}\n`;
|
||||
report += `- Mitigated: ${mitigated.length}\n`;
|
||||
report += `- Waived: ${waived.length}\n\n`;
|
||||
|
||||
if (critical.length > 0) {
|
||||
report += `## 🚨 Critical Risks (BLOCKERS)\n\n`;
|
||||
critical.forEach((r) => {
|
||||
report += `- **${r.title}** (${r.category})\n`;
|
||||
report += ` - Score: ${r.score} (Probability: ${r.probability}, Impact: ${r.impact})\n`;
|
||||
report += ` - Owner: ${r.owner}\n`;
|
||||
report += ` - Mitigation: ${r.mitigationPlan || 'NOT ASSIGNED'}\n\n`;
|
||||
});
|
||||
}
|
||||
|
||||
if (high.length > 0) {
|
||||
report += `## ⚠️ High Risks\n\n`;
|
||||
high.forEach((r) => {
|
||||
report += `- **${r.title}** (${r.category})\n`;
|
||||
report += ` - Score: ${r.score}\n`;
|
||||
report += ` - Owner: ${r.owner}\n`;
|
||||
report += ` - Deadline: ${r.deadline?.toISOString().split('T')[0] || 'NOT SET'}\n\n`;
|
||||
});
|
||||
}
|
||||
|
||||
return report;
|
||||
}
|
||||
|
||||
private logHistory(riskId: string, event: string): void {
|
||||
this.history.push({ riskId, event, timestamp: new Date() });
|
||||
}
|
||||
|
||||
getHistory(riskId: string): Array<{ event: string; timestamp: Date }> {
|
||||
return this.history.filter((h) => h.riskId === riskId).map((h) => ({ event: h.event, timestamp: h.timestamp }));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```typescript
|
||||
const tracker = new RiskMitigationTracker();
|
||||
|
||||
// Register critical security risk
|
||||
tracker.addRisk({
|
||||
id: 'risk-001',
|
||||
category: 'SEC',
|
||||
title: 'SQL injection vulnerability in user search',
|
||||
description: 'Unsanitized input allows arbitrary SQL execution',
|
||||
probability: 3,
|
||||
impact: 3,
|
||||
score: 9,
|
||||
owner: 'security-team',
|
||||
status: 'OPEN',
|
||||
});
|
||||
|
||||
// Add mitigation actions
|
||||
tracker.addMitigationAction({
|
||||
riskId: 'risk-001',
|
||||
action: 'Add parameterized queries to user-search endpoint',
|
||||
owner: 'alice@example.com',
|
||||
deadline: new Date('2025-10-20'),
|
||||
status: 'IN_PROGRESS',
|
||||
});
|
||||
|
||||
tracker.addMitigationAction({
|
||||
riskId: 'risk-001',
|
||||
action: 'Add WAF rule to block SQL injection patterns',
|
||||
owner: 'bob@example.com',
|
||||
deadline: new Date('2025-10-22'),
|
||||
status: 'PENDING',
|
||||
});
|
||||
|
||||
// Complete first action
|
||||
tracker.completeMitigation('risk-001', 0);
|
||||
|
||||
// Generate report
|
||||
console.log(tracker.generateReport());
|
||||
// Markdown report with critical risks, owners, deadlines
|
||||
|
||||
// View history
|
||||
console.log(tracker.getHistory('risk-001'));
|
||||
// [
|
||||
// { event: 'Risk registered: SQL injection...', timestamp: ... },
|
||||
// { event: 'Mitigation action added: Add parameterized queries...', timestamp: ... },
|
||||
// { event: 'Mitigation completed: Add parameterized queries...', timestamp: ... }
|
||||
// ]
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Ownership enforcement**: Every risk >4 requires owner assignment
|
||||
- **Deadline tracking**: Mitigation actions have explicit deadlines
|
||||
- **Audit trail**: Complete history of risk lifecycle (registered → mitigated)
|
||||
- **Automated reports**: Markdown output for Confluence/GitHub wikis
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Coverage Traceability Matrix (Test-to-Requirement Mapping)
|
||||
|
||||
**Context**: Validate that every acceptance criterion maps to at least one test
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// coverage-traceability.ts
|
||||
export type AcceptanceCriterion = {
|
||||
id: string;
|
||||
story: string;
|
||||
criterion: string;
|
||||
priority: 'P0' | 'P1' | 'P2' | 'P3';
|
||||
};
|
||||
|
||||
export type TestCase = {
|
||||
file: string;
|
||||
name: string;
|
||||
criteriaIds: string[]; // Links to acceptance criteria
|
||||
};
|
||||
|
||||
export type CoverageMatrix = {
|
||||
criterion: AcceptanceCriterion;
|
||||
tests: TestCase[];
|
||||
covered: boolean;
|
||||
waiverReason?: string;
|
||||
};
|
||||
|
||||
export function buildCoverageMatrix(criteria: AcceptanceCriterion[], tests: TestCase[]): CoverageMatrix[] {
|
||||
return criteria.map((criterion) => {
|
||||
const matchingTests = tests.filter((t) => t.criteriaIds.includes(criterion.id));
|
||||
|
||||
return {
|
||||
criterion,
|
||||
tests: matchingTests,
|
||||
covered: matchingTests.length > 0,
|
||||
};
|
||||
});
|
||||
}
|
||||
|
||||
export function validateCoverage(matrix: CoverageMatrix[]): {
|
||||
gaps: CoverageMatrix[];
|
||||
passRate: number;
|
||||
} {
|
||||
const gaps = matrix.filter((m) => !m.covered && !m.waiverReason);
|
||||
const passRate = ((matrix.length - gaps.length) / matrix.length) * 100;
|
||||
|
||||
return { gaps, passRate };
|
||||
}
|
||||
|
||||
// Example: Extract criteria IDs from test names
|
||||
export function extractCriteriaFromTests(testFiles: string[]): TestCase[] {
|
||||
// Simplified: In real implementation, parse test files with AST
|
||||
// Here we simulate extraction from test names
|
||||
return [
|
||||
{
|
||||
file: 'tests/e2e/auth/login.spec.ts',
|
||||
name: 'should allow user to login with valid credentials',
|
||||
criteriaIds: ['AC-001', 'AC-002'], // Linked to acceptance criteria
|
||||
},
|
||||
{
|
||||
file: 'tests/e2e/auth/password-reset.spec.ts',
|
||||
name: 'should send password reset email',
|
||||
criteriaIds: ['AC-003'],
|
||||
},
|
||||
];
|
||||
}
|
||||
|
||||
// Generate Markdown traceability report
|
||||
export function generateTraceabilityReport(matrix: CoverageMatrix[]): string {
|
||||
let report = `# Requirements-to-Tests Traceability Matrix\n\n`;
|
||||
report += `**Generated**: ${new Date().toISOString()}\n\n`;
|
||||
|
||||
const { gaps, passRate } = validateCoverage(matrix);
|
||||
|
||||
report += `## Summary\n`;
|
||||
report += `- Total Criteria: ${matrix.length}\n`;
|
||||
report += `- Covered: ${matrix.filter((m) => m.covered).length}\n`;
|
||||
report += `- Gaps: ${gaps.length}\n`;
|
||||
report += `- Waived: ${matrix.filter((m) => m.waiverReason).length}\n`;
|
||||
report += `- Coverage Rate: ${passRate.toFixed(1)}%\n\n`;
|
||||
|
||||
if (gaps.length > 0) {
|
||||
report += `## ❌ Coverage Gaps (MUST RESOLVE)\n\n`;
|
||||
report += `| Story | Criterion | Priority | Tests |\n`;
|
||||
report += `|-------|-----------|----------|-------|\n`;
|
||||
gaps.forEach((m) => {
|
||||
report += `| ${m.criterion.story} | ${m.criterion.criterion} | ${m.criterion.priority} | None |\n`;
|
||||
});
|
||||
report += `\n`;
|
||||
}
|
||||
|
||||
report += `## ✅ Covered Criteria\n\n`;
|
||||
report += `| Story | Criterion | Tests |\n`;
|
||||
report += `|-------|-----------|-------|\n`;
|
||||
matrix
|
||||
.filter((m) => m.covered)
|
||||
.forEach((m) => {
|
||||
const testList = m.tests.map((t) => `\`${t.file}\``).join(', ');
|
||||
report += `| ${m.criterion.story} | ${m.criterion.criterion} | ${testList} |\n`;
|
||||
});
|
||||
|
||||
return report;
|
||||
}
|
||||
```
|
||||
|
||||
**Usage Example**:
|
||||
|
||||
```typescript
|
||||
// Define acceptance criteria
|
||||
const criteria: AcceptanceCriterion[] = [
|
||||
{ id: 'AC-001', story: 'US-123', criterion: 'User can login with email', priority: 'P0' },
|
||||
{ id: 'AC-002', story: 'US-123', criterion: 'User sees error on invalid password', priority: 'P0' },
|
||||
{ id: 'AC-003', story: 'US-124', criterion: 'User receives password reset email', priority: 'P1' },
|
||||
{ id: 'AC-004', story: 'US-125', criterion: 'User can update profile', priority: 'P2' }, // NO TEST
|
||||
];
|
||||
|
||||
// Extract tests
|
||||
const tests: TestCase[] = extractCriteriaFromTests(['tests/e2e/auth/login.spec.ts', 'tests/e2e/auth/password-reset.spec.ts']);
|
||||
|
||||
// Build matrix
|
||||
const matrix = buildCoverageMatrix(criteria, tests);
|
||||
|
||||
// Validate
|
||||
const { gaps, passRate } = validateCoverage(matrix);
|
||||
console.log(`Coverage: ${passRate.toFixed(1)}%`); // "Coverage: 75.0%"
|
||||
console.log(`Gaps: ${gaps.length}`); // "Gaps: 1" (AC-004 has no test)
|
||||
|
||||
// Generate report
|
||||
const report = generateTraceabilityReport(matrix);
|
||||
console.log(report);
|
||||
// Markdown table showing coverage gaps
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Bidirectional traceability**: Criteria → Tests and Tests → Criteria
|
||||
- **Gap detection**: Automatically identifies missing coverage
|
||||
- **Priority awareness**: P0 gaps are critical blockers
|
||||
- **Waiver support**: Allow explicit waivers for low-priority gaps
|
||||
|
||||
---
|
||||
|
||||
## Risk Governance Checklist
|
||||
|
||||
Before deploying to production, ensure:
|
||||
|
||||
- [ ] **Risk scoring complete**: All identified risks scored (Probability × Impact)
|
||||
- [ ] **Ownership assigned**: Every risk >4 has owner, mitigation plan, deadline
|
||||
- [ ] **Coverage validated**: Every acceptance criterion maps to at least one test
|
||||
- [ ] **Gate decision documented**: PASS/CONCERNS/FAIL/WAIVED with rationale
|
||||
- [ ] **Waivers approved**: All waivers have approver, reason, expiry date
|
||||
- [ ] **Audit trail captured**: Risk history log available for compliance review
|
||||
- [ ] **Traceability matrix**: Requirements-to-tests mapping up to date
|
||||
- [ ] **Critical risks resolved**: No score=9 risks in OPEN status
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*trace` (Phase 2: gate decision), `*nfr-assess` (risk scoring), `*test-design` (risk identification)
|
||||
- **Related fragments**: `probability-impact.md` (scoring definitions), `test-priorities-matrix.md` (P0-P3 classification), `nfr-criteria.md` (non-functional risks)
|
||||
- **Tools**: Risk tracking dashboards (Jira, Linear), gate automation (CI/CD), traceability reports (Markdown, Confluence)
|
||||
|
||||
_Source: Murat risk governance notes, gate schema guidance, SEON production gate workflows, ISO 31000 risk management standards_
|
||||
|
||||
@@ -1,9 +1,732 @@
|
||||
# Selective and Targeted Test Execution
|
||||
|
||||
- Use tags/grep (`--grep "@smoke"`, `--grep "@critical"`) to slice suites by risk, not directory.
|
||||
- Filter by spec patterns (`--spec "**/*checkout*"`) or git diff (`npm run test:changed`) to focus on impacted areas.
|
||||
- Combine priority metadata (P0–P3) with change detection to decide which levels to run pre-commit vs. in CI.
|
||||
- Record burn-in history for newly added specs; promote to main suite only after consistent green runs.
|
||||
- Document the selection strategy in README/CI so the team understands when full regression is mandatory.
|
||||
## Principle
|
||||
|
||||
_Source: 32+ selective testing strategies blog, Murat testing philosophy._
|
||||
Run only the tests you need, when you need them. Use tags/grep to slice suites by risk priority (not directory structure), filter by spec patterns or git diff to focus on impacted areas, and combine priority metadata (P0-P3) with change detection to optimize pre-commit vs. CI execution. Document the selection strategy clearly so teams understand when full regression is mandatory.
|
||||
|
||||
## Rationale
|
||||
|
||||
Running the entire test suite on every commit wastes time and resources. Smart test selection provides fast feedback (smoke tests in minutes, full regression in hours) while maintaining confidence. The "32+ ways of selective testing" philosophy balances speed with coverage: quick loops for developers, comprehensive validation before deployment. Poorly documented selection leads to confusion about when tests run and why.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Tag-Based Execution with Priority Levels
|
||||
|
||||
**Context**: Organize tests by risk priority and execution stage using grep/tag patterns.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Tag-based test organization
|
||||
* - @smoke: Critical path tests (run on every commit, < 5 min)
|
||||
* - @regression: Full test suite (run pre-merge, < 30 min)
|
||||
* - @p0: Critical business functions (payment, auth, data integrity)
|
||||
* - @p1: Core features (primary user journeys)
|
||||
* - @p2: Secondary features (supporting functionality)
|
||||
* - @p3: Nice-to-have (cosmetic, non-critical)
|
||||
*/
|
||||
|
||||
test.describe('Checkout Flow', () => {
|
||||
// P0 + Smoke: Must run on every commit
|
||||
test('@smoke @p0 should complete purchase with valid payment', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('card-number').fill('4242424242424242');
|
||||
await page.getByTestId('submit-payment').click();
|
||||
|
||||
await expect(page.getByTestId('order-confirmation')).toBeVisible();
|
||||
});
|
||||
|
||||
// P0 but not smoke: Run pre-merge
|
||||
test('@regression @p0 should handle payment decline gracefully', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('card-number').fill('4000000000000002'); // Decline card
|
||||
await page.getByTestId('submit-payment').click();
|
||||
|
||||
await expect(page.getByTestId('payment-error')).toBeVisible();
|
||||
await expect(page.getByTestId('payment-error')).toContainText('declined');
|
||||
});
|
||||
|
||||
// P1 + Smoke: Important but not critical
|
||||
test('@smoke @p1 should apply discount code', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('promo-code').fill('SAVE10');
|
||||
await page.getByTestId('apply-promo').click();
|
||||
|
||||
await expect(page.getByTestId('discount-applied')).toBeVisible();
|
||||
});
|
||||
|
||||
// P2: Run in full regression only
|
||||
test('@regression @p2 should remember saved payment methods', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
await expect(page.getByTestId('saved-cards')).toBeVisible();
|
||||
});
|
||||
|
||||
// P3: Low priority, run nightly or weekly
|
||||
test('@nightly @p3 should display checkout page analytics', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS__);
|
||||
expect(analyticsEvents).toBeDefined();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**package.json scripts**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"test": "playwright test",
|
||||
"test:smoke": "playwright test --grep '@smoke'",
|
||||
"test:p0": "playwright test --grep '@p0'",
|
||||
"test:p0-p1": "playwright test --grep '@p0|@p1'",
|
||||
"test:regression": "playwright test --grep '@regression'",
|
||||
"test:nightly": "playwright test --grep '@nightly'",
|
||||
"test:not-slow": "playwright test --grep-invert '@slow'",
|
||||
"test:critical-smoke": "playwright test --grep '@smoke.*@p0'"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Cypress equivalent**:
|
||||
|
||||
```javascript
|
||||
// cypress/e2e/checkout.cy.ts
|
||||
describe('Checkout Flow', { tags: ['@checkout'] }, () => {
|
||||
it('should complete purchase', { tags: ['@smoke', '@p0'] }, () => {
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-cy="card-number"]').type('4242424242424242');
|
||||
cy.get('[data-cy="submit-payment"]').click();
|
||||
cy.get('[data-cy="order-confirmation"]').should('be.visible');
|
||||
});
|
||||
|
||||
it('should handle decline', { tags: ['@regression', '@p0'] }, () => {
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-cy="card-number"]').type('4000000000000002');
|
||||
cy.get('[data-cy="submit-payment"]').click();
|
||||
cy.get('[data-cy="payment-error"]').should('be.visible');
|
||||
});
|
||||
});
|
||||
|
||||
// cypress.config.ts
|
||||
export default defineConfig({
|
||||
e2e: {
|
||||
env: {
|
||||
grepTags: process.env.GREP_TAGS || '',
|
||||
grepFilterSpecs: true,
|
||||
},
|
||||
setupNodeEvents(on, config) {
|
||||
require('@cypress/grep/src/plugin')(config);
|
||||
return config;
|
||||
},
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Usage**:
|
||||
|
||||
```bash
|
||||
# Playwright
|
||||
npm run test:smoke # Run all @smoke tests
|
||||
npm run test:p0 # Run all P0 tests
|
||||
npm run test -- --grep "@smoke.*@p0" # Run tests with BOTH tags
|
||||
|
||||
# Cypress (with @cypress/grep plugin)
|
||||
npx cypress run --env grepTags="@smoke"
|
||||
npx cypress run --env grepTags="@p0+@smoke" # AND logic
|
||||
npx cypress run --env grepTags="@p0 @p1" # OR logic
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Multiple tags per test**: Combine priority (@p0) with stage (@smoke)
|
||||
- **AND/OR logic**: Grep supports complex filtering
|
||||
- **Clear naming**: Tags document test importance
|
||||
- **Fast feedback**: @smoke runs < 5 min, full suite < 30 min
|
||||
- **CI integration**: Different jobs run different tag combinations
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Spec Filter Pattern (File-Based Selection)
|
||||
|
||||
**Context**: Run tests by file path pattern or directory for targeted execution.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/selective-spec-runner.sh
|
||||
# Run tests based on spec file patterns
|
||||
|
||||
set -e
|
||||
|
||||
PATTERN=${1:-"**/*.spec.ts"}
|
||||
TEST_ENV=${TEST_ENV:-local}
|
||||
|
||||
echo "🎯 Selective Spec Runner"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Pattern: $PATTERN"
|
||||
echo "Environment: $TEST_ENV"
|
||||
echo ""
|
||||
|
||||
# Pattern examples and their use cases
|
||||
case "$PATTERN" in
|
||||
"**/checkout*")
|
||||
echo "📦 Running checkout-related tests"
|
||||
npx playwright test --grep-files="**/checkout*"
|
||||
;;
|
||||
"**/auth*"|"**/login*"|"**/signup*")
|
||||
echo "🔐 Running authentication tests"
|
||||
npx playwright test --grep-files="**/auth*|**/login*|**/signup*"
|
||||
;;
|
||||
"tests/e2e/**")
|
||||
echo "🌐 Running all E2E tests"
|
||||
npx playwright test tests/e2e/
|
||||
;;
|
||||
"tests/integration/**")
|
||||
echo "🔌 Running all integration tests"
|
||||
npx playwright test tests/integration/
|
||||
;;
|
||||
"tests/component/**")
|
||||
echo "🧩 Running all component tests"
|
||||
npx playwright test tests/component/
|
||||
;;
|
||||
*)
|
||||
echo "🔍 Running tests matching pattern: $PATTERN"
|
||||
npx playwright test "$PATTERN"
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
**Playwright config for file filtering**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts
|
||||
import { defineConfig, devices } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
// ... other config
|
||||
|
||||
// Project-based organization
|
||||
projects: [
|
||||
{
|
||||
name: 'smoke',
|
||||
testMatch: /.*smoke.*\.spec\.ts/,
|
||||
retries: 0,
|
||||
},
|
||||
{
|
||||
name: 'e2e',
|
||||
testMatch: /tests\/e2e\/.*\.spec\.ts/,
|
||||
retries: 2,
|
||||
},
|
||||
{
|
||||
name: 'integration',
|
||||
testMatch: /tests\/integration\/.*\.spec\.ts/,
|
||||
retries: 1,
|
||||
},
|
||||
{
|
||||
name: 'component',
|
||||
testMatch: /tests\/component\/.*\.spec\.ts/,
|
||||
use: { ...devices['Desktop Chrome'] },
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
**Advanced pattern matching**:
|
||||
|
||||
```typescript
|
||||
// scripts/run-by-component.ts
|
||||
/**
|
||||
* Run tests related to specific component(s)
|
||||
* Usage: npm run test:component UserProfile,Settings
|
||||
*/
|
||||
|
||||
import { execSync } from 'child_process';
|
||||
|
||||
const components = process.argv[2]?.split(',') || [];
|
||||
|
||||
if (components.length === 0) {
|
||||
console.error('❌ No components specified');
|
||||
console.log('Usage: npm run test:component UserProfile,Settings');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Convert component names to glob patterns
|
||||
const patterns = components.map((comp) => `**/*${comp}*.spec.ts`).join(' ');
|
||||
|
||||
console.log(`🧩 Running tests for components: ${components.join(', ')}`);
|
||||
console.log(`Patterns: ${patterns}`);
|
||||
|
||||
try {
|
||||
execSync(`npx playwright test ${patterns}`, {
|
||||
stdio: 'inherit',
|
||||
env: { ...process.env, CI: 'false' },
|
||||
});
|
||||
} catch (error) {
|
||||
process.exit(1);
|
||||
}
|
||||
```
|
||||
|
||||
**package.json scripts**:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"test:checkout": "playwright test **/checkout*.spec.ts",
|
||||
"test:auth": "playwright test **/auth*.spec.ts **/login*.spec.ts",
|
||||
"test:e2e": "playwright test tests/e2e/",
|
||||
"test:integration": "playwright test tests/integration/",
|
||||
"test:component": "ts-node scripts/run-by-component.ts",
|
||||
"test:project": "playwright test --project",
|
||||
"test:smoke-project": "playwright test --project smoke"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Glob patterns**: Wildcards match file paths flexibly
|
||||
- **Project isolation**: Separate projects have different configs
|
||||
- **Component targeting**: Run tests for specific features
|
||||
- **Directory-based**: Organize tests by type (e2e, integration, component)
|
||||
- **CI optimization**: Run subsets in parallel CI jobs
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Diff-Based Test Selection (Changed Files Only)
|
||||
|
||||
**Context**: Run only tests affected by code changes for maximum speed.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/test-changed-files.sh
|
||||
# Intelligent test selection based on git diff
|
||||
|
||||
set -e
|
||||
|
||||
BASE_BRANCH=${BASE_BRANCH:-main}
|
||||
TEST_ENV=${TEST_ENV:-local}
|
||||
|
||||
echo "🔍 Changed File Test Selector"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Base branch: $BASE_BRANCH"
|
||||
echo "Environment: $TEST_ENV"
|
||||
echo ""
|
||||
|
||||
# Get changed files
|
||||
CHANGED_FILES=$(git diff --name-only $BASE_BRANCH...HEAD)
|
||||
|
||||
if [ -z "$CHANGED_FILES" ]; then
|
||||
echo "✅ No files changed. Skipping tests."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "Changed files:"
|
||||
echo "$CHANGED_FILES" | sed 's/^/ - /'
|
||||
echo ""
|
||||
|
||||
# Arrays to collect test specs
|
||||
DIRECT_TEST_FILES=()
|
||||
RELATED_TEST_FILES=()
|
||||
RUN_ALL_TESTS=false
|
||||
|
||||
# Process each changed file
|
||||
while IFS= read -r file; do
|
||||
case "$file" in
|
||||
# Changed test files: run them directly
|
||||
*.spec.ts|*.spec.js|*.test.ts|*.test.js|*.cy.ts|*.cy.js)
|
||||
DIRECT_TEST_FILES+=("$file")
|
||||
;;
|
||||
|
||||
# Critical config changes: run ALL tests
|
||||
package.json|package-lock.json|playwright.config.ts|cypress.config.ts|tsconfig.json|.github/workflows/*)
|
||||
echo "⚠️ Critical file changed: $file"
|
||||
RUN_ALL_TESTS=true
|
||||
break
|
||||
;;
|
||||
|
||||
# Component changes: find related tests
|
||||
src/components/*.tsx|src/components/*.jsx)
|
||||
COMPONENT_NAME=$(basename "$file" | sed 's/\.[^.]*$//')
|
||||
echo "🧩 Component changed: $COMPONENT_NAME"
|
||||
|
||||
# Find tests matching component name
|
||||
FOUND_TESTS=$(find tests -name "*${COMPONENT_NAME}*.spec.ts" -o -name "*${COMPONENT_NAME}*.cy.ts" 2>/dev/null || true)
|
||||
if [ -n "$FOUND_TESTS" ]; then
|
||||
while IFS= read -r test_file; do
|
||||
RELATED_TEST_FILES+=("$test_file")
|
||||
done <<< "$FOUND_TESTS"
|
||||
fi
|
||||
;;
|
||||
|
||||
# Utility/lib changes: run integration + unit tests
|
||||
src/utils/*|src/lib/*|src/helpers/*)
|
||||
echo "⚙️ Utility file changed: $file"
|
||||
RELATED_TEST_FILES+=($(find tests/unit tests/integration -name "*.spec.ts" 2>/dev/null || true))
|
||||
;;
|
||||
|
||||
# API changes: run integration + e2e tests
|
||||
src/api/*|src/services/*|src/controllers/*)
|
||||
echo "🔌 API file changed: $file"
|
||||
RELATED_TEST_FILES+=($(find tests/integration tests/e2e -name "*.spec.ts" 2>/dev/null || true))
|
||||
;;
|
||||
|
||||
# Type changes: run all TypeScript tests
|
||||
*.d.ts|src/types/*)
|
||||
echo "📝 Type definition changed: $file"
|
||||
RUN_ALL_TESTS=true
|
||||
break
|
||||
;;
|
||||
|
||||
# Documentation only: skip tests
|
||||
*.md|docs/*|README*)
|
||||
echo "📄 Documentation changed: $file (no tests needed)"
|
||||
;;
|
||||
|
||||
*)
|
||||
echo "❓ Unclassified change: $file (running smoke tests)"
|
||||
RELATED_TEST_FILES+=($(find tests -name "*smoke*.spec.ts" 2>/dev/null || true))
|
||||
;;
|
||||
esac
|
||||
done <<< "$CHANGED_FILES"
|
||||
|
||||
# Execute tests based on analysis
|
||||
if [ "$RUN_ALL_TESTS" = true ]; then
|
||||
echo ""
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🚨 Running FULL test suite (critical changes detected)"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
npm run test
|
||||
exit $?
|
||||
fi
|
||||
|
||||
# Combine and deduplicate test files
|
||||
ALL_TEST_FILES=(${DIRECT_TEST_FILES[@]} ${RELATED_TEST_FILES[@]})
|
||||
UNIQUE_TEST_FILES=($(echo "${ALL_TEST_FILES[@]}" | tr ' ' '\n' | sort -u))
|
||||
|
||||
if [ ${#UNIQUE_TEST_FILES[@]} -eq 0 ]; then
|
||||
echo ""
|
||||
echo "✅ No tests found for changed files. Running smoke tests."
|
||||
npm run test:smoke
|
||||
exit $?
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🎯 Running ${#UNIQUE_TEST_FILES[@]} test file(s)"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
for test_file in "${UNIQUE_TEST_FILES[@]}"; do
|
||||
echo " - $test_file"
|
||||
done
|
||||
|
||||
echo ""
|
||||
npm run test -- "${UNIQUE_TEST_FILES[@]}"
|
||||
```
|
||||
|
||||
**GitHub Actions integration**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/test-changed.yml
|
||||
name: Test Changed Files
|
||||
on:
|
||||
pull_request:
|
||||
types: [opened, synchronize, reopened]
|
||||
|
||||
jobs:
|
||||
detect-and-test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0 # Full history for accurate diff
|
||||
|
||||
- name: Get changed files
|
||||
id: changed-files
|
||||
uses: tj-actions/changed-files@v40
|
||||
with:
|
||||
files: |
|
||||
src/**
|
||||
tests/**
|
||||
*.config.ts
|
||||
files_ignore: |
|
||||
**/*.md
|
||||
docs/**
|
||||
|
||||
- name: Run tests for changed files
|
||||
if: steps.changed-files.outputs.any_changed == 'true'
|
||||
run: |
|
||||
echo "Changed files: ${{ steps.changed-files.outputs.all_changed_files }}"
|
||||
bash scripts/test-changed-files.sh
|
||||
env:
|
||||
BASE_BRANCH: ${{ github.base_ref }}
|
||||
TEST_ENV: staging
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **Intelligent mapping**: Code changes → related tests
|
||||
- **Critical file detection**: Config changes = full suite
|
||||
- **Component mapping**: UI changes → component + E2E tests
|
||||
- **Fast feedback**: Run only what's needed (< 2 min typical)
|
||||
- **Safety net**: Unrecognized changes run smoke tests
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Promotion Rules (Pre-Commit → CI → Staging → Production)
|
||||
|
||||
**Context**: Progressive test execution strategy across deployment stages.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// scripts/test-promotion-strategy.ts
|
||||
/**
|
||||
* Test Promotion Strategy
|
||||
* Defines which tests run at each stage of the development lifecycle
|
||||
*/
|
||||
|
||||
export type TestStage = 'pre-commit' | 'ci-pr' | 'ci-merge' | 'staging' | 'production';
|
||||
|
||||
export type TestPromotion = {
|
||||
stage: TestStage;
|
||||
description: string;
|
||||
testCommand: string;
|
||||
timebudget: string; // minutes
|
||||
required: boolean;
|
||||
failureAction: 'block' | 'warn' | 'alert';
|
||||
};
|
||||
|
||||
export const TEST_PROMOTION_RULES: Record<TestStage, TestPromotion> = {
|
||||
'pre-commit': {
|
||||
stage: 'pre-commit',
|
||||
description: 'Local developer checks before git commit',
|
||||
testCommand: 'npm run test:smoke',
|
||||
timebudget: '2',
|
||||
required: true,
|
||||
failureAction: 'block',
|
||||
},
|
||||
'ci-pr': {
|
||||
stage: 'ci-pr',
|
||||
description: 'CI checks on pull request creation/update',
|
||||
testCommand: 'npm run test:changed && npm run test:p0-p1',
|
||||
timebudget: '10',
|
||||
required: true,
|
||||
failureAction: 'block',
|
||||
},
|
||||
'ci-merge': {
|
||||
stage: 'ci-merge',
|
||||
description: 'Full regression before merge to main',
|
||||
testCommand: 'npm run test:regression',
|
||||
timebudget: '30',
|
||||
required: true,
|
||||
failureAction: 'block',
|
||||
},
|
||||
staging: {
|
||||
stage: 'staging',
|
||||
description: 'Post-deployment validation in staging environment',
|
||||
testCommand: 'npm run test:e2e -- --grep "@smoke"',
|
||||
timebudget: '15',
|
||||
required: true,
|
||||
failureAction: 'block',
|
||||
},
|
||||
production: {
|
||||
stage: 'production',
|
||||
description: 'Production smoke tests post-deployment',
|
||||
testCommand: 'npm run test:e2e:prod -- --grep "@smoke.*@p0"',
|
||||
timebudget: '5',
|
||||
required: false,
|
||||
failureAction: 'alert',
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Get tests to run for a specific stage
|
||||
*/
|
||||
export function getTestsForStage(stage: TestStage): TestPromotion {
|
||||
return TEST_PROMOTION_RULES[stage];
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate if tests can be promoted to next stage
|
||||
*/
|
||||
export function canPromote(currentStage: TestStage, testsPassed: boolean): boolean {
|
||||
const promotion = TEST_PROMOTION_RULES[currentStage];
|
||||
|
||||
if (!promotion.required) {
|
||||
return true; // Non-required tests don't block promotion
|
||||
}
|
||||
|
||||
return testsPassed;
|
||||
}
|
||||
```
|
||||
|
||||
**Husky pre-commit hook**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# .husky/pre-commit
|
||||
# Run smoke tests before allowing commit
|
||||
|
||||
echo "🔍 Running pre-commit tests..."
|
||||
|
||||
npm run test:smoke
|
||||
|
||||
if [ $? -ne 0 ]; then
|
||||
echo ""
|
||||
echo "❌ Pre-commit tests failed!"
|
||||
echo "Please fix failures before committing."
|
||||
echo ""
|
||||
echo "To skip (NOT recommended): git commit --no-verify"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ Pre-commit tests passed"
|
||||
```
|
||||
|
||||
**GitHub Actions workflow**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/test-promotion.yml
|
||||
name: Test Promotion Strategy
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main]
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
# Stage 1: PR tests (changed + P0-P1)
|
||||
pr-tests:
|
||||
if: github.event_name == 'pull_request'
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 10
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run PR-level tests
|
||||
run: |
|
||||
npm run test:changed
|
||||
npm run test:p0-p1
|
||||
|
||||
# Stage 2: Full regression (pre-merge)
|
||||
regression-tests:
|
||||
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run full regression
|
||||
run: npm run test:regression
|
||||
|
||||
# Stage 3: Staging validation (post-deploy)
|
||||
staging-smoke:
|
||||
if: github.event_name == 'workflow_dispatch'
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 15
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run staging smoke tests
|
||||
run: npm run test:e2e -- --grep "@smoke"
|
||||
env:
|
||||
TEST_ENV: staging
|
||||
|
||||
# Stage 4: Production smoke (post-deploy, non-blocking)
|
||||
production-smoke:
|
||||
if: github.event_name == 'workflow_dispatch'
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 5
|
||||
continue-on-error: true # Don't fail deployment if smoke tests fail
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Run production smoke tests
|
||||
run: npm run test:e2e:prod -- --grep "@smoke.*@p0"
|
||||
env:
|
||||
TEST_ENV: production
|
||||
|
||||
- name: Alert on failure
|
||||
if: failure()
|
||||
uses: 8398a7/action-slack@v3
|
||||
with:
|
||||
status: ${{ job.status }}
|
||||
text: '🚨 Production smoke tests failed!'
|
||||
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
|
||||
```
|
||||
|
||||
**Selection strategy documentation**:
|
||||
|
||||
````markdown
|
||||
# Test Selection Strategy
|
||||
|
||||
## Test Promotion Stages
|
||||
|
||||
| Stage | Tests Run | Time Budget | Blocks Deploy | Failure Action |
|
||||
| ---------- | ------------------- | ----------- | ------------- | -------------- |
|
||||
| Pre-Commit | Smoke (@smoke) | 2 min | ✅ Yes | Block commit |
|
||||
| CI PR | Changed + P0-P1 | 10 min | ✅ Yes | Block merge |
|
||||
| CI Merge | Full regression | 30 min | ✅ Yes | Block deploy |
|
||||
| Staging | E2E smoke | 15 min | ✅ Yes | Rollback |
|
||||
| Production | Critical smoke only | 5 min | ❌ No | Alert team |
|
||||
|
||||
## When Full Regression Runs
|
||||
|
||||
Full regression suite (`npm run test:regression`) runs in these scenarios:
|
||||
|
||||
- ✅ Before merging to `main` (CI Merge stage)
|
||||
- ✅ Nightly builds (scheduled workflow)
|
||||
- ✅ Manual trigger (workflow_dispatch)
|
||||
- ✅ Release candidate testing
|
||||
|
||||
Full regression does NOT run on:
|
||||
|
||||
- ❌ Every PR commit (too slow)
|
||||
- ❌ Pre-commit hooks (too slow)
|
||||
- ❌ Production deployments (deploy-blocking)
|
||||
|
||||
## Override Scenarios
|
||||
|
||||
Skip tests (emergency only):
|
||||
|
||||
```bash
|
||||
git commit --no-verify # Skip pre-commit hook
|
||||
gh pr merge --admin # Force merge (requires admin)
|
||||
```
|
||||
````
|
||||
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
- **Progressive validation**: More tests at each stage
|
||||
- **Time budgets**: Clear expectations per stage
|
||||
- **Blocking vs. alerting**: Production tests don't block deploy
|
||||
- **Documentation**: Team knows when full regression runs
|
||||
- **Emergency overrides**: Documented but discouraged
|
||||
|
||||
---
|
||||
|
||||
## Test Selection Strategy Checklist
|
||||
|
||||
Before implementing selective testing, verify:
|
||||
|
||||
- [ ] **Tag strategy defined**: @smoke, @p0-p3, @regression documented
|
||||
- [ ] **Time budgets set**: Each stage has clear timeout (smoke < 5 min, full < 30 min)
|
||||
- [ ] **Changed file mapping**: Code changes → test selection logic implemented
|
||||
- [ ] **Promotion rules documented**: README explains when full regression runs
|
||||
- [ ] **CI integration**: GitHub Actions uses selective strategy
|
||||
- [ ] **Local parity**: Developers can run same selections locally
|
||||
- [ ] **Emergency overrides**: Skip mechanisms documented (--no-verify, admin merge)
|
||||
- [ ] **Metrics tracked**: Monitor test execution time and selection accuracy
|
||||
|
||||
## Integration Points
|
||||
|
||||
- Used in workflows: `*ci` (CI/CD setup), `*automate` (test generation with tags)
|
||||
- Related fragments: `ci-burn-in.md`, `test-priorities-matrix.md`, `test-quality.md`
|
||||
- Selection tools: Playwright --grep, Cypress @cypress/grep, git diff
|
||||
|
||||
_Source: 32+ selective testing strategies blog, Murat testing philosophy, SEON CI optimization_
|
||||
```
|
||||
|
||||
527
src/modules/bmm/testarch/knowledge/selector-resilience.md
Normal file
527
src/modules/bmm/testarch/knowledge/selector-resilience.md
Normal file
@@ -0,0 +1,527 @@
|
||||
# Selector Resilience
|
||||
|
||||
## Principle
|
||||
|
||||
Robust selectors follow a strict hierarchy: **data-testid > ARIA roles > text content > CSS/IDs** (last resort). Selectors must be resilient to UI changes (styling, layout, content updates) and remain human-readable for maintenance.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Brittle selectors (CSS classes, nth-child, complex XPath) break when UI styling changes, elements are reordered, or design updates occur. This causes test maintenance burden and false negatives.
|
||||
|
||||
**The Solution**: Prioritize semantic selectors that reflect user intent (ARIA roles, accessible names, test IDs). Use dynamic filtering for lists instead of nth() indexes. Validate selectors during code review and refactor proactively.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Prevents false test failures (UI refactoring doesn't break tests)
|
||||
- Improves accessibility (ARIA roles benefit both tests and screen readers)
|
||||
- Enhances readability (semantic selectors document user intent)
|
||||
- Reduces maintenance burden (robust selectors survive design changes)
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Selector Hierarchy (Priority Order with Examples)
|
||||
|
||||
**Context**: Choose the most resilient selector for each element type
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/selectors/hierarchy-examples.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Selector Hierarchy Best Practices', () => {
|
||||
test('Level 1: data-testid (BEST - most resilient)', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
|
||||
// ✅ Best: Dedicated test attribute (survives all UI changes)
|
||||
await page.getByTestId('email-input').fill('user@example.com');
|
||||
await page.getByTestId('password-input').fill('password123');
|
||||
await page.getByTestId('login-button').click();
|
||||
|
||||
await expect(page.getByTestId('welcome-message')).toBeVisible();
|
||||
|
||||
// Why it's best:
|
||||
// - Survives CSS refactoring (class name changes)
|
||||
// - Survives layout changes (element reordering)
|
||||
// - Survives content changes (button text updates)
|
||||
// - Explicit test contract (developer knows it's for testing)
|
||||
});
|
||||
|
||||
test('Level 2: ARIA roles and accessible names (GOOD - future-proof)', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
|
||||
// ✅ Good: Semantic HTML roles (benefits accessibility + tests)
|
||||
await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
|
||||
await page.getByRole('textbox', { name: 'Password' }).fill('password123');
|
||||
await page.getByRole('button', { name: 'Sign In' }).click();
|
||||
|
||||
await expect(page.getByRole('heading', { name: 'Welcome' })).toBeVisible();
|
||||
|
||||
// Why it's good:
|
||||
// - Survives CSS refactoring
|
||||
// - Survives layout changes
|
||||
// - Enforces accessibility (screen reader compatible)
|
||||
// - Self-documenting (role + name = clear intent)
|
||||
});
|
||||
|
||||
test('Level 3: Text content (ACCEPTABLE - user-centric)', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ✅ Acceptable: Text content (matches user perception)
|
||||
await page.getByText('Create New Order').click();
|
||||
await expect(page.getByText('Order Details')).toBeVisible();
|
||||
|
||||
// Why it's acceptable:
|
||||
// - User-centric (what user sees)
|
||||
// - Survives CSS/layout changes
|
||||
// - Breaks when copy changes (forces test update with content)
|
||||
|
||||
// ⚠️ Use with caution for dynamic/localized content:
|
||||
// - Avoid for content with variables: "User 123" (use regex instead)
|
||||
// - Avoid for i18n content (use data-testid or ARIA)
|
||||
});
|
||||
|
||||
test('Level 4: CSS classes/IDs (LAST RESORT - brittle)', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
|
||||
// ❌ Last resort: CSS class (breaks with styling updates)
|
||||
// await page.locator('.btn-primary').click()
|
||||
|
||||
// ❌ Last resort: ID (breaks if ID changes)
|
||||
// await page.locator('#login-form').fill(...)
|
||||
|
||||
// ✅ Better: Use data-testid or ARIA instead
|
||||
await page.getByTestId('login-button').click();
|
||||
|
||||
// Why CSS/ID is last resort:
|
||||
// - Breaks with CSS refactoring (class name changes)
|
||||
// - Breaks with HTML restructuring (ID changes)
|
||||
// - Not semantic (unclear what element does)
|
||||
// - Tight coupling between tests and styling
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Hierarchy: data-testid (best) > ARIA (good) > text (acceptable) > CSS/ID (last resort)
|
||||
- data-testid survives ALL UI changes (explicit test contract)
|
||||
- ARIA roles enforce accessibility (screen reader compatible)
|
||||
- Text content is user-centric (but breaks with copy changes)
|
||||
- CSS/ID are brittle (break with styling refactoring)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Dynamic Selector Patterns (Lists, Filters, Regex)
|
||||
|
||||
**Context**: Handle dynamic content, lists, and variable data with resilient selectors
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/selectors/dynamic-selectors.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Dynamic Selector Patterns', () => {
|
||||
test('regex for variable content (user IDs, timestamps)', async ({ page }) => {
|
||||
await page.goto('/users');
|
||||
|
||||
// ✅ Good: Regex pattern for dynamic user IDs
|
||||
await expect(page.getByText(/User \d+/)).toBeVisible();
|
||||
|
||||
// ✅ Good: Regex for timestamps
|
||||
await expect(page.getByText(/Last login: \d{4}-\d{2}-\d{2}/)).toBeVisible();
|
||||
|
||||
// ✅ Good: Regex for dynamic counts
|
||||
await expect(page.getByText(/\d+ items in cart/)).toBeVisible();
|
||||
});
|
||||
|
||||
test('partial text matching (case-insensitive, substring)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ✅ Good: Partial match (survives minor text changes)
|
||||
await page.getByText('Product', { exact: false }).first().click();
|
||||
|
||||
// ✅ Good: Case-insensitive (survives capitalization changes)
|
||||
await expect(page.getByText(/sign in/i)).toBeVisible();
|
||||
});
|
||||
|
||||
test('filter locators for lists (avoid brittle nth)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Bad: Index-based (breaks when order changes)
|
||||
// await page.locator('.product-card').nth(2).click()
|
||||
|
||||
// ✅ Good: Filter by content (resilient to reordering)
|
||||
await page.locator('[data-testid="product-card"]').filter({ hasText: 'Premium Plan' }).click();
|
||||
|
||||
// ✅ Good: Filter by attribute
|
||||
await page
|
||||
.locator('[data-testid="product-card"]')
|
||||
.filter({ has: page.locator('[data-status="active"]') })
|
||||
.first()
|
||||
.click();
|
||||
});
|
||||
|
||||
test('nth() only when absolutely necessary', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ⚠️ Acceptable: nth(0) for first item (common pattern)
|
||||
const firstNotification = page.getByTestId('notification').nth(0);
|
||||
await expect(firstNotification).toContainText('Welcome');
|
||||
|
||||
// ❌ Bad: nth(5) for arbitrary index (fragile)
|
||||
// await page.getByTestId('notification').nth(5).click()
|
||||
|
||||
// ✅ Better: Use filter() with specific criteria
|
||||
await page.getByTestId('notification').filter({ hasText: 'Critical Alert' }).click();
|
||||
});
|
||||
|
||||
test('combine multiple locators for specificity', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// ✅ Good: Narrow scope with combined locators
|
||||
const shippingSection = page.getByTestId('shipping-section');
|
||||
await shippingSection.getByLabel('Address Line 1').fill('123 Main St');
|
||||
await shippingSection.getByLabel('City').fill('New York');
|
||||
|
||||
// Scoping prevents ambiguity (multiple "City" fields on page)
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Regex patterns handle variable content (IDs, timestamps, counts)
|
||||
- Partial matching survives minor text changes (`exact: false`)
|
||||
- `filter()` is more resilient than `nth()` (content-based vs index-based)
|
||||
- `nth(0)` acceptable for "first item", avoid arbitrary indexes
|
||||
- Combine locators to narrow scope (prevent ambiguity)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Selector Anti-Patterns (What NOT to Do)
|
||||
|
||||
**Context**: Common selector mistakes that cause brittle tests
|
||||
|
||||
**Problem Examples**:
|
||||
|
||||
```typescript
|
||||
// tests/selectors/anti-patterns.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Selector Anti-Patterns to Avoid', () => {
|
||||
test('❌ Anti-Pattern 1: CSS classes (brittle)', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
|
||||
// ❌ Bad: CSS class (breaks with design system updates)
|
||||
// await page.locator('.btn-primary').click()
|
||||
// await page.locator('.form-input-lg').fill('test@example.com')
|
||||
|
||||
// ✅ Good: Use data-testid or ARIA role
|
||||
await page.getByTestId('login-button').click();
|
||||
await page.getByRole('textbox', { name: 'Email' }).fill('test@example.com');
|
||||
});
|
||||
|
||||
test('❌ Anti-Pattern 2: Index-based nth() (fragile)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Bad: Index-based (breaks when product order changes)
|
||||
// await page.locator('.product-card').nth(3).click()
|
||||
|
||||
// ✅ Good: Content-based filter
|
||||
await page.locator('[data-testid="product-card"]').filter({ hasText: 'Laptop' }).click();
|
||||
});
|
||||
|
||||
test('❌ Anti-Pattern 3: Complex XPath (hard to maintain)', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Bad: Complex XPath (unreadable, breaks with structure changes)
|
||||
// await page.locator('xpath=//div[@class="container"]//section[2]//button[contains(@class, "primary")]').click()
|
||||
|
||||
// ✅ Good: Semantic selector
|
||||
await page.getByRole('button', { name: 'Create Order' }).click();
|
||||
});
|
||||
|
||||
test('❌ Anti-Pattern 4: ID selectors (coupled to implementation)', async ({ page }) => {
|
||||
await page.goto('/settings');
|
||||
|
||||
// ❌ Bad: HTML ID (breaks if ID changes for accessibility/SEO)
|
||||
// await page.locator('#user-settings-form').fill(...)
|
||||
|
||||
// ✅ Good: data-testid or ARIA landmark
|
||||
await page.getByTestId('user-settings-form').getByLabel('Display Name').fill('John Doe');
|
||||
});
|
||||
|
||||
test('✅ Refactoring: Bad → Good Selector', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Before (brittle):
|
||||
// await page.locator('.checkout-form > .payment-section > .btn-submit').click()
|
||||
|
||||
// After (resilient):
|
||||
await page.getByTestId('checkout-form').getByRole('button', { name: 'Complete Payment' }).click();
|
||||
|
||||
await expect(page.getByText('Payment successful')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Why These Fail**:
|
||||
|
||||
- **CSS classes**: Change frequently with design updates (Tailwind, CSS modules)
|
||||
- **nth() indexes**: Fragile to element reordering (new features, A/B tests)
|
||||
- **Complex XPath**: Unreadable, breaks with HTML structure changes
|
||||
- **HTML IDs**: Not stable (accessibility improvements change IDs)
|
||||
|
||||
**Better Approach**: Use selector hierarchy (testid > ARIA > text)
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Selector Debugging Techniques (Inspector, DevTools, MCP)
|
||||
|
||||
**Context**: Debug selector failures interactively to find better alternatives
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/selectors/debugging-techniques.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Selector Debugging Techniques', () => {
|
||||
test('use Playwright Inspector to test selectors', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Pause test to open Inspector
|
||||
await page.pause();
|
||||
|
||||
// In Inspector console, test selectors:
|
||||
// page.getByTestId('user-menu') ✅ Works
|
||||
// page.getByRole('button', { name: 'Profile' }) ✅ Works
|
||||
// page.locator('.btn-primary') ❌ Brittle
|
||||
|
||||
// Use "Pick Locator" feature to generate selectors
|
||||
// Use "Record" mode to capture user interactions
|
||||
|
||||
await page.getByTestId('user-menu').click();
|
||||
await expect(page.getByRole('menu')).toBeVisible();
|
||||
});
|
||||
|
||||
test('use locator.all() to debug lists', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// Debug: How many products are visible?
|
||||
const products = await page.getByTestId('product-card').all();
|
||||
console.log(`Found ${products.length} products`);
|
||||
|
||||
// Debug: What text is in each product?
|
||||
for (const product of products) {
|
||||
const text = await product.textContent();
|
||||
console.log(`Product text: ${text}`);
|
||||
}
|
||||
|
||||
// Use findings to build better selector
|
||||
await page.getByTestId('product-card').filter({ hasText: 'Laptop' }).click();
|
||||
});
|
||||
|
||||
test('use DevTools console to test selectors', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Open DevTools (manually or via page.pause())
|
||||
// Test selectors in console:
|
||||
// document.querySelectorAll('[data-testid="payment-method"]')
|
||||
// document.querySelector('#credit-card-input')
|
||||
|
||||
// Find robust selector through trial and error
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
});
|
||||
|
||||
test('MCP browser_generate_locator (if available)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// If Playwright MCP available, use browser_generate_locator:
|
||||
// 1. Click element in browser
|
||||
// 2. MCP generates optimal selector
|
||||
// 3. Copy into test
|
||||
|
||||
// Example output from MCP:
|
||||
// page.getByRole('link', { name: 'Product A' })
|
||||
|
||||
// Use generated selector
|
||||
await page.getByRole('link', { name: 'Product A' }).click();
|
||||
await expect(page).toHaveURL(/\/products\/\d+/);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Playwright Inspector: Interactive selector testing with "Pick Locator" feature
|
||||
- `locator.all()`: Debug lists to understand structure and content
|
||||
- DevTools console: Test CSS selectors before adding to tests
|
||||
- MCP browser_generate_locator: Auto-generate optimal selectors (if MCP available)
|
||||
- Always validate selectors work before committing
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Selector Refactoring Guide (Before/After Patterns)
|
||||
|
||||
**Context**: Systematically improve brittle selectors to resilient alternatives
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/selectors/refactoring-guide.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Selector Refactoring Patterns', () => {
|
||||
test('refactor: CSS class → data-testid', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Before: CSS class (breaks with Tailwind updates)
|
||||
// await page.locator('.bg-blue-500.px-4.py-2.rounded').click()
|
||||
|
||||
// ✅ After: data-testid
|
||||
await page.getByTestId('add-to-cart-button').click();
|
||||
|
||||
// Implementation: Add data-testid to button component
|
||||
// <button className="bg-blue-500 px-4 py-2 rounded" data-testid="add-to-cart-button">
|
||||
});
|
||||
|
||||
test('refactor: nth() index → filter()', async ({ page }) => {
|
||||
await page.goto('/users');
|
||||
|
||||
// ❌ Before: Index-based (breaks when users reorder)
|
||||
// await page.locator('.user-row').nth(2).click()
|
||||
|
||||
// ✅ After: Content-based filter
|
||||
await page.locator('[data-testid="user-row"]').filter({ hasText: 'john@example.com' }).click();
|
||||
});
|
||||
|
||||
test('refactor: Complex XPath → ARIA role', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// ❌ Before: Complex XPath (unreadable, brittle)
|
||||
// await page.locator('xpath=//div[@id="payment"]//form//button[contains(@class, "submit")]').click()
|
||||
|
||||
// ✅ After: ARIA role
|
||||
await page.getByRole('button', { name: 'Complete Payment' }).click();
|
||||
});
|
||||
|
||||
test('refactor: ID selector → data-testid', async ({ page }) => {
|
||||
await page.goto('/settings');
|
||||
|
||||
// ❌ Before: HTML ID (changes with accessibility improvements)
|
||||
// await page.locator('#user-profile-section').getByLabel('Name').fill('John')
|
||||
|
||||
// ✅ After: data-testid + semantic label
|
||||
await page.getByTestId('user-profile-section').getByLabel('Display Name').fill('John Doe');
|
||||
});
|
||||
|
||||
test('refactor: Deeply nested CSS → scoped data-testid', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Before: Deep nesting (breaks with structure changes)
|
||||
// await page.locator('.container .sidebar .menu .item:nth-child(3) a').click()
|
||||
|
||||
// ✅ After: Scoped data-testid
|
||||
const sidebar = page.getByTestId('sidebar');
|
||||
await sidebar.getByRole('link', { name: 'Settings' }).click();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- CSS class → data-testid (survives design system updates)
|
||||
- nth() → filter() (content-based vs index-based)
|
||||
- Complex XPath → ARIA role (readable, semantic)
|
||||
- ID → data-testid (decouples from HTML structure)
|
||||
- Deep nesting → scoped locators (modular, maintainable)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Selector Best Practices Checklist
|
||||
|
||||
```typescript
|
||||
// tests/selectors/validation-checklist.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
/**
|
||||
* Selector Validation Checklist
|
||||
*
|
||||
* Before committing test, verify selectors meet these criteria:
|
||||
*/
|
||||
test.describe('Selector Best Practices Validation', () => {
|
||||
test('✅ 1. Prefer data-testid for interactive elements', async ({ page }) => {
|
||||
await page.goto('/login');
|
||||
|
||||
// Interactive elements (buttons, inputs, links) should use data-testid
|
||||
await page.getByTestId('email-input').fill('test@example.com');
|
||||
await page.getByTestId('login-button').click();
|
||||
});
|
||||
|
||||
test('✅ 2. Use ARIA roles for semantic elements', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Semantic elements (headings, navigation, forms) use ARIA
|
||||
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
|
||||
await page.getByRole('navigation').getByRole('link', { name: 'Settings' }).click();
|
||||
});
|
||||
|
||||
test('✅ 3. Avoid CSS classes (except when testing styles)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Never for interaction: page.locator('.btn-primary')
|
||||
// ✅ Only for visual regression: await expect(page.locator('.error-banner')).toHaveCSS('color', 'rgb(255, 0, 0)')
|
||||
});
|
||||
|
||||
test('✅ 4. Use filter() instead of nth() for lists', async ({ page }) => {
|
||||
await page.goto('/orders');
|
||||
|
||||
// List selection should be content-based
|
||||
await page.getByTestId('order-row').filter({ hasText: 'Order #12345' }).click();
|
||||
});
|
||||
|
||||
test('✅ 5. Selectors are human-readable', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// ✅ Good: Clear intent
|
||||
await page.getByTestId('shipping-address-form').getByLabel('Street Address').fill('123 Main St');
|
||||
|
||||
// ❌ Bad: Cryptic
|
||||
// await page.locator('div > div:nth-child(2) > input[type="text"]').fill('123 Main St')
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Validation Rules**:
|
||||
|
||||
1. **Interactive elements** (buttons, inputs) → data-testid
|
||||
2. **Semantic elements** (headings, nav, forms) → ARIA roles
|
||||
3. **CSS classes** → Avoid (except visual regression tests)
|
||||
4. **Lists** → filter() over nth() (content-based selection)
|
||||
5. **Readability** → Selectors document user intent (clear, semantic)
|
||||
|
||||
---
|
||||
|
||||
## Selector Resilience Checklist
|
||||
|
||||
Before deploying selectors:
|
||||
|
||||
- [ ] **Hierarchy followed**: data-testid (1st choice) > ARIA (2nd) > text (3rd) > CSS/ID (last resort)
|
||||
- [ ] **Interactive elements use data-testid**: Buttons, inputs, links have dedicated test attributes
|
||||
- [ ] **Semantic elements use ARIA**: Headings, navigation, forms use roles and accessible names
|
||||
- [ ] **No brittle patterns**: No CSS classes (except visual tests), no arbitrary nth(), no complex XPath
|
||||
- [ ] **Dynamic content handled**: Regex for IDs/timestamps, filter() for lists, partial matching for text
|
||||
- [ ] **Selectors are scoped**: Use container locators to narrow scope (prevent ambiguity)
|
||||
- [ ] **Human-readable**: Selectors document user intent (clear, semantic, maintainable)
|
||||
- [ ] **Validated in Inspector**: Test selectors interactively before committing (page.pause())
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (generate tests with robust selectors), `*automate` (healing selector failures), `*test-review` (validate selector quality)
|
||||
- **Related fragments**: `test-healing-patterns.md` (selector failure diagnosis), `fixture-architecture.md` (page object alternatives), `test-quality.md` (maintainability standards)
|
||||
- **Tools**: Playwright Inspector (Pick Locator), DevTools console, Playwright MCP browser_generate_locator (optional)
|
||||
|
||||
_Source: Playwright selector best practices, accessibility guidelines (ARIA), production test maintenance patterns_
|
||||
644
src/modules/bmm/testarch/knowledge/test-healing-patterns.md
Normal file
644
src/modules/bmm/testarch/knowledge/test-healing-patterns.md
Normal file
@@ -0,0 +1,644 @@
|
||||
# Test Healing Patterns
|
||||
|
||||
## Principle
|
||||
|
||||
Common test failures follow predictable patterns (stale selectors, race conditions, dynamic data assertions, network errors, hard waits). **Automated healing** identifies failure signatures and applies pattern-based fixes. Manual healing captures these patterns for future automation.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Test failures waste developer time on repetitive debugging. Teams manually fix the same selector issues, timing bugs, and data mismatches repeatedly across test suites.
|
||||
|
||||
**The Solution**: Catalog common failure patterns with diagnostic signatures and automated fixes. When a test fails, match the error message/stack trace against known patterns and apply the corresponding fix. This transforms test maintenance from reactive debugging to proactive pattern application.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Reduces test maintenance time by 60-80% (pattern-based fixes vs manual debugging)
|
||||
- Prevents flakiness regression (same bug fixed once, applied everywhere)
|
||||
- Builds institutional knowledge (failure catalog grows over time)
|
||||
- Enables self-healing test suites (automate workflow validates and heals)
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Common Failure Pattern - Stale Selectors (Element Not Found)
|
||||
|
||||
**Context**: Test fails with "Element not found" or "Locator resolved to 0 elements" errors
|
||||
|
||||
**Diagnostic Signature**:
|
||||
|
||||
```typescript
|
||||
// src/testing/healing/selector-healing.ts
|
||||
|
||||
export type SelectorFailure = {
|
||||
errorMessage: string;
|
||||
stackTrace: string;
|
||||
selector: string;
|
||||
testFile: string;
|
||||
lineNumber: number;
|
||||
};
|
||||
|
||||
/**
|
||||
* Detect stale selector failures
|
||||
*/
|
||||
export function isSelectorFailure(error: Error): boolean {
|
||||
const patterns = [
|
||||
/locator.*resolved to 0 elements/i,
|
||||
/element not found/i,
|
||||
/waiting for locator.*to be visible/i,
|
||||
/selector.*did not match any elements/i,
|
||||
/unable to find element/i,
|
||||
];
|
||||
|
||||
return patterns.some((pattern) => pattern.test(error.message));
|
||||
}
|
||||
|
||||
/**
|
||||
* Extract selector from error message
|
||||
*/
|
||||
export function extractSelector(errorMessage: string): string | null {
|
||||
// Playwright: "locator('button[type=\"submit\"]') resolved to 0 elements"
|
||||
const playwrightMatch = errorMessage.match(/locator\('([^']+)'\)/);
|
||||
if (playwrightMatch) return playwrightMatch[1];
|
||||
|
||||
// Cypress: "Timed out retrying: Expected to find element: '.submit-button'"
|
||||
const cypressMatch = errorMessage.match(/Expected to find element: ['"]([^'"]+)['"]/i);
|
||||
if (cypressMatch) return cypressMatch[1];
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Suggest better selector based on hierarchy
|
||||
*/
|
||||
export function suggestBetterSelector(badSelector: string): string {
|
||||
// If using CSS class → suggest data-testid
|
||||
if (badSelector.startsWith('.') || badSelector.includes('class=')) {
|
||||
const elementName = badSelector.match(/class=["']([^"']+)["']/)?.[1] || badSelector.slice(1);
|
||||
return `page.getByTestId('${elementName}') // Prefer data-testid over CSS class`;
|
||||
}
|
||||
|
||||
// If using ID → suggest data-testid
|
||||
if (badSelector.startsWith('#')) {
|
||||
return `page.getByTestId('${badSelector.slice(1)}') // Prefer data-testid over ID`;
|
||||
}
|
||||
|
||||
// If using nth() → suggest filter() or more specific selector
|
||||
if (badSelector.includes('.nth(')) {
|
||||
return `page.locator('${badSelector.split('.nth(')[0]}').filter({ hasText: 'specific text' }) // Avoid brittle nth(), use filter()`;
|
||||
}
|
||||
|
||||
// If using complex CSS → suggest ARIA role
|
||||
if (badSelector.includes('>') || badSelector.includes('+')) {
|
||||
return `page.getByRole('button', { name: 'Submit' }) // Prefer ARIA roles over complex CSS`;
|
||||
}
|
||||
|
||||
return `page.getByTestId('...') // Add data-testid attribute to element`;
|
||||
}
|
||||
```
|
||||
|
||||
**Healing Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/healing/selector-healing.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { isSelectorFailure, extractSelector, suggestBetterSelector } from '../../src/testing/healing/selector-healing';
|
||||
|
||||
test('heal stale selector failures automatically', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
try {
|
||||
// Original test with brittle CSS selector
|
||||
await page.locator('.btn-primary').click();
|
||||
} catch (error: any) {
|
||||
if (isSelectorFailure(error)) {
|
||||
const badSelector = extractSelector(error.message);
|
||||
const suggestion = badSelector ? suggestBetterSelector(badSelector) : null;
|
||||
|
||||
console.log('HEALING SUGGESTION:', suggestion);
|
||||
|
||||
// Apply healed selector
|
||||
await page.getByTestId('submit-button').click(); // Fixed!
|
||||
} else {
|
||||
throw error; // Not a selector issue, rethrow
|
||||
}
|
||||
}
|
||||
|
||||
await expect(page.getByText('Success')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Diagnosis: Error message contains "locator resolved to 0 elements" or "element not found"
|
||||
- Fix: Replace brittle selector (CSS class, ID, nth) with robust alternative (data-testid, ARIA role)
|
||||
- Prevention: Follow selector hierarchy (data-testid > ARIA > text > CSS)
|
||||
- Automation: Pattern matching on error message + stack trace
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Common Failure Pattern - Race Conditions (Timing Errors)
|
||||
|
||||
**Context**: Test fails with "timeout waiting for element" or "element not visible" errors
|
||||
|
||||
**Diagnostic Signature**:
|
||||
|
||||
```typescript
|
||||
// src/testing/healing/timing-healing.ts
|
||||
|
||||
export type TimingFailure = {
|
||||
errorMessage: string;
|
||||
testFile: string;
|
||||
lineNumber: number;
|
||||
actionType: 'click' | 'fill' | 'waitFor' | 'expect';
|
||||
};
|
||||
|
||||
/**
|
||||
* Detect race condition failures
|
||||
*/
|
||||
export function isTimingFailure(error: Error): boolean {
|
||||
const patterns = [
|
||||
/timeout.*waiting for/i,
|
||||
/element is not visible/i,
|
||||
/element is not attached to the dom/i,
|
||||
/waiting for element to be visible.*exceeded/i,
|
||||
/timed out retrying/i,
|
||||
/waitForLoadState.*timeout/i,
|
||||
];
|
||||
|
||||
return patterns.some((pattern) => pattern.test(error.message));
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect hard wait anti-pattern
|
||||
*/
|
||||
export function hasHardWait(testCode: string): boolean {
|
||||
const hardWaitPatterns = [/page\.waitForTimeout\(/, /cy\.wait\(\d+\)/, /await.*sleep\(/, /setTimeout\(/];
|
||||
|
||||
return hardWaitPatterns.some((pattern) => pattern.test(testCode));
|
||||
}
|
||||
|
||||
/**
|
||||
* Suggest deterministic wait replacement
|
||||
*/
|
||||
export function suggestDeterministicWait(testCode: string): string {
|
||||
if (testCode.includes('page.waitForTimeout')) {
|
||||
return `
|
||||
// ❌ Bad: Hard wait (flaky)
|
||||
// await page.waitForTimeout(3000)
|
||||
|
||||
// ✅ Good: Wait for network response
|
||||
await page.waitForResponse(resp => resp.url().includes('/api/data') && resp.status() === 200)
|
||||
|
||||
// OR wait for element state
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
|
||||
`.trim();
|
||||
}
|
||||
|
||||
if (testCode.includes('cy.wait(') && /cy\.wait\(\d+\)/.test(testCode)) {
|
||||
return `
|
||||
// ❌ Bad: Hard wait (flaky)
|
||||
// cy.wait(3000)
|
||||
|
||||
// ✅ Good: Wait for aliased network request
|
||||
cy.intercept('GET', '/api/data').as('getData')
|
||||
cy.visit('/page')
|
||||
cy.wait('@getData')
|
||||
`.trim();
|
||||
}
|
||||
|
||||
return `
|
||||
// Add network-first interception BEFORE navigation:
|
||||
await page.route('**/api/**', route => route.continue())
|
||||
const responsePromise = page.waitForResponse('**/api/data')
|
||||
await page.goto('/page')
|
||||
await responsePromise
|
||||
`.trim();
|
||||
}
|
||||
```
|
||||
|
||||
**Healing Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/healing/timing-healing.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { isTimingFailure, hasHardWait, suggestDeterministicWait } from '../../src/testing/healing/timing-healing';
|
||||
|
||||
test('heal race condition with network-first pattern', async ({ page, context }) => {
|
||||
// Setup interception BEFORE navigation (prevent race)
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
body: JSON.stringify({ products: [{ id: 1, name: 'Product A' }] }),
|
||||
});
|
||||
});
|
||||
|
||||
const responsePromise = page.waitForResponse('**/api/products');
|
||||
|
||||
await page.goto('/products');
|
||||
await responsePromise; // Deterministic wait
|
||||
|
||||
// Element now reliably visible (no race condition)
|
||||
await expect(page.getByText('Product A')).toBeVisible();
|
||||
});
|
||||
|
||||
test('heal hard wait with event-based wait', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Original (flaky): await page.waitForTimeout(3000)
|
||||
|
||||
// ✅ Healed: Wait for spinner to disappear
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
|
||||
|
||||
// Element now reliably visible
|
||||
await expect(page.getByText('Dashboard loaded')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Diagnosis: Error contains "timeout" or "not visible", often after navigation
|
||||
- Fix: Replace hard waits with network-first pattern or element state waits
|
||||
- Prevention: ALWAYS intercept before navigate, use waitForResponse()
|
||||
- Automation: Detect `page.waitForTimeout()` or `cy.wait(number)` in test code
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Common Failure Pattern - Dynamic Data Assertions (Non-Deterministic IDs)
|
||||
|
||||
**Context**: Test fails with "Expected 'User 123' but received 'User 456'" or timestamp mismatches
|
||||
|
||||
**Diagnostic Signature**:
|
||||
|
||||
```typescript
|
||||
// src/testing/healing/data-healing.ts
|
||||
|
||||
export type DataFailure = {
|
||||
errorMessage: string;
|
||||
expectedValue: string;
|
||||
actualValue: string;
|
||||
testFile: string;
|
||||
lineNumber: number;
|
||||
};
|
||||
|
||||
/**
|
||||
* Detect dynamic data assertion failures
|
||||
*/
|
||||
export function isDynamicDataFailure(error: Error): boolean {
|
||||
const patterns = [
|
||||
/expected.*\d+.*received.*\d+/i, // ID mismatches
|
||||
/expected.*\d{4}-\d{2}-\d{2}.*received/i, // Date mismatches
|
||||
/expected.*user.*\d+/i, // Dynamic user IDs
|
||||
/expected.*order.*\d+/i, // Dynamic order IDs
|
||||
/expected.*to.*contain.*\d+/i, // Numeric assertions
|
||||
];
|
||||
|
||||
return patterns.some((pattern) => pattern.test(error.message));
|
||||
}
|
||||
|
||||
/**
|
||||
* Suggest flexible assertion pattern
|
||||
*/
|
||||
export function suggestFlexibleAssertion(errorMessage: string): string {
|
||||
if (/expected.*user.*\d+/i.test(errorMessage)) {
|
||||
return `
|
||||
// ❌ Bad: Hardcoded ID
|
||||
// await expect(page.getByText('User 123')).toBeVisible()
|
||||
|
||||
// ✅ Good: Regex pattern for any user ID
|
||||
await expect(page.getByText(/User \\d+/)).toBeVisible()
|
||||
|
||||
// OR use partial match
|
||||
await expect(page.locator('[data-testid="user-name"]')).toContainText('User')
|
||||
`.trim();
|
||||
}
|
||||
|
||||
if (/expected.*\d{4}-\d{2}-\d{2}/i.test(errorMessage)) {
|
||||
return `
|
||||
// ❌ Bad: Hardcoded date
|
||||
// await expect(page.getByText('2024-01-15')).toBeVisible()
|
||||
|
||||
// ✅ Good: Dynamic date validation
|
||||
const today = new Date().toISOString().split('T')[0]
|
||||
await expect(page.getByTestId('created-date')).toHaveText(today)
|
||||
|
||||
// OR use date format regex
|
||||
await expect(page.getByTestId('created-date')).toHaveText(/\\d{4}-\\d{2}-\\d{2}/)
|
||||
`.trim();
|
||||
}
|
||||
|
||||
if (/expected.*order.*\d+/i.test(errorMessage)) {
|
||||
return `
|
||||
// ❌ Bad: Hardcoded order ID
|
||||
// const orderId = '12345'
|
||||
|
||||
// ✅ Good: Capture dynamic order ID
|
||||
const orderText = await page.getByTestId('order-id').textContent()
|
||||
const orderId = orderText?.match(/Order #(\\d+)/)?.[1]
|
||||
expect(orderId).toBeTruthy()
|
||||
|
||||
// Use captured ID in later assertions
|
||||
await expect(page.getByText(\`Order #\${orderId} confirmed\`)).toBeVisible()
|
||||
`.trim();
|
||||
}
|
||||
|
||||
return `Use regex patterns, partial matching, or capture dynamic values instead of hardcoding`;
|
||||
}
|
||||
```
|
||||
|
||||
**Healing Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/healing/data-healing.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('heal dynamic ID assertion with regex', async ({ page }) => {
|
||||
await page.goto('/users');
|
||||
|
||||
// ❌ Original (fails with random IDs): await expect(page.getByText('User 123')).toBeVisible()
|
||||
|
||||
// ✅ Healed: Regex pattern matches any user ID
|
||||
await expect(page.getByText(/User \d+/)).toBeVisible();
|
||||
});
|
||||
|
||||
test('heal timestamp assertion with dynamic generation', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Original (fails daily): await expect(page.getByText('2024-01-15')).toBeVisible()
|
||||
|
||||
// ✅ Healed: Generate expected date dynamically
|
||||
const today = new Date().toISOString().split('T')[0];
|
||||
await expect(page.getByTestId('last-updated')).toContainText(today);
|
||||
});
|
||||
|
||||
test('heal order ID assertion with capture', async ({ page, request }) => {
|
||||
// Create order via API (dynamic ID)
|
||||
const response = await request.post('/api/orders', {
|
||||
data: { productId: '123', quantity: 1 },
|
||||
});
|
||||
const { orderId } = await response.json();
|
||||
|
||||
// ✅ Healed: Use captured dynamic ID
|
||||
await page.goto(`/orders/${orderId}`);
|
||||
await expect(page.getByText(`Order #${orderId}`)).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Diagnosis: Error message shows expected vs actual value mismatch with IDs/timestamps
|
||||
- Fix: Use regex patterns (`/User \d+/`), partial matching, or capture dynamic values
|
||||
- Prevention: Never hardcode IDs, timestamps, or random data in assertions
|
||||
- Automation: Parse error message for expected/actual values, suggest regex patterns
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Common Failure Pattern - Network Errors (Missing Route Interception)
|
||||
|
||||
**Context**: Test fails with "API call failed" or "500 error" during test execution
|
||||
|
||||
**Diagnostic Signature**:
|
||||
|
||||
```typescript
|
||||
// src/testing/healing/network-healing.ts
|
||||
|
||||
export type NetworkFailure = {
|
||||
errorMessage: string;
|
||||
url: string;
|
||||
statusCode: number;
|
||||
method: string;
|
||||
};
|
||||
|
||||
/**
|
||||
* Detect network failure
|
||||
*/
|
||||
export function isNetworkFailure(error: Error): boolean {
|
||||
const patterns = [
|
||||
/api.*call.*failed/i,
|
||||
/request.*failed/i,
|
||||
/network.*error/i,
|
||||
/500.*internal server error/i,
|
||||
/503.*service unavailable/i,
|
||||
/fetch.*failed/i,
|
||||
];
|
||||
|
||||
return patterns.some((pattern) => pattern.test(error.message));
|
||||
}
|
||||
|
||||
/**
|
||||
* Suggest route interception
|
||||
*/
|
||||
export function suggestRouteInterception(url: string, method: string): string {
|
||||
return `
|
||||
// ❌ Bad: Real API call (unreliable, slow, external dependency)
|
||||
|
||||
// ✅ Good: Mock API response with route interception
|
||||
await page.route('${url}', route => {
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({
|
||||
// Mock response data
|
||||
id: 1,
|
||||
name: 'Test User',
|
||||
email: 'test@example.com'
|
||||
})
|
||||
})
|
||||
})
|
||||
|
||||
// Then perform action
|
||||
await page.goto('/page')
|
||||
`.trim();
|
||||
}
|
||||
```
|
||||
|
||||
**Healing Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/healing/network-healing.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('heal network failure with route mocking', async ({ page, context }) => {
|
||||
// ✅ Healed: Mock API to prevent real network calls
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({
|
||||
products: [
|
||||
{ id: 1, name: 'Product A', price: 29.99 },
|
||||
{ id: 2, name: 'Product B', price: 49.99 },
|
||||
],
|
||||
}),
|
||||
});
|
||||
});
|
||||
|
||||
await page.goto('/products');
|
||||
|
||||
// Test now reliable (no external API dependency)
|
||||
await expect(page.getByText('Product A')).toBeVisible();
|
||||
await expect(page.getByText('$29.99')).toBeVisible();
|
||||
});
|
||||
|
||||
test('heal 500 error with error state mocking', async ({ page, context }) => {
|
||||
// Mock API failure scenario
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) });
|
||||
});
|
||||
|
||||
await page.goto('/products');
|
||||
|
||||
// Verify error handling (not crash)
|
||||
await expect(page.getByText('Unable to load products')).toBeVisible();
|
||||
await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Diagnosis: Error message contains "API call failed", "500 error", or network-related failures
|
||||
- Fix: Add `page.route()` or `cy.intercept()` to mock API responses
|
||||
- Prevention: Mock ALL external dependencies (APIs, third-party services)
|
||||
- Automation: Extract URL from error message, generate route interception code
|
||||
|
||||
---
|
||||
|
||||
### Example 5: Common Failure Pattern - Hard Waits (Unreliable Timing)
|
||||
|
||||
**Context**: Test fails intermittently with "timeout exceeded" or passes/fails randomly
|
||||
|
||||
**Diagnostic Signature**:
|
||||
|
||||
```typescript
|
||||
// src/testing/healing/hard-wait-healing.ts
|
||||
|
||||
/**
|
||||
* Detect hard wait anti-pattern in test code
|
||||
*/
|
||||
export function detectHardWaits(testCode: string): Array<{ line: number; code: string }> {
|
||||
const lines = testCode.split('\n');
|
||||
const violations: Array<{ line: number; code: string }> = [];
|
||||
|
||||
lines.forEach((line, index) => {
|
||||
if (line.includes('page.waitForTimeout(') || /cy\.wait\(\d+\)/.test(line) || line.includes('sleep(') || line.includes('setTimeout(')) {
|
||||
violations.push({ line: index + 1, code: line.trim() });
|
||||
}
|
||||
});
|
||||
|
||||
return violations;
|
||||
}
|
||||
|
||||
/**
|
||||
* Suggest event-based wait replacement
|
||||
*/
|
||||
export function suggestEventBasedWait(hardWaitLine: string): string {
|
||||
if (hardWaitLine.includes('page.waitForTimeout')) {
|
||||
return `
|
||||
// ❌ Bad: Hard wait (flaky)
|
||||
${hardWaitLine}
|
||||
|
||||
// ✅ Good: Wait for network response
|
||||
await page.waitForResponse(resp => resp.url().includes('/api/') && resp.ok())
|
||||
|
||||
// OR wait for element state change
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
|
||||
await page.getByTestId('content').waitFor({ state: 'visible' })
|
||||
`.trim();
|
||||
}
|
||||
|
||||
if (/cy\.wait\(\d+\)/.test(hardWaitLine)) {
|
||||
return `
|
||||
// ❌ Bad: Hard wait (flaky)
|
||||
${hardWaitLine}
|
||||
|
||||
// ✅ Good: Wait for aliased request
|
||||
cy.intercept('GET', '/api/data').as('getData')
|
||||
cy.visit('/page')
|
||||
cy.wait('@getData') // Deterministic
|
||||
`.trim();
|
||||
}
|
||||
|
||||
return 'Replace hard waits with event-based waits (waitForResponse, waitFor state changes)';
|
||||
}
|
||||
```
|
||||
|
||||
**Healing Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/healing/hard-wait-healing.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('heal hard wait with deterministic wait', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Original (flaky): await page.waitForTimeout(3000)
|
||||
|
||||
// ✅ Healed: Wait for loading spinner to disappear
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
|
||||
|
||||
// OR wait for specific network response
|
||||
await page.waitForResponse((resp) => resp.url().includes('/api/dashboard') && resp.ok());
|
||||
|
||||
await expect(page.getByText('Dashboard ready')).toBeVisible();
|
||||
});
|
||||
|
||||
test('heal implicit wait with explicit network wait', async ({ page }) => {
|
||||
const responsePromise = page.waitForResponse('**/api/products');
|
||||
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Original (race condition): await page.getByText('Product A').click()
|
||||
|
||||
// ✅ Healed: Wait for network first
|
||||
await responsePromise;
|
||||
await page.getByText('Product A').click();
|
||||
|
||||
await expect(page).toHaveURL(/\/products\/\d+/);
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Diagnosis: Test code contains `page.waitForTimeout()` or `cy.wait(number)`
|
||||
- Fix: Replace with `waitForResponse()`, `waitFor({ state })`, or aliased intercepts
|
||||
- Prevention: NEVER use hard waits, always use event-based/response-based waits
|
||||
- Automation: Scan test code for hard wait patterns, suggest deterministic replacements
|
||||
|
||||
---
|
||||
|
||||
## Healing Pattern Catalog
|
||||
|
||||
| Failure Type | Diagnostic Signature | Healing Strategy | Prevention Pattern |
|
||||
| -------------- | --------------------------------------------- | ------------------------------------- | ----------------------------------------- |
|
||||
| Stale Selector | "locator resolved to 0 elements" | Replace with data-testid or ARIA role | Selector hierarchy (testid > ARIA > text) |
|
||||
| Race Condition | "timeout waiting for element" | Add network-first interception | Intercept before navigate |
|
||||
| Dynamic Data | "Expected 'User 123' but got 'User 456'" | Use regex or capture dynamic values | Never hardcode IDs/timestamps |
|
||||
| Network Error | "API call failed", "500 error" | Add route mocking | Mock all external dependencies |
|
||||
| Hard Wait | Test contains `waitForTimeout()` or `wait(n)` | Replace with event-based waits | Always use deterministic waits |
|
||||
|
||||
## Healing Workflow
|
||||
|
||||
1. **Run test** → Capture failure
|
||||
2. **Identify pattern** → Match error against diagnostic signatures
|
||||
3. **Apply fix** → Use pattern-based healing strategy
|
||||
4. **Re-run test** → Validate fix (max 3 iterations)
|
||||
5. **Mark unfixable** → Use `test.fixme()` if healing fails after 3 attempts
|
||||
|
||||
## Healing Checklist
|
||||
|
||||
Before enabling auto-healing in workflows:
|
||||
|
||||
- [ ] **Failure catalog documented**: Common patterns identified (selectors, timing, data, network, hard waits)
|
||||
- [ ] **Diagnostic signatures defined**: Error message patterns for each failure type
|
||||
- [ ] **Healing strategies documented**: Fix patterns for each failure type
|
||||
- [ ] **Prevention patterns documented**: Best practices to avoid recurrence
|
||||
- [ ] **Healing iteration limit set**: Max 3 attempts before marking test.fixme()
|
||||
- [ ] **MCP integration optional**: Graceful degradation without Playwright MCP
|
||||
- [ ] **Pattern-based fallback**: Use knowledge base patterns when MCP unavailable
|
||||
- [ ] **Healing report generated**: Document what was healed and how
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*automate` (auto-healing after test generation), `*atdd` (optional healing for acceptance tests)
|
||||
- **Related fragments**: `selector-resilience.md` (selector debugging), `timing-debugging.md` (race condition fixes), `network-first.md` (interception patterns), `data-factories.md` (dynamic data handling)
|
||||
- **Tools**: Error message parsing, AST analysis for code patterns, Playwright MCP (optional), pattern matching
|
||||
|
||||
_Source: Playwright test-healer patterns, production test failure analysis, common anti-patterns from test-resources-for-ai_
|
||||
@@ -146,3 +146,328 @@ Examples:
|
||||
- `1.3-UNIT-001`
|
||||
- `1.3-INT-002`
|
||||
- `1.3-E2E-001`
|
||||
|
||||
## Real Code Examples
|
||||
|
||||
### Example 1: E2E Test (Full User Journey)
|
||||
|
||||
**Scenario**: User logs in, navigates to dashboard, and places an order.
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-flow.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { createUser, createProduct } from '../test-utils/factories';
|
||||
|
||||
test.describe('Checkout Flow', () => {
|
||||
test('user can complete purchase with saved payment method', async ({ page, apiRequest }) => {
|
||||
// Setup: Seed data via API (fast!)
|
||||
const user = createUser({ email: 'buyer@example.com', hasSavedCard: true });
|
||||
const product = createProduct({ name: 'Widget', price: 29.99, stock: 10 });
|
||||
|
||||
await apiRequest.post('/api/users', { data: user });
|
||||
await apiRequest.post('/api/products', { data: product });
|
||||
|
||||
// Network-first: Intercept BEFORE action
|
||||
const loginPromise = page.waitForResponse('**/api/auth/login');
|
||||
const cartPromise = page.waitForResponse('**/api/cart');
|
||||
const orderPromise = page.waitForResponse('**/api/orders');
|
||||
|
||||
// Step 1: Login
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', user.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
await loginPromise;
|
||||
|
||||
// Assert: Dashboard visible
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
await expect(page.getByText(`Welcome, ${user.name}`)).toBeVisible();
|
||||
|
||||
// Step 2: Add product to cart
|
||||
await page.goto(`/products/${product.id}`);
|
||||
await page.click('[data-testid="add-to-cart"]');
|
||||
await cartPromise;
|
||||
await expect(page.getByText('Added to cart')).toBeVisible();
|
||||
|
||||
// Step 3: Checkout with saved payment
|
||||
await page.goto('/checkout');
|
||||
await expect(page.getByText('Visa ending in 1234')).toBeVisible(); // Saved card
|
||||
await page.click('[data-testid="use-saved-card"]');
|
||||
await page.click('[data-testid="place-order"]');
|
||||
await orderPromise;
|
||||
|
||||
// Assert: Order confirmation
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
await expect(page.getByText(/Order #\d+/)).toBeVisible();
|
||||
await expect(page.getByText('$29.99')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points (E2E)**:
|
||||
|
||||
- Tests complete user journey across multiple pages
|
||||
- API setup for data (fast), UI for assertions (user-centric)
|
||||
- Network-first interception to prevent flakiness
|
||||
- Validates critical revenue path end-to-end
|
||||
|
||||
### Example 2: Integration Test (API/Service Layer)
|
||||
|
||||
**Scenario**: UserService creates user and assigns role via AuthRepository.
|
||||
|
||||
```typescript
|
||||
// tests/integration/user-service.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import { createUser } from '../test-utils/factories';
|
||||
|
||||
test.describe('UserService Integration', () => {
|
||||
test('should create user with admin role via API', async ({ request }) => {
|
||||
const userData = createUser({ role: 'admin' });
|
||||
|
||||
// Direct API call (no UI)
|
||||
const response = await request.post('/api/users', {
|
||||
data: userData,
|
||||
});
|
||||
|
||||
expect(response.status()).toBe(201);
|
||||
|
||||
const createdUser = await response.json();
|
||||
expect(createdUser.id).toBeTruthy();
|
||||
expect(createdUser.email).toBe(userData.email);
|
||||
expect(createdUser.role).toBe('admin');
|
||||
|
||||
// Verify database state
|
||||
const getResponse = await request.get(`/api/users/${createdUser.id}`);
|
||||
expect(getResponse.status()).toBe(200);
|
||||
|
||||
const fetchedUser = await getResponse.json();
|
||||
expect(fetchedUser.role).toBe('admin');
|
||||
expect(fetchedUser.permissions).toContain('user:delete');
|
||||
expect(fetchedUser.permissions).toContain('user:update');
|
||||
|
||||
// Cleanup
|
||||
await request.delete(`/api/users/${createdUser.id}`);
|
||||
});
|
||||
|
||||
test('should validate email uniqueness constraint', async ({ request }) => {
|
||||
const userData = createUser({ email: 'duplicate@example.com' });
|
||||
|
||||
// Create first user
|
||||
const response1 = await request.post('/api/users', { data: userData });
|
||||
expect(response1.status()).toBe(201);
|
||||
|
||||
const user1 = await response1.json();
|
||||
|
||||
// Attempt duplicate email
|
||||
const response2 = await request.post('/api/users', { data: userData });
|
||||
expect(response2.status()).toBe(409); // Conflict
|
||||
const error = await response2.json();
|
||||
expect(error.message).toContain('Email already exists');
|
||||
|
||||
// Cleanup
|
||||
await request.delete(`/api/users/${user1.id}`);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points (Integration)**:
|
||||
|
||||
- Tests service layer + database interaction
|
||||
- No UI involved—pure API validation
|
||||
- Business logic focus (role assignment, constraints)
|
||||
- Faster than E2E, more realistic than unit tests
|
||||
|
||||
### Example 3: Component Test (Isolated UI Component)
|
||||
|
||||
**Scenario**: Test button component in isolation with props and user interactions.
|
||||
|
||||
```typescript
|
||||
// src/components/Button.cy.tsx (Cypress Component Test)
|
||||
import { Button } from './Button';
|
||||
|
||||
describe('Button Component', () => {
|
||||
it('should render with correct label', () => {
|
||||
cy.mount(<Button label="Click Me" />);
|
||||
cy.contains('Click Me').should('be.visible');
|
||||
});
|
||||
|
||||
it('should call onClick handler when clicked', () => {
|
||||
const onClickSpy = cy.stub().as('onClick');
|
||||
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
|
||||
|
||||
cy.get('button').click();
|
||||
cy.get('@onClick').should('have.been.calledOnce');
|
||||
});
|
||||
|
||||
it('should be disabled when disabled prop is true', () => {
|
||||
cy.mount(<Button label="Disabled" disabled={true} />);
|
||||
cy.get('button').should('be.disabled');
|
||||
cy.get('button').should('have.attr', 'aria-disabled', 'true');
|
||||
});
|
||||
|
||||
it('should show loading spinner when loading', () => {
|
||||
cy.mount(<Button label="Loading" loading={true} />);
|
||||
cy.get('[data-testid="spinner"]').should('be.visible');
|
||||
cy.get('button').should('be.disabled');
|
||||
});
|
||||
|
||||
it('should apply variant styles correctly', () => {
|
||||
cy.mount(<Button label="Primary" variant="primary" />);
|
||||
cy.get('button').should('have.class', 'btn-primary');
|
||||
|
||||
cy.mount(<Button label="Secondary" variant="secondary" />);
|
||||
cy.get('button').should('have.class', 'btn-secondary');
|
||||
});
|
||||
});
|
||||
|
||||
// Playwright Component Test equivalent
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import { Button } from './Button';
|
||||
|
||||
test.describe('Button Component', () => {
|
||||
test('should call onClick handler when clicked', async ({ mount }) => {
|
||||
let clicked = false;
|
||||
const component = await mount(
|
||||
<Button label="Submit" onClick={() => { clicked = true; }} />
|
||||
);
|
||||
|
||||
await component.getByRole('button').click();
|
||||
expect(clicked).toBe(true);
|
||||
});
|
||||
|
||||
test('should be disabled when loading', async ({ mount }) => {
|
||||
const component = await mount(<Button label="Loading" loading={true} />);
|
||||
await expect(component.getByRole('button')).toBeDisabled();
|
||||
await expect(component.getByTestId('spinner')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points (Component)**:
|
||||
|
||||
- Tests UI component in isolation (no full app)
|
||||
- Props + user interactions + visual states
|
||||
- Faster than E2E, more realistic than unit tests for UI
|
||||
- Great for design system components
|
||||
|
||||
### Example 4: Unit Test (Pure Function)
|
||||
|
||||
**Scenario**: Test pure business logic function without framework dependencies.
|
||||
|
||||
```typescript
|
||||
// src/utils/price-calculator.test.ts (Jest/Vitest)
|
||||
import { calculateDiscount, applyTaxes, calculateTotal } from './price-calculator';
|
||||
|
||||
describe('PriceCalculator', () => {
|
||||
describe('calculateDiscount', () => {
|
||||
it('should apply percentage discount correctly', () => {
|
||||
const result = calculateDiscount(100, { type: 'percentage', value: 20 });
|
||||
expect(result).toBe(80);
|
||||
});
|
||||
|
||||
it('should apply fixed amount discount correctly', () => {
|
||||
const result = calculateDiscount(100, { type: 'fixed', value: 15 });
|
||||
expect(result).toBe(85);
|
||||
});
|
||||
|
||||
it('should not apply discount below zero', () => {
|
||||
const result = calculateDiscount(10, { type: 'fixed', value: 20 });
|
||||
expect(result).toBe(0);
|
||||
});
|
||||
|
||||
it('should handle no discount', () => {
|
||||
const result = calculateDiscount(100, { type: 'none', value: 0 });
|
||||
expect(result).toBe(100);
|
||||
});
|
||||
});
|
||||
|
||||
describe('applyTaxes', () => {
|
||||
it('should calculate tax correctly for US', () => {
|
||||
const result = applyTaxes(100, { country: 'US', rate: 0.08 });
|
||||
expect(result).toBe(108);
|
||||
});
|
||||
|
||||
it('should calculate tax correctly for EU (VAT)', () => {
|
||||
const result = applyTaxes(100, { country: 'DE', rate: 0.19 });
|
||||
expect(result).toBe(119);
|
||||
});
|
||||
|
||||
it('should handle zero tax rate', () => {
|
||||
const result = applyTaxes(100, { country: 'US', rate: 0 });
|
||||
expect(result).toBe(100);
|
||||
});
|
||||
});
|
||||
|
||||
describe('calculateTotal', () => {
|
||||
it('should calculate total with discount and taxes', () => {
|
||||
const items = [
|
||||
{ price: 50, quantity: 2 }, // 100
|
||||
{ price: 30, quantity: 1 }, // 30
|
||||
];
|
||||
const discount = { type: 'percentage', value: 10 }; // -13
|
||||
const tax = { country: 'US', rate: 0.08 }; // +9.36
|
||||
|
||||
const result = calculateTotal(items, discount, tax);
|
||||
expect(result).toBeCloseTo(126.36, 2);
|
||||
});
|
||||
|
||||
it('should handle empty items array', () => {
|
||||
const result = calculateTotal([], { type: 'none', value: 0 }, { country: 'US', rate: 0 });
|
||||
expect(result).toBe(0);
|
||||
});
|
||||
|
||||
it('should calculate correctly without discount or tax', () => {
|
||||
const items = [{ price: 25, quantity: 4 }];
|
||||
const result = calculateTotal(items, { type: 'none', value: 0 }, { country: 'US', rate: 0 });
|
||||
expect(result).toBe(100);
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points (Unit)**:
|
||||
|
||||
- Pure function testing—no framework dependencies
|
||||
- Fast execution (milliseconds)
|
||||
- Edge case coverage (zero, negative, empty inputs)
|
||||
- High cyclomatic complexity handled at unit level
|
||||
|
||||
## When to Use Which Level
|
||||
|
||||
| Scenario | Unit | Integration | E2E |
|
||||
| ---------------------- | ------------- | ----------------- | ------------- |
|
||||
| Pure business logic | ✅ Primary | ❌ Overkill | ❌ Overkill |
|
||||
| Database operations | ❌ Can't test | ✅ Primary | ❌ Overkill |
|
||||
| API contracts | ❌ Can't test | ✅ Primary | ⚠️ Supplement |
|
||||
| User journeys | ❌ Can't test | ❌ Can't test | ✅ Primary |
|
||||
| Component props/events | ✅ Partial | ⚠️ Component test | ❌ Overkill |
|
||||
| Visual regression | ❌ Can't test | ⚠️ Component test | ✅ Primary |
|
||||
| Error handling (logic) | ✅ Primary | ⚠️ Integration | ❌ Overkill |
|
||||
| Error handling (UI) | ❌ Partial | ⚠️ Component test | ✅ Primary |
|
||||
|
||||
## Anti-Pattern Examples
|
||||
|
||||
**❌ BAD: E2E test for business logic**
|
||||
|
||||
```typescript
|
||||
// DON'T DO THIS
|
||||
test('calculate discount via UI', async ({ page }) => {
|
||||
await page.goto('/calculator');
|
||||
await page.fill('[data-testid="price"]', '100');
|
||||
await page.fill('[data-testid="discount"]', '20');
|
||||
await page.click('[data-testid="calculate"]');
|
||||
await expect(page.getByText('$80')).toBeVisible();
|
||||
});
|
||||
// Problem: Slow, brittle, tests logic that should be unit tested
|
||||
```
|
||||
|
||||
**✅ GOOD: Unit test for business logic**
|
||||
|
||||
```typescript
|
||||
test('calculate discount', () => {
|
||||
expect(calculateDiscount(100, 20)).toBe(80);
|
||||
});
|
||||
// Fast, reliable, isolated
|
||||
```
|
||||
|
||||
_Source: Murat Testing Philosophy (test pyramid), existing test-levels-framework.md structure._
|
||||
|
||||
@@ -172,3 +172,202 @@ Review and adjust priorities based on:
|
||||
- Usage analytics
|
||||
- Test failure history
|
||||
- Business priority changes
|
||||
|
||||
---
|
||||
|
||||
## Automated Priority Classification
|
||||
|
||||
### Example: Priority Calculator (Risk-Based Automation)
|
||||
|
||||
```typescript
|
||||
// src/testing/priority-calculator.ts
|
||||
|
||||
export type Priority = 'P0' | 'P1' | 'P2' | 'P3';
|
||||
|
||||
export type PriorityFactors = {
|
||||
revenueImpact: 'critical' | 'high' | 'medium' | 'low' | 'none';
|
||||
userImpact: 'all' | 'majority' | 'some' | 'few' | 'minimal';
|
||||
securityRisk: boolean;
|
||||
complianceRequired: boolean;
|
||||
previousFailure: boolean;
|
||||
complexity: 'high' | 'medium' | 'low';
|
||||
usage: 'frequent' | 'regular' | 'occasional' | 'rare';
|
||||
};
|
||||
|
||||
/**
|
||||
* Calculate test priority based on multiple factors
|
||||
* Mirrors the priority decision tree with objective criteria
|
||||
*/
|
||||
export function calculatePriority(factors: PriorityFactors): Priority {
|
||||
const { revenueImpact, userImpact, securityRisk, complianceRequired, previousFailure, complexity, usage } = factors;
|
||||
|
||||
// P0: Revenue-critical, security, or compliance
|
||||
if (revenueImpact === 'critical' || securityRisk || complianceRequired || (previousFailure && revenueImpact === 'high')) {
|
||||
return 'P0';
|
||||
}
|
||||
|
||||
// P0: High revenue + high complexity + frequent usage
|
||||
if (revenueImpact === 'high' && complexity === 'high' && usage === 'frequent') {
|
||||
return 'P0';
|
||||
}
|
||||
|
||||
// P1: Core user journey (majority impacted + frequent usage)
|
||||
if (userImpact === 'all' || userImpact === 'majority') {
|
||||
if (usage === 'frequent' || complexity === 'high') {
|
||||
return 'P1';
|
||||
}
|
||||
}
|
||||
|
||||
// P1: High revenue OR high complexity with regular usage
|
||||
if ((revenueImpact === 'high' && usage === 'regular') || (complexity === 'high' && usage === 'frequent')) {
|
||||
return 'P1';
|
||||
}
|
||||
|
||||
// P2: Secondary features (some impact, occasional usage)
|
||||
if (userImpact === 'some' || usage === 'occasional') {
|
||||
return 'P2';
|
||||
}
|
||||
|
||||
// P3: Rarely used, low impact
|
||||
return 'P3';
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate priority justification (for audit trail)
|
||||
*/
|
||||
export function justifyPriority(factors: PriorityFactors): string {
|
||||
const priority = calculatePriority(factors);
|
||||
const reasons: string[] = [];
|
||||
|
||||
if (factors.revenueImpact === 'critical') reasons.push('critical revenue impact');
|
||||
if (factors.securityRisk) reasons.push('security-critical');
|
||||
if (factors.complianceRequired) reasons.push('compliance requirement');
|
||||
if (factors.previousFailure) reasons.push('regression prevention');
|
||||
if (factors.userImpact === 'all' || factors.userImpact === 'majority') {
|
||||
reasons.push(`impacts ${factors.userImpact} users`);
|
||||
}
|
||||
if (factors.complexity === 'high') reasons.push('high complexity');
|
||||
if (factors.usage === 'frequent') reasons.push('frequently used');
|
||||
|
||||
return `${priority}: ${reasons.join(', ')}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Example: Payment scenario priority calculation
|
||||
*/
|
||||
const paymentScenario: PriorityFactors = {
|
||||
revenueImpact: 'critical',
|
||||
userImpact: 'all',
|
||||
securityRisk: true,
|
||||
complianceRequired: true,
|
||||
previousFailure: false,
|
||||
complexity: 'high',
|
||||
usage: 'frequent',
|
||||
};
|
||||
|
||||
console.log(calculatePriority(paymentScenario)); // 'P0'
|
||||
console.log(justifyPriority(paymentScenario));
|
||||
// 'P0: critical revenue impact, security-critical, compliance requirement, impacts all users, high complexity, frequently used'
|
||||
```
|
||||
|
||||
### Example: Test Suite Tagging Strategy
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
// Tag tests with priority for selective execution
|
||||
test.describe('Checkout Flow', () => {
|
||||
test('valid payment completes successfully @p0 @smoke @revenue', async ({ page }) => {
|
||||
// P0: Revenue-critical happy path
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
await page.getByTestId('card-number').fill('4242424242424242');
|
||||
await page.getByRole('button', { name: 'Place Order' }).click();
|
||||
|
||||
await expect(page.getByText('Order confirmed')).toBeVisible();
|
||||
});
|
||||
|
||||
test('expired card shows user-friendly error @p1 @error-handling', async ({ page }) => {
|
||||
// P1: Core error scenario (frequent user impact)
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
await page.getByTestId('card-number').fill('4000000000000069'); // Test card: expired
|
||||
await page.getByRole('button', { name: 'Place Order' }).click();
|
||||
|
||||
await expect(page.getByText('Card expired. Please use a different card.')).toBeVisible();
|
||||
});
|
||||
|
||||
test('coupon code applies discount correctly @p2', async ({ page }) => {
|
||||
// P2: Secondary feature (nice-to-have)
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('coupon-code').fill('SAVE10');
|
||||
await page.getByRole('button', { name: 'Apply' }).click();
|
||||
|
||||
await expect(page.getByText('10% discount applied')).toBeVisible();
|
||||
});
|
||||
|
||||
test('gift message formatting preserved @p3', async ({ page }) => {
|
||||
// P3: Cosmetic feature (rarely used)
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('gift-message').fill('Happy Birthday!\n\nWith love.');
|
||||
await page.getByRole('button', { name: 'Place Order' }).click();
|
||||
|
||||
// Message formatting preserved (linebreaks intact)
|
||||
await expect(page.getByTestId('order-summary')).toContainText('Happy Birthday!');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Run tests by priority:**
|
||||
|
||||
```bash
|
||||
# P0 only (smoke tests, 2-5 min)
|
||||
npx playwright test --grep @p0
|
||||
|
||||
# P0 + P1 (core functionality, 10-15 min)
|
||||
npx playwright test --grep "@p0|@p1"
|
||||
|
||||
# Full regression (all priorities, 30+ min)
|
||||
npx playwright test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with Risk Scoring
|
||||
|
||||
Priority should align with risk score from `probability-impact.md`:
|
||||
|
||||
| Risk Score | Typical Priority | Rationale |
|
||||
| ---------- | ---------------- | ------------------------------------------ |
|
||||
| 9 | P0 | Critical blocker (probability=3, impact=3) |
|
||||
| 6-8 | P0 or P1 | High risk (requires mitigation) |
|
||||
| 4-5 | P1 or P2 | Medium risk (monitor closely) |
|
||||
| 1-3 | P2 or P3 | Low risk (document and defer) |
|
||||
|
||||
**Example**: Risk score 9 (checkout API failure) → P0 priority → comprehensive coverage required.
|
||||
|
||||
---
|
||||
|
||||
## Priority Checklist
|
||||
|
||||
Before finalizing test priorities:
|
||||
|
||||
- [ ] **Revenue impact assessed**: Payment, subscription, billing features → P0
|
||||
- [ ] **Security risks identified**: Auth, data exposure, injection attacks → P0
|
||||
- [ ] **Compliance requirements documented**: GDPR, PCI-DSS, SOC2 → P0
|
||||
- [ ] **User impact quantified**: >50% users → P0/P1, <10% → P2/P3
|
||||
- [ ] **Previous failures reviewed**: Regression prevention → increase priority
|
||||
- [ ] **Complexity evaluated**: >500 LOC or multiple dependencies → increase priority
|
||||
- [ ] **Usage metrics consulted**: Frequent use → P0/P1, rare use → P2/P3
|
||||
- [ ] **Monitoring coverage confirmed**: Strong monitoring → can decrease priority
|
||||
- [ ] **Rollback capability verified**: Easy rollback → can decrease priority
|
||||
- [ ] **Priorities tagged in tests**: @p0, @p1, @p2, @p3 for selective execution
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*automate` (priority-based test generation), `*test-design` (scenario prioritization), `*trace` (coverage validation by priority)
|
||||
- **Related fragments**: `risk-governance.md` (risk scoring), `probability-impact.md` (impact assessment), `selective-testing.md` (tag-based execution)
|
||||
- **Tools**: Playwright/Cypress grep for tag filtering, CI scripts for priority-based execution
|
||||
|
||||
_Source: Risk-based testing practices, test prioritization strategies, production incident analysis_
|
||||
|
||||
@@ -1,10 +1,664 @@
|
||||
# Test Quality Definition of Done
|
||||
|
||||
- No hard waits (`waitForTimeout`, `cy.wait(ms)`); rely on deterministic waits or event hooks.
|
||||
- Each spec <300 lines and executes in ≤1.5 minutes.
|
||||
- Tests are isolated, parallel-safe, and self-cleaning (seed via API/tasks, teardown after run).
|
||||
- Assertions stay visible in test bodies; avoid conditional logic controlling test flow.
|
||||
- Suites must pass locally and in CI with the same commands.
|
||||
- Promote new tests only after they have failed for the intended reason at least once.
|
||||
## Principle
|
||||
|
||||
_Source: Murat quality checklist._
|
||||
Tests must be deterministic, isolated, explicit, focused, and fast. Every test should execute in under 1.5 minutes, contain fewer than 300 lines, avoid hard waits and conditionals, keep assertions visible in test bodies, and clean up after itself for parallel execution.
|
||||
|
||||
## Rationale
|
||||
|
||||
Quality tests provide reliable signal about application health. Flaky tests erode confidence and waste engineering time. Tests that use hard waits (`waitForTimeout(3000)`) are non-deterministic and slow. Tests with hidden assertions or conditional logic become unmaintainable. Large tests (>300 lines) are hard to understand and debug. Slow tests (>1.5 min) block CI pipelines. Self-cleaning tests prevent state pollution in parallel runs.
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Deterministic Test Pattern
|
||||
|
||||
**Context**: When writing tests, eliminate all sources of non-determinism: hard waits, conditionals controlling flow, try-catch for flow control, and random data without seeds.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Non-deterministic test with conditionals and hard waits
|
||||
test('user can view dashboard - FLAKY', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
await page.waitForTimeout(3000); // NEVER - arbitrary wait
|
||||
|
||||
// Conditional flow control - test behavior varies
|
||||
if (await page.locator('[data-testid="welcome-banner"]').isVisible()) {
|
||||
await page.click('[data-testid="dismiss-banner"]');
|
||||
await page.waitForTimeout(500);
|
||||
}
|
||||
|
||||
// Try-catch for flow control - hides real issues
|
||||
try {
|
||||
await page.click('[data-testid="load-more"]');
|
||||
} catch (e) {
|
||||
// Silently continue - test passes even if button missing
|
||||
}
|
||||
|
||||
// Random data without control
|
||||
const randomEmail = `user${Math.random()}@example.com`;
|
||||
await expect(page.getByText(randomEmail)).toBeVisible(); // Will fail randomly
|
||||
});
|
||||
|
||||
// ✅ GOOD: Deterministic test with explicit waits
|
||||
test('user can view dashboard', async ({ page, apiRequest }) => {
|
||||
const user = createUser({ email: 'test@example.com', hasSeenWelcome: true });
|
||||
|
||||
// Setup via API (fast, controlled)
|
||||
await apiRequest.post('/api/users', { data: user });
|
||||
|
||||
// Network-first: Intercept BEFORE navigate
|
||||
const dashboardPromise = page.waitForResponse((resp) => resp.url().includes('/api/dashboard') && resp.status() === 200);
|
||||
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Wait for actual response, not arbitrary time
|
||||
const dashboardResponse = await dashboardPromise;
|
||||
const dashboard = await dashboardResponse.json();
|
||||
|
||||
// Explicit assertions with controlled data
|
||||
await expect(page.getByText(`Welcome, ${user.name}`)).toBeVisible();
|
||||
await expect(page.getByTestId('dashboard-items')).toHaveCount(dashboard.items.length);
|
||||
|
||||
// No conditionals - test always executes same path
|
||||
// No try-catch - failures bubble up clearly
|
||||
});
|
||||
|
||||
// Cypress equivalent
|
||||
describe('Dashboard', () => {
|
||||
it('should display user dashboard', () => {
|
||||
const user = createUser({ email: 'test@example.com', hasSeenWelcome: true });
|
||||
|
||||
// Setup via task (fast, controlled)
|
||||
cy.task('db:seed', { users: [user] });
|
||||
|
||||
// Network-first interception
|
||||
cy.intercept('GET', '**/api/dashboard').as('getDashboard');
|
||||
|
||||
cy.visit('/dashboard');
|
||||
|
||||
// Deterministic wait for response
|
||||
cy.wait('@getDashboard').then((interception) => {
|
||||
const dashboard = interception.response.body;
|
||||
|
||||
// Explicit assertions
|
||||
cy.contains(`Welcome, ${user.name}`).should('be.visible');
|
||||
cy.get('[data-cy="dashboard-items"]').should('have.length', dashboard.items.length);
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Replace `waitForTimeout()` with `waitForResponse()` or element state checks
|
||||
- Never use if/else to control test flow - tests should be deterministic
|
||||
- Avoid try-catch for flow control - let failures bubble up clearly
|
||||
- Use factory functions with controlled data, not `Math.random()`
|
||||
- Network-first pattern prevents race conditions
|
||||
|
||||
### Example 2: Isolated Test with Cleanup
|
||||
|
||||
**Context**: When tests create data, they must clean up after themselves to prevent state pollution in parallel runs. Use fixture auto-cleanup or explicit teardown.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Test leaves data behind, pollutes other tests
|
||||
test('admin can create user - POLLUTES STATE', async ({ page, apiRequest }) => {
|
||||
await page.goto('/admin/users');
|
||||
|
||||
// Hardcoded email - collides in parallel runs
|
||||
await page.fill('[data-testid="email"]', 'newuser@example.com');
|
||||
await page.fill('[data-testid="name"]', 'New User');
|
||||
await page.click('[data-testid="create-user"]');
|
||||
|
||||
await expect(page.getByText('User created')).toBeVisible();
|
||||
|
||||
// NO CLEANUP - user remains in database
|
||||
// Next test run fails: "Email already exists"
|
||||
});
|
||||
|
||||
// ✅ GOOD: Test cleans up with fixture auto-cleanup
|
||||
// playwright/support/fixtures/database-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import { deleteRecord, seedDatabase } from '../helpers/db-helpers';
|
||||
|
||||
type DatabaseFixture = {
|
||||
seedUser: (userData: Partial<User>) => Promise<User>;
|
||||
};
|
||||
|
||||
export const test = base.extend<DatabaseFixture>({
|
||||
seedUser: async ({}, use) => {
|
||||
const createdUsers: string[] = [];
|
||||
|
||||
const seedUser = async (userData: Partial<User>) => {
|
||||
const user = await seedDatabase('users', userData);
|
||||
createdUsers.push(user.id); // Track for cleanup
|
||||
return user;
|
||||
};
|
||||
|
||||
await use(seedUser);
|
||||
|
||||
// Auto-cleanup: Delete all users created during test
|
||||
for (const userId of createdUsers) {
|
||||
await deleteRecord('users', userId);
|
||||
}
|
||||
createdUsers.length = 0;
|
||||
},
|
||||
});
|
||||
|
||||
// Use the fixture
|
||||
test('admin can create user', async ({ page, seedUser }) => {
|
||||
// Create admin with unique data
|
||||
const admin = await seedUser({
|
||||
email: faker.internet.email(), // Unique each run
|
||||
role: 'admin',
|
||||
});
|
||||
|
||||
await page.goto('/admin/users');
|
||||
|
||||
const newUserEmail = faker.internet.email(); // Unique
|
||||
await page.fill('[data-testid="email"]', newUserEmail);
|
||||
await page.fill('[data-testid="name"]', 'New User');
|
||||
await page.click('[data-testid="create-user"]');
|
||||
|
||||
await expect(page.getByText('User created')).toBeVisible();
|
||||
|
||||
// Verify in database
|
||||
const createdUser = await seedUser({ email: newUserEmail });
|
||||
expect(createdUser.email).toBe(newUserEmail);
|
||||
|
||||
// Auto-cleanup happens via fixture teardown
|
||||
});
|
||||
|
||||
// Cypress equivalent with explicit cleanup
|
||||
describe('Admin User Management', () => {
|
||||
const createdUserIds: string[] = [];
|
||||
|
||||
afterEach(() => {
|
||||
// Cleanup: Delete all users created during test
|
||||
createdUserIds.forEach((userId) => {
|
||||
cy.task('db:delete', { table: 'users', id: userId });
|
||||
});
|
||||
createdUserIds.length = 0;
|
||||
});
|
||||
|
||||
it('should create user', () => {
|
||||
const admin = createUser({ role: 'admin' });
|
||||
const newUser = createUser(); // Unique data via faker
|
||||
|
||||
cy.task('db:seed', { users: [admin] }).then((result: any) => {
|
||||
createdUserIds.push(result.users[0].id);
|
||||
});
|
||||
|
||||
cy.visit('/admin/users');
|
||||
cy.get('[data-cy="email"]').type(newUser.email);
|
||||
cy.get('[data-cy="name"]').type(newUser.name);
|
||||
cy.get('[data-cy="create-user"]').click();
|
||||
|
||||
cy.contains('User created').should('be.visible');
|
||||
|
||||
// Track for cleanup
|
||||
cy.task('db:findByEmail', newUser.email).then((user: any) => {
|
||||
createdUserIds.push(user.id);
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Use fixtures with auto-cleanup via teardown (after `use()`)
|
||||
- Track all created resources in array during test execution
|
||||
- Use `faker` for unique data - prevents parallel collisions
|
||||
- Cypress: Use `afterEach()` with explicit cleanup
|
||||
- Never hardcode IDs or emails - always generate unique values
|
||||
|
||||
### Example 3: Explicit Assertions in Tests
|
||||
|
||||
**Context**: When validating test results, keep assertions visible in test bodies. Never hide assertions in helper functions - this obscures test intent and makes failures harder to diagnose.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Assertions hidden in helper functions
|
||||
// helpers/api-validators.ts
|
||||
export async function validateUserCreation(response: Response, expectedEmail: string) {
|
||||
const user = await response.json();
|
||||
expect(response.status()).toBe(201);
|
||||
expect(user.email).toBe(expectedEmail);
|
||||
expect(user.id).toBeTruthy();
|
||||
expect(user.createdAt).toBeTruthy();
|
||||
// Hidden assertions - not visible in test
|
||||
}
|
||||
|
||||
test('create user via API - OPAQUE', async ({ request }) => {
|
||||
const userData = createUser({ email: 'test@example.com' });
|
||||
|
||||
const response = await request.post('/api/users', { data: userData });
|
||||
|
||||
// What assertions are running? Have to check helper.
|
||||
await validateUserCreation(response, userData.email);
|
||||
// When this fails, error is: "validateUserCreation failed" - NOT helpful
|
||||
});
|
||||
|
||||
// ✅ GOOD: Assertions explicit in test
|
||||
test('create user via API', async ({ request }) => {
|
||||
const userData = createUser({ email: 'test@example.com' });
|
||||
|
||||
const response = await request.post('/api/users', { data: userData });
|
||||
|
||||
// All assertions visible - clear test intent
|
||||
expect(response.status()).toBe(201);
|
||||
|
||||
const createdUser = await response.json();
|
||||
expect(createdUser.id).toBeTruthy();
|
||||
expect(createdUser.email).toBe(userData.email);
|
||||
expect(createdUser.name).toBe(userData.name);
|
||||
expect(createdUser.role).toBe('user');
|
||||
expect(createdUser.createdAt).toBeTruthy();
|
||||
expect(createdUser.isActive).toBe(true);
|
||||
|
||||
// When this fails, error is: "Expected role to be 'user', got 'admin'" - HELPFUL
|
||||
});
|
||||
|
||||
// ✅ ACCEPTABLE: Helper for data extraction, NOT assertions
|
||||
// helpers/api-extractors.ts
|
||||
export async function extractUserFromResponse(response: Response): Promise<User> {
|
||||
const user = await response.json();
|
||||
return user; // Just extracts, no assertions
|
||||
}
|
||||
|
||||
test('create user with extraction helper', async ({ request }) => {
|
||||
const userData = createUser({ email: 'test@example.com' });
|
||||
|
||||
const response = await request.post('/api/users', { data: userData });
|
||||
|
||||
// Extract data with helper (OK)
|
||||
const createdUser = await extractUserFromResponse(response);
|
||||
|
||||
// But keep assertions in test (REQUIRED)
|
||||
expect(response.status()).toBe(201);
|
||||
expect(createdUser.email).toBe(userData.email);
|
||||
expect(createdUser.role).toBe('user');
|
||||
});
|
||||
|
||||
// Cypress equivalent
|
||||
describe('User API', () => {
|
||||
it('should create user with explicit assertions', () => {
|
||||
const userData = createUser({ email: 'test@example.com' });
|
||||
|
||||
cy.request('POST', '/api/users', userData).then((response) => {
|
||||
// All assertions visible in test
|
||||
expect(response.status).to.equal(201);
|
||||
expect(response.body.id).to.exist;
|
||||
expect(response.body.email).to.equal(userData.email);
|
||||
expect(response.body.name).to.equal(userData.name);
|
||||
expect(response.body.role).to.equal('user');
|
||||
expect(response.body.createdAt).to.exist;
|
||||
expect(response.body.isActive).to.be.true;
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
// ✅ GOOD: Parametrized tests for soft assertions (bulk validation)
|
||||
test.describe('User creation validation', () => {
|
||||
const testCases = [
|
||||
{ field: 'email', value: 'test@example.com', expected: 'test@example.com' },
|
||||
{ field: 'name', value: 'Test User', expected: 'Test User' },
|
||||
{ field: 'role', value: 'admin', expected: 'admin' },
|
||||
{ field: 'isActive', value: true, expected: true },
|
||||
];
|
||||
|
||||
for (const { field, value, expected } of testCases) {
|
||||
test(`should set ${field} correctly`, async ({ request }) => {
|
||||
const userData = createUser({ [field]: value });
|
||||
|
||||
const response = await request.post('/api/users', { data: userData });
|
||||
const user = await response.json();
|
||||
|
||||
// Parametrized assertion - still explicit
|
||||
expect(user[field]).toBe(expected);
|
||||
});
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Never hide `expect()` calls in helper functions
|
||||
- Helpers can extract/transform data, but assertions stay in tests
|
||||
- Parametrized tests are acceptable for bulk validation (still explicit)
|
||||
- Explicit assertions make failures actionable: "Expected X, got Y"
|
||||
- Hidden assertions produce vague failures: "Helper function failed"
|
||||
|
||||
### Example 4: Test Length Limits
|
||||
|
||||
**Context**: When tests grow beyond 300 lines, they become hard to understand, debug, and maintain. Refactor long tests by extracting setup helpers, splitting scenarios, or using fixtures.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: 400-line monolithic test (truncated for example)
|
||||
test('complete user journey - TOO LONG', async ({ page, request }) => {
|
||||
// 50 lines of setup
|
||||
const admin = createUser({ role: 'admin' });
|
||||
await request.post('/api/users', { data: admin });
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', admin.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login"]');
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
|
||||
// 100 lines of user creation
|
||||
await page.goto('/admin/users');
|
||||
const newUser = createUser();
|
||||
await page.fill('[data-testid="email"]', newUser.email);
|
||||
// ... 95 more lines of form filling, validation, etc.
|
||||
|
||||
// 100 lines of permissions assignment
|
||||
await page.click('[data-testid="assign-permissions"]');
|
||||
// ... 95 more lines
|
||||
|
||||
// 100 lines of notification preferences
|
||||
await page.click('[data-testid="notification-settings"]');
|
||||
// ... 95 more lines
|
||||
|
||||
// 50 lines of cleanup
|
||||
await request.delete(`/api/users/${newUser.id}`);
|
||||
// ... 45 more lines
|
||||
|
||||
// TOTAL: 400 lines - impossible to understand or debug
|
||||
});
|
||||
|
||||
// ✅ GOOD: Split into focused tests with shared fixture
|
||||
// playwright/support/fixtures/admin-fixture.ts
|
||||
export const test = base.extend({
|
||||
adminPage: async ({ page, request }, use) => {
|
||||
// Shared setup: Login as admin
|
||||
const admin = createUser({ role: 'admin' });
|
||||
await request.post('/api/users', { data: admin });
|
||||
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', admin.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login"]');
|
||||
await expect(page).toHaveURL('/dashboard');
|
||||
|
||||
await use(page); // Provide logged-in page
|
||||
|
||||
// Cleanup handled by fixture
|
||||
},
|
||||
});
|
||||
|
||||
// Test 1: User creation (50 lines)
|
||||
test('admin can create user', async ({ adminPage, seedUser }) => {
|
||||
await adminPage.goto('/admin/users');
|
||||
|
||||
const newUser = createUser();
|
||||
await adminPage.fill('[data-testid="email"]', newUser.email);
|
||||
await adminPage.fill('[data-testid="name"]', newUser.name);
|
||||
await adminPage.click('[data-testid="role-dropdown"]');
|
||||
await adminPage.click('[data-testid="role-user"]');
|
||||
await adminPage.click('[data-testid="create-user"]');
|
||||
|
||||
await expect(adminPage.getByText('User created')).toBeVisible();
|
||||
await expect(adminPage.getByText(newUser.email)).toBeVisible();
|
||||
|
||||
// Verify in database
|
||||
const created = await seedUser({ email: newUser.email });
|
||||
expect(created.role).toBe('user');
|
||||
});
|
||||
|
||||
// Test 2: Permission assignment (60 lines)
|
||||
test('admin can assign permissions', async ({ adminPage, seedUser }) => {
|
||||
const user = await seedUser({ email: faker.internet.email() });
|
||||
|
||||
await adminPage.goto(`/admin/users/${user.id}`);
|
||||
await adminPage.click('[data-testid="assign-permissions"]');
|
||||
await adminPage.check('[data-testid="permission-read"]');
|
||||
await adminPage.check('[data-testid="permission-write"]');
|
||||
await adminPage.click('[data-testid="save-permissions"]');
|
||||
|
||||
await expect(adminPage.getByText('Permissions updated')).toBeVisible();
|
||||
|
||||
// Verify permissions assigned
|
||||
const response = await adminPage.request.get(`/api/users/${user.id}`);
|
||||
const updated = await response.json();
|
||||
expect(updated.permissions).toContain('read');
|
||||
expect(updated.permissions).toContain('write');
|
||||
});
|
||||
|
||||
// Test 3: Notification preferences (70 lines)
|
||||
test('admin can update notification preferences', async ({ adminPage, seedUser }) => {
|
||||
const user = await seedUser({ email: faker.internet.email() });
|
||||
|
||||
await adminPage.goto(`/admin/users/${user.id}/notifications`);
|
||||
await adminPage.check('[data-testid="email-notifications"]');
|
||||
await adminPage.uncheck('[data-testid="sms-notifications"]');
|
||||
await adminPage.selectOption('[data-testid="frequency"]', 'daily');
|
||||
await adminPage.click('[data-testid="save-preferences"]');
|
||||
|
||||
await expect(adminPage.getByText('Preferences saved')).toBeVisible();
|
||||
|
||||
// Verify preferences
|
||||
const response = await adminPage.request.get(`/api/users/${user.id}/preferences`);
|
||||
const prefs = await response.json();
|
||||
expect(prefs.emailEnabled).toBe(true);
|
||||
expect(prefs.smsEnabled).toBe(false);
|
||||
expect(prefs.frequency).toBe('daily');
|
||||
});
|
||||
|
||||
// TOTAL: 3 tests × 60 lines avg = 180 lines
|
||||
// Each test is focused, debuggable, and under 300 lines
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Split monolithic tests into focused scenarios (<300 lines each)
|
||||
- Extract common setup into fixtures (auto-runs for each test)
|
||||
- Each test validates one concern (user creation, permissions, preferences)
|
||||
- Failures are easier to diagnose: "Permission assignment failed" vs "Complete journey failed"
|
||||
- Tests can run in parallel (isolated concerns)
|
||||
|
||||
### Example 5: Execution Time Optimization
|
||||
|
||||
**Context**: When tests take longer than 1.5 minutes, they slow CI pipelines and feedback loops. Optimize by using API setup instead of UI navigation, parallelizing independent operations, and avoiding unnecessary waits.
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: 4-minute test (slow setup, sequential operations)
|
||||
test('user completes order - SLOW (4 min)', async ({ page }) => {
|
||||
// Step 1: Manual signup via UI (90 seconds)
|
||||
await page.goto('/signup');
|
||||
await page.fill('[data-testid="email"]', 'buyer@example.com');
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.fill('[data-testid="confirm-password"]', 'password123');
|
||||
await page.fill('[data-testid="name"]', 'Buyer User');
|
||||
await page.click('[data-testid="signup"]');
|
||||
await page.waitForURL('/verify-email'); // Wait for email verification
|
||||
// ... manual email verification flow
|
||||
|
||||
// Step 2: Manual product creation via UI (60 seconds)
|
||||
await page.goto('/admin/products');
|
||||
await page.fill('[data-testid="product-name"]', 'Widget');
|
||||
// ... 20 more fields
|
||||
await page.click('[data-testid="create-product"]');
|
||||
|
||||
// Step 3: Navigate to checkout (30 seconds)
|
||||
await page.goto('/products');
|
||||
await page.waitForTimeout(5000); // Unnecessary hard wait
|
||||
await page.click('[data-testid="product-widget"]');
|
||||
await page.waitForTimeout(3000); // Unnecessary
|
||||
await page.click('[data-testid="add-to-cart"]');
|
||||
await page.waitForTimeout(2000); // Unnecessary
|
||||
|
||||
// Step 4: Complete checkout (40 seconds)
|
||||
await page.goto('/checkout');
|
||||
await page.waitForTimeout(5000); // Unnecessary
|
||||
await page.fill('[data-testid="credit-card"]', '4111111111111111');
|
||||
// ... more form filling
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await page.waitForTimeout(10000); // Unnecessary
|
||||
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
|
||||
// TOTAL: ~240 seconds (4 minutes)
|
||||
});
|
||||
|
||||
// ✅ GOOD: 45-second test (API setup, parallel ops, deterministic waits)
|
||||
test('user completes order', async ({ page, apiRequest }) => {
|
||||
// Step 1: API setup (parallel, 5 seconds total)
|
||||
const [user, product] = await Promise.all([
|
||||
// Create user via API (fast)
|
||||
apiRequest
|
||||
.post('/api/users', {
|
||||
data: createUser({
|
||||
email: 'buyer@example.com',
|
||||
emailVerified: true, // Skip verification
|
||||
}),
|
||||
})
|
||||
.then((r) => r.json()),
|
||||
|
||||
// Create product via API (fast)
|
||||
apiRequest
|
||||
.post('/api/products', {
|
||||
data: createProduct({
|
||||
name: 'Widget',
|
||||
price: 29.99,
|
||||
stock: 10,
|
||||
}),
|
||||
})
|
||||
.then((r) => r.json()),
|
||||
]);
|
||||
|
||||
// Step 2: Auth setup via storage state (instant, 0 seconds)
|
||||
await page.context().addCookies([
|
||||
{
|
||||
name: 'auth_token',
|
||||
value: user.token,
|
||||
domain: 'localhost',
|
||||
path: '/',
|
||||
},
|
||||
]);
|
||||
|
||||
// Step 3: Network-first interception BEFORE navigation (10 seconds)
|
||||
const cartPromise = page.waitForResponse('**/api/cart');
|
||||
const orderPromise = page.waitForResponse('**/api/orders');
|
||||
|
||||
await page.goto(`/products/${product.id}`);
|
||||
await page.click('[data-testid="add-to-cart"]');
|
||||
await cartPromise; // Deterministic wait (no hard wait)
|
||||
|
||||
// Step 4: Checkout with network waits (30 seconds)
|
||||
await page.goto('/checkout');
|
||||
await page.fill('[data-testid="credit-card"]', '4111111111111111');
|
||||
await page.fill('[data-testid="cvv"]', '123');
|
||||
await page.fill('[data-testid="expiry"]', '12/25');
|
||||
await page.click('[data-testid="submit-order"]');
|
||||
await orderPromise; // Deterministic wait (no hard wait)
|
||||
|
||||
await expect(page.getByText('Order Confirmed')).toBeVisible();
|
||||
await expect(page.getByText(`Order #${product.id}`)).toBeVisible();
|
||||
|
||||
// TOTAL: ~45 seconds (6x faster)
|
||||
});
|
||||
|
||||
// Cypress equivalent
|
||||
describe('Order Flow', () => {
|
||||
it('should complete purchase quickly', () => {
|
||||
// Step 1: API setup (parallel, fast)
|
||||
const user = createUser({ emailVerified: true });
|
||||
const product = createProduct({ name: 'Widget', price: 29.99 });
|
||||
|
||||
cy.task('db:seed', { users: [user], products: [product] });
|
||||
|
||||
// Step 2: Auth setup via session (instant)
|
||||
cy.setCookie('auth_token', user.token);
|
||||
|
||||
// Step 3: Network-first interception
|
||||
cy.intercept('POST', '**/api/cart').as('addToCart');
|
||||
cy.intercept('POST', '**/api/orders').as('createOrder');
|
||||
|
||||
cy.visit(`/products/${product.id}`);
|
||||
cy.get('[data-cy="add-to-cart"]').click();
|
||||
cy.wait('@addToCart'); // Deterministic wait
|
||||
|
||||
// Step 4: Checkout
|
||||
cy.visit('/checkout');
|
||||
cy.get('[data-cy="credit-card"]').type('4111111111111111');
|
||||
cy.get('[data-cy="cvv"]').type('123');
|
||||
cy.get('[data-cy="expiry"]').type('12/25');
|
||||
cy.get('[data-cy="submit-order"]').click();
|
||||
cy.wait('@createOrder'); // Deterministic wait
|
||||
|
||||
cy.contains('Order Confirmed').should('be.visible');
|
||||
cy.contains(`Order #${product.id}`).should('be.visible');
|
||||
});
|
||||
});
|
||||
|
||||
// Additional optimization: Shared auth state (0 seconds per test)
|
||||
// playwright/support/global-setup.ts
|
||||
export default async function globalSetup() {
|
||||
const browser = await chromium.launch();
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Create admin user once for all tests
|
||||
const admin = createUser({ role: 'admin', emailVerified: true });
|
||||
await page.request.post('/api/users', { data: admin });
|
||||
|
||||
// Login once, save session
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', admin.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login"]');
|
||||
|
||||
// Save auth state for reuse
|
||||
await page.context().storageState({ path: 'playwright/.auth/admin.json' });
|
||||
|
||||
await browser.close();
|
||||
}
|
||||
|
||||
// Use shared auth in tests (instant)
|
||||
test.use({ storageState: 'playwright/.auth/admin.json' });
|
||||
|
||||
test('admin action', async ({ page }) => {
|
||||
// Already logged in - no auth overhead (0 seconds)
|
||||
await page.goto('/admin');
|
||||
// ... test logic
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Use API for data setup (10-50x faster than UI)
|
||||
- Run independent operations in parallel (`Promise.all`)
|
||||
- Replace hard waits with deterministic waits (`waitForResponse`)
|
||||
- Reuse auth sessions via `storageState` (Playwright) or `setCookie` (Cypress)
|
||||
- Skip unnecessary flows (email verification, multi-step signups)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*atdd` (test generation quality), `*automate` (test expansion quality), `*test-review` (quality validation)
|
||||
- **Related fragments**:
|
||||
- `network-first.md` - Deterministic waiting strategies
|
||||
- `data-factories.md` - Isolated, parallel-safe data patterns
|
||||
- `fixture-architecture.md` - Setup extraction and cleanup
|
||||
- `test-levels-framework.md` - Choosing appropriate test granularity for speed
|
||||
|
||||
## Core Quality Checklist
|
||||
|
||||
Every test must pass these criteria:
|
||||
|
||||
- [ ] **No Hard Waits** - Use `waitForResponse`, `waitForLoadState`, or element state (not `waitForTimeout`)
|
||||
- [ ] **No Conditionals** - Tests execute the same path every time (no if/else, try/catch for flow control)
|
||||
- [ ] **< 300 Lines** - Keep tests focused; split large tests or extract setup to fixtures
|
||||
- [ ] **< 1.5 Minutes** - Optimize with API setup, parallel operations, and shared auth
|
||||
- [ ] **Self-Cleaning** - Use fixtures with auto-cleanup or explicit `afterEach()` teardown
|
||||
- [ ] **Explicit Assertions** - Keep `expect()` calls in test bodies, not hidden in helpers
|
||||
- [ ] **Unique Data** - Use `faker` for dynamic data; never hardcode IDs or emails
|
||||
- [ ] **Parallel-Safe** - Tests don't share state; run successfully with `--workers=4`
|
||||
|
||||
_Source: Murat quality checklist, Definition of Done requirements (lines 370-381, 406-422)._
|
||||
|
||||
372
src/modules/bmm/testarch/knowledge/timing-debugging.md
Normal file
372
src/modules/bmm/testarch/knowledge/timing-debugging.md
Normal file
@@ -0,0 +1,372 @@
|
||||
# Timing Debugging and Race Condition Fixes
|
||||
|
||||
## Principle
|
||||
|
||||
Race conditions arise when tests make assumptions about asynchronous timing (network, animations, state updates). **Deterministic waiting** eliminates flakiness by explicitly waiting for observable events (network responses, element state changes) instead of arbitrary timeouts.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: Tests pass locally but fail in CI (different timing), or pass/fail randomly (race conditions). Hard waits (`waitForTimeout`, `sleep`) mask timing issues without solving them.
|
||||
|
||||
**The Solution**: Replace all hard waits with event-based waits (`waitForResponse`, `waitFor({ state })`). Implement network-first pattern (intercept before navigate). Use explicit state checks (loading spinner detached, data loaded). This makes tests deterministic regardless of network speed or system load.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Eliminates flaky tests (0 tolerance for timing-based failures)
|
||||
- Works consistently across environments (local, CI, production-like)
|
||||
- Faster test execution (no unnecessary waits)
|
||||
- Clearer test intent (explicit about what we're waiting for)
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Race Condition Identification (Network-First Pattern)
|
||||
|
||||
**Context**: Prevent race conditions by intercepting network requests before navigation
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/timing/race-condition-prevention.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Race Condition Prevention Patterns', () => {
|
||||
test('❌ Anti-Pattern: Navigate then intercept (race condition)', async ({ page, context }) => {
|
||||
// BAD: Navigation starts before interception ready
|
||||
await page.goto('/products'); // ⚠️ Race! API might load before route is set
|
||||
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({ status: 200, body: JSON.stringify({ products: [] }) });
|
||||
});
|
||||
|
||||
// Test may see real API response or mock (non-deterministic)
|
||||
});
|
||||
|
||||
test('✅ Pattern: Intercept BEFORE navigate (deterministic)', async ({ page, context }) => {
|
||||
// GOOD: Interception ready before navigation
|
||||
await context.route('**/api/products', (route) => {
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
contentType: 'application/json',
|
||||
body: JSON.stringify({
|
||||
products: [
|
||||
{ id: 1, name: 'Product A', price: 29.99 },
|
||||
{ id: 2, name: 'Product B', price: 49.99 },
|
||||
],
|
||||
}),
|
||||
});
|
||||
});
|
||||
|
||||
const responsePromise = page.waitForResponse('**/api/products');
|
||||
|
||||
await page.goto('/products'); // Navigation happens AFTER route is ready
|
||||
await responsePromise; // Explicit wait for network
|
||||
|
||||
// Test sees mock response reliably (deterministic)
|
||||
await expect(page.getByText('Product A')).toBeVisible();
|
||||
});
|
||||
|
||||
test('✅ Pattern: Wait for element state change (loading → loaded)', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Wait for loading indicator to appear (confirms load started)
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'visible' });
|
||||
|
||||
// Wait for loading indicator to disappear (confirms load complete)
|
||||
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
|
||||
|
||||
// Content now reliably visible
|
||||
await expect(page.getByTestId('dashboard-data')).toBeVisible();
|
||||
});
|
||||
|
||||
test('✅ Pattern: Explicit visibility check (not just presence)', async ({ page }) => {
|
||||
await page.goto('/modal-demo');
|
||||
|
||||
await page.getByRole('button', { name: 'Open Modal' }).click();
|
||||
|
||||
// ❌ Bad: Element exists but may not be visible yet
|
||||
// await expect(page.getByTestId('modal')).toBeAttached()
|
||||
|
||||
// ✅ Good: Wait for visibility (accounts for animations)
|
||||
await expect(page.getByTestId('modal')).toBeVisible();
|
||||
await expect(page.getByRole('heading', { name: 'Modal Title' })).toBeVisible();
|
||||
});
|
||||
|
||||
test('❌ Anti-Pattern: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
|
||||
// ⚠️ Deprecated for SPAs (WebSocket connections never idle)
|
||||
// await page.goto('/dashboard')
|
||||
// await page.waitForLoadState('networkidle') // May timeout in SPAs
|
||||
|
||||
// ✅ Better: Wait for specific API response
|
||||
const responsePromise = page.waitForResponse('**/api/dashboard');
|
||||
await page.goto('/dashboard');
|
||||
await responsePromise;
|
||||
|
||||
await expect(page.getByText('Dashboard loaded')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Network-first: ALWAYS intercept before navigate (prevents race conditions)
|
||||
- State changes: Wait for loading spinner detached (explicit load completion)
|
||||
- Visibility vs presence: `toBeVisible()` accounts for animations, `toBeAttached()` doesn't
|
||||
- Avoid networkidle: Unreliable in SPAs (WebSocket, polling connections)
|
||||
- Explicit waits: Document exactly what we're waiting for
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Deterministic Waiting Patterns (Event-Based, Not Time-Based)
|
||||
|
||||
**Context**: Replace all hard waits with observable event waits
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/timing/deterministic-waits.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Deterministic Waiting Patterns', () => {
|
||||
test('waitForResponse() with URL pattern', async ({ page }) => {
|
||||
const responsePromise = page.waitForResponse('**/api/products');
|
||||
|
||||
await page.goto('/products');
|
||||
await responsePromise; // Deterministic (waits for exact API call)
|
||||
|
||||
await expect(page.getByText('Products loaded')).toBeVisible();
|
||||
});
|
||||
|
||||
test('waitForResponse() with predicate function', async ({ page }) => {
|
||||
const responsePromise = page.waitForResponse((resp) => resp.url().includes('/api/search') && resp.status() === 200);
|
||||
|
||||
await page.goto('/search');
|
||||
await page.getByPlaceholder('Search').fill('laptop');
|
||||
await page.getByRole('button', { name: 'Search' }).click();
|
||||
|
||||
await responsePromise; // Wait for successful search response
|
||||
|
||||
await expect(page.getByTestId('search-results')).toBeVisible();
|
||||
});
|
||||
|
||||
test('waitForFunction() for custom conditions', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Wait for custom JavaScript condition
|
||||
await page.waitForFunction(() => {
|
||||
const element = document.querySelector('[data-testid="user-count"]');
|
||||
return element && parseInt(element.textContent || '0') > 0;
|
||||
});
|
||||
|
||||
// User count now loaded
|
||||
await expect(page.getByTestId('user-count')).not.toHaveText('0');
|
||||
});
|
||||
|
||||
test('waitFor() element state (attached, visible, hidden, detached)', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// Wait for element to be attached to DOM
|
||||
await page.getByTestId('product-list').waitFor({ state: 'attached' });
|
||||
|
||||
// Wait for element to be visible (animations complete)
|
||||
await page.getByTestId('product-list').waitFor({ state: 'visible' });
|
||||
|
||||
// Perform action
|
||||
await page.getByText('Product A').click();
|
||||
|
||||
// Wait for modal to be hidden (close animation complete)
|
||||
await page.getByTestId('modal').waitFor({ state: 'hidden' });
|
||||
});
|
||||
|
||||
test('Cypress: cy.wait() with aliased intercepts', async () => {
|
||||
// Cypress example (not Playwright)
|
||||
/*
|
||||
cy.intercept('GET', '/api/products').as('getProducts')
|
||||
cy.visit('/products')
|
||||
cy.wait('@getProducts') // Deterministic wait for specific request
|
||||
|
||||
cy.get('[data-testid="product-list"]').should('be.visible')
|
||||
*/
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `waitForResponse()`: Wait for specific API calls (URL pattern or predicate)
|
||||
- `waitForFunction()`: Wait for custom JavaScript conditions
|
||||
- `waitFor({ state })`: Wait for element state changes (attached, visible, hidden, detached)
|
||||
- Cypress `cy.wait('@alias')`: Deterministic wait for aliased intercepts
|
||||
- All waits are event-based (not time-based)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Timing Anti-Patterns (What NEVER to Do)
|
||||
|
||||
**Context**: Common timing mistakes that cause flakiness
|
||||
|
||||
**Problem Examples**:
|
||||
|
||||
```typescript
|
||||
// tests/timing/anti-patterns.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('Timing Anti-Patterns to Avoid', () => {
|
||||
test('❌ NEVER: page.waitForTimeout() (arbitrary delay)', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// ❌ Bad: Arbitrary 3-second wait (flaky)
|
||||
// await page.waitForTimeout(3000)
|
||||
// Problem: Might be too short (CI slower) or too long (wastes time)
|
||||
|
||||
// ✅ Good: Wait for observable event
|
||||
await page.waitForResponse('**/api/dashboard');
|
||||
await expect(page.getByText('Dashboard loaded')).toBeVisible();
|
||||
});
|
||||
|
||||
test('❌ NEVER: cy.wait(number) without alias (arbitrary delay)', async () => {
|
||||
// Cypress example
|
||||
/*
|
||||
// ❌ Bad: Arbitrary delay
|
||||
cy.visit('/products')
|
||||
cy.wait(2000) // Flaky!
|
||||
|
||||
// ✅ Good: Wait for specific request
|
||||
cy.intercept('GET', '/api/products').as('getProducts')
|
||||
cy.visit('/products')
|
||||
cy.wait('@getProducts') // Deterministic
|
||||
*/
|
||||
});
|
||||
|
||||
test('❌ NEVER: Multiple hard waits in sequence (compounding delays)', async ({ page }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// ❌ Bad: Stacked hard waits (6+ seconds wasted)
|
||||
// await page.waitForTimeout(2000) // Wait for form
|
||||
// await page.getByTestId('email').fill('test@example.com')
|
||||
// await page.waitForTimeout(1000) // Wait for validation
|
||||
// await page.getByTestId('submit').click()
|
||||
// await page.waitForTimeout(3000) // Wait for redirect
|
||||
|
||||
// ✅ Good: Event-based waits (no wasted time)
|
||||
await page.getByTestId('checkout-form').waitFor({ state: 'visible' });
|
||||
await page.getByTestId('email').fill('test@example.com');
|
||||
await page.waitForResponse('**/api/validate-email');
|
||||
await page.getByTestId('submit').click();
|
||||
await page.waitForURL('**/confirmation');
|
||||
});
|
||||
|
||||
test('❌ NEVER: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
|
||||
// ❌ Bad: Unreliable in SPAs (WebSocket connections never idle)
|
||||
// await page.goto('/dashboard')
|
||||
// await page.waitForLoadState('networkidle') // Timeout in SPAs!
|
||||
|
||||
// ✅ Good: Wait for specific API responses
|
||||
await page.goto('/dashboard');
|
||||
await page.waitForResponse('**/api/dashboard');
|
||||
await page.waitForResponse('**/api/user');
|
||||
await expect(page.getByTestId('dashboard-content')).toBeVisible();
|
||||
});
|
||||
|
||||
test('❌ NEVER: Sleep/setTimeout in tests', async ({ page }) => {
|
||||
await page.goto('/products');
|
||||
|
||||
// ❌ Bad: Node.js sleep (blocks test thread)
|
||||
// await new Promise(resolve => setTimeout(resolve, 2000))
|
||||
|
||||
// ✅ Good: Playwright auto-waits for element
|
||||
await expect(page.getByText('Products loaded')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Why These Fail**:
|
||||
|
||||
- **Hard waits**: Arbitrary timeouts (too short → flaky, too long → slow)
|
||||
- **Stacked waits**: Compound delays (wasteful, unreliable)
|
||||
- **networkidle**: Broken in SPAs (WebSocket/polling never idle)
|
||||
- **Sleep**: Blocks execution (wastes time, doesn't solve race conditions)
|
||||
|
||||
**Better Approach**: Use event-based waits from examples above
|
||||
|
||||
---
|
||||
|
||||
## Async Debugging Techniques
|
||||
|
||||
### Technique 1: Promise Chain Analysis
|
||||
|
||||
```typescript
|
||||
test('debug async waterfall with console logs', async ({ page }) => {
|
||||
console.log('1. Starting navigation...');
|
||||
await page.goto('/products');
|
||||
|
||||
console.log('2. Waiting for API response...');
|
||||
const response = await page.waitForResponse('**/api/products');
|
||||
console.log('3. API responded:', response.status());
|
||||
|
||||
console.log('4. Waiting for UI update...');
|
||||
await expect(page.getByText('Products loaded')).toBeVisible();
|
||||
console.log('5. Test complete');
|
||||
|
||||
// Console output shows exactly where timing issue occurs
|
||||
});
|
||||
```
|
||||
|
||||
### Technique 2: Network Waterfall Inspection (DevTools)
|
||||
|
||||
```typescript
|
||||
test('inspect network timing with trace viewer', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Generate trace for analysis
|
||||
// npx playwright test --trace on
|
||||
// npx playwright show-trace trace.zip
|
||||
|
||||
// In trace viewer:
|
||||
// 1. Check Network tab for API call timing
|
||||
// 2. Identify slow requests (>1s response time)
|
||||
// 3. Find race conditions (overlapping requests)
|
||||
// 4. Verify request order (dependencies)
|
||||
});
|
||||
```
|
||||
|
||||
### Technique 3: Trace Viewer for Timing Visualization
|
||||
|
||||
```typescript
|
||||
test('use trace viewer to debug timing', async ({ page }) => {
|
||||
// Run with trace: npx playwright test --trace on
|
||||
|
||||
await page.goto('/checkout');
|
||||
await page.getByTestId('submit').click();
|
||||
|
||||
// In trace viewer, examine:
|
||||
// - Timeline: See exact timing of each action
|
||||
// - Snapshots: Hover to see DOM state at each moment
|
||||
// - Network: Identify slow/failed requests
|
||||
// - Console: Check for async errors
|
||||
|
||||
await expect(page.getByText('Success')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Race Condition Checklist
|
||||
|
||||
Before deploying tests:
|
||||
|
||||
- [ ] **Network-first pattern**: All routes intercepted BEFORE navigation (no race conditions)
|
||||
- [ ] **Explicit waits**: Every navigation followed by `waitForResponse()` or state check
|
||||
- [ ] **No hard waits**: Zero instances of `waitForTimeout()`, `cy.wait(number)`, `sleep()`
|
||||
- [ ] **Element state waits**: Loading spinners use `waitFor({ state: 'detached' })`
|
||||
- [ ] **Visibility checks**: Use `toBeVisible()` (accounts for animations), not just `toBeAttached()`
|
||||
- [ ] **Response validation**: Wait for successful responses (`resp.ok()` or `status === 200`)
|
||||
- [ ] **Trace viewer analysis**: Generate traces to identify timing issues (network waterfall, console errors)
|
||||
- [ ] **CI/local parity**: Tests pass reliably in both environments (no timing assumptions)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*automate` (healing timing failures), `*test-review` (detect hard wait anti-patterns), `*framework` (configure timeout standards)
|
||||
- **Related fragments**: `test-healing-patterns.md` (race condition diagnosis), `network-first.md` (interception patterns), `playwright-config.md` (timeout configuration), `visual-debugging.md` (trace viewer analysis)
|
||||
- **Tools**: Playwright Inspector (`--debug`), Trace Viewer (`--trace on`), DevTools Network tab
|
||||
|
||||
_Source: Playwright timing best practices, network-first pattern from test-resources-for-ai, production race condition debugging_
|
||||
@@ -1,9 +1,524 @@
|
||||
# Visual Debugging and Developer Ergonomics
|
||||
|
||||
- Keep Playwright trace viewer, Cypress runner, and Storybook accessible in CI artifacts to speed up reproduction.
|
||||
- Record short screen captures only-on-failure; pair them with HAR or console logs to avoid guesswork.
|
||||
- Document common trace navigation steps (network tab, action timeline) so new contributors diagnose issues quickly.
|
||||
- Encourage live-debug sessions with component harnesses to validate behaviour before writing full E2E specs.
|
||||
- Integrate accessibility tooling (axe, Playwright audits) into the same debug workflow to catch regressions early.
|
||||
## Principle
|
||||
|
||||
_Source: Murat DX blog posts, Playwright book appendix on debugging._
|
||||
Fast feedback loops and transparent debugging artifacts are critical for maintaining test reliability and developer confidence. Visual debugging tools (trace viewers, screenshots, videos, HAR files) turn cryptic test failures into actionable insights, reducing triage time from hours to minutes.
|
||||
|
||||
## Rationale
|
||||
|
||||
**The Problem**: CI failures often provide minimal context—a timeout, a selector mismatch, or a network error—forcing developers to reproduce issues locally (if they can). This wastes time and discourages test maintenance.
|
||||
|
||||
**The Solution**: Capture rich debugging artifacts **only on failure** to balance storage costs with diagnostic value. Modern tools like Playwright Trace Viewer, Cypress Debug UI, and HAR recordings provide interactive, time-travel debugging that reveals exactly what the test saw at each step.
|
||||
|
||||
**Why This Matters**:
|
||||
|
||||
- Reduces failure triage time by 80-90% (visual context vs logs alone)
|
||||
- Enables debugging without local reproduction
|
||||
- Improves test maintenance confidence (clear failure root cause)
|
||||
- Catches timing/race conditions that are hard to reproduce locally
|
||||
|
||||
## Pattern Examples
|
||||
|
||||
### Example 1: Playwright Trace Viewer Configuration (Production Pattern)
|
||||
|
||||
**Context**: Capture traces on first retry only (balances storage and diagnostics)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts
|
||||
import { defineConfig } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
use: {
|
||||
// Visual debugging artifacts (space-efficient)
|
||||
trace: 'on-first-retry', // Only when test fails once
|
||||
screenshot: 'only-on-failure', // Not on success
|
||||
video: 'retain-on-failure', // Delete on pass
|
||||
|
||||
// Context for debugging
|
||||
baseURL: process.env.BASE_URL || 'http://localhost:3000',
|
||||
|
||||
// Timeout context
|
||||
actionTimeout: 15_000, // 15s for clicks/fills
|
||||
navigationTimeout: 30_000, // 30s for page loads
|
||||
},
|
||||
|
||||
// CI-specific artifact retention
|
||||
reporter: [
|
||||
['html', { outputFolder: 'playwright-report', open: 'never' }],
|
||||
['junit', { outputFile: 'results.xml' }],
|
||||
['list'], // Console output
|
||||
],
|
||||
|
||||
// Failure handling
|
||||
retries: process.env.CI ? 2 : 0, // Retry in CI to capture trace
|
||||
workers: process.env.CI ? 1 : undefined,
|
||||
});
|
||||
```
|
||||
|
||||
**Opening and Using Trace Viewer**:
|
||||
|
||||
```bash
|
||||
# After test failure in CI, download trace artifact
|
||||
# Then open locally:
|
||||
npx playwright show-trace path/to/trace.zip
|
||||
|
||||
# Or serve trace viewer:
|
||||
npx playwright show-report
|
||||
```
|
||||
|
||||
**Key Features to Use in Trace Viewer**:
|
||||
|
||||
1. **Timeline**: See each action (click, navigate, assertion) with timing
|
||||
2. **Snapshots**: Hover over timeline to see DOM state at that moment
|
||||
3. **Network Tab**: Inspect all API calls, headers, payloads, timing
|
||||
4. **Console Tab**: View console.log/error messages
|
||||
5. **Source Tab**: See test code with execution markers
|
||||
6. **Metadata**: Browser, OS, test duration, screenshots
|
||||
|
||||
**Why This Works**:
|
||||
|
||||
- `on-first-retry` avoids capturing traces for flaky passes (saves storage)
|
||||
- Screenshots + video give visual context without trace overhead
|
||||
- Interactive timeline makes timing issues obvious (race conditions, slow API)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: HAR File Recording for Network Debugging
|
||||
|
||||
**Context**: Capture all network activity for reproducible API debugging
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-with-har.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
test.describe('Checkout Flow with HAR Recording', () => {
|
||||
test('should complete payment with full network capture', async ({ page, context }) => {
|
||||
// Start HAR recording BEFORE navigation
|
||||
await context.routeFromHAR(path.join(__dirname, '../fixtures/checkout.har'), {
|
||||
url: '**/api/**', // Only capture API calls
|
||||
update: true, // Update HAR if file exists
|
||||
});
|
||||
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Interact with page
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
await page.getByTestId('card-number').fill('4242424242424242');
|
||||
await page.getByTestId('submit-payment').click();
|
||||
|
||||
// Wait for payment confirmation
|
||||
await expect(page.getByTestId('success-message')).toBeVisible();
|
||||
|
||||
// HAR file saved to fixtures/checkout.har
|
||||
// Contains all network requests/responses for replay
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Using HAR for Deterministic Mocking**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-replay-har.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
import path from 'path';
|
||||
|
||||
test('should replay checkout flow from HAR', async ({ page, context }) => {
|
||||
// Replay network from HAR (no real API calls)
|
||||
await context.routeFromHAR(path.join(__dirname, '../fixtures/checkout.har'), {
|
||||
url: '**/api/**',
|
||||
update: false, // Read-only mode
|
||||
});
|
||||
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Same test, but network responses come from HAR file
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
await page.getByTestId('card-number').fill('4242424242424242');
|
||||
await page.getByTestId('submit-payment').click();
|
||||
|
||||
await expect(page.getByTestId('success-message')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- **`update: true`** records new HAR or updates existing (for flaky API debugging)
|
||||
- **`update: false`** replays from HAR (deterministic, no real API)
|
||||
- Filter by URL pattern (`**/api/**`) to avoid capturing static assets
|
||||
- HAR files are human-readable JSON (easy to inspect/modify)
|
||||
|
||||
**When to Use HAR**:
|
||||
|
||||
- Debugging flaky tests caused by API timing/responses
|
||||
- Creating deterministic mocks for integration tests
|
||||
- Analyzing third-party API behavior (Stripe, Auth0)
|
||||
- Reproducing production issues locally (record HAR in staging)
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Custom Artifact Capture (Console Logs + Network on Failure)
|
||||
|
||||
**Context**: Capture additional debugging context automatically on test failure
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/debug-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
type DebugFixture = {
|
||||
captureDebugArtifacts: () => Promise<void>;
|
||||
};
|
||||
|
||||
export const test = base.extend<DebugFixture>({
|
||||
captureDebugArtifacts: async ({ page }, use, testInfo) => {
|
||||
const consoleLogs: string[] = [];
|
||||
const networkRequests: Array<{ url: string; status: number; method: string }> = [];
|
||||
|
||||
// Capture console messages
|
||||
page.on('console', (msg) => {
|
||||
consoleLogs.push(`[${msg.type()}] ${msg.text()}`);
|
||||
});
|
||||
|
||||
// Capture network requests
|
||||
page.on('request', (request) => {
|
||||
networkRequests.push({
|
||||
url: request.url(),
|
||||
method: request.method(),
|
||||
status: 0, // Will be updated on response
|
||||
});
|
||||
});
|
||||
|
||||
page.on('response', (response) => {
|
||||
const req = networkRequests.find((r) => r.url === response.url());
|
||||
if (req) req.status = response.status();
|
||||
});
|
||||
|
||||
await use(async () => {
|
||||
// This function can be called manually in tests
|
||||
// But it also runs automatically on failure via afterEach
|
||||
});
|
||||
|
||||
// After test completes, save artifacts if failed
|
||||
if (testInfo.status !== testInfo.expectedStatus) {
|
||||
const artifactDir = path.join(testInfo.outputDir, 'debug-artifacts');
|
||||
fs.mkdirSync(artifactDir, { recursive: true });
|
||||
|
||||
// Save console logs
|
||||
fs.writeFileSync(path.join(artifactDir, 'console.log'), consoleLogs.join('\n'), 'utf-8');
|
||||
|
||||
// Save network summary
|
||||
fs.writeFileSync(path.join(artifactDir, 'network.json'), JSON.stringify(networkRequests, null, 2), 'utf-8');
|
||||
|
||||
console.log(`Debug artifacts saved to: ${artifactDir}`);
|
||||
}
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Usage in Tests**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/payment-with-debug.spec.ts
|
||||
import { test, expect } from '../support/fixtures/debug-fixture';
|
||||
|
||||
test('payment flow captures debug artifacts on failure', async ({ page, captureDebugArtifacts }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Test will automatically capture console + network on failure
|
||||
await page.getByTestId('submit-payment').click();
|
||||
await expect(page.getByTestId('success-message')).toBeVisible({ timeout: 5000 });
|
||||
|
||||
// If this fails, console.log and network.json saved automatically
|
||||
});
|
||||
```
|
||||
|
||||
**CI Integration (GitHub Actions)**:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/e2e.yml
|
||||
name: E2E Tests with Artifacts
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run Playwright tests
|
||||
run: npm run test:e2e
|
||||
continue-on-error: true # Capture artifacts even on failure
|
||||
|
||||
- name: Upload test artifacts on failure
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: playwright-artifacts
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
retention-days: 30
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Fixtures automatically capture context without polluting test code
|
||||
- Only saves artifacts on failure (storage-efficient)
|
||||
- CI uploads artifacts for post-mortem analysis
|
||||
- `continue-on-error: true` ensures artifact upload even when tests fail
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Accessibility Debugging Integration (axe-core in Trace Viewer)
|
||||
|
||||
**Context**: Catch accessibility regressions during visual debugging
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// playwright/support/fixtures/a11y-fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
import AxeBuilder from '@axe-core/playwright';
|
||||
|
||||
type A11yFixture = {
|
||||
checkA11y: () => Promise<void>;
|
||||
};
|
||||
|
||||
export const test = base.extend<A11yFixture>({
|
||||
checkA11y: async ({ page }, use) => {
|
||||
await use(async () => {
|
||||
// Run axe accessibility scan
|
||||
const results = await new AxeBuilder({ page }).analyze();
|
||||
|
||||
// Attach results to test report (visible in trace viewer)
|
||||
if (results.violations.length > 0) {
|
||||
console.log(`Found ${results.violations.length} accessibility violations:`);
|
||||
results.violations.forEach((violation) => {
|
||||
console.log(`- [${violation.impact}] ${violation.id}: ${violation.description}`);
|
||||
console.log(` Help: ${violation.helpUrl}`);
|
||||
});
|
||||
|
||||
throw new Error(`Accessibility violations found: ${results.violations.length}`);
|
||||
}
|
||||
});
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Usage with Visual Debugging**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-a11y.spec.ts
|
||||
import { test, expect } from '../support/fixtures/a11y-fixture';
|
||||
|
||||
test('checkout page is accessible', async ({ page, checkA11y }) => {
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Verify page loaded
|
||||
await expect(page.getByRole('heading', { name: 'Checkout' })).toBeVisible();
|
||||
|
||||
// Run accessibility check
|
||||
await checkA11y();
|
||||
|
||||
// If violations found, test fails and trace captures:
|
||||
// - Screenshot showing the problematic element
|
||||
// - Console log with violation details
|
||||
// - Network tab showing any failed resource loads
|
||||
});
|
||||
```
|
||||
|
||||
**Trace Viewer Benefits**:
|
||||
|
||||
- **Screenshot shows visual context** of accessibility issue (contrast, missing labels)
|
||||
- **Console tab shows axe-core violations** with impact level and helpUrl
|
||||
- **DOM snapshot** allows inspecting ARIA attributes at failure point
|
||||
- **Network tab** reveals if icon fonts or images failed (common a11y issue)
|
||||
|
||||
**Cypress Equivalent**:
|
||||
|
||||
```javascript
|
||||
// cypress/support/commands.ts
|
||||
import 'cypress-axe';
|
||||
|
||||
Cypress.Commands.add('checkA11y', (context = null, options = {}) => {
|
||||
cy.injectAxe(); // Inject axe-core
|
||||
cy.checkA11y(context, options, (violations) => {
|
||||
if (violations.length) {
|
||||
cy.task('log', `Found ${violations.length} accessibility violations`);
|
||||
violations.forEach((violation) => {
|
||||
cy.task('log', `- [${violation.impact}] ${violation.id}: ${violation.description}`);
|
||||
});
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
// tests/e2e/checkout-a11y.cy.ts
|
||||
describe('Checkout Accessibility', () => {
|
||||
it('should have no a11y violations', () => {
|
||||
cy.visit('/checkout');
|
||||
cy.injectAxe();
|
||||
cy.checkA11y();
|
||||
// On failure, Cypress UI shows:
|
||||
// - Screenshot of page
|
||||
// - Console log with violation details
|
||||
// - Network tab with API calls
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- Accessibility checks integrate seamlessly with visual debugging
|
||||
- Violations are captured in trace viewer/Cypress UI automatically
|
||||
- Provides actionable links (helpUrl) to fix issues
|
||||
- Screenshots show visual context (contrast, layout)
|
||||
|
||||
---
|
||||
|
||||
### Example 5: Time-Travel Debugging Workflow (Playwright Inspector)
|
||||
|
||||
**Context**: Debug tests interactively with step-through execution
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```typescript
|
||||
// tests/e2e/checkout-debug.spec.ts
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test('debug checkout flow step-by-step', async ({ page }) => {
|
||||
// Set breakpoint by uncommenting this:
|
||||
// await page.pause()
|
||||
|
||||
await page.goto('/checkout');
|
||||
|
||||
// Use Playwright Inspector to:
|
||||
// 1. Step through each action
|
||||
// 2. Inspect DOM at each step
|
||||
// 3. View network calls per action
|
||||
// 4. Take screenshots manually
|
||||
|
||||
await page.getByTestId('payment-method').selectOption('credit-card');
|
||||
|
||||
// Pause here to inspect form state
|
||||
// await page.pause()
|
||||
|
||||
await page.getByTestId('card-number').fill('4242424242424242');
|
||||
await page.getByTestId('submit-payment').click();
|
||||
|
||||
await expect(page.getByTestId('success-message')).toBeVisible();
|
||||
});
|
||||
```
|
||||
|
||||
**Running with Inspector**:
|
||||
|
||||
```bash
|
||||
# Open Playwright Inspector (GUI debugger)
|
||||
npx playwright test --debug
|
||||
|
||||
# Or use headed mode with slowMo
|
||||
npx playwright test --headed --slow-mo=1000
|
||||
|
||||
# Debug specific test
|
||||
npx playwright test checkout-debug.spec.ts --debug
|
||||
|
||||
# Set environment variable for persistent debugging
|
||||
PWDEBUG=1 npx playwright test
|
||||
```
|
||||
|
||||
**Inspector Features**:
|
||||
|
||||
1. **Step-through execution**: Click "Next" to execute one action at a time
|
||||
2. **DOM inspector**: Hover over elements to see selectors
|
||||
3. **Network panel**: See API calls with timing
|
||||
4. **Console panel**: View console.log output
|
||||
5. **Pick locator**: Click element in browser to get selector
|
||||
6. **Record mode**: Record interactions to generate test code
|
||||
|
||||
**Common Debugging Patterns**:
|
||||
|
||||
```typescript
|
||||
// Pattern 1: Debug selector issues
|
||||
test('debug selector', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
await page.pause(); // Inspector opens
|
||||
|
||||
// In Inspector console, test selectors:
|
||||
// page.getByTestId('user-menu') ✅
|
||||
// page.getByRole('button', { name: 'Profile' }) ✅
|
||||
// page.locator('.btn-primary') ❌ (fragile)
|
||||
});
|
||||
|
||||
// Pattern 2: Debug timing issues
|
||||
test('debug network timing', async ({ page }) => {
|
||||
await page.goto('/dashboard');
|
||||
|
||||
// Set up network listener BEFORE interaction
|
||||
const responsePromise = page.waitForResponse('**/api/users');
|
||||
await page.getByTestId('load-users').click();
|
||||
|
||||
await page.pause(); // Check network panel for timing
|
||||
|
||||
const response = await responsePromise;
|
||||
expect(response.status()).toBe(200);
|
||||
});
|
||||
|
||||
// Pattern 3: Debug state changes
|
||||
test('debug state mutation', async ({ page }) => {
|
||||
await page.goto('/cart');
|
||||
|
||||
// Check initial state
|
||||
await expect(page.getByTestId('cart-count')).toHaveText('0');
|
||||
|
||||
await page.pause(); // Inspect DOM
|
||||
|
||||
await page.getByTestId('add-to-cart').click();
|
||||
|
||||
await page.pause(); // Inspect DOM again (compare state)
|
||||
|
||||
await expect(page.getByTestId('cart-count')).toHaveText('1');
|
||||
});
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
|
||||
- `page.pause()` opens Inspector at that exact moment
|
||||
- Inspector shows DOM state, network activity, console at pause point
|
||||
- "Pick locator" feature helps find robust selectors
|
||||
- Record mode generates test code from manual interactions
|
||||
|
||||
---
|
||||
|
||||
## Visual Debugging Checklist
|
||||
|
||||
Before deploying tests to CI, ensure:
|
||||
|
||||
- [ ] **Artifact configuration**: `trace: 'on-first-retry'`, `screenshot: 'only-on-failure'`, `video: 'retain-on-failure'`
|
||||
- [ ] **CI artifact upload**: GitHub Actions/GitLab CI configured to upload `test-results/` and `playwright-report/`
|
||||
- [ ] **HAR recording**: Set up for flaky API tests (record once, replay deterministically)
|
||||
- [ ] **Custom debug fixtures**: Console logs + network summary captured on failure
|
||||
- [ ] **Accessibility integration**: axe-core violations visible in trace viewer
|
||||
- [ ] **Trace viewer docs**: README explains how to open traces locally (`npx playwright show-trace`)
|
||||
- [ ] **Inspector workflow**: Document `--debug` flag for interactive debugging
|
||||
- [ ] **Storage optimization**: Artifacts deleted after 30 days (CI retention policy)
|
||||
|
||||
## Integration Points
|
||||
|
||||
- **Used in workflows**: `*framework` (initial setup), `*ci` (artifact upload), `*test-review` (validate artifact config)
|
||||
- **Related fragments**: `playwright-config.md` (artifact configuration), `ci-burn-in.md` (CI artifact upload), `test-quality.md` (debugging best practices)
|
||||
- **Tools**: Playwright Trace Viewer, Cypress Debug UI, axe-core, HAR files
|
||||
|
||||
_Source: Playwright official docs, Murat testing philosophy (visual debugging manifesto), SEON production debugging patterns_
|
||||
|
||||
@@ -17,3 +17,6 @@ test-quality,Test Quality Definition of Done,"Execution limits, isolation rules,
|
||||
nfr-criteria,NFR Review Criteria,"Security, performance, reliability, maintainability status definitions","nfr,assessment,quality",knowledge/nfr-criteria.md
|
||||
test-levels,Test Levels Framework,"Guidelines for choosing unit, integration, or end-to-end coverage","testing,levels,selection",knowledge/test-levels-framework.md
|
||||
test-priorities,Test Priorities Matrix,"P0–P3 criteria, coverage targets, execution ordering","testing,prioritization,risk",knowledge/test-priorities-matrix.md
|
||||
test-healing-patterns,Test Healing Patterns,"Common failure patterns and automated fixes","healing,debugging,patterns",knowledge/test-healing-patterns.md
|
||||
selector-resilience,Selector Resilience,"Robust selector strategies and debugging techniques","selectors,locators,debugging",knowledge/selector-resilience.md
|
||||
timing-debugging,Timing Debugging,"Race condition identification and deterministic wait fixes","timing,async,debugging",knowledge/timing-debugging.md
|
||||
|
||||
|
@@ -9,11 +9,12 @@ This directory houses the per-command workflows used by the Test Architect agent
|
||||
- `automate` – expands regression coverage after implementation.
|
||||
- `ci` – bootstraps CI/CD pipelines aligned with TEA practices.
|
||||
- `test-design` – combines risk assessment and coverage planning.
|
||||
- `trace` – maps requirements to implemented automated tests.
|
||||
- `trace` – maps requirements to tests (Phase 1) and makes quality gate decisions (Phase 2).
|
||||
- `nfr-assess` – evaluates non-functional requirements.
|
||||
- `gate` – records the release decision in the gate file.
|
||||
- `test-review` – reviews test quality using knowledge base patterns and generates quality score.
|
||||
|
||||
**Note**: The `gate` workflow has been merged into `trace` as Phase 2. The `*trace` command now performs both requirements-to-tests traceability mapping AND quality gate decision (PASS/CONCERNS/FAIL/WAIVED) in a single atomic operation.
|
||||
|
||||
Each subdirectory contains:
|
||||
|
||||
- `README.md` – comprehensive workflow documentation with usage, inputs, outputs, and integration notes.
|
||||
|
||||
@@ -141,6 +141,145 @@ The TEA agent runs this workflow when:
|
||||
|
||||
**Selection Strategy**: Avoid duplicate coverage. Use E2E for critical happy path, API for business logic variations, component for UI edge cases, unit for pure logic.
|
||||
|
||||
### Recording Mode (NEW - Phase 2.5)
|
||||
|
||||
**atdd** can record complex UI interactions instead of AI generation.
|
||||
|
||||
**Activation**: Automatic for complex UI when config.tea_use_mcp_enhancements is true and MCP available
|
||||
|
||||
- Fallback: AI generation (silent, automatic)
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations)
|
||||
- ✅ Unclear requirements (exploratory, discovering expected behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
|
||||
- ✅ Clear acceptance criteria available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
**How Test Generation Works (Default - AI-Based):**
|
||||
|
||||
TEA generates tests using AI by:
|
||||
|
||||
1. **Analyzing acceptance criteria** from story markdown
|
||||
2. **Inferring selectors** from requirement descriptions (e.g., "login button" → `[data-testid="login-button"]`)
|
||||
3. **Synthesizing test code** based on knowledge base patterns
|
||||
4. **Estimating interactions** using common UI patterns (click, type, verify)
|
||||
5. **Applying best practices** from knowledge fragments (Given-When-Then, network-first, fixtures)
|
||||
|
||||
**This works well for:**
|
||||
|
||||
- ✅ Clear requirements with known UI patterns
|
||||
- ✅ Standard workflows (login, CRUD, navigation)
|
||||
- ✅ When selectors follow conventions (data-testid attributes)
|
||||
|
||||
**What MCP Adds (Interactive Verification & Enhancement):**
|
||||
|
||||
When Playwright MCP is available, TEA **additionally**:
|
||||
|
||||
1. **Verifies generated tests** by:
|
||||
- **Launching real browser** with `generator_setup_page`
|
||||
- **Executing generated test steps** with `browser_*` tools (`navigate`, `click`, `type`)
|
||||
- **Seeing actual UI** with `browser_snapshot` (visual verification)
|
||||
- **Discovering real selectors** with `browser_generate_locator` (auto-generate from live DOM)
|
||||
|
||||
2. **Enhances AI-generated tests** by:
|
||||
- **Validating selectors exist** in actual DOM (not just guesses)
|
||||
- **Verifying behavior** with `browser_verify_text`, `browser_verify_visible`, `browser_verify_url`
|
||||
- **Capturing actual interaction log** with `generator_read_log`
|
||||
- **Refining test code** with real observed behavior
|
||||
|
||||
3. **Catches issues early** by:
|
||||
- **Finding missing selectors** before DEV implements (requirements clarification)
|
||||
- **Discovering edge cases** not in requirements (loading states, error messages)
|
||||
- **Validating assumptions** about UI structure and behavior
|
||||
|
||||
**Key Benefits of MCP Enhancement:**
|
||||
|
||||
- ✅ **AI generates tests** (fast, based on requirements) **+** **MCP verifies tests** (accurate, based on reality)
|
||||
- ✅ **Accurate selectors**: Validated against actual DOM, not just inferred
|
||||
- ✅ **Visual validation**: TEA sees what user sees (modals, animations, state changes)
|
||||
- ✅ **Complex flows**: Records multi-step interactions precisely
|
||||
- ✅ **Edge case discovery**: Observes actual app behavior beyond requirements
|
||||
- ✅ **Selector resilience**: MCP generates robust locators from live page (role-based, text-based, fallback chains)
|
||||
|
||||
**Example Enhancement Flow:**
|
||||
|
||||
```
|
||||
1. AI generates test based on acceptance criteria
|
||||
→ await page.click('[data-testid="submit-button"]')
|
||||
|
||||
2. MCP verifies selector exists (browser_generate_locator)
|
||||
→ Found: button[type="submit"].btn-primary
|
||||
→ No data-testid attribute exists!
|
||||
|
||||
3. TEA refines test with actual selector
|
||||
→ await page.locator('button[type="submit"]').click()
|
||||
→ Documents requirement: "Add data-testid='submit-button' to button"
|
||||
```
|
||||
|
||||
**Recording Workflow (MCP-Based):**
|
||||
|
||||
```
|
||||
1. Set generation_mode: "recording"
|
||||
2. Use generator_setup_page to init recording session
|
||||
3. For each acceptance criterion:
|
||||
a. Execute scenario with browser_* tools:
|
||||
- browser_navigate, browser_click, browser_type
|
||||
- browser_select, browser_check
|
||||
b. Add verifications with browser_verify_* tools:
|
||||
- browser_verify_text, browser_verify_visible
|
||||
- browser_verify_url
|
||||
c. Capture log with generator_read_log
|
||||
d. Generate test with generator_write_test
|
||||
4. Enhance generated tests with knowledge base patterns:
|
||||
- Add Given-When-Then comments
|
||||
- Replace selectors with data-testid
|
||||
- Add network-first interception
|
||||
- Add fixtures/factories
|
||||
5. Verify tests fail (RED phase)
|
||||
```
|
||||
|
||||
**Example: Recording a Checkout Flow**
|
||||
|
||||
```markdown
|
||||
Recording session for: "User completes checkout with credit card"
|
||||
|
||||
Actions recorded:
|
||||
|
||||
1. browser_navigate('/cart')
|
||||
2. browser_click('[data-testid="checkout-button"]')
|
||||
3. browser_type('[data-testid="card-number"]', '4242424242424242')
|
||||
4. browser_type('[data-testid="expiry"]', '12/25')
|
||||
5. browser_type('[data-testid="cvv"]', '123')
|
||||
6. browser_click('[data-testid="place-order"]')
|
||||
7. browser_verify_text('Order confirmed')
|
||||
8. browser_verify_url('/confirmation')
|
||||
|
||||
Generated test (enhanced):
|
||||
|
||||
- Given-When-Then structure added
|
||||
- data-testid selectors used
|
||||
- Network-first payment API mock added
|
||||
- Card factory created for test data
|
||||
- Test verified to FAIL (checkout not implemented)
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Recording mode is OPTIONAL (default: AI generation)
|
||||
- Requires Playwright MCP (falls back to AI if unavailable)
|
||||
- Generated tests enhanced with knowledge base patterns
|
||||
- Same quality output regardless of generation method
|
||||
|
||||
### Given-When-Then Structure
|
||||
|
||||
All tests follow BDD format for clarity:
|
||||
|
||||
@@ -51,16 +51,126 @@ Generates failing acceptance tests BEFORE implementation following TDD's red-gre
|
||||
4. **Load Knowledge Base Fragments**
|
||||
|
||||
**Critical:** Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load:
|
||||
- `fixture-architecture.md` - Test fixture patterns with auto-cleanup
|
||||
- `data-factories.md` - Factory patterns using faker
|
||||
- `component-tdd.md` - Component test strategies
|
||||
- `network-first.md` - Route interception patterns
|
||||
- `test-quality.md` - Test design principles
|
||||
- `fixture-architecture.md` - Test fixture patterns with auto-cleanup (pure function → fixture → mergeTests composition, 406 lines, 5 examples)
|
||||
- `data-factories.md` - Factory patterns using faker (override patterns, nested factories, API seeding, 498 lines, 5 examples)
|
||||
- `component-tdd.md` - Component test strategies (red-green-refactor, provider isolation, accessibility, visual regression, 480 lines, 4 examples)
|
||||
- `network-first.md` - Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
|
||||
- `test-quality.md` - Test design principles (deterministic tests, isolated with cleanup, explicit assertions, length limits, execution time optimization, 658 lines, 5 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns and healing strategies (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, dynamic patterns, anti-patterns, 541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention and async debugging (network-first, deterministic waiting, anti-patterns, 370 lines, 3 examples)
|
||||
|
||||
**Halt Condition:** If story has no acceptance criteria or framework is missing, HALT with message: "ATDD requires clear acceptance criteria and test framework setup"
|
||||
|
||||
---
|
||||
|
||||
## Step 1.5: Generation Mode Selection (NEW - Phase 2.5)
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Detect Generation Mode**
|
||||
|
||||
Determine mode based on scenario complexity:
|
||||
|
||||
**AI Generation Mode (DEFAULT)**:
|
||||
- Clear acceptance criteria with standard patterns
|
||||
- Uses: AI-generated tests from requirements
|
||||
- Appropriate for: CRUD, auth, navigation, API tests
|
||||
- Fastest approach
|
||||
|
||||
**Recording Mode (OPTIONAL - Complex UI)**:
|
||||
- Complex UI interactions (drag-drop, wizards, multi-page flows)
|
||||
- Uses: Interactive test recording with Playwright MCP
|
||||
- Appropriate for: Visual workflows, unclear requirements
|
||||
- Only if config.tea_use_mcp_enhancements is true AND MCP available
|
||||
|
||||
2. **AI Generation Mode (DEFAULT - Continue to Step 2)**
|
||||
|
||||
For standard scenarios:
|
||||
- Continue with existing workflow (Step 2: Select Test Levels and Strategy)
|
||||
- AI generates tests based on acceptance criteria from Step 1
|
||||
- Use knowledge base patterns for test structure
|
||||
|
||||
3. **Recording Mode (OPTIONAL - Complex UI Only)**
|
||||
|
||||
For complex UI scenarios AND config.tea_use_mcp_enhancements is true:
|
||||
|
||||
**A. Check MCP Availability**
|
||||
|
||||
If Playwright MCP tools are available in your IDE:
|
||||
- Use MCP recording mode (Step 3.B)
|
||||
|
||||
If MCP unavailable:
|
||||
- Fallback to AI generation mode (silent, automatic)
|
||||
- Continue to Step 2
|
||||
|
||||
**B. Interactive Test Recording (MCP-Based)**
|
||||
|
||||
Use Playwright MCP test-generator tools:
|
||||
|
||||
**Setup:**
|
||||
|
||||
```
|
||||
1. Use generator_setup_page to initialize recording session
|
||||
2. Navigate to application starting URL (from story context)
|
||||
3. Ready to record user interactions
|
||||
```
|
||||
|
||||
**Recording Process (Per Acceptance Criterion):**
|
||||
|
||||
```
|
||||
4. Read acceptance criterion from story
|
||||
5. Manually execute test scenario using browser_* tools:
|
||||
- browser_navigate: Navigate to pages
|
||||
- browser_click: Click buttons, links, elements
|
||||
- browser_type: Fill form fields
|
||||
- browser_select: Select dropdown options
|
||||
- browser_check: Check/uncheck checkboxes
|
||||
6. Add verification steps using browser_verify_* tools:
|
||||
- browser_verify_text: Verify text content
|
||||
- browser_verify_visible: Verify element visibility
|
||||
- browser_verify_url: Verify URL navigation
|
||||
7. Capture interaction log with generator_read_log
|
||||
8. Generate test file with generator_write_test
|
||||
9. Repeat for next acceptance criterion
|
||||
```
|
||||
|
||||
**Post-Recording Enhancement:**
|
||||
|
||||
```
|
||||
10. Review generated test code
|
||||
11. Enhance with knowledge base patterns:
|
||||
- Add Given-When-Then comments
|
||||
- Replace recorded selectors with data-testid (if needed)
|
||||
- Add network-first interception (from network-first.md)
|
||||
- Add fixtures for auth/data setup (from fixture-architecture.md)
|
||||
- Use factories for test data (from data-factories.md)
|
||||
12. Verify tests fail (missing implementation)
|
||||
13. Continue to Step 4 (Build Data Infrastructure)
|
||||
```
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations)
|
||||
- ✅ Unclear requirements (exploratory, discovering expected behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
- ✅ Clear acceptance criteria available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
4. **Proceed to Test Level Selection**
|
||||
|
||||
After mode selection:
|
||||
- AI Generation: Continue to Step 2 (Select Test Levels and Strategy)
|
||||
- Recording: Skip to Step 4 (Build Data Infrastructure) - tests already generated
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Select Test Levels and Strategy
|
||||
|
||||
### Actions
|
||||
@@ -583,18 +693,24 @@ test('should display user info', async ({ page }) => {
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Auto-load enabled:**
|
||||
**Core Fragments (Auto-loaded in Step 1):**
|
||||
|
||||
- `fixture-architecture.md` - Fixture patterns
|
||||
- `data-factories.md` - Factory patterns
|
||||
- `component-tdd.md` - Component testing
|
||||
- `network-first.md` - Route interception
|
||||
- `fixture-architecture.md` - Pure function → fixture → mergeTests patterns (406 lines, 5 examples)
|
||||
- `data-factories.md` - Factory patterns with faker, overrides, API seeding (498 lines, 5 examples)
|
||||
- `component-tdd.md` - Red-green-refactor, provider isolation, accessibility, visual regression (480 lines, 4 examples)
|
||||
- `network-first.md` - Intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
|
||||
- `test-quality.md` - Deterministic tests, cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns (541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention, deterministic waiting, async debugging (370 lines, 3 examples)
|
||||
|
||||
**Manual reference:**
|
||||
**Reference for Test Level Selection:**
|
||||
|
||||
- Use `tea-index.csv` to find additional fragments
|
||||
- Load `test-levels-framework.md` for level selection
|
||||
- Load `test-quality.md` for test design principles
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework (467 lines, 4 examples)
|
||||
|
||||
**Manual Reference (Optional):**
|
||||
|
||||
- Use `tea-index.csv` to find additional specialized fragments as needed
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -75,6 +75,13 @@ The TEA agent runs this workflow when:
|
||||
- `require_self_cleaning`: All tests must clean up data (default: true)
|
||||
- `auto_load_knowledge`: Load relevant knowledge fragments (default: true)
|
||||
- `run_tests_after_generation`: Verify tests pass/fail as expected (default: true)
|
||||
- `auto_validate`: Run generated tests after creation (default: true) **NEW**
|
||||
- `auto_heal_failures`: Enable automatic healing (default: false, opt-in) **NEW**
|
||||
- `max_healing_iterations`: Maximum healing attempts per test (default: 3) **NEW**
|
||||
- `fail_on_unhealable`: Fail workflow if tests can't be healed (default: false) **NEW**
|
||||
- `mark_unhealable_as_fixme`: Mark unfixable tests with test.fixme() (default: true) **NEW**
|
||||
- `use_mcp_healing`: Use Playwright MCP if available (default: true) **NEW**
|
||||
- `healing_knowledge_fragments`: Healing patterns to load (default: "test-healing-patterns,selector-resilience,timing-debugging") **NEW**
|
||||
|
||||
## Outputs
|
||||
|
||||
@@ -161,6 +168,269 @@ The TEA agent runs this workflow when:
|
||||
|
||||
Use E2E sparingly for critical paths. Use API/Component/Unit for variations and edge cases.
|
||||
|
||||
### Healing Capabilities (NEW - Phase 2.5)
|
||||
|
||||
**automate** automatically validates and heals test failures after generation.
|
||||
|
||||
**Configuration**: Controlled by `config.tea_use_mcp_enhancements` (default: true)
|
||||
|
||||
- If true + MCP available → MCP-assisted healing
|
||||
- If true + MCP unavailable → Pattern-based healing
|
||||
- If false → No healing, document failures for manual review
|
||||
|
||||
**Constants**: Max 3 healing attempts, unfixable tests marked as `test.fixme()`
|
||||
|
||||
**How Healing Works (Default - Pattern-Based):**
|
||||
|
||||
TEA heals tests using pattern-based analysis by:
|
||||
|
||||
1. **Parsing error messages** from test output logs
|
||||
2. **Matching patterns** against known failure signatures
|
||||
3. **Applying fixes** from healing knowledge fragments:
|
||||
- `test-healing-patterns.md` - Common failure patterns (selectors, timing, data, network)
|
||||
- `selector-resilience.md` - Selector refactoring (CSS → data-testid, nth() → filter())
|
||||
- `timing-debugging.md` - Race condition fixes (hard waits → event-based waits)
|
||||
4. **Re-running tests** to verify fix (max 3 iterations)
|
||||
5. **Marking unfixable tests** as `test.fixme()` with detailed comments
|
||||
|
||||
**This works well for:**
|
||||
|
||||
- ✅ Common failure patterns (stale selectors, timing issues, dynamic data)
|
||||
- ✅ Text-based errors with clear signatures
|
||||
- ✅ Issues documented in knowledge base
|
||||
- ✅ Automated CI environments without browser access
|
||||
|
||||
**What MCP Adds (Interactive Debugging Enhancement):**
|
||||
|
||||
When Playwright MCP is available, TEA **additionally**:
|
||||
|
||||
1. **Debugs failures interactively** before applying pattern-based fixes:
|
||||
- **Pause test execution** with `playwright_test_debug_test` (step through, inspect state)
|
||||
- **See visual failure context** with `browser_snapshot` (screenshot of failure state)
|
||||
- **Inspect live DOM** with browser tools (find why selector doesn't match)
|
||||
- **Analyze console logs** with `browser_console_messages` (JS errors, warnings, debug output)
|
||||
- **Inspect network activity** with `browser_network_requests` (failed API calls, CORS errors, timeouts)
|
||||
|
||||
2. **Enhances pattern-based fixes** with real-world data:
|
||||
- **Pattern match identifies issue** (e.g., "stale selector")
|
||||
- **MCP discovers actual selector** with `browser_generate_locator` from live page
|
||||
- **TEA applies refined fix** using real DOM structure (not just pattern guess)
|
||||
- **Verification happens in browser** (see if fix works visually)
|
||||
|
||||
3. **Catches root causes** pattern matching might miss:
|
||||
- **Network failures**: MCP shows 500 error on API call (not just timeout)
|
||||
- **JS errors**: MCP shows `TypeError: undefined` in console (not just "element not found")
|
||||
- **Timing issues**: MCP shows loading spinner still visible (not just "selector timeout")
|
||||
- **State problems**: MCP shows modal blocking button (not just "not clickable")
|
||||
|
||||
**Key Benefits of MCP Enhancement:**
|
||||
|
||||
- ✅ **Pattern-based fixes** (fast, automated) **+** **MCP verification** (accurate, context-aware)
|
||||
- ✅ **Visual debugging**: See exactly what user sees when test fails
|
||||
- ✅ **DOM inspection**: Discover why selectors don't match (element missing, wrong attributes, dynamic IDs)
|
||||
- ✅ **Network visibility**: Identify API failures, slow requests, CORS issues
|
||||
- ✅ **Console analysis**: Catch JS errors that break page functionality
|
||||
- ✅ **Robust selectors**: Generate locators from actual DOM (role, text, testid hierarchy)
|
||||
- ✅ **Faster iteration**: Debug and fix in same browser session (no restart needed)
|
||||
- ✅ **Higher success rate**: MCP helps diagnose failures pattern matching can't solve
|
||||
|
||||
**Example Enhancement Flow:**
|
||||
|
||||
```
|
||||
1. Pattern-based healing identifies issue
|
||||
→ Error: "Locator '.submit-btn' resolved to 0 elements"
|
||||
→ Pattern match: Stale selector (CSS class)
|
||||
→ Suggested fix: Replace with data-testid
|
||||
|
||||
2. MCP enhances diagnosis (if available)
|
||||
→ browser_snapshot shows button exists but has class ".submit-button" (not ".submit-btn")
|
||||
→ browser_generate_locator finds: button[type="submit"].submit-button
|
||||
→ browser_console_messages shows no errors
|
||||
|
||||
3. TEA applies refined fix
|
||||
→ await page.locator('button[type="submit"]').click()
|
||||
→ (More accurate than pattern-based guess)
|
||||
```
|
||||
|
||||
**Healing Modes:**
|
||||
|
||||
1. **MCP-Enhanced Healing** (when Playwright MCP available):
|
||||
- Pattern-based analysis **+** Interactive debugging
|
||||
- Visual context with `browser_snapshot`
|
||||
- Console log analysis with `browser_console_messages`
|
||||
- Network inspection with `browser_network_requests`
|
||||
- Live DOM inspection with `browser_generate_locator`
|
||||
- Step-by-step debugging with `playwright_test_debug_test`
|
||||
|
||||
2. **Pattern-Based Healing** (always available):
|
||||
- Error message parsing and pattern matching
|
||||
- Automated fixes from healing knowledge fragments
|
||||
- Text-based analysis (no visual/DOM inspection)
|
||||
- Works in CI without browser access
|
||||
|
||||
**Healing Workflow:**
|
||||
|
||||
```
|
||||
1. Generate tests → Run tests
|
||||
2. IF pass → Success ✅
|
||||
3. IF fail AND auto_heal_failures=false → Report failures ⚠️
|
||||
4. IF fail AND auto_heal_failures=true → Enter healing loop:
|
||||
a. Identify failure pattern (selector, timing, data, network)
|
||||
b. Apply automated fix from knowledge base
|
||||
c. Re-run test (max 3 iterations)
|
||||
d. IF healed → Success ✅
|
||||
e. IF unhealable → Mark test.fixme() with detailed comment
|
||||
```
|
||||
|
||||
**Example Healing Outcomes:**
|
||||
|
||||
```typescript
|
||||
// ❌ Original (failing): CSS class selector
|
||||
await page.locator('.btn-primary').click();
|
||||
|
||||
// ✅ Healed: data-testid selector
|
||||
await page.getByTestId('submit-button').click();
|
||||
|
||||
// ❌ Original (failing): Hard wait
|
||||
await page.waitForTimeout(3000);
|
||||
|
||||
// ✅ Healed: Network-first pattern
|
||||
await page.waitForResponse('**/api/data');
|
||||
|
||||
// ❌ Original (failing): Hardcoded ID
|
||||
await expect(page.getByText('User 123')).toBeVisible();
|
||||
|
||||
// ✅ Healed: Regex pattern
|
||||
await expect(page.getByText(/User \d+/)).toBeVisible();
|
||||
```
|
||||
|
||||
**Unfixable Tests (Marked as test.fixme()):**
|
||||
|
||||
```typescript
|
||||
test.fixme('[P1] should handle complex interaction', async ({ page }) => {
|
||||
// FIXME: Test healing failed after 3 attempts
|
||||
// Failure: "Locator 'button[data-action="submit"]' resolved to 0 elements"
|
||||
// Attempted fixes:
|
||||
// 1. Replaced with page.getByTestId('submit-button') - still failing
|
||||
// 2. Replaced with page.getByRole('button', { name: 'Submit' }) - still failing
|
||||
// 3. Added waitForLoadState('networkidle') - still failing
|
||||
// Manual investigation needed: Selector may require application code changes
|
||||
// TODO: Review with team, may need data-testid added to button component
|
||||
// Original test code...
|
||||
});
|
||||
```
|
||||
|
||||
**When to Enable Healing:**
|
||||
|
||||
- ✅ Enable for greenfield projects (catch generated test issues early)
|
||||
- ✅ Enable for brownfield projects (auto-fix legacy selector patterns)
|
||||
- ❌ Disable if environment not ready (application not deployed/seeded)
|
||||
- ❌ Disable if preferring manual review of all generated tests
|
||||
|
||||
**Healing Report Example:**
|
||||
|
||||
```markdown
|
||||
## Test Healing Report
|
||||
|
||||
**Auto-Heal Enabled**: true
|
||||
**Healing Mode**: Pattern-based
|
||||
**Iterations Allowed**: 3
|
||||
|
||||
### Validation Results
|
||||
|
||||
- **Total tests**: 10
|
||||
- **Passing**: 7
|
||||
- **Failing**: 3
|
||||
|
||||
### Healing Outcomes
|
||||
|
||||
**Successfully Healed (2 tests):**
|
||||
|
||||
- `tests/e2e/login.spec.ts:15` - Stale selector (CSS class → data-testid)
|
||||
- `tests/e2e/checkout.spec.ts:42` - Race condition (added network-first interception)
|
||||
|
||||
**Unable to Heal (1 test):**
|
||||
|
||||
- `tests/e2e/complex-flow.spec.ts:67` - Marked as test.fixme()
|
||||
- Requires application code changes (add data-testid to component)
|
||||
|
||||
### Healing Patterns Applied
|
||||
|
||||
- **Selector fixes**: 1
|
||||
- **Timing fixes**: 1
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Healing is OPTIONAL (default: disabled)
|
||||
- Works without Playwright MCP (pattern-based fallback)
|
||||
- Unfixable tests marked clearly (not silently broken)
|
||||
- Manual investigation path documented
|
||||
|
||||
### Recording Mode (NEW - Phase 2.5)
|
||||
|
||||
**automate** can record complex UI interactions instead of AI generation.
|
||||
|
||||
**Activation**: Automatic for complex UI scenarios when config.tea_use_mcp_enhancements is true and MCP available
|
||||
|
||||
- Complex scenarios: drag-drop, wizards, multi-page flows
|
||||
- Fallback: AI generation (silent, automatic)
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations, transitions)
|
||||
- ✅ Unclear requirements (exploratory, discovering behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
|
||||
- ✅ Clear requirements available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
**Recording Workflow (Same as atdd):**
|
||||
|
||||
```
|
||||
1. Set generation_mode: "recording"
|
||||
2. Use generator_setup_page to init recording
|
||||
3. For each test scenario:
|
||||
- Execute with browser_* tools (navigate, click, type, select)
|
||||
- Add verifications with browser_verify_* tools
|
||||
- Capture log and generate test file
|
||||
4. Enhance with knowledge base patterns:
|
||||
- Given-When-Then structure
|
||||
- data-testid selectors
|
||||
- Network-first interception
|
||||
- Fixtures/factories
|
||||
5. Validate (run tests if auto_validate enabled)
|
||||
6. Heal if needed (if auto_heal_failures enabled)
|
||||
```
|
||||
|
||||
**Combination: Recording + Healing:**
|
||||
|
||||
automate can use BOTH recording and healing together:
|
||||
|
||||
- Generate tests via recording (complex flows captured interactively)
|
||||
- Run tests to validate (auto_validate)
|
||||
- Heal failures automatically (auto_heal_failures)
|
||||
|
||||
This is particularly powerful for brownfield projects where:
|
||||
|
||||
- Requirements unclear → Use recording to capture existing behavior
|
||||
- Application complex → Recording captures nuances AI might miss
|
||||
- Tests may fail → Healing fixes common issues automatically
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Recording mode is OPTIONAL (default: AI generation)
|
||||
- Requires Playwright MCP (falls back to AI if unavailable)
|
||||
- Works with or without healing enabled
|
||||
- Same quality output regardless of generation method
|
||||
|
||||
### Test Level Selection Framework
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
@@ -421,8 +691,7 @@ await element.click();
|
||||
|
||||
**After this workflow:**
|
||||
|
||||
- **trace** workflow: Update traceability matrix with new test coverage
|
||||
- **gate** workflow: Quality gate decision using test results
|
||||
- **trace** workflow: Update traceability matrix with new test coverage (Phase 1) and make quality gate decision (Phase 2)
|
||||
- **CI pipeline**: Run tests in burn-in loop to detect flaky patterns
|
||||
|
||||
**Coordinates with:**
|
||||
@@ -450,7 +719,7 @@ await element.click();
|
||||
|
||||
- **atdd**: REQUIRES story with acceptance criteria (halt if missing)
|
||||
- **test-design**: REQUIRES PRD/epic context (halt if missing)
|
||||
- **gate**: REQUIRES test results (halt if missing)
|
||||
- **trace (Phase 2)**: REQUIRES test results for gate decision (halt if missing)
|
||||
|
||||
### File Size Limits
|
||||
|
||||
@@ -494,7 +763,13 @@ This workflow automatically consults:
|
||||
- **ci-burn-in.md** - Flaky test detection patterns (10 iterations to catch intermittent failures)
|
||||
- **test-quality.md** - Test design principles (Given-When-Then, determinism, isolation, atomic assertions)
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping.
|
||||
**Healing Knowledge (If `auto_heal_failures` enabled):**
|
||||
|
||||
- **test-healing-patterns.md** - Common failure patterns and automated fixes (selectors, timing, data, network, hard waits)
|
||||
- **selector-resilience.md** - Robust selector strategies and debugging (data-testid hierarchy, filter vs nth, anti-patterns)
|
||||
- **timing-debugging.md** - Race condition identification and deterministic wait fixes (network-first, event-based waits)
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping (22 fragments total).
|
||||
|
||||
## Example Output
|
||||
|
||||
|
||||
@@ -233,7 +233,71 @@ Before starting this workflow, verify:
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Documentation and Scripts Updated
|
||||
## Step 5: Test Validation and Healing (NEW - Phase 2.5)
|
||||
|
||||
### Healing Configuration
|
||||
|
||||
- [ ] Healing configuration checked:
|
||||
- [ ] `{auto_validate}` setting noted (default: true)
|
||||
- [ ] `{auto_heal_failures}` setting noted (default: false)
|
||||
- [ ] `{max_healing_iterations}` setting noted (default: 3)
|
||||
- [ ] `{use_mcp_healing}` setting noted (default: true)
|
||||
|
||||
### Healing Knowledge Fragments Loaded (If Healing Enabled)
|
||||
|
||||
- [ ] `test-healing-patterns.md` loaded (common failure patterns and fixes)
|
||||
- [ ] `selector-resilience.md` loaded (selector refactoring guide)
|
||||
- [ ] `timing-debugging.md` loaded (race condition fixes)
|
||||
|
||||
### Test Execution and Validation
|
||||
|
||||
- [ ] Generated tests executed (if `{auto_validate}` true)
|
||||
- [ ] Test results captured:
|
||||
- [ ] Total tests run
|
||||
- [ ] Passing tests count
|
||||
- [ ] Failing tests count
|
||||
- [ ] Error messages and stack traces captured
|
||||
|
||||
### Healing Loop (If Enabled and Tests Failed)
|
||||
|
||||
- [ ] Healing loop entered (if `{auto_heal_failures}` true AND tests failed)
|
||||
- [ ] For each failing test:
|
||||
- [ ] Failure pattern identified (selector, timing, data, network, hard wait)
|
||||
- [ ] Appropriate healing strategy applied:
|
||||
- [ ] Stale selector → Replaced with data-testid or ARIA role
|
||||
- [ ] Race condition → Added network-first interception or state waits
|
||||
- [ ] Dynamic data → Replaced hardcoded values with regex/dynamic generation
|
||||
- [ ] Network error → Added route mocking
|
||||
- [ ] Hard wait → Replaced with event-based wait
|
||||
- [ ] Healed test re-run to validate fix
|
||||
- [ ] Iteration count tracked (max 3 attempts)
|
||||
|
||||
### Unfixable Tests Handling
|
||||
|
||||
- [ ] Tests that couldn't be healed after 3 iterations marked with `test.fixme()` (if `{mark_unhealable_as_fixme}` true)
|
||||
- [ ] Detailed comment added to test.fixme() tests:
|
||||
- [ ] What failure occurred
|
||||
- [ ] What healing was attempted (3 iterations)
|
||||
- [ ] Why healing failed
|
||||
- [ ] Manual investigation steps needed
|
||||
- [ ] Original test logic preserved in comments
|
||||
|
||||
### Healing Report Generated
|
||||
|
||||
- [ ] Healing report generated (if healing attempted)
|
||||
- [ ] Report includes:
|
||||
- [ ] Auto-heal enabled status
|
||||
- [ ] Healing mode (MCP-assisted or Pattern-based)
|
||||
- [ ] Iterations allowed (max_healing_iterations)
|
||||
- [ ] Validation results (total, passing, failing)
|
||||
- [ ] Successfully healed tests (count, file:line, fix applied)
|
||||
- [ ] Unable to heal tests (count, file:line, reason)
|
||||
- [ ] Healing patterns applied (selector fixes, timing fixes, data fixes)
|
||||
- [ ] Knowledge base references used
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Documentation and Scripts Updated
|
||||
|
||||
### Test README Updated
|
||||
|
||||
@@ -393,9 +457,13 @@ All of the following must be true before marking this workflow as complete:
|
||||
- [ ] **Test README updated** with execution instructions and patterns
|
||||
- [ ] **package.json scripts updated** with test execution commands
|
||||
- [ ] **Test suite run locally** (if run_tests_after_generation true)
|
||||
- [ ] **Tests validated** (if auto_validate enabled)
|
||||
- [ ] **Failures healed** (if auto_heal_failures enabled and tests failed)
|
||||
- [ ] **Healing report generated** (if healing attempted)
|
||||
- [ ] **Unfixable tests marked** with test.fixme() and detailed comments (if any)
|
||||
- [ ] **Automation summary created** and saved to correct location
|
||||
- [ ] **Output file formatted correctly**
|
||||
- [ ] **Knowledge base references applied** and documented
|
||||
- [ ] **Knowledge base references applied** and documented (including healing fragments if used)
|
||||
- [ ] **No test quality issues** (flaky patterns, race conditions, hardcoded data, page objects)
|
||||
|
||||
---
|
||||
@@ -505,5 +573,8 @@ All of the following must be true before marking this workflow as complete:
|
||||
- **Priority tagging enables selective execution:** P0 tests run on every commit, P1 on PR, P2 nightly
|
||||
- **Network-first pattern prevents race conditions:** Route interception BEFORE navigation
|
||||
- **No page objects:** Keep tests simple, direct, and maintainable
|
||||
- **Use knowledge base:** Load relevant fragments (test-levels, test-priorities, fixture-architecture, data-factories) for guidance
|
||||
- **Use knowledge base:** Load relevant fragments (test-levels, test-priorities, fixture-architecture, data-factories, healing patterns) for guidance
|
||||
- **Deterministic tests only:** No hard waits, no conditional flow, no flaky patterns allowed
|
||||
- **Optional healing:** auto_heal_failures disabled by default (opt-in for automatic test healing)
|
||||
- **Graceful degradation:** Healing works without Playwright MCP (pattern-based fallback)
|
||||
- **Unfixable tests handled:** Mark with test.fixme() and detailed comments (not silently broken)
|
||||
|
||||
@@ -84,13 +84,19 @@ Expands test automation coverage by generating comprehensive test suites at appr
|
||||
5. **Load Knowledge Base Fragments**
|
||||
|
||||
**Critical:** Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load:
|
||||
- `test-levels-framework.md` - Test level selection (E2E vs API vs Component vs Unit)
|
||||
- `test-priorities.md` - Priority classification (P0-P3)
|
||||
- `fixture-architecture.md` - Test fixture patterns
|
||||
- `data-factories.md` - Factory patterns with faker
|
||||
- `selective-testing.md` - Targeted test execution strategies
|
||||
- `ci-burn-in.md` - Flaky test detection patterns
|
||||
- `test-quality.md` - Test design principles
|
||||
- `test-levels-framework.md` - Test level selection (E2E vs API vs Component vs Unit with decision matrix, 467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - Priority classification (P0-P3 with automated scoring, risk mapping, 389 lines, 2 examples)
|
||||
- `fixture-architecture.md` - Test fixture patterns (pure function → fixture → mergeTests, auto-cleanup, 406 lines, 5 examples)
|
||||
- `data-factories.md` - Factory patterns with faker (overrides, nested factories, API seeding, 498 lines, 5 examples)
|
||||
- `selective-testing.md` - Targeted test execution strategies (tag-based, spec filters, diff-based, promotion rules, 727 lines, 4 examples)
|
||||
- `ci-burn-in.md` - Flaky test detection patterns (10-iteration burn-in, sharding, selective execution, 678 lines, 4 examples)
|
||||
- `test-quality.md` - Test design principles (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)
|
||||
- `network-first.md` - Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
|
||||
|
||||
**Healing Knowledge (If `{auto_heal_failures}` is true):**
|
||||
- `test-healing-patterns.md` - Common failure patterns and automated fixes (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector debugging and refactoring guide (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition identification and fixes (network-first, deterministic waiting, async debugging, 370 lines, 3 examples)
|
||||
|
||||
---
|
||||
|
||||
@@ -163,7 +169,7 @@ Expands test automation coverage by generating comprehensive test suites at appr
|
||||
|
||||
4. **Assign Test Priorities**
|
||||
|
||||
**Knowledge Base Reference**: `test-priorities.md`
|
||||
**Knowledge Base Reference**: `test-priorities-matrix.md`
|
||||
|
||||
**P0 (Critical - Every commit)**:
|
||||
- Critical user paths that must always work
|
||||
@@ -558,7 +564,208 @@ Expands test automation coverage by generating comprehensive test suites at appr
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Update Documentation and Scripts
|
||||
## Step 5: Execute, Validate & Heal Generated Tests (NEW - Phase 2.5)
|
||||
|
||||
**Purpose**: Automatically validate generated tests and heal common failures before delivery
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Validate Generated Tests**
|
||||
|
||||
Always validate (auto_validate is always true):
|
||||
- Run generated tests to verify they work
|
||||
- Continue with healing if config.tea_use_mcp_enhancements is true
|
||||
|
||||
2. **Run Generated Tests**
|
||||
|
||||
Execute the full test suite that was just generated:
|
||||
|
||||
```bash
|
||||
npx playwright test {generated_test_files}
|
||||
```
|
||||
|
||||
Capture results:
|
||||
- Total tests run
|
||||
- Passing tests count
|
||||
- Failing tests count
|
||||
- Error messages and stack traces for failures
|
||||
|
||||
3. **Evaluate Results**
|
||||
|
||||
**If ALL tests pass:**
|
||||
- ✅ Generate report with success summary
|
||||
- Proceed to Step 6 (Documentation and Scripts)
|
||||
|
||||
**If tests FAIL:**
|
||||
- Check config.tea_use_mcp_enhancements setting
|
||||
- If true: Enter healing loop (Step 5.4)
|
||||
- If false: Document failures for manual review, proceed to Step 6
|
||||
|
||||
4. **Healing Loop (If config.tea_use_mcp_enhancements is true)**
|
||||
|
||||
**Iteration limit**: 3 attempts per test (constant)
|
||||
|
||||
**For each failing test:**
|
||||
|
||||
**A. Load Healing Knowledge Fragments**
|
||||
|
||||
Consult `tea-index.csv` to load healing patterns:
|
||||
- `test-healing-patterns.md` - Common failure patterns and fixes
|
||||
- `selector-resilience.md` - Selector debugging and refactoring
|
||||
- `timing-debugging.md` - Race condition identification and fixes
|
||||
|
||||
**B. Identify Failure Pattern**
|
||||
|
||||
Analyze error message and stack trace to classify failure type:
|
||||
|
||||
**Stale Selector Failure:**
|
||||
- Error contains: "locator resolved to 0 elements", "element not found", "unable to find element"
|
||||
- Extract selector from error message
|
||||
- Apply selector healing (knowledge from `selector-resilience.md`):
|
||||
- If CSS class → Replace with `page.getByTestId()`
|
||||
- If nth() → Replace with `filter({ hasText })`
|
||||
- If ID → Replace with data-testid
|
||||
- If complex XPath → Replace with ARIA role
|
||||
|
||||
**Race Condition Failure:**
|
||||
- Error contains: "timeout waiting for", "element not visible", "timed out retrying"
|
||||
- Detect missing network waits or hard waits in test code
|
||||
- Apply timing healing (knowledge from `timing-debugging.md`):
|
||||
- Add network-first interception before navigate
|
||||
- Replace `waitForTimeout()` with `waitForResponse()`
|
||||
- Add explicit element state waits (`waitFor({ state: 'visible' })`)
|
||||
|
||||
**Dynamic Data Failure:**
|
||||
- Error contains: "Expected 'User 123' but received 'User 456'", timestamp mismatches
|
||||
- Identify hardcoded assertions
|
||||
- Apply data healing (knowledge from `test-healing-patterns.md`):
|
||||
- Replace hardcoded IDs with regex (`/User \d+/`)
|
||||
- Replace hardcoded dates with dynamic generation
|
||||
- Capture dynamic values and use in assertions
|
||||
|
||||
**Network Error Failure:**
|
||||
- Error contains: "API call failed", "500 error", "network error"
|
||||
- Detect missing route interception
|
||||
- Apply network healing (knowledge from `test-healing-patterns.md`):
|
||||
- Add `page.route()` or `cy.intercept()` for API mocking
|
||||
- Mock error scenarios (500, 429, timeout)
|
||||
|
||||
**Hard Wait Detection:**
|
||||
- Scan test code for `page.waitForTimeout()`, `cy.wait(number)`, `sleep()`
|
||||
- Apply hard wait healing (knowledge from `timing-debugging.md`):
|
||||
- Replace with event-based waits
|
||||
- Add network response waits
|
||||
- Use element state changes
|
||||
|
||||
**C. MCP Healing Mode (If MCP Tools Available)**
|
||||
|
||||
If Playwright MCP tools are available in your IDE:
|
||||
|
||||
Use MCP tools for interactive healing:
|
||||
- `playwright_test_debug_test`: Pause on failure for visual inspection
|
||||
- `browser_snapshot`: Capture visual context at failure point
|
||||
- `browser_console_messages`: Retrieve console logs for JS errors
|
||||
- `browser_network_requests`: Analyze network activity
|
||||
- `browser_generate_locator`: Generate better selectors interactively
|
||||
|
||||
Apply MCP-generated fixes to test code.
|
||||
|
||||
**D. Pattern-Based Healing Mode (Fallback)**
|
||||
|
||||
If MCP unavailable, use pattern-based analysis:
|
||||
- Parse error message and stack trace
|
||||
- Match against failure patterns from knowledge base
|
||||
- Apply fixes programmatically:
|
||||
- Selector fixes: Use suggestions from `selector-resilience.md`
|
||||
- Timing fixes: Apply patterns from `timing-debugging.md`
|
||||
- Data fixes: Use patterns from `test-healing-patterns.md`
|
||||
|
||||
**E. Apply Healing Fix**
|
||||
- Modify test file with healed code
|
||||
- Re-run test to validate fix
|
||||
- If test passes: Mark as healed, move to next failure
|
||||
- If test fails: Increment iteration count, try different pattern
|
||||
|
||||
**F. Iteration Limit Handling**
|
||||
|
||||
After 3 failed healing attempts:
|
||||
|
||||
Always mark unfixable tests:
|
||||
- Mark test with `test.fixme()` instead of `test()`
|
||||
- Add detailed comment explaining:
|
||||
- What failure occurred
|
||||
- What healing was attempted (3 iterations)
|
||||
- Why healing failed
|
||||
- Manual investigation needed
|
||||
|
||||
```typescript
|
||||
test.fixme('[P1] should handle complex interaction', async ({ page }) => {
|
||||
// FIXME: Test healing failed after 3 attempts
|
||||
// Failure: "Locator 'button[data-action="submit"]' resolved to 0 elements"
|
||||
// Attempted fixes:
|
||||
// 1. Replaced with page.getByTestId('submit-button') - still failing
|
||||
// 2. Replaced with page.getByRole('button', { name: 'Submit' }) - still failing
|
||||
// 3. Added waitForLoadState('networkidle') - still failing
|
||||
// Manual investigation needed: Selector may require application code changes
|
||||
// TODO: Review with team, may need data-testid added to button component
|
||||
// Original test code...
|
||||
});
|
||||
```
|
||||
|
||||
**Note**: Workflow continues even with unfixable tests (marked as test.fixme() for manual review)
|
||||
|
||||
5. **Generate Healing Report**
|
||||
|
||||
Document healing outcomes:
|
||||
|
||||
```markdown
|
||||
## Test Healing Report
|
||||
|
||||
**Auto-Heal Enabled**: {auto_heal_failures}
|
||||
**Healing Mode**: {use_mcp_healing ? "MCP-assisted" : "Pattern-based"}
|
||||
**Iterations Allowed**: {max_healing_iterations}
|
||||
|
||||
### Validation Results
|
||||
|
||||
- **Total tests**: {total_tests}
|
||||
- **Passing**: {passing_tests}
|
||||
- **Failing**: {failing_tests}
|
||||
|
||||
### Healing Outcomes
|
||||
|
||||
**Successfully Healed ({healed_count} tests):**
|
||||
|
||||
- `tests/e2e/login.spec.ts:15` - Stale selector (CSS class → data-testid)
|
||||
- `tests/e2e/checkout.spec.ts:42` - Race condition (added network-first interception)
|
||||
- `tests/api/users.spec.ts:28` - Dynamic data (hardcoded ID → regex pattern)
|
||||
|
||||
**Unable to Heal ({unfixable_count} tests):**
|
||||
|
||||
- `tests/e2e/complex-flow.spec.ts:67` - Marked as test.fixme() with manual investigation needed
|
||||
- Failure: Locator not found after 3 healing attempts
|
||||
- Requires application code changes (add data-testid to component)
|
||||
|
||||
### Healing Patterns Applied
|
||||
|
||||
- **Selector fixes**: 2 (CSS class → data-testid, nth() → filter())
|
||||
- **Timing fixes**: 1 (added network-first interception)
|
||||
- **Data fixes**: 1 (hardcoded ID → regex)
|
||||
|
||||
### Knowledge Base References
|
||||
|
||||
- `test-healing-patterns.md` - Common failure patterns
|
||||
- `selector-resilience.md` - Selector refactoring guide
|
||||
- `timing-debugging.md` - Race condition prevention
|
||||
```
|
||||
|
||||
6. **Update Test Files with Healing Results**
|
||||
- Save healed test code to files
|
||||
- Mark unfixable tests with `test.fixme()` and detailed comments
|
||||
- Preserve original test logic in comments (for debugging)
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Update Documentation and Scripts
|
||||
|
||||
### Actions
|
||||
|
||||
@@ -956,20 +1163,26 @@ test('should login', async ({ page }) => {
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Auto-load enabled:**
|
||||
**Core Fragments (Auto-loaded in Step 1):**
|
||||
|
||||
- `test-levels-framework.md` - Test level selection
|
||||
- `test-priorities.md` - Priority classification
|
||||
- `fixture-architecture.md` - Fixture patterns
|
||||
- `data-factories.md` - Factory patterns
|
||||
- `selective-testing.md` - Targeted test execution
|
||||
- `ci-burn-in.md` - Flaky test detection
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - P0-P3 classification with automated scoring and risk mapping (389 lines, 2 examples)
|
||||
- `fixture-architecture.md` - Pure function → fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)
|
||||
- `data-factories.md` - Factory patterns with faker: overrides, nested factories, API seeding (498 lines, 5 examples)
|
||||
- `selective-testing.md` - Tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
- `ci-burn-in.md` - 10-iteration burn-in loop, parallel sharding, selective execution (678 lines, 4 examples)
|
||||
- `test-quality.md` - Deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)
|
||||
- `network-first.md` - Intercept before navigate, HAR capture, deterministic waiting strategies (489 lines, 5 examples)
|
||||
|
||||
**Manual reference:**
|
||||
**Healing Fragments (Auto-loaded if `{auto_heal_failures}` enabled):**
|
||||
|
||||
- Use `tea-index.csv` to find additional fragments
|
||||
- Load `network-first.md` for route interception patterns
|
||||
- Load `test-quality.md` for test design principles
|
||||
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns refactoring (541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention, deterministic waiting, async debugging techniques (370 lines, 3 examples)
|
||||
|
||||
**Manual Reference (Optional):**
|
||||
|
||||
- Use `tea-index.csv` to find additional specialized fragments as needed
|
||||
|
||||
---
|
||||
|
||||
@@ -1079,7 +1292,11 @@ After completing all steps, verify:
|
||||
- [ ] Test README updated (execution instructions, priority tagging, patterns)
|
||||
- [ ] package.json scripts updated (test execution commands)
|
||||
- [ ] Test suite run locally (results captured)
|
||||
- [ ] Automation summary created (tests, infrastructure, coverage, DoD)
|
||||
- [ ] Tests validated (if auto_validate enabled)
|
||||
- [ ] Failures healed (if auto_heal_failures enabled)
|
||||
- [ ] Healing report generated (if healing attempted)
|
||||
- [ ] Unfixable tests marked with test.fixme() (if any)
|
||||
- [ ] Automation summary created (tests, infrastructure, coverage, healing, DoD)
|
||||
- [ ] Output file formatted correctly
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
|
||||
@@ -72,6 +72,7 @@ variables:
|
||||
# Advanced options
|
||||
auto_load_knowledge: true # Load test-levels, test-priorities, fixture-architecture, selective-testing, ci-burn-in
|
||||
run_tests_after_generation: true # Verify tests pass/fail as expected
|
||||
auto_validate: true # Always validate generated tests
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/automation-summary.md"
|
||||
|
||||
@@ -212,7 +212,7 @@ Automatically consults TEA knowledge base:
|
||||
|
||||
- **atdd**: Generate failing tests that run in CI
|
||||
- **automate**: Expand test coverage that CI executes
|
||||
- **gate**: Use CI results for quality gate decisions
|
||||
- **trace (Phase 2)**: Use CI results for quality gate decisions
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
@@ -484,7 +484,7 @@ bmad tea *ci
|
||||
- **framework**: Set up test infrastructure → [framework/README.md](../framework/README.md)
|
||||
- **atdd**: Generate acceptance tests → [atdd/README.md](../atdd/README.md)
|
||||
- **automate**: Expand test coverage → [automate/README.md](../automate/README.md)
|
||||
- **gate**: Quality gate decisions → [gate/README.md](../gate/README.md)
|
||||
- **trace**: Traceability and quality gate decisions → [trace/README.md](../trace/README.md)
|
||||
|
||||
## Version History
|
||||
|
||||
|
||||
@@ -355,10 +355,11 @@ Scaffolds a production-ready CI/CD quality pipeline with test execution, burn-in
|
||||
|
||||
**Critical:** Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to identify and load relevant knowledge fragments:
|
||||
|
||||
- `ci-burn-in.md` - Burn-in loop patterns and configuration
|
||||
- `selective-testing.md` - Changed test detection strategies
|
||||
- `visual-debugging.md` - Artifact collection best practices
|
||||
- `test-quality.md` - CI-specific test quality criteria
|
||||
- `ci-burn-in.md` - Burn-in loop patterns: 10-iteration detection, GitHub Actions workflow, shard orchestration, selective execution (678 lines, 4 examples)
|
||||
- `selective-testing.md` - Changed test detection strategies: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
- `visual-debugging.md` - Artifact collection best practices: trace viewer, HAR recording, custom artifacts, accessibility integration (522 lines, 5 examples)
|
||||
- `test-quality.md` - CI-specific test quality criteria: deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)
|
||||
- `playwright-config.md` - CI-optimized configuration: parallelization, artifact output, project dependencies, sharding (722 lines, 5 examples)
|
||||
|
||||
### CI Platform-Specific Guidance
|
||||
|
||||
|
||||
@@ -351,11 +351,11 @@ The generated `tests/README.md` should include:
|
||||
|
||||
**Critical:** Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to identify and load relevant knowledge fragments:
|
||||
|
||||
- `fixture-architecture.md` - Pure function → fixture → `mergeTests` pattern
|
||||
- `data-factories.md` - Faker-based factories with auto-cleanup
|
||||
- `network-first.md` - Network-first testing safeguards
|
||||
- `playwright-config.md` - Playwright-specific configuration best practices
|
||||
- `test-config.md` - General configuration guidelines
|
||||
- `fixture-architecture.md` - Pure function → fixture → `mergeTests` composition with auto-cleanup (406 lines, 5 examples)
|
||||
- `data-factories.md` - Faker-based factories with overrides, nested factories, API seeding, auto-cleanup (498 lines, 5 examples)
|
||||
- `network-first.md` - Network-first testing safeguards: intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
|
||||
- `playwright-config.md` - Playwright-specific configuration: environment-based, timeout standards, artifact output, parallelization, project config (722 lines, 5 examples)
|
||||
- `test-quality.md` - Test design principles: deterministic, isolated with cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
|
||||
|
||||
### Framework-Specific Guidance
|
||||
|
||||
|
||||
@@ -1,493 +0,0 @@
|
||||
# Quality Gate Decision Workflow
|
||||
|
||||
The Quality Gate workflow makes deterministic release decisions (PASS/CONCERNS/FAIL/WAIVED) based on comprehensive quality evidence including test results, risk assessment, traceability, and non-functional requirements validation.
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow is the final checkpoint before deploying a story, epic, or release to production. It evaluates all quality evidence against predefined criteria and makes a transparent, rule-based decision with complete audit trail.
|
||||
|
||||
**Key Features:**
|
||||
|
||||
- **Deterministic Decision Rules**: Clear, objective criteria eliminate bias
|
||||
- **Four Decision States**: PASS (ready), CONCERNS (deploy with monitoring), FAIL (blocked), WAIVED (business override)
|
||||
- **P0-P3 Risk Framework**: Prioritized evaluation of critical vs nice-to-have features
|
||||
- **Evidence-Based**: Never guess - requires test results, coverage, NFR validation
|
||||
- **Waiver Management**: Business-approved exceptions with remediation plans
|
||||
- **Audit Trail**: Complete history of decisions with rationale
|
||||
- **CI/CD Integration**: Gate YAML snippets for pipeline automation
|
||||
- **Stakeholder Communication**: Auto-generated notifications with decision summary
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
bmad tea *gate
|
||||
```
|
||||
|
||||
The TEA agent runs this workflow when:
|
||||
|
||||
- Story is complete and ready for release (after `*dev story-approved`)
|
||||
- Epic is complete and needs quality validation before deployment
|
||||
- Release candidate needs final go/no-go decision
|
||||
- Hotfix requires expedited quality assessment
|
||||
- User explicitly requests gate decision: `bmad tea *gate`
|
||||
|
||||
**Typical workflow sequence:**
|
||||
|
||||
1. `*test-design` → Risk assessment with P0-P3 prioritization
|
||||
2. `*atdd` → Generate failing tests before implementation
|
||||
3. `*dev story` → Implement feature with tests passing
|
||||
4. `*automate` → Expand regression suite
|
||||
5. `*trace` → Verify requirements-to-tests coverage
|
||||
6. `*nfr-assess` → Validate non-functional requirements
|
||||
7. **`*gate`** → Make final release decision ⬅️ YOU ARE HERE
|
||||
|
||||
---
|
||||
|
||||
## Inputs
|
||||
|
||||
### Required Context Files
|
||||
|
||||
- **Test Results**: CI/CD pipeline results, test framework reports (Playwright HTML, Jest JSON, JUnit XML)
|
||||
- **Story/Epic File**: The feature being gated (e.g., `story-1.3.md`, `epic-2.md`)
|
||||
|
||||
### Recommended Context Files
|
||||
|
||||
- **test-design.md**: Risk assessment with P0/P1/P2/P3 scenario prioritization
|
||||
- **traceability-matrix.md**: Requirements-to-tests coverage analysis with gap identification
|
||||
- **nfr-assessment.md**: Non-functional requirements validation (security, performance, reliability, maintainability)
|
||||
- **Code Coverage Report**: Line/branch/function coverage metrics
|
||||
- **Burn-in Results**: 10-iteration flakiness detection from CI pipeline
|
||||
|
||||
### Workflow Variables
|
||||
|
||||
Key variables that control gate behavior (configured in `workflow.yaml`):
|
||||
|
||||
- **gate_type**: `story` | `epic` | `release` | `hotfix` (default: `story`)
|
||||
- **decision_mode**: `deterministic` | `manual` (default: `deterministic`)
|
||||
- **min_p0_pass_rate**: Threshold for P0 tests (default: `100` - must be perfect)
|
||||
- **min_p1_pass_rate**: Threshold for P1 tests (default: `95%`)
|
||||
- **min_overall_pass_rate**: Overall test threshold (default: `90%`)
|
||||
- **min_coverage**: Code coverage minimum (default: `80%`)
|
||||
- **allow_waivers**: Enable business-approved waivers (default: `true`)
|
||||
- **require_evidence**: Require links to test results/reports (default: `true`)
|
||||
- **validate_evidence_freshness**: Warn if assessments >7 days old (default: `true`)
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
### Primary Deliverable
|
||||
|
||||
**Gate Decision Document** (`gate-decision-{type}-{id}.md`):
|
||||
|
||||
- **Decision**: PASS / CONCERNS / FAIL / WAIVED with clear rationale
|
||||
- **Evidence Summary**: Test results, coverage, NFRs, flakiness validation
|
||||
- **Rationale**: Explanation of decision based on criteria
|
||||
- **Residual Risks**: Unresolved issues (for CONCERNS/WAIVED)
|
||||
- **Waiver Details**: Approver, expiry, remediation plan (for WAIVED)
|
||||
- **Critical Issues**: Top blockers with owners and due dates (for FAIL)
|
||||
- **Recommendations**: Next steps for each decision type
|
||||
- **Audit Trail**: Complete history for compliance/review
|
||||
|
||||
### Secondary Outputs
|
||||
|
||||
- **Gate YAML**: Machine-readable snippet for CI/CD integration
|
||||
- **Status Update**: Appends decision to `bmm-workflow-status.md` history
|
||||
- **Stakeholder Notification**: Auto-generated message with decision summary
|
||||
|
||||
### Validation Safeguards
|
||||
|
||||
- ✅ All required evidence sources discovered or explicitly provided
|
||||
- ✅ Evidence freshness validated (warns if >7 days old)
|
||||
- ✅ P0 criteria evaluated first (immediate FAIL if not met)
|
||||
- ✅ Decision rules applied deterministically (no human bias)
|
||||
- ✅ Waivers require business justification and remediation plan
|
||||
- ✅ Audit trail maintained for transparency
|
||||
|
||||
---
|
||||
|
||||
## Decision Logic
|
||||
|
||||
### PASS Decision
|
||||
|
||||
**All criteria met:**
|
||||
|
||||
- ✅ P0 test pass rate = 100%
|
||||
- ✅ P1 test pass rate ≥ 95%
|
||||
- ✅ Overall test pass rate ≥ 90%
|
||||
- ✅ Code coverage ≥ 80%
|
||||
- ✅ Security issues = 0
|
||||
- ✅ Critical NFR failures = 0
|
||||
- ✅ Flaky tests = 0
|
||||
|
||||
**Action:** Deploy to production with standard monitoring
|
||||
|
||||
---
|
||||
|
||||
### CONCERNS Decision
|
||||
|
||||
**P0 criteria met, but P1 criteria degraded:**
|
||||
|
||||
- ✅ P0 test pass rate = 100%
|
||||
- ⚠️ P1 test pass rate 90-94% (below 95% threshold)
|
||||
- ⚠️ Code coverage 75-79% (below 80% threshold)
|
||||
- ✅ No security issues
|
||||
- ✅ No critical NFR failures
|
||||
- ✅ No flaky tests
|
||||
|
||||
**Residual Risks:** Minor P1 issues, edge cases, non-critical gaps
|
||||
|
||||
**Action:** Deploy with enhanced monitoring, create backlog stories for fixes
|
||||
|
||||
---
|
||||
|
||||
### FAIL Decision
|
||||
|
||||
**Any P0 criterion failed:**
|
||||
|
||||
- ❌ P0 test pass rate <100%
|
||||
- OR ❌ Security issues >0
|
||||
- OR ❌ Critical NFR failures >0
|
||||
- OR ❌ Flaky tests detected
|
||||
|
||||
**Critical Blockers:** P0 test failures, security vulnerabilities, critical NFRs
|
||||
|
||||
**Action:** Block deployment, fix critical issues, re-run gate after fixes
|
||||
|
||||
---
|
||||
|
||||
### WAIVED Decision
|
||||
|
||||
**FAIL status + business-approved waiver:**
|
||||
|
||||
- ❌ Original decision: FAIL
|
||||
- 🔓 Waiver approved by: {VP Engineering / CTO / Product Owner}
|
||||
- 📋 Business justification: {regulatory deadline, contractual obligation, etc.}
|
||||
- 📅 Waiver expiry: {date - does NOT apply to future releases}
|
||||
- 🔧 Remediation plan: {fix in next release, due date}
|
||||
|
||||
**Action:** Deploy with business approval, aggressive monitoring, fix ASAP
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
### Before Gate
|
||||
|
||||
1. **test-design** (recommended) - Provides P0-P3 risk framework
|
||||
2. **atdd** (recommended) - Ensures acceptance criteria have tests
|
||||
3. **automate** (recommended) - Expands regression suite
|
||||
4. **trace** (recommended) - Verifies requirements coverage
|
||||
5. **nfr-assess** (recommended) - Validates non-functional requirements
|
||||
|
||||
### After Gate
|
||||
|
||||
- **PASS**: Proceed to deployment workflow
|
||||
- **CONCERNS**: Deploy with monitoring, create remediation backlog stories
|
||||
- **FAIL**: Block deployment, fix issues, re-run gate
|
||||
- **WAIVED**: Deploy with business approval, escalate monitoring
|
||||
|
||||
### Coordinates With
|
||||
|
||||
- **bmm-workflow-status.md**: Appends gate decision to history
|
||||
- **CI/CD Pipeline**: Gate YAML used for automated gates
|
||||
- **PM/SM**: Notification of decision and next steps
|
||||
|
||||
---
|
||||
|
||||
## Example Scenarios
|
||||
|
||||
### Scenario 1: Ideal Release (PASS)
|
||||
|
||||
```
|
||||
Evidence:
|
||||
- P0 tests: 15/15 passed (100%) ✅
|
||||
- P1 tests: 28/29 passed (96.5%) ✅
|
||||
- Overall: 98% pass rate ✅
|
||||
- Coverage: 87% ✅
|
||||
- Security: 0 issues ✅
|
||||
- Flakiness: 0 flaky tests ✅
|
||||
|
||||
Decision: ✅ PASS
|
||||
|
||||
Rationale: All criteria exceeded thresholds. Feature ready for production.
|
||||
|
||||
Next Steps:
|
||||
1. Deploy to staging
|
||||
2. Monitor for 24 hours
|
||||
3. Deploy to production
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 2: Minor Issues (CONCERNS)
|
||||
|
||||
```
|
||||
Evidence:
|
||||
- P0 tests: 12/12 passed (100%) ✅
|
||||
- P1 tests: 21/24 passed (87.5%) ⚠️
|
||||
- Overall: 91% pass rate ✅
|
||||
- Coverage: 78% ⚠️
|
||||
- Security: 0 issues ✅
|
||||
- Flakiness: 0 flaky tests ✅
|
||||
|
||||
Decision: ⚠️ CONCERNS
|
||||
|
||||
Rationale: P0 criteria met, but P1 pass rate (87.5%) below threshold (95%).
|
||||
Coverage (78%) slightly below target (80%). Issues are edge cases in
|
||||
international date handling - low probability, workaround exists.
|
||||
|
||||
Residual Risks:
|
||||
- P1: Date formatting edge case for Japan/Korea timezones
|
||||
- Coverage: Missing tests for admin override flow
|
||||
|
||||
Next Steps:
|
||||
1. Deploy with enhanced monitoring on date formatting
|
||||
2. Create backlog story: "Fix date formatting for Asia Pacific"
|
||||
3. Add admin override tests in next sprint
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 3: Critical Blocker (FAIL)
|
||||
|
||||
```
|
||||
Evidence:
|
||||
- P0 tests: 9/12 passed (75%) ❌
|
||||
- Security: 1 SQL injection in search filter ❌
|
||||
- Coverage: 68% ❌
|
||||
|
||||
Decision: ❌ FAIL
|
||||
|
||||
Rationale: CRITICAL BLOCKERS:
|
||||
1. P0 test failures in core search functionality
|
||||
2. Unresolved SQL injection vulnerability (CRITICAL)
|
||||
3. Coverage below minimum threshold
|
||||
|
||||
Critical Issues:
|
||||
| Priority | Issue | Owner | Due Date |
|
||||
|----------|-------|-------|----------|
|
||||
| P0 | Fix SQL injection in search filter | Backend | 2025-10-16 |
|
||||
| P0 | Fix search pagination crash | Backend | 2025-10-16 |
|
||||
| P0 | Fix search timeout for large datasets | Backend | 2025-10-17 |
|
||||
|
||||
Next Steps:
|
||||
1. BLOCK DEPLOYMENT IMMEDIATELY
|
||||
2. Fix P0 issues listed above
|
||||
3. Re-run full test suite
|
||||
4. Re-run gate after fixes verified
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 4: Business Override (WAIVED)
|
||||
|
||||
```
|
||||
Evidence:
|
||||
- P0 tests: 10/11 passed (90.9%) ❌
|
||||
- Issue: Legacy report export fails for Excel 2007
|
||||
|
||||
Original Decision: ❌ FAIL
|
||||
|
||||
Waiver Details:
|
||||
- Approver: Jane Doe, VP Engineering
|
||||
- Reason: GDPR compliance deadline (regulatory requirement, Oct 15)
|
||||
- Expiry: 2025-10-15 (does NOT apply to v2.5.0)
|
||||
- Monitoring: Enhanced error tracking on report export
|
||||
- Remediation: Fix in v2.4.1 hotfix (due Oct 20)
|
||||
|
||||
Decision: 🔓 WAIVED
|
||||
|
||||
Business Justification:
|
||||
Release contains critical GDPR features required by law on Oct 15. Failed
|
||||
test affects legacy Excel 2007 export used by <1% of users. Workaround
|
||||
available (use Excel 2010+). Risk acceptable given regulatory priority.
|
||||
|
||||
Next Steps:
|
||||
1. Deploy v2.4.0 with waiver approval
|
||||
2. Monitor error rates on report export
|
||||
3. Fix Excel 2007 export in v2.4.1 (Oct 20)
|
||||
4. Notify affected users of workaround
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Deterministic vs Manual Mode
|
||||
|
||||
- **Deterministic mode** (recommended): Rule-based decisions using predefined thresholds
|
||||
- Eliminates bias and ensures consistency
|
||||
- Clear audit trail of criteria evaluation
|
||||
- Faster decisions for routine releases
|
||||
|
||||
- **Manual mode**: Human judgment with guidance from criteria
|
||||
- Use for edge cases, unusual situations
|
||||
- Still requires evidence documentation
|
||||
- TEA provides recommendation, user makes final call
|
||||
|
||||
### P0 is Sacred
|
||||
|
||||
**P0 failures ALWAYS result in FAIL** (no exceptions except waivers):
|
||||
|
||||
- P0 = Critical user journeys, security, data integrity
|
||||
- Cannot deploy with P0 failures - too risky
|
||||
- Waivers require VP/CTO approval + business justification
|
||||
|
||||
### Waivers are Temporary
|
||||
|
||||
- Waiver applies ONLY to specific release
|
||||
- Issue must be fixed in next release
|
||||
- Waiver expiry date enforced
|
||||
- Never waive: security, data corruption, compliance violations
|
||||
|
||||
### Evidence Freshness Matters
|
||||
|
||||
- Assessments >7 days old may be stale
|
||||
- Code changes since assessment may invalidate conclusions
|
||||
- Re-run workflows if evidence is outdated
|
||||
|
||||
### Security Never Compromised
|
||||
|
||||
- Security issues ALWAYS block release
|
||||
- No waivers for security vulnerabilities
|
||||
- Fix security issues immediately, then re-gate
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base References
|
||||
|
||||
This workflow automatically consults:
|
||||
|
||||
- **risk-governance.md** - Risk-based quality gate criteria and decision framework
|
||||
- **probability-impact.md** - Risk scoring (probability × impact) for residual risks
|
||||
- **test-quality.md** - Definition of Done for tests, quality standards
|
||||
- **test-priorities.md** - P0/P1/P2/P3 priority classification framework
|
||||
- **ci-burn-in.md** - Flakiness detection and burn-in validation patterns
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Problem: No test results found
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Check CI/CD pipeline for test execution
|
||||
- Verify `test_results` variable points to correct path
|
||||
- Run tests locally and provide results explicitly
|
||||
|
||||
---
|
||||
|
||||
### Problem: Assessments are stale (>7 days old)
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Re-run `*test-design` workflow
|
||||
- Re-run `*trace` workflow
|
||||
- Re-run `*nfr-assess` workflow
|
||||
- Update evidence files before gate decision
|
||||
|
||||
---
|
||||
|
||||
### Problem: Unclear decision (edge case)
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Switch to manual mode: `decision_mode: manual`
|
||||
- Document assumptions and rationale clearly
|
||||
- Escalate to tech lead or architect for guidance
|
||||
- Consider waiver if business-critical
|
||||
|
||||
---
|
||||
|
||||
### Problem: Waiver requested but not justified
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Require written business justification from stakeholder
|
||||
- Ensure approver is appropriate authority (VP/CTO/PO)
|
||||
- Verify remediation plan exists with concrete due date
|
||||
- Document monitoring plan for waived risk
|
||||
- Confirm waiver expiry date (must be fixed in next release)
|
||||
|
||||
---
|
||||
|
||||
## Integration with BMad Status File
|
||||
|
||||
This workflow updates `bmm-workflow-status.md` with gate decisions:
|
||||
|
||||
```markdown
|
||||
### Quality & Testing Progress (TEA Agent)
|
||||
|
||||
**Gate Decisions:**
|
||||
|
||||
- [2025-10-14] ✅ PASS - Story 1.3 (User Auth) - All criteria met, 98% pass rate
|
||||
- [2025-10-14] ⚠️ CONCERNS - Epic 2 (Payments) - P1 pass rate 89%, deploy with monitoring
|
||||
- [2025-10-14] ❌ FAIL - Story 3.2 (Export) - SQL injection blocking release
|
||||
- [2025-10-15] 🔓 WAIVED - Release v2.4.0 - GDPR deadline, VP approved
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Strict Gate (Zero Tolerance)
|
||||
|
||||
```yaml
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 100
|
||||
min_overall_pass_rate: 95
|
||||
min_coverage: 90
|
||||
allow_waivers: false
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Financial systems, healthcare, security-critical features
|
||||
|
||||
---
|
||||
|
||||
### Balanced Gate (Production Standard)
|
||||
|
||||
```yaml
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 95
|
||||
min_overall_pass_rate: 90
|
||||
min_coverage: 80
|
||||
allow_waivers: true
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Most production releases (default configuration)
|
||||
|
||||
---
|
||||
|
||||
### Relaxed Gate (Early Development)
|
||||
|
||||
```yaml
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 85
|
||||
min_overall_pass_rate: 80
|
||||
min_coverage: 70
|
||||
allow_waivers: true
|
||||
allow_p2_failures: true
|
||||
allow_p3_failures: true
|
||||
```
|
||||
|
||||
Use for: Alpha/beta releases, internal tools, proof-of-concept
|
||||
|
||||
---
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `bmad tea *test-design` - Risk assessment before implementation
|
||||
- `bmad tea *trace` - Verify requirements-to-tests coverage
|
||||
- `bmad tea *nfr-assess` - Validate non-functional requirements
|
||||
- `bmad tea *automate` - Expand regression suite
|
||||
- `bmad sm story-approved` - Mark story as complete (triggers gate)
|
||||
@@ -1,393 +0,0 @@
|
||||
# Quality Gate Decision - Validation Checklist
|
||||
|
||||
Use this checklist to validate that the gate decision workflow completed successfully and all criteria were properly evaluated.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Evidence Gathering
|
||||
|
||||
- [ ] Test execution results obtained (CI/CD pipeline, test framework reports)
|
||||
- [ ] Story/epic/release file identified and read
|
||||
- [ ] Test design document discovered or explicitly provided (if available)
|
||||
- [ ] Traceability matrix discovered or explicitly provided (if available)
|
||||
- [ ] NFR assessment discovered or explicitly provided (if available)
|
||||
- [ ] Code coverage report discovered or explicitly provided (if available)
|
||||
- [ ] Burn-in results discovered or explicitly provided (if available)
|
||||
|
||||
### Evidence Validation
|
||||
|
||||
- [ ] Evidence freshness validated (warn if >7 days old, recommend re-running workflows)
|
||||
- [ ] All required assessments available or user acknowledged gaps
|
||||
- [ ] Test results are complete (not partial or interrupted runs)
|
||||
- [ ] Test results match current codebase (not from outdated branch)
|
||||
|
||||
### Knowledge Base Loading
|
||||
|
||||
- [ ] `risk-governance.md` loaded successfully
|
||||
- [ ] `probability-impact.md` loaded successfully
|
||||
- [ ] `test-quality.md` loaded successfully
|
||||
- [ ] `test-priorities.md` loaded successfully
|
||||
- [ ] `ci-burn-in.md` loaded (if burn-in results available)
|
||||
|
||||
---
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Context Loading
|
||||
|
||||
- [ ] Gate type identified (story/epic/release/hotfix)
|
||||
- [ ] Target ID extracted (story_id, epic_num, or release_version)
|
||||
- [ ] Decision thresholds loaded from workflow variables
|
||||
- [ ] Risk tolerance configuration loaded
|
||||
- [ ] Waiver policy loaded
|
||||
|
||||
### Step 2: Evidence Parsing
|
||||
|
||||
**Test Results:**
|
||||
|
||||
- [ ] Total test count extracted
|
||||
- [ ] Passed test count extracted
|
||||
- [ ] Failed test count extracted
|
||||
- [ ] Skipped test count extracted
|
||||
- [ ] Test duration extracted
|
||||
- [ ] P0 test pass rate calculated
|
||||
- [ ] P1 test pass rate calculated
|
||||
- [ ] Overall test pass rate calculated
|
||||
|
||||
**Quality Assessments:**
|
||||
|
||||
- [ ] P0/P1/P2/P3 scenarios extracted from test-design.md (if available)
|
||||
- [ ] Risk scores extracted from test-design.md (if available)
|
||||
- [ ] Coverage percentages extracted from traceability-matrix.md (if available)
|
||||
- [ ] Coverage gaps extracted from traceability-matrix.md (if available)
|
||||
- [ ] NFR status extracted from nfr-assessment.md (if available)
|
||||
- [ ] Security issues count extracted from nfr-assessment.md (if available)
|
||||
|
||||
**Code Coverage:**
|
||||
|
||||
- [ ] Line coverage percentage extracted (if available)
|
||||
- [ ] Branch coverage percentage extracted (if available)
|
||||
- [ ] Function coverage percentage extracted (if available)
|
||||
- [ ] Critical path coverage validated (if available)
|
||||
|
||||
**Burn-in Results:**
|
||||
|
||||
- [ ] Burn-in iterations count extracted (if available)
|
||||
- [ ] Flaky tests count extracted (if available)
|
||||
- [ ] Stability score calculated (if available)
|
||||
|
||||
### Step 3: Decision Rules Application
|
||||
|
||||
**P0 Criteria Evaluation:**
|
||||
|
||||
- [ ] P0 test pass rate evaluated (must be 100%)
|
||||
- [ ] P0 acceptance criteria coverage evaluated (must be 100%)
|
||||
- [ ] Security issues count evaluated (must be 0)
|
||||
- [ ] Critical NFR failures evaluated (must be 0)
|
||||
- [ ] Flaky tests evaluated (must be 0 if burn-in enabled)
|
||||
- [ ] P0 decision recorded: PASS or FAIL
|
||||
|
||||
**P1 Criteria Evaluation:**
|
||||
|
||||
- [ ] P1 test pass rate evaluated (threshold: min_p1_pass_rate)
|
||||
- [ ] P1 acceptance criteria coverage evaluated (threshold: 95%)
|
||||
- [ ] Overall test pass rate evaluated (threshold: min_overall_pass_rate)
|
||||
- [ ] Code coverage evaluated (threshold: min_coverage)
|
||||
- [ ] P1 decision recorded: PASS or CONCERNS
|
||||
|
||||
**P2/P3 Criteria Evaluation:**
|
||||
|
||||
- [ ] P2 failures tracked (informational, don't block if allow_p2_failures: true)
|
||||
- [ ] P3 failures tracked (informational, don't block if allow_p3_failures: true)
|
||||
- [ ] Residual risks documented
|
||||
|
||||
**Final Decision:**
|
||||
|
||||
- [ ] Decision determined: PASS / CONCERNS / FAIL / WAIVED
|
||||
- [ ] Decision rationale documented
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
|
||||
### Step 4: Documentation
|
||||
|
||||
**Gate Decision Document Created:**
|
||||
|
||||
- [ ] Story/epic/release info section complete (ID, title, description, links)
|
||||
- [ ] Decision clearly stated (PASS / CONCERNS / FAIL / WAIVED)
|
||||
- [ ] Decision date recorded
|
||||
- [ ] Evaluator recorded (user or agent name)
|
||||
|
||||
**Evidence Summary Documented:**
|
||||
|
||||
- [ ] Test results summary complete (total, passed, failed, pass rates)
|
||||
- [ ] Coverage summary complete (P0/P1 criteria, code coverage)
|
||||
- [ ] NFR validation summary complete (security, performance, reliability, maintainability)
|
||||
- [ ] Flakiness summary complete (burn-in iterations, flaky test count)
|
||||
|
||||
**Rationale Documented:**
|
||||
|
||||
- [ ] Decision rationale clearly explained
|
||||
- [ ] Key evidence highlighted
|
||||
- [ ] Assumptions and caveats noted (if any)
|
||||
|
||||
**Residual Risks Documented (if CONCERNS or WAIVED):**
|
||||
|
||||
- [ ] Unresolved P1/P2 issues listed
|
||||
- [ ] Probability × impact estimated for each risk
|
||||
- [ ] Mitigations or workarounds described
|
||||
|
||||
**Waivers Documented (if WAIVED):**
|
||||
|
||||
- [ ] Waiver reason documented (business justification)
|
||||
- [ ] Waiver approver documented (name, role)
|
||||
- [ ] Waiver expiry date documented
|
||||
- [ ] Remediation plan documented (fix in next release, due date)
|
||||
- [ ] Monitoring plan documented
|
||||
|
||||
**Critical Issues Documented (if FAIL or CONCERNS):**
|
||||
|
||||
- [ ] Top 5-10 critical issues listed
|
||||
- [ ] Priority assigned to each issue (P0/P1/P2)
|
||||
- [ ] Owner assigned to each issue
|
||||
- [ ] Due date assigned to each issue
|
||||
|
||||
**Recommendations Documented:**
|
||||
|
||||
- [ ] Next steps clearly stated for decision type
|
||||
- [ ] Deployment recommendation provided
|
||||
- [ ] Monitoring recommendations provided (if applicable)
|
||||
- [ ] Remediation recommendations provided (if applicable)
|
||||
|
||||
### Step 5: Status Updates and Notifications
|
||||
|
||||
**Status File Updated:**
|
||||
|
||||
- [ ] Gate decision appended to bmm-workflow-status.md (if append_to_history: true)
|
||||
- [ ] Format correct: `[DATE] Gate Decision: DECISION - Target {ID} - {rationale}`
|
||||
- [ ] Status file committed or staged for commit
|
||||
|
||||
**Gate YAML Created:**
|
||||
|
||||
- [ ] Gate YAML snippet generated with decision and criteria
|
||||
- [ ] Evidence references included in YAML
|
||||
- [ ] Next steps included in YAML
|
||||
- [ ] YAML file saved to output folder
|
||||
|
||||
**Stakeholder Notification Generated:**
|
||||
|
||||
- [ ] Notification subject line created
|
||||
- [ ] Notification body created with summary
|
||||
- [ ] Recipients identified (PM, SM, DEV lead, stakeholders)
|
||||
- [ ] Notification ready for delivery (if notify_stakeholders: true)
|
||||
|
||||
**Outputs Saved:**
|
||||
|
||||
- [ ] Gate decision document saved to `{output_file}`
|
||||
- [ ] Gate YAML saved to `{output_folder}/gate-decision-{target}.yaml`
|
||||
- [ ] All outputs are valid and readable
|
||||
|
||||
---
|
||||
|
||||
## Output Validation
|
||||
|
||||
### Gate Decision Document
|
||||
|
||||
**Completeness:**
|
||||
|
||||
- [ ] All required sections present (info, decision, evidence, rationale, next steps)
|
||||
- [ ] No placeholder text or TODOs left in document
|
||||
- [ ] All evidence references are accurate and complete
|
||||
- [ ] All links to artifacts are valid
|
||||
|
||||
**Accuracy:**
|
||||
|
||||
- [ ] Decision matches applied criteria rules
|
||||
- [ ] Test results match CI/CD pipeline output
|
||||
- [ ] Coverage percentages match reports
|
||||
- [ ] NFR status matches assessment document
|
||||
- [ ] No contradictions or inconsistencies
|
||||
|
||||
**Clarity:**
|
||||
|
||||
- [ ] Decision rationale is clear and unambiguous
|
||||
- [ ] Technical jargon is explained or avoided
|
||||
- [ ] Stakeholders can understand next steps
|
||||
- [ ] Recommendations are actionable
|
||||
|
||||
### Gate YAML
|
||||
|
||||
**Format:**
|
||||
|
||||
- [ ] YAML is valid (no syntax errors)
|
||||
- [ ] All required fields present (target, decision, date, evaluator, criteria, evidence)
|
||||
- [ ] Field values are correct data types (numbers, strings, dates)
|
||||
|
||||
**Content:**
|
||||
|
||||
- [ ] Criteria values match decision document
|
||||
- [ ] Evidence references are accurate
|
||||
- [ ] Next steps align with decision type
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Decision Integrity
|
||||
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
- [ ] P0 failures result in FAIL decision (unless waived)
|
||||
- [ ] Security issues result in FAIL decision (unless waived - but should never be waived)
|
||||
- [ ] Waivers have business justification and approver (if WAIVED)
|
||||
- [ ] Residual risks are documented (if CONCERNS or WAIVED)
|
||||
|
||||
### Evidence-Based
|
||||
|
||||
- [ ] Decision is based on actual test results (not guesses)
|
||||
- [ ] All claims are supported by evidence
|
||||
- [ ] No assumptions without documentation
|
||||
- [ ] Evidence sources are cited (CI run IDs, report URLs)
|
||||
|
||||
### Transparency
|
||||
|
||||
- [ ] Decision rationale is transparent and auditable
|
||||
- [ ] Criteria evaluation is documented step-by-step
|
||||
- [ ] Any deviations from standard process are explained
|
||||
- [ ] Waiver justifications are clear (if applicable)
|
||||
|
||||
### Consistency
|
||||
|
||||
- [ ] Decision aligns with risk-governance knowledge fragment
|
||||
- [ ] Priority framework (P0/P1/P2/P3) applied consistently
|
||||
- [ ] Terminology consistent with test-quality knowledge fragment
|
||||
- [ ] Decision matrix followed correctly
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### BMad Workflow Status
|
||||
|
||||
- [ ] Gate decision added to `bmm-workflow-status.md`
|
||||
- [ ] Format matches existing gate history entries
|
||||
- [ ] Timestamp is accurate
|
||||
- [ ] Decision summary is concise (<80 chars)
|
||||
|
||||
### CI/CD Pipeline
|
||||
|
||||
- [ ] Gate YAML is CI/CD-compatible
|
||||
- [ ] YAML can be parsed by pipeline automation
|
||||
- [ ] Decision can be used to block/allow deployments
|
||||
- [ ] Evidence references are accessible to pipeline
|
||||
|
||||
### Stakeholders
|
||||
|
||||
- [ ] Notification message is clear and actionable
|
||||
- [ ] Decision is explained in non-technical terms
|
||||
- [ ] Next steps are specific and time-bound
|
||||
- [ ] Recipients are appropriate for decision type
|
||||
|
||||
---
|
||||
|
||||
## Compliance and Audit
|
||||
|
||||
### Audit Trail
|
||||
|
||||
- [ ] Decision date and time recorded
|
||||
- [ ] Evaluator identified (user or agent)
|
||||
- [ ] All evidence sources cited
|
||||
- [ ] Decision criteria documented
|
||||
- [ ] Rationale clearly explained
|
||||
|
||||
### Traceability
|
||||
|
||||
- [ ] Gate decision traceable to story/epic/release
|
||||
- [ ] Evidence traceable to specific test runs
|
||||
- [ ] Assessments traceable to workflows that created them
|
||||
- [ ] Waiver traceable to approver (if applicable)
|
||||
|
||||
### Compliance
|
||||
|
||||
- [ ] Security requirements validated (no unresolved vulnerabilities)
|
||||
- [ ] Quality standards met or waived with justification
|
||||
- [ ] Regulatory requirements addressed (if applicable)
|
||||
- [ ] Documentation sufficient for external audit
|
||||
|
||||
---
|
||||
|
||||
## Edge Cases and Exceptions
|
||||
|
||||
### Missing Evidence
|
||||
|
||||
- [ ] If test-design.md missing, decision still possible with test results + trace
|
||||
- [ ] If traceability-matrix.md missing, decision still possible with test results
|
||||
- [ ] If nfr-assessment.md missing, NFR validation marked as NOT ASSESSED
|
||||
- [ ] If code coverage missing, coverage criterion marked as NOT ASSESSED
|
||||
- [ ] User acknowledged gaps in evidence or provided alternative proof
|
||||
|
||||
### Stale Evidence
|
||||
|
||||
- [ ] Evidence freshness checked (if validate_evidence_freshness: true)
|
||||
- [ ] Warnings issued for assessments >7 days old
|
||||
- [ ] User acknowledged stale evidence or re-ran workflows
|
||||
- [ ] Decision document notes any stale evidence used
|
||||
|
||||
### Conflicting Evidence
|
||||
|
||||
- [ ] Conflicts between test results and assessments resolved
|
||||
- [ ] Most recent/authoritative source identified
|
||||
- [ ] Conflict resolution documented in decision rationale
|
||||
- [ ] User consulted if conflict cannot be resolved
|
||||
|
||||
### Waiver Scenarios
|
||||
|
||||
- [ ] Waiver only used for FAIL decision (not PASS or CONCERNS)
|
||||
- [ ] Waiver has business justification (not technical convenience)
|
||||
- [ ] Waiver has named approver with authority (VP/CTO/PO)
|
||||
- [ ] Waiver has expiry date (does NOT apply to future releases)
|
||||
- [ ] Waiver has remediation plan with concrete due date
|
||||
- [ ] Security vulnerabilities are NOT waived (enforced)
|
||||
|
||||
---
|
||||
|
||||
## Final Validation
|
||||
|
||||
### Document Review
|
||||
|
||||
- [ ] Gate decision document reviewed for accuracy
|
||||
- [ ] Gate YAML reviewed for correctness
|
||||
- [ ] Notification message reviewed for clarity
|
||||
- [ ] Status file update reviewed for format
|
||||
|
||||
### Stakeholder Communication
|
||||
|
||||
- [ ] Decision communicated to PM (if applicable)
|
||||
- [ ] Decision communicated to SM (if applicable)
|
||||
- [ ] Decision communicated to DEV lead (if applicable)
|
||||
- [ ] Decision communicated to stakeholders (if notify_stakeholders: true)
|
||||
|
||||
### Next Steps Identified
|
||||
|
||||
- [ ] **For PASS**: Deployment steps documented, monitoring plan identified
|
||||
- [ ] **For CONCERNS**: Monitoring plan documented, remediation backlog created
|
||||
- [ ] **For FAIL**: Blockers documented, fix assignments confirmed, re-gate planned
|
||||
- [ ] **For WAIVED**: Business approval confirmed, monitoring escalated, remediation scheduled
|
||||
|
||||
### Workflow Complete
|
||||
|
||||
- [ ] All checklist items completed
|
||||
- [ ] All outputs validated and saved
|
||||
- [ ] All stakeholders notified
|
||||
- [ ] Gate decision is final and documented
|
||||
- [ ] Ready to proceed to next phase (deploy, fix, or escalate)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
Record any issues, deviations, or important observations during workflow execution:
|
||||
|
||||
- **Evidence Issues**: [Note any missing, stale, or conflicting evidence]
|
||||
- **Decision Rationale**: [Document any nuanced reasoning or edge cases]
|
||||
- **Waiver Details**: [Document waiver negotiations or approvals]
|
||||
- **Follow-up Actions**: [List any actions required after gate decision]
|
||||
@@ -1,445 +0,0 @@
|
||||
# Gate Decision: {target_id} ({feature_name})
|
||||
|
||||
**Decision:** {PASS | CONCERNS | FAIL | WAIVED}
|
||||
**Date:** {YYYY-MM-DD}
|
||||
**Evaluator:** {user_name or TEA Agent}
|
||||
**Gate Type:** {story | epic | release | hotfix}
|
||||
|
||||
---
|
||||
|
||||
## Story/Epic/Release Information
|
||||
|
||||
- **ID**: {story_id | epic_num | release_version}
|
||||
- **Title**: {feature_name}
|
||||
- **Description**: {brief_description}
|
||||
- **Links**:
|
||||
- Story/Epic File: `{file_path}`
|
||||
- Test Design: `{test_design_file_path}` (if available)
|
||||
- Traceability Matrix: `{trace_file_path}` (if available)
|
||||
- NFR Assessment: `{nfr_file_path}` (if available)
|
||||
|
||||
---
|
||||
|
||||
## Evidence Summary
|
||||
|
||||
### Test Results
|
||||
|
||||
- **Total Tests**: {total_count}
|
||||
- **Passed**: {passed_count} ({pass_percentage}%)
|
||||
- **Failed**: {failed_count} ({fail_percentage}%)
|
||||
- **Skipped**: {skipped_count} ({skip_percentage}%)
|
||||
- **Duration**: {total_duration}
|
||||
|
||||
**Priority Breakdown:**
|
||||
|
||||
- **P0 Tests**: {p0_passed}/{p0_total} passed ({p0_pass_rate}%) {✅ | ❌}
|
||||
- **P1 Tests**: {p1_passed}/{p1_total} passed ({p1_pass_rate}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Tests**: {p2_passed}/{p2_total} passed ({p2_pass_rate}%) {informational}
|
||||
- **P3 Tests**: {p3_passed}/{p3_total} passed ({p3_pass_rate}%) {informational}
|
||||
|
||||
**Overall Pass Rate**: {overall_pass_rate}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Test Results Source**: {CI_run_id | test_report_url | local_run}
|
||||
|
||||
---
|
||||
|
||||
### Coverage Summary
|
||||
|
||||
**Requirements Coverage:**
|
||||
|
||||
- **P0 Acceptance Criteria**: {p0_covered}/{p0_total} covered ({p0_coverage}%) {✅ | ❌}
|
||||
- **P1 Acceptance Criteria**: {p1_covered}/{p1_total} covered ({p1_coverage}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Acceptance Criteria**: {p2_covered}/{p2_total} covered ({p2_coverage}%) {informational}
|
||||
- **Overall Coverage**: {overall_coverage}%
|
||||
|
||||
**Code Coverage** (if available):
|
||||
|
||||
- **Line Coverage**: {line_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Branch Coverage**: {branch_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Function Coverage**: {function_coverage}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Coverage Gaps**: {gap_count} gaps identified
|
||||
|
||||
- {list_of_critical_gaps}
|
||||
|
||||
**Coverage Source**: {coverage_report_url | coverage_file_path}
|
||||
|
||||
---
|
||||
|
||||
### Non-Functional Requirements (NFRs)
|
||||
|
||||
**Security**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- Security Issues: {security_issue_count}
|
||||
- {details_if_issues}
|
||||
|
||||
**Performance**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {performance_metrics_summary}
|
||||
|
||||
**Reliability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {reliability_metrics_summary}
|
||||
|
||||
**Maintainability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {maintainability_metrics_summary}
|
||||
|
||||
**NFR Source**: {nfr_assessment_file_path | not_assessed}
|
||||
|
||||
---
|
||||
|
||||
### Flakiness Validation
|
||||
|
||||
**Burn-in Results** (if available):
|
||||
|
||||
- **Burn-in Iterations**: {iteration_count} (e.g., 10)
|
||||
- **Flaky Tests Detected**: {flaky_test_count} {✅ if 0 | ❌ if >0}
|
||||
- **Stability Score**: {stability_percentage}%
|
||||
|
||||
**Flaky Tests List** (if any):
|
||||
|
||||
- {flaky_test_1_name} - {failure_rate}
|
||||
- {flaky_test_2_name} - {failure_rate}
|
||||
- {flaky_test_3_name} - {failure_rate}
|
||||
|
||||
**Burn-in Source**: {CI_burn_in_run_id | not_available}
|
||||
|
||||
---
|
||||
|
||||
## Decision Criteria Evaluation
|
||||
|
||||
### P0 Criteria (Must ALL Pass)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| --------------------- | --------- | ------------------------- | -------- | -------- |
|
||||
| P0 Test Pass Rate | 100% | {p0_pass_rate}% | {✅ PASS | ❌ FAIL} |
|
||||
| P0 Criteria Coverage | 100% | {p0_coverage}% | {✅ PASS | ❌ FAIL} |
|
||||
| Security Issues | 0 | {security_issue_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Critical NFR Failures | 0 | {critical_nfr_fail_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Flaky Tests | 0 | {flaky_test_count} | {✅ PASS | ❌ FAIL} |
|
||||
|
||||
**P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
|
||||
|
||||
---
|
||||
|
||||
### P1 Criteria (Required for PASS, May Accept for CONCERNS)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ---------------------- | ------------------------- | -------------------- | -------- | ----------- | -------- |
|
||||
| P1 Test Pass Rate | ≥{min_p1_pass_rate}% | {p1_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| P1 Criteria Coverage | ≥95% | {p1_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Overall Test Pass Rate | ≥{min_overall_pass_rate}% | {overall_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Code Coverage | ≥{min_coverage}% | {code_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
|
||||
**P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
|
||||
|
||||
---
|
||||
|
||||
### P2/P3 Criteria (Informational, Don't Block)
|
||||
|
||||
| Criterion | Actual | Notes |
|
||||
| ----------------- | --------------- | ------------------------------------------------------------ |
|
||||
| P2 Test Pass Rate | {p2_pass_rate}% | {allow_p2_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
| P3 Test Pass Rate | {p3_pass_rate}% | {allow_p3_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
|
||||
---
|
||||
|
||||
## Rationale
|
||||
|
||||
{Explain decision based on criteria evaluation}
|
||||
|
||||
{Highlight key evidence that drove decision}
|
||||
|
||||
{Note any assumptions or caveats}
|
||||
|
||||
**Example (PASS):**
|
||||
|
||||
> All P0 criteria met with 100% pass rates across critical tests. All P1 criteria exceeded thresholds with 98% overall pass rate and 87% code coverage. No security issues detected. No flaky tests in burn-in validation. Feature is ready for production deployment with standard monitoring.
|
||||
|
||||
**Example (CONCERNS):**
|
||||
|
||||
> All P0 criteria met, ensuring critical user journeys are protected. However, P1 pass rate (89%) falls below threshold (95%) due to edge cases in international currency handling. Code coverage (78%) is slightly below target (80%) due to missing tests for admin override flow. Issues are non-critical and have acceptable workarounds. Risk is low enough to deploy with enhanced monitoring.
|
||||
|
||||
**Example (FAIL):**
|
||||
|
||||
> CRITICAL BLOCKERS DETECTED:
|
||||
>
|
||||
> 1. P0 test failures (80% pass rate) in core search functionality prevent safe deployment
|
||||
> 2. Unresolved SQL injection vulnerability in search filter poses CRITICAL security risk
|
||||
> 3. Code coverage (68%) significantly below minimum threshold (80%)
|
||||
>
|
||||
> Release MUST BE BLOCKED until P0 issues are resolved. Security vulnerability cannot be waived.
|
||||
|
||||
**Example (WAIVED):**
|
||||
|
||||
> Original decision was FAIL due to P0 test failure in legacy Excel 2007 export module (affects <1% of users). However, release contains critical GDPR compliance features required by regulatory deadline (Oct 15). Business has approved waiver given:
|
||||
>
|
||||
> - Regulatory priority overrides legacy module risk
|
||||
> - Workaround available (use Excel 2010+)
|
||||
> - Issue will be fixed in v2.4.1 hotfix (due Oct 20)
|
||||
> - Enhanced monitoring in place
|
||||
|
||||
---
|
||||
|
||||
## {Section: Delete if not applicable}
|
||||
|
||||
### Residual Risks (For CONCERNS or WAIVED)
|
||||
|
||||
List unresolved P1/P2 issues that don't block release but should be tracked:
|
||||
|
||||
1. **{Risk Description}**
|
||||
- **Priority**: P1 | P2
|
||||
- **Probability**: Low | Medium | High
|
||||
- **Impact**: Low | Medium | High
|
||||
- **Risk Score**: {probability × impact}
|
||||
- **Mitigation**: {workaround or monitoring plan}
|
||||
- **Remediation**: {fix in next sprint/release}
|
||||
|
||||
2. **{Risk Description}**
|
||||
- **Priority**: P1 | P2
|
||||
- **Probability**: Low | Medium | High
|
||||
- **Impact**: Low | Medium | High
|
||||
- **Risk Score**: {probability × impact}
|
||||
- **Mitigation**: {workaround or monitoring plan}
|
||||
- **Remediation**: {fix in next sprint/release}
|
||||
|
||||
**Overall Residual Risk**: {LOW | MEDIUM | HIGH}
|
||||
|
||||
---
|
||||
|
||||
### Waiver Details (For WAIVED only)
|
||||
|
||||
**Original Decision**: ❌ FAIL
|
||||
|
||||
**Reason for Failure**:
|
||||
|
||||
- {list_of_blocking_issues}
|
||||
|
||||
**Waiver Information**:
|
||||
|
||||
- **Waiver Reason**: {business_justification}
|
||||
- **Waiver Approver**: {name}, {role} (e.g., Jane Doe, VP Engineering)
|
||||
- **Approval Date**: {YYYY-MM-DD}
|
||||
- **Waiver Expiry**: {YYYY-MM-DD} (**NOTE**: Does NOT apply to next release)
|
||||
|
||||
**Monitoring Plan**:
|
||||
|
||||
- {enhanced_monitoring_1}
|
||||
- {enhanced_monitoring_2}
|
||||
- {escalation_criteria}
|
||||
|
||||
**Remediation Plan**:
|
||||
|
||||
- **Fix Target**: {next_release_version} (e.g., v2.4.1 hotfix)
|
||||
- **Due Date**: {YYYY-MM-DD}
|
||||
- **Owner**: {team_or_person}
|
||||
- **Verification**: {how_fix_will_be_verified}
|
||||
|
||||
**Business Justification**:
|
||||
{detailed_explanation_of_why_waiver_is_acceptable}
|
||||
|
||||
---
|
||||
|
||||
### Critical Issues (For FAIL or CONCERNS)
|
||||
|
||||
Top blockers requiring immediate attention:
|
||||
|
||||
| Priority | Issue | Description | Owner | Due Date | Status |
|
||||
| -------- | ------------- | ------------------- | ------------ | ------------ | ------------------ |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P1 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P1 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
|
||||
**Blocking Issues Count**: {p0_blocker_count} P0 blockers, {p1_blocker_count} P1 issues
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### For PASS Decision
|
||||
|
||||
1. **Proceed to deployment**
|
||||
- Deploy to staging environment
|
||||
- Validate with smoke tests
|
||||
- Monitor key metrics for 24-48 hours
|
||||
- Deploy to production with standard monitoring
|
||||
|
||||
2. **Post-Deployment Monitoring**
|
||||
- {metric_1_to_monitor}
|
||||
- {metric_2_to_monitor}
|
||||
- {alert_thresholds}
|
||||
|
||||
3. **Success Criteria**
|
||||
- {success_criterion_1}
|
||||
- {success_criterion_2}
|
||||
|
||||
---
|
||||
|
||||
### For CONCERNS Decision
|
||||
|
||||
1. **Deploy with Enhanced Monitoring**
|
||||
- Deploy to staging with extended validation period
|
||||
- Enable enhanced logging/monitoring for known risk areas:
|
||||
- {risk_area_1}
|
||||
- {risk_area_2}
|
||||
- Set aggressive alerts for potential issues
|
||||
- Deploy to production with caution
|
||||
|
||||
2. **Create Remediation Backlog**
|
||||
- Create story: "{fix_title_1}" (Priority: {priority})
|
||||
- Create story: "{fix_title_2}" (Priority: {priority})
|
||||
- Target sprint: {next_sprint}
|
||||
|
||||
3. **Post-Deployment Actions**
|
||||
- Monitor {specific_areas} closely for {time_period}
|
||||
- Weekly status updates on remediation progress
|
||||
- Re-assess after fixes deployed
|
||||
|
||||
---
|
||||
|
||||
### For FAIL Decision
|
||||
|
||||
1. **Block Deployment Immediately**
|
||||
- Do NOT deploy to any environment
|
||||
- Notify stakeholders of blocking issues
|
||||
- Escalate to tech lead and PM
|
||||
|
||||
2. **Fix Critical Issues**
|
||||
- Address P0 blockers listed in Critical Issues section
|
||||
- Owner assignments confirmed
|
||||
- Due dates agreed upon
|
||||
- Daily standup on blocker resolution
|
||||
|
||||
3. **Re-Run Gate After Fixes**
|
||||
- Re-run full test suite after fixes
|
||||
- Re-run affected quality workflows:
|
||||
- `bmad tea *trace` (if coverage was issue)
|
||||
- `bmad tea *nfr-assess` (if NFRs were issue)
|
||||
- Re-run gate workflow: `bmad tea *gate`
|
||||
- Verify decision is PASS before deploying
|
||||
|
||||
---
|
||||
|
||||
### For WAIVED Decision
|
||||
|
||||
1. **Deploy with Business Approval**
|
||||
- Confirm waiver approver has signed off
|
||||
- Document waiver in release notes
|
||||
- Notify all stakeholders of waived risks
|
||||
|
||||
2. **Aggressive Monitoring**
|
||||
- {enhanced_monitoring_plan}
|
||||
- {escalation_procedures}
|
||||
- Daily checks on waived risk areas
|
||||
|
||||
3. **Mandatory Remediation**
|
||||
- Fix MUST be completed by {due_date}
|
||||
- Issue CANNOT be waived in next release
|
||||
- Track remediation progress weekly
|
||||
- Verify fix in next gate
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
**Immediate Actions** (next 24-48 hours):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Follow-up Actions** (next sprint/release):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Stakeholder Communication**:
|
||||
|
||||
- Notify PM: {decision_summary}
|
||||
- Notify SM: {decision_summary}
|
||||
- Notify DEV lead: {decision_summary}
|
||||
- Notify stakeholders: {decision_summary}
|
||||
|
||||
---
|
||||
|
||||
## Gate Decision YAML (CI/CD Integration)
|
||||
|
||||
```yaml
|
||||
gate_decision:
|
||||
target: '{target_id}'
|
||||
type: '{story | epic | release | hotfix}'
|
||||
decision: '{PASS | CONCERNS | FAIL | WAIVED}'
|
||||
date: '{YYYY-MM-DD}'
|
||||
evaluator: '{user_name or TEA Agent}'
|
||||
|
||||
criteria:
|
||||
p0_pass_rate: { p0_pass_rate }
|
||||
p1_pass_rate: { p1_pass_rate }
|
||||
overall_pass_rate: { overall_pass_rate }
|
||||
code_coverage: { code_coverage }
|
||||
security_issues: { security_issue_count }
|
||||
critical_nfrs_fail: { critical_nfr_fail_count }
|
||||
flaky_tests: { flaky_test_count }
|
||||
|
||||
thresholds:
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: { min_p1_pass_rate }
|
||||
min_overall_pass_rate: { min_overall_pass_rate }
|
||||
min_coverage: { min_coverage }
|
||||
|
||||
evidence:
|
||||
test_results: '{CI_run_id | test_report_url}'
|
||||
traceability: '{trace_file_path}'
|
||||
nfr_assessment: '{nfr_file_path}'
|
||||
code_coverage: '{coverage_report_url}'
|
||||
burn_in: '{burn_in_run_id}'
|
||||
|
||||
next_steps: '{brief_summary_of_recommendations}'
|
||||
|
||||
waiver: # Only if WAIVED
|
||||
reason: '{business_justification}'
|
||||
approver: '{name}, {role}'
|
||||
expiry: '{YYYY-MM-DD}'
|
||||
remediation_due: '{YYYY-MM-DD}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Audit Trail
|
||||
|
||||
**Created**: {YYYY-MM-DD HH:MM:SS}
|
||||
**Modified**: {YYYY-MM-DD HH:MM:SS} (if updated)
|
||||
**Version**: 1.0
|
||||
**Document ID**: gate-decision-{target_id}-{YYYYMMDD}
|
||||
**Workflow**: testarch-gate v4.0
|
||||
|
||||
---
|
||||
|
||||
## Appendices
|
||||
|
||||
### Evidence Files Referenced
|
||||
|
||||
- Story/Epic: `{file_path}`
|
||||
- Test Design: `{test_design_file_path}`
|
||||
- Traceability Matrix: `{trace_file_path}`
|
||||
- NFR Assessment: `{nfr_file_path}`
|
||||
- Test Results: `{test_results_path}`
|
||||
- Code Coverage: `{coverage_report_path}`
|
||||
- Burn-in Results: `{burn_in_results_path}`
|
||||
|
||||
### Knowledge Base Fragments Consulted
|
||||
|
||||
- `risk-governance.md` - Risk-based quality gate criteria
|
||||
- `probability-impact.md` - Risk scoring framework
|
||||
- `test-quality.md` - Definition of Done for tests
|
||||
- `test-priorities.md` - P0/P1/P2/P3 priority framework
|
||||
- `ci-burn-in.md` - Flakiness detection patterns
|
||||
|
||||
### Related Documents
|
||||
|
||||
- PRD: `{prd_file_path}` (if applicable)
|
||||
- Tech Spec: `{tech_spec_file_path}` (if applicable)
|
||||
- Architecture: `{architecture_file_path}` (if applicable)
|
||||
@@ -1,494 +0,0 @@
|
||||
# Quality Gate Decision - Instructions v4.0
|
||||
|
||||
**Workflow:** `testarch-gate`
|
||||
**Purpose:** Make deterministic quality gate decision (PASS/CONCERNS/FAIL/WAIVED) for story/epic/release based on test results, risk assessment, and non-functional validation
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Format:** Pure Markdown v4.0 (no XML blocks)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow evaluates all quality evidence (test results, traceability, NFRs, risk assessment) and makes a deterministic gate decision following predefined rules. It ensures that releases meet quality standards and provides an audit trail for decision-making.
|
||||
|
||||
**Key Capabilities:**
|
||||
|
||||
- Deterministic decision rules (PASS/CONCERNS/FAIL/WAIVED)
|
||||
- Evidence-based validation (test results, coverage, NFRs, risks)
|
||||
- P0-P3 risk framework integration
|
||||
- Waiver management (business-approved exceptions)
|
||||
- Audit trail with history tracking
|
||||
- Stakeholder notification generation
|
||||
- Gate YAML output for CI/CD integration
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
|
||||
- Test execution results (CI/CD pipeline, local test runs)
|
||||
- Story or epic being gated
|
||||
- Completed quality workflows (at minimum test-design OR trace)
|
||||
|
||||
**Recommended:**
|
||||
|
||||
- `test-design.md` - Risk assessment with P0-P3 prioritization
|
||||
- `traceability-matrix.md` - Requirements-to-tests coverage analysis
|
||||
- `nfr-assessment.md` - Non-functional requirements validation
|
||||
- Code coverage report
|
||||
- Burn-in test results (flakiness validation)
|
||||
|
||||
**Halt Conditions:**
|
||||
|
||||
- If critical assessments are missing AND user doesn't waive requirement, halt and request them
|
||||
- If assessments are stale (>7 days old) AND `validate_evidence_freshness: true`, warn user
|
||||
- If test results are unavailable, halt and request test execution
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
### Step 1: Load Context and Knowledge Base
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Load relevant knowledge fragments from `{project-root}/bmad/bmm/testarch/tea-index.csv`:
|
||||
- `risk-governance.md` - Risk-based quality gate criteria
|
||||
- `probability-impact.md` - Risk scoring framework
|
||||
- `test-quality.md` - Definition of Done for tests
|
||||
- `test-priorities.md` - P0/P1/P2/P3 priority framework
|
||||
- `ci-burn-in.md` - Flakiness detection validation
|
||||
|
||||
2. Read gate configuration from workflow variables:
|
||||
- Gate type (story/epic/release/hotfix)
|
||||
- Decision thresholds (pass rates, coverage minimums)
|
||||
- Risk tolerance (allow P2/P3 failures, escalate P1)
|
||||
- Waiver policy
|
||||
|
||||
3. Identify gate target:
|
||||
- Extract story ID, epic number, or release version
|
||||
- Determine scope (single story vs full epic vs release)
|
||||
|
||||
**Output:** Complete understanding of gate criteria and target scope
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Gather Quality Evidence
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Auto-discover assessment files** (if not explicitly provided):
|
||||
- Search for `test-design-epic-{epic_num}.md` or `test-design-story-{story_id}.md`
|
||||
- Search for `traceability-matrix-{story_id}.md` or `traceability-matrix-epic-{epic_num}.md`
|
||||
- Search for `nfr-assessment-{story_id}.md` or `nfr-assessment-epic-{epic_num}.md`
|
||||
- Search for story file: `story-{story_id}.md`
|
||||
|
||||
2. **Validate evidence freshness** (if `validate_evidence_freshness: true`):
|
||||
- Check file modification dates
|
||||
- Warn if any assessment is >7 days old
|
||||
- Recommend re-running stale workflows
|
||||
|
||||
3. **Parse test execution results**:
|
||||
- CI/CD pipeline results (GitHub Actions, GitLab CI, Jenkins)
|
||||
- Test framework reports (Playwright HTML report, Jest JSON, JUnit XML)
|
||||
- Extract metrics: total tests, passed, failed, skipped, duration
|
||||
- Extract burn-in results: flaky test count, stability score
|
||||
|
||||
4. **Parse quality assessments**:
|
||||
- **test-design.md**: Extract P0/P1/P2/P3 scenarios, risk scores, mitigation status
|
||||
- **traceability-matrix.md**: Extract coverage percentages, gaps, unmapped criteria
|
||||
- **nfr-assessment.md**: Extract NFR status (PASS/CONCERNS/FAIL per category)
|
||||
|
||||
5. **Parse code coverage** (if available):
|
||||
- Line coverage, branch coverage, function coverage
|
||||
- Coverage by file/directory
|
||||
- Identify uncovered critical paths
|
||||
|
||||
**Output:** Comprehensive evidence package with all quality metrics
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Apply Decision Rules (Deterministic Mode)
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Evaluate P0 criteria** (must ALL pass for gate to PASS):
|
||||
- ✅ P0 test pass rate = 100%
|
||||
- ✅ P0 acceptance criteria coverage = 100%
|
||||
- ✅ No critical security issues (max_security_issues = 0)
|
||||
- ✅ No critical NFR failures (max_critical_nfrs_fail = 0)
|
||||
- ✅ No flaky tests in burn-in (if burn-in enabled)
|
||||
|
||||
**If ANY P0 criterion fails → Decision = FAIL**
|
||||
|
||||
2. **Evaluate P1 criteria** (required for PASS, may be waived for CONCERNS):
|
||||
- ✅ P1 test pass rate ≥ min_p1_pass_rate (default: 95%)
|
||||
- ✅ P1 acceptance criteria coverage ≥ 95%
|
||||
- ✅ Overall test pass rate ≥ min_overall_pass_rate (default: 90%)
|
||||
- ✅ Code coverage ≥ min_coverage (default: 80%)
|
||||
|
||||
**If ANY P1 criterion fails → Decision = CONCERNS (may escalate to FAIL)**
|
||||
|
||||
3. **Evaluate P2/P3 criteria** (informational, don't block):
|
||||
- P2 failures tracked but don't affect gate decision (if allow_p2_failures: true)
|
||||
- P3 failures tracked but don't affect gate decision (if allow_p3_failures: true)
|
||||
- Document as residual risk
|
||||
|
||||
4. **Determine final decision**:
|
||||
- **PASS**: All P0 criteria met, all P1 criteria met, no critical blockers
|
||||
- **CONCERNS**: All P0 criteria met, some P1 criteria missed, residual risk acceptable
|
||||
- **FAIL**: Any P0 criterion missed, critical blockers present
|
||||
- **WAIVED**: FAIL status with business-approved waiver (if allow_waivers: true)
|
||||
|
||||
**Output:** Gate decision with deterministic justification
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Document Decision and Evidence
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Create gate decision document** using `gate-template.md`:
|
||||
- **Story/Epic/Release Info**: ID, title, description, links
|
||||
- **Decision**: PASS / CONCERNS / FAIL / WAIVED
|
||||
- **Decision Date**: Timestamp of gate evaluation
|
||||
- **Evaluator**: User or agent who made decision
|
||||
|
||||
2. **Document evidence**:
|
||||
- **Test Results Summary**:
|
||||
- Total tests: X
|
||||
- Passed: Y (Z%)
|
||||
- Failed: N (M%)
|
||||
- P0 pass rate: 100% ✅ / <100% ❌
|
||||
- P1 pass rate: X% ✅ / <95% ⚠️
|
||||
- **Coverage Summary**:
|
||||
- P0 criteria: X/Y covered (Z%)
|
||||
- P1 criteria: X/Y covered (Z%)
|
||||
- Code coverage: X%
|
||||
- **NFR Validation**:
|
||||
- Security: PASS / CONCERNS / FAIL
|
||||
- Performance: PASS / CONCERNS / FAIL
|
||||
- Reliability: PASS / CONCERNS / FAIL
|
||||
- Maintainability: PASS / CONCERNS / FAIL
|
||||
- **Flakiness**:
|
||||
- Burn-in iterations: 10
|
||||
- Flaky tests detected: 0 ✅ / >0 ❌
|
||||
|
||||
3. **Document rationale**:
|
||||
- Explain decision based on criteria
|
||||
- Highlight key evidence that drove decision
|
||||
- Note any assumptions or caveats
|
||||
|
||||
4. **Document residual risks** (if CONCERNS or WAIVED):
|
||||
- List unresolved P1/P2 issues
|
||||
- Estimate probability × impact
|
||||
- Describe mitigations or workarounds
|
||||
|
||||
5. **Document waivers** (if WAIVED):
|
||||
- Waiver reason (business justification)
|
||||
- Waiver approver (name, role)
|
||||
- Waiver expiry date
|
||||
- Remediation plan
|
||||
|
||||
6. **List critical issues** (if FAIL or CONCERNS):
|
||||
- Top 5-10 issues blocking gate
|
||||
- Priority (P0/P1/P2)
|
||||
- Owner
|
||||
- Due date
|
||||
|
||||
7. **Provide recommendations**:
|
||||
- **For PASS**: Proceed to deployment, monitor post-release
|
||||
- **For CONCERNS**: Deploy with monitoring, address issues in next sprint
|
||||
- **For FAIL**: Block deployment, fix critical issues, re-run gate
|
||||
- **For WAIVED**: Deploy with business approval, aggressive monitoring
|
||||
|
||||
**Output:** Complete gate decision document ready for review
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Update Status Tracking and Notify
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Append to bmm-workflow-status.md** (if `append_to_history: true`):
|
||||
- Add gate decision to history section
|
||||
- Format: `[DATE] Gate Decision: DECISION - Story/Epic/Release {ID} - {brief rationale}`
|
||||
- Example: `[2025-10-14] Gate Decision: PASS - Story 1.3 - All P0/P1 criteria met, 98% pass rate`
|
||||
|
||||
2. **Generate stakeholder notification** (if `notify_stakeholders: true`):
|
||||
- **Subject**: Gate Decision: DECISION - {Story/Epic/Release ID}
|
||||
- **Body**: Summary of decision, key metrics, next steps
|
||||
- **Recipients**: PM, SM, DEV lead, stakeholders
|
||||
|
||||
3. **Generate gate YAML snippet** for CI/CD integration:
|
||||
|
||||
```yaml
|
||||
gate_decision:
|
||||
target: 'story-1.3'
|
||||
decision: 'PASS' # or CONCERNS / FAIL / WAIVED
|
||||
date: '2025-10-14'
|
||||
evaluator: 'TEA Agent'
|
||||
criteria:
|
||||
p0_pass_rate: 100
|
||||
p1_pass_rate: 98
|
||||
overall_pass_rate: 96
|
||||
code_coverage: 85
|
||||
security_issues: 0
|
||||
critical_nfrs_fail: 0
|
||||
flaky_tests: 0
|
||||
evidence:
|
||||
test_results: 'CI Run #456'
|
||||
traceability: 'traceability-matrix-1.3.md'
|
||||
nfr_assessment: 'nfr-assessment-1.3.md'
|
||||
next_steps: 'Deploy to staging, monitor metrics'
|
||||
```
|
||||
|
||||
4. **Save outputs**:
|
||||
- Write gate decision document to `{output_file}`
|
||||
- Write gate YAML to `{output_folder}/gate-decision-{target}.yaml`
|
||||
- Update status file
|
||||
|
||||
**Output:** Gate decision documented, tracked, and communicated
|
||||
|
||||
---
|
||||
|
||||
## Decision Matrix (Quick Reference)
|
||||
|
||||
| Scenario | P0 Pass Rate | P1 Pass Rate | Security Issues | Critical NFRs | Decision | Action |
|
||||
| ----------------- | ------------ | ------------ | --------------- | ------------- | ------------ | ---------------------- |
|
||||
| Ideal | 100% | ≥95% | 0 | 0 | **PASS** | Deploy |
|
||||
| Minor issues | 100% | 90-94% | 0 | 0 | **CONCERNS** | Deploy with monitoring |
|
||||
| P1 degradation | 100% | <90% | 0 | 0 | **CONCERNS** | Fix in next sprint |
|
||||
| P0 failure | <100% | any | any | any | **FAIL** | Block release |
|
||||
| Security issue | any | any | >0 | any | **FAIL** | Fix immediately |
|
||||
| Critical NFR fail | any | any | any | >0 | **FAIL** | Remediate first |
|
||||
| Business waiver | <100% | any | any | any | **WAIVED** | Deploy with approval |
|
||||
|
||||
---
|
||||
|
||||
## Waiver Management
|
||||
|
||||
**When to waive:**
|
||||
|
||||
- Business-critical deadline (e.g., regulatory requirement, contractual obligation)
|
||||
- Issue is low-probability edge case with acceptable risk
|
||||
- Workaround exists for known issue
|
||||
- Fix is in progress but can be deployed post-release
|
||||
|
||||
**Waiver requirements:**
|
||||
|
||||
- Named approver (VP Engineering, CTO, Product Owner)
|
||||
- Business justification documented
|
||||
- Remediation plan with due date
|
||||
- Expiry date (waiver does NOT apply to future releases)
|
||||
- Monitoring plan for waived risk
|
||||
|
||||
**Never waive:**
|
||||
|
||||
- Security vulnerabilities
|
||||
- Data corruption risks
|
||||
- Critical user journey failures
|
||||
- Compliance violations
|
||||
|
||||
---
|
||||
|
||||
## Example Gate Decisions
|
||||
|
||||
### Example 1: PASS Decision
|
||||
|
||||
```markdown
|
||||
# Gate Decision: story-1.3 (User Authentication Flow)
|
||||
|
||||
**Decision:** ✅ PASS
|
||||
**Date:** 2025-10-14
|
||||
**Evaluator:** TEA Agent
|
||||
|
||||
## Evidence Summary
|
||||
|
||||
- **P0 Tests:** 12/12 passed (100%) ✅
|
||||
- **P1 Tests:** 24/25 passed (96%) ✅
|
||||
- **Overall Pass Rate:** 98% ✅
|
||||
- **Code Coverage:** 87% ✅
|
||||
- **Security Issues:** 0 ✅
|
||||
- **Flaky Tests:** 0 ✅
|
||||
|
||||
## Rationale
|
||||
|
||||
All P0 criteria met. All P1 criteria exceeded thresholds. No critical issues detected. Feature is ready for production deployment.
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Deploy to staging environment
|
||||
2. Monitor authentication metrics for 24 hours
|
||||
3. Deploy to production if no issues
|
||||
```
|
||||
|
||||
### Example 2: CONCERNS Decision
|
||||
|
||||
```markdown
|
||||
# Gate Decision: epic-2 (Payment Processing)
|
||||
|
||||
**Decision:** ⚠️ CONCERNS
|
||||
**Date:** 2025-10-14
|
||||
**Evaluator:** TEA Agent
|
||||
|
||||
## Evidence Summary
|
||||
|
||||
- **P0 Tests:** 28/28 passed (100%) ✅
|
||||
- **P1 Tests:** 42/47 passed (89%) ⚠️
|
||||
- **Overall Pass Rate:** 91% ✅
|
||||
- **Code Coverage:** 78% ⚠️
|
||||
- **Security Issues:** 0 ✅
|
||||
- **Flaky Tests:** 0 ✅
|
||||
|
||||
## Rationale
|
||||
|
||||
All P0 criteria met, but P1 pass rate (89%) below threshold (95%). Coverage (78%) slightly below target (80%). Issues are non-critical and can be addressed post-release.
|
||||
|
||||
## Residual Risks
|
||||
|
||||
1. **P1 Issue**: Edge case in refund flow for international currencies (low probability)
|
||||
2. **Coverage Gap**: Missing tests for admin cancel flow (workaround exists)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Deploy with enhanced monitoring on refund flows
|
||||
2. Create backlog stories for P1 fixes
|
||||
3. Add missing tests in next sprint
|
||||
```
|
||||
|
||||
### Example 3: FAIL Decision
|
||||
|
||||
```markdown
|
||||
# Gate Decision: story-3.2 (Data Export)
|
||||
|
||||
**Decision:** ❌ FAIL
|
||||
**Date:** 2025-10-14
|
||||
**Evaluator:** TEA Agent
|
||||
|
||||
## Evidence Summary
|
||||
|
||||
- **P0 Tests:** 8/10 passed (80%) ❌
|
||||
- **P1 Tests:** 18/22 passed (82%) ❌
|
||||
- **Security Issues:** 1 (SQL injection in export filter) ❌
|
||||
- **Code Coverage:** 65% ❌
|
||||
|
||||
## Rationale
|
||||
|
||||
**CRITICAL BLOCKERS:**
|
||||
|
||||
1. P0 test failures in core export functionality
|
||||
2. Unresolved SQL injection vulnerability (CRITICAL security issue)
|
||||
3. Coverage below minimum threshold
|
||||
|
||||
Release BLOCKED until critical issues are resolved.
|
||||
|
||||
## Critical Issues
|
||||
|
||||
| Priority | Issue | Owner | Due Date |
|
||||
| -------- | ------------------------------------- | ------------ | ---------- |
|
||||
| P0 | Fix SQL injection in export filter | Backend Team | 2025-10-16 |
|
||||
| P0 | Fix export pagination bug | Backend Team | 2025-10-16 |
|
||||
| P0 | Fix export timeout for large datasets | Backend Team | 2025-10-17 |
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Block deployment immediately**
|
||||
2. Fix P0 issues listed above
|
||||
3. Re-run full test suite
|
||||
4. Re-run gate workflow after fixes
|
||||
```
|
||||
|
||||
### Example 4: WAIVED Decision
|
||||
|
||||
```markdown
|
||||
# Gate Decision: release-v2.4.0
|
||||
|
||||
**Decision:** 🔓 WAIVED
|
||||
**Date:** 2025-10-14
|
||||
**Evaluator:** TEA Agent
|
||||
|
||||
## Original Decision: ❌ FAIL
|
||||
|
||||
**Reason for failure:**
|
||||
|
||||
- P0 test failure in legacy reporting module
|
||||
- Issue affects <1% of users (specific browser configuration)
|
||||
|
||||
## Waiver Details
|
||||
|
||||
- **Waiver Reason:** Regulatory deadline for GDPR compliance features (Oct 15)
|
||||
- **Waiver Approver:** Jane Doe, VP Engineering
|
||||
- **Waiver Expiry:** 2025-10-15 (does NOT apply to v2.4.1)
|
||||
- **Monitoring Plan:** Enhanced error tracking on reporting module
|
||||
- **Remediation Plan:** Fix in v2.4.1 hotfix (due Oct 20)
|
||||
|
||||
## Business Justification
|
||||
|
||||
Release contains critical GDPR compliance features required by regulatory deadline. Failed test affects legacy reporting module used by <1% of users in specific edge case (IE11 + Windows 7). Workaround available (use Chrome). Risk acceptable given regulatory priority.
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Deploy v2.4.0 with waiver
|
||||
2. Monitor error rates on reporting module
|
||||
3. Fix legacy module in v2.4.1 (Oct 20)
|
||||
4. Notify affected users of workaround
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with BMad Status File
|
||||
|
||||
This workflow updates `bmm-workflow-status.md` with gate decisions for tracking:
|
||||
|
||||
```markdown
|
||||
### Quality & Testing Progress (TEA Agent)
|
||||
|
||||
**Gate Decisions:**
|
||||
|
||||
- [2025-10-14] ✅ PASS - Story 1.3 (User Auth) - All criteria met
|
||||
- [2025-10-14] ⚠️ CONCERNS - Epic 2 (Payments) - P1 pass rate 89%
|
||||
- [2025-10-14] ❌ FAIL - Story 3.2 (Export) - Security issue blocking
|
||||
- [2025-10-15] 🔓 WAIVED - Release v2.4.0 - GDPR deadline waiver
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
1. **Deterministic > Manual**: Use rule-based decisions to reduce bias and ensure consistency
|
||||
2. **Evidence Required**: Never make decisions without test results and assessments
|
||||
3. **P0 is Sacred**: P0 failures ALWAYS result in FAIL (no exceptions except waivers)
|
||||
4. **Waivers are Temporary**: Waiver does NOT apply to future releases - issue must be fixed
|
||||
5. **Security Never Waived**: Security vulnerabilities should never be waived
|
||||
6. **Transparency**: Document rationale clearly for audit trail
|
||||
7. **Freshness Matters**: Stale assessments (>7 days) should be re-run
|
||||
8. **Burn-in Counts**: Flaky tests detected in burn-in should block gate
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Problem: No test results found**
|
||||
|
||||
- Check CI/CD pipeline for test execution
|
||||
- Verify test results path in workflow variables
|
||||
- Run tests locally and provide results
|
||||
|
||||
**Problem: Assessments are stale**
|
||||
|
||||
- Re-run `*test-design`, `*trace`, `*nfr-assess` workflows
|
||||
- Update evidence files before gate decision
|
||||
|
||||
**Problem: Unclear decision (edge case)**
|
||||
|
||||
- Escalate to manual review
|
||||
- Document assumptions and rationale
|
||||
- Consider waiver if business-critical
|
||||
|
||||
**Problem: Waiver requested but not justified**
|
||||
|
||||
- Require business justification from stakeholder
|
||||
- Ensure named approver is appropriate authority
|
||||
- Verify remediation plan exists with due date
|
||||
@@ -1,94 +0,0 @@
|
||||
# Test Architect workflow: gate
|
||||
name: testarch-gate
|
||||
description: "Quality gate decision for story/epic/release with deterministic PASS/CONCERNS/FAIL/WAIVED status"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/bmad/bmm/workflows/testarch/gate"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/gate-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
# Gate target
|
||||
gate_type: "story" # story, epic, release, hotfix
|
||||
story_id: "" # e.g., "1.3" for story mode
|
||||
epic_num: "" # e.g., "1" for epic mode
|
||||
release_version: "" # e.g., "v2.4.0" for release mode
|
||||
|
||||
# Gate decision configuration
|
||||
decision_mode: "deterministic" # deterministic (rule-based) or manual (team decision)
|
||||
allow_waivers: true # Allow business-approved waivers for FAIL → WAIVED
|
||||
require_evidence: true # Require links to test results, reports, etc.
|
||||
|
||||
# Input sources (auto-discover if not provided)
|
||||
story_file: "" # Path to story markdown
|
||||
test_design_file: "" # Path to test-design.md (risk assessment)
|
||||
trace_file: "" # Path to traceability-matrix.md (coverage)
|
||||
nfr_file: "" # Path to nfr-assessment.md (non-functional validation)
|
||||
test_results: "" # Path to test execution results (CI artifacts, reports)
|
||||
|
||||
# Decision criteria (thresholds)
|
||||
min_p0_pass_rate: 100 # P0 tests must have 100% pass rate
|
||||
min_p1_pass_rate: 95 # P1 tests threshold
|
||||
min_overall_pass_rate: 90 # Overall test pass rate
|
||||
min_coverage: 80 # Code/requirement coverage minimum
|
||||
max_critical_nfrs_fail: 0 # No critical NFRs can fail
|
||||
max_security_issues: 0 # No unresolved security issues
|
||||
|
||||
# Risk tolerance
|
||||
allow_p2_failures: true # P2 failures don't block release
|
||||
allow_p3_failures: true # P3 failures don't block release
|
||||
escalate_p1_failures: true # P1 failures require escalation approval
|
||||
|
||||
# Output configuration
|
||||
output_file: "{output_folder}/gate-decision-{gate_type}-{story_id}{epic_num}{release_version}.md"
|
||||
append_to_history: true # Append to bmm-workflow-status.md gate history
|
||||
notify_stakeholders: true # Generate notification message for team
|
||||
|
||||
# Advanced options
|
||||
auto_load_knowledge: true # Load risk-governance, probability-impact, test-quality fragments
|
||||
check_all_workflows_complete: true # Verify test-design, trace, nfr-assess are complete
|
||||
validate_evidence_freshness: true # Warn if assessments are >7 days old
|
||||
require_sign_off: false # Require named approver for gate decision
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/gate-decision.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read story, assessments, test results
|
||||
- write_file # Create gate decision document
|
||||
- search_repo # Find related artifacts
|
||||
- list_files # Discover assessments
|
||||
|
||||
# Recommended inputs
|
||||
recommended_inputs:
|
||||
- story: "Story or epic being gated (required)"
|
||||
- test_design: "Risk assessment with P0-P3 prioritization (required)"
|
||||
- trace: "Requirements-to-tests traceability matrix (required)"
|
||||
- nfr_assess: "Non-functional requirements validation (recommended)"
|
||||
- test_results: "CI/CD test execution results (required)"
|
||||
- code_coverage: "Code coverage report (recommended)"
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- gate
|
||||
- test-architect
|
||||
- release
|
||||
- decision
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: false # Gate decision is single-pass
|
||||
|
||||
web_bundle: false
|
||||
@@ -403,7 +403,7 @@ bmad tea *nfr-assess \
|
||||
- **testarch-test-design** → `*nfr-assess` - Define NFR requirements, then assess
|
||||
- **testarch-framework** → `*nfr-assess` - Set up frameworks, then validate NFRs
|
||||
- **testarch-ci** → `*nfr-assess` - Configure CI, then assess reliability with burn-in
|
||||
- `*nfr-assess` → **testarch-gate** - Assess NFRs, then apply quality gates
|
||||
- `*nfr-assess` → **testarch-trace (Phase 2)** - Assess NFRs, then apply quality gates
|
||||
- `*nfr-assess` → **testarch-test-review** - Assess maintainability, then review tests
|
||||
|
||||
---
|
||||
@@ -452,7 +452,7 @@ bmad tea *nfr-assess \
|
||||
- `bmad tea *test-design` - Define NFR requirements and test plan
|
||||
- `bmad tea *framework` - Set up performance/security testing frameworks
|
||||
- `bmad tea *ci` - Configure CI/CD for NFR validation
|
||||
- `bmad tea *gate` - Apply quality gates using NFR assessment metrics
|
||||
- `bmad tea *trace` (Phase 2) - Apply quality gates using NFR assessment metrics
|
||||
- `bmad tea *test-review` - Review test quality (maintainability NFR)
|
||||
|
||||
---
|
||||
|
||||
@@ -51,10 +51,11 @@ This workflow performs a comprehensive assessment of non-functional requirements
|
||||
**Actions:**
|
||||
|
||||
1. Load relevant knowledge fragments from `{project-root}/bmad/bmm/testarch/tea-index.csv`:
|
||||
- `nfr-criteria.md` - Non-functional requirements criteria and thresholds
|
||||
- `ci-burn-in.md` - CI/CD burn-in patterns for reliability validation
|
||||
- `test-quality.md` - Test quality expectations (related to maintainability)
|
||||
- `playwright-config.md` - Performance configuration patterns (if using Playwright)
|
||||
- `nfr-criteria.md` - Non-functional requirements criteria and thresholds (security, performance, reliability, maintainability with code examples, 658 lines, 4 examples)
|
||||
- `ci-burn-in.md` - CI/CD burn-in patterns for reliability validation (10-iteration detection, sharding, selective execution, 678 lines, 4 examples)
|
||||
- `test-quality.md` - Test quality expectations for maintainability (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)
|
||||
- `playwright-config.md` - Performance configuration patterns: parallelization, timeout standards, artifact output (722 lines, 5 examples)
|
||||
- `error-handling.md` - Reliability validation patterns: scoped exceptions, retry validation, telemetry logging, graceful degradation (736 lines, 4 examples)
|
||||
|
||||
2. Read story file (if provided):
|
||||
- Extract NFR requirements
|
||||
|
||||
@@ -176,6 +176,121 @@ The TEA agent runs this workflow when:
|
||||
|
||||
**Key principle**: Avoid duplicate coverage - don't test same behavior at multiple levels.
|
||||
|
||||
### Exploratory Mode (NEW - Phase 2.5)
|
||||
|
||||
**test-design** supports UI exploration for brownfield applications with missing documentation.
|
||||
|
||||
**Activation**: Automatic when requirements missing/incomplete for brownfield apps
|
||||
|
||||
- If config.tea_use_mcp_enhancements is true + MCP available → MCP-assisted exploration
|
||||
- Otherwise → Manual exploration with user documentation
|
||||
|
||||
**When to Use Exploratory Mode:**
|
||||
|
||||
- ✅ Brownfield projects with missing documentation
|
||||
- ✅ Legacy systems lacking requirements
|
||||
- ✅ Undocumented features needing test coverage
|
||||
- ✅ Unknown user journeys requiring discovery
|
||||
- ❌ NOT for greenfield projects with clear requirements
|
||||
|
||||
**Exploration Modes:**
|
||||
|
||||
1. **MCP-Assisted Exploration** (if Playwright MCP available):
|
||||
- Interactive browser exploration using MCP tools
|
||||
- `planner_setup_page` - Initialize browser
|
||||
- `browser_navigate` - Explore pages
|
||||
- `browser_click` - Interact with UI elements
|
||||
- `browser_hover` - Reveal hidden menus
|
||||
- `browser_snapshot` - Capture state at each step
|
||||
- `browser_screenshot` - Document visually
|
||||
- `browser_console_messages` - Find JavaScript errors
|
||||
- `browser_network_requests` - Identify API endpoints
|
||||
|
||||
2. **Manual Exploration** (fallback without MCP):
|
||||
- User explores application manually
|
||||
- Documents findings in markdown:
|
||||
- Pages/features discovered
|
||||
- User journeys identified
|
||||
- API endpoints observed (DevTools Network)
|
||||
- JavaScript errors noted (DevTools Console)
|
||||
- Critical workflows mapped
|
||||
- Provides exploration findings to workflow
|
||||
|
||||
**Exploration Workflow:**
|
||||
|
||||
```
|
||||
1. Enable exploratory_mode and set exploration_url
|
||||
2. IF MCP available:
|
||||
- Use planner_setup_page to init browser
|
||||
- Explore UI with browser_* tools
|
||||
- Capture snapshots and screenshots
|
||||
- Monitor console and network
|
||||
- Document discoveries
|
||||
3. IF MCP unavailable:
|
||||
- Notify user to explore manually
|
||||
- Wait for exploration findings
|
||||
4. Convert discoveries to testable requirements
|
||||
5. Continue with standard risk assessment (Step 2)
|
||||
```
|
||||
|
||||
**Example Output from Exploratory Mode:**
|
||||
|
||||
```markdown
|
||||
## Exploration Findings - Legacy Admin Panel
|
||||
|
||||
**Exploration URL**: https://admin.example.com
|
||||
**Mode**: MCP-Assisted
|
||||
|
||||
### Discovered Features:
|
||||
|
||||
1. User Management (/admin/users)
|
||||
- List users (table with 10 columns)
|
||||
- Edit user (modal form)
|
||||
- Delete user (confirmation dialog)
|
||||
- Export to CSV (download button)
|
||||
|
||||
2. Reporting Dashboard (/admin/reports)
|
||||
- Date range picker
|
||||
- Filter by department
|
||||
- Generate PDF report
|
||||
- Email report to stakeholders
|
||||
|
||||
3. API Endpoints Discovered:
|
||||
- GET /api/admin/users
|
||||
- PUT /api/admin/users/:id
|
||||
- DELETE /api/admin/users/:id
|
||||
- POST /api/reports/generate
|
||||
|
||||
### User Journeys Mapped:
|
||||
|
||||
1. Admin deletes inactive user
|
||||
- Navigate to /admin/users
|
||||
- Click delete icon
|
||||
- Confirm in modal
|
||||
- User removed from table
|
||||
|
||||
2. Admin generates monthly report
|
||||
- Navigate to /admin/reports
|
||||
- Select date range (last month)
|
||||
- Click generate
|
||||
- Download PDF
|
||||
|
||||
### Risks Identified (from exploration):
|
||||
|
||||
- R-001 (SEC): No RBAC check observed (any admin can delete any user)
|
||||
- R-002 (DATA): No confirmation on bulk delete
|
||||
- R-003 (PERF): User table loads slowly (5s for 1000 rows)
|
||||
|
||||
**Next**: Proceed to risk assessment with discovered requirements
|
||||
```
|
||||
|
||||
**Graceful Degradation:**
|
||||
|
||||
- Exploratory mode is OPTIONAL (default: disabled)
|
||||
- Works without Playwright MCP (manual fallback)
|
||||
- If exploration fails, can disable mode and provide requirements documentation
|
||||
- Seamlessly transitions to standard risk assessment workflow
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
Automatically consults TEA knowledge base:
|
||||
@@ -197,7 +312,7 @@ Automatically consults TEA knowledge base:
|
||||
|
||||
- **atdd**: Generate failing tests for P0 scenarios
|
||||
- **automate**: Expand coverage for P1/P2 scenarios
|
||||
- **gate**: Use quality gate criteria for release decisions
|
||||
- **trace (Phase 2)**: Use quality gate criteria for release decisions
|
||||
|
||||
**Coordinates with:**
|
||||
|
||||
@@ -368,7 +483,7 @@ Total effort: 65 hours (~8 days)
|
||||
|
||||
- **atdd**: Generate failing tests → [atdd/README.md](../atdd/README.md)
|
||||
- **automate**: Expand regression coverage → [automate/README.md](../automate/README.md)
|
||||
- **gate**: Quality gate decisions → [gate/README.md](../gate/README.md)
|
||||
- **trace**: Traceability and quality gate decisions → [trace/README.md](../trace/README.md)
|
||||
- **framework**: Test infrastructure → [framework/README.md](../framework/README.md)
|
||||
|
||||
## Version History
|
||||
|
||||
@@ -49,12 +49,126 @@ Plans comprehensive test coverage strategy with risk assessment, priority classi
|
||||
4. **Load Knowledge Base Fragments**
|
||||
|
||||
**Critical:** Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load:
|
||||
- `risk-governance.md` - Risk classification framework
|
||||
- `probability-impact.md` - Risk scoring methodology
|
||||
- `test-levels-framework.md` - Test level selection guidance
|
||||
- `test-priorities-matrix.md` - P0-P3 prioritization criteria
|
||||
- `risk-governance.md` - Risk classification framework (6 categories: TECH, SEC, PERF, DATA, BUS, OPS), automated scoring, gate decision engine, owner tracking (625 lines, 4 examples)
|
||||
- `probability-impact.md` - Risk scoring methodology (probability × impact matrix, automated classification, dynamic re-assessment, gate integration, 604 lines, 4 examples)
|
||||
- `test-levels-framework.md` - Test level selection guidance (E2E vs API vs Component vs Unit with decision matrix, characteristics, when to use each, 467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - P0-P3 prioritization criteria (automated priority calculation, risk-based mapping, tagging strategy, time budgets, 389 lines, 2 examples)
|
||||
|
||||
**Halt Condition:** If story data or acceptance criteria are missing, HALT with message: "Test design requires clear requirements and acceptance criteria"
|
||||
**Halt Condition:** If story data or acceptance criteria are missing, check if brownfield exploration is needed. If neither requirements NOR exploration possible, HALT with message: "Test design requires clear requirements, acceptance criteria, or brownfield app URL for exploration"
|
||||
|
||||
---
|
||||
|
||||
## Step 1.5: Mode Selection (NEW - Phase 2.5)
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Detect Planning Mode**
|
||||
|
||||
Determine mode based on context:
|
||||
|
||||
**Requirements-Based Mode (DEFAULT)**:
|
||||
- Have clear story/PRD with acceptance criteria
|
||||
- Uses: Existing workflow (Steps 2-4)
|
||||
- Appropriate for: Documented features, greenfield projects
|
||||
|
||||
**Exploratory Mode (OPTIONAL - Brownfield)**:
|
||||
- Missing/incomplete requirements AND brownfield application exists
|
||||
- Uses: UI exploration to discover functionality
|
||||
- Appropriate for: Undocumented brownfield apps, legacy systems
|
||||
|
||||
2. **Requirements-Based Mode (DEFAULT - Skip to Step 2)**
|
||||
|
||||
If requirements are clear:
|
||||
- Continue with existing workflow (Step 2: Assess and Classify Risks)
|
||||
- Use loaded requirements from Step 1
|
||||
- Proceed with risk assessment based on documented requirements
|
||||
|
||||
3. **Exploratory Mode (OPTIONAL - Brownfield Apps)**
|
||||
|
||||
If exploring brownfield application:
|
||||
|
||||
**A. Check MCP Availability**
|
||||
|
||||
If config.tea_use_mcp_enhancements is true AND Playwright MCP tools available:
|
||||
- Use MCP-assisted exploration (Step 3.B)
|
||||
|
||||
If MCP unavailable OR config.tea_use_mcp_enhancements is false:
|
||||
- Use manual exploration fallback (Step 3.C)
|
||||
|
||||
**B. MCP-Assisted Exploration (If MCP Tools Available)**
|
||||
|
||||
Use Playwright MCP browser tools to explore UI:
|
||||
|
||||
**Setup:**
|
||||
|
||||
```
|
||||
1. Use planner_setup_page to initialize browser
|
||||
2. Navigate to {exploration_url}
|
||||
3. Capture initial state with browser_snapshot
|
||||
```
|
||||
|
||||
**Exploration Process:**
|
||||
|
||||
```
|
||||
4. Use browser_navigate to explore different pages
|
||||
5. Use browser_click to interact with buttons, links, forms
|
||||
6. Use browser_hover to reveal hidden menus/tooltips
|
||||
7. Capture browser_snapshot at each significant state
|
||||
8. Take browser_screenshot for documentation
|
||||
9. Monitor browser_console_messages for JavaScript errors
|
||||
10. Track browser_network_requests to identify API calls
|
||||
11. Map user flows and interactive elements
|
||||
12. Document discovered functionality
|
||||
```
|
||||
|
||||
**Discovery Documentation:**
|
||||
- Create list of discovered features (pages, workflows, forms)
|
||||
- Identify user journeys (navigation paths)
|
||||
- Map API endpoints (from network requests)
|
||||
- Note error states (from console messages)
|
||||
- Capture screenshots for visual reference
|
||||
|
||||
**Convert to Test Scenarios:**
|
||||
- Transform discoveries into testable requirements
|
||||
- Prioritize based on user flow criticality
|
||||
- Identify risks from discovered functionality
|
||||
- Continue with Step 2 (Assess and Classify Risks) using discovered requirements
|
||||
|
||||
**C. Manual Exploration Fallback (If MCP Unavailable)**
|
||||
|
||||
If Playwright MCP is not available:
|
||||
|
||||
**Notify User:**
|
||||
|
||||
```markdown
|
||||
Exploratory mode enabled but Playwright MCP unavailable.
|
||||
|
||||
**Manual exploration required:**
|
||||
|
||||
1. Open application at: {exploration_url}
|
||||
2. Explore all pages, workflows, and features
|
||||
3. Document findings in markdown:
|
||||
- List of pages/features discovered
|
||||
- User journeys identified
|
||||
- API endpoints observed (DevTools Network tab)
|
||||
- JavaScript errors noted (DevTools Console)
|
||||
- Critical workflows mapped
|
||||
|
||||
4. Provide exploration findings to continue workflow
|
||||
|
||||
**Alternative:** Disable exploratory_mode and provide requirements documentation
|
||||
```
|
||||
|
||||
Wait for user to provide exploration findings, then:
|
||||
- Parse user-provided discovery documentation
|
||||
- Convert to testable requirements
|
||||
- Continue with Step 2 (risk assessment)
|
||||
|
||||
4. **Proceed to Risk Assessment**
|
||||
|
||||
After mode selection (Requirements-Based OR Exploratory):
|
||||
- Continue to Step 2: Assess and Classify Risks
|
||||
- Use requirements from documentation (Requirements-Based) OR discoveries (Exploratory)
|
||||
|
||||
---
|
||||
|
||||
@@ -402,18 +516,21 @@ Examples:
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Auto-load enabled:**
|
||||
**Core Fragments (Auto-loaded in Step 1):**
|
||||
|
||||
- `risk-governance.md` - Risk framework
|
||||
- `probability-impact.md` - Scoring guide
|
||||
- `test-levels-framework.md` - Level selection
|
||||
- `test-priorities-matrix.md` - Priority assignment
|
||||
- `risk-governance.md` - Risk classification (6 categories), automated scoring, gate decision engine, coverage traceability, owner tracking (625 lines, 4 examples)
|
||||
- `probability-impact.md` - Probability × impact matrix, automated classification thresholds, dynamic re-assessment, gate integration (604 lines, 4 examples)
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - P0-P3 automated priority calculation, risk-based mapping, tagging strategy, time budgets (389 lines, 2 examples)
|
||||
|
||||
**Manual reference:**
|
||||
**Reference for Test Planning:**
|
||||
|
||||
- Use `tea-index.csv` to find additional fragments
|
||||
- Load `selective-testing.md` for execution strategy
|
||||
- Load `fixture-architecture.md` for data setup patterns
|
||||
- `selective-testing.md` - Execution strategy: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
- `fixture-architecture.md` - Data setup patterns: pure function → fixture → mergeTests, auto-cleanup (406 lines, 5 examples)
|
||||
|
||||
**Manual Reference (Optional):**
|
||||
|
||||
- Use `tea-index.csv` to find additional specialized fragments as needed
|
||||
|
||||
### Evidence-Based Assessment
|
||||
|
||||
|
||||
@@ -50,14 +50,18 @@ This workflow performs comprehensive test quality reviews using TEA's knowledge
|
||||
**Actions:**
|
||||
|
||||
1. Load relevant knowledge fragments from `{project-root}/bmad/bmm/testarch/tea-index.csv`:
|
||||
- `test-quality.md` - Definition of Done (no hard waits, <300 lines, <1.5 min, self-cleaning)
|
||||
- `fixture-architecture.md` - Pure function → Fixture → mergeTests pattern
|
||||
- `network-first.md` - Route intercept before navigate (race condition prevention)
|
||||
- `data-factories.md` - Factory functions with overrides, API-first setup
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit appropriateness
|
||||
- `playwright-config.md` - Environment-based configuration (if Playwright detected)
|
||||
- `tdd-cycles.md` - Red-Green-Refactor patterns
|
||||
- `selective-testing.md` - Duplicate coverage detection
|
||||
- `test-quality.md` - Definition of Done (deterministic tests, isolated with cleanup, explicit assertions, <300 lines, <1.5 min, 658 lines, 5 examples)
|
||||
- `fixture-architecture.md` - Pure function → Fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)
|
||||
- `network-first.md` - Route intercept before navigate to prevent race conditions (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
|
||||
- `data-factories.md` - Factory functions with faker: overrides, nested factories, API-first setup (498 lines, 5 examples)
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit appropriateness with decision matrix (467 lines, 4 examples)
|
||||
- `playwright-config.md` - Environment-based configuration with fail-fast validation (722 lines, 5 examples)
|
||||
- `component-tdd.md` - Red-Green-Refactor patterns with provider isolation, accessibility, visual regression (480 lines, 4 examples)
|
||||
- `selective-testing.md` - Duplicate coverage detection with tag-based, spec filter, diff-based selection (727 lines, 4 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention and async debugging techniques (370 lines, 3 examples)
|
||||
- `ci-burn-in.md` - Flaky test detection with 10-iteration burn-in loop (678 lines, 4 examples)
|
||||
|
||||
2. Determine review scope:
|
||||
- **single**: Review one test file (`test_file_path` provided)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Requirements Traceability Workflow
|
||||
# Requirements Traceability & Quality Gate Workflow
|
||||
|
||||
**Workflow ID:** `testarch-trace`
|
||||
**Agent:** Test Architect (TEA)
|
||||
@@ -8,14 +8,22 @@
|
||||
|
||||
## Overview
|
||||
|
||||
The **trace** workflow generates a comprehensive requirements-to-tests traceability matrix that maps acceptance criteria to implemented tests, identifies coverage gaps, and provides actionable recommendations for improving test coverage.
|
||||
The **trace** workflow operates in two sequential phases to validate test coverage and deployment readiness:
|
||||
|
||||
**PHASE 1 - REQUIREMENTS TRACEABILITY:** Generates comprehensive requirements-to-tests traceability matrix that maps acceptance criteria to implemented tests, identifies coverage gaps, and provides actionable recommendations.
|
||||
|
||||
**PHASE 2 - QUALITY GATE DECISION:** Makes deterministic release decisions (PASS/CONCERNS/FAIL/WAIVED) based on traceability results, test execution evidence, and non-functional requirements validation.
|
||||
|
||||
**Key Features:**
|
||||
|
||||
- Maps acceptance criteria to specific test cases across all levels (E2E, API, Component, Unit)
|
||||
- Classifies coverage status (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
|
||||
- Prioritizes gaps by risk level (P0/P1/P2/P3)
|
||||
- Generates CI/CD-ready YAML snippets for quality gates
|
||||
- Applies deterministic decision rules for deployment readiness
|
||||
- Generates gate decisions with evidence and rationale
|
||||
- Supports waivers for business-approved exceptions
|
||||
- Updates workflow status and notifies stakeholders
|
||||
- Creates CI/CD-ready YAML snippets for quality gates
|
||||
- Detects duplicate coverage across test levels
|
||||
- Verifies test quality (assertions, structure, performance)
|
||||
|
||||
@@ -25,33 +33,49 @@ The **trace** workflow generates a comprehensive requirements-to-tests traceabil
|
||||
|
||||
Use `*trace` when you need to:
|
||||
|
||||
### Phase 1 - Traceability
|
||||
|
||||
- ✅ Validate that all acceptance criteria have test coverage
|
||||
- ✅ Identify coverage gaps before release or PR merge
|
||||
- ✅ Generate traceability documentation for compliance or audits
|
||||
- ✅ Ensure critical paths (P0/P1) are fully tested
|
||||
- ✅ Detect duplicate coverage across test levels
|
||||
- ✅ Assess test quality across your suite
|
||||
- ✅ Create gate-ready metrics for CI/CD pipelines
|
||||
|
||||
### Phase 2 - Gate Decision (Optional)
|
||||
|
||||
- ✅ Make final go/no-go deployment decision
|
||||
- ✅ Validate test execution results against thresholds
|
||||
- ✅ Evaluate non-functional requirements (security, performance)
|
||||
- ✅ Generate audit trail for release approval
|
||||
- ✅ Handle business waivers for critical deadlines
|
||||
- ✅ Notify stakeholders of gate decision
|
||||
|
||||
**Typical Timing:**
|
||||
|
||||
- After tests are implemented (post-ATDD or post-development)
|
||||
- Before merging a PR (validate P0/P1 coverage)
|
||||
- Before release (validate full coverage)
|
||||
- Before release (validate full coverage and make gate decision)
|
||||
- During sprint retrospectives (assess test quality)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
### Phase 1 - Traceability (Required)
|
||||
|
||||
- Acceptance criteria (from story file OR inline)
|
||||
- Implemented test suite (or acknowledged gaps)
|
||||
|
||||
**Recommended:**
|
||||
### Phase 2 - Gate Decision (Required if `enable_gate_decision: true`)
|
||||
|
||||
- Test execution results (CI/CD test reports, pass/fail rates)
|
||||
- Test design with risk priorities (P0/P1/P2/P3)
|
||||
|
||||
### Recommended
|
||||
|
||||
- `test-design.md` - Risk assessment and test priorities
|
||||
- `nfr-assessment.md` - Non-functional requirements validation (for release gates)
|
||||
- `tech-spec.md` - Technical implementation details
|
||||
- Test framework configuration (playwright.config.ts, jest.config.js)
|
||||
|
||||
@@ -59,12 +83,13 @@ Use `*trace` when you need to:
|
||||
|
||||
- Story lacks any tests AND gaps are not acknowledged → Run `*atdd` first
|
||||
- Acceptance criteria are completely missing → Provide criteria or story file
|
||||
- Phase 2 enabled but test execution results missing → Warn and skip gate decision
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage (BMad Mode)
|
||||
### Basic Usage (Both Phases)
|
||||
|
||||
```bash
|
||||
bmad tea *trace
|
||||
@@ -72,16 +97,15 @@ bmad tea *trace
|
||||
|
||||
The workflow will:
|
||||
|
||||
1. Read story file from `bmad/output/story-X.X.md`
|
||||
2. Extract acceptance criteria
|
||||
3. Auto-discover tests for this story
|
||||
4. Generate traceability matrix
|
||||
5. Save to `bmad/output/traceability-matrix.md`
|
||||
1. **Phase 1**: Read story file, extract acceptance criteria, auto-discover tests, generate traceability matrix
|
||||
2. **Phase 2**: Load test execution results, apply decision rules, generate gate decision document
|
||||
3. Save traceability matrix to `bmad/output/traceability-matrix.md`
|
||||
4. Save gate decision to `bmad/output/gate-decision-story-X.X.md`
|
||||
|
||||
### Standalone Mode (No Story File)
|
||||
### Phase 1 Only (Skip Gate Decision)
|
||||
|
||||
```bash
|
||||
bmad tea *trace --acceptance-criteria "AC-1: User can login with email..."
|
||||
bmad tea *trace --enable-gate-decision false
|
||||
```
|
||||
|
||||
### Custom Configuration
|
||||
@@ -89,15 +113,25 @@ bmad tea *trace --acceptance-criteria "AC-1: User can login with email..."
|
||||
```bash
|
||||
bmad tea *trace \
|
||||
--story-file "bmad/output/story-1.3.md" \
|
||||
--output-file "docs/qa/trace-1.3.md" \
|
||||
--test-results "ci-artifacts/test-report.xml" \
|
||||
--min-p0-coverage 100 \
|
||||
--min-p1-coverage 90
|
||||
--min-p1-coverage 90 \
|
||||
--min-p0-pass-rate 100 \
|
||||
--min-p1-pass-rate 95
|
||||
```
|
||||
|
||||
### Standalone Mode (No Story File)
|
||||
|
||||
```bash
|
||||
bmad tea *trace --acceptance-criteria "AC-1: User can login with email..."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
### PHASE 1: Requirements Traceability
|
||||
|
||||
1. **Load Context** - Read story, test design, tech spec, knowledge base
|
||||
2. **Discover Tests** - Auto-find tests related to story (by ID, describe blocks, file paths)
|
||||
3. **Map Criteria** - Link acceptance criteria to specific test cases
|
||||
@@ -105,11 +139,18 @@ bmad tea *trace \
|
||||
5. **Verify Quality** - Check test quality (assertions, structure, performance)
|
||||
6. **Generate Deliverables** - Create traceability matrix, gate YAML, coverage badge
|
||||
|
||||
### PHASE 2: Quality Gate Decision (if `enable_gate_decision: true`)
|
||||
|
||||
7. **Gather Evidence** - Load traceability results, test execution reports, NFR assessments
|
||||
8. **Apply Decision Rules** - Evaluate against thresholds (PASS/CONCERNS/FAIL/WAIVED)
|
||||
9. **Document Decision** - Create gate decision document with evidence and rationale
|
||||
10. **Update Status & Notify** - Append to bmm-workflow-status.md, notify stakeholders
|
||||
|
||||
---
|
||||
|
||||
## Outputs
|
||||
|
||||
### Traceability Matrix (`traceability-matrix.md`)
|
||||
### Phase 1: Traceability Matrix (`traceability-matrix.md`)
|
||||
|
||||
Comprehensive markdown file with:
|
||||
|
||||
@@ -119,32 +160,136 @@ Comprehensive markdown file with:
|
||||
- Quality assessment for each test
|
||||
- Gate YAML snippet
|
||||
|
||||
### Gate YAML Snippet
|
||||
**Example:**
|
||||
|
||||
```yaml
|
||||
traceability:
|
||||
story_id: '1.3'
|
||||
coverage:
|
||||
overall: 85%
|
||||
p0: 100%
|
||||
p1: 90%
|
||||
gaps:
|
||||
critical: 0
|
||||
high: 1
|
||||
status: 'PASS'
|
||||
```markdown
|
||||
# Traceability Matrix - Story 1.3
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
| Priority | Total | FULL | Coverage % | Status |
|
||||
| -------- | ----- | ---- | ---------- | ------- |
|
||||
| P0 | 3 | 3 | 100% | ✅ PASS |
|
||||
| P1 | 5 | 4 | 80% | ⚠️ WARN |
|
||||
|
||||
Gate Status: CONCERNS ⚠️ (P1 coverage below 90%)
|
||||
```
|
||||
|
||||
### Updated Story File (Optional)
|
||||
### Phase 2: Gate Decision Document (`gate-decision-{type}-{id}.md`)
|
||||
|
||||
Adds "Traceability" section to story markdown with:
|
||||
**Decision Document** with:
|
||||
|
||||
- Link to traceability matrix
|
||||
- Coverage summary
|
||||
- Gate status
|
||||
- **Decision**: PASS / CONCERNS / FAIL / WAIVED with clear rationale
|
||||
- **Evidence Summary**: Test results, coverage, NFRs, quality validation
|
||||
- **Decision Criteria Table**: Each criterion with threshold, actual, status
|
||||
- **Rationale**: Explanation of decision based on evidence
|
||||
- **Residual Risks**: Unresolved issues (for CONCERNS/WAIVED)
|
||||
- **Waiver Details**: Approver, justification, remediation plan (for WAIVED)
|
||||
- **Next Steps**: Action items for each decision type
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 1.3 - User Login
|
||||
|
||||
**Decision**: ⚠️ CONCERNS
|
||||
**Date**: 2025-10-15
|
||||
|
||||
## Decision Criteria
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ------------ | --------- | ------ | ------- |
|
||||
| P0 Coverage | ≥100% | 100% | ✅ PASS |
|
||||
| P1 Coverage | ≥90% | 88% | ⚠️ FAIL |
|
||||
| Overall Pass | ≥90% | 96% | ✅ PASS |
|
||||
|
||||
**Decision**: CONCERNS (P1 coverage 88% below 90% threshold)
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Deploy with monitoring
|
||||
- Create follow-up story for AC-5 test
|
||||
```
|
||||
|
||||
### Secondary Outputs
|
||||
|
||||
- **Gate YAML**: Machine-readable snippet for CI/CD integration
|
||||
- **Status Update**: Appends decision to `bmm-workflow-status.md` history
|
||||
- **Stakeholder Notification**: Auto-generated summary message
|
||||
- **Updated Story File**: Traceability section added (optional)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Classifications
|
||||
## Decision Logic (Phase 2)
|
||||
|
||||
### PASS Decision ✅
|
||||
|
||||
**All criteria met:**
|
||||
|
||||
- ✅ P0 coverage ≥ 100%
|
||||
- ✅ P1 coverage ≥ 90%
|
||||
- ✅ Overall coverage ≥ 80%
|
||||
- ✅ P0 test pass rate = 100%
|
||||
- ✅ P1 test pass rate ≥ 95%
|
||||
- ✅ Overall test pass rate ≥ 90%
|
||||
- ✅ Security issues = 0
|
||||
- ✅ Critical NFR failures = 0
|
||||
|
||||
**Action:** Deploy to production with standard monitoring
|
||||
|
||||
---
|
||||
|
||||
### CONCERNS Decision ⚠️
|
||||
|
||||
**P0 criteria met, but P1 criteria degraded:**
|
||||
|
||||
- ✅ P0 coverage = 100%
|
||||
- ⚠️ P1 coverage 80-89% (below 90% threshold)
|
||||
- ⚠️ P1 test pass rate 90-94% (below 95% threshold)
|
||||
- ✅ No security issues
|
||||
- ✅ No critical NFR failures
|
||||
|
||||
**Residual Risks:** Minor P1 issues, edge cases, non-critical gaps
|
||||
|
||||
**Action:** Deploy with enhanced monitoring, create backlog stories for fixes
|
||||
|
||||
**Note:** CONCERNS does NOT block deployment but requires acknowledgment
|
||||
|
||||
---
|
||||
|
||||
### FAIL Decision ❌
|
||||
|
||||
**Any P0 criterion failed:**
|
||||
|
||||
- ❌ P0 coverage <100% (missing critical tests)
|
||||
- OR ❌ P0 test pass rate <100% (failing critical tests)
|
||||
- OR ❌ P1 coverage <80% (significant gap)
|
||||
- OR ❌ Security issues >0
|
||||
- OR ❌ Critical NFR failures >0
|
||||
|
||||
**Critical Blockers:** P0 test failures, security vulnerabilities, critical NFRs
|
||||
|
||||
**Action:** Block deployment, fix critical issues, re-run gate after fixes
|
||||
|
||||
---
|
||||
|
||||
### WAIVED Decision 🔓
|
||||
|
||||
**FAIL status + business-approved waiver:**
|
||||
|
||||
- ❌ Original decision: FAIL
|
||||
- 🔓 Waiver approved by: {VP Engineering / CTO / Product Owner}
|
||||
- 📋 Business justification: {regulatory deadline, contractual obligation}
|
||||
- 📅 Waiver expiry: {date - does NOT apply to future releases}
|
||||
- 🔧 Remediation plan: {fix in next release, due date}
|
||||
|
||||
**Action:** Deploy with business approval, aggressive monitoring, fix ASAP
|
||||
|
||||
**Important:** Waivers NEVER apply to P0 security issues or data corruption risks
|
||||
|
||||
---
|
||||
|
||||
## Coverage Classifications (Phase 1)
|
||||
|
||||
- **FULL** ✅ - All scenarios validated at appropriate level(s)
|
||||
- **PARTIAL** ⚠️ - Some coverage but missing edge cases or levels
|
||||
@@ -156,12 +301,12 @@ Adds "Traceability" section to story markdown with:
|
||||
|
||||
## Quality Gates
|
||||
|
||||
| Priority | Coverage Requirement | Severity | Action |
|
||||
| -------- | -------------------- | -------- | ------------------ |
|
||||
| P0 | 100% | BLOCKER | Do not release |
|
||||
| P1 | 90% | HIGH | Block PR merge |
|
||||
| P2 | 80% (recommended) | MEDIUM | Address in nightly |
|
||||
| P3 | No requirement | LOW | Optional |
|
||||
| Priority | Coverage Requirement | Pass Rate Requirement | Severity | Action |
|
||||
| -------- | -------------------- | --------------------- | -------- | ------------------ |
|
||||
| P0 | 100% | 100% | BLOCKER | Do not release |
|
||||
| P1 | 90% | 95% | HIGH | Block PR merge |
|
||||
| P2 | 80% (recommended) | 85% (recommended) | MEDIUM | Address in nightly |
|
||||
| P3 | No requirement | No requirement | LOW | Optional |
|
||||
|
||||
---
|
||||
|
||||
@@ -196,10 +341,47 @@ variables:
|
||||
generate_coverage_badge: true
|
||||
update_story_file: true
|
||||
|
||||
# Quality gates
|
||||
# Quality gates (Phase 1 recommendations)
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 90
|
||||
min_overall_coverage: 80
|
||||
|
||||
# PHASE 2: Gate Decision Variables
|
||||
enable_gate_decision: true # Run gate decision after traceability
|
||||
|
||||
# Gate target specification
|
||||
gate_type: 'story' # story | epic | release | hotfix
|
||||
|
||||
# Gate decision configuration
|
||||
decision_mode: 'deterministic' # deterministic | manual
|
||||
allow_waivers: true
|
||||
require_evidence: true
|
||||
|
||||
# Input sources for gate
|
||||
nfr_file: '' # Path to nfr-assessment.md (optional)
|
||||
test_results: '' # Path to test execution results (required for Phase 2)
|
||||
|
||||
# Decision criteria thresholds
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 95
|
||||
min_overall_pass_rate: 90
|
||||
max_critical_nfrs_fail: 0
|
||||
max_security_issues: 0
|
||||
|
||||
# Risk tolerance
|
||||
allow_p2_failures: true
|
||||
allow_p3_failures: true
|
||||
escalate_p1_failures: true
|
||||
|
||||
# Gate output configuration
|
||||
gate_output_file: '{output_folder}/gate-decision-{gate_type}-{story_id}.md'
|
||||
append_to_history: true
|
||||
notify_stakeholders: true
|
||||
|
||||
# Advanced gate options
|
||||
check_all_workflows_complete: true
|
||||
validate_evidence_freshness: true
|
||||
require_sign_off: false
|
||||
```
|
||||
|
||||
---
|
||||
@@ -208,24 +390,34 @@ variables:
|
||||
|
||||
This workflow automatically loads relevant knowledge fragments:
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
|
||||
- `traceability.md` - Requirements mapping patterns
|
||||
- `test-priorities.md` - P0/P1/P2/P3 risk framework
|
||||
- `risk-governance.md` - Risk-based testing approach
|
||||
- `test-quality.md` - Definition of Done for tests
|
||||
- `selective-testing.md` - Duplicate coverage patterns
|
||||
|
||||
**Phase 2 (Gate Decision):**
|
||||
|
||||
- `risk-governance.md` - Quality gate criteria and decision framework
|
||||
- `probability-impact.md` - Risk scoring for residual risks
|
||||
- `test-quality.md` - Quality standards validation
|
||||
- `test-priorities.md` - Priority classification framework
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
## Example Scenarios
|
||||
|
||||
### Example 1: Full Coverage Validation
|
||||
### Example 1: Full Coverage with Gate PASS
|
||||
|
||||
```bash
|
||||
# Validate P0/P1 coverage before PR merge
|
||||
bmad tea *trace --story-file "bmad/output/story-1.3.md"
|
||||
# Validate coverage and make gate decision
|
||||
bmad tea *trace --story-file "bmad/output/story-1.3.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Output:**
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
# Traceability Matrix - Story 1.3
|
||||
@@ -237,17 +429,42 @@ bmad tea *trace --story-file "bmad/output/story-1.3.md"
|
||||
| P0 | 3 | 3 | 100% | ✅ PASS |
|
||||
| P1 | 5 | 5 | 100% | ✅ PASS |
|
||||
|
||||
Gate Status: PASS ✅
|
||||
Gate Status: Ready for Phase 2 ✅
|
||||
```
|
||||
|
||||
### Example 2: Gap Identification
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 1.3
|
||||
|
||||
**Decision**: ✅ PASS
|
||||
|
||||
Evidence:
|
||||
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 100% ✅
|
||||
- P0 Pass Rate: 100% (12/12 tests) ✅
|
||||
- P1 Pass Rate: 98% (45/46 tests) ✅
|
||||
- Overall Pass Rate: 96% ✅
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. Deploy to staging
|
||||
2. Monitor for 24 hours
|
||||
3. Deploy to production
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Gap Identification with CONCERNS Decision
|
||||
|
||||
```bash
|
||||
# Find coverage gaps for existing feature
|
||||
bmad tea *trace --target-feature "user-authentication"
|
||||
# Find gaps and evaluate readiness
|
||||
bmad tea *trace --story-file "bmad/output/story-2.1.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Output:**
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
## Gap Analysis
|
||||
@@ -263,70 +480,201 @@ bmad tea *trace --target-feature "user-authentication"
|
||||
- Impact: Users may not recover accounts in error scenarios
|
||||
```
|
||||
|
||||
### Example 3: Duplicate Coverage Detection
|
||||
|
||||
```bash
|
||||
# Check for redundant tests
|
||||
bmad tea *trace --check-duplicate-coverage true
|
||||
```
|
||||
|
||||
**Output:**
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
## Duplicate Coverage Detected
|
||||
# Quality Gate Decision: Story 2.1
|
||||
|
||||
⚠️ AC-1 (login validation) is tested at multiple levels:
|
||||
**Decision**: ⚠️ CONCERNS
|
||||
|
||||
- 1.3-E2E-001 (full user journey) ✅ Appropriate
|
||||
- 1.3-UNIT-001 (business logic) ✅ Appropriate
|
||||
- 1.3-COMPONENT-001 (form validation) ⚠️ Redundant with UNIT-001
|
||||
Evidence:
|
||||
|
||||
Recommendation: Remove 1.3-COMPONENT-001 or consolidate with UNIT-001
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 88% ⚠️ (below 90%)
|
||||
- Test Pass Rate: 96% ✅
|
||||
|
||||
Residual Risks:
|
||||
|
||||
- AC-3 missing E2E test for email error handling
|
||||
|
||||
Next Steps:
|
||||
|
||||
- Deploy with monitoring
|
||||
- Create follow-up story for AC-3 test
|
||||
- Monitor production for edge cases
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Critical Blocker with FAIL Decision
|
||||
|
||||
```bash
|
||||
# Critical issues detected
|
||||
bmad tea *trace --story-file "bmad/output/story-3.2.md" \
|
||||
--test-results "ci-artifacts/test-report.xml"
|
||||
```
|
||||
|
||||
**Phase 1 Output:**
|
||||
|
||||
```markdown
|
||||
## Gap Analysis
|
||||
|
||||
### Critical Gaps (BLOCKER)
|
||||
|
||||
1. **AC-2: Invalid login security validation**
|
||||
- Priority: P0
|
||||
- Status: NONE (no tests)
|
||||
- Impact: Security vulnerability - users can bypass login
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Story 3.2
|
||||
|
||||
**Decision**: ❌ FAIL
|
||||
|
||||
Critical Blockers:
|
||||
|
||||
- P0 Coverage: 80% ❌ (AC-2 missing)
|
||||
- Security Risk: Login bypass vulnerability
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. BLOCK DEPLOYMENT IMMEDIATELY
|
||||
2. Add P0 test for AC-2: 1.3-E2E-004
|
||||
3. Re-run full test suite
|
||||
4. Re-run gate after fixes verified
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Business Override with WAIVED Decision
|
||||
|
||||
```bash
|
||||
# FAIL with business waiver
|
||||
bmad tea *trace --story-file "bmad/output/release-2.4.0.md" \
|
||||
--test-results "ci-artifacts/test-report.xml" \
|
||||
--allow-waivers true
|
||||
```
|
||||
|
||||
**Phase 2 Output:**
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: Release 2.4.0
|
||||
|
||||
**Original Decision**: ❌ FAIL
|
||||
**Final Decision**: 🔓 WAIVED
|
||||
|
||||
Waiver Details:
|
||||
|
||||
- Approver: Jane Doe, VP Engineering
|
||||
- Reason: GDPR compliance deadline (regulatory, Oct 15)
|
||||
- Expiry: 2025-10-15 (does NOT apply to v2.5.0)
|
||||
- Monitoring: Enhanced error tracking
|
||||
- Remediation: Fix in v2.4.1 hotfix (due Oct 20)
|
||||
|
||||
Business Justification:
|
||||
Release contains critical GDPR features required by law. Failed
|
||||
test affects legacy feature used by <1% of users. Workaround available.
|
||||
|
||||
Next Steps:
|
||||
|
||||
1. Deploy v2.4.0 with waiver approval
|
||||
2. Monitor error rates aggressively
|
||||
3. Fix issue in v2.4.1 (Oct 20)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No tests found for this story"
|
||||
### Phase 1 Issues
|
||||
|
||||
#### "No tests found for this story"
|
||||
|
||||
- Run `*atdd` workflow first to generate failing acceptance tests
|
||||
- Check test file naming conventions (may not match story ID pattern)
|
||||
- Verify test directory path is correct (`test_dir` variable)
|
||||
|
||||
### "Cannot determine coverage status"
|
||||
#### "Cannot determine coverage status"
|
||||
|
||||
- Tests may lack explicit mapping (no test IDs, unclear describe blocks)
|
||||
- Add test IDs: `{STORY_ID}-{LEVEL}-{SEQ}` (e.g., `1.3-E2E-001`)
|
||||
- Use Given-When-Then narrative in test descriptions
|
||||
|
||||
### "P0 coverage below 100%"
|
||||
#### "P0 coverage below 100%"
|
||||
|
||||
- This is a **BLOCKER** - do not release
|
||||
- Identify missing P0 tests in gap analysis
|
||||
- Run `*atdd` workflow to generate missing tests
|
||||
- Verify P0 classification is correct with stakeholders
|
||||
|
||||
### "Duplicate coverage detected"
|
||||
#### "Duplicate coverage detected"
|
||||
|
||||
- Review `selective-testing.md` knowledge fragment
|
||||
- Determine if overlap is acceptable (defense in depth) or wasteful
|
||||
- Consolidate tests at appropriate level (logic → unit, journey → E2E)
|
||||
|
||||
### Phase 2 Issues
|
||||
|
||||
#### "Test execution results missing"
|
||||
|
||||
- Phase 2 gate decision requires `test_results` (CI/CD test reports)
|
||||
- If missing, Phase 2 will be skipped with warning
|
||||
- Provide JUnit XML, TAP, or JSON test report path via `test_results` variable
|
||||
|
||||
#### "Gate decision is FAIL but deployment needed urgently"
|
||||
|
||||
- Request business waiver (if `allow_waivers: true`)
|
||||
- Document approver, justification, mitigation plan
|
||||
- Create follow-up stories to address gaps
|
||||
- Use WAIVED decision only for non-P0 gaps
|
||||
- **Never waive**: Security issues, data corruption risks
|
||||
|
||||
#### "Assessments are stale (>7 days old)"
|
||||
|
||||
- Re-run `*test-design` workflow
|
||||
- Re-run traceability (Phase 1)
|
||||
- Re-run `*nfr-assess` workflow
|
||||
- Update evidence files before gate decision
|
||||
|
||||
#### "Unclear decision (edge case)"
|
||||
|
||||
- Switch to manual mode: `decision_mode: manual`
|
||||
- Document assumptions and rationale clearly
|
||||
- Escalate to tech lead or architect for guidance
|
||||
- Consider waiver if business-critical
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
- **testarch-test-design** → `*trace` - Define priorities, then trace coverage
|
||||
- **testarch-atdd** → `*trace` - Generate tests, then validate coverage
|
||||
- `*trace` → **testarch-automate** - Identify gaps, then automate regression
|
||||
- `*trace` → **testarch-gate** - Generate metrics, then apply quality gates
|
||||
- `*trace` → **testarch-test-review** - Flag quality issues, then review tests
|
||||
### Before Trace
|
||||
|
||||
1. **testarch-test-design** - Define test priorities (P0/P1/P2/P3)
|
||||
2. **testarch-atdd** - Generate failing acceptance tests
|
||||
3. **testarch-automate** - Expand regression suite
|
||||
|
||||
### After Trace (Phase 2 Decision)
|
||||
|
||||
- **PASS**: Proceed to deployment workflow
|
||||
- **CONCERNS**: Deploy with monitoring, create remediation backlog stories
|
||||
- **FAIL**: Block deployment, fix issues, re-run trace workflow
|
||||
- **WAIVED**: Deploy with business approval, escalate monitoring
|
||||
|
||||
### Complements
|
||||
|
||||
- `*trace` → **testarch-nfr-assess** - Use NFR validation in gate decision
|
||||
- `*trace` → **testarch-test-review** - Flag quality issues for review
|
||||
- **CI/CD Pipeline** - Use gate YAML for automated quality gates
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Phase 1 - Traceability
|
||||
|
||||
1. **Run Trace After Test Implementation**
|
||||
- Don't run `*trace` before tests exist (run `*atdd` first)
|
||||
- Trace is most valuable after initial test suite is written
|
||||
@@ -346,26 +694,105 @@ Recommendation: Remove 1.3-COMPONENT-001 or consolidate with UNIT-001
|
||||
- Unit tests for logic, E2E for journeys
|
||||
- Only overlap for defense in depth on critical paths
|
||||
|
||||
5. **Generate Gate-Ready Artifacts**
|
||||
### Phase 2 - Gate Decision
|
||||
|
||||
5. **Evidence is King**
|
||||
- Never make gate decisions without fresh test results
|
||||
- Validate evidence freshness (<7 days old)
|
||||
- Link to all evidence sources (reports, logs, artifacts)
|
||||
|
||||
6. **P0 is Sacred**
|
||||
- P0 failures ALWAYS result in FAIL (no exceptions except waivers)
|
||||
- P0 = Critical user journeys, security, data integrity
|
||||
- Waivers require VP/CTO approval + business justification
|
||||
|
||||
7. **Waivers are Temporary**
|
||||
- Waiver applies ONLY to specific release
|
||||
- Issue must be fixed in next release
|
||||
- Never waive: security, data corruption, compliance violations
|
||||
|
||||
8. **CONCERNS is Not PASS**
|
||||
- CONCERNS means "deploy with monitoring"
|
||||
- Create follow-up stories for issues
|
||||
- Do not ignore CONCERNS repeatedly
|
||||
|
||||
9. **Automate Gate Integration**
|
||||
- Enable `generate_gate_yaml` for CI/CD integration
|
||||
- Use YAML snippets in pipeline quality gates
|
||||
- Export metrics for dashboard visualization
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Strict Gate (Zero Tolerance)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 100
|
||||
min_overall_coverage: 90
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 100
|
||||
min_overall_pass_rate: 95
|
||||
allow_waivers: false
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Financial systems, healthcare, security-critical features
|
||||
|
||||
---
|
||||
|
||||
### Balanced Gate (Production Standard - Default)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 90
|
||||
min_overall_coverage: 80
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 95
|
||||
min_overall_pass_rate: 90
|
||||
allow_waivers: true
|
||||
max_security_issues: 0
|
||||
max_critical_nfrs_fail: 0
|
||||
```
|
||||
|
||||
Use for: Most production releases
|
||||
|
||||
---
|
||||
|
||||
### Relaxed Gate (Early Development)
|
||||
|
||||
```yaml
|
||||
min_p0_coverage: 100
|
||||
min_p1_coverage: 80
|
||||
min_overall_coverage: 70
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_pass_rate: 85
|
||||
min_overall_pass_rate: 80
|
||||
allow_waivers: true
|
||||
allow_p2_failures: true
|
||||
allow_p3_failures: true
|
||||
```
|
||||
|
||||
Use for: Alpha/beta releases, internal tools, proof-of-concept
|
||||
|
||||
---
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `bmad tea *test-design` - Define test priorities and risk assessment
|
||||
- `bmad tea *atdd` - Generate failing acceptance tests for gaps
|
||||
- `bmad tea *automate` - Expand regression suite based on gaps
|
||||
- `bmad tea *gate` - Apply quality gates using traceability metrics
|
||||
- `bmad tea *nfr-assess` - Validate non-functional requirements (for gate)
|
||||
- `bmad tea *test-review` - Review test quality issues flagged by trace
|
||||
- `bmad sm story-approved` - Mark story as complete (triggers gate)
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
- [Instructions](./instructions.md) - Detailed workflow steps
|
||||
- [Instructions](./instructions.md) - Detailed workflow steps (both phases)
|
||||
- [Checklist](./checklist.md) - Validation checklist
|
||||
- [Template](./trace-template.md) - Traceability matrix template
|
||||
- [Knowledge Base](../../testarch/knowledge/) - Testing best practices
|
||||
|
||||
@@ -1,10 +1,17 @@
|
||||
# Requirements Traceability - Validation Checklist
|
||||
# Requirements Traceability & Gate Decision - Validation Checklist
|
||||
|
||||
**Workflow:** `testarch-trace`
|
||||
**Purpose:** Ensure complete and accurate traceability matrix with actionable gap analysis
|
||||
**Purpose:** Ensure complete traceability matrix with actionable gap analysis AND make deployment readiness decision (PASS/CONCERNS/FAIL/WAIVED)
|
||||
|
||||
This checklist covers **two sequential phases**:
|
||||
|
||||
- **PHASE 1**: Requirements Traceability (always executed)
|
||||
- **PHASE 2**: Quality Gate Decision (executed if `enable_gate_decision: true`)
|
||||
|
||||
---
|
||||
|
||||
# PHASE 1: REQUIREMENTS TRACEABILITY
|
||||
|
||||
## Prerequisites Validation
|
||||
|
||||
- [ ] Acceptance criteria are available (from story file OR inline)
|
||||
@@ -114,15 +121,6 @@
|
||||
|
||||
---
|
||||
|
||||
## Quality Gate Validation
|
||||
|
||||
- [ ] P0 coverage >= 100% (required) ✅ or BLOCKER documented ❌
|
||||
- [ ] P1 coverage >= 90% (recommended) ✅ or HIGH priority gap documented ⚠️
|
||||
- [ ] Overall coverage >= 80% (recommended) ✅ or MEDIUM priority gap documented ⚠️
|
||||
- [ ] Gate status determined: PASS / WARN / FAIL
|
||||
|
||||
---
|
||||
|
||||
## Test Quality Verification
|
||||
|
||||
For each mapped test, verify:
|
||||
@@ -149,7 +147,7 @@ Knowledge fragments referenced:
|
||||
|
||||
---
|
||||
|
||||
## Deliverables Generated
|
||||
## Phase 1 Deliverables Generated
|
||||
|
||||
### Traceability Matrix Markdown
|
||||
|
||||
@@ -161,15 +159,6 @@ Knowledge fragments referenced:
|
||||
- [ ] Quality assessment section included
|
||||
- [ ] Recommendations section included
|
||||
|
||||
### Gate YAML Snippet (if enabled)
|
||||
|
||||
- [ ] YAML snippet generated
|
||||
- [ ] Story ID included
|
||||
- [ ] Coverage metrics included (overall, p0, p1, p2)
|
||||
- [ ] Gap counts included (critical, high, medium, low)
|
||||
- [ ] Status included (PASS / WARN / FAIL)
|
||||
- [ ] Recommendations included
|
||||
|
||||
### Coverage Badge/Metric (if enabled)
|
||||
|
||||
- [ ] Badge markdown generated
|
||||
@@ -180,11 +169,10 @@ Knowledge fragments referenced:
|
||||
- [ ] "Traceability" section added to story markdown
|
||||
- [ ] Link to traceability matrix included
|
||||
- [ ] Coverage summary included
|
||||
- [ ] Gate status included
|
||||
|
||||
---
|
||||
|
||||
## Quality Assurance
|
||||
## Phase 1 Quality Assurance
|
||||
|
||||
### Accuracy Checks
|
||||
|
||||
@@ -213,6 +201,370 @@ Knowledge fragments referenced:
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Documentation
|
||||
|
||||
- [ ] Traceability matrix is readable and well-formatted
|
||||
- [ ] Tables render correctly in markdown
|
||||
- [ ] Code blocks have proper syntax highlighting
|
||||
- [ ] Links are valid and accessible
|
||||
- [ ] Recommendations are clear and prioritized
|
||||
|
||||
---
|
||||
|
||||
# PHASE 2: QUALITY GATE DECISION
|
||||
|
||||
**Note**: Phase 2 executes only if `enable_gate_decision: true` in workflow.yaml
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Evidence Gathering
|
||||
|
||||
- [ ] Test execution results obtained (CI/CD pipeline, test framework reports)
|
||||
- [ ] Story/epic/release file identified and read
|
||||
- [ ] Test design document discovered or explicitly provided (if available)
|
||||
- [ ] Traceability matrix discovered or explicitly provided (available from Phase 1)
|
||||
- [ ] NFR assessment discovered or explicitly provided (if available)
|
||||
- [ ] Code coverage report discovered or explicitly provided (if available)
|
||||
- [ ] Burn-in results discovered or explicitly provided (if available)
|
||||
|
||||
### Evidence Validation
|
||||
|
||||
- [ ] Evidence freshness validated (warn if >7 days old, recommend re-running workflows)
|
||||
- [ ] All required assessments available or user acknowledged gaps
|
||||
- [ ] Test results are complete (not partial or interrupted runs)
|
||||
- [ ] Test results match current codebase (not from outdated branch)
|
||||
|
||||
### Knowledge Base Loading
|
||||
|
||||
- [ ] `risk-governance.md` loaded successfully
|
||||
- [ ] `probability-impact.md` loaded successfully
|
||||
- [ ] `test-quality.md` loaded successfully
|
||||
- [ ] `test-priorities.md` loaded successfully
|
||||
- [ ] `ci-burn-in.md` loaded (if burn-in results available)
|
||||
|
||||
---
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Context Loading
|
||||
|
||||
- [ ] Gate type identified (story/epic/release/hotfix)
|
||||
- [ ] Target ID extracted (story_id, epic_num, or release_version)
|
||||
- [ ] Decision thresholds loaded from workflow variables
|
||||
- [ ] Risk tolerance configuration loaded
|
||||
- [ ] Waiver policy loaded
|
||||
|
||||
### Step 2: Evidence Parsing
|
||||
|
||||
**Test Results:**
|
||||
|
||||
- [ ] Total test count extracted
|
||||
- [ ] Passed test count extracted
|
||||
- [ ] Failed test count extracted
|
||||
- [ ] Skipped test count extracted
|
||||
- [ ] Test duration extracted
|
||||
- [ ] P0 test pass rate calculated
|
||||
- [ ] P1 test pass rate calculated
|
||||
- [ ] Overall test pass rate calculated
|
||||
|
||||
**Quality Assessments:**
|
||||
|
||||
- [ ] P0/P1/P2/P3 scenarios extracted from test-design.md (if available)
|
||||
- [ ] Risk scores extracted from test-design.md (if available)
|
||||
- [ ] Coverage percentages extracted from traceability-matrix.md (available from Phase 1)
|
||||
- [ ] Coverage gaps extracted from traceability-matrix.md (available from Phase 1)
|
||||
- [ ] NFR status extracted from nfr-assessment.md (if available)
|
||||
- [ ] Security issues count extracted from nfr-assessment.md (if available)
|
||||
|
||||
**Code Coverage:**
|
||||
|
||||
- [ ] Line coverage percentage extracted (if available)
|
||||
- [ ] Branch coverage percentage extracted (if available)
|
||||
- [ ] Function coverage percentage extracted (if available)
|
||||
- [ ] Critical path coverage validated (if available)
|
||||
|
||||
**Burn-in Results:**
|
||||
|
||||
- [ ] Burn-in iterations count extracted (if available)
|
||||
- [ ] Flaky tests count extracted (if available)
|
||||
- [ ] Stability score calculated (if available)
|
||||
|
||||
### Step 3: Decision Rules Application
|
||||
|
||||
**P0 Criteria Evaluation:**
|
||||
|
||||
- [ ] P0 test pass rate evaluated (must be 100%)
|
||||
- [ ] P0 acceptance criteria coverage evaluated (must be 100%)
|
||||
- [ ] Security issues count evaluated (must be 0)
|
||||
- [ ] Critical NFR failures evaluated (must be 0)
|
||||
- [ ] Flaky tests evaluated (must be 0 if burn-in enabled)
|
||||
- [ ] P0 decision recorded: PASS or FAIL
|
||||
|
||||
**P1 Criteria Evaluation:**
|
||||
|
||||
- [ ] P1 test pass rate evaluated (threshold: min_p1_pass_rate)
|
||||
- [ ] P1 acceptance criteria coverage evaluated (threshold: 95%)
|
||||
- [ ] Overall test pass rate evaluated (threshold: min_overall_pass_rate)
|
||||
- [ ] Code coverage evaluated (threshold: min_coverage)
|
||||
- [ ] P1 decision recorded: PASS or CONCERNS
|
||||
|
||||
**P2/P3 Criteria Evaluation:**
|
||||
|
||||
- [ ] P2 failures tracked (informational, don't block if allow_p2_failures: true)
|
||||
- [ ] P3 failures tracked (informational, don't block if allow_p3_failures: true)
|
||||
- [ ] Residual risks documented
|
||||
|
||||
**Final Decision:**
|
||||
|
||||
- [ ] Decision determined: PASS / CONCERNS / FAIL / WAIVED
|
||||
- [ ] Decision rationale documented
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
|
||||
### Step 4: Documentation
|
||||
|
||||
**Gate Decision Document Created:**
|
||||
|
||||
- [ ] Story/epic/release info section complete (ID, title, description, links)
|
||||
- [ ] Decision clearly stated (PASS / CONCERNS / FAIL / WAIVED)
|
||||
- [ ] Decision date recorded
|
||||
- [ ] Evaluator recorded (user or agent name)
|
||||
|
||||
**Evidence Summary Documented:**
|
||||
|
||||
- [ ] Test results summary complete (total, passed, failed, pass rates)
|
||||
- [ ] Coverage summary complete (P0/P1 criteria, code coverage)
|
||||
- [ ] NFR validation summary complete (security, performance, reliability, maintainability)
|
||||
- [ ] Flakiness summary complete (burn-in iterations, flaky test count)
|
||||
|
||||
**Rationale Documented:**
|
||||
|
||||
- [ ] Decision rationale clearly explained
|
||||
- [ ] Key evidence highlighted
|
||||
- [ ] Assumptions and caveats noted (if any)
|
||||
|
||||
**Residual Risks Documented (if CONCERNS or WAIVED):**
|
||||
|
||||
- [ ] Unresolved P1/P2 issues listed
|
||||
- [ ] Probability × impact estimated for each risk
|
||||
- [ ] Mitigations or workarounds described
|
||||
|
||||
**Waivers Documented (if WAIVED):**
|
||||
|
||||
- [ ] Waiver reason documented (business justification)
|
||||
- [ ] Waiver approver documented (name, role)
|
||||
- [ ] Waiver expiry date documented
|
||||
- [ ] Remediation plan documented (fix in next release, due date)
|
||||
- [ ] Monitoring plan documented
|
||||
|
||||
**Critical Issues Documented (if FAIL or CONCERNS):**
|
||||
|
||||
- [ ] Top 5-10 critical issues listed
|
||||
- [ ] Priority assigned to each issue (P0/P1/P2)
|
||||
- [ ] Owner assigned to each issue
|
||||
- [ ] Due date assigned to each issue
|
||||
|
||||
**Recommendations Documented:**
|
||||
|
||||
- [ ] Next steps clearly stated for decision type
|
||||
- [ ] Deployment recommendation provided
|
||||
- [ ] Monitoring recommendations provided (if applicable)
|
||||
- [ ] Remediation recommendations provided (if applicable)
|
||||
|
||||
### Step 5: Status Updates and Notifications
|
||||
|
||||
**Status File Updated:**
|
||||
|
||||
- [ ] Gate decision appended to bmm-workflow-status.md (if append_to_history: true)
|
||||
- [ ] Format correct: `[DATE] Gate Decision: DECISION - Target {ID} - {rationale}`
|
||||
- [ ] Status file committed or staged for commit
|
||||
|
||||
**Gate YAML Created:**
|
||||
|
||||
- [ ] Gate YAML snippet generated with decision and criteria
|
||||
- [ ] Evidence references included in YAML
|
||||
- [ ] Next steps included in YAML
|
||||
- [ ] YAML file saved to output folder
|
||||
|
||||
**Stakeholder Notification Generated:**
|
||||
|
||||
- [ ] Notification subject line created
|
||||
- [ ] Notification body created with summary
|
||||
- [ ] Recipients identified (PM, SM, DEV lead, stakeholders)
|
||||
- [ ] Notification ready for delivery (if notify_stakeholders: true)
|
||||
|
||||
**Outputs Saved:**
|
||||
|
||||
- [ ] Gate decision document saved to `{output_file}`
|
||||
- [ ] Gate YAML saved to `{output_folder}/gate-decision-{target}.yaml`
|
||||
- [ ] All outputs are valid and readable
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Output Validation
|
||||
|
||||
### Gate Decision Document
|
||||
|
||||
**Completeness:**
|
||||
|
||||
- [ ] All required sections present (info, decision, evidence, rationale, next steps)
|
||||
- [ ] No placeholder text or TODOs left in document
|
||||
- [ ] All evidence references are accurate and complete
|
||||
- [ ] All links to artifacts are valid
|
||||
|
||||
**Accuracy:**
|
||||
|
||||
- [ ] Decision matches applied criteria rules
|
||||
- [ ] Test results match CI/CD pipeline output
|
||||
- [ ] Coverage percentages match reports
|
||||
- [ ] NFR status matches assessment document
|
||||
- [ ] No contradictions or inconsistencies
|
||||
|
||||
**Clarity:**
|
||||
|
||||
- [ ] Decision rationale is clear and unambiguous
|
||||
- [ ] Technical jargon is explained or avoided
|
||||
- [ ] Stakeholders can understand next steps
|
||||
- [ ] Recommendations are actionable
|
||||
|
||||
### Gate YAML
|
||||
|
||||
**Format:**
|
||||
|
||||
- [ ] YAML is valid (no syntax errors)
|
||||
- [ ] All required fields present (target, decision, date, evaluator, criteria, evidence)
|
||||
- [ ] Field values are correct data types (numbers, strings, dates)
|
||||
|
||||
**Content:**
|
||||
|
||||
- [ ] Criteria values match decision document
|
||||
- [ ] Evidence references are accurate
|
||||
- [ ] Next steps align with decision type
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Quality Checks
|
||||
|
||||
### Decision Integrity
|
||||
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
- [ ] P0 failures result in FAIL decision (unless waived)
|
||||
- [ ] Security issues result in FAIL decision (unless waived - but should never be waived)
|
||||
- [ ] Waivers have business justification and approver (if WAIVED)
|
||||
- [ ] Residual risks are documented (if CONCERNS or WAIVED)
|
||||
|
||||
### Evidence-Based
|
||||
|
||||
- [ ] Decision is based on actual test results (not guesses)
|
||||
- [ ] All claims are supported by evidence
|
||||
- [ ] No assumptions without documentation
|
||||
- [ ] Evidence sources are cited (CI run IDs, report URLs)
|
||||
|
||||
### Transparency
|
||||
|
||||
- [ ] Decision rationale is transparent and auditable
|
||||
- [ ] Criteria evaluation is documented step-by-step
|
||||
- [ ] Any deviations from standard process are explained
|
||||
- [ ] Waiver justifications are clear (if applicable)
|
||||
|
||||
### Consistency
|
||||
|
||||
- [ ] Decision aligns with risk-governance knowledge fragment
|
||||
- [ ] Priority framework (P0/P1/P2/P3) applied consistently
|
||||
- [ ] Terminology consistent with test-quality knowledge fragment
|
||||
- [ ] Decision matrix followed correctly
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Integration Points
|
||||
|
||||
### BMad Workflow Status
|
||||
|
||||
- [ ] Gate decision added to `bmm-workflow-status.md`
|
||||
- [ ] Format matches existing gate history entries
|
||||
- [ ] Timestamp is accurate
|
||||
- [ ] Decision summary is concise (<80 chars)
|
||||
|
||||
### CI/CD Pipeline
|
||||
|
||||
- [ ] Gate YAML is CI/CD-compatible
|
||||
- [ ] YAML can be parsed by pipeline automation
|
||||
- [ ] Decision can be used to block/allow deployments
|
||||
- [ ] Evidence references are accessible to pipeline
|
||||
|
||||
### Stakeholders
|
||||
|
||||
- [ ] Notification message is clear and actionable
|
||||
- [ ] Decision is explained in non-technical terms
|
||||
- [ ] Next steps are specific and time-bound
|
||||
- [ ] Recipients are appropriate for decision type
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Compliance and Audit
|
||||
|
||||
### Audit Trail
|
||||
|
||||
- [ ] Decision date and time recorded
|
||||
- [ ] Evaluator identified (user or agent)
|
||||
- [ ] All evidence sources cited
|
||||
- [ ] Decision criteria documented
|
||||
- [ ] Rationale clearly explained
|
||||
|
||||
### Traceability
|
||||
|
||||
- [ ] Gate decision traceable to story/epic/release
|
||||
- [ ] Evidence traceable to specific test runs
|
||||
- [ ] Assessments traceable to workflows that created them
|
||||
- [ ] Waiver traceable to approver (if applicable)
|
||||
|
||||
### Compliance
|
||||
|
||||
- [ ] Security requirements validated (no unresolved vulnerabilities)
|
||||
- [ ] Quality standards met or waived with justification
|
||||
- [ ] Regulatory requirements addressed (if applicable)
|
||||
- [ ] Documentation sufficient for external audit
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Edge Cases and Exceptions
|
||||
|
||||
### Missing Evidence
|
||||
|
||||
- [ ] If test-design.md missing, decision still possible with test results + trace
|
||||
- [ ] If traceability-matrix.md missing, decision still possible with test results (but Phase 1 should provide it)
|
||||
- [ ] If nfr-assessment.md missing, NFR validation marked as NOT ASSESSED
|
||||
- [ ] If code coverage missing, coverage criterion marked as NOT ASSESSED
|
||||
- [ ] User acknowledged gaps in evidence or provided alternative proof
|
||||
|
||||
### Stale Evidence
|
||||
|
||||
- [ ] Evidence freshness checked (if validate_evidence_freshness: true)
|
||||
- [ ] Warnings issued for assessments >7 days old
|
||||
- [ ] User acknowledged stale evidence or re-ran workflows
|
||||
- [ ] Decision document notes any stale evidence used
|
||||
|
||||
### Conflicting Evidence
|
||||
|
||||
- [ ] Conflicts between test results and assessments resolved
|
||||
- [ ] Most recent/authoritative source identified
|
||||
- [ ] Conflict resolution documented in decision rationale
|
||||
- [ ] User consulted if conflict cannot be resolved
|
||||
|
||||
### Waiver Scenarios
|
||||
|
||||
- [ ] Waiver only used for FAIL decision (not PASS or CONCERNS)
|
||||
- [ ] Waiver has business justification (not technical convenience)
|
||||
- [ ] Waiver has named approver with authority (VP/CTO/PO)
|
||||
- [ ] Waiver has expiry date (does NOT apply to future releases)
|
||||
- [ ] Waiver has remediation plan with concrete due date
|
||||
- [ ] Security vulnerabilities are NOT waived (enforced)
|
||||
|
||||
---
|
||||
|
||||
# FINAL VALIDATION (Both Phases)
|
||||
|
||||
## Non-Prescriptive Validation
|
||||
|
||||
- [ ] Traceability format adapted to team needs (not rigid template)
|
||||
@@ -225,42 +577,77 @@ Knowledge fragments referenced:
|
||||
|
||||
## Documentation and Communication
|
||||
|
||||
- [ ] Traceability matrix is readable and well-formatted
|
||||
- [ ] All documents are readable and well-formatted
|
||||
- [ ] Tables render correctly in markdown
|
||||
- [ ] Code blocks have proper syntax highlighting
|
||||
- [ ] Links are valid and accessible
|
||||
- [ ] Recommendations are clear and prioritized
|
||||
- [ ] Gate status is prominent and unambiguous
|
||||
- [ ] Gate decision is prominent and unambiguous (Phase 2)
|
||||
|
||||
---
|
||||
|
||||
## Final Validation
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
|
||||
- [ ] All prerequisites met
|
||||
- [ ] All acceptance criteria mapped or gaps documented
|
||||
- [ ] P0 coverage is 100% OR documented as BLOCKER
|
||||
- [ ] Gap analysis is complete and prioritized
|
||||
- [ ] Test quality issues identified and flagged
|
||||
- [ ] Deliverables generated and saved
|
||||
- [ ] Gate YAML ready for CI/CD integration (if enabled)
|
||||
- [ ] Story file updated (if enabled)
|
||||
- [ ] Workflow completed successfully
|
||||
|
||||
**Phase 2 (Gate Decision):**
|
||||
|
||||
- [ ] All quality evidence gathered
|
||||
- [ ] Decision criteria applied correctly
|
||||
- [ ] Decision rationale documented
|
||||
- [ ] Gate YAML ready for CI/CD integration
|
||||
- [ ] Status file updated (if enabled)
|
||||
- [ ] Stakeholders notified (if enabled)
|
||||
|
||||
**Workflow Complete:**
|
||||
|
||||
- [ ] Phase 1 completed successfully
|
||||
- [ ] Phase 2 completed successfully (if enabled)
|
||||
- [ ] All outputs validated and saved
|
||||
- [ ] Ready to proceed based on gate decision
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**Traceability Status:**
|
||||
**Phase 1 - Traceability Status:**
|
||||
|
||||
- [ ] ✅ PASS - All quality gates met, no critical gaps
|
||||
- [ ] ⚠️ WARN - P1 gaps exist, address before PR merge
|
||||
- [ ] ❌ FAIL - P0 gaps exist, BLOCKER for release
|
||||
|
||||
**Phase 2 - Gate Decision Status (if enabled):**
|
||||
|
||||
- [ ] ✅ PASS - Deploy to production
|
||||
- [ ] ⚠️ CONCERNS - Deploy with monitoring
|
||||
- [ ] ❌ FAIL - Block deployment, fix issues
|
||||
- [ ] 🔓 WAIVED - Deploy with business approval and remediation plan
|
||||
|
||||
**Next Actions:**
|
||||
|
||||
- If PASS: Proceed to `*gate` workflow or PR merge
|
||||
- If WARN: Address HIGH priority gaps, re-run `*trace`
|
||||
- If FAIL: Run `*atdd` to generate missing P0 tests, re-run `*trace`
|
||||
- If PASS (both phases): Proceed to deployment
|
||||
- If WARN/CONCERNS: Address gaps/issues, proceed with monitoring
|
||||
- If FAIL (either phase): Run `*atdd` for missing tests, fix issues, re-run `*trace`
|
||||
- If WAIVED: Deploy with approved waiver, schedule remediation
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
Record any issues, deviations, or important observations during workflow execution:
|
||||
|
||||
- **Phase 1 Issues**: [Note any traceability mapping challenges, missing tests, quality concerns]
|
||||
- **Phase 2 Issues**: [Note any missing, stale, or conflicting evidence]
|
||||
- **Decision Rationale**: [Document any nuanced reasoning or edge cases]
|
||||
- **Waiver Details**: [Document waiver negotiations or approvals]
|
||||
- **Follow-up Actions**: [List any actions required after gate decision]
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Requirements Traceability - Instructions v4.0
|
||||
# Test Architect Workflow: Requirements Traceability & Quality Gate Decision
|
||||
|
||||
**Workflow:** `testarch-trace`
|
||||
**Purpose:** Generate requirements-to-tests traceability matrix with coverage analysis and gap identification
|
||||
**Purpose:** Generate requirements-to-tests traceability matrix, analyze coverage gaps, and make quality gate decisions (PASS/CONCERNS/FAIL/WAIVED)
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Format:** Pure Markdown v4.0 (no XML blocks)
|
||||
|
||||
@@ -9,29 +9,40 @@
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow creates a comprehensive traceability matrix that maps acceptance criteria to implemented tests, identifies coverage gaps, and provides actionable recommendations for improving test coverage. It supports both BMad-integrated mode (with story files and test design) and standalone mode (with inline acceptance criteria).
|
||||
This workflow operates in two sequential phases to validate test coverage and deployment readiness:
|
||||
|
||||
**PHASE 1 - REQUIREMENTS TRACEABILITY:** Create comprehensive traceability matrix mapping acceptance criteria to implemented tests, identify coverage gaps, and provide actionable recommendations.
|
||||
|
||||
**PHASE 2 - QUALITY GATE DECISION:** Use traceability results combined with test execution evidence to make gate decisions (PASS/CONCERNS/FAIL/WAIVED) that determine deployment readiness.
|
||||
|
||||
**Key Capabilities:**
|
||||
|
||||
- Map acceptance criteria to specific test cases across all levels (E2E, API, Component, Unit)
|
||||
- Classify coverage status (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
|
||||
- Prioritize gaps by risk level (P0/P1/P2/P3) using test-priorities framework
|
||||
- Generate gate-ready YAML snippets for CI/CD integration
|
||||
- Detect duplicate coverage across test levels
|
||||
- Verify explicit assertions in test cases
|
||||
- Apply deterministic decision rules based on coverage and test execution results
|
||||
- Generate gate decisions with evidence and rationale
|
||||
- Support waivers for business-approved exceptions
|
||||
- Update workflow status and notify stakeholders
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
**Required (Phase 1):**
|
||||
|
||||
- Acceptance criteria (from story file OR provided inline)
|
||||
- Implemented test suite (or acknowledge gaps to be addressed)
|
||||
|
||||
**Required (Phase 2 - if `enable_gate_decision: true`):**
|
||||
|
||||
- Test execution results (CI/CD test reports, pass/fail rates)
|
||||
- Test design with risk priorities (P0/P1/P2/P3)
|
||||
|
||||
**Recommended:**
|
||||
|
||||
- `test-design.md` (for risk assessment and priority context)
|
||||
- `nfr-assessment.md` (for release-level gates)
|
||||
- `tech-spec.md` (for technical implementation context)
|
||||
- Test framework configuration (playwright.config.ts, jest.config.js, etc.)
|
||||
|
||||
@@ -39,21 +50,26 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
|
||||
- If story lacks any implemented tests AND no gaps are acknowledged, recommend running `*atdd` workflow first
|
||||
- If acceptance criteria are completely missing, halt and request them
|
||||
- If Phase 2 enabled but test execution results missing, warn and skip gate decision
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
## PHASE 1: REQUIREMENTS TRACEABILITY
|
||||
|
||||
This phase focuses on mapping requirements to tests, analyzing coverage, and identifying gaps.
|
||||
|
||||
---
|
||||
|
||||
### Step 1: Load Context and Knowledge Base
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Load relevant knowledge fragments from `{project-root}/bmad/bmm/testarch/tea-index.csv`:
|
||||
- `traceability.md` - Requirements mapping patterns
|
||||
- `test-priorities.md` - P0/P1/P2/P3 risk framework
|
||||
- `risk-governance.md` - Risk-based testing approach
|
||||
- `test-quality.md` - Definition of Done for tests
|
||||
- `selective-testing.md` - Duplicate coverage patterns
|
||||
- `test-priorities-matrix.md` - P0/P1/P2/P3 risk framework with automated priority calculation, risk-based mapping, tagging strategy (389 lines, 2 examples)
|
||||
- `risk-governance.md` - Risk-based testing approach: 6 categories (TECH, SEC, PERF, DATA, BUS, OPS), automated scoring, gate decision engine, coverage traceability (625 lines, 4 examples)
|
||||
- `probability-impact.md` - Risk scoring methodology: probability × impact matrix, automated classification, dynamic re-assessment, gate integration (604 lines, 4 examples)
|
||||
- `test-quality.md` - Definition of Done for tests: deterministic, isolated with cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
|
||||
- `selective-testing.md` - Duplicate coverage patterns: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
|
||||
2. Read story file (if provided):
|
||||
- Extract acceptance criteria
|
||||
@@ -160,7 +176,7 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
- P1 coverage >= 90% (recommended)
|
||||
- Overall coverage >= 80% (recommended)
|
||||
|
||||
**Output:** Prioritized gap analysis with actionable recommendations
|
||||
**Output:** Prioritized gap analysis with actionable recommendations and coverage metrics
|
||||
|
||||
---
|
||||
|
||||
@@ -191,7 +207,7 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Generate Deliverables
|
||||
### Step 6: Generate Deliverables (Phase 1)
|
||||
|
||||
**Actions:**
|
||||
|
||||
@@ -231,13 +247,442 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
- Include coverage summary
|
||||
- Add gate status
|
||||
|
||||
**Output:** Complete traceability documentation ready for review and CI/CD integration
|
||||
**Output:** Complete Phase 1 traceability deliverables
|
||||
|
||||
**Next:** If `enable_gate_decision: true`, proceed to Phase 2. Otherwise, workflow complete.
|
||||
|
||||
---
|
||||
|
||||
## PHASE 2: QUALITY GATE DECISION
|
||||
|
||||
This phase uses traceability results to make a quality gate decision (PASS/CONCERNS/FAIL/WAIVED) based on evidence and decision rules.
|
||||
|
||||
**When Phase 2 Runs:** Automatically after Phase 1 if `enable_gate_decision: true` (default: true)
|
||||
|
||||
**Skip Conditions:** If test execution results (`test_results`) not provided, warn and skip Phase 2.
|
||||
|
||||
---
|
||||
|
||||
### Step 7: Gather Quality Evidence
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Load Phase 1 traceability results** (inherited context):
|
||||
- Coverage metrics (P0/P1/overall percentages)
|
||||
- Gap analysis (missing/partial tests)
|
||||
- Quality concerns (test quality flags)
|
||||
- Traceability matrix
|
||||
|
||||
2. **Load test execution results** (if `test_results` provided):
|
||||
- Read CI/CD test reports (JUnit XML, TAP, JSON)
|
||||
- Extract pass/fail counts by priority
|
||||
- Calculate pass rates:
|
||||
- **P0 pass rate**: `(P0 passed / P0 total) * 100`
|
||||
- **P1 pass rate**: `(P1 passed / P1 total) * 100`
|
||||
- **Overall pass rate**: `(All passed / All total) * 100`
|
||||
- Identify failing tests and map to criteria
|
||||
|
||||
3. **Load NFR assessment** (if `nfr_file` provided):
|
||||
- Read `nfr-assessment.md` or similar
|
||||
- Check critical NFR status (performance, security, scalability)
|
||||
- Flag any critical NFR failures
|
||||
|
||||
4. **Load supporting artifacts**:
|
||||
- `test-design.md` → Risk priorities, DoD checklist
|
||||
- `story-*.md` or `Epics.md` → Requirements context
|
||||
- `bmm-workflow-status.md` → Workflow completion status (if `check_all_workflows_complete: true`)
|
||||
|
||||
5. **Validate evidence freshness** (if `validate_evidence_freshness: true`):
|
||||
- Check timestamps of test-design, traceability, NFR assessments
|
||||
- Warn if artifacts are >7 days old
|
||||
|
||||
6. **Check prerequisite workflows** (if `check_all_workflows_complete: true`):
|
||||
- Verify test-design workflow complete
|
||||
- Verify trace workflow complete (Phase 1)
|
||||
- Verify nfr-assess workflow complete (if release-level gate)
|
||||
|
||||
**Output:** Consolidated evidence bundle with all quality signals
|
||||
|
||||
---
|
||||
|
||||
### Step 8: Apply Decision Rules
|
||||
|
||||
**If `decision_mode: "deterministic"`** (rule-based - default):
|
||||
|
||||
**Decision rules** (based on `workflow.yaml` thresholds):
|
||||
|
||||
1. **PASS** if ALL of the following are true:
|
||||
- P0 coverage ≥ `min_p0_coverage` (default: 100%)
|
||||
- P1 coverage ≥ `min_p1_coverage` (default: 90%)
|
||||
- Overall coverage ≥ `min_overall_coverage` (default: 80%)
|
||||
- P0 test pass rate = `min_p0_pass_rate` (default: 100%)
|
||||
- P1 test pass rate ≥ `min_p1_pass_rate` (default: 95%)
|
||||
- Overall test pass rate ≥ `min_overall_pass_rate` (default: 90%)
|
||||
- Critical NFRs passed (if `nfr_file` provided)
|
||||
- No unresolved security issues ≤ `max_security_issues` (default: 0)
|
||||
- No test quality red flags (hard waits, no assertions)
|
||||
|
||||
2. **CONCERNS** if ANY of the following are true:
|
||||
- P1 coverage 80-89% (below threshold but not critical)
|
||||
- P1 test pass rate 90-94% (below threshold but not critical)
|
||||
- Overall pass rate 85-89%
|
||||
- P2 coverage <50% (informational)
|
||||
- Some non-critical NFRs failing
|
||||
- Minor test quality concerns (large test files, inferred mappings)
|
||||
- **Note**: CONCERNS does NOT block deployment but requires acknowledgment
|
||||
|
||||
3. **FAIL** if ANY of the following are true:
|
||||
- P0 coverage <100% (missing critical tests)
|
||||
- P0 test pass rate <100% (failing critical tests)
|
||||
- P1 coverage <80% (significant gap)
|
||||
- P1 test pass rate <90% (significant failures)
|
||||
- Overall coverage <80%
|
||||
- Overall pass rate <85%
|
||||
- Critical NFRs failing (`max_critical_nfrs_fail` exceeded)
|
||||
- Unresolved security issues (`max_security_issues` exceeded)
|
||||
- Major test quality issues (tests with no assertions, pervasive hard waits)
|
||||
|
||||
4. **WAIVED** (only if `allow_waivers: true`):
|
||||
- Decision would be FAIL based on rules above
|
||||
- Business stakeholder has approved waiver
|
||||
- Waiver documented with:
|
||||
- Justification (time constraint, known limitation, acceptable risk)
|
||||
- Approver name and date
|
||||
- Mitigation plan (follow-up stories, manual testing)
|
||||
- Waiver evidence linked (email, Slack thread, ticket)
|
||||
|
||||
**Risk tolerance adjustments:**
|
||||
|
||||
- If `allow_p2_failures: true` → P2 test failures do NOT affect gate decision
|
||||
- If `allow_p3_failures: true` → P3 test failures do NOT affect gate decision
|
||||
- If `escalate_p1_failures: true` → P1 failures require explicit manager/lead approval
|
||||
|
||||
**If `decision_mode: "manual"`:**
|
||||
|
||||
- Present evidence summary to team
|
||||
- Recommend decision based on rules above
|
||||
- Team makes final call in meeting/chat
|
||||
- Document decision with approver names
|
||||
|
||||
**Output:** Gate decision (PASS/CONCERNS/FAIL/WAIVED) with rule-based rationale
|
||||
|
||||
---
|
||||
|
||||
### Step 9: Document Decision and Evidence
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Create gate decision document**:
|
||||
- Save to `gate_output_file` (default: `{output_folder}/gate-decision-{gate_type}-{story_id}.md`)
|
||||
- Use structure below
|
||||
|
||||
2. **Document structure**:
|
||||
|
||||
```markdown
|
||||
# Quality Gate Decision: {gate_type} {story_id/epic_num/release_version}
|
||||
|
||||
**Decision**: [PASS / CONCERNS / FAIL / WAIVED]
|
||||
**Date**: {date}
|
||||
**Decider**: {decision_mode} (deterministic | manual)
|
||||
**Evidence Date**: {test_results_date}
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
[1-2 sentence summary of decision and key factors]
|
||||
|
||||
---
|
||||
|
||||
## Decision Criteria
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ----------------- | --------- | -------- | ------- |
|
||||
| P0 Coverage | ≥100% | 100% | ✅ PASS |
|
||||
| P1 Coverage | ≥90% | 88% | ⚠️ FAIL |
|
||||
| Overall Coverage | ≥80% | 92% | ✅ PASS |
|
||||
| P0 Pass Rate | 100% | 100% | ✅ PASS |
|
||||
| P1 Pass Rate | ≥95% | 98% | ✅ PASS |
|
||||
| Overall Pass Rate | ≥90% | 96% | ✅ PASS |
|
||||
| Critical NFRs | All Pass | All Pass | ✅ PASS |
|
||||
| Security Issues | 0 | 0 | ✅ PASS |
|
||||
|
||||
**Overall Status**: 7/8 criteria met → Decision: **CONCERNS**
|
||||
|
||||
---
|
||||
|
||||
## Evidence Summary
|
||||
|
||||
### Test Coverage (from Phase 1 Traceability)
|
||||
|
||||
- **P0 Coverage**: 100% (5/5 criteria fully covered)
|
||||
- **P1 Coverage**: 88% (7/8 criteria fully covered)
|
||||
- **Overall Coverage**: 92% (12/13 criteria covered)
|
||||
- **Gap**: AC-5 (P1) missing E2E test
|
||||
|
||||
### Test Execution Results
|
||||
|
||||
- **P0 Pass Rate**: 100% (12/12 tests passed)
|
||||
- **P1 Pass Rate**: 98% (45/46 tests passed)
|
||||
- **Overall Pass Rate**: 96% (67/70 tests passed)
|
||||
- **Failures**: 3 P2 tests (non-blocking)
|
||||
|
||||
### Non-Functional Requirements
|
||||
|
||||
- Performance: ✅ PASS (response time <500ms)
|
||||
- Security: ✅ PASS (no vulnerabilities)
|
||||
- Scalability: ✅ PASS (handles 10K users)
|
||||
|
||||
### Test Quality
|
||||
|
||||
- All tests have explicit assertions ✅
|
||||
- No hard waits detected ✅
|
||||
- Test files <300 lines ✅
|
||||
- Test IDs follow convention ✅
|
||||
|
||||
---
|
||||
|
||||
## Decision Rationale
|
||||
|
||||
**Why CONCERNS (not PASS)**:
|
||||
|
||||
- P1 coverage at 88% is below 90% threshold
|
||||
- AC-5 (P1 priority) missing E2E test for error handling scenario
|
||||
- This is a known gap from test-design phase
|
||||
|
||||
**Why CONCERNS (not FAIL)**:
|
||||
|
||||
- P0 coverage is 100% (critical paths validated)
|
||||
- Overall coverage is 92% (above 80% threshold)
|
||||
- Test pass rate is excellent (96% overall)
|
||||
- Gap is isolated to one P1 criterion (not systemic)
|
||||
|
||||
**Recommendation**:
|
||||
|
||||
- Acknowledge gap and proceed with deployment
|
||||
- Add missing AC-5 E2E test in next sprint
|
||||
- Create follow-up story: "Add E2E test for AC-5 error handling"
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Create follow-up story for AC-5 E2E test
|
||||
- [ ] Deploy to staging environment
|
||||
- [ ] Monitor production for edge cases related to AC-5
|
||||
- [ ] Update traceability matrix after follow-up test added
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Traceability Matrix: `bmad/output/traceability-matrix.md`
|
||||
- Test Design: `bmad/output/test-design-epic-2.md`
|
||||
- Test Results: `ci-artifacts/test-report-2025-01-15.xml`
|
||||
- NFR Assessment: `bmad/output/nfr-assessment-release-1.2.md`
|
||||
```
|
||||
|
||||
3. **Include evidence links** (if `require_evidence: true`):
|
||||
- Link to traceability matrix
|
||||
- Link to test execution reports (CI artifacts)
|
||||
- Link to NFR assessment
|
||||
- Link to test-design document
|
||||
- Link to relevant PRs, commits, deployments
|
||||
|
||||
4. **Waiver documentation** (if decision is WAIVED):
|
||||
- Approver name and role (e.g., "Jane Doe, Engineering Manager")
|
||||
- Approval date and method (e.g., "2025-01-15, Slack thread")
|
||||
- Justification (e.g., "Time-boxed MVP, missing tests will be added in v1.1")
|
||||
- Mitigation plan (e.g., "Manual testing by QA, follow-up stories created")
|
||||
- Evidence link (e.g., "Slack: #engineering 2025-01-15 3:42pm")
|
||||
|
||||
**Output:** Complete gate decision document with evidence and rationale
|
||||
|
||||
---
|
||||
|
||||
### Step 10: Update Status Tracking and Notify
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Update workflow status** (if `append_to_history: true`):
|
||||
- Append gate decision to `bmm-workflow-status.md` under "Gate History" section
|
||||
- Format:
|
||||
|
||||
```markdown
|
||||
## Gate History
|
||||
|
||||
### Story 1.3 - User Login (2025-01-15)
|
||||
|
||||
- **Decision**: CONCERNS
|
||||
- **Reason**: P1 coverage 88% (below 90%)
|
||||
- **Document**: [gate-decision-story-1.3.md](bmad/output/gate-decision-story-1.3.md)
|
||||
- **Action**: Deploy with follow-up story for AC-5
|
||||
```
|
||||
|
||||
2. **Generate stakeholder notification** (if `notify_stakeholders: true`):
|
||||
- Create concise summary message for team communication
|
||||
- Include: Decision, key metrics, action items
|
||||
- Format for Slack/email/chat:
|
||||
|
||||
```
|
||||
🚦 Quality Gate Decision: Story 1.3 - User Login
|
||||
|
||||
Decision: ⚠️ CONCERNS
|
||||
- P0 Coverage: ✅ 100%
|
||||
- P1 Coverage: ⚠️ 88% (below 90%)
|
||||
- Test Pass Rate: ✅ 96%
|
||||
|
||||
Action Required:
|
||||
- Create follow-up story for AC-5 E2E test
|
||||
- Deploy to staging for validation
|
||||
|
||||
Full Report: bmad/output/gate-decision-story-1.3.md
|
||||
```
|
||||
|
||||
3. **Request sign-off** (if `require_sign_off: true`):
|
||||
- Prompt for named approver (tech lead, QA lead, PM)
|
||||
- Document approver name and timestamp in gate decision
|
||||
- Block until sign-off received (interactive prompt)
|
||||
|
||||
**Output:** Status tracking updated, stakeholders notified, sign-off obtained (if required)
|
||||
|
||||
**Workflow Complete**: Both Phase 1 (traceability) and Phase 2 (gate decision) deliverables generated.
|
||||
|
||||
---
|
||||
|
||||
## Decision Matrix (Quick Reference)
|
||||
|
||||
| Scenario | P0 Cov | P1 Cov | Overall Cov | P0 Pass | P1 Pass | Overall Pass | NFRs | Decision |
|
||||
| --------------- | ----------------- | ------ | ----------- | ------- | ------- | ------------ | ---- | ------------ |
|
||||
| All green | 100% | ≥90% | ≥80% | 100% | ≥95% | ≥90% | Pass | **PASS** |
|
||||
| Minor gap | 100% | 80-89% | ≥80% | 100% | 90-94% | 85-89% | Pass | **CONCERNS** |
|
||||
| Missing P0 | <100% | - | - | - | - | - | - | **FAIL** |
|
||||
| P0 test fail | 100% | - | - | <100% | - | - | - | **FAIL** |
|
||||
| P1 gap | 100% | <80% | - | 100% | - | - | - | **FAIL** |
|
||||
| NFR fail | 100% | ≥90% | ≥80% | 100% | ≥95% | ≥90% | Fail | **FAIL** |
|
||||
| Security issue | - | - | - | - | - | - | Yes | **FAIL** |
|
||||
| Business waiver | [FAIL conditions] | - | - | - | - | - | - | **WAIVED** |
|
||||
|
||||
---
|
||||
|
||||
## Waiver Management
|
||||
|
||||
**When to use waivers:**
|
||||
|
||||
- Time-boxed MVP releases (known gaps, follow-up planned)
|
||||
- Low-risk P1 gaps with mitigation (manual testing, monitoring)
|
||||
- Technical debt acknowledged by product/engineering leadership
|
||||
- External dependencies blocking test automation
|
||||
|
||||
**Waiver approval process:**
|
||||
|
||||
1. Document gap and risk in gate decision
|
||||
2. Propose mitigation plan (manual testing, follow-up stories, monitoring)
|
||||
3. Request approval from stakeholder (EM, PM, QA lead)
|
||||
4. Link approval evidence (email, chat thread, meeting notes)
|
||||
5. Add waiver to gate decision document
|
||||
6. Create follow-up stories to close gaps
|
||||
|
||||
**Waiver does NOT apply to:**
|
||||
|
||||
- P0 gaps (always blocking)
|
||||
- Critical security issues (always blocking)
|
||||
- Critical NFR failures (performance, data integrity)
|
||||
|
||||
---
|
||||
|
||||
## Example Gate Decisions
|
||||
|
||||
### Example 1: PASS (All Criteria Met)
|
||||
|
||||
```
|
||||
Decision: ✅ PASS
|
||||
|
||||
Summary: All quality criteria met. Story 1.3 is ready for production deployment.
|
||||
|
||||
Evidence:
|
||||
- P0 Coverage: 100% (5/5 criteria)
|
||||
- P1 Coverage: 95% (19/20 criteria)
|
||||
- Overall Coverage: 92% (24/26 criteria)
|
||||
- P0 Pass Rate: 100% (12/12 tests)
|
||||
- P1 Pass Rate: 98% (45/46 tests)
|
||||
- Overall Pass Rate: 96% (67/70 tests)
|
||||
- NFRs: All pass (performance, security, scalability)
|
||||
|
||||
Action: Deploy to production ✅
|
||||
```
|
||||
|
||||
### Example 2: CONCERNS (Minor Gap, Non-Blocking)
|
||||
|
||||
```
|
||||
Decision: ⚠️ CONCERNS
|
||||
|
||||
Summary: P1 coverage slightly below threshold (88% vs 90%). Recommend deploying with follow-up story.
|
||||
|
||||
Evidence:
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 88% ⚠️ (below 90%)
|
||||
- Overall Coverage: 92% ✅
|
||||
- Test Pass Rate: 96% ✅
|
||||
- Gap: AC-5 (P1) missing E2E test
|
||||
|
||||
Action:
|
||||
- Deploy to staging for validation
|
||||
- Create follow-up story for AC-5 E2E test
|
||||
- Monitor production for edge cases related to AC-5
|
||||
```
|
||||
|
||||
### Example 3: FAIL (P0 Gap, Blocking)
|
||||
|
||||
```
|
||||
Decision: ❌ FAIL
|
||||
|
||||
Summary: P0 coverage incomplete. Missing critical validation test. BLOCKING deployment.
|
||||
|
||||
Evidence:
|
||||
- P0 Coverage: 80% ❌ (4/5 criteria, AC-2 missing)
|
||||
- AC-2: "User cannot login with invalid credentials" (P0 priority)
|
||||
- No tests validate login security for invalid credentials
|
||||
- This is a critical security gap
|
||||
|
||||
Action:
|
||||
- Add P0 test for AC-2: 1.3-E2E-004 (invalid credentials)
|
||||
- Re-run traceability after test added
|
||||
- Re-evaluate gate decision after P0 coverage = 100%
|
||||
|
||||
Deployment BLOCKED until P0 gap resolved ❌
|
||||
```
|
||||
|
||||
### Example 4: WAIVED (Business Decision)
|
||||
|
||||
```
|
||||
Decision: ⚠️ WAIVED
|
||||
|
||||
Summary: P1 coverage below threshold (75% vs 90%), but waived for MVP launch.
|
||||
|
||||
Evidence:
|
||||
- P0 Coverage: 100% ✅
|
||||
- P1 Coverage: 75% ❌ (below 90%)
|
||||
- Gap: 5 P1 criteria missing E2E tests (error handling, edge cases)
|
||||
|
||||
Waiver:
|
||||
- Approver: Jane Doe, Engineering Manager
|
||||
- Date: 2025-01-15
|
||||
- Justification: Time-boxed MVP for investor demo. Core functionality (P0) fully validated. P1 gaps are low-risk edge cases.
|
||||
- Mitigation: Manual QA testing for P1 scenarios, follow-up stories created for automated tests in v1.1
|
||||
- Evidence: Slack #engineering 2025-01-15 3:42pm
|
||||
|
||||
Action:
|
||||
- Deploy to production with manual QA validation ✅
|
||||
- Add 5 E2E tests for P1 gaps in v1.1 sprint
|
||||
- Monitor production logs for edge case occurrences
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Non-Prescriptive Approach
|
||||
|
||||
**Minimal Examples:** This workflow provides principles and patterns, not rigid templates. Teams should adapt the traceability format to their needs.
|
||||
**Minimal Examples:** This workflow provides principles and patterns, not rigid templates. Teams should adapt the traceability and gate decision formats to their needs.
|
||||
|
||||
**Key Patterns to Follow:**
|
||||
|
||||
@@ -245,7 +690,9 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
- Prioritize by risk (P0 gaps are critical, P3 gaps are acceptable)
|
||||
- Check coverage at appropriate levels (E2E for journeys, Unit for logic)
|
||||
- Verify test quality (explicit assertions, no flakiness)
|
||||
- Generate gate-ready artifacts (YAML snippets for CI/CD)
|
||||
- Apply deterministic gate rules for consistency
|
||||
- Document gate decisions with clear evidence
|
||||
- Use waivers judiciously (business approved, mitigation planned)
|
||||
|
||||
**Extend as Needed:**
|
||||
|
||||
@@ -253,6 +700,8 @@ This workflow creates a comprehensive traceability matrix that maps acceptance c
|
||||
- Integrate with code coverage tools (Istanbul, NYC)
|
||||
- Link to external traceability systems (JIRA, Azure DevOps)
|
||||
- Add compliance or regulatory requirements
|
||||
- Customize gate decision thresholds per project
|
||||
- Add manual approval workflows for gate decisions
|
||||
|
||||
---
|
||||
|
||||
@@ -323,7 +772,7 @@ Use selective testing principles from `selective-testing.md`:
|
||||
### With test-design.md
|
||||
|
||||
- Use risk assessment to prioritize gap remediation
|
||||
- Reference test priorities (P0/P1/P2/P3) for severity classification
|
||||
- Reference test priorities (P0/P1/P2/P3) for severity classification and gate decision
|
||||
- Align traceability with originally planned test coverage
|
||||
|
||||
### With tech-spec.md
|
||||
@@ -338,9 +787,15 @@ Use selective testing principles from `selective-testing.md`:
|
||||
- Verify acceptance criteria align with product goals
|
||||
- Check for unstated requirements that need coverage
|
||||
|
||||
### With nfr-assessment.md
|
||||
|
||||
- Load non-functional validation results for gate decision
|
||||
- Check critical NFR status (performance, security, scalability)
|
||||
- Include NFR pass/fail in gate decision criteria
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates
|
||||
## Quality Gates (Phase 1 Recommendations)
|
||||
|
||||
### P0 Coverage (Critical Paths)
|
||||
|
||||
@@ -496,6 +951,7 @@ traceability:
|
||||
|
||||
Before completing this workflow, verify:
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
- ✅ All acceptance criteria are mapped to tests (or gaps are documented)
|
||||
- ✅ Coverage status is classified (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
|
||||
- ✅ Gaps are prioritized by risk level (P0/P1/P2/P3)
|
||||
@@ -503,19 +959,32 @@ Before completing this workflow, verify:
|
||||
- ✅ Duplicate coverage is identified and flagged
|
||||
- ✅ Test quality is assessed (assertions, structure, performance)
|
||||
- ✅ Traceability matrix is generated and saved
|
||||
- ✅ Gate YAML snippet is generated (if enabled)
|
||||
- ✅ Story file is updated with traceability section (if enabled)
|
||||
- ✅ Recommendations are actionable and specific
|
||||
|
||||
**Phase 2 (Gate Decision - if enabled):**
|
||||
- ✅ Test execution results loaded and pass rates calculated
|
||||
- ✅ NFR assessment results loaded (if applicable)
|
||||
- ✅ Decision rules applied consistently (PASS/CONCERNS/FAIL/WAIVED)
|
||||
- ✅ Gate decision document created with evidence
|
||||
- ✅ Waiver documented if decision is WAIVED (approver, justification, mitigation)
|
||||
- ✅ Workflow status updated (bmm-workflow-status.md)
|
||||
- ✅ Stakeholders notified (if enabled)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
- **Explicit Mapping:** Require tests to reference criteria explicitly (test IDs, describe blocks) for maintainability
|
||||
- **Risk-Based Prioritization:** Use test-priorities framework (P0/P1/P2/P3) to determine gap severity
|
||||
- **Quality Over Quantity:** Better to have fewer high-quality tests with FULL coverage than many low-quality tests with PARTIAL coverage
|
||||
- **Selective Testing:** Avoid duplicate coverage - test each behavior at the appropriate level only
|
||||
- **Gate Integration:** Generate YAML snippets that can be consumed by CI/CD pipelines for automated quality gates
|
||||
|
||||
**Phase 2 (Gate Decision):**
|
||||
- **Deterministic Rules:** Use consistent thresholds (P0=100%, P1≥90%, overall≥80%) for objectivity
|
||||
- **Evidence-Based:** Every decision must cite specific metrics (coverage %, pass rates, NFRs)
|
||||
- **Waiver Discipline:** Waivers require approver name, justification, mitigation plan, and evidence link
|
||||
- **Non-Blocking CONCERNS:** Use CONCERNS for minor gaps that don't justify blocking deployment (e.g., P1 at 88% vs 90%)
|
||||
- **Automate in CI/CD:** Generate YAML snippets that can be consumed by CI/CD pipelines for automated quality gates
|
||||
|
||||
---
|
||||
|
||||
@@ -542,15 +1011,33 @@ Before completing this workflow, verify:
|
||||
- Determine if overlap is acceptable (defense in depth) or wasteful (same validation at multiple levels)
|
||||
- Consolidate tests at appropriate level (logic → unit, integration → API, journey → E2E)
|
||||
|
||||
### "Test execution results missing" (Phase 2)
|
||||
- Phase 2 gate decision requires `test_results` (CI/CD test reports)
|
||||
- If missing, Phase 2 will be skipped with warning
|
||||
- Provide JUnit XML, TAP, or JSON test report path via `test_results` variable
|
||||
|
||||
### "Gate decision is FAIL but deployment needed urgently"
|
||||
- Request business waiver (if `allow_waivers: true`)
|
||||
- Document approver, justification, mitigation plan
|
||||
- Create follow-up stories to address gaps
|
||||
- Use WAIVED decision only for non-P0 gaps
|
||||
|
||||
---
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- **testarch-test-design** - Define test priorities (P0/P1/P2/P3) before tracing
|
||||
- **testarch-atdd** - Generate failing acceptance tests for gaps identified
|
||||
- **testarch-automate** - Expand regression suite based on traceability findings
|
||||
- **testarch-gate** - Use traceability matrix as input for quality gate decisions
|
||||
- **testarch-test-review** - Review test quality issues flagged in traceability
|
||||
**Prerequisites:**
|
||||
- `testarch-test-design` - Define test priorities (P0/P1/P2/P3) before tracing (required for Phase 2)
|
||||
- `testarch-atdd` or `testarch-automate` - Generate tests before tracing coverage
|
||||
|
||||
**Complements:**
|
||||
- `testarch-nfr-assess` - Non-functional requirements validation (recommended for release gates)
|
||||
- `testarch-test-review` - Review test quality issues flagged in traceability
|
||||
|
||||
**Next Steps:**
|
||||
- If gate decision is PASS/CONCERNS → Deploy and monitor
|
||||
- If gate decision is FAIL → Add missing tests, re-run trace workflow
|
||||
- If gate decision is WAIVED → Deploy with mitigation, create follow-up stories
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,12 +1,14 @@
|
||||
# Traceability Matrix - Story {STORY_ID}
|
||||
# Traceability Matrix & Gate Decision - Story {STORY_ID}
|
||||
|
||||
**Story:** {STORY_TITLE}
|
||||
**Date:** {DATE}
|
||||
**Status:** {OVERALL_COVERAGE}% Coverage ({GAP_COUNT} {GAP_SEVERITY} gap{s})
|
||||
**Evaluator:** {user_name or TEA Agent}
|
||||
|
||||
---
|
||||
|
||||
## Coverage Summary
|
||||
## PHASE 1: REQUIREMENTS TRACEABILITY
|
||||
|
||||
### Coverage Summary
|
||||
|
||||
| Priority | Total Criteria | FULL Coverage | Coverage % | Status |
|
||||
| --------- | -------------- | ------------- | ---------- | ------------ |
|
||||
@@ -24,9 +26,9 @@
|
||||
|
||||
---
|
||||
|
||||
## Detailed Mapping
|
||||
### Detailed Mapping
|
||||
|
||||
### {CRITERION_ID}: {CRITERION_DESCRIPTION} ({PRIORITY})
|
||||
#### {CRITERION_ID}: {CRITERION_DESCRIPTION} ({PRIORITY})
|
||||
|
||||
- **Coverage:** {COVERAGE_STATUS} {STATUS_ICON}
|
||||
- **Tests:**
|
||||
@@ -47,7 +49,7 @@
|
||||
|
||||
---
|
||||
|
||||
### Example: AC-1: User can login with email and password (P0)
|
||||
#### Example: AC-1: User can login with email and password (P0)
|
||||
|
||||
- **Coverage:** FULL ✅
|
||||
- **Tests:**
|
||||
@@ -62,7 +64,7 @@
|
||||
|
||||
---
|
||||
|
||||
### Example: AC-3: User can reset password via email (P1)
|
||||
#### Example: AC-3: User can reset password via email (P1)
|
||||
|
||||
- **Coverage:** PARTIAL ⚠️
|
||||
- **Tests:**
|
||||
@@ -81,26 +83,9 @@
|
||||
|
||||
---
|
||||
|
||||
### Example: AC-7: Session timeout handling (P2)
|
||||
### Gap Analysis
|
||||
|
||||
- **Coverage:** UNIT-ONLY ⚠️
|
||||
- **Tests:**
|
||||
- `1.3-UNIT-006` - tests/unit/session-manager.spec.ts:42
|
||||
- **Given:** Session has expired timestamp
|
||||
- **When:** isSessionValid is called
|
||||
- **Then:** Returns false
|
||||
|
||||
- **Gaps:**
|
||||
- Missing: E2E validation of timeout behavior in UI
|
||||
- Missing: API test for session refresh flow
|
||||
|
||||
- **Recommendation:** Add `1.3-E2E-005` to validate that user sees timeout message and is redirected to login. Add `1.3-API-002` to validate session refresh endpoint behavior.
|
||||
|
||||
---
|
||||
|
||||
## Gap Analysis
|
||||
|
||||
### Critical Gaps (BLOCKER) ❌
|
||||
#### Critical Gaps (BLOCKER) ❌
|
||||
|
||||
{CRITICAL_GAP_COUNT} gaps found. **Do not release until resolved.**
|
||||
|
||||
@@ -112,7 +97,7 @@
|
||||
|
||||
---
|
||||
|
||||
### High Priority Gaps (PR BLOCKER) ⚠️
|
||||
#### High Priority Gaps (PR BLOCKER) ⚠️
|
||||
|
||||
{HIGH_GAP_COUNT} gaps found. **Address before PR merge.**
|
||||
|
||||
@@ -124,7 +109,7 @@
|
||||
|
||||
---
|
||||
|
||||
### Medium Priority Gaps (Nightly) ⚠️
|
||||
#### Medium Priority Gaps (Nightly) ⚠️
|
||||
|
||||
{MEDIUM_GAP_COUNT} gaps found. **Address in nightly test improvements.**
|
||||
|
||||
@@ -134,7 +119,7 @@
|
||||
|
||||
---
|
||||
|
||||
### Low Priority Gaps (Optional) ℹ️
|
||||
#### Low Priority Gaps (Optional) ℹ️
|
||||
|
||||
{LOW_GAP_COUNT} gaps found. **Optional - add if time permits.**
|
||||
|
||||
@@ -143,9 +128,9 @@
|
||||
|
||||
---
|
||||
|
||||
## Quality Assessment
|
||||
### Quality Assessment
|
||||
|
||||
### Tests with Issues
|
||||
#### Tests with Issues
|
||||
|
||||
**BLOCKER Issues** ❌
|
||||
|
||||
@@ -161,7 +146,7 @@
|
||||
|
||||
---
|
||||
|
||||
### Example Quality Issues
|
||||
#### Example Quality Issues
|
||||
|
||||
**WARNING Issues** ⚠️
|
||||
|
||||
@@ -174,26 +159,26 @@
|
||||
|
||||
---
|
||||
|
||||
### Tests Passing Quality Gates
|
||||
#### Tests Passing Quality Gates
|
||||
|
||||
**{PASSING_TEST_COUNT}/{TOTAL_TEST_COUNT} tests ({PASSING_PCT}%) meet all quality criteria** ✅
|
||||
|
||||
---
|
||||
|
||||
## Duplicate Coverage Analysis
|
||||
### Duplicate Coverage Analysis
|
||||
|
||||
### Acceptable Overlap (Defense in Depth)
|
||||
#### Acceptable Overlap (Defense in Depth)
|
||||
|
||||
- {CRITERION_ID}: Tested at unit (business logic) and E2E (user journey) ✅
|
||||
|
||||
### Unacceptable Duplication ⚠️
|
||||
#### Unacceptable Duplication ⚠️
|
||||
|
||||
- {CRITERION_ID}: Same validation at E2E and Component level
|
||||
- Recommendation: Remove {TEST_ID} or consolidate with {OTHER_TEST_ID}
|
||||
|
||||
---
|
||||
|
||||
## Coverage by Test Level
|
||||
### Coverage by Test Level
|
||||
|
||||
| Test Level | Tests | Criteria Covered | Coverage % |
|
||||
| ---------- | ----------------- | -------------------- | ---------------- |
|
||||
@@ -205,85 +190,459 @@
|
||||
|
||||
---
|
||||
|
||||
## Gate YAML Snippet
|
||||
### Traceability Recommendations
|
||||
|
||||
```yaml
|
||||
traceability:
|
||||
story_id: "{STORY_ID}"
|
||||
date: "{DATE}"
|
||||
coverage:
|
||||
overall: {OVERALL_PCT}%
|
||||
p0: {P0_PCT}%
|
||||
p1: {P1_PCT}%
|
||||
p2: {P2_PCT}%
|
||||
p3: {P3_PCT}%
|
||||
gaps:
|
||||
critical: {CRITICAL_COUNT}
|
||||
high: {HIGH_COUNT}
|
||||
medium: {MEDIUM_COUNT}
|
||||
low: {LOW_COUNT}
|
||||
quality:
|
||||
passing_tests: {PASSING_COUNT}
|
||||
total_tests: {TOTAL_TESTS}
|
||||
blocker_issues: {BLOCKER_COUNT}
|
||||
warning_issues: {WARNING_COUNT}
|
||||
status: "{STATUS}" # PASS / WARN / FAIL
|
||||
recommendations:
|
||||
- "{RECOMMENDATION_1}"
|
||||
- "{RECOMMENDATION_2}"
|
||||
- "{RECOMMENDATION_3}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate Actions (Before PR Merge)
|
||||
#### Immediate Actions (Before PR Merge)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
2. **{ACTION_2}** - {DESCRIPTION}
|
||||
|
||||
### Short-term Actions (This Sprint)
|
||||
#### Short-term Actions (This Sprint)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
2. **{ACTION_2}** - {DESCRIPTION}
|
||||
|
||||
### Long-term Actions (Backlog)
|
||||
#### Long-term Actions (Backlog)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
### Example Recommendations
|
||||
#### Example Recommendations
|
||||
|
||||
### Immediate Actions (Before PR Merge)
|
||||
**Immediate Actions (Before PR Merge)**
|
||||
|
||||
1. **Add P1 Password Reset Tests** - Implement `1.3-API-001` for email service integration and `1.3-E2E-004` for error path validation. P1 coverage currently at 80%, target is 90%.
|
||||
2. **Optimize Slow E2E Test** - Refactor `1.3-E2E-001` to use faster fixture setup. Currently 145s, target is <90s.
|
||||
|
||||
### Short-term Actions (This Sprint)
|
||||
**Short-term Actions (This Sprint)**
|
||||
|
||||
1. **Enhance P2 Coverage** - Add E2E validation for session timeout (`1.3-E2E-005`). Currently UNIT-ONLY coverage.
|
||||
2. **Split Large Test File** - Break `1.3-UNIT-005` (320 lines) into multiple focused test files (<300 lines each).
|
||||
|
||||
### Long-term Actions (Backlog)
|
||||
**Long-term Actions (Backlog)**
|
||||
|
||||
1. **Enrich P3 Coverage** - Add tests for edge cases in P3 criteria if time permits.
|
||||
|
||||
---
|
||||
|
||||
## PHASE 2: QUALITY GATE DECISION
|
||||
|
||||
**Gate Type:** {story | epic | release | hotfix}
|
||||
**Decision Mode:** {deterministic | manual}
|
||||
|
||||
---
|
||||
|
||||
### Evidence Summary
|
||||
|
||||
#### Test Execution Results
|
||||
|
||||
- **Total Tests**: {total_count}
|
||||
- **Passed**: {passed_count} ({pass_percentage}%)
|
||||
- **Failed**: {failed_count} ({fail_percentage}%)
|
||||
- **Skipped**: {skipped_count} ({skip_percentage}%)
|
||||
- **Duration**: {total_duration}
|
||||
|
||||
**Priority Breakdown:**
|
||||
|
||||
- **P0 Tests**: {p0_passed}/{p0_total} passed ({p0_pass_rate}%) {✅ | ❌}
|
||||
- **P1 Tests**: {p1_passed}/{p1_total} passed ({p1_pass_rate}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Tests**: {p2_passed}/{p2_total} passed ({p2_pass_rate}%) {informational}
|
||||
- **P3 Tests**: {p3_passed}/{p3_total} passed ({p3_pass_rate}%) {informational}
|
||||
|
||||
**Overall Pass Rate**: {overall_pass_rate}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Test Results Source**: {CI_run_id | test_report_url | local_run}
|
||||
|
||||
---
|
||||
|
||||
#### Coverage Summary (from Phase 1)
|
||||
|
||||
**Requirements Coverage:**
|
||||
|
||||
- **P0 Acceptance Criteria**: {p0_covered}/{p0_total} covered ({p0_coverage}%) {✅ | ❌}
|
||||
- **P1 Acceptance Criteria**: {p1_covered}/{p1_total} covered ({p1_coverage}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Acceptance Criteria**: {p2_covered}/{p2_total} covered ({p2_coverage}%) {informational}
|
||||
- **Overall Coverage**: {overall_coverage}%
|
||||
|
||||
**Code Coverage** (if available):
|
||||
|
||||
- **Line Coverage**: {line_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Branch Coverage**: {branch_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Function Coverage**: {function_coverage}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Coverage Source**: {coverage_report_url | coverage_file_path}
|
||||
|
||||
---
|
||||
|
||||
#### Non-Functional Requirements (NFRs)
|
||||
|
||||
**Security**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- Security Issues: {security_issue_count}
|
||||
- {details_if_issues}
|
||||
|
||||
**Performance**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {performance_metrics_summary}
|
||||
|
||||
**Reliability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {reliability_metrics_summary}
|
||||
|
||||
**Maintainability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {maintainability_metrics_summary}
|
||||
|
||||
**NFR Source**: {nfr_assessment_file_path | not_assessed}
|
||||
|
||||
---
|
||||
|
||||
#### Flakiness Validation
|
||||
|
||||
**Burn-in Results** (if available):
|
||||
|
||||
- **Burn-in Iterations**: {iteration_count} (e.g., 10)
|
||||
- **Flaky Tests Detected**: {flaky_test_count} {✅ if 0 | ❌ if >0}
|
||||
- **Stability Score**: {stability_percentage}%
|
||||
|
||||
**Flaky Tests List** (if any):
|
||||
|
||||
- {flaky_test_1_name} - {failure_rate}
|
||||
- {flaky_test_2_name} - {failure_rate}
|
||||
|
||||
**Burn-in Source**: {CI_burn_in_run_id | not_available}
|
||||
|
||||
---
|
||||
|
||||
### Decision Criteria Evaluation
|
||||
|
||||
#### P0 Criteria (Must ALL Pass)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| --------------------- | --------- | ------------------------- | -------- | -------- |
|
||||
| P0 Coverage | 100% | {p0_coverage}% | {✅ PASS | ❌ FAIL} |
|
||||
| P0 Test Pass Rate | 100% | {p0_pass_rate}% | {✅ PASS | ❌ FAIL} |
|
||||
| Security Issues | 0 | {security_issue_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Critical NFR Failures | 0 | {critical_nfr_fail_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Flaky Tests | 0 | {flaky_test_count} | {✅ PASS | ❌ FAIL} |
|
||||
|
||||
**P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
|
||||
|
||||
---
|
||||
|
||||
#### P1 Criteria (Required for PASS, May Accept for CONCERNS)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ---------------------- | ------------------------- | -------------------- | -------- | ----------- | -------- |
|
||||
| P1 Coverage | ≥{min_p1_coverage}% | {p1_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| P1 Test Pass Rate | ≥{min_p1_pass_rate}% | {p1_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Overall Test Pass Rate | ≥{min_overall_pass_rate}% | {overall_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Overall Coverage | ≥{min_coverage}% | {overall_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
|
||||
**P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
|
||||
|
||||
---
|
||||
|
||||
#### P2/P3 Criteria (Informational, Don't Block)
|
||||
|
||||
| Criterion | Actual | Notes |
|
||||
| ----------------- | --------------- | ------------------------------------------------------------ |
|
||||
| P2 Test Pass Rate | {p2_pass_rate}% | {allow_p2_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
| P3 Test Pass Rate | {p3_pass_rate}% | {allow_p3_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
|
||||
---
|
||||
|
||||
### GATE DECISION: {PASS | CONCERNS | FAIL | WAIVED}
|
||||
|
||||
---
|
||||
|
||||
### Rationale
|
||||
|
||||
{Explain decision based on criteria evaluation}
|
||||
|
||||
{Highlight key evidence that drove decision}
|
||||
|
||||
{Note any assumptions or caveats}
|
||||
|
||||
**Example (PASS):**
|
||||
|
||||
> All P0 criteria met with 100% coverage and pass rates across critical tests. All P1 criteria exceeded thresholds with 98% overall pass rate and 92% coverage. No security issues detected. No flaky tests in validation. Feature is ready for production deployment with standard monitoring.
|
||||
|
||||
**Example (CONCERNS):**
|
||||
|
||||
> All P0 criteria met, ensuring critical user journeys are protected. However, P1 coverage (88%) falls below threshold (90%) due to missing E2E test for AC-5 edge case. Overall pass rate (96%) is excellent. Issues are non-critical and have acceptable workarounds. Risk is low enough to deploy with enhanced monitoring.
|
||||
|
||||
**Example (FAIL):**
|
||||
|
||||
> CRITICAL BLOCKERS DETECTED:
|
||||
>
|
||||
> 1. P0 coverage incomplete (80%) - AC-2 security validation missing
|
||||
> 2. P0 test failures (75% pass rate) in core search functionality
|
||||
> 3. Unresolved SQL injection vulnerability in search filter (CRITICAL)
|
||||
>
|
||||
> Release MUST BE BLOCKED until P0 issues are resolved. Security vulnerability cannot be waived.
|
||||
|
||||
**Example (WAIVED):**
|
||||
|
||||
> Original decision was FAIL due to P0 test failure in legacy Excel 2007 export module (affects <1% of users). However, release contains critical GDPR compliance features required by regulatory deadline (Oct 15). Business has approved waiver given:
|
||||
>
|
||||
> - Regulatory priority overrides legacy module risk
|
||||
> - Workaround available (use Excel 2010+)
|
||||
> - Issue will be fixed in v2.4.1 hotfix (due Oct 20)
|
||||
> - Enhanced monitoring in place
|
||||
|
||||
---
|
||||
|
||||
### {Section: Delete if not applicable}
|
||||
|
||||
#### Residual Risks (For CONCERNS or WAIVED)
|
||||
|
||||
List unresolved P1/P2 issues that don't block release but should be tracked:
|
||||
|
||||
1. **{Risk Description}**
|
||||
- **Priority**: P1 | P2
|
||||
- **Probability**: Low | Medium | High
|
||||
- **Impact**: Low | Medium | High
|
||||
- **Risk Score**: {probability × impact}
|
||||
- **Mitigation**: {workaround or monitoring plan}
|
||||
- **Remediation**: {fix in next sprint/release}
|
||||
|
||||
**Overall Residual Risk**: {LOW | MEDIUM | HIGH}
|
||||
|
||||
---
|
||||
|
||||
#### Waiver Details (For WAIVED only)
|
||||
|
||||
**Original Decision**: ❌ FAIL
|
||||
|
||||
**Reason for Failure**:
|
||||
|
||||
- {list_of_blocking_issues}
|
||||
|
||||
**Waiver Information**:
|
||||
|
||||
- **Waiver Reason**: {business_justification}
|
||||
- **Waiver Approver**: {name}, {role} (e.g., Jane Doe, VP Engineering)
|
||||
- **Approval Date**: {YYYY-MM-DD}
|
||||
- **Waiver Expiry**: {YYYY-MM-DD} (**NOTE**: Does NOT apply to next release)
|
||||
|
||||
**Monitoring Plan**:
|
||||
|
||||
- {enhanced_monitoring_1}
|
||||
- {enhanced_monitoring_2}
|
||||
- {escalation_criteria}
|
||||
|
||||
**Remediation Plan**:
|
||||
|
||||
- **Fix Target**: {next_release_version} (e.g., v2.4.1 hotfix)
|
||||
- **Due Date**: {YYYY-MM-DD}
|
||||
- **Owner**: {team_or_person}
|
||||
- **Verification**: {how_fix_will_be_verified}
|
||||
|
||||
**Business Justification**:
|
||||
{detailed_explanation_of_why_waiver_is_acceptable}
|
||||
|
||||
---
|
||||
|
||||
#### Critical Issues (For FAIL or CONCERNS)
|
||||
|
||||
Top blockers requiring immediate attention:
|
||||
|
||||
| Priority | Issue | Description | Owner | Due Date | Status |
|
||||
| -------- | ------------- | ------------------- | ------------ | ------------ | ------------------ |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P1 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
|
||||
**Blocking Issues Count**: {p0_blocker_count} P0 blockers, {p1_blocker_count} P1 issues
|
||||
|
||||
---
|
||||
|
||||
### Gate Recommendations
|
||||
|
||||
#### For PASS Decision ✅
|
||||
|
||||
1. **Proceed to deployment**
|
||||
- Deploy to staging environment
|
||||
- Validate with smoke tests
|
||||
- Monitor key metrics for 24-48 hours
|
||||
- Deploy to production with standard monitoring
|
||||
|
||||
2. **Post-Deployment Monitoring**
|
||||
- {metric_1_to_monitor}
|
||||
- {metric_2_to_monitor}
|
||||
- {alert_thresholds}
|
||||
|
||||
3. **Success Criteria**
|
||||
- {success_criterion_1}
|
||||
- {success_criterion_2}
|
||||
|
||||
---
|
||||
|
||||
#### For CONCERNS Decision ⚠️
|
||||
|
||||
1. **Deploy with Enhanced Monitoring**
|
||||
- Deploy to staging with extended validation period
|
||||
- Enable enhanced logging/monitoring for known risk areas:
|
||||
- {risk_area_1}
|
||||
- {risk_area_2}
|
||||
- Set aggressive alerts for potential issues
|
||||
- Deploy to production with caution
|
||||
|
||||
2. **Create Remediation Backlog**
|
||||
- Create story: "{fix_title_1}" (Priority: {priority})
|
||||
- Create story: "{fix_title_2}" (Priority: {priority})
|
||||
- Target sprint: {next_sprint}
|
||||
|
||||
3. **Post-Deployment Actions**
|
||||
- Monitor {specific_areas} closely for {time_period}
|
||||
- Weekly status updates on remediation progress
|
||||
- Re-assess after fixes deployed
|
||||
|
||||
---
|
||||
|
||||
#### For FAIL Decision ❌
|
||||
|
||||
1. **Block Deployment Immediately**
|
||||
- Do NOT deploy to any environment
|
||||
- Notify stakeholders of blocking issues
|
||||
- Escalate to tech lead and PM
|
||||
|
||||
2. **Fix Critical Issues**
|
||||
- Address P0 blockers listed in Critical Issues section
|
||||
- Owner assignments confirmed
|
||||
- Due dates agreed upon
|
||||
- Daily standup on blocker resolution
|
||||
|
||||
3. **Re-Run Gate After Fixes**
|
||||
- Re-run full test suite after fixes
|
||||
- Re-run `bmad tea *trace` workflow
|
||||
- Verify decision is PASS before deploying
|
||||
|
||||
---
|
||||
|
||||
#### For WAIVED Decision 🔓
|
||||
|
||||
1. **Deploy with Business Approval**
|
||||
- Confirm waiver approver has signed off
|
||||
- Document waiver in release notes
|
||||
- Notify all stakeholders of waived risks
|
||||
|
||||
2. **Aggressive Monitoring**
|
||||
- {enhanced_monitoring_plan}
|
||||
- {escalation_procedures}
|
||||
- Daily checks on waived risk areas
|
||||
|
||||
3. **Mandatory Remediation**
|
||||
- Fix MUST be completed by {due_date}
|
||||
- Issue CANNOT be waived in next release
|
||||
- Track remediation progress weekly
|
||||
- Verify fix in next gate
|
||||
|
||||
---
|
||||
|
||||
### Next Steps
|
||||
|
||||
**Immediate Actions** (next 24-48 hours):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Follow-up Actions** (next sprint/release):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Stakeholder Communication**:
|
||||
|
||||
- Notify PM: {decision_summary}
|
||||
- Notify SM: {decision_summary}
|
||||
- Notify DEV lead: {decision_summary}
|
||||
|
||||
---
|
||||
|
||||
## Integrated YAML Snippet (CI/CD)
|
||||
|
||||
```yaml
|
||||
traceability_and_gate:
|
||||
# Phase 1: Traceability
|
||||
traceability:
|
||||
story_id: "{STORY_ID}"
|
||||
date: "{DATE}"
|
||||
coverage:
|
||||
overall: {OVERALL_PCT}%
|
||||
p0: {P0_PCT}%
|
||||
p1: {P1_PCT}%
|
||||
p2: {P2_PCT}%
|
||||
p3: {P3_PCT}%
|
||||
gaps:
|
||||
critical: {CRITICAL_COUNT}
|
||||
high: {HIGH_COUNT}
|
||||
medium: {MEDIUM_COUNT}
|
||||
low: {LOW_COUNT}
|
||||
quality:
|
||||
passing_tests: {PASSING_COUNT}
|
||||
total_tests: {TOTAL_TESTS}
|
||||
blocker_issues: {BLOCKER_COUNT}
|
||||
warning_issues: {WARNING_COUNT}
|
||||
recommendations:
|
||||
- "{RECOMMENDATION_1}"
|
||||
- "{RECOMMENDATION_2}"
|
||||
|
||||
# Phase 2: Gate Decision
|
||||
gate_decision:
|
||||
decision: "{PASS | CONCERNS | FAIL | WAIVED}"
|
||||
gate_type: "{story | epic | release | hotfix}"
|
||||
decision_mode: "{deterministic | manual}"
|
||||
criteria:
|
||||
p0_coverage: {p0_coverage}%
|
||||
p0_pass_rate: {p0_pass_rate}%
|
||||
p1_coverage: {p1_coverage}%
|
||||
p1_pass_rate: {p1_pass_rate}%
|
||||
overall_pass_rate: {overall_pass_rate}%
|
||||
overall_coverage: {overall_coverage}%
|
||||
security_issues: {security_issue_count}
|
||||
critical_nfrs_fail: {critical_nfr_fail_count}
|
||||
flaky_tests: {flaky_test_count}
|
||||
thresholds:
|
||||
min_p0_coverage: 100
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_coverage: {min_p1_coverage}
|
||||
min_p1_pass_rate: {min_p1_pass_rate}
|
||||
min_overall_pass_rate: {min_overall_pass_rate}
|
||||
min_coverage: {min_coverage}
|
||||
evidence:
|
||||
test_results: "{CI_run_id | test_report_url}"
|
||||
traceability: "{trace_file_path}"
|
||||
nfr_assessment: "{nfr_file_path}"
|
||||
code_coverage: "{coverage_report_url}"
|
||||
next_steps: "{brief_summary_of_recommendations}"
|
||||
waiver: # Only if WAIVED
|
||||
reason: "{business_justification}"
|
||||
approver: "{name}, {role}"
|
||||
expiry: "{YYYY-MM-DD}"
|
||||
remediation_due: "{YYYY-MM-DD}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- **Story File:** {STORY_FILE_PATH}
|
||||
- **Test Design:** {TEST_DESIGN_PATH} (if available)
|
||||
- **Tech Spec:** {TECH_SPEC_PATH} (if available)
|
||||
- **Test Results:** {TEST_RESULTS_PATH}
|
||||
- **NFR Assessment:** {NFR_FILE_PATH} (if available)
|
||||
- **Test Files:** {TEST_DIR_PATH}
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**Traceability Assessment:**
|
||||
**Phase 1 - Traceability Assessment:**
|
||||
|
||||
- Overall Coverage: {OVERALL_PCT}%
|
||||
- P0 Coverage: {P0_PCT}% {P0_STATUS}
|
||||
@@ -291,16 +650,23 @@ traceability:
|
||||
- Critical Gaps: {CRITICAL_COUNT}
|
||||
- High Priority Gaps: {HIGH_COUNT}
|
||||
|
||||
**Gate Status:** {STATUS} {STATUS_ICON}
|
||||
**Phase 2 - Gate Decision:**
|
||||
|
||||
- **Decision**: {PASS | CONCERNS | FAIL | WAIVED} {STATUS_ICON}
|
||||
- **P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
|
||||
- **P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
|
||||
|
||||
**Overall Status:** {STATUS} {STATUS_ICON}
|
||||
|
||||
**Next Steps:**
|
||||
|
||||
- If PASS ✅: Proceed to `*gate` workflow or PR merge
|
||||
- If WARN ⚠️: Address HIGH priority gaps, re-run `*trace`
|
||||
- If FAIL ❌: Run `*atdd` to generate missing P0 tests, re-run `*trace`
|
||||
- If PASS ✅: Proceed to deployment
|
||||
- If CONCERNS ⚠️: Deploy with monitoring, create remediation backlog
|
||||
- If FAIL ❌: Block deployment, fix critical issues, re-run workflow
|
||||
- If WAIVED 🔓: Deploy with business approval and aggressive monitoring
|
||||
|
||||
**Generated:** {DATE}
|
||||
**Workflow:** testarch-trace v4.0
|
||||
**Workflow:** testarch-trace v4.0 (Enhanced with Gate Decision)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Test Architect workflow: trace
|
||||
# Test Architect workflow: trace (enhanced with gate decision)
|
||||
name: testarch-trace
|
||||
description: "Generate requirements-to-tests traceability matrix with coverage analysis and gap identification"
|
||||
description: "Generate requirements-to-tests traceability matrix, analyze coverage, and make quality gate decision (PASS/CONCERNS/FAIL/WAIVED)"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
@@ -65,6 +65,46 @@ variables:
|
||||
include_code_coverage: false # Integrate with code coverage reports (Istanbul, NYC)
|
||||
check_assertions: true # Verify explicit assertions in tests
|
||||
|
||||
# PHASE 2: Gate Decision Variables (runs after traceability)
|
||||
enable_gate_decision: true # Run gate decision after traceability (Phase 2)
|
||||
|
||||
# Gate target specification
|
||||
gate_type: "story" # story | epic | release | hotfix
|
||||
# story_id, epic_num, release_version inherited from trace context
|
||||
|
||||
# Gate decision configuration
|
||||
decision_mode: "deterministic" # deterministic (rule-based) | manual (team decision)
|
||||
allow_waivers: true # Allow business-approved waivers for FAIL → WAIVED
|
||||
require_evidence: true # Require links to test results, reports, etc.
|
||||
|
||||
# Input sources for gate (auto-discovered from Phase 1 + external)
|
||||
# story_file, test_design_file inherited from trace
|
||||
nfr_file: "" # Path to nfr-assessment.md (optional, recommended for release gates)
|
||||
test_results: "" # Path to test execution results (CI artifacts, reports)
|
||||
|
||||
# Decision criteria thresholds
|
||||
min_p0_pass_rate: 100 # P0 tests must have 100% pass rate
|
||||
min_p1_pass_rate: 95 # P1 tests threshold
|
||||
min_overall_pass_rate: 90 # Overall test pass rate
|
||||
# min_coverage already defined above (min_overall_coverage: 80)
|
||||
max_critical_nfrs_fail: 0 # No critical NFRs can fail
|
||||
max_security_issues: 0 # No unresolved security issues
|
||||
|
||||
# Risk tolerance
|
||||
allow_p2_failures: true # P2 failures don't block release
|
||||
allow_p3_failures: true # P3 failures don't block release
|
||||
escalate_p1_failures: true # P1 failures require escalation approval
|
||||
|
||||
# Gate output configuration
|
||||
gate_output_file: "{output_folder}/gate-decision-{gate_type}-{story_id}{epic_num}{release_version}.md"
|
||||
append_to_history: true # Append to bmm-workflow-status.md gate history
|
||||
notify_stakeholders: true # Generate notification message for team
|
||||
|
||||
# Advanced gate options
|
||||
check_all_workflows_complete: true # Verify test-design, trace, nfr-assess complete
|
||||
validate_evidence_freshness: true # Warn if assessments are >7 days old
|
||||
require_sign_off: false # Require named approver for gate decision
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/traceability-matrix.md"
|
||||
|
||||
@@ -80,9 +120,12 @@ required_tools:
|
||||
recommended_inputs:
|
||||
- story: "Story markdown with acceptance criteria (required for BMad mode)"
|
||||
- test_files: "Test suite for the feature (auto-discovered if not provided)"
|
||||
- test_design: "Test design with risk/priority assessment (optional)"
|
||||
- test_design: "Test design with risk/priority assessment (required for Phase 2 gate)"
|
||||
- tech_spec: "Technical specification (optional)"
|
||||
- existing_tests: "Current test suite for analysis"
|
||||
- test_results: "CI/CD test execution results (required for Phase 2 gate)"
|
||||
- nfr_assess: "Non-functional requirements validation (recommended for release gates)"
|
||||
- code_coverage: "Code coverage report (optional)"
|
||||
|
||||
tags:
|
||||
- qa
|
||||
@@ -90,6 +133,9 @@ tags:
|
||||
- test-architect
|
||||
- coverage
|
||||
- requirements
|
||||
- gate
|
||||
- decision
|
||||
- release
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
|
||||
Reference in New Issue
Block a user