BoMB updates

2025-10-04 00:22:59 -05:00
parent 9e8c7f3503
commit 5ee4cf535c
326 changed files with 39464 additions and 402 deletions
--- a/bmad/bmm/testarch/README.md
+++ b/bmad/bmm/testarch/README.md
@@ -0,0 +1,162 @@
+---
+last-redoc-date: 2025-09-30
+---
+
+# Test Architect (TEA) Agent Guide
+
+## Overview
+
+- **Persona:** Murat, Master Test Architect and Quality Advisor focused on risk-based testing, fixture architecture, ATDD, and CI/CD governance.
+- **Mission:** Deliver actionable quality strategies, automation coverage, and gate decisions that scale with project level and compliance demands.
+- **Use When:** Project level ≥2, integration risk is non-trivial, brownfield regression risk exists, or compliance/NFR evidence is required.
+
+## Prerequisites and Setup
+
+1. Run the core planning workflows first:
+   - Analyst `*product-brief`
+   - Product Manager `*plan-project`
+   - Architect `*solution-architecture`
+2. Confirm `bmad/bmm/config.yaml` defines `project_name`, `output_folder`, `dev_story_location`, and language settings.
+3. Ensure a test test framework setup exists; if not, use `*framework` command to create a test framework setup, prior to development.
+4. Skim supporting references (knowledge under `testarch/`, command workflows under `workflows/testarch/`).
+   - `tea-index.csv` + `knowledge/*.md`
+
+## High-Level Cheat Sheets
+
+### Greenfield Feature Launch (Level 2)
+
+| Phase              | Test Architect                                                            | Dev / Team                                                                       | Outputs                                                                               |
+| ------------------ | ------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
+| Setup              | -                                                                         | Analyst `*product-brief`, PM `*plan-project`, Architect `*solution-architecture` | `{output_folder}/product-brief*.md`, `PRD.md`, `epics.md`, `solution-architecture.md` |
+| Pre-Implementation | Run `*framework` (if harness missing), `*ci`, and `*test-design`          | Review risk/design/CI guidance, align backlog                                    | Test scaffold, CI pipeline, risk and coverage strategy                                |
+| Story Prep         | -                                                                         | Scrum Master `*create-story`, `*story-context`                                   | Story markdown + context XML                                                          |
+| Implementation     | (Optional) Trigger `*atdd` before dev to supply failing tests + checklist | Implement story guided by ATDD checklist                                         | Failing acceptance tests + implementation checklist                                   |
+| Post-Dev           | Execute `*automate`, re-run `*trace`                                      | Address recommendations, update code/tests                                       | Regression specs, refreshed coverage matrix                                           |
+| Release            | Run `*gate`                                                               | Confirm Definition of Done, share release notes                                  | Gate YAML + release summary (owners, waivers)                                         |
+
+<details>
+<summary>Execution Notes</summary>
+
+- Run `*framework` only once per repo or when modern harness support is missing.
+- `*framework` followed by `*ci` establishes install + pipeline; `*test-design` then handles risk scoring, mitigations, and scenario planning in one pass.
+- Use `*atdd` before coding when the team can adopt ATDD; share its checklist with the dev agent.
+- Post-implementation, keep `*trace` current, expand coverage with `*automate`, and finish with `*gate`.
+
+</details>
+
+<details>
+<summary>Worked Example – “Nova CRM” Greenfield Feature</summary>
+
+1. **Planning:** Analyst runs `*product-brief`; PM executes `*plan-project` to produce PRD and epics; Architect completes `*solution-architecture` for the new module.
+2. **Setup:** TEA checks harness via `*framework`, configures `*ci`, and runs `*test-design` to capture risk/coverage plans.
+3. **Story Prep:** Scrum Master generates the story via `*create-story`; PO validates using `*assess-project-ready`.
+4. **Implementation:** TEA optionally runs `*atdd`; Dev implements with guidance from failing tests and the plan.
+5. **Post-Dev and Release:** TEA runs `*automate`, re-runs `*trace`, and finishes with `*gate` to document the decision.
+
+</details>
+
+### Brownfield Feature Enhancement (Level 3–4)
+
+| Phase             | Test Architect                                                      | Dev / Team                                                 | Outputs                                                 |
+| ----------------- | ------------------------------------------------------------------- | ---------------------------------------------------------- | ------------------------------------------------------- |
+| Refresh Context   | -                                                                   | Analyst/PM/Architect rerun planning workflows              | Updated planning artifacts in `{output_folder}`         |
+| Baseline Coverage | Run `*trace` to inventory existing tests                            | Review matrix, flag hotspots                               | Coverage matrix + initial gate snippet                  |
+| Risk Targeting    | Run `*test-design`                                                  | Align remediation/backlog priorities                       | Brownfield risk memo + scenario matrix                  |
+| Story Prep        | -                                                                   | Scrum Master `*create-story`                               | Updated story markdown                                  |
+| Implementation    | (Optional) Run `*atdd` before dev                                   | Implement story, referencing checklist/tests               | Failing acceptance tests + implementation checklist     |
+| Post-Dev          | Apply `*automate`, re-run `*trace`, trigger `*nfr-assess` if needed | Resolve gaps, update docs/tests                            | Regression specs, refreshed coverage matrix, NFR report |
+| Release           | Run `*gate`                                                         | Product Owner `*assess-project-ready`, share release notes | Gate YAML + release summary                             |
+
+<details>
+<summary>Execution Notes</summary>
+
+- Lead with `*trace` so remediation plans target true coverage gaps. Ensure `*framework` and `*ci` are in place early in the engagement; if the brownfield lacks them, run those setup steps immediately after refreshing context.
+- `*test-design` should highlight regression hotspots, mitigations, and P0 scenarios.
+- Use `*atdd` when stories benefit from ATDD; otherwise proceed to implementation and rely on post-dev automation.
+- After development, expand coverage with `*automate`, re-run `*trace`, and close with `*gate`. Run `*nfr-assess` now if non-functional risks weren't addressed earlier.
+- Product Owner `*assess-project-ready` confirms the team has artifacts before handoff or release.
+
+</details>
+
+<details>
+<summary>Worked Example – “Atlas Payments” Brownfield Story</summary>
+
+1. **Context Refresh:** Analyst reruns `*product-brief`; PM executes `*plan-project` to update PRD, analysis, and `epics.md`; Architect triggers `*solution-architecture` capturing legacy payment flows.
+2. **Baseline Coverage:** TEA executes `*trace` to record current coverage in `docs/qa/assessments/atlas-payment-trace.md`.
+3. **Risk and Design:** `*test-design` flags settlement edge cases, plans mitigations, and allocates new API/E2E scenarios with P0 priorities.
+4. **Story Prep:** Scrum Master generates `stories/story-1.1.md` via `*create-story`, automatically pulling updated context.
+5. **ATDD First:** TEA runs `*atdd`, producing failing Playwright specs under `tests/e2e/payments/` plus an implementation checklist.
+6. **Implementation:** Dev pairs with the checklist/tests to deliver the story.
+7. **Post-Implementation:** TEA applies `*automate`, re-runs `*trace`, performs `*nfr-assess` to validate SLAs, and closes with `*gate` marking PASS with follow-ups.
+
+</details>
+
+### Enterprise / Compliance Program (Level 4)
+
+| Phase               | Test Architect                                   | Dev / Team                                     | Outputs                                                   |
+| ------------------- | ------------------------------------------------ | ---------------------------------------------- | --------------------------------------------------------- |
+| Strategic Planning  | -                                                | Analyst/PM/Architect standard workflows        | Enterprise-grade PRD, epics, architecture                 |
+| Quality Planning    | Run `*framework`, `*test-design`, `*nfr-assess`  | Review guidance, align compliance requirements | Harness scaffold, risk + coverage plan, NFR documentation |
+| Pipeline Enablement | Configure `*ci`                                  | Coordinate secrets, pipeline approvals         | `.github/workflows/test.yml`, helper scripts              |
+| Execution           | Enforce `*atdd`, `*automate`, `*trace` per story | Implement stories, resolve TEA findings        | Tests, fixtures, coverage matrices                        |
+| Release             | Run `*gate`                                      | Capture sign-offs, archive artifacts           | Updated assessments, gate YAML, audit trail               |
+
+<details>
+<summary>Execution Notes</summary>
+
+- Use `*atdd` for every story when feasible so acceptance tests lead implementation in regulated environments.
+- `*ci` scaffolds selective testing scripts, burn-in jobs, caching, and notifications for long-running suites.
+- Prior to release, rerun coverage (`*trace`, `*automate`) and formalize the decision in `*gate`; store everything for audits. Call `*nfr-assess` here if compliance/performance requirements weren't captured during planning.
+
+</details>
+
+<details>
+<summary>Worked Example – “Helios Ledger” Enterprise Release</summary>
+
+1. **Strategic Planning:** Analyst/PM/Architect complete PRD, epics, and architecture using the standard workflows.
+2. **Quality Planning:** TEA runs `*framework`, `*test-design`, and `*nfr-assess` to establish mitigations, coverage, and NFR targets.
+3. **Pipeline Setup:** TEA configures CI via `*ci` with selective execution scripts.
+4. **Execution:** For each story, TEA enforces `*atdd`, `*automate`, and `*trace`; Dev teams iterate on the findings.
+5. **Release:** TEA re-checks coverage and logs the final gate decision via `*gate`, archiving artifacts for compliance.
+
+</details>
+
+## Command Catalog
+
+| Command        | Task File                                        | Primary Outputs                                                     | Notes                                            |
+| -------------- | ------------------------------------------------ | ------------------------------------------------------------------- | ------------------------------------------------ |
+| `*framework`   | `workflows/testarch/framework/instructions.md`   | Playwright/Cypress scaffold, `.env.example`, `.nvmrc`, sample specs | Use when no production-ready harness exists      |
+| `*atdd`        | `workflows/testarch/atdd/instructions.md`        | Failing acceptance tests + implementation checklist                 | Requires approved story + harness                |
+| `*automate`    | `workflows/testarch/automate/instructions.md`    | Prioritized specs, fixtures, README/script updates, DoD summary     | Avoid duplicate coverage (see priority matrix)   |
+| `*ci`          | `workflows/testarch/ci/instructions.md`          | CI workflow, selective test scripts, secrets checklist              | Platform-aware (GitHub Actions default)          |
+| `*test-design` | `workflows/testarch/test-design/instructions.md` | Combined risk assessment, mitigation plan, and coverage strategy    | Handles risk scoring and test design in one pass |
+| `*trace`       | `workflows/testarch/trace/instructions.md`       | Coverage matrix, recommendations, gate snippet                      | Requires access to story/tests repositories      |
+| `*nfr-assess`  | `workflows/testarch/nfr-assess/instructions.md`  | NFR assessment report with actions                                  | Focus on security/performance/reliability        |
+| `*gate`        | `workflows/testarch/gate/instructions.md`        | Gate YAML + summary (PASS/CONCERNS/FAIL/WAIVED)                     | Deterministic decision rules + rationale         |
+
+<details>
+<summary>Command Guidance and Context Loading</summary>
+
+- Each task now carries its own preflight/flow/deliverable guidance inline.
+- `tea-index.csv` maps workflow needs to knowledge fragments; keep tags accurate as you add guidance.
+- Consider future modularization into orchestrated workflows if additional automation is needed.
+- Update the fragment markdown files alongside workflow edits so guidance and outputs stay in sync.
+
+</details>
+
+## Workflow Placement
+
+The TEA stack has three tightly-linked layers:
+
+1. **Agent spec (`agents/tea.md`)** – declares the persona, critical actions, and the `run-workflow` entries for every TEA command. Critical actions instruct the agent to load `tea-index.csv` and then fetch only the fragments it needs from `knowledge/` before giving guidance.
+2. **Knowledge index (`tea-index.csv`)** – catalogues each fragment with tags and file paths. Workflows call out the IDs they need (e.g., `risk-governance`, `fixture-architecture`) so the agent loads targeted guidance instead of a monolithic brief.
+3. **Workflows (`workflows/testarch/*`)** – contain the task flows and reference `tea-index.csv` in their `<flow>`/`<notes>` sections to request specific fragments. Keeping all workflows in this directory ensures consistent discovery during planning (`*framework`), implementation (`*atdd`, `*automate`, `*trace`), and release (`*nfr-assess`, `*gate`).
+
+This separation lets us expand the knowledge base without touching agent wiring and keeps every command remote-controllable via the standard BMAD workflow runner. As navigation improves, we can add lightweight entrypoints or tags in the index without changing where workflows live.
+
+## Appendix
+
+- **Supporting Knowledge:**
+  - `tea-index.csv` – Catalog of knowledge fragments with tags and file paths under `knowledge/` for task-specific loading.
+  - `knowledge/*.md` – Focused summaries (fixtures, network, CI, levels, priorities, etc.) distilled from Murat’s external resources.
+  - `test-resources-for-ai-flat.txt` – Raw 347 KB archive retained for manual deep dives when a fragment needs source validation.
--- a/bmad/bmm/testarch/knowledge/ci-burn-in.md
+++ b/bmad/bmm/testarch/knowledge/ci-burn-in.md
@@ -0,0 +1,9 @@
+# CI Pipeline and Burn-In Strategy
+
+- Stage jobs: install/caching once, run `test-changed` for quick feedback, then shard full suites with `fail-fast: false` so evidence isn’t lost.
+- Re-run changed specs 5–10x (burn-in) before merging to flush flakes; fail the pipeline on the first inconsistent run.
+- Upload artifacts on failure (videos, traces, HAR) and keep retry counts explicit—hidden retries hide instability.
+- Use `wait-on` for app startup, enforce time budgets (<10 min per job), and document required secrets alongside workflows.
+- Mirror CI scripts locally (`npm run test:ci`, `scripts/burn-in-changed.sh`) so devs reproduce pipeline behaviour exactly.
+
+_Source: Murat CI/CD strategy blog, Playwright/Cypress workflow examples._
--- a/bmad/bmm/testarch/knowledge/component-tdd.md
+++ b/bmad/bmm/testarch/knowledge/component-tdd.md
@@ -0,0 +1,9 @@
+# Component Test-Driven Development Loop
+
+- Start every UI change with a failing component spec (`cy.mount` or RTL `render`); ship only after red → green → refactor passes.
+- Recreate providers/stores per spec to prevent state bleed and keep parallel runs deterministic.
+- Use factories to exercise prop/state permutations; cover accessibility by asserting against roles, labels, and keyboard flows.
+- Keep component specs under ~100 lines: split by intent (rendering, state transitions, error messaging) to preserve clarity.
+- Pair component tests with visual debugging (Cypress runner, Storybook, Playwright trace viewer) to accelerate diagnosis.
+
+_Source: CCTDD repository, Murat component testing talks._
--- a/bmad/bmm/testarch/knowledge/contract-testing.md
+++ b/bmad/bmm/testarch/knowledge/contract-testing.md
@@ -0,0 +1,9 @@
+# Contract Testing Essentials (Pact)
+
+- Store consumer contracts beside the integration specs that generate them; version contracts semantically and publish on every CI run.
+- Require provider verification before merge; failed verification blocks release and surfaces breaking changes immediately.
+- Capture fallback behaviour inside interactions (timeouts, retries, error payloads) so resilience guarantees remain explicit.
+- Automate broker housekeeping: tag releases, archive superseded contracts, and expire unused pacts to reduce noise.
+- Pair contract suites with API smoke or component tests to validate data mapping and UI rendering in tandem.
+
+_Source: Pact consumer/provider sample repos, Murat contract testing blog._
--- a/bmad/bmm/testarch/knowledge/data-factories.md
+++ b/bmad/bmm/testarch/knowledge/data-factories.md
@@ -0,0 +1,9 @@
+# Data Factories and API-First Setup
+
+- Prefer factory functions that accept overrides and return complete objects (`createUser(overrides)`)—never rely on static fixtures.
+- Seed state through APIs, tasks, or direct DB helpers before visiting the UI; UI-based setup is for validation only.
+- Ensure factories generate parallel-safe identifiers (UUIDs, timestamps) and perform cleanup after each test.
+- Centralize factory exports to avoid duplication; version them alongside schema changes to catch drift in reviews.
+- When working with shared environments, layer feature toggles or targeted cleanup so factories do not clobber concurrent runs.
+
+_Source: Murat Testing Philosophy, blog posts on functional helpers and API-first testing._
--- a/bmad/bmm/testarch/knowledge/email-auth.md
+++ b/bmad/bmm/testarch/knowledge/email-auth.md
@@ -0,0 +1,9 @@
+# Email-Based Authentication Testing
+
+- Use services like Mailosaur or in-house SMTP capture; extract magic links via regex or HTML parsing helpers.
+- Preserve browser storage (local/session) when processing links—restore state before visiting the authenticated page.
+- Cache email payloads with `cypress-data-session` or equivalent so retries don’t exhaust inbox quotas.
+- Cover negative cases: expired links, reused links, and multiple requests in rapid succession.
+- Ensure the workflow logs the email ID and link for troubleshooting, but scrub PII before committing artifacts.
+
+_Source: Email authentication blog, Murat testing toolkit._
--- a/bmad/bmm/testarch/knowledge/error-handling.md
+++ b/bmad/bmm/testarch/knowledge/error-handling.md
@@ -0,0 +1,9 @@
+# Error Handling and Resilience Checks
+
+- Treat expected failures explicitly: intercept network errors and assert UI fallbacks (`error-message` visible, retries triggered).
+- In Cypress, use scoped `Cypress.on('uncaught:exception')` to ignore known errors; rethrow anything else so regressions fail.
+- In Playwright, hook `page.on('pageerror')` and only swallow the specific, documented error messages.
+- Test retry/backoff logic by forcing sequential failures (e.g., 500, timeout, success) and asserting telemetry gets recorded.
+- Log captured errors with context (request payload, user/session) but redact secrets to keep artifacts safe for sharing.
+
+_Source: Murat error-handling patterns, Pact resilience guidance._
--- a/bmad/bmm/testarch/knowledge/feature-flags.md
+++ b/bmad/bmm/testarch/knowledge/feature-flags.md
@@ -0,0 +1,9 @@
+# Feature Flag Governance
+
+- Centralize flag definitions in a frozen enum; expose helpers to set, clear, and target specific audiences.
+- Test both enabled and disabled states in CI; clean up targeting after each spec to keep shared environments stable.
+- For LaunchDarkly-style systems, script API helpers to seed variations instead of mutating via UI.
+- Maintain a checklist for new flags: default state, owners, expiry date, telemetry, rollback plan.
+- Document flag dependencies in story/PR templates so QA and release reviews know which toggles must flip before launch.
+
+_Source: LaunchDarkly strategy blog, Murat test architecture notes._
--- a/bmad/bmm/testarch/knowledge/fixture-architecture.md
+++ b/bmad/bmm/testarch/knowledge/fixture-architecture.md
@@ -0,0 +1,9 @@
+# Fixture Architecture Playbook
+
+- Build helpers as pure functions first, then expose them via Playwright `extend` or Cypress commands so logic stays testable in isolation.
+- Compose capabilities with `mergeTests` (Playwright) or layered Cypress commands instead of inheritance; each fixture should solve one concern (auth, api, logs, network).
+- Keep HTTP helpers framework agnostic—accept all required params explicitly and return results so unit tests and runtime fixtures can share them.
+- Export fixtures through package subpaths (`"./api-request"`, `"./api-request/fixtures"`) to make reuse trivial across suites and projects.
+- Treat fixture files as infrastructure: document dependencies, enforce deterministic timeouts, and ban hidden retries that mask flakiness.
+
+_Source: Murat Testing Philosophy, cy-vs-pw comparison, SEON production patterns._
--- a/bmad/bmm/testarch/knowledge/network-first.md
+++ b/bmad/bmm/testarch/knowledge/network-first.md
@@ -0,0 +1,9 @@
+# Network-First Safeguards
+
+- Register interceptions before any navigation or user action; store the promise and await it immediately after the triggering step.
+- Assert on structured responses (status, body schema, headers) instead of generic waits so failures surface with actionable context.
+- Capture HAR files or Playwright traces on successful runs—reuse them for deterministic CI playback when upstream services flake.
+- Prefer edge mocking: stub at service boundaries, never deep within the stack unless risk analysis demands it.
+- Replace implicit waits with deterministic signals like `waitForResponse`, disappearance of spinners, or event hooks.
+
+_Source: Murat Testing Philosophy, Playwright patterns book, blog on network interception._
--- a/bmad/bmm/testarch/knowledge/nfr-criteria.md
+++ b/bmad/bmm/testarch/knowledge/nfr-criteria.md
@@ -0,0 +1,21 @@
+# Non-Functional Review Criteria
+
+- **Security**
+  - PASS: auth/authz, secret handling, and threat mitigations in place.
+  - CONCERNS: minor gaps with clear owners.
+  - FAIL: critical exposure or missing controls.
+- **Performance**
+  - PASS: metrics meet targets with profiling evidence.
+  - CONCERNS: trending toward limits or missing baselines.
+  - FAIL: breaches SLO/SLA or introduces resource leaks.
+- **Reliability**
+  - PASS: error handling, retries, health checks verified.
+  - CONCERNS: partial coverage or missing telemetry.
+  - FAIL: no recovery path or crash scenarios unresolved.
+- **Maintainability**
+  - PASS: clean code, tests, and documentation shipped together.
+  - CONCERNS: duplication, low coverage, or unclear ownership.
+  - FAIL: absent tests, tangled implementations, or no observability.
+- Default to CONCERNS when targets or evidence are undefined—force the team to clarify before sign-off.
+
+_Source: Murat NFR assessment guidance._
--- a/bmad/bmm/testarch/knowledge/playwright-config.md
+++ b/bmad/bmm/testarch/knowledge/playwright-config.md
@@ -0,0 +1,9 @@
+# Playwright Configuration Guardrails
+
+- Load environment configs via a central map (`envConfigMap`) and fail fast when `TEST_ENV` is missing or unsupported.
+- Standardize timeouts: action 15s, navigation 30s, expect 10s, test 60s; expose overrides through fixtures rather than inline literals.
+- Emit HTML + JUnit reporters, disable auto-open, and store artifacts under `test-results/` for CI upload.
+- Keep `.env.example`, `.nvmrc`, and browser dependencies versioned so local and CI runs stay aligned.
+- Use global setup for shared auth tokens or seeding, but prefer per-test fixtures for anything mutable to avoid cross-test leakage.
+
+_Source: Playwright book repo, SEON configuration example._
--- a/bmad/bmm/testarch/knowledge/probability-impact.md
+++ b/bmad/bmm/testarch/knowledge/probability-impact.md
@@ -0,0 +1,17 @@
+# Probability and Impact Scale
+
+- **Probability**
+  - 1 – Unlikely: standard implementation, low uncertainty.
+  - 2 – Possible: edge cases or partial unknowns worth investigation.
+  - 3 – Likely: known issues, new integrations, or high ambiguity.
+- **Impact**
+  - 1 – Minor: cosmetic issues or easy workarounds.
+  - 2 – Degraded: partial feature loss or manual workaround required.
+  - 3 – Critical: blockers, data/security/regulatory exposure.
+- Multiply probability × impact to derive the risk score.
+  - 1–3: document for awareness.
+  - 4–5: monitor closely, plan mitigations.
+  - 6–8: CONCERNS at the gate until mitigations are implemented.
+  - 9: automatic gate FAIL until resolved or formally waived.
+
+_Source: Murat risk model summary._
--- a/bmad/bmm/testarch/knowledge/risk-governance.md
+++ b/bmad/bmm/testarch/knowledge/risk-governance.md
@@ -0,0 +1,14 @@
+# Risk Governance and Gatekeeping
+
+- Score risk as probability (1–3) × impact (1–3); totals ≥6 demand mitigation before approval, 9 mandates a gate failure.
+- Classify risks across TECH, SEC, PERF, DATA, BUS, OPS. Document owners, mitigation plans, and deadlines for any score above 4.
+- Trace every acceptance criterion to implemented tests; missing coverage must be resolved or explicitly waived before release.
+- Gate decisions:
+  - **PASS** – no critical issues remain and evidence is current.
+  - **CONCERNS** – residual risk exists but has owners, actions, and timelines.
+  - **FAIL** – critical issues unresolved or evidence missing.
+  - **WAIVED** – risk accepted with documented approver, rationale, and expiry.
+- Maintain a gate history log capturing updates so auditors can follow the decision trail.
+- Use the probability/impact scale fragment for shared definitions when scoring teams run the matrix.
+
+_Source: Murat risk governance notes, gate schema guidance._
--- a/bmad/bmm/testarch/knowledge/selective-testing.md
+++ b/bmad/bmm/testarch/knowledge/selective-testing.md
@@ -0,0 +1,9 @@
+# Selective and Targeted Test Execution
+
+- Use tags/grep (`--grep "@smoke"`, `--grep "@critical"`) to slice suites by risk, not directory.
+- Filter by spec patterns (`--spec "**/*checkout*"`) or git diff (`npm run test:changed`) to focus on impacted areas.
+- Combine priority metadata (P0–P3) with change detection to decide which levels to run pre-commit vs. in CI.
+- Record burn-in history for newly added specs; promote to main suite only after consistent green runs.
+- Document the selection strategy in README/CI so the team understands when full regression is mandatory.
+
+_Source: 32+ selective testing strategies blog, Murat testing philosophy._
--- a/bmad/bmm/testarch/knowledge/test-levels-framework.md
+++ b/bmad/bmm/testarch/knowledge/test-levels-framework.md
@@ -0,0 +1,148 @@
+<!-- Powered by BMAD-CORE™ -->
+
+# Test Levels Framework
+
+Comprehensive guide for determining appropriate test levels (unit, integration, E2E) for different scenarios.
+
+## Test Level Decision Matrix
+
+### Unit Tests
+
+**When to use:**
+
+- Testing pure functions and business logic
+- Algorithm correctness
+- Input validation and data transformation
+- Error handling in isolated components
+- Complex calculations or state machines
+
+**Characteristics:**
+
+- Fast execution (immediate feedback)
+- No external dependencies (DB, API, file system)
+- Highly maintainable and stable
+- Easy to debug failures
+
+**Example scenarios:**
+
+```yaml
+unit_test:
+  component: 'PriceCalculator'
+  scenario: 'Calculate discount with multiple rules'
+  justification: 'Complex business logic with multiple branches'
+  mock_requirements: 'None - pure function'
+```
+
+### Integration Tests
+
+**When to use:**
+
+- Component interaction verification
+- Database operations and transactions
+- API endpoint contracts
+- Service-to-service communication
+- Middleware and interceptor behavior
+
+**Characteristics:**
+
+- Moderate execution time
+- Tests component boundaries
+- May use test databases or containers
+- Validates system integration points
+
+**Example scenarios:**
+
+```yaml
+integration_test:
+  components: ['UserService', 'AuthRepository']
+  scenario: 'Create user with role assignment'
+  justification: 'Critical data flow between service and persistence'
+  test_environment: 'In-memory database'
+```
+
+### End-to-End Tests
+
+**When to use:**
+
+- Critical user journeys
+- Cross-system workflows
+- Visual regression testing
+- Compliance and regulatory requirements
+- Final validation before release
+
+**Characteristics:**
+
+- Slower execution
+- Tests complete workflows
+- Requires full environment setup
+- Most realistic but most brittle
+
+**Example scenarios:**
+
+```yaml
+e2e_test:
+  journey: 'Complete checkout process'
+  scenario: 'User purchases with saved payment method'
+  justification: 'Revenue-critical path requiring full validation'
+  environment: 'Staging with test payment gateway'
+```
+
+## Test Level Selection Rules
+
+### Favor Unit Tests When:
+
+- Logic can be isolated
+- No side effects involved
+- Fast feedback needed
+- High cyclomatic complexity
+
+### Favor Integration Tests When:
+
+- Testing persistence layer
+- Validating service contracts
+- Testing middleware/interceptors
+- Component boundaries critical
+
+### Favor E2E Tests When:
+
+- User-facing critical paths
+- Multi-system interactions
+- Regulatory compliance scenarios
+- Visual regression important
+
+## Anti-patterns to Avoid
+
+- E2E testing for business logic validation
+- Unit testing framework behavior
+- Integration testing third-party libraries
+- Duplicate coverage across levels
+
+## Duplicate Coverage Guard
+
+**Before adding any test, check:**
+
+1. Is this already tested at a lower level?
+2. Can a unit test cover this instead of integration?
+3. Can an integration test cover this instead of E2E?
+
+**Coverage overlap is only acceptable when:**
+
+- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
+- Critical paths requiring defense in depth
+- Regression prevention for previously broken functionality
+
+## Test Naming Conventions
+
+- Unit: `test_{component}_{scenario}`
+- Integration: `test_{flow}_{interaction}`
+- E2E: `test_{journey}_{outcome}`
+
+## Test ID Format
+
+`{EPIC}.{STORY}-{LEVEL}-{SEQ}`
+
+Examples:
+
+- `1.3-UNIT-001`
+- `1.3-INT-002`
+- `1.3-E2E-001`
--- a/bmad/bmm/testarch/knowledge/test-priorities-matrix.md
+++ b/bmad/bmm/testarch/knowledge/test-priorities-matrix.md
@@ -0,0 +1,174 @@
+<!-- Powered by BMAD-CORE™ -->
+
+# Test Priorities Matrix
+
+Guide for prioritizing test scenarios based on risk, criticality, and business impact.
+
+## Priority Levels
+
+### P0 - Critical (Must Test)
+
+**Criteria:**
+
+- Revenue-impacting functionality
+- Security-critical paths
+- Data integrity operations
+- Regulatory compliance requirements
+- Previously broken functionality (regression prevention)
+
+**Examples:**
+
+- Payment processing
+- Authentication/authorization
+- User data creation/deletion
+- Financial calculations
+- GDPR/privacy compliance
+
+**Testing Requirements:**
+
+- Comprehensive coverage at all levels
+- Both happy and unhappy paths
+- Edge cases and error scenarios
+- Performance under load
+
+### P1 - High (Should Test)
+
+**Criteria:**
+
+- Core user journeys
+- Frequently used features
+- Features with complex logic
+- Integration points between systems
+- Features affecting user experience
+
+**Examples:**
+
+- User registration flow
+- Search functionality
+- Data import/export
+- Notification systems
+- Dashboard displays
+
+**Testing Requirements:**
+
+- Primary happy paths required
+- Key error scenarios
+- Critical edge cases
+- Basic performance validation
+
+### P2 - Medium (Nice to Test)
+
+**Criteria:**
+
+- Secondary features
+- Admin functionality
+- Reporting features
+- Configuration options
+- UI polish and aesthetics
+
+**Examples:**
+
+- Admin settings panels
+- Report generation
+- Theme customization
+- Help documentation
+- Analytics tracking
+
+**Testing Requirements:**
+
+- Happy path coverage
+- Basic error handling
+- Can defer edge cases
+
+### P3 - Low (Test if Time Permits)
+
+**Criteria:**
+
+- Rarely used features
+- Nice-to-have functionality
+- Cosmetic issues
+- Non-critical optimizations
+
+**Examples:**
+
+- Advanced preferences
+- Legacy feature support
+- Experimental features
+- Debug utilities
+
+**Testing Requirements:**
+
+- Smoke tests only
+- Can rely on manual testing
+- Document known limitations
+
+## Risk-Based Priority Adjustments
+
+### Increase Priority When:
+
+- High user impact (affects >50% of users)
+- High financial impact (>$10K potential loss)
+- Security vulnerability potential
+- Compliance/legal requirements
+- Customer-reported issues
+- Complex implementation (>500 LOC)
+- Multiple system dependencies
+
+### Decrease Priority When:
+
+- Feature flag protected
+- Gradual rollout planned
+- Strong monitoring in place
+- Easy rollback capability
+- Low usage metrics
+- Simple implementation
+- Well-isolated component
+
+## Test Coverage by Priority
+
+| Priority | Unit Coverage | Integration Coverage | E2E Coverage       |
+| -------- | ------------- | -------------------- | ------------------ |
+| P0       | >90%          | >80%                 | All critical paths |
+| P1       | >80%          | >60%                 | Main happy paths   |
+| P2       | >60%          | >40%                 | Smoke tests        |
+| P3       | Best effort   | Best effort          | Manual only        |
+
+## Priority Assignment Rules
+
+1. **Start with business impact** - What happens if this fails?
+2. **Consider probability** - How likely is failure?
+3. **Factor in detectability** - Would we know if it failed?
+4. **Account for recoverability** - Can we fix it quickly?
+
+## Priority Decision Tree
+
+```
+Is it revenue-critical?
+├─ YES → P0
+└─ NO → Does it affect core user journey?
+    ├─ YES → Is it high-risk?
+    │   ├─ YES → P0
+    │   └─ NO → P1
+    └─ NO → Is it frequently used?
+        ├─ YES → P1
+        └─ NO → Is it customer-facing?
+            ├─ YES → P2
+            └─ NO → P3
+```
+
+## Test Execution Order
+
+1. Execute P0 tests first (fail fast on critical issues)
+2. Execute P1 tests second (core functionality)
+3. Execute P2 tests if time permits
+4. P3 tests only in full regression cycles
+
+## Continuous Adjustment
+
+Review and adjust priorities based on:
+
+- Production incident patterns
+- User feedback and complaints
+- Usage analytics
+- Test failure history
+- Business priority changes
--- a/bmad/bmm/testarch/knowledge/test-quality.md
+++ b/bmad/bmm/testarch/knowledge/test-quality.md
@@ -0,0 +1,10 @@
+# Test Quality Definition of Done
+
+- No hard waits (`waitForTimeout`, `cy.wait(ms)`); rely on deterministic waits or event hooks.
+- Each spec <300 lines and executes in ≤1.5 minutes.
+- Tests are isolated, parallel-safe, and self-cleaning (seed via API/tasks, teardown after run).
+- Assertions stay visible in test bodies; avoid conditional logic controlling test flow.
+- Suites must pass locally and in CI with the same commands.
+- Promote new tests only after they have failed for the intended reason at least once.
+
+_Source: Murat quality checklist._
--- a/bmad/bmm/testarch/knowledge/visual-debugging.md
+++ b/bmad/bmm/testarch/knowledge/visual-debugging.md
@@ -0,0 +1,9 @@
+# Visual Debugging and Developer Ergonomics
+
+- Keep Playwright trace viewer, Cypress runner, and Storybook accessible in CI artifacts to speed up reproduction.
+- Record short screen captures only-on-failure; pair them with HAR or console logs to avoid guesswork.
+- Document common trace navigation steps (network tab, action timeline) so new contributors diagnose issues quickly.
+- Encourage live-debug sessions with component harnesses to validate behaviour before writing full E2E specs.
+- Integrate accessibility tooling (axe, Playwright audits) into the same debug workflow to catch regressions early.
+
+_Source: Murat DX blog posts, Playwright book appendix on debugging._
--- a/bmad/bmm/testarch/tea-index.csv
+++ b/bmad/bmm/testarch/tea-index.csv
@@ -0,0 +1,19 @@
+id,name,description,tags,fragment_file
+fixture-architecture,Fixture Architecture,"Composable fixture patterns (pure function → fixture → merge) and reuse rules","fixtures,architecture,playwright,cypress",knowledge/fixture-architecture.md
+network-first,Network-First Safeguards,"Intercept-before-navigate workflow, HAR capture, deterministic waits, edge mocking","network,stability,playwright,cypress",knowledge/network-first.md
+data-factories,Data Factories and API Setup,"Factories with overrides, API seeding, cleanup discipline","data,factories,setup,api",knowledge/data-factories.md
+component-tdd,Component TDD Loop,"Red→green→refactor workflow, provider isolation, accessibility assertions","component-testing,tdd,ui",knowledge/component-tdd.md
+playwright-config,Playwright Config Guardrails,"Environment switching, timeout standards, artifact outputs","playwright,config,env",knowledge/playwright-config.md
+ci-burn-in,CI and Burn-In Strategy,"Staged jobs, shard orchestration, burn-in loops, artifact policy","ci,automation,flakiness",knowledge/ci-burn-in.md
+selective-testing,Selective Test Execution,"Tag/grep usage, spec filters, diff-based runs, promotion rules","risk-based,selection,strategy",knowledge/selective-testing.md
+feature-flags,Feature Flag Governance,"Enum management, targeting helpers, cleanup, release checklists","feature-flags,governance,launchdarkly",knowledge/feature-flags.md
+contract-testing,Contract Testing Essentials,"Pact publishing, provider verification, resilience coverage","contract-testing,pact,api",knowledge/contract-testing.md
+email-auth,Email Authentication Testing,"Magic link extraction, state preservation, caching, negative flows","email-authentication,security,workflow",knowledge/email-auth.md
+error-handling,Error Handling Checks,"Scoped exception handling, retry validation, telemetry logging","resilience,error-handling,stability",knowledge/error-handling.md
+visual-debugging,Visual Debugging Toolkit,"Trace viewer usage, artifact expectations, accessibility integration","debugging,dx,tooling",knowledge/visual-debugging.md
+risk-governance,Risk Governance,"Scoring matrix, category ownership, gate decision rules","risk,governance,gates",knowledge/risk-governance.md
+probability-impact,Probability and Impact Scale,"Shared definitions for scoring matrix and gate thresholds","risk,scoring,scale",knowledge/probability-impact.md
+test-quality,Test Quality Definition of Done,"Execution limits, isolation rules, green criteria","quality,definition-of-done,tests",knowledge/test-quality.md
+nfr-criteria,NFR Review Criteria,"Security, performance, reliability, maintainability status definitions","nfr,assessment,quality",knowledge/nfr-criteria.md
+test-levels,Test Levels Framework,"Guidelines for choosing unit, integration, or end-to-end coverage","testing,levels,selection",knowledge/test-levels-framework.md
+test-priorities,Test Priorities Matrix,"P0–P3 criteria, coverage targets, execution ordering","testing,prioritization,risk",knowledge/test-priorities-matrix.md