Replaced the old TEA brief with an indexed knowledge system: the agent now loads topic-specific

docs from knowledge/ via tea-index.csv, workflows reference those fragments, and risk/level/ priority guidance lives in the new fragment files
2025-09-30 10:16:51 -05:00
parent bb2cb7e951
commit b8814d372f
20 changed files with 101 additions and 405 deletions
--- a/src/modules/bmm/agents/tea.md
+++ b/src/modules/bmm/agents/tea.md
@@ -8,12 +8,12 @@
    <role>Master Test Architect</role>
    <identity>Expert test architect and CI specialist with comprehensive expertise across all software engineering disciplines, with primary focus on test discipline. Deep knowledge in test strategy, automated testing frameworks, quality gates, risk-based testing, and continuous integration/delivery. Proven track record in building robust testing infrastructure and establishing quality standards that scale.</identity>
    <communication_style>Educational and advisory approach. Strong opinions, weakly held. Explains quality concerns with clear rationale. Balances thoroughness with pragmatism. Uses data and risk analysis to support recommendations while remaining approachable and collaborative.</communication_style>
-    <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn't good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it's not tech debt, it's feature debt that impacts customers. In the AI era, E2E tests reign supreme as the ultimate acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with E2E suite. Simplicity is the ultimate sophistication.</principles>
+    <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn't good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it's not tech debt, it's feature debt that impacts customers. I prioritise lower-level coverage before integration/E2E defenses and treat flakiness as non-negotiable debt. In the AI era, E2E tests serve as the living acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with the E2E suite. Simplicity is the ultimate sophistication.</principles>
  </persona>
  <critical-actions>
    <i>Load into memory {project-root}/bmad/bmm/config.yaml and set variable project_name, output_folder, user_name, communication_language</i>
    <i>Load into memory {project-root}/bmad/bmm/testarch/tea-knowledge.md for Murat’s latest heuristics</i>
    <i>Consult {project-root}/bmad/bmm/testarch/tea-index.csv to select knowledge fragments under `knowledge/` and load only the files needed for the current task</i>
    <i>Load the referenced fragment(s) from `{project-root}/bmad/bmm/testarch/knowledge/` before giving recommendations</i>
    <i>Cross-check recommendations with the current official Playwright, Cypress, Pact, and CI platform documentation; fall back to {project-root}/bmad/bmm/testarch/test-resources-for-ai-flat.txt only when deeper sourcing is required</i>
    <i>Remember the users name is {user_name}</i>
    <i>ALWAYS communicate in {communication_language}</i>
--- a/src/modules/bmm/testarch/README.md
+++ b/src/modules/bmm/testarch/README.md
@@ -19,9 +19,7 @@ last-redoc-date: 2025-09-30
 2. Confirm `bmad/bmm/config.yaml` defines `project_name`, `output_folder`, `dev_story_location`, and language settings.
 3. Ensure a test test framework setup exists; if not, use `*framework` command to create a test framework setup, prior to development.
 4. Skim supporting references (knowledge under `testarch/`, command workflows under `workflows/testarch/`).
-   - `tea-knowledge.md`
+   - `tea-index.csv` + `knowledge/*.md`
   - `test-levels-framework.md`
   - `test-priorities-matrix.md`
 ## High-Level Cheat Sheets
@@ -140,22 +138,25 @@ last-redoc-date: 2025-09-30
 <summary>Command Guidance and Context Loading</summary>
 - Each task now carries its own preflight/flow/deliverable guidance inline.
- `tea-knowledge.md` still stores heuristics; update the brief alongside task edits.
+- `tea-index.csv` maps workflow needs to knowledge fragments; keep tags accurate as you add guidance.
 - Consider future modularization into orchestrated workflows if additional automation is needed.
- `tea-knowledge.md` encapsulates Murat’s philosophy—update both CSV and knowledge file together to avoid drift.
+- Update the fragment markdown files alongside workflow edits so guidance and outputs stay in sync.
 </details>
 ## Workflow Placement
-We keep every Test Architect workflow under `workflows/testarch/` instead of scattering them across the phase folders. TEA steps show up during planning (`*framework`), implementation (`*atdd`, `*automate`, `*trace`), and release (`*gate`), so a single directory keeps the command catalog and examples coherent while still letting the orchestrator treat each command as a first-class workflow. When phase-specific navigation improves, we can add lightweight entrypoints without losing this central reference.
+The TEA stack has three tightly-linked layers:
 1. **Agent spec (`agents/tea.md`)** – declares the persona, critical actions, and the `run-workflow` entries for every TEA command. Critical actions instruct the agent to load `tea-index.csv` and then fetch only the fragments it needs from `knowledge/` before giving guidance.
 2. **Knowledge index (`tea-index.csv`)** – catalogues each fragment with tags and file paths. Workflows call out the IDs they need (e.g., `risk-governance`, `fixture-architecture`) so the agent loads targeted guidance instead of a monolithic brief.
 3. **Workflows (`workflows/testarch/*`)** – contain the task flows and reference `tea-index.csv` in their `<flow>`/`<notes>` sections to request specific fragments. Keeping all workflows in this directory ensures consistent discovery during planning (`*framework`), implementation (`*atdd`, `*automate`, `*trace`), and release (`*nfr-assess`, `*gate`).
 This separation lets us expand the knowledge base without touching agent wiring and keeps every command remote-controllable via the standard BMAD workflow runner. As navigation improves, we can add lightweight entrypoints or tags in the index without changing where workflows live.
 ## Appendix
 - **Supporting Knowledge:**
  - `tea-knowledge.md` – Murat’s testing philosophy, heuristics, and risk scales.
  - `tea-index.csv` – Catalog of knowledge fragments with tags and file paths under `knowledge/` for task-specific loading.
-  - `test-levels-framework.md` – Decision matrix for unit/integration/E2E selection.
+  - `knowledge/*.md` – Focused summaries (fixtures, network, CI, levels, priorities, etc.) distilled from Murat’s external resources.
  - `test-priorities-matrix.md` – Priority (P0–P3) criteria and target coverage percentages.
  - `knowledge/*.md` – Focused summaries (fixtures, network, CI, etc.) distilled from Murat’s external resources.
  - `test-resources-for-ai-flat.txt` – Raw 347 KB archive retained for manual deep dives when a fragment needs source validation.
--- a/src/modules/bmm/testarch/knowledge/nfr-criteria.md
+++ b/src/modules/bmm/testarch/knowledge/nfr-criteria.md
@@ -0,0 +1,21 @@
 # Non-Functional Review Criteria
 - **Security**
  - PASS: auth/authz, secret handling, and threat mitigations in place.
  - CONCERNS: minor gaps with clear owners.
  - FAIL: critical exposure or missing controls.
 - **Performance**
  - PASS: metrics meet targets with profiling evidence.
  - CONCERNS: trending toward limits or missing baselines.
  - FAIL: breaches SLO/SLA or introduces resource leaks.
 - **Reliability**
  - PASS: error handling, retries, health checks verified.
  - CONCERNS: partial coverage or missing telemetry.
  - FAIL: no recovery path or crash scenarios unresolved.
 - **Maintainability**
  - PASS: clean code, tests, and documentation shipped together.
  - CONCERNS: duplication, low coverage, or unclear ownership.
  - FAIL: absent tests, tangled implementations, or no observability.
 - Default to CONCERNS when targets or evidence are undefined—force the team to clarify before sign-off.
 _Source: Murat NFR assessment guidance._
--- a/src/modules/bmm/testarch/knowledge/probability-impact.md
+++ b/src/modules/bmm/testarch/knowledge/probability-impact.md
@@ -0,0 +1,17 @@
 # Probability and Impact Scale
 - **Probability**
  - 1 – Unlikely: standard implementation, low uncertainty.
  - 2 – Possible: edge cases or partial unknowns worth investigation.
  - 3 – Likely: known issues, new integrations, or high ambiguity.
 - **Impact**
  - 1 – Minor: cosmetic issues or easy workarounds.
  - 2 – Degraded: partial feature loss or manual workaround required.
  - 3 – Critical: blockers, data/security/regulatory exposure.
 - Multiply probability × impact to derive the risk score.
  - 1–3: document for awareness.
  - 4–5: monitor closely, plan mitigations.
  - 6–8: CONCERNS at the gate until mitigations are implemented.
  - 9: automatic gate FAIL until resolved or formally waived.
 _Source: Murat risk model summary._
--- a/src/modules/bmm/testarch/knowledge/risk-governance.md
+++ b/src/modules/bmm/testarch/knowledge/risk-governance.md
@@ -0,0 +1,14 @@
 # Risk Governance and Gatekeeping
 - Score risk as probability (1–3) × impact (1–3); totals ≥6 demand mitigation before approval, 9 mandates a gate failure.
 - Classify risks across TECH, SEC, PERF, DATA, BUS, OPS. Document owners, mitigation plans, and deadlines for any score above 4.
 - Trace every acceptance criterion to implemented tests; missing coverage must be resolved or explicitly waived before release.
 - Gate decisions:
  - **PASS** – no critical issues remain and evidence is current.
  - **CONCERNS** – residual risk exists but has owners, actions, and timelines.
  - **FAIL** – critical issues unresolved or evidence missing.
  - **WAIVED** – risk accepted with documented approver, rationale, and expiry.
 - Maintain a gate history log capturing updates so auditors can follow the decision trail.
 - Use the probability/impact scale fragment for shared definitions when scoring teams run the matrix.
 _Source: Murat risk governance notes, gate schema guidance._
--- a/src/modules/bmm/testarch/knowledge/test-levels-framework.md
+++ b/src/modules/bmm/testarch/knowledge/test-levels-framework.md
--- a/src/modules/bmm/testarch/knowledge/test-priorities-matrix.md
+++ b/src/modules/bmm/testarch/knowledge/test-priorities-matrix.md
--- a/src/modules/bmm/testarch/knowledge/test-quality.md
+++ b/src/modules/bmm/testarch/knowledge/test-quality.md
@@ -0,0 +1,10 @@
 # Test Quality Definition of Done
 - No hard waits (`waitForTimeout`, `cy.wait(ms)`); rely on deterministic waits or event hooks.
 - Each spec <300 lines and executes in ≤1.5 minutes.
 - Tests are isolated, parallel-safe, and self-cleaning (seed via API/tasks, teardown after run).
 - Assertions stay visible in test bodies; avoid conditional logic controlling test flow.
 - Suites must pass locally and in CI with the same commands.
 - Promote new tests only after they have failed for the intended reason at least once.
 _Source: Murat quality checklist._
--- a/src/modules/bmm/testarch/tea-index.csv
+++ b/src/modules/bmm/testarch/tea-index.csv
@@ -1,13 +1,19 @@
 id,name,description,tags,fragment_file
 fixture-architecture,Fixture Architecture,"Composable fixture patterns (pure function → fixture → merge) and reuse rules","fixtures,architecture,playwright,cypress",knowledge/fixture-architecture.md
 network-first,Network-First Safeguards,"Intercept-before-navigate workflow, HAR capture, deterministic waits, edge mocking","network,stability,playwright,cypress",knowledge/network-first.md
-data-factories,Data Factories & API Setup,"Factories with overrides, API seeding, cleanup discipline","data,factories,setup,api",knowledge/data-factories.md
+data-factories,Data Factories and API Setup,"Factories with overrides, API seeding, cleanup discipline","data,factories,setup,api",knowledge/data-factories.md
 component-tdd,Component TDD Loop,"Red→green→refactor workflow, provider isolation, accessibility assertions","component-testing,tdd,ui",knowledge/component-tdd.md
 playwright-config,Playwright Config Guardrails,"Environment switching, timeout standards, artifact outputs","playwright,config,env",knowledge/playwright-config.md
-ci-burn-in,CI & Burn-In Strategy,"Staged jobs, shard orchestration, burn-in loops, artifact policy","ci,automation,flakiness",knowledge/ci-burn-in.md
+ci-burn-in,CI and Burn-In Strategy,"Staged jobs, shard orchestration, burn-in loops, artifact policy","ci,automation,flakiness",knowledge/ci-burn-in.md
 selective-testing,Selective Test Execution,"Tag/grep usage, spec filters, diff-based runs, promotion rules","risk-based,selection,strategy",knowledge/selective-testing.md
 feature-flags,Feature Flag Governance,"Enum management, targeting helpers, cleanup, release checklists","feature-flags,governance,launchdarkly",knowledge/feature-flags.md
 contract-testing,Contract Testing Essentials,"Pact publishing, provider verification, resilience coverage","contract-testing,pact,api",knowledge/contract-testing.md
 email-auth,Email Authentication Testing,"Magic link extraction, state preservation, caching, negative flows","email-authentication,security,workflow",knowledge/email-auth.md
 error-handling,Error Handling Checks,"Scoped exception handling, retry validation, telemetry logging","resilience,error-handling,stability",knowledge/error-handling.md
 visual-debugging,Visual Debugging Toolkit,"Trace viewer usage, artifact expectations, accessibility integration","debugging,dx,tooling",knowledge/visual-debugging.md
 risk-governance,Risk Governance,"Scoring matrix, category ownership, gate decision rules","risk,governance,gates",knowledge/risk-governance.md
 probability-impact,Probability and Impact Scale,"Shared definitions for scoring matrix and gate thresholds","risk,scoring,scale",knowledge/probability-impact.md
 test-quality,Test Quality Definition of Done,"Execution limits, isolation rules, green criteria","quality,definition-of-done,tests",knowledge/test-quality.md
 nfr-criteria,NFR Review Criteria,"Security, performance, reliability, maintainability status definitions","nfr,assessment,quality",knowledge/nfr-criteria.md
 test-levels,Test Levels Framework,"Guidelines for choosing unit, integration, or end-to-end coverage","testing,levels,selection",knowledge/test-levels-framework.md
 test-priorities,Test Priorities Matrix,"P0–P3 criteria, coverage targets, execution ordering","testing,prioritization,risk",knowledge/test-priorities-matrix.md
--- a/src/modules/bmm/testarch/tea-knowledge.md
+++ b/src/modules/bmm/testarch/tea-knowledge.md
@@ -1,365 +0,0 @@
 <!-- Powered by BMAD-CORE™ -->
 # Murat Test Architecture Foundations (Slim Brief)
 This brief distills Murat Ozcan's testing philosophy used by the Test Architect agent. Use it as the north star while executing the TEA workflows, and rely on `tea-index.csv` to pull deeper fragments on demand.
 ## Core Principles
 - Cost vs confidence: cost = creation + execution + maintenance. Push confidence where impact is highest and skip redundant checks.
 - Engineering assumes failure: predict what breaks, defend with tests, learn from every failure. A single failing test means the software is not ready.
 - Quality is team work. Story estimates include testing, documentation, and deployment work required to ship safely.
 - Missing test coverage is feature debt (hurts customers), not mere tech debt—treat it with the same urgency as functionality gaps.
 - Shared mutable state is the source of all evil: design fixtures and helpers so each test owns its data.
 - Composition over inheritance: prefer functional helpers and fixtures that compose behaviour; page objects and deep class trees hide duplication.
 - Setup via API, assert via UI. Keep tests user-centric while priming state through fast interfaces.
 - One test = one concern. Explicit assertions live in the test body, not buried in helpers.
 - Test at the lowest level possible first: favour component/unit coverage before integration/E2E (target ~1:3–1:5 ratio of high-level to low-level tests).
 - Zero tolerance for flakiness: if a test flakes, fix the cause immediately or delete the test—shipping with flakes is not acceptable evidence.
 ## Patterns & Heuristics
 - Selector order: `data-cy` / `data-testid` -> ARIA -> text. Avoid brittle CSS, IDs, or index based locators.
 - Network boundary is the mock boundary. Stub at the edge, never mid-service unless risk demands.
 - **Network-first pattern**: ALWAYS intercept before navigation: `const call = interceptNetwork(); await page.goto(); await call;`
 - Deterministic waits only: await specific network responses, elements disappearing, or event hooks. Ban fixed sleeps.
 - **Fixture architecture (The Murat Way)**:
  ```typescript
  // 1. Pure function first (testable independently)
  export async function apiRequest({ request, method, url, data }) {
    /* implementation */
  }
  // 2. Fixture wrapper
  export const apiRequestFixture = base.extend({
    apiRequest: async ({ request }, use) => {
      await use((params) => apiRequest({ request, ...params }));
    },
  });
  // 3. Compose via mergeTests
  export const test = mergeTests(base, apiRequestFixture, authFixture, networkFixture);
  ```
 - **Data factories pattern**:
  ```typescript
  export const createUser = (overrides = {}) => ({
    id: faker.string.uuid(),
    email: faker.internet.email(),
    ...overrides,
  });
  ```
 - Standard test skeleton keeps intent clear—`describe` the feature, `context` specific scenarios, make setup visible, and follow Arrange → Act → Assert explicitly:
  ```javascript
  describe('Checkout', () => {
    context('when inventory is available', () => {
      beforeEach(async () => {
        await seedInventory();
        await interceptOrders(); // intercept BEFORE navigation
        await test.step('navigate', () => page.goto('/checkout'));
      });
      it('completes purchase', async () => {
        await cart.fillDetails(validUser);
        await expect(page.getByTestId('order-confirmed')).toBeVisible();
      });
    });
  });
  ```
 - Helper/fixture thresholds: 3+ call sites → promote to fixture with subpath export, 2-3 → shared utility module, 1-off → keep inline to avoid premature abstraction.
 - Deterministic waits only: prefer `page.waitForResponse`, `cy.wait('@alias')`, or element disappearance (e.g., `cy.get('[data-cy="spinner"]').should('not.exist')`). Ban `waitForTimeout`/`cy.wait(ms)` unless quarantined in TODO and slated for removal.
 - Data is created via APIs or tasks, not UI flows:
  ```javascript
  beforeEach(() => {
    cy.task('db:seed', { users: [createUser({ role: 'admin' })] });
  });
  ```
 - Assertions stay in tests; when shared state varies, assert on ranges (`expect(count).toBeGreaterThanOrEqual(3)`) rather than brittle exact values.
 - Visual debugging: keep component/test runner UIs available (Playwright trace viewer, Cypress runner) to accelerate feedback.
 ## Risk & Coverage
 - Risk score = probability (1-3) × impact (1-3). Score 9 => gate FAIL, ≥6 => CONCERNS. Most stories have 0-1 high risks.
 - Test level ratio: heavy unit/component coverage, but always include E2E for critical journeys and integration seams.
 - Traceability looks for reality: map each acceptance criterion to concrete tests and flag missing coverage or duplicate value.
 - NFR focus areas: Security, Performance, Reliability, Maintainability. Demand evidence (tests, telemetry, alerts) before approving.
 ## Test Configuration
 - **Timeouts**: actionTimeout 15s, navigationTimeout 30s, testTimeout 60s, expectTimeout 10s
 - **Reporters**: HTML (never auto-open) + JUnit XML for CI integration
 - **Media**: screenshot only-on-failure, video retain-on-failure
 - **Language Matching**: Tests should match source code language (JS/TS frontend -> JS/TS tests)
 ## Automation & CI
 - Prefer Playwright for multi-language teams, worker parallelism, rich debugging; Cypress suits smaller DX-first repos or component-heavy spikes.
 - **Framework Selection**: Large repo + performance = Playwright, Small repo + DX = Cypress
 - **Component Testing**: Large repos = Vitest (has UI, easy RTL conversion), Small repos = Cypress CT
 - CI pipelines run lint -> unit -> component -> e2e, with selective reruns for flakes and artifacts (videos, traces) on failure.
 - Shard suites to keep feedback tight; treat CI as shared safety net, not a bottleneck.
 - Test selection ideas (32+ strategies): filter by tags/grep (`npm run test -- --grep "@smoke"`), file patterns (`--spec "**/*checkout*"`), changed files (`npm run test:changed`), or test level (`npm run test:unit` / `npm run test:e2e`).
 - Burn-in testing: run new or changed specs multiple times (e.g., 3-10x) to flush flakes before they land in main.
 - Keep helper scripts handy (`scripts/test-changed.sh`, `scripts/burn-in-changed.sh`) so CI and local workflows stay in sync.
 ## Project Structure & Config
 - **Directory structure**:
  ```
  project/
  ├── playwright.config.ts     # Environment-based config loading
  ├── playwright/
  │   ├── tests/               # All specs (group by domain: auth/, network/, feature-flags/…)
  │   ├── support/             # Frequently touched helpers (global-setup, merged-fixtures, ui helpers, factories)
  │   ├── config/              # Environment configs (base, local, staging, production)
  │   └── scripts/             # Expert utilities (burn-in, record/playback, maintenance)
  ```
 - **Environment config pattern**:
  ```javascript
  const configs = {
    local: require('./config/local.config'),
    staging: require('./config/staging.config'),
    prod: require('./config/prod.config'),
  };
  export default configs[process.env.TEST_ENV || 'local'];
  ```
 - Validate environment input up-front (fail fast when `TEST_ENV` is missing) and keep Playwright/Cypress configs small by delegating per-env overrides to files under `config/`.
 - Keep `.env.example`, `.nvmrc`, and scripts (burn-in, test-changed) in source control so CI and local machines share tooling defaults.
 ## Test Hygiene & Independence
 - Tests must be independent and stateless; never rely on execution order.
 - Cleanup all data created during tests (afterEach or API cleanup).
 - Ensure idempotency: same results every run.
 - No shared mutable state; prefer factory functions per test.
 - Tests must run in parallel safely; never commit `.only`.
 - Prefer co-location: component tests next to components, integration in `tests/integration`, etc.
 - Feature flags: centralise enum definitions (e.g., `export const FLAGS = Object.freeze({ NEW_FEATURE: 'new-feature' })`), provide helpers to set/clear targeting, write dedicated flag suites that clean up targeting after each run, and exercise both enabled/disabled paths in CI.
 ## CCTDD (Component Test-Driven Development)
 - Start with failing component test -> implement minimal component -> refactor.
 - Component tests catch ~70% of bugs before integration.
 - Use `cy.mount()` or `render()` to test components in isolation; focus on user interactions.
 ## CI Optimization Strategies
 - **Parallel execution**: Split by test file, not test case.
 - **Smart selection**: Run only tests affected by changes (dependency graphs, git diff).
 - **Burn-in testing**: Run new/modified tests 3x to catch flakiness early.
 - **HAR recording**: Record network traffic for offline playback in CI.
 - **Selective reruns**: Only rerun failed specs, not entire suite.
 - **Network recording**: capture HAR files during stable runs so CI can replay network traffic when external systems are flaky.
 - Stage jobs: cache dependencies once, run `test-changed` before full suite, then execute sharded E2E jobs with `fail-fast: false` so one failure doesn’t cancel other evidence.
 - Ship burn-in scripts (e.g., `scripts/burn-in-changed.sh`) that loop 5–10x over changed specs and stop on first failure; wire them into CI for flaky detection before merge.
 ## Package Scripts
 - **Essential npm scripts**:
  ```json
  "test:e2e": "playwright test",
  "test:unit": "vitest run",
  "test:component": "cypress run --component",
  "test:contract": "jest --testMatch='**/pact/*.spec.ts'",
  "test:debug": "playwright test --headed",
  "test:ci": "npm run test:unit && npm run test:e2e",
  "contract:publish": "pact-broker publish"
  ```
 ## Online Resources & Examples
 - Full-text mirrors of Murat's public repos live in the `test-resources-for-ai/sample-repos` knowledge pack so TEA can stay offline. Key origins include Playwright patterns (`pw-book`), Cypress vs Playwright comparisons, Tour of Heroes, and Pact consumer/provider examples.
 - - Fixture architecture: https://github.com/muratkeremozcan/cy-vs-pw-murats-version
 - Playwright patterns: https://github.com/muratkeremozcan/pw-book
 - Component testing (CCTDD): https://github.com/muratkeremozcan/cctdd
 - Contract testing: https://github.com/muratkeremozcan/pact-js-example-consumer
 - Full app example: https://github.com/muratkeremozcan/tour-of-heroes-react-vite-cypress-ts
 - Blog essays at https://dev.to/muratkeremozcan provide narrative rationale—distil any new actionable guidance back into this brief when processes evolve.
 ## Risk Model Details
 - TECH: Unmitigated architecture flaws, experimental patterns without fallbacks.
 - SEC: Missing security controls, potential vulnerabilities, unsafe data handling.
 - PERF: SLA-breaking slowdowns, resource exhaustion, lack of caching.
 - DATA: Loss or corruption scenarios, migrations without rollback, inconsistent schemas.
 - BUS: Business or user harm, revenue-impacting failures, compliance gaps.
 - OPS: Deployment, infrastructure, or observability gaps that block releases.
 ## Probability & Impact Scale
 - Probability 1 = Unlikely (standard implementation, low risk).
 - Probability 2 = Possible (edge cases, needs attention).
 - Probability 3 = Likely (known issues, high uncertainty).
 - Impact 1 = Minor (cosmetic, easy workaround).
 - Impact 2 = Degraded (partial feature loss, manual workaround needed).
 - Impact 3 = Critical (blocker, data/security/regulatory impact).
 - Scores: 9 => FAIL, 6-8 => CONCERNS, 4 => monitor, 1-3 => note only.
 ## Test Design Frameworks
 - Use [`test-levels-framework.md`](./test-levels-framework.md) for level selection and anti-patterns.
 - Use [`test-priorities-matrix.md`](./test-priorities-matrix.md) for P0–P3 priority criteria.
 - Naming convention: `{epic}.{story}-{LEVEL}-{sequence}` (e.g., `2.4-E2E-01`).
 - Tie each scenario to risk mitigations or acceptance criteria.
 ## Test Quality Definition of Done
 - No hard waits (`page.waitForTimeout`, `cy.wait(ms)`)—use deterministic waits.
 - Each test < 300 lines and executes in <= 1.5 minutes.
 - Tests are stateless, parallel-safe, and self-cleaning.
 - No conditional logic in tests (`if/else`, `try/catch` controlling flow).
 - Explicit assertions live in tests, not hidden in helpers.
 - Tests must run green locally and in CI with identical commands.
 - A test delivers value only when it has failed at least once—design suites so they regularly catch regressions during development.
 ## NFR Status Criteria
 - **Security**: PASS (auth, authz, secrets handled), CONCERNS (minor gaps), FAIL (critical exposure).
 - **Performance**: PASS (meets targets, profiling evidence), CONCERNS (approaching limits), FAIL (breaches limits, leaks).
 - **Reliability**: PASS (error handling, retries, health checks), CONCERNS (partial coverage), FAIL (no recovery, crashes).
 - **Maintainability**: PASS (tests + docs + clean code), CONCERNS (duplication, low coverage), FAIL (no tests, tangled code).
 - Unknown targets => CONCERNS until defined.
 ## Quality Gate Schema
 ```yaml
 schema: 1
 story: '{epic}.{story}'
 story_title: '{title}'
 gate: PASS|CONCERNS|FAIL|WAIVED
 status_reason: 'Single sentence summary'
 reviewer: 'Murat (Master Test Architect)'
 updated: '2024-09-20T12:34:56Z'
 waiver:
  active: false
  reason: ''
  approved_by: ''
  expires: ''
 top_issues:
  - id: SEC-001
    severity: high
    finding: 'Issue description'
    suggested_action: 'Action to resolve'
 risk_summary:
  totals:
    critical: 0
    high: 0
    medium: 0
    low: 0
 recommendations:
  must_fix: []
  monitor: []
 nfr_validation:
  security: { status: PASS, notes: '' }
  performance: { status: CONCERNS, notes: 'Add caching' }
  reliability: { status: PASS, notes: '' }
  maintainability: { status: PASS, notes: '' }
 history:
  - at: '2024-09-20T12:34:56Z'
    gate: CONCERNS
    note: 'Initial review'
 ```
 - Optional sections: `quality_score` block for extended metrics, and `evidence` block (tests_reviewed, risks_identified, trace.ac_covered/ac_gaps) when teams track them.
 ## Collaborative TDD Loop
 - Share failing acceptance tests with the developer or AI agent.
 - Track red -> green -> refactor progress alongside the implementation checklist.
 - Update checklist items as each test passes; add new tests for discovered edge cases.
 - Keep conversation focused on observable behavior, not implementation detail.
 ## Traceability Coverage Definitions
 - FULL: All scenarios for the criterion validated across appropriate levels.
 - PARTIAL: Some coverage exists but gaps remain.
 - NONE: No tests currently validate the criterion.
 - UNIT-ONLY: Only low-level tests exist; add integration/E2E.
 - INTEGRATION-ONLY: Missing unit/component coverage for fast feedback.
 - Avoid naive UI E2E until service-level confidence exists; use API or contract tests to harden backends first, then add minimal UI coverage to fill the gaps.
 ## CI Platform Guidance
 - Default to GitHub Actions if no preference is given; otherwise ask for GitLab, CircleCI, etc.
 - Ensure local script mirrors CI pipeline (npm test vs CI workflow).
 - Use concurrency controls to prevent duplicate runs (`concurrency` block in GitHub Actions).
 - Keep job runtime under 10 minutes; split further if necessary.
 ## Testing Tool Preferences
 - Component testing: Large repositories prioritize Vitest with UI (fast, component-native). Smaller DX-first teams with existing Cypress stacks can keep Cypress Component Testing for consistency.
 - E2E testing: Favor Playwright for large or performance-sensitive repos; reserve Cypress for smaller DX-first teams where developer experience outweighs scale.
 - API testing: Prefer Playwright's API testing or contract suites over ad-hoc REST clients.
 - Contract testing: Pact.js for consumer-driven contracts; keep `pact/` config in repo.
 - Visual testing: Percy, Chromatic, or Playwright snapshots when UX must be audited.
 ## Naming Conventions
 - File names: `ComponentName.cy.tsx` for Cypress component tests, `component-name.spec.ts` for Playwright, `ComponentName.test.tsx` for unit/RTL.
 - Describe blocks: `describe('Feature/Component Name', () => { context('when condition', ...) })`.
 - Data attributes: always kebab-case (`data-cy="submit-button"`, `data-testid="user-email"`).
 ## Contract Testing Rules (Pact)
 - Use Pact for microservice integrations; keep a `pact/` directory with broker config and share contracts as first-class artifacts in the repo.
 - Keep consumer contracts beside the integration specs that exercise them; version with semantic tags so downstream teams understand breaking changes.
 - Publish contracts on every CI run and enforce provider verification before merge—failing verification blocks release and acts as a quality gate.
 - Capture fallback behaviour (timeouts, retries, circuit breakers) inside interactions so resilience expectations stay explicit.
 - Sample interaction scaffold:
  ```javascript
  const interaction = {
    state: 'user with id 1 exists',
    uponReceiving: 'a request for user 1',
    withRequest: {
      method: 'GET',
      path: '/users/1',
      headers: { Accept: 'application/json' },
    },
    willRespondWith: {
      status: 200,
      headers: { 'Content-Type': 'application/json' },
      body: like({ id: 1, name: string('Jane Doe'), email: email('jane@example.com') }),
    },
  };
  ```
 ## Reference Capsules (Summaries Bundled In)
 - **Fixture Architecture Quick Wins** (`knowledge/fixture-architecture.md`)
  - Compose Playwright or Cypress suites with additive fixtures; use `mergeTests`/`extend` to layer auth, network, and telemetry helpers without inheritance.
  - Keep HTTP helpers framework-agnostic so the same function fuels unit tests, API smoke checks, and runtime fixtures.
  - Normalize selectors (`data-testid`/`data-cy`) and lint new UI code for missing attributes to prevent brittle locators.
 - **Network & Playwright Patterns** (`knowledge/network-first.md`, `knowledge/playwright-config.md`)
  - Register network interceptions before navigation, assert on typed responses, and capture HAR files for regression.
  - Treat timeouts and retries as configuration, not inline magic numbers; expose overrides via fixtures.
  - Name specs and test IDs with intent (`checkout.complete-happy-path`) so CI shards and triage stay meaningful.
 - **Component TDD Highlights** (`knowledge/component-tdd.md`, `knowledge/data-factories.md`)
  - Begin UI work with failing component specs; rebuild providers/stores per spec to avoid state bleed.
  - Use factories to exercise prop variations and edge cases; assert through accessible queries (`getByRole`, `getByLabelText`).
  - Document mount helpers and cleanup expectations so component tests stay deterministic.
 - **Contract Testing Cliff Notes** (`knowledge/contract-testing.md`)
  - Store consumer contracts alongside integration specs; version with semantic tags and publish on every CI run.
  - Enforce provider verification prior to merge to act as a release gate for service integrations.
  - Capture fallback behaviour (timeouts, retries, circuit breakers) inside contracts to keep resilience expectations explicit.
 - **End-to-End Reference Flow** (`knowledge/ci-burn-in.md`, `knowledge/selective-testing.md`)
  - Prime end-to-end journeys through API fixtures, then assert through UI steps mirroring real user narratives.
  - Pair burn-in scripts (`npm run test:e2e -- --repeat-each=3`) with selective retries to flush flakes before promotion.
 - **Special Topics** (`knowledge/feature-flags.md`, `knowledge/email-auth.md`, `knowledge/error-handling.md`, `knowledge/visual-debugging.md`)
  - Feature flag governance, targeted email-auth flows, resilient error handling, and visual debugging ergonomics captured as separate fragments.
  - Use the Murat knowledge bundle only when these fragments need deeper sourcing.
 These capsules map to focused fragments stored under `knowledge/`. Each fragment is catalogued in `tea-index.csv` so workflows can load only what they need.
 ## Reference Assets
 - [Test Architect README](./README.md) — high-level usage guidance and phase checklists.
 - [Test Levels Framework](./test-levels-framework.md) — choose the right level for each scenario.
 - [Test Priorities Matrix](./test-priorities-matrix.md) — assign P0–P3 priorities consistently.
 - [TEA Workflows](../workflows/testarch/README.md) — per-command instructions executed by the agent.
 - [TEA Knowledge Index](./tea-index.csv) — tags each knowledge fragment and the supporting markdown file under `knowledge/` for on-demand loading.
 - [Murat Knowledge Bundle](./test-resources-for-ai-flat.txt) — raw 347 KB archive of blogs and course notes; consult manually when a fragment needs deeper sourcing.
--- a/src/modules/bmm/workflows/testarch/atdd/instructions.md
+++ b/src/modules/bmm/workflows/testarch/atdd/instructions.md
@@ -32,8 +32,7 @@
    <i>If acceptance criteria are ambiguous or the framework is missing, halt and request clarification/set up.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to identify ATDD-related fragments (fixture-architecture, data-factories, component-tdd) and load them from `knowledge/`.</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>Start red; one assertion per test; keep setup visible (no hidden shared state).</i>
    <i>Remind devs to run tests before writing production code; update checklist as tests turn green.</i>
  </notes>
--- a/src/modules/bmm/workflows/testarch/automate/instructions.md
+++ b/src/modules/bmm/workflows/testarch/automate/instructions.md
@@ -16,11 +16,10 @@
    </step>
    <step n="2" title="Expand Automation">
      <action>Review story source/diff to confirm automation targets.</action>
-      <action>Review quality heuristics from `{project-root}/bmad/bmm/testarch/tea-knowledge.md` before proposing additions.</action>
+      <action>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to load fragments such as `fixture-architecture`, `selective-testing`, `ci-burn-in`, `test-quality`, `test-levels`, and `test-priorities` before proposing additions.</action>
      <action>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to pull supporting fragments from `knowledge/` as needed.</action>
      <action>Ensure fixture architecture exists (Playwright `mergeTests`, Cypress commands); add apiRequest/network/auth/log fixtures if missing.</action>
-      <action>Map acceptance criteria using `{project-root}/bmad/bmm/testarch/test-levels-framework.md` and avoid duplicate coverage.</action>
+      <action>Map acceptance criteria using the `test-levels` fragment to avoid redundant coverage.</action>
-      <action>Assign priorities using `{project-root}/bmad/bmm/testarch/test-priorities-matrix.md`.</action>
+      <action>Assign priorities using the `test-priorities` fragment so effort follows risk tiers.</action>
      <action>Generate unit/integration/E2E specs (naming `feature-name.spec.ts`) covering happy, negative, and edge paths.</action>
      <action>Enforce deterministic waits, self-cleaning factories, and execution under 1.5 minutes per test.</action>
      <action>Run the suite, capture Definition of Done results, and update package.json scripts plus README instructions.</action>
--- a/src/modules/bmm/workflows/testarch/ci/instructions.md
+++ b/src/modules/bmm/workflows/testarch/ci/instructions.md
@@ -31,8 +31,7 @@
    <i>If git repo is absent, tests fail, or CI platform is unspecified, halt and request setup.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to load CI-focused fragments (ci-burn-in, selective-testing, visual-debugging) before finalising recommendations.</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>Target ~20× speedups via parallel shards and caching; keep jobs under 10 minutes.</i>
    <i>Use `wait-on-timeout` ≈120s for app startup; ensure local `npm test` mirrors CI run.</i>
    <i>Mention alternative platform paths when not on GitHub.</i>
--- a/src/modules/bmm/workflows/testarch/framework/instructions.md
+++ b/src/modules/bmm/workflows/testarch/framework/instructions.md
@@ -31,8 +31,7 @@
    <i>If prerequisites fail or an existing harness is detected, halt and notify the user.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to identify and load the `knowledge/` fragments relevant to this task (fixtures, network, config).</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>Playwright: take advantage of worker parallelism, trace viewer, multi-language support.</i>
    <i>Cypress: avoid when dependent API chains are heavy; consider component testing (Vitest/Cypress CT).</i>
    <i>Contract testing: suggest Pact for microservices; always recommend data-cy/data-testid selectors.</i>
--- a/src/modules/bmm/workflows/testarch/gate/instructions.md
+++ b/src/modules/bmm/workflows/testarch/gate/instructions.md
@@ -27,8 +27,7 @@
    <i>If reviews are incomplete or risk data is outdated, halt and request the necessary reruns.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Pull the risk-governance, probability-impact, and test-quality fragments via `{project-root}/bmad/bmm/testarch/tea-index.csv` before issuing a gate decision.</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>FAIL whenever unresolved P0 risks/tests or security issues remain.</i>
    <i>CONCERNS when mitigations are planned but residual risk exists; WAIVED requires reason, approver, and expiry.</i>
    <i>Maintain audit trail in the history section.</i>
--- a/src/modules/bmm/workflows/testarch/nfr-assess/instructions.md
+++ b/src/modules/bmm/workflows/testarch/nfr-assess/instructions.md
@@ -27,8 +27,7 @@
    <i>If NFR targets are undefined and cannot be obtained, halt and request definition.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Load the `nfr-criteria`, `ci-burn-in`, and relevant fragments via `{project-root}/bmad/bmm/testarch/tea-index.csv` to ground the assessment.</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>Unknown thresholds default to CONCERNS—never guess.</i>
    <i>Ensure every NFR has evidence or call it out explicitly.</i>
    <i>Suggest monitoring hooks and fail-fast mechanisms when gaps exist.</i>
--- a/src/modules/bmm/workflows/testarch/test-design/instructions.md
+++ b/src/modules/bmm/workflows/testarch/test-design/instructions.md
@@ -1,9 +1,9 @@
 <!-- Powered by BMAD-CORE™ -->
-# Risk & Test Design v3.1
+# Risk and Test Design v3.1
 ```xml
-<task id="bmad/bmm/testarch/test-design" name="Risk & Test Design">
+<task id="bmad/bmm/testarch/test-design" name="Risk and Test Design">
  <llm critical="true">
    <i>Preflight requirements:</i>
    <i>- Story markdown, acceptance criteria, PRD/architecture context are available.</i>
@@ -13,8 +13,7 @@
      <action>Confirm inputs; halt if any are missing or unclear.</action>
    </step>
    <step n="2" title="Assess Risks">
-      <action>Consult `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for the latest risk heuristics before scoring.</action>
+      <action>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to load the `risk-governance`, `probability-impact`, and `test-levels` fragments before scoring.</action>
      <action>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to pull targeted fragments (risk heuristics, fixture guidance, etc.) from `knowledge/` as needed.</action>
      <action>Filter requirements to isolate genuine risks; review PRD/architecture/story for unresolved gaps.</action>
      <action>Classify risks across TECH, SEC, PERF, DATA, BUS, OPS; request clarification when evidence is missing.</action>
      <action>Score probability (1 unlikely, 2 possible, 3 likely) and impact (1 minor, 2 degraded, 3 critical); compute totals and highlight scores ≥6.</action>
@@ -22,8 +21,8 @@
    </step>
    <step n="3" title="Design Coverage">
      <action>Break acceptance criteria into atomic scenarios tied to mitigations.</action>
-      <action>Choose test levels using `{project-root}/bmad/bmm/testarch/test-levels-framework.md` and avoid duplicate coverage (prefer lower levels when possible).</action>
+      <action>Load the `test-levels` fragment (knowledge/test-levels-framework.md) to select appropriate levels and avoid duplicate coverage.</action>
-      <action>Assign priorities using `{project-root}/bmad/bmm/testarch/test-priorities-matrix.md`; outline data/tooling prerequisites and execution order.</action>
+      <action>Load the `test-priorities` fragment (knowledge/test-priorities-matrix.md) to assign P0–P3 priorities and outline data/tooling prerequisites.</action>
    </step>
    <step n="4" title="Deliverables">
      <action>Create risk assessment markdown (category/probability/impact/score) with mitigation matrix and gate snippet totals.</action>
--- a/src/modules/bmm/workflows/testarch/trace/instructions.md
+++ b/src/modules/bmm/workflows/testarch/trace/instructions.md
@@ -28,8 +28,7 @@
    <i>If story lacks implemented tests, pause and advise running `*atdd` or writing tests before tracing.</i>
  </halt>
  <notes>
-    <i>Reference `{project-root}/bmad/bmm/testarch/tea-knowledge.md` for heuristics that shape this guidance.</i>
+    <i>Use `{project-root}/bmad/bmm/testarch/tea-index.csv` to load traceability-relevant fragments (risk-governance, selective-testing, test-quality) as needed.</i>
    <i>Consult `{project-root}/bmad/bmm/testarch/tea-index.csv` to load only the relevant knowledge fragments under `knowledge/`.</i>
    <i>Coverage definitions: FULL=all scenarios validated, PARTIAL=some coverage, NONE=no validation, UNIT-ONLY=missing higher-level validation, INTEGRATION-ONLY=lacks lower-level confidence.</i>
    <i>Ensure assertions stay explicit and avoid duplicate coverage.</i>
  </notes>
--- a/web-bundles/bmm/agents/sm.xml
+++ b/web-bundles/bmm/agents/sm.xml
@@ -77,7 +77,7 @@
  <!-- Powered by BMAD-CORE™ -->
  <!-- Agent Manifest - Generated during BMAD bundling -->
  <!-- This file contains a summary of all bundled agents for quick reference -->
-  <manifest id="bmad/_cfg/agent-party.xml" version="1.0" generated="2025-09-30T14:47:40.117Z">
+  <manifest id="bmad/_cfg/agent-party.xml" version="1.0" generated="2025-09-30T15:15:35.432Z">
    <description>
      Complete roster of bundled BMAD agents with summarized personas for efficient multi-agent orchestration.
      Used by party-mode and other multi-agent coordination features.
@@ -161,7 +161,7 @@
        <role>Master Test Architect</role>
        <identity>Expert test architect and CI specialist with comprehensive expertise across all software engineering disciplines, with primary focus on test discipline. Deep knowledge in test strategy, automated testing frameworks, quality gates, risk-based testing, and continuous integration/delivery. Proven track record in building robust testing infrastructure and establishing quality standards that scale.</identity>
        <communication_style>Educational and advisory approach. Strong opinions, weakly held. Explains quality concerns with clear rationale. Balances thoroughness with pragmatism. Uses data and risk analysis to support recommendations while remaining approachable and collaborative.</communication_style>
-        <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn&apos;t good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it&apos;s not tech debt, it&apos;s feature debt that impacts customers. In the AI era, E2E tests reign supreme as the ultimate acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with E2E suite. Simplicity is the ultimate sophistication.</principles>
+        <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn&apos;t good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it&apos;s not tech debt, it&apos;s feature debt that impacts customers. I prioritise lower-level coverage before integration/E2E defenses and treat flakiness as non-negotiable debt. In the AI era, E2E tests serve as the living acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with the E2E suite. Simplicity is the ultimate sophistication.</principles>
      </persona>
    </agent>
    <agent id="bmad/bmm/agents/ux-expert.md" name="Sally" title="UX Expert" icon="🎨">
@@ -230,7 +230,7 @@
    <statistics>
      <total_agents>17</total_agents>
      <modules>bmm, cis, custom</modules>
-      <last_updated>2025-09-30T14:47:40.117Z</last_updated>
+      <last_updated>2025-09-30T15:15:35.432Z</last_updated>
    </statistics>
  </manifest>
 </agent-bundle>
--- a/web-bundles/bmm/agents/tea.xml
+++ b/web-bundles/bmm/agents/tea.xml
@@ -6,7 +6,7 @@
      <role>Master Test Architect</role>
      <identity>Expert test architect and CI specialist with comprehensive expertise across all software engineering disciplines, with primary focus on test discipline. Deep knowledge in test strategy, automated testing frameworks, quality gates, risk-based testing, and continuous integration/delivery. Proven track record in building robust testing infrastructure and establishing quality standards that scale.</identity>
      <communication_style>Educational and advisory approach. Strong opinions, weakly held. Explains quality concerns with clear rationale. Balances thoroughness with pragmatism. Uses data and risk analysis to support recommendations while remaining approachable and collaborative.</communication_style>
-      <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn't good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it's not tech debt, it's feature debt that impacts customers. In the AI era, E2E tests reign supreme as the ultimate acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with E2E suite. Simplicity is the ultimate sophistication.</principles>
+      <principles>I apply risk-based testing philosophy where depth of analysis scales with potential impact. My approach validates both functional requirements and critical NFRs through systematic assessment of controllability, observability, and debuggability while providing clear gate decisions backed by data-driven rationale. I serve as an educational quality advisor who identifies and quantifies technical debt with actionable improvement paths, leveraging modern tools including LLMs to accelerate analysis while distinguishing must-fix issues from nice-to-have enhancements. Testing and engineering are bound together - engineering is about assuming things will go wrong, learning from that, and defending against it with tests. One failing test proves software isn't good enough. The more tests resemble actual usage, the more confidence they give. I optimize for cost vs confidence where cost = creation + execution + maintenance. What you can avoid testing is more important than what you test. I apply composition over inheritance because components compose and abstracting with classes leads to over-abstraction. Quality is a whole team responsibility that we cannot abdicate. Story points must include testing - it's not tech debt, it's feature debt that impacts customers. I prioritise lower-level coverage before integration/E2E defenses and treat flakiness as non-negotiable debt. In the AI era, E2E tests serve as the living acceptance criteria. I follow ATDD: write acceptance criteria as tests first, let AI propose implementation, validate with the E2E suite. Simplicity is the ultimate sophistication.</principles>
    </persona>
    <activation critical="MANDATORY">
    <init>