BMAD-METHOD/testing-as-engineering.md at 91f6c41be1509025f0c520adab856788ae7a4d0a

mirror of https://github.com/bmad-code-org/BMAD-METHOD.git synced 2026-01-30 04:32:02 +00:00

Files

Alex Verkhovsky 91f6c41be1 docs: radical reduction of documentation scope for v6 beta (#1406 )

* docs: radical reduction of documentation scope for v6 beta

Archive and basement unreviewed content to ship a focused, minimal doc set.

Changes:
- Archive stale how-to workflow guides (will rewrite for v6)
- Archive outdated explanation and reference content
- Move unreviewed content to basement for later review
- Reorganize TEA docs into dedicated /tea/ section
- Add workflow-map visual reference page
- Simplify getting-started tutorial and sidebar navigation
- Add explanation pages: brainstorming, adversarial-review, party-mode,
  quick-flow, advanced-elicitation
- Fix base URL handling for subdirectory deployments (GitHub Pages forks)

The goal is a minimal, accurate doc set for beta rather than
comprehensive but potentially misleading content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: restructure BMM and agents documentation by consolidating and flattening index files.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-25 14:00:26 -06:00

5.1 KiB

Raw Blame History

title, description

title	description
AI-Generated Testing: Why Most Approaches Fail	How Playwright-Utils, TEA workflows, and Playwright MCPs solve AI test quality problems

AI-generated tests frequently fail in production because they lack systematic quality standards. This document explains the problem and presents a solution combining three components: Playwright-Utils, TEA (Test Architect), and Playwright MCPs.

:::note[Source] This article is adapted from The Testing Meta Most Teams Have Not Caught Up To Yet by Murat K Ozcan. :::

The Problem with AI-Generated Tests

When teams use AI to generate tests without structure, they often produce what can be called "slop factory" outputs:

Issue	Description
Redundant coverage	Multiple tests covering the same functionality
Incorrect assertions	Tests that pass but don't actually verify behavior
Flaky tests	Non-deterministic tests that randomly pass or fail
Unreviewable diffs	Generated code too verbose or inconsistent to review

The core problem is that prompt-driven testing paths lean into nondeterminism, which is the exact opposite of what testing exists to protect.

:::caution[The Paradox] AI excels at generating code quickly, but testing requires precision and consistency. Without guardrails, AI-generated tests amplify the chaos they're meant to prevent. :::

The Solution: A Three-Part Stack

The solution combines three components that work together to enforce quality:

Playwright-Utils

Bridges the gap between Cypress ergonomics and Playwright's capabilities by standardizing commonly reinvented primitives through utility functions.

Utility	Purpose
api-request	API calls with schema validation
auth-session	Authentication handling
intercept-network-call	Network mocking and interception
recurse	Retry logic and polling
log	Structured logging
network-recorder	Record and replay network traffic
burn-in	Smart test selection for CI
network-error-monitor	HTTP error detection
file-utils	CSV/PDF handling

These utilities eliminate the need to reinvent authentication, API calls, retries, and logging for every project.

TEA (Test Architect Agent)

A quality operating model packaged as eight executable workflows spanning test design, CI/CD gates, and release readiness. TEA encodes test architecture expertise into repeatable processes.

Workflow	Purpose
`test-design`	Risk-based test planning per epic
`framework`	Scaffold production-ready test infrastructure
`ci`	CI pipeline with selective testing
`atdd`	Acceptance test-driven development
`automate`	Prioritized test automation
`test-review`	Test quality audits (0-100 score)
`nfr-assess`	Non-functional requirements assessment
`trace`	Coverage traceability and gate decisions

:::tip[Key Insight] TEA doesn't just generate tests—it provides a complete quality operating model with workflows for planning, execution, and release gates. :::

Playwright MCPs

Model Context Protocols enable real-time verification during test generation. Instead of inferring selectors and behavior from documentation, MCPs allow agents to:

Run flows and confirm the DOM against the accessibility tree
Validate network responses in real-time
Discover actual functionality through interactive exploration
Verify generated tests against live applications

How They Work Together

The three components form a quality pipeline:

Stage	Component	Action
Standards	Playwright-Utils	Provides production-ready patterns and utilities
Process	TEA Workflows	Enforces systematic test planning and review
Verification	Playwright MCPs	Validates generated tests against live applications

Before (AI-only): 20 tests with redundant coverage, incorrect assertions, and flaky behavior.

After (Full Stack): Risk-based selection, verified selectors, validated behavior, reviewable code.

Why This Matters

Traditional AI testing approaches fail because they:

Lack quality standards — No consistent patterns or utilities
Skip planning — Jump straight to test generation without risk assessment
Can't verify — Generate tests without validating against actual behavior
Don't review — No systematic audit of generated test quality

The three-part stack addresses each gap:

Gap	Solution
No standards	Playwright-Utils provides production-ready patterns
No planning	TEA `test-design` creates risk-based test plans
No verification	Playwright MCPs validate against live applications
No review	TEA `test-review` audits quality with scoring

This approach is sometimes called context engineering—loading domain-specific standards into AI context automatically rather than relying on prompts alone. TEA's tea-index.csv manifest loads relevant knowledge fragments so the AI doesn't relearn testing patterns each session.

5.1 KiB Raw Blame History