fix: Prevent mock data implementations with infrastructure features

Problem:
The coding agent can implement in-memory storage (e.g., `dev-store.ts` with
`globalThis`) instead of a real database. These implementations pass all tests
because data persists during runtime, but data is lost on server restart.

This is a root cause for #68 - agent "passes" features that don't actually work.

Solution:
1. Add 5 mandatory Infrastructure Features (indices 0-4) that run first:
   - Feature 0: Database connection established
   - Feature 1: Database schema applied correctly
   - Feature 2: Data persists across server restart (CRITICAL)
   - Feature 3: No mock data patterns in codebase
   - Feature 4: Backend API queries real database

2. Add STEP 5.7: Server Restart Persistence Test to coding prompt:
   - Create test data, stop server, restart, verify data still exists

3. Extend grep patterns for mock detection in STEP 5.6:
   - globalThis., devStore, dev-store, mockData, fakeData
   - TODO.*real, STUB, MOCK, new Map() as data stores

Changes:
- .claude/templates/initializer_prompt.template.md - Infrastructure features
- .claude/templates/coding_prompt.template.md - STEP 5.6/5.7 enhancements
- .claude/commands/create-spec.md - Phase 3b database question

Backwards Compatible:
- Works with YOLO mode (uses bash/grep, not browser automation)
- Stateless apps can skip database features via create-spec question

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
cabana8471
2026-01-25 08:01:30 +01:00
parent 486979c3d9
commit 8e23fee094
3 changed files with 223 additions and 32 deletions

View File

@@ -95,6 +95,27 @@ Ask the user about their involvement preference:
**For Detailed Mode users**, ask specific tech questions about frontend, backend, database, etc.
### Phase 3b: Database Requirements (MANDATORY)
**Always ask this question regardless of mode:**
> "One foundational question about data storage:
>
> **Does this application need to store user data persistently?**
>
> 1. **Yes, needs a database** - Users create, save, and retrieve data (most apps)
> 2. **No, stateless** - Pure frontend, no data storage needed (calculators, static sites)
> 3. **Not sure** - Let me describe what I need and you decide"
**Branching logic:**
- **If "Yes" or "Not sure"**: Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence.
- **If "No, stateless"**: Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly:
```xml
<database>none - stateless application</database>
```
## Phase 4: Features (THE MAIN PHASE)
This is where you spend most of your time. Ask questions in plain language that anyone can answer.
@@ -207,12 +228,23 @@ After gathering all features, **you** (the agent) should tally up the testable f
**Typical ranges for reference:**
- **Simple apps** (todo list, calculator, notes): ~20-50 features
- **Medium apps** (blog, task manager with auth): ~100 features
- **Advanced apps** (e-commerce, CRM, full SaaS): ~150-200 features
- **Simple apps** (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure)
- **Medium apps** (blog, task manager with auth): ~105 features (includes 5 infrastructure)
- **Advanced apps** (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure)
These are just reference points - your actual count should come from the requirements discussed.
**MANDATORY: Infrastructure Features**
If the app requires a database (Phase 3b answer was "Yes" or "Not sure"), you MUST include 5 Infrastructure features (indices 0-4):
1. Database connection established
2. Database schema applied correctly
3. Data persists across server restart
4. No mock data patterns in codebase
5. Backend API queries real database
These features ensure the coding agent implements a real database, not mock data or in-memory storage.
**How to count features:**
For each feature area discussed, estimate the number of discrete, testable behaviors:
@@ -225,17 +257,20 @@ For each feature area discussed, estimate the number of discrete, testable behav
> "Based on what we discussed, here's my feature breakdown:
>
> - **Infrastructure (required)**: 5 features (database setup, persistence verification)
> - [Category 1]: ~X features
> - [Category 2]: ~Y features
> - [Category 3]: ~Z features
> - ...
>
> **Total: ~N features**
> **Total: ~N features** (including 5 infrastructure)
>
> Does this seem right, or should I adjust?"
Let the user confirm or adjust. This becomes your `feature_count` for the spec.
**Important:** The first 5 features (indices 0-4) created by the initializer MUST be the Infrastructure category with no dependencies. All other features depend on these.
## Phase 5: Technical Details (DERIVED OR DISCUSSED)
**For Quick Mode users:**