fix: Prevent mock data implementations with infrastructure features

Problem:
The coding agent can implement in-memory storage (e.g., `dev-store.ts` with
`globalThis`) instead of a real database. These implementations pass all tests
because data persists during runtime, but data is lost on server restart.

This is a root cause for #68 - agent "passes" features that don't actually work.

Solution:
1. Add 5 mandatory Infrastructure Features (indices 0-4) that run first:
   - Feature 0: Database connection established
   - Feature 1: Database schema applied correctly
   - Feature 2: Data persists across server restart (CRITICAL)
   - Feature 3: No mock data patterns in codebase
   - Feature 4: Backend API queries real database

2. Add STEP 5.7: Server Restart Persistence Test to coding prompt:
   - Create test data, stop server, restart, verify data still exists

3. Extend grep patterns for mock detection in STEP 5.6:
   - globalThis., devStore, dev-store, mockData, fakeData
   - TODO.*real, STUB, MOCK, new Map() as data stores

Changes:
- .claude/templates/initializer_prompt.template.md - Infrastructure features
- .claude/templates/coding_prompt.template.md - STEP 5.6/5.7 enhancements
- .claude/commands/create-spec.md - Phase 3b database question

Backwards Compatible:
- Works with YOLO mode (uses bash/grep, not browser automation)
- Stateless apps can skip database features via create-spec question

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
cabana8471
2026-01-25 08:01:30 +01:00
parent 486979c3d9
commit 8e23fee094
3 changed files with 223 additions and 32 deletions

View File

@@ -36,9 +36,9 @@ Use the feature_create_bulk tool to add all features at once. You can create fea
- Feature count must match the `feature_count` specified in app_spec.txt
- Reference tiers for other projects:
- **Simple apps**: ~150 tests
- **Medium apps**: ~250 tests
- **Complex apps**: ~400+ tests
- **Simple apps**: ~155 tests (includes 5 infrastructure)
- **Medium apps**: ~255 tests (includes 5 infrastructure)
- **Complex apps**: ~405+ tests (includes 5 infrastructure)
- Both "functional" and "style" categories
- Mix of narrow tests (2-5 steps) and comprehensive tests (10+ steps)
- At least 25 tests MUST have 10+ steps each (more for complex apps)
@@ -60,8 +60,9 @@ Dependencies enable **parallel execution** of independent features. When specifi
2. **Can only depend on EARLIER features** (index must be less than current position)
3. **No circular dependencies** allowed
4. **Maximum 20 dependencies** per feature
5. **Foundation features (index 0-9)** should have NO dependencies
6. **60% of features after index 10** should have at least one dependency
5. **Infrastructure features (indices 0-4)** have NO dependencies - they run FIRST
6. **ALL features after index 4** MUST depend on `[0, 1, 2, 3, 4]` (infrastructure)
7. **60% of features after index 10** should have additional dependencies beyond infrastructure
### Dependency Types
@@ -82,30 +83,107 @@ Create WIDE dependency graphs, not linear chains:
```json
[
// FOUNDATION TIER (indices 0-2, no dependencies) - run first
{ "name": "App loads without errors", "category": "functional" },
{ "name": "Navigation bar displays", "category": "style" },
{ "name": "Homepage renders correctly", "category": "functional" },
// INFRASTRUCTURE TIER (indices 0-4, no dependencies) - MUST run first
{ "name": "Database connection established", "category": "functional" },
{ "name": "Database schema applied correctly", "category": "functional" },
{ "name": "Data persists across server restart", "category": "functional" },
{ "name": "No mock data patterns in codebase", "category": "functional" },
{ "name": "Backend API queries real database", "category": "functional" },
// AUTH TIER (indices 3-5, depend on foundation) - run in parallel
{ "name": "User can register", "depends_on_indices": [0] },
{ "name": "User can login", "depends_on_indices": [0, 3] },
{ "name": "User can logout", "depends_on_indices": [4] },
// FOUNDATION TIER (indices 5-7, depend on infrastructure)
{ "name": "App loads without errors", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] },
{ "name": "Navigation bar displays", "category": "style", "depends_on_indices": [0, 1, 2, 3, 4] },
{ "name": "Homepage renders correctly", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] },
// CORE CRUD TIER (indices 6-9) - WIDE GRAPH: all 4 depend on login
// All 4 start as soon as login passes!
{ "name": "User can create todo", "depends_on_indices": [4] },
{ "name": "User can view todos", "depends_on_indices": [4] },
{ "name": "User can edit todo", "depends_on_indices": [4, 6] },
{ "name": "User can delete todo", "depends_on_indices": [4, 6] },
// AUTH TIER (indices 8-10, depend on foundation + infrastructure)
{ "name": "User can register", "depends_on_indices": [0, 1, 2, 3, 4, 5] },
{ "name": "User can login", "depends_on_indices": [0, 1, 2, 3, 4, 5, 8] },
{ "name": "User can logout", "depends_on_indices": [9] },
// ADVANCED TIER (indices 10-11) - both depend on view, not each other
{ "name": "User can filter todos", "depends_on_indices": [7] },
{ "name": "User can search todos", "depends_on_indices": [7] }
// CORE CRUD TIER (indices 11-14) - WIDE GRAPH: all 4 depend on login
{ "name": "User can create todo", "depends_on_indices": [9] },
{ "name": "User can view todos", "depends_on_indices": [9] },
{ "name": "User can edit todo", "depends_on_indices": [9, 11] },
{ "name": "User can delete todo", "depends_on_indices": [9, 11] },
// ADVANCED TIER (indices 15-16) - both depend on view, not each other
{ "name": "User can filter todos", "depends_on_indices": [12] },
{ "name": "User can search todos", "depends_on_indices": [12] }
]
```
**Result:** With 3 parallel agents, this 12-feature project completes in ~5-6 cycles instead of 12 sequential cycles.
**Result:** With 3 parallel agents, this project completes efficiently with proper database validation first.
---
## MANDATORY INFRASTRUCTURE FEATURES (Indices 0-4)
**CRITICAL:** Create these FIRST, before any functional features. These features ensure the application uses a real database, not mock data or in-memory storage.
| Index | Name | Test Steps |
|-------|------|------------|
| 0 | Database connection established | Start server → check logs for DB connection → health endpoint returns DB status |
| 1 | Database schema applied correctly | Connect to DB directly → list tables → verify schema matches spec |
| 2 | Data persists across server restart | Create via API → STOP server completely → START server → query API → data still exists |
| 3 | No mock data patterns in codebase | Run grep for prohibited patterns → must return empty |
| 4 | Backend API queries real database | Check server logs → SQL/DB queries appear for API calls |
**ALL other features MUST depend on indices [0, 1, 2, 3, 4].**
### Infrastructure Feature Descriptions
**Feature 0 - Database connection established:**
```
Steps:
1. Start the development server
2. Check server logs for database connection message
3. Call health endpoint (e.g., GET /api/health)
4. Verify response includes database status: connected
```
**Feature 1 - Database schema applied correctly:**
```
Steps:
1. Connect to database directly (sqlite3, psql, etc.)
2. List all tables in the database
3. Verify tables match what's defined in app_spec.txt
4. Verify key columns exist on each table
```
**Feature 2 - Data persists across server restart (CRITICAL):**
```
Steps:
1. Create unique test data via API (e.g., POST /api/items with name "RESTART_TEST_12345")
2. Verify data appears in API response (GET /api/items)
3. STOP the server completely: pkill -f "node" && sleep 5
4. Verify server is stopped: pgrep -f "node" returns nothing
5. RESTART the server: ./init.sh & sleep 15
6. Query API again: GET /api/items
7. Verify "RESTART_TEST_12345" still exists
8. If data is GONE → CRITICAL FAILURE (in-memory storage detected)
9. Clean up test data
```
**Feature 3 - No mock data patterns in codebase:**
```
Steps:
1. Run: grep -r "globalThis\." --include="*.ts" --include="*.tsx" src/
2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" src/
3. Run: grep -r "mockData\|fakeData\|sampleData\|dummyData" --include="*.ts" src/
4. Run: grep -r "new Map()\|new Set()" --include="*.ts" src/lib/ src/store/ src/data/
5. ALL grep commands must return empty (exit code 1)
6. If any returns results → investigate and fix before passing
```
**Feature 4 - Backend API queries real database:**
```
Steps:
1. Start server with verbose logging
2. Make API call (e.g., GET /api/items)
3. Check server logs
4. Verify SQL query appears (SELECT, INSERT, etc.) or ORM query log
5. If no DB queries in logs → implementation is using mock data
```
---
@@ -117,6 +195,7 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou
| Category | Simple | Medium | Complex |
| -------------------------------- | ------- | ------- | -------- |
| **0. Infrastructure (REQUIRED)** | 5 | 5 | 5 |
| A. Security & Access Control | 5 | 20 | 40 |
| B. Navigation Integrity | 15 | 25 | 40 |
| C. Real Data Verification | 20 | 30 | 50 |
@@ -137,12 +216,14 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou
| R. Concurrency & Race Conditions | 5 | 8 | 15 |
| S. Export/Import | 5 | 6 | 10 |
| T. Performance | 5 | 5 | 10 |
| **TOTAL** | **150** | **250** | **400+** |
| **TOTAL** | **155** | **255** | **405+** |
---
### Category Descriptions
**0. Infrastructure (REQUIRED - Priority 0)** - Database connectivity, schema existence, data persistence across server restart, absence of mock patterns. These features MUST pass before any functional features can begin. All tiers require exactly 5 infrastructure features (indices 0-4).
**A. Security & Access Control** - Test unauthorized access blocking, permission enforcement, session management, role-based access, and data isolation between users.
**B. Navigation Integrity** - Test all buttons, links, menus, breadcrumbs, deep links, back button behavior, 404 handling, and post-login/logout redirects.
@@ -205,6 +286,16 @@ The feature_list.json must include tests that **actively verify real data** and
- `setTimeout` simulating API delays with static data
- Static returns instead of database queries
**Additional prohibited patterns (in-memory stores):**
- `globalThis.` (in-memory storage pattern)
- `dev-store`, `devStore`, `DevStore` (development stores)
- `json-server`, `mirage`, `msw` (mock backends)
- `Map()` or `Set()` used as primary data store
- Environment checks like `if (process.env.NODE_ENV === 'development')` for data routing
**Why this matters:** In-memory stores (like `globalThis.devStore`) will pass simple tests because data persists during a single server run. But data is LOST on server restart, which is unacceptable for production. The Infrastructure features (0-4) specifically test for this by requiring data to survive a full server restart.
---
**CRITICAL INSTRUCTION:**