diff --git a/.claude/commands/create-spec.md b/.claude/commands/create-spec.md index f8cae28..f8a1b96 100644 --- a/.claude/commands/create-spec.md +++ b/.claude/commands/create-spec.md @@ -95,6 +95,27 @@ Ask the user about their involvement preference: **For Detailed Mode users**, ask specific tech questions about frontend, backend, database, etc. +### Phase 3b: Database Requirements (MANDATORY) + +**Always ask this question regardless of mode:** + +> "One foundational question about data storage: +> +> **Does this application need to store user data persistently?** +> +> 1. **Yes, needs a database** - Users create, save, and retrieve data (most apps) +> 2. **No, stateless** - Pure frontend, no data storage needed (calculators, static sites) +> 3. **Not sure** - Let me describe what I need and you decide" + +**Branching logic:** + +- **If "Yes" or "Not sure"**: Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence. + +- **If "No, stateless"**: Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly: + ```xml + none - stateless application + ``` + ## Phase 4: Features (THE MAIN PHASE) This is where you spend most of your time. Ask questions in plain language that anyone can answer. @@ -207,12 +228,23 @@ After gathering all features, **you** (the agent) should tally up the testable f **Typical ranges for reference:** -- **Simple apps** (todo list, calculator, notes): ~20-50 features -- **Medium apps** (blog, task manager with auth): ~100 features -- **Advanced apps** (e-commerce, CRM, full SaaS): ~150-200 features +- **Simple apps** (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure) +- **Medium apps** (blog, task manager with auth): ~105 features (includes 5 infrastructure) +- **Advanced apps** (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure) These are just reference points - your actual count should come from the requirements discussed. +**MANDATORY: Infrastructure Features** + +If the app requires a database (Phase 3b answer was "Yes" or "Not sure"), you MUST include 5 Infrastructure features (indices 0-4): +1. Database connection established +2. Database schema applied correctly +3. Data persists across server restart +4. No mock data patterns in codebase +5. Backend API queries real database + +These features ensure the coding agent implements a real database, not mock data or in-memory storage. + **How to count features:** For each feature area discussed, estimate the number of discrete, testable behaviors: @@ -225,17 +257,20 @@ For each feature area discussed, estimate the number of discrete, testable behav > "Based on what we discussed, here's my feature breakdown: > +> - **Infrastructure (required)**: 5 features (database setup, persistence verification) > - [Category 1]: ~X features > - [Category 2]: ~Y features > - [Category 3]: ~Z features > - ... > -> **Total: ~N features** +> **Total: ~N features** (including 5 infrastructure) > > Does this seem right, or should I adjust?" Let the user confirm or adjust. This becomes your `feature_count` for the spec. +**Important:** The first 5 features (indices 0-4) created by the initializer MUST be the Infrastructure category with no dependencies. All other features depend on these. + ## Phase 5: Technical Details (DERIVED OR DISCUSSED) **For Quick Mode users:** diff --git a/.claude/templates/coding_prompt.template.md b/.claude/templates/coding_prompt.template.md index d72b933..9322404 100644 --- a/.claude/templates/coding_prompt.template.md +++ b/.claude/templates/coding_prompt.template.md @@ -156,6 +156,9 @@ Use browser automation tools: - [ ] Deleted the test data - verified it's gone everywhere - [ ] NO unexplained data appeared (would indicate mock data) - [ ] Dashboard/counts reflect real numbers after my changes +- [ ] **Ran extended mock data grep (STEP 5.6) - no hits in src/ (excluding tests)** +- [ ] **Verified no globalThis, devStore, or dev-store patterns** +- [ ] **Server restart test passed (STEP 5.7) - data persists across restart** #### Navigation Verification @@ -174,10 +177,92 @@ Use browser automation tools: ### STEP 5.6: MOCK DATA DETECTION (Before marking passing) -1. **Search code:** `grep -r "mockData\|fakeData\|TODO\|STUB" --include="*.ts" --include="*.tsx"` -2. **Runtime test:** Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone -3. **Check database:** All displayed data must come from real DB queries -4. If unexplained data appears, it's mock data - fix before marking passing. +**Run ALL these grep checks. Any hits in src/ (excluding test files) require investigation:** + +```bash +# Common exclusions for test files +EXCLUDE="--exclude=*.test.* --exclude=*.spec.* --exclude=*__test__* --exclude=*__mocks__*" + +# 1. In-memory storage patterns (CRITICAL - catches dev-store) +grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ +grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ + +# 2. Mock data variables +grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ + +# 3. TODO/incomplete markers +grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ + +# 4. Development-only conditionals +grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ + +# 5. In-memory collections as data stores +grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ 2>/dev/null +``` + +**Rule:** If ANY grep returns results in production code → investigate → FIX before marking passing. + +**Runtime verification:** +1. Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone +2. Check database directly - all displayed data must come from real DB queries +3. If unexplained data appears, it's mock data - fix before marking passing. + +### STEP 5.7: SERVER RESTART PERSISTENCE TEST (MANDATORY for data features) + +**When required:** Any feature involving CRUD operations or data persistence. + +**This test is NON-NEGOTIABLE. It catches in-memory storage implementations that pass all other tests.** + +**Steps:** + +1. Create unique test data via UI or API (e.g., item named "RESTART_TEST_12345") +2. Verify data appears in UI and API response + +3. **STOP the server completely:** + ```bash + # Kill by port (safer - only kills the dev server, not VS Code/Claude Code/etc.) + # Unix/macOS: + lsof -ti :${PORT:-3000} | xargs kill -TERM 2>/dev/null || true + sleep 3 + lsof -ti :${PORT:-3000} | xargs kill -9 2>/dev/null || true + sleep 2 + + # Windows alternative (use if lsof not available): + # netstat -ano | findstr :${PORT:-3000} | findstr LISTENING + # taskkill /F /PID 2>nul + + # Verify server is stopped + if lsof -ti :${PORT:-3000} > /dev/null 2>&1; then + echo "ERROR: Server still running on port ${PORT:-3000}!" + exit 1 + fi + ``` + +4. **RESTART the server:** + ```bash + ./init.sh & + sleep 15 # Allow server to fully start + # Verify server is responding + if ! curl -f http://localhost:${PORT:-3000}/api/health && ! curl -f http://localhost:${PORT:-3000}; then + echo "ERROR: Server failed to start after restart" + exit 1 + fi + ``` + +5. **Query for test data - it MUST still exist** + - Via UI: Navigate to data location, verify data appears + - Via API: `curl http://localhost:${PORT:-3000}/api/items` - verify data in response + +6. **If data is GONE:** Implementation uses in-memory storage → CRITICAL FAIL + - Run all grep commands from STEP 5.6 to identify the mock pattern + - You MUST fix the in-memory storage implementation before proceeding + - Replace in-memory storage with real database queries + +7. **Clean up test data** after successful verification + +**Why this test exists:** In-memory stores like `globalThis.devStore` pass all other tests because data persists during a single server run. Only a full server restart reveals this bug. Skipping this step WILL allow dev-store implementations to slip through. + +**YOLO Mode Note:** Even in YOLO mode, this verification is MANDATORY for data features. Use curl instead of browser automation. ### STEP 6: UPDATE FEATURE STATUS (CAREFULLY!) diff --git a/.claude/templates/initializer_prompt.template.md b/.claude/templates/initializer_prompt.template.md index c6ee081..bb914e2 100644 --- a/.claude/templates/initializer_prompt.template.md +++ b/.claude/templates/initializer_prompt.template.md @@ -36,9 +36,9 @@ Use the feature_create_bulk tool to add all features at once. You can create fea - Feature count must match the `feature_count` specified in app_spec.txt - Reference tiers for other projects: - - **Simple apps**: ~150 tests - - **Medium apps**: ~250 tests - - **Complex apps**: ~400+ tests + - **Simple apps**: ~165 tests (includes 5 infrastructure) + - **Medium apps**: ~265 tests (includes 5 infrastructure) + - **Advanced apps**: ~405+ tests (includes 5 infrastructure) - Both "functional" and "style" categories - Mix of narrow tests (2-5 steps) and comprehensive tests (10+ steps) - At least 25 tests MUST have 10+ steps each (more for complex apps) @@ -60,8 +60,9 @@ Dependencies enable **parallel execution** of independent features. When specifi 2. **Can only depend on EARLIER features** (index must be less than current position) 3. **No circular dependencies** allowed 4. **Maximum 20 dependencies** per feature -5. **Foundation features (index 0-9)** should have NO dependencies -6. **60% of features after index 10** should have at least one dependency +5. **Infrastructure features (indices 0-4)** have NO dependencies - they run FIRST +6. **ALL features after index 4** MUST depend on `[0, 1, 2, 3, 4]` (infrastructure) +7. **60% of features after index 10** should have additional dependencies beyond infrastructure ### Dependency Types @@ -82,30 +83,113 @@ Create WIDE dependency graphs, not linear chains: ```json [ - // FOUNDATION TIER (indices 0-2, no dependencies) - run first - { "name": "App loads without errors", "category": "functional" }, - { "name": "Navigation bar displays", "category": "style" }, - { "name": "Homepage renders correctly", "category": "functional" }, + // INFRASTRUCTURE TIER (indices 0-4, no dependencies) - MUST run first + { "name": "Database connection established", "category": "functional" }, + { "name": "Database schema applied correctly", "category": "functional" }, + { "name": "Data persists across server restart", "category": "functional" }, + { "name": "No mock data patterns in codebase", "category": "functional" }, + { "name": "Backend API queries real database", "category": "functional" }, - // AUTH TIER (indices 3-5, depend on foundation) - run in parallel - { "name": "User can register", "depends_on_indices": [0] }, - { "name": "User can login", "depends_on_indices": [0, 3] }, - { "name": "User can logout", "depends_on_indices": [4] }, + // FOUNDATION TIER (indices 5-7, depend on infrastructure) + { "name": "App loads without errors", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] }, + { "name": "Navigation bar displays", "category": "style", "depends_on_indices": [0, 1, 2, 3, 4] }, + { "name": "Homepage renders correctly", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] }, - // CORE CRUD TIER (indices 6-9) - WIDE GRAPH: all 4 depend on login - // All 4 start as soon as login passes! - { "name": "User can create todo", "depends_on_indices": [4] }, - { "name": "User can view todos", "depends_on_indices": [4] }, - { "name": "User can edit todo", "depends_on_indices": [4, 6] }, - { "name": "User can delete todo", "depends_on_indices": [4, 6] }, + // AUTH TIER (indices 8-10, depend on foundation + infrastructure) + { "name": "User can register", "depends_on_indices": [0, 1, 2, 3, 4, 5] }, + { "name": "User can login", "depends_on_indices": [0, 1, 2, 3, 4, 5, 8] }, + { "name": "User can logout", "depends_on_indices": [0, 1, 2, 3, 4, 9] }, - // ADVANCED TIER (indices 10-11) - both depend on view, not each other - { "name": "User can filter todos", "depends_on_indices": [7] }, - { "name": "User can search todos", "depends_on_indices": [7] } + // CORE CRUD TIER (indices 11-14) - WIDE GRAPH: all 4 depend on login + { "name": "User can create todo", "depends_on_indices": [0, 1, 2, 3, 4, 9] }, + { "name": "User can view todos", "depends_on_indices": [0, 1, 2, 3, 4, 9] }, + { "name": "User can edit todo", "depends_on_indices": [0, 1, 2, 3, 4, 9, 11] }, + { "name": "User can delete todo", "depends_on_indices": [0, 1, 2, 3, 4, 9, 11] }, + + // ADVANCED TIER (indices 15-16) - both depend on view, not each other + { "name": "User can filter todos", "depends_on_indices": [0, 1, 2, 3, 4, 12] }, + { "name": "User can search todos", "depends_on_indices": [0, 1, 2, 3, 4, 12] } ] ``` -**Result:** With 3 parallel agents, this 12-feature project completes in ~5-6 cycles instead of 12 sequential cycles. +**Result:** With 3 parallel agents, this project completes efficiently with proper database validation first. + +--- + +## MANDATORY INFRASTRUCTURE FEATURES (Indices 0-4) + +**CRITICAL:** Create these FIRST, before any functional features. These features ensure the application uses a real database, not mock data or in-memory storage. + +| Index | Name | Test Steps | +|-------|------|------------| +| 0 | Database connection established | Start server → check logs for DB connection → health endpoint returns DB status | +| 1 | Database schema applied correctly | Connect to DB directly → list tables → verify schema matches spec | +| 2 | Data persists across server restart | Create via API → STOP server completely → START server → query API → data still exists | +| 3 | No mock data patterns in codebase | Run grep for prohibited patterns → must return empty | +| 4 | Backend API queries real database | Check server logs → SQL/DB queries appear for API calls | + +**ALL other features MUST depend on indices [0, 1, 2, 3, 4].** + +### Infrastructure Feature Descriptions + +**Feature 0 - Database connection established:** +```text +Steps: +1. Start the development server +2. Check server logs for database connection message +3. Call health endpoint (e.g., GET /api/health) +4. Verify response includes database status: connected +``` + +**Feature 1 - Database schema applied correctly:** +```text +Steps: +1. Connect to database directly (sqlite3, psql, etc.) +2. List all tables in the database +3. Verify tables match what's defined in app_spec.txt +4. Verify key columns exist on each table +``` + +**Feature 2 - Data persists across server restart (CRITICAL):** +```text +Steps: +1. Create unique test data via API (e.g., POST /api/items with name "RESTART_TEST_12345") +2. Verify data appears in API response (GET /api/items) +3. STOP the server completely (kill by port to avoid killing unrelated Node processes): + - Unix/macOS: lsof -ti :$PORT | xargs kill -9 2>/dev/null || true && sleep 5 + - Windows: FOR /F "tokens=5" %a IN ('netstat -aon ^| find ":$PORT"') DO taskkill /F /PID %a 2>nul + - Note: Replace $PORT with actual port (e.g., 3000) +4. Verify server is stopped: lsof -ti :$PORT returns nothing (or netstat on Windows) +5. RESTART the server: ./init.sh & sleep 15 +6. Query API again: GET /api/items +7. Verify "RESTART_TEST_12345" still exists +8. If data is GONE → CRITICAL FAILURE (in-memory storage detected) +9. Clean up test data +``` + +**Feature 3 - No mock data patterns in codebase:** +```text +Steps: +1. Run: grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" src/ +2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" src/ +3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" --include="*.js" src/ +4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" src/ +5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" src/ +6. Run: grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/ 2>/dev/null +7. Run: grep -E "json-server|miragejs|msw" package.json +8. ALL grep commands must return empty (exit code 1) +9. If any returns results → investigate and fix before passing +``` + +**Feature 4 - Backend API queries real database:** +```text +Steps: +1. Start server with verbose logging +2. Make API call (e.g., GET /api/items) +3. Check server logs +4. Verify SQL query appears (SELECT, INSERT, etc.) or ORM query log +5. If no DB queries in logs → implementation is using mock data +``` --- @@ -115,8 +199,9 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou ### Category Distribution by Complexity Tier -| Category | Simple | Medium | Complex | +| Category | Simple | Medium | Advanced | | -------------------------------- | ------- | ------- | -------- | +| **0. Infrastructure (REQUIRED)** | 5 | 5 | 5 | | A. Security & Access Control | 5 | 20 | 40 | | B. Navigation Integrity | 15 | 25 | 40 | | C. Real Data Verification | 20 | 30 | 50 | @@ -137,12 +222,14 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou | R. Concurrency & Race Conditions | 5 | 8 | 15 | | S. Export/Import | 5 | 6 | 10 | | T. Performance | 5 | 5 | 10 | -| **TOTAL** | **150** | **250** | **400+** | +| **TOTAL** | **165** | **265** | **405+** | --- ### Category Descriptions +**0. Infrastructure (REQUIRED - Priority 0)** - Database connectivity, schema existence, data persistence across server restart, absence of mock patterns. These features MUST pass before any functional features can begin. All tiers require exactly 5 infrastructure features (indices 0-4). + **A. Security & Access Control** - Test unauthorized access blocking, permission enforcement, session management, role-based access, and data isolation between users. **B. Navigation Integrity** - Test all buttons, links, menus, breadcrumbs, deep links, back button behavior, 404 handling, and post-login/logout redirects. @@ -205,6 +292,16 @@ The feature_list.json must include tests that **actively verify real data** and - `setTimeout` simulating API delays with static data - Static returns instead of database queries +**Additional prohibited patterns (in-memory stores):** + +- `globalThis.` (in-memory storage pattern) +- `dev-store`, `devStore`, `DevStore` (development stores) +- `json-server`, `mirage`, `msw` (mock backends) +- `Map()` or `Set()` used as primary data store +- Environment checks like `if (process.env.NODE_ENV === 'development')` for data routing + +**Why this matters:** In-memory stores (like `globalThis.devStore`) will pass simple tests because data persists during a single server run. But data is LOST on server restart, which is unacceptable for production. The Infrastructure features (0-4) specifically test for this by requiring data to survive a full server restart. + --- **CRITICAL INSTRUCTION:**