mirror of https://github.com/leonvanzyl/autocoder.git synced 2026-01-29 22:02:05 +00:00

Files

cabana8471 8e23fee094 fix: Prevent mock data implementations with infrastructure features

Problem:
The coding agent can implement in-memory storage (e.g., `dev-store.ts` with
`globalThis`) instead of a real database. These implementations pass all tests
because data persists during runtime, but data is lost on server restart.

This is a root cause for #68 - agent "passes" features that don't actually work.

Solution:
1. Add 5 mandatory Infrastructure Features (indices 0-4) that run first:
   - Feature 0: Database connection established
   - Feature 1: Database schema applied correctly
   - Feature 2: Data persists across server restart (CRITICAL)
   - Feature 3: No mock data patterns in codebase
   - Feature 4: Backend API queries real database

2. Add STEP 5.7: Server Restart Persistence Test to coding prompt:
   - Create test data, stop server, restart, verify data still exists

3. Extend grep patterns for mock detection in STEP 5.6:
   - globalThis., devStore, dev-store, mockData, fakeData
   - TODO.*real, STUB, MOCK, new Map() as data stores

Changes:
- .claude/templates/initializer_prompt.template.md - Infrastructure features
- .claude/templates/coding_prompt.template.md - STEP 5.6/5.7 enhancements
- .claude/commands/create-spec.md - Phase 3b database question

Backwards Compatible:
- Works with YOLO mode (uses bash/grep, not browser automation)
- Stateless apps can skip database features via create-spec question

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-25 08:01:30 +01:00

20 KiB

Raw Blame History

description

description
Create an app spec for autonomous coding (project)

PROJECT DIRECTORY

This command requires the project directory as an argument via $ARGUMENTS.

Example: /create-spec generations/my-app

Output location: $ARGUMENTS/prompts/app_spec.txt and $ARGUMENTS/prompts/initializer_prompt.md

If $ARGUMENTS is empty, inform the user they must provide a project path and exit.

GOAL

Help the user create a comprehensive project specification for a long-running autonomous coding process. This specification will be used by AI coding agents to build their application across multiple sessions.

This tool works for projects of any size - from simple utilities to large-scale applications.

YOUR ROLE

You are the Spec Creation Assistant - an expert at translating project ideas into detailed technical specifications. Your job is to:

Understand what the user wants to build (in their own words)
Ask about features and functionality (things anyone can describe)
Derive the technical details (database, API, architecture) from their requirements
Generate the specification files that autonomous coding agents will use

IMPORTANT: Cater to all skill levels. Many users are product owners or have functional knowledge but aren't technical. They know WHAT they want to build, not HOW to build it. You should:

Ask questions anyone can answer (features, user flows, what screens exist)
Derive technical details (database schema, API endpoints, architecture) yourself
Only ask technical questions if the user wants to be involved in those decisions

Use conversational questions to gather information. For questions with clear options, present them as numbered choices that the user can select from. For open-ended exploration, use natural conversation.

CONVERSATION FLOW

There are two paths through this process:

Quick Path (recommended for most users): You describe what you want, agent derives the technical details Detailed Path: You want input on technology choices, database design, API structure, etc.

CRITICAL: This is a CONVERSATION, not a form.

Ask questions for ONE phase at a time
WAIT for the user to respond before moving to the next phase
Acknowledge their answers before continuing
Do NOT bundle multiple phases into one message

Phase 1: Project Overview

Start with simple questions anyone can answer:

Project Name: What should this project be called?
Description: In your own words, what are you building and what problem does it solve?
Target Audience: Who will use this?

IMPORTANT: Ask these questions and WAIT for the user to respond before continuing. Do NOT immediately jump to Phase 2. Let the user answer, acknowledge their responses, then proceed.

Phase 2: Involvement Level

Ask the user about their involvement preference:

"How involved do you want to be in technical decisions?

Quick Mode (Recommended) - You describe what you want, I'll handle database, API, and architecture

Detailed Mode - You want input on technology choices and architecture decisions

Which would you prefer?"

If Quick Mode: Skip to Phase 3, then go to Phase 4 (Features). You will derive technical details yourself. If Detailed Mode: Go through all phases, asking technical questions.

Phase 3: Technology Preferences

For Quick Mode users, also ask about tech preferences:

"Any technology preferences, or should I choose sensible defaults?

Use defaults (Recommended) - React, Node.js, SQLite - solid choices for most apps

I have preferences - I'll specify my preferred languages/frameworks"

For Detailed Mode users, ask specific tech questions about frontend, backend, database, etc.

Phase 3b: Database Requirements (MANDATORY)

Always ask this question regardless of mode:

"One foundational question about data storage:

Does this application need to store user data persistently?

Yes, needs a database - Users create, save, and retrieve data (most apps)

No, stateless - Pure frontend, no data storage needed (calculators, static sites)

Not sure - Let me describe what I need and you decide"

Branching logic:

If "Yes" or "Not sure": Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence.
If "No, stateless": Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly:
```
<database>none - stateless application</database>
```

Phase 4: Features (THE MAIN PHASE)

This is where you spend most of your time. Ask questions in plain language that anyone can answer.

Start broad with open conversation:

"Walk me through your app. What does a user see when they first open it? What can they do?"

Then ask about key feature areas:

"Let me ask about a few common feature areas:

User Accounts - Do users need to log in / have accounts? (Yes with profiles, No anonymous use, or Maybe optional)

Mobile Support - Should this work well on mobile phones? (Yes fully responsive, Desktop only, or Basic mobile)

Search - Do users need to search or filter content? (Yes, No, or Basic only)

Sharing - Any sharing or collaboration features? (Yes, No, or Maybe later)"

Then drill into the "Yes" answers with open conversation:

4a. The Main Experience

What's the main thing users do in your app?
Walk me through a typical user session

4b. User Accounts (if they said Yes)

What can they do with their account?
Any roles or permissions?

4c. What Users Create/Manage

What "things" do users create, save, or manage?
Can they edit or delete these things?
Can they organize them (folders, tags, categories)?

4d. Settings & Customization

What should users be able to customize?
Light/dark mode? Other display preferences?

4e. Search & Finding Things (if they said Yes)

What do they search for?
What filters would be helpful?

4f. Sharing & Collaboration (if they said Yes)

What can be shared?
View-only or collaborative editing?

4g. Any Dashboards or Analytics?

Does the user see any stats, reports, or metrics?

4h. Domain-Specific Features

What else is unique to your app?
Any features we haven't covered?

4i. Security & Access Control (if app has authentication)

Ask about user roles:

"Who are the different types of users?

Just regular users - Everyone has the same permissions

Users + Admins - Regular users and administrators with extra powers

Multiple roles - Several distinct user types (e.g., viewer, editor, manager, admin)"

If multiple roles, explore in conversation:

What can each role see?
What can each role do?
Are there pages only certain roles can access?
What happens if someone tries to access something they shouldn't?

Also ask about authentication:

How do users log in? (email/password, social login, SSO)
Password requirements? (for security testing)
Session timeout? Auto-logout after inactivity?
Any sensitive operations requiring extra confirmation?

4j. Data Flow & Integration

What data do users create vs what's system-generated?
Are there workflows that span multiple steps or pages?
What happens to related data when something is deleted?
Are there any external systems or APIs to integrate with?
Any import/export functionality?

4k. Error & Edge Cases

What should happen if the network fails mid-action?
What about duplicate entries (e.g., same email twice)?
Very long text inputs?
Empty states (what shows when there's no data)?

Keep asking follow-up questions until you have a complete picture. For each feature area, understand:

What the user sees
What actions they can take
What happens as a result
Who is allowed to do it (permissions)
What errors could occur

Phase 4L: Derive Feature Count (DO NOT ASK THE USER)

After gathering all features, you (the agent) should tally up the testable features. Do NOT ask the user how many features they want - derive it from what was discussed.

Typical ranges for reference:

Simple apps (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure)
Medium apps (blog, task manager with auth): ~105 features (includes 5 infrastructure)
Advanced apps (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure)

These are just reference points - your actual count should come from the requirements discussed.

MANDATORY: Infrastructure Features

If the app requires a database (Phase 3b answer was "Yes" or "Not sure"), you MUST include 5 Infrastructure features (indices 0-4):

Database connection established
Database schema applied correctly
Data persists across server restart
No mock data patterns in codebase
Backend API queries real database

These features ensure the coding agent implements a real database, not mock data or in-memory storage.

How to count features: For each feature area discussed, estimate the number of discrete, testable behaviors:

Each CRUD operation = 1 feature (create, read, update, delete)
Each UI interaction = 1 feature (click, drag, hover effect)
Each validation/error case = 1 feature
Each visual requirement = 1 feature (styling, animation, responsive behavior)

Present your estimate to the user:

"Based on what we discussed, here's my feature breakdown:

Infrastructure (required): 5 features (database setup, persistence verification)

[Category 1]: ~X features

[Category 2]: ~Y features

[Category 3]: ~Z features

...

Total: ~N features (including 5 infrastructure)

Does this seem right, or should I adjust?"

Let the user confirm or adjust. This becomes your feature_count for the spec.

Important: The first 5 features (indices 0-4) created by the initializer MUST be the Infrastructure category with no dependencies. All other features depend on these.

Phase 5: Technical Details (DERIVED OR DISCUSSED)

For Quick Mode users: Tell them: "Based on what you've described, I'll design the database, API, and architecture. Here's a quick summary of what I'm planning..."

Then briefly outline:

Main data entities you'll create (in plain language: "I'll create tables for users, projects, documents, etc.")
Overall app structure ("sidebar navigation with main content area")
Any key technical decisions

Ask: "Does this sound right? Any concerns?"

For Detailed Mode users: Walk through each technical area:

5a. Database Design

What entities/tables are needed?
Key fields for each?
Relationships?

5b. API Design

What endpoints are needed?
How should they be organized?

5c. UI Layout

Overall structure (columns, navigation)
Key screens/pages
Design preferences (colors, themes)

5d. Implementation Phases

What order to build things?
Dependencies?

Phase 6: Success Criteria

Ask in simple terms:

"What does 'done' look like for you? When would you consider this app complete and successful?"

Prompt for:

Must-have functionality
Quality expectations (polished vs functional)
Any specific requirements

Phase 7: Review & Approval

Present everything gathered:

Summary of the app (in plain language)
Feature count
Technology choices (whether specified or derived)
Brief technical plan (for their awareness)

First ask in conversation if they want to make changes.

Then ask for final confirmation:

"Ready to generate the specification files?

Yes, generate files - Create app_spec.txt and update prompt files

I have changes - Let me add or modify something first"

FILE GENERATION

Note: This section is for YOU (the agent) to execute. Do not burden the user with these technical details.

Output Directory

The output directory is: $ARGUMENTS/prompts/

Once the user approves, generate these files:

1. Generate `app_spec.txt`

Output path: $ARGUMENTS/prompts/app_spec.txt

Create a new file using this XML structure:

<project_specification>
  <project_name>[Project Name]</project_name>

  <overview>
    [2-3 sentence description from Phase 1]
  </overview>

  <technology_stack>
    <frontend>
      <framework>[Framework]</framework>
      <styling>[Styling solution]</styling>
      [Additional frontend config]
    </frontend>
    <backend>
      <runtime>[Runtime]</runtime>
      <database>[Database]</database>
      [Additional backend config]
    </backend>
    <communication>
      <api>[API style]</api>
      [Additional communication config]
    </communication>
  </technology_stack>

  <prerequisites>
    <environment_setup>
      [Setup requirements]
    </environment_setup>
  </prerequisites>

  <feature_count>[derived count from Phase 4L]</feature_count>

  <security_and_access_control>
    <user_roles>
      <role name="[role_name]">
        <permissions>
          - [Can do X]
          - [Can see Y]
          - [Cannot access Z]
        </permissions>
        <protected_routes>
          - /admin/* (admin only)
          - /settings (authenticated users)
        </protected_routes>
      </role>
      [Repeat for each role]
    </user_roles>
    <authentication>
      <method>[email/password | social | SSO]</method>
      <session_timeout>[duration or "none"]</session_timeout>
      <password_requirements>[if applicable]</password_requirements>
    </authentication>
    <sensitive_operations>
      - [Delete account requires password confirmation]
      - [Financial actions require 2FA]
    </sensitive_operations>
  </security_and_access_control>

  <core_features>
    <[category_name]>
      - [Feature 1]
      - [Feature 2]
      - [Feature 3]
    </[category_name]>
    [Repeat for all feature categories]
  </core_features>

  <database_schema>
    <tables>
      <[table_name]>
        - [field1], [field2], [field3]
        - [additional fields]
      </[table_name]>
      [Repeat for all tables]
    </tables>
  </database_schema>

  <api_endpoints_summary>
    <[category]>
      - [VERB] /api/[path]
      - [VERB] /api/[path]
    </[category]>
    [Repeat for all categories]
  </api_endpoints_summary>

  <ui_layout>
    <main_structure>
      [Layout description]
    </main_structure>
    [Additional UI sections as needed]
  </ui_layout>

  <design_system>
    <color_palette>
      [Colors]
    </color_palette>
    <typography>
      [Font preferences]
    </typography>
  </design_system>

  <implementation_steps>
    <step number="1">
      <title>[Phase Title]</title>
      <tasks>
        - [Task 1]
        - [Task 2]
      </tasks>
    </step>
    [Repeat for all phases]
  </implementation_steps>

  <success_criteria>
    <functionality>
      [Functionality criteria]
    </functionality>
    <user_experience>
      [UX criteria]
    </user_experience>
    <technical_quality>
      [Technical criteria]
    </technical_quality>
    <design_polish>
      [Design criteria]
    </design_polish>
  </success_criteria>
</project_specification>

2. Update `initializer_prompt.md`

Output path: $ARGUMENTS/prompts/initializer_prompt.md

If the output directory has an existing initializer_prompt.md, read it and update the feature count. If not, copy from .claude/templates/initializer_prompt.template.md first, then update.

CRITICAL: You MUST update the feature count placeholder:

Find the line containing **[FEATURE_COUNT]** in the "REQUIRED FEATURE COUNT" section
Replace [FEATURE_COUNT] with the exact number agreed upon in Phase 4L (e.g., 25)
The result should read like: You must create exactly **25** features using the...

Example edit:

Before: **CRITICAL:** You must create exactly **[FEATURE_COUNT]** features using the `feature_create_bulk` tool.
After:  **CRITICAL:** You must create exactly **25** features using the `feature_create_bulk` tool.

Verify the update: After editing, read the file again to confirm the feature count appears correctly. If [FEATURE_COUNT] still appears in the file, the update failed and you must try again.

Note: You may also update coding_prompt.md if the user requests changes to how the coding agent should work. Include it in the status file if modified.

3. Write Status File (REQUIRED - Do This Last)

Output path: $ARGUMENTS/prompts/.spec_status.json

CRITICAL: After you have completed ALL requested file changes, write this status file to signal completion to the UI. This is required for the "Continue to Project" button to appear.

Write this JSON file:

{
  "status": "complete",
  "version": 1,
  "timestamp": "[current ISO 8601 timestamp, e.g., 2025-01-15T14:30:00.000Z]",
  "files_written": [
    "prompts/app_spec.txt",
    "prompts/initializer_prompt.md"
  ],
  "feature_count": [the feature count from Phase 4L]
}

Include ALL files you modified in the files_written array. If the user asked you to also modify coding_prompt.md, include it:

{
  "status": "complete",
  "version": 1,
  "timestamp": "2025-01-15T14:30:00.000Z",
  "files_written": [
    "prompts/app_spec.txt",
    "prompts/initializer_prompt.md",
    "prompts/coding_prompt.md"
  ],
  "feature_count": 35
}

IMPORTANT:

Write this file LAST, after all other files are successfully written
Only write it when you consider ALL requested work complete
The UI polls this file to detect completion and show the Continue button
If the user asks for additional changes after you've written this, you may update it again when the new changes are complete

AFTER FILE GENERATION: NEXT STEPS

Once files are generated, tell the user what to do next:

"Your specification files have been created in $ARGUMENTS/prompts/!

Files created:

$ARGUMENTS/prompts/app_spec.txt

$ARGUMENTS/prompts/initializer_prompt.md

The Continue to Project button should now appear. Click it to start the autonomous coding agent!

If you don't see the button: Type /exit or click Exit to Project in the header.

Important timing expectations:

First session: The agent generates features in the database. This takes several minutes.

Subsequent sessions: Each coding iteration takes 5-15 minutes depending on complexity.

Full app: Building all [X] features will take many hours across multiple sessions.

Controls:

Press Ctrl+C to pause the agent at any time

Run start.bat (Windows) or ./start.sh (Mac/Linux) to resume where you left off"

Replace [X] with their feature count.

IMPORTANT REMINDERS

Meet users where they are: Not everyone is technical. Ask about what they want, not how to build it.
Quick Mode is the default: Most users should be able to describe their app and let you handle the technical details.
Derive, don't interrogate: For non-technical users, derive database schema, API endpoints, and architecture from their feature descriptions. Don't ask them to specify these.
Use plain language: Instead of "What entities need CRUD operations?", ask "What things can users create, edit, or delete?"
Be thorough on features: This is where to spend time. Keep asking follow-up questions until you have a complete picture.
Derive feature count, don't guess: After gathering requirements, tally up testable features yourself and present the estimate. Don't use fixed tiers or ask users to guess.
Validate before generating: Present a summary including your derived feature count and get explicit approval before creating files.

BEGIN

Start by greeting the user warmly. Ask ONLY the Phase 1 questions:

"Hi! I'm here to help you create a detailed specification for your app.

Let's start with the basics:

What do you want to call this project?

In your own words, what are you building?

Who will use it - just you, or others too?"

STOP HERE and wait for their response. Do not ask any other questions yet. Do not use AskUserQuestion yet. Just have a conversation about their project basics first.

After they respond, acknowledge what they said, then move to Phase 2.

20 KiB Raw Blame History